[
  {
    "path": ".github/workflows/ci.yaml",
    "content": "name: PZ Merge Checks\n\non:\n  pull_request:\n    branches:\n      - main\n\njobs:\n  test:\n    runs-on: ubuntu-latest\n    steps:\n    - uses: actions/checkout@v4\n    - name: Set up Python\n      uses: actions/setup-python@v5\n      with:\n        python-version: '3.12'\n\n    - name: Install dependencies\n      run: |\n        pip install --upgrade pip\n        pip install .\n\n    - name: Download and register testdata\n      run: |\n        pushd testdata\n        wget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/enron-eval-tiny.tar.gz\n        wget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/real-estate-eval-tiny.tar.gz\n        tar -xzf enron-eval-tiny.tar.gz\n        tar -xzf real-estate-eval-tiny.tar.gz\n        rm *.tar.gz\n        popd\n\n    - name: Test with pytest\n      env:\n        OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n        TOGETHER_API_KEY: ${{ secrets.TOGETHER_API_KEY }}\n        ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n      run: |\n        export CI=true\n        export NO_GEMINI=true\n        pip install pytest\n        pytest -v tests/pytest\n\n  lint-and-format:\n    runs-on: ubuntu-latest\n    steps:\n    - uses: actions/checkout@v4\n    - name: Set up Python\n      uses: actions/setup-python@v5\n      with:\n        python-version: '3.x'\n    - name: Install the code linting and formatting tool Ruff\n      run: pip install \"ruff>=0.9.0\"\n    - name: check version\n      run: ruff --version\n    - name: Lint code with Ruff\n      run: ruff check --output-format=github --target-version=py38\n    - name: Check code formatting with Ruff\n      run: ruff check --no-fix . --target-version=py38\n      continue-on-error: true\n\n  check-version-bump:\n    runs-on: ubuntu-latest\n    steps:\n    - uses: actions/checkout@v4\n    - name: Check Version Increased\n      run: |\n        git fetch --prune --unshallow\n        git checkout ${{ github.event.pull_request.base.sha }}\n        VERSION=`cat pyproject.toml | grep '^version' | sed -E 's/version.*=.*\\\"(.*)\".*/\\1/'`\n        echo \"Current version is $VERSION\"\n        git checkout ${{ github.event.pull_request.head.sha }}\n        VERSION_PR=`cat pyproject.toml | grep '^version' | sed -E 's/version.*=.*\\\"(.*)\".*/\\1/'`\n        echo \"Version in PR is $VERSION_PR\"\n        if [ \"$VERSION\" = \"$VERSION_PR\" ]; then\n          echo \"Error: Version has not been bumped\"\n          exit 1\n        fi\n"
  },
  {
    "path": ".github/workflows/docs.yaml",
    "content": "name: Deploy Docs to GitHub Pages\n\non:\n  push:\n    branches:\n      - main\n\npermissions:\n  contents: write\n\njobs:\n  build:\n    name: Build Docusaurus\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n      - uses: actions/setup-node@v4\n        with:\n          node-version: 20\n          cache: npm\n          cache-dependency-path: website/package-lock.json\n\n      - name: Install dependencies\n        run: |\n          cd website\n          npm ci\n      - name: Build website\n        run: |\n          cd website\n          npm run build\n          echo \"palimpzest.org\" > build/CNAME\n      - name: Upload Build Artifact\n        uses: actions/upload-pages-artifact@v3\n        with:\n          path: website/build\n\n  deploy:\n    name: Deploy to GitHub Pages\n    needs: build\n\n    # Grant GITHUB_TOKEN the permissions required to make a Pages deployment\n    permissions:\n      pages: write # to deploy to Pages\n      id-token: write # to verify the deployment originates from an appropriate source\n\n    # Deploy to the github-pages environment\n    environment:\n      name: github-pages\n      url: ${{ steps.deployment.outputs.page_url }}\n\n    runs-on: ubuntu-latest\n    steps:\n      - name: Deploy to GitHub Pages\n        id: deployment\n        uses: actions/deploy-pages@v4\n"
  },
  {
    "path": ".github/workflows/package.yaml",
    "content": "name: package\n\non:\n  push:\n    branches:\n      - main\n  pull_request:\n    branches:\n      - main\n\njobs:\n  build:\n    runs-on: ubuntu-latest\n    steps:\n    - uses: actions/checkout@v4\n    - name: Set up Python\n      uses: actions/setup-python@v5\n      with:\n        python-version: '3.x'\n    - name: Build Package\n      run: |\n        pip install --upgrade pip build\n        python3 -m build\n    - name: Store the distribution packages\n      uses: actions/upload-artifact@v4\n      with:\n        name: python-package-distributions\n        path: dist/\n\n  publish:\n    runs-on: ubuntu-latest\n    name: Publish Package\n    if: ${{ github.event_name == 'push' && github.ref == 'refs/heads/main' }}\n    needs:\n    - build\n    environment:\n      name: pypi\n      url: https://pypi.org/p/palimpzest\n    permissions:\n      id-token: write\n    steps:\n      - name: Download all the dists\n        uses: actions/download-artifact@v4\n        with:\n          name: python-package-distributions\n          path: dist/\n      - name: Publish distribution to PyPI\n        uses: pypa/gh-action-pypi-publish@release/v1\n\n  github-release:\n    name: >-\n      Sign distribution w/Sigstore and upload to GitHub Release\n    needs:\n    - publish\n    runs-on: ubuntu-latest\n    permissions:\n      contents: write\n      id-token: write\n    steps:\n    - name: Download all the dists\n      uses: actions/download-artifact@v4\n      with:\n        name: python-package-distributions\n        path: dist/\n    - name: Sign the dists with Sigstore\n      uses: sigstore/gh-action-sigstore-python@v3.0.0\n      with:\n        inputs: >-\n          ./dist/*.tar.gz\n          ./dist/*.whl\n    - name: Create GitHub Release\n      env:\n        GITHUB_TOKEN: ${{ github.token }}\n      run: |\n        PKG_VERSION=`ls dist/ | head -n 1 | sed -E 's/.*palimpzest-([0-9]+\\.[0-9]+\\.[0-9]+)-.*/\\1/'`\n        gh release create \"$PKG_VERSION\" --repo \"$GITHUB_REPOSITORY\" --notes \"\"\n    - name: Upload artifact signatures to GitHub Release\n      env:\n        GITHUB_TOKEN: ${{ github.token }}\n      # Upload to GitHub Release using the `gh` CLI.\n      # `dist/` contains the built packages, and the\n      # sigstore-produced signatures and certificates.\n      run: |\n        PKG_VERSION=`ls dist/ | head -n 1 | sed -E 's/.*palimpzest-([0-9]+\\.[0-9]+\\.[0-9]+)-.*/\\1/'`\n        gh release upload \"$PKG_VERSION\" dist/** --repo \"$GITHUB_REPOSITORY\"\n"
  },
  {
    "path": ".github/workflows/test-docs.yaml",
    "content": "name: Test Building Docs\n\non:\n  pull_request:\n    branches:\n      - main\n\njobs:\n  test-deploy:\n    name: Test deployment\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n      - uses: actions/setup-node@v4\n        with:\n          node-version: 20\n          cache: npm\n          cache-dependency-path: website/package-lock.json\n\n      - name: Install dependencies\n        run: |\n          cd website\n          npm ci\n      - name: Test build website\n        run: |\n          cd website\n          npm run build\n"
  },
  {
    "path": ".gitignore",
    "content": "docs/site/\n*.zip\n.cache/\n.env\nbuild/*\ndocs/build/*\ndocs/source/generated/*\ndist/*\n.vscode/*\n.idea/*\n.chroma\n.chroma-biodex\n.chroma-mmqa\n.ragatouille\nplots/\npaper-imgs/\n\n# testdata folders and archive files\ntestdata/enron-tiny.csv\ntestdata/*/\ntestdata/*.tar.gz\ntests/pytest/data/generator_messages/\nscripts/provider_stats/\nscripts/litellm_stats/\n\n# python artifacts\n*.egg-info\n**/__pycache__/\n\n# other\n.DS_Store\n\n# logs\n*.log\n\n# virtual environment(s)\nvenv/\nuv.lock\n\n# tmp\ntestdata/maildir/\ntestdata/real-estate-eval-100.tar\n\n# jupyter\n.ipynb_checkpoints/\n\n# evaluation\nold-eval-results/\neval-results/\n\ntestdata/enron-eval/*.txt\n\n# your zed using open source contributor who only installs in a virtual environment\n.venv\n.zed\npyrightconfig.json\n\nmyenv/\npz-env/\n\n# abacus-research data\nabacus-research/cuad-data/*\nabacus-research/opt-profiling-data/*\nabacus-research/parse-answer-errors/*\n\n# stats\nscripts/litellm_stats/\nscripts/provider_stats/\ntests/pytest/data/generator_messages/\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2024 MIT Data Systems Group\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "![pz-banner](https://palimpzest-workloads.s3.us-east-1.amazonaws.com/palimpzest-cropped.png)\n\n# Palimpzest (PZ)\n[![Discord](https://img.shields.io/discord/1245561987480420445?logo=discord)](https://discord.gg/dN85JJ6jaH)\n[![Docs](https://img.shields.io/badge/Read_the_Docs-purple?logo=readthedocs)](https://palimpzest.org/)\n[![Colab Demo](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Fm8I4yL1az395MsFkQbEIZSmUZs0oGvZ?usp=sharing)\n[![PyPI](https://img.shields.io/pypi/v/palimpzest)](https://pypi.org/project/palimpzest/)\n[![PyPI - Monthly Downloads](https://img.shields.io/pypi/dm/palimpzest?color=teal)](https://pypi.org/project/palimpzest/)\n<!-- [![Paper](https://img.shields.io/badge/Paper-arXiv-b31b1b?logo=arxiv)](https://arxiv.org/pdf/2405.14696) -->\n<!-- [![Video](https://img.shields.io/badge/YouTube-Talk-red?logo=youtube)](https://youtu.be/T8VQfyBiki0?si=eiph57DSEkDNbEIu) -->\n\n## 📚 Learn How to Use PZ\nOur [full documentation](https://palimpzest.org) is the definitive resource for learning how to use PZ. It contains all of the installation and quickstart materials on this page, as well as user guides, full API documentation (coming soon), and much more.\n\n## 🚀 Getting started\nYou can find a stable version of the PZ package on PyPI [here](https://pypi.org/project/palimpzest/). To install the package, run:\n```bash\n$ pip install palimpzest\n```\n\nYou can also install PZ with [uv](https://docs.astral.sh/uv/) for a faster installation:\n```bash\n$ uv pip install palimpzest\n```\n\nAlternatively, to install the latest version of the package from this repository, you can clone this repository and run the following commands:\n```bash\n$ git clone git@github.com:mitdbg/palimpzest.git\n$ cd palimpzest\n$ pip install .\n```\n\n## 🙋🏽 Join the PZ Community\nWe are actively hacking on PZ and would love to have you join our community [![Discord](https://img.shields.io/discord/1245561987480420445?logo=discord)](https://discord.gg/dN85JJ6jaH)\n\n[Our Discord server](https://discord.gg/dN85JJ6jaH) is the best place to:\n- Get help with your PZ program(s)\n- Give feedback to the maintainers\n- Discuss the future direction(s) of the project\n- Discuss anything related to data processing with LLMs!\n\nWe are eager to learn more about your workloads and use cases, and will take them into consideration in planning our future roadmap.\n\n### 📓 Citation\nIf you would like to cite our original paper on Palimpzest, please use the following citation:\n```\n@inproceedings{palimpzestCIDR,\n    title={Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing},\n    author={Liu, Chunwei and Russo, Matthew and Cafarella, Michael and Cao, Lei and Chen, Peter Baile and Chen, Zui and Franklin, Michael and Kraska, Tim and Madden, Samuel and Shahout, Rana and Vitagliano, Gerardo},\n    booktitle = {Proceedings of the {{Conference}} on {{Innovative Database Research}} ({{CIDR}})},\n    date = 2025,\n}\n```\n\nIf you would like to cite our paper on Palimpzest's optimizer Abacus, please use the following citation:\n```\n@misc{russo2025abacuscostbasedoptimizersemantic,\n      title={Abacus: A Cost-Based Optimizer for Semantic Operator Systems}, \n      author={Matthew Russo and Sivaprasad Sudhir and Gerardo Vitagliano and Chunwei Liu and Tim Kraska and Samuel Madden and Michael Cafarella},\n      year={2025},\n      eprint={2505.14661},\n      archivePrefix={arXiv},\n      primaryClass={cs.DB},\n      url={https://arxiv.org/abs/2505.14661}, \n}\n```\n"
  },
  {
    "path": "abacus-research/README.md",
    "content": "## Chroma Embeddings and MMQA files\nYou can download the chroma embeddings we computed for MMQA and BioDEX by executing the following:\n```sh\n$ ./download_embeddings_and_mmqa.sh\n```\nThis folder also contains questions for the different splits of MMQA -- of which we only use `MMQA_dev.jsonl` for scoring PZ's output. If you need the full MMQA dataset for any reason (e.g. to visualize at which images are being retrieved by a pipeline), you can find it here: https://github.com/allenai/multimodalqa/tree/master.\n\n## Table 2\nThe following scripts create the data for Abacus in Table 2 in our Abacus paper.\n- `run_biodex.sh`\n- `run_cuad.sh`\n- `run_mmqa_complex.sh`\n\n## Table 3\nThe following scripts create the data for Abacus in Table 3 in our Abacus paper.\n- `run_biodex_min_cost_latency.sh`\n- `run_cuad_min_cost_latency.sh`\n- `run_mmqa_complex_min_cost_latency.sh`\n\n## Figure 6\nThe following scripts create the data for Figure 6 in our Abacus paper.\n- `run_biodex_priors.sh`\n- `run_biodex_priors_constrained.sh`\n- `run_cuad_priors.sh`\n- `run_cuad_priors_constrained.sh`\n\n## Figure 7\nThe `run_biodex_cost_threshold.sh` and `run_cuad_cost_threshold.sh` scripts create the data for Figure 6 in our Abacus paper.\n\n## Figure 8\nThe `run_ablation_study.sh` script creates the data for Figure 8 in our Abacus paper."
  },
  {
    "path": "abacus-research/README_CUAD_LOCAL.md",
    "content": "# CUAD Local Data Setup and Usage\n\n## Setup\n\nSince HuggingFace datasets no longer supports loading scripts, we've created a local data loading solution.\n\n### 1. Download CUAD Data\n\nFirst, run the setup script to download CUAD data to a local directory:\n\n```bash\npython setup_cuad_data.py\n```\n\nThis will:\n- Create a `cuad-data/` directory\n- Download the CUAD dataset files (train and test JSON files)\n- Download the original dataset script from HuggingFace for reference\n\n### 2. Updated Scripts\n\nThe following scripts have been updated to use local data via `cuad_data_loader.py`:\n\n- **cuad-demo.py**\n- **cuad-max-quality-at-cost.py**\n\n### 3. Running the Scripts\n\n#### Basic CUAD Demo\n```bash\n# Make sure OPENAI_API_KEY is set in .env or environment\nsource ../.env && export OPENAI_API_KEY\n\n# Run from abacus-research directory\nseed=0\nexp_name=\"cuad-final-mab-k6-j4-budget50-seed${seed}\"\npython cuad-demo.py --k 6 --j 4 --sample-budget 50 --seed $seed --exp-name $exp_name --gpt4-mini-only\n```\n\n#### Max Quality at Cost\n```bash\npython cuad-max-quality-at-cost.py --constrained --gpt4-mini-only\n```"
  },
  {
    "path": "abacus-research/biodex-ablation.py",
    "content": "import argparse\nimport json\nimport os\nimport time\n\nimport chromadb\nimport datasets\nfrom chromadb.utils.embedding_functions.openai_embedding_function import OpenAIEmbeddingFunction\n\nimport palimpzest as pz\nfrom palimpzest.constants import Model\n\nbiodex_entry_cols = [\n    {\"name\": \"pmid\", \"type\": str, \"desc\": \"The PubMed ID of the medical paper\"},\n    {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the medical paper\"},\n    {\"name\": \"abstract\", \"type\": str, \"desc\": \"The abstract of the medical paper\"},\n    {\"name\": \"fulltext\", \"type\": str, \"desc\": \"The full text of the medical paper, which contains information relevant for creating a drug safety report.\"},\n]\n\nbiodex_reactions_cols = [\n    {\"name\": \"reactions\", \"type\": list[str], \"desc\": \"The list of all medical conditions experienced by the patient as discussed in the report. Try to provide as many relevant medical conditions as possible.\"},\n]\n\nbiodex_reaction_labels_cols = [\n    {\"name\": \"reaction_labels\", \"type\": list[str], \"desc\": \"Official terms for medical conditions listed in `reactions`\"},\n]\n\nbiodex_ranked_reactions_labels_cols = [\n    {\"name\": \"ranked_reaction_labels\", \"type\": list[str], \"desc\": \"The ranked list of medical conditions experienced by the patient. The most relevant label occurs first in the list. Be sure to rank ALL of the inputs.\"},\n]\n\nclass BiodexValidator(pz.Validator):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__()\n\n        # read dataset and prepare entries\n        dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=\"train\").to_pandas()\n        if shuffle:\n            dataset = dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\"records\")\n        else:\n            dataset = dataset.to_dict(orient=\"records\")[:num_samples]\n\n        # compute mapping from pmid --> label (i.e. reactions list)\n        self.pmid_to_label = self._compute_pmid_to_label(dataset)\n\n        # store rp_at_k for computing rank-precision at k metric\n        self.k = rp_at_k\n\n    def _compute_pmid_to_label(self, dataset: list[dict]) -> dict:\n        \"\"\"Compute the label for a BioDEX report given its entry in the dataset.\"\"\"\n        pmid_to_label = {}\n        for entry in dataset:\n            pmid = str(entry[\"pmid\"])\n            reactions_lst = [\n                reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n                for reaction in entry[\"reactions\"].split(\",\")\n            ]\n            pmid_to_label[pmid] = reactions_lst\n\n        return pmid_to_label\n\n    def rank_precision_at_k(self, preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # lower-case each list\n            preds = [pred.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for pred in preds]\n            targets = set([target.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for target in targets])\n\n            # compute rank-precision at k\n            rn = len(targets)\n            denom = min(self.k, rn)\n            total = 0.0\n            for i in range(self.k):\n                total += preds[i] in targets if i < len(preds) else 0.0\n\n            return total / denom\n\n        except Exception:\n            os.makedirs(\"rp@k-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"rp@k-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def term_recall(self, preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # normalize terms in each list\n            pred_terms = set([\n                term.strip()\n                for pred in preds\n                for term in pred.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n            target_terms = ([\n                term.strip()\n                for target in targets\n                for term in target.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n\n            # compute term recall and return\n            intersect = pred_terms.intersection(target_terms)\n            term_recall = len(intersect) / len(target_terms)\n\n            return term_recall\n\n        except Exception:\n            os.makedirs(\"term-recall-eval-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"term-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        field_name = fields[0]\n        if field_name == \"reactions\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[str(input_record[\"pmid\"])]\n            return self.term_recall(preds, targets)\n        elif field_name == \"ranked_reaction_labels\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[str(input_record[\"pmid\"])]\n            return self.rank_precision_at_k(preds, targets)\n        else:\n            raise NotImplementedError(f\"Validator.map_score_fn not implemented for field {field_name}.\")\n\n    def topk_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        field_name = fields[0]\n        if field_name == \"reaction_labels\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[input_record[\"pmid\"]]\n            return self.term_recall(preds, targets)\n        else:\n            raise NotImplementedError(f\"Validator.topk_score_fn not implemented for field {field_name}.\")\n\n\nclass BiodexDataset(pz.IterDataset):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        split: str = \"test\",\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__(id=\"biodex\", schema=biodex_entry_cols)\n\n        self.dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=split).to_pandas()\n        if shuffle:\n            self.dataset = self.dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\"records\")\n        else:\n            self.dataset = self.dataset.to_dict(orient=\"records\")[:num_samples]\n\n        self.rp_at_k = rp_at_k\n        self.num_samples = num_samples\n        self.shuffle = shuffle\n        self.seed = seed\n        self.split = split\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        # get entry\n        entry = self.dataset[idx]\n\n        # get input fields\n        pmid = entry[\"pmid\"]\n        title = entry[\"title\"]\n        abstract = entry[\"abstract\"]\n        fulltext = entry[\"fulltext\"]\n\n        # create item with fields\n        item = {\"pmid\": pmid, \"title\": title, \"abstract\": abstract, \"fulltext\": fulltext}\n\n        return item\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\n        \"--optimizer-strategy\",\n        default=\"pareto\",\n        type=str,\n        help=\"The optimizer strategy to use. One of pareto or greedy\",\n    )\n    parser.add_argument(\n        \"--seed\",\n        default=42,\n        type=int,\n        help=\"Seed used to initialize RNG for MAB sampling algorithm\",\n    )\n    parser.add_argument(\n        \"--k\",\n        default=10,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--j\",\n        default=3,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--sample-budget\",\n        default=100,\n        type=int,\n        help=\"Total sample budget in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--exp-name\",\n        default=None,\n        type=str,\n        help=\"The experiment name.\",\n    )\n    parser.add_argument(\n        \"--policy\",\n        default=None,\n        type=str,\n        help=\"The policy (one of 'mincost' or 'maxquality').\",\n    )\n    parser.add_argument(\n        \"--priors-file\",\n        default=None,\n        type=str,\n        help=\"A file with a dictionary mapping physical operator ids to prior belief on their performance\",\n    )\n    args = parser.parse_args()\n\n    # create directory for profiling data\n    os.makedirs(\"ablation-data\", exist_ok=True)\n\n    seed = args.seed\n    k = args.k\n    j = args.j\n    sample_budget = args.sample_budget\n    optimizer_strategy = args.optimizer_strategy\n    exp_name = args.exp_name\n    priors = None\n    if args.priors_file is not None and os.path.exists(args.priors_file):\n        with open(args.priors_file) as f:\n            priors = json.load(f)\n\n    # set the optimization policy; constraint set to 80% of mean quality from unconstrained plans (Table 2)\n    policy = (\n        pz.MinCostAtFixedQuality(min_quality=0.8 * 0.261)\n        if args.policy == \"mincost\"\n        else pz.MaxQualityAtFixedCost(max_cost=0.5 * 0.7)\n    )\n    print(f\"USING POLICY: {policy}\")\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # create validator\n    validator = BiodexValidator(\n        rp_at_k=5,\n        num_samples=20,\n        shuffle=True,\n        seed=seed,\n    )\n\n    # create train dataset for validator\n    train_dataset = BiodexDataset(\n        split=\"train\",\n        num_samples=20,\n        shuffle=True,\n        seed=seed,\n    )\n    train_dataset = {train_dataset.id: train_dataset}\n\n    # load index [text-embedding-3-small]\n    chroma_client = chromadb.PersistentClient(\".chroma-biodex\")\n    openai_ef = OpenAIEmbeddingFunction(\n        api_key=os.environ[\"OPENAI_API_KEY\"],\n        model_name=\"text-embedding-3-small\",\n    )\n    index = chroma_client.get_collection(\"biodex-reaction-terms\", embedding_function=openai_ef)\n\n    def search_func(index: chromadb.Collection, query: list[list[float]], k: int) -> list[str]:\n        # execute query with embeddings\n        results = index.query(query, n_results=5)\n\n        # get list of result terms with their cosine similarity scores\n        final_results = []\n        for query_docs, query_distances in zip(results[\"documents\"], results[\"distances\"]):\n            for doc, dist in zip(query_docs, query_distances):\n                cosine_similarity = 1 - dist\n                final_results.append({\"content\": doc, \"similarity\": cosine_similarity})\n\n        # sort the results by similarity score\n        sorted_results = sorted(final_results, key=lambda result: result[\"similarity\"], reverse=True)\n\n        # remove duplicates\n        sorted_results_set = set()\n        final_sorted_results = []\n        for result in sorted_results:\n            if result[\"content\"] not in sorted_results_set:\n                sorted_results_set.add(result[\"content\"])\n                final_sorted_results.append(result[\"content\"])\n\n        # return the top-k similar results and generation stats\n        return {\"reaction_labels\": final_sorted_results[:k]}\n\n    # construct plan\n    plan = BiodexDataset(split=\"test\", num_samples=250, shuffle=True, seed=seed)\n    plan = plan.sem_map(biodex_reactions_cols)\n    plan = plan.sem_topk(\n        index=index,\n        search_func=search_func,\n        search_attr=\"reactions\",\n        output_attrs=biodex_reaction_labels_cols,\n    )\n    plan = plan.sem_map(biodex_ranked_reactions_labels_cols, depends_on=[\"title\", \"abstract\", \"fulltext\", \"reaction_labels\"])\n\n    # set models\n    models = [\n        Model.GPT_4o,\n        Model.GPT_4o_MINI,\n        Model.LLAMA3_1_8B,\n        Model.LLAMA3_3_70B,\n        # Model.MIXTRAL,  # NOTE: only available in tag `abacus-paper-experiments`\n        # Model.DEEPSEEK_R1_DISTILL_QWEN_1_5B,\n    ]\n\n    # execute pz plan\n    config = pz.QueryProcessorConfig(\n        policy=policy,\n        optimizer_strategy=optimizer_strategy,\n        execution_strategy=\"parallel\",\n        use_final_op_quality=True,\n        max_workers=64,\n        available_models=models,\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=True,\n        k=k,\n        j=j,\n        sample_budget=sample_budget,\n        # sample_cost_budget=0.10,\n        seed=seed,\n        exp_name=exp_name,\n        priors=priors,\n        dont_use_priors=(priors is None),\n    )\n\n    data_record_collection = plan.optimize_and_run(config=config, train_dataset=train_dataset, validator=validator)\n\n    print(data_record_collection.to_df())\n    data_record_collection.to_df().to_csv(f\"ablation-data/{exp_name}-output.csv\", index=False)\n\n    # create filepaths for records and stats\n    records_path = f\"ablation-data/{exp_name}-records.json\"\n    stats_path = f\"ablation-data/{exp_name}-profiling.json\"\n\n    # save record outputs\n    record_jsons = []\n    for record in data_record_collection:\n        record_dict = record.to_dict()\n        record_dict = {\n            k: v\n            for k, v in record_dict.items()\n            if k in [\"pmid\", \"reactions\", \"reaction_labels\", \"ranked_reaction_labels\"]\n        }\n        record_jsons.append(record_dict)\n\n    with open(records_path, \"w\") as f:\n        json.dump(record_jsons, f)\n\n    # save statistics\n    execution_stats_dict = data_record_collection.execution_stats.to_json()\n    with open(stats_path, \"w\") as f:\n        json.dump(execution_stats_dict, f)\n\n    # score output\n    test_dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=\"test\").to_pandas()\n    test_dataset = test_dataset.sample(n=250, random_state=seed).to_dict(orient=\"records\")\n\n    # construct mapping from pmid --> label (field, value) pairs\n    def compute_target_record(entry):\n        reactions_lst = [\n            reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n            for reaction in entry[\"reactions\"].split(\",\")\n        ]\n        label_dict = {\"ranked_reaction_labels\": reactions_lst}\n        return label_dict\n\n    label_fields_to_values = {\n        entry[\"pmid\"]: compute_target_record(entry) for entry in test_dataset\n    }\n\n    def rank_precision_at_k(preds: list, targets: list, k: int):\n        if preds is None:\n            return 0.0\n\n        # lower-case each list\n        preds = [pred.lower().replace(\"'\", \"\").replace(\"^\", \"\") for pred in preds]\n        targets = set([target.lower().replace(\"'\", \"\").replace(\"^\", \"\") for target in targets])\n\n        # compute rank-precision at k\n        rn = len(targets)\n        denom = min(k, rn)\n        total = 0.0\n        for i in range(k):\n            total += preds[i] in targets if i < len(preds) else 0.0\n\n        return total / denom\n\n    def compute_avg_rp_at_k(records, k=5):\n        total_rp_at_k = 0\n        bad = 0\n        for record in records:\n            pmid = record['pmid']\n            preds = record['ranked_reaction_labels']\n            targets = label_fields_to_values[pmid]['ranked_reaction_labels']\n            try:\n                total_rp_at_k += rank_precision_at_k(preds, targets, k)\n            except Exception:\n                bad += 1\n\n        return total_rp_at_k / len(records), bad\n\n    rp_at_k, bad = compute_avg_rp_at_k(record_jsons, k=5)\n    final_plan_id = list(data_record_collection.execution_stats.plan_stats.keys())[0]\n    final_plan_str = data_record_collection.execution_stats.plan_strs[final_plan_id]\n    stats_dict = {\n        \"rp@5\": rp_at_k,\n        \"optimization_time\": data_record_collection.execution_stats.optimization_time,\n        \"optimization_cost\": data_record_collection.execution_stats.optimization_cost,\n        \"plan_execution_time\": data_record_collection.execution_stats.plan_execution_time,\n        \"plan_execution_cost\": data_record_collection.execution_stats.plan_execution_cost,\n        \"total_execution_time\": data_record_collection.execution_stats.total_execution_time,\n        \"total_execution_cost\": data_record_collection.execution_stats.total_execution_cost,\n        \"plan_str\": final_plan_str,\n    }\n    with open(f\"ablation-data/{exp_name}-metrics.json\", \"w\") as f:\n        json.dump(stats_dict, f)\n\n    print(f\"bad: {bad}\")\n    print(\"-------\")\n    print(f\"rp@k: {rp_at_k:.5f}\")\n    print(f\"Optimization time: {data_record_collection.execution_stats.optimization_time}\")\n    print(f\"Optimization cost: {data_record_collection.execution_stats.optimization_cost}\")\n    print(f\"Plan Exec. time: {data_record_collection.execution_stats.plan_execution_time}\")\n    print(f\"Plan Exec. cost: {data_record_collection.execution_stats.plan_execution_cost}\")\n    print(f\"Total Execution time: {data_record_collection.execution_stats.total_execution_time}\")\n    print(f\"Total Execution Cost: {data_record_collection.execution_stats.total_execution_cost}\")\n"
  },
  {
    "path": "abacus-research/biodex-demo.py",
    "content": "import argparse\nimport json\nimport os\nimport time\n\nimport chromadb\nimport datasets\nfrom chromadb.utils.embedding_functions.openai_embedding_function import OpenAIEmbeddingFunction\n\nimport palimpzest as pz\nfrom palimpzest.constants import Model\n\nbiodex_entry_cols = [\n    {\"name\": \"pmid\", \"type\": str, \"desc\": \"The PubMed ID of the medical paper\"},\n    {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the medical paper\"},\n    {\"name\": \"abstract\", \"type\": str, \"desc\": \"The abstract of the medical paper\"},\n    {\"name\": \"fulltext\", \"type\": str, \"desc\": \"The full text of the medical paper, which contains information relevant for creating a drug safety report.\"},\n]\n\nbiodex_reactions_cols = [\n    {\"name\": \"reactions\", \"type\": list[str], \"desc\": \"The list of all medical conditions experienced by the patient as discussed in the report. Try to provide as many relevant medical conditions as possible.\"},\n]\n\nbiodex_reaction_labels_cols = [\n    {\"name\": \"reaction_labels\", \"type\": list[str], \"desc\": \"Official terms for medical conditions listed in `reactions`\"},\n]\n\nbiodex_ranked_reactions_labels_cols = [\n    {\"name\": \"ranked_reaction_labels\", \"type\": list[str], \"desc\": \"The ranked list of medical conditions experienced by the patient. The most relevant label occurs first in the list. Be sure to rank ALL of the inputs.\"},\n]\n\nclass BiodexValidator(pz.Validator):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__()\n\n        # read dataset and prepare entries\n        dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=\"train\").to_pandas()\n        if shuffle:\n            dataset = dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\"records\")\n        else:\n            dataset = dataset.to_dict(orient=\"records\")[:num_samples]\n\n        # compute mapping from pmid --> label (i.e. reactions list)\n        self.pmid_to_label = self._compute_pmid_to_label(dataset)\n\n        # store rp_at_k for computing rank-precision at k metric\n        self.k = rp_at_k\n\n    def _compute_pmid_to_label(self, dataset: list[dict]) -> dict:\n        \"\"\"Compute the label for a BioDEX report given its entry in the dataset.\"\"\"\n        pmid_to_label = {}\n        for entry in dataset:\n            pmid = str(entry[\"pmid\"])\n            reactions_lst = [\n                reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n                for reaction in entry[\"reactions\"].split(\",\")\n            ]\n            pmid_to_label[pmid] = reactions_lst\n\n        return pmid_to_label\n\n    def rank_precision_at_k(self, preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # lower-case each list\n            preds = [pred.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for pred in preds]\n            targets = set([target.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for target in targets])\n\n            # compute rank-precision at k\n            rn = len(targets)\n            denom = min(self.k, rn)\n            total = 0.0\n            for i in range(self.k):\n                total += preds[i] in targets if i < len(preds) else 0.0\n\n            return total / denom\n\n        except Exception:\n            os.makedirs(\"rp@k-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"rp@k-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def term_recall(self, preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # normalize terms in each list\n            pred_terms = set([\n                term.strip()\n                for pred in preds\n                for term in pred.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n            target_terms = ([\n                term.strip()\n                for target in targets\n                for term in target.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n\n            # compute term recall and return\n            intersect = pred_terms.intersection(target_terms)\n            term_recall = len(intersect) / len(target_terms)\n\n            return term_recall\n\n        except Exception:\n            os.makedirs(\"term-recall-eval-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"term-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        field_name = fields[0]\n        if field_name == \"reactions\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[str(input_record[\"pmid\"])]\n            return self.term_recall(preds, targets)\n        elif field_name == \"ranked_reaction_labels\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[str(input_record[\"pmid\"])]\n            return self.rank_precision_at_k(preds, targets)\n        else:\n            raise NotImplementedError(f\"Validator.map_score_fn not implemented for field {field_name}.\")\n\n    def topk_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        field_name = fields[0]\n        if field_name == \"reaction_labels\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[input_record[\"pmid\"]]\n            return self.term_recall(preds, targets)\n        else:\n            raise NotImplementedError(f\"Validator.topk_score_fn not implemented for field {field_name}.\")\n\n\nclass BiodexDataset(pz.IterDataset):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        split: str = \"test\",\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__(id=\"biodex\", schema=biodex_entry_cols)\n\n        self.dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=split).to_pandas()\n        if shuffle:\n            self.dataset = self.dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\"records\")\n        else:\n            self.dataset = self.dataset.to_dict(orient=\"records\")[:num_samples]\n\n        self.rp_at_k = rp_at_k\n        self.num_samples = num_samples\n        self.shuffle = shuffle\n        self.seed = seed\n        self.split = split\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        # get entry\n        entry = self.dataset[idx]\n\n        # get input fields\n        pmid = entry[\"pmid\"]\n        title = entry[\"title\"]\n        abstract = entry[\"abstract\"]\n        fulltext = entry[\"fulltext\"]\n\n        # create item with fields\n        item = {\"pmid\": pmid, \"title\": title, \"abstract\": abstract, \"fulltext\": fulltext}\n\n        return item\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\"--progress\", default=False, action=\"store_true\", help=\"Print progress output\")\n    parser.add_argument(\"--constrained\", default=False, action=\"store_true\", help=\"Use constrained objective\")\n    parser.add_argument(\"--gpt4-mini-only\", default=False, action=\"store_true\", help=\"Use only GPT-4o-mini\")\n    parser.add_argument(\n        \"--execution-strategy\",\n        default=\"parallel\",\n        type=str,\n        help=\"The plan executor to use. One of sequential, pipelined, parallel\",\n    )\n    parser.add_argument(\n        \"--sentinel-execution-strategy\",\n        default=\"mab\",\n        type=str,\n        help=\"The sentinel execution strategy to use. One of mab or random\",\n    )\n    parser.add_argument(\n        \"--policy\",\n        default=\"maxquality\",\n        type=str,\n        help=\"One of 'mincost', 'mintime', 'maxquality'\",\n    )\n    parser.add_argument(\n        \"--val-examples\",\n        default=25,\n        type=int,\n        help=\"Number of validation examples to sample from\",\n    )\n    parser.add_argument(\n        \"--model\",\n        default=\"gpt-4o\",\n        type=str,\n        help=\"One of 'gpt-4o', 'gpt-4o-mini', 'llama'\",\n    )\n    parser.add_argument(\n        \"--seed\",\n        default=42,\n        type=int,\n        help=\"Seed used to initialize RNG for MAB sampling algorithm\",\n    )\n    parser.add_argument(\n        \"--k\",\n        default=10,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--j\",\n        default=3,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--sample-budget\",\n        default=100,\n        type=int,\n        help=\"Total sample budget in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--exp-name\",\n        default=None,\n        type=str,\n        help=\"The experiment name.\",\n    )\n    parser.add_argument(\n        \"--priors-file\",\n        default=None,\n        type=str,\n        help=\"A file with a dictionary mapping physical operator ids to prior belief on their performance\",\n    )\n    parser.add_argument(\n        \"--quality\",\n        default=None,\n        type=float,\n        help=\"Quality threshold\",\n    )\n\n    args = parser.parse_args()\n\n    # create directory for profiling data\n    os.makedirs(\"opt-profiling-data\", exist_ok=True)\n\n    verbose = args.verbose\n    progress = args.progress\n    seed = args.seed\n    val_examples = args.val_examples\n    k = args.k\n    j = args.j\n    sample_budget = args.sample_budget\n    execution_strategy = args.execution_strategy\n    sentinel_execution_strategy = args.sentinel_execution_strategy\n    exp_name = (\n        f\"biodex-final-{sentinel_execution_strategy}-k{k}-j{j}-budget{sample_budget}-seed{seed}\"\n        if args.exp_name is None\n        else args.exp_name\n    )\n    priors = None\n    if args.priors_file is not None:\n        with open(args.priors_file) as f:\n            priors = json.load(f)\n\n    # set the optimization policy; constraint set to 25% percentile from unconstrained plans\n    policy = pz.MaxQuality() if not args.constrained else pz.MaxQualityAtFixedCost(max_cost=2.250)\n    if args.policy == \"mincost\":\n        policy = pz.MinCost()\n    elif args.policy == \"minlatency\":\n        policy = pz.MinTime()\n    elif args.quality is not None and args.policy == \"mincostatfixedquality\":\n        policy = pz.MinCostAtFixedQuality(min_quality=args.quality)\n    elif args.quality is not None and args.policy == \"minlatencyatfixedquality\":\n        policy = pz.MinTimeAtFixedQuality(min_quality=args.quality)\n    print(f\"USING POLICY: {policy}\")\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # create validator\n    validator = BiodexValidator(\n        rp_at_k=5,\n        num_samples=val_examples,\n        shuffle=True,\n        seed=seed,\n    )\n\n    # create train dataset for validator\n    train_dataset = BiodexDataset(\n        split=\"train\",\n        num_samples=val_examples,\n        shuffle=True,\n        seed=seed,\n    )\n    train_dataset = {train_dataset.id: train_dataset}\n\n    # load index [text-embedding-3-small]\n    chroma_client = chromadb.PersistentClient(\".chroma-biodex\")\n    openai_ef = OpenAIEmbeddingFunction(\n        api_key=os.environ[\"OPENAI_API_KEY\"],\n        model_name=\"text-embedding-3-small\",\n    )\n    index = chroma_client.get_collection(\"biodex-reaction-terms\", embedding_function=openai_ef)\n\n    def search_func(index: chromadb.Collection, query: list[list[float]], k: int) -> list[str]:\n        # execute query with embeddings\n        results = index.query(query, n_results=5)\n\n        # get list of result terms with their cosine similarity scores\n        final_results = []\n        for query_docs, query_distances in zip(results[\"documents\"], results[\"distances\"]):\n            for doc, dist in zip(query_docs, query_distances):\n                cosine_similarity = 1 - dist\n                final_results.append({\"content\": doc, \"similarity\": cosine_similarity})\n\n        # sort the results by similarity score\n        sorted_results = sorted(final_results, key=lambda result: result[\"similarity\"], reverse=True)\n\n        # remove duplicates\n        sorted_results_set = set()\n        final_sorted_results = []\n        for result in sorted_results:\n            if result[\"content\"] not in sorted_results_set:\n                sorted_results_set.add(result[\"content\"])\n                final_sorted_results.append(result[\"content\"])\n\n        # return the top-k similar results and generation stats\n        return {\"reaction_labels\": final_sorted_results[:k]}\n\n    # construct plan\n    plan = BiodexDataset(split=\"test\", num_samples=250, shuffle=True, seed=seed)\n    plan = plan.sem_map(biodex_reactions_cols)\n    plan = plan.sem_topk(\n        index=index,\n        search_func=search_func,\n        search_attr=\"reactions\",\n        output_attrs=biodex_reaction_labels_cols,\n    )\n    plan = plan.sem_map(biodex_ranked_reactions_labels_cols, depends_on=[\"title\", \"abstract\", \"fulltext\", \"reaction_labels\"])\n\n    # set models\n    models = [Model.GPT_4o_MINI] if args.gpt4_mini_only else [\n        Model.GPT_4o,\n        Model.GPT_4o_MINI,\n        Model.LLAMA3_1_8B,\n        Model.LLAMA3_3_70B,\n        # Model.MIXTRAL,  # NOTE: only available in tag `abacus-paper-experiments`\n        Model.DEEPSEEK_R1_DISTILL_QWEN_1_5B,\n    ]\n\n    # execute pz plan\n    config = pz.QueryProcessorConfig(\n        policy=policy,\n        optimizer_strategy=\"pareto\",\n        sentinel_execution_strategy=sentinel_execution_strategy,\n        execution_strategy=execution_strategy,\n        use_final_op_quality=True,\n        max_workers=64,\n        verbose=verbose,\n        available_models=models,\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=progress,\n        k=k,\n        j=j,\n        sample_budget=sample_budget,\n        # sample_cost_budget=0.10,\n        seed=seed,\n        exp_name=exp_name,\n        priors=priors,\n    )\n\n    data_record_collection = plan.optimize_and_run(config=config, train_dataset=train_dataset, validator=validator)\n\n    print(data_record_collection.to_df())\n    data_record_collection.to_df().to_csv(f\"opt-profiling-data/{exp_name}-output.csv\", index=False)\n\n    # create filepaths for records and stats\n    records_path = f\"opt-profiling-data/{exp_name}-records.json\"\n    stats_path = f\"opt-profiling-data/{exp_name}-profiling.json\"\n\n    # save record outputs\n    record_jsons = []\n    for record in data_record_collection:\n        record_dict = record.to_dict()\n        record_dict = {\n            k: v\n            for k, v in record_dict.items()\n            if k in [\"pmid\", \"reactions\", \"reaction_labels\", \"ranked_reaction_labels\"]\n        }\n        record_jsons.append(record_dict)\n\n    with open(records_path, \"w\") as f:\n        json.dump(record_jsons, f)\n\n    # save statistics\n    execution_stats_dict = data_record_collection.execution_stats.to_json()\n    with open(stats_path, \"w\") as f:\n        json.dump(execution_stats_dict, f)\n\n    # score output\n    test_dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=\"test\").to_pandas()\n    test_dataset = test_dataset.sample(n=250, random_state=seed).to_dict(orient=\"records\")\n\n    # construct mapping from pmid --> label (field, value) pairs\n    def compute_target_record(entry):\n        reactions_lst = [\n            reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n            for reaction in entry[\"reactions\"].split(\",\")\n        ]\n        label_dict = {\"ranked_reaction_labels\": reactions_lst}\n        return label_dict\n\n    label_fields_to_values = {\n        entry[\"pmid\"]: compute_target_record(entry) for entry in test_dataset\n    }\n\n    def rank_precision_at_k(preds: list, targets: list, k: int):\n        if preds is None:\n            return 0.0\n\n        # lower-case each list\n        preds = [pred.lower().replace(\"'\", \"\").replace(\"^\", \"\") for pred in preds]\n        targets = set([target.lower().replace(\"'\", \"\").replace(\"^\", \"\") for target in targets])\n\n        # compute rank-precision at k\n        rn = len(targets)\n        denom = min(k, rn)\n        total = 0.0\n        for i in range(k):\n            total += preds[i] in targets if i < len(preds) else 0.0\n\n        return total / denom\n\n    def compute_avg_rp_at_k(records, k=5):\n        total_rp_at_k = 0\n        bad = 0\n        for record in records:\n            pmid = record['pmid']\n            preds = record['ranked_reaction_labels']\n            targets = label_fields_to_values[pmid]['ranked_reaction_labels']\n            try:\n                total_rp_at_k += rank_precision_at_k(preds, targets, k)\n            except Exception:\n                bad += 1\n\n        return total_rp_at_k / len(records), bad\n\n    rp_at_k, bad = compute_avg_rp_at_k(record_jsons, k=5)\n    final_plan_id = list(data_record_collection.execution_stats.plan_stats.keys())[0]\n    final_plan_str = data_record_collection.execution_stats.plan_strs[final_plan_id]\n    stats_dict = {\n        \"rp@5\": rp_at_k,\n        \"optimization_time\": data_record_collection.execution_stats.optimization_time,\n        \"optimization_cost\": data_record_collection.execution_stats.optimization_cost,\n        \"plan_execution_time\": data_record_collection.execution_stats.plan_execution_time,\n        \"plan_execution_cost\": data_record_collection.execution_stats.plan_execution_cost,\n        \"total_execution_time\": data_record_collection.execution_stats.total_execution_time,\n        \"total_execution_cost\": data_record_collection.execution_stats.total_execution_cost,\n        \"plan_str\": final_plan_str,\n    }\n    with open(f\"opt-profiling-data/{exp_name}-metrics.json\", \"w\") as f:\n        json.dump(stats_dict, f)\n\n    print(f\"bad: {bad}\")\n    print(\"-------\")\n    print(f\"rp@k: {rp_at_k:.5f}\")\n    print(f\"Optimization time: {data_record_collection.execution_stats.optimization_time}\")\n    print(f\"Optimization cost: {data_record_collection.execution_stats.optimization_cost}\")\n    print(f\"Plan Exec. time: {data_record_collection.execution_stats.plan_execution_time}\")\n    print(f\"Plan Exec. cost: {data_record_collection.execution_stats.plan_execution_cost}\")\n    print(f\"Total Execution time: {data_record_collection.execution_stats.total_execution_time}\")\n    print(f\"Total Execution Cost: {data_record_collection.execution_stats.total_execution_cost}\")\n"
  },
  {
    "path": "abacus-research/biodex-max-quality-at-cost.py",
    "content": "import argparse\nimport json\nimport os\nimport time\n\nimport chromadb\nimport datasets\nfrom chromadb.utils.embedding_functions.openai_embedding_function import OpenAIEmbeddingFunction\n\n# from ragatouille import RAGPretrainedModel\nimport palimpzest as pz\nfrom palimpzest.constants import Model\nfrom palimpzest.policy import MaxQuality, MaxQualityAtFixedCost\n\nbiodex_entry_cols = [\n    {\"name\": \"pmid\", \"type\": str, \"desc\": \"The PubMed ID of the medical paper\"},\n    {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the medical paper\"},\n    {\"name\": \"abstract\", \"type\": str, \"desc\": \"The abstract of the medical paper\"},\n    {\"name\": \"fulltext\", \"type\": str, \"desc\": \"The full text of the medical paper, which contains information relevant for creating a drug safety report.\"},\n]\n\nbiodex_reactions_cols = [\n    {\"name\": \"reactions\", \"type\": list[str], \"desc\": \"The list of all medical conditions experienced by the patient as discussed in the report. Try to provide as many relevant medical conditions as possible.\"},\n]\n\nbiodex_reaction_labels_cols = [\n    {\"name\": \"reaction_labels\", \"type\": list[str], \"desc\": \"Official terms for medical conditions listed in `reactions`\"},\n]\n\nbiodex_ranked_reactions_labels_cols = [\n    {\"name\": \"ranked_reaction_labels\", \"type\": list[str], \"desc\": \"The ranked list of medical conditions experienced by the patient. The most relevant label occurs first in the list. Be sure to rank ALL of the inputs.\"},\n]\n\nclass BiodexValidator(pz.Validator):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__()\n\n        # read dataset and prepare entries\n        dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=\"train\").to_pandas()\n        if shuffle:\n            dataset = dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\"records\")\n        else:\n            dataset = dataset.to_dict(orient=\"records\")[:num_samples]\n\n        # compute mapping from pmid --> label (i.e. reactions list)\n        self.pmid_to_label = self._compute_pmid_to_label(dataset)\n\n        # store rp_at_k for computing rank-precision at k metric\n        self.k = rp_at_k\n\n    def _compute_pmid_to_label(self, dataset: list[dict]) -> dict:\n        \"\"\"Compute the label for a BioDEX report given its entry in the dataset.\"\"\"\n        pmid_to_label = {}\n        for entry in dataset:\n            pmid = str(entry[\"pmid\"])\n            reactions_lst = [\n                reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n                for reaction in entry[\"reactions\"].split(\",\")\n            ]\n            pmid_to_label[pmid] = reactions_lst\n\n        return pmid_to_label\n\n    def rank_precision_at_k(self, preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # lower-case each list\n            preds = [pred.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for pred in preds]\n            targets = set([target.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for target in targets])\n\n            # compute rank-precision at k\n            rn = len(targets)\n            denom = min(self.k, rn)\n            total = 0.0\n            for i in range(self.k):\n                total += preds[i] in targets if i < len(preds) else 0.0\n\n            return total / denom\n\n        except Exception:\n            os.makedirs(\"rp@k-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"rp@k-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def term_recall(self, preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # normalize terms in each list\n            pred_terms = set([\n                term.strip()\n                for pred in preds\n                for term in pred.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n            target_terms = ([\n                term.strip()\n                for target in targets\n                for term in target.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n\n            # compute term recall and return\n            intersect = pred_terms.intersection(target_terms)\n            term_recall = len(intersect) / len(target_terms)\n\n            return term_recall\n\n        except Exception:\n            os.makedirs(\"term-recall-eval-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"term-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        field_name = fields[0]\n        if field_name == \"reactions\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[str(input_record[\"pmid\"])]\n            return self.term_recall(preds, targets)\n        elif field_name == \"ranked_reaction_labels\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[str(input_record[\"pmid\"])]\n            return self.rank_precision_at_k(preds, targets)\n        else:\n            raise NotImplementedError(f\"Validator.map_score_fn not implemented for field {field_name}.\")\n\n    def topk_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        field_name = fields[0]\n        if field_name == \"reaction_labels\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[input_record[\"pmid\"]]\n            return self.term_recall(preds, targets)\n        else:\n            raise NotImplementedError(f\"Validator.topk_score_fn not implemented for field {field_name}.\")\n\n\nclass BiodexDataset(pz.IterDataset):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        split: str = \"test\",\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__(id=\"biodex\", schema=biodex_entry_cols)\n\n        self.dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=split).to_pandas()\n        if shuffle:\n            self.dataset = self.dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\"records\")\n        else:\n            self.dataset = self.dataset.to_dict(orient=\"records\")[:num_samples]\n\n        self.rp_at_k = rp_at_k\n        self.num_samples = num_samples\n        self.shuffle = shuffle\n        self.seed = seed\n        self.split = split\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        # get entry\n        entry = self.dataset[idx]\n\n        # get input fields\n        pmid = entry[\"pmid\"]\n        title = entry[\"title\"]\n        abstract = entry[\"abstract\"]\n        fulltext = entry[\"fulltext\"]\n\n        # create item with fields\n        item = {\"fields\": {}, \"labels\": {}, \"score_fn\": {}}\n        item[\"fields\"][\"pmid\"] = pmid\n        item[\"fields\"][\"title\"] = title\n        item[\"fields\"][\"abstract\"] = abstract\n        item[\"fields\"][\"fulltext\"] = fulltext\n\n        return item\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\"--progress\", default=False, action=\"store_true\", help=\"Print progress output\")\n    parser.add_argument(\n        \"--execution-strategy\",\n        default=\"parallel\",\n        type=str,\n        help=\"The plan executor to use. One of sequential, pipelined, parallel\",\n    )\n    parser.add_argument(\n        \"--sentinel-execution-strategy\",\n        default=\"mab\",\n        type=str,\n        help=\"The sentinel execution strategy to use. One of mab or random\",\n    )\n    parser.add_argument(\n        \"--optimizer-strategy\",\n        default=\"pareto\",\n        type=str,\n        help=\"The optimizer to use. One of pareto or greedy\",\n    )\n    parser.add_argument(\n        \"--val-examples\",\n        default=30,\n        type=int,\n        help=\"Number of validation examples to sample from\",\n    )\n    parser.add_argument(\n        \"--seed\",\n        default=42,\n        type=int,\n        help=\"Seed used to initialize RNG for MAB sampling algorithm\",\n    )\n    parser.add_argument(\n        \"--k\",\n        default=10,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--j\",\n        default=3,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--sample-budget\",\n        default=100,\n        type=int,\n        help=\"Total sample budget in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--cost\",\n        default=1.0,\n        type=float,\n        help=\"The cost budget for the optimization\",\n    )\n    parser.add_argument(\n        \"--exp-name\",\n        default=None,\n        type=str,\n        help=\"The experiment name.\",\n    )\n    parser.add_argument(\n        \"--priors-file\",\n        default=None,\n        type=str,\n        help=\"A file with a dictionary mapping physical operator ids to prior belief on their performance\",\n    )\n\n    args = parser.parse_args()\n\n    # create directory for profiling data\n    os.makedirs(\"max-quality-at-cost-data\", exist_ok=True)\n\n    verbose = args.verbose\n    progress = args.progress\n    seed = args.seed\n    val_examples = args.val_examples\n    k = args.k\n    j = args.j\n    sample_budget = args.sample_budget\n    execution_strategy = args.execution_strategy\n    sentinel_execution_strategy = args.sentinel_execution_strategy\n    optimizer_strategy = args.optimizer_strategy\n    cost = args.cost\n    exp_name = (\n        f\"biodex-strategy-{optimizer_strategy}-k{k}-j{j}-budget{sample_budget}-seed{seed}\"\n        if args.exp_name is None\n        else args.exp_name\n    )\n    priors = None\n    if args.priors_file is not None:\n        with open(args.priors_file) as f:\n            priors = json.load(f)\n    print(f\"EXPERIMENT NAME: {exp_name}\")\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # create validator\n    validator = BiodexValidator(\n        rp_at_k=5,\n        num_samples=val_examples,\n        shuffle=True,\n        seed=seed,\n    )\n\n    # create validation data source\n    train_dataset = BiodexDataset(\n        split=\"train\",\n        num_samples=val_examples,\n        shuffle=True,\n        seed=seed,\n    )\n    train_dataset = {train_dataset.id: train_dataset}\n\n    # load index [text-embedding-3-small]\n    chroma_client = chromadb.PersistentClient(\".chroma-biodex\")\n    openai_ef = OpenAIEmbeddingFunction(\n        api_key=os.environ[\"OPENAI_API_KEY\"],\n        model_name=\"text-embedding-3-small\",\n    )\n    index = chroma_client.get_collection(\"biodex-reaction-terms\", embedding_function=openai_ef)\n\n    def search_func(index: chromadb.Collection, query: list[list[float]], k: int) -> list[str]:\n        # execute query with embeddings\n        results = index.query(query, n_results=5)\n\n        # get list of result terms with their cosine similarity scores\n        final_results = []\n        for query_docs, query_distances in zip(results[\"documents\"], results[\"distances\"]):\n            for doc, dist in zip(query_docs, query_distances):\n                cosine_similarity = 1 - dist\n                final_results.append({\"content\": doc, \"similarity\": cosine_similarity})\n\n        # sort the results by similarity score\n        sorted_results = sorted(final_results, key=lambda result: result[\"similarity\"], reverse=True)\n\n        # remove duplicates\n        sorted_results_set = set()\n        final_sorted_results = []\n        for result in sorted_results:\n            if result[\"content\"] not in sorted_results_set:\n                sorted_results_set.add(result[\"content\"])\n                final_sorted_results.append(result[\"content\"])\n\n        # return the top-k similar results and generation stats\n        return {\"reaction_labels\": final_sorted_results[:k]}\n\n    # construct plan\n    plan = BiodexDataset(split=\"test\", num_samples=250, shuffle=True, seed=seed)\n    plan = plan.sem_map(biodex_reactions_cols)\n    plan = plan.sem_topk(\n        index=index,\n        search_func=search_func,\n        search_attr=\"reactions\",\n        output_attrs=biodex_reaction_labels_cols,\n    )\n    plan = plan.sem_map(biodex_ranked_reactions_labels_cols, depends_on=[\"title\", \"abstract\", \"fulltext\", \"reaction_labels\"])\n\n    # set policy\n    policy = MaxQualityAtFixedCost(max_cost=cost) if cost < 999 else MaxQuality()\n\n    # execute pz plan\n    config = pz.QueryProcessorConfig(\n        policy=policy,\n        optimizer_strategy=optimizer_strategy,\n        sentinel_execution_strategy=sentinel_execution_strategy,\n        execution_strategy=execution_strategy,\n        use_final_op_quality=True,\n        max_workers=64,\n        verbose=verbose,\n        available_models=[\n            Model.GPT_4o,\n            Model.GPT_4o_MINI,\n            Model.LLAMA3_1_8B,\n            Model.LLAMA3_3_70B,\n            # Model.MIXTRAL, # NOTE: only available in tag `abacus-paper-experiments`\n            Model.DEEPSEEK_R1_DISTILL_QWEN_1_5B,\n        ],\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=progress,\n        k=k,\n        j=j,\n        sample_budget=sample_budget,\n        seed=seed,\n        exp_name=exp_name,\n        priors=priors,\n    )\n\n    data_record_collection = plan.optimize_and_run(config=config, train_dataset=train_dataset, validator=validator)\n\n    print(data_record_collection.to_df())\n    data_record_collection.to_df().to_csv(f\"max-quality-at-cost-data/{exp_name}-output.csv\", index=False)\n\n    # create filepaths for records and stats\n    records_path = f\"max-quality-at-cost-data/{exp_name}-records.json\"\n    stats_path = f\"max-quality-at-cost-data/{exp_name}-profiling.json\"\n\n    # save record outputs\n    record_jsons = []\n    for record in data_record_collection:\n        record_dict = record.to_dict()\n        record_dict = {\n            k: v\n            for k, v in record_dict.items()\n            if k in [\"pmid\", \"reactions\", \"reaction_labels\", \"ranked_reaction_labels\"]\n        }\n        record_jsons.append(record_dict)\n\n    with open(records_path, \"w\") as f:\n        json.dump(record_jsons, f)\n\n    # save statistics\n    execution_stats_dict = data_record_collection.execution_stats.to_json()\n    with open(stats_path, \"w\") as f:\n        json.dump(execution_stats_dict, f)\n\n    # score output\n    test_dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=\"test\").to_pandas()\n    test_dataset = test_dataset.sample(n=250, random_state=seed).to_dict(orient=\"records\")\n\n    # construct mapping from pmid --> label (field, value) pairs\n    def compute_target_record(entry):\n        reactions_lst = [\n            reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n            for reaction in entry[\"reactions\"].split(\",\")\n        ]\n        label_dict = {\"ranked_reaction_labels\": reactions_lst}\n        return label_dict\n\n    label_fields_to_values = {\n        entry[\"pmid\"]: compute_target_record(entry) for entry in test_dataset\n    }\n\n    def rank_precision_at_k(preds: list, targets: list, k: int):\n        if preds is None:\n            return 0.0\n\n        # lower-case each list\n        preds = [pred.lower().replace(\"'\", \"\").replace(\"^\", \"\") for pred in preds]\n        targets = set([target.lower().replace(\"'\", \"\").replace(\"^\", \"\") for target in targets])\n\n        # compute rank-precision at k\n        rn = len(targets)\n        denom = min(k, rn)\n        total = 0.0\n        for i in range(k):\n            total += preds[i] in targets if i < len(preds) else 0.0\n\n        return total / denom\n\n    def compute_avg_rp_at_k(records, k=5):\n        total_rp_at_k, bad = 0, 0\n        for record in records:\n            pmid = record['pmid']\n            preds = record['ranked_reaction_labels']\n            targets = label_fields_to_values[pmid]['ranked_reaction_labels']\n            try:\n                total_rp_at_k += rank_precision_at_k(preds, targets, k)\n            except Exception:\n                print(f\"Error computing rank precision at k for record with pmid {pmid}\")\n                bad += 1\n\n        return total_rp_at_k / len(records), bad\n\n    rp_at_k, failed = compute_avg_rp_at_k(record_jsons, k=5)\n    final_plan_id = list(data_record_collection.execution_stats.plan_stats.keys())[0]\n    final_plan_str = data_record_collection.execution_stats.plan_strs[final_plan_id]\n    stats_dict = {\n        \"rp@5\": rp_at_k,\n        \"failed\": failed,\n        \"optimization_time\": data_record_collection.execution_stats.optimization_time,\n        \"optimization_cost\": data_record_collection.execution_stats.optimization_cost,\n        \"plan_execution_time\": data_record_collection.execution_stats.plan_execution_time,\n        \"plan_execution_cost\": data_record_collection.execution_stats.plan_execution_cost,\n        \"total_execution_time\": data_record_collection.execution_stats.total_execution_time,\n        \"total_execution_cost\": data_record_collection.execution_stats.total_execution_cost,\n        \"plan_str\": final_plan_str,\n    }\n    with open(f\"max-quality-at-cost-data/{exp_name}-metrics.json\", \"w\") as f:\n        json.dump(stats_dict, f)\n\n    print(f\"rp@k: {rp_at_k:.5f}\")\n    print(f\"failed: {failed}\")\n    print(f\"Optimization time: {data_record_collection.execution_stats.optimization_time}\")\n    print(f\"Optimization cost: {data_record_collection.execution_stats.optimization_cost}\")\n    print(f\"Plan Exec. time: {data_record_collection.execution_stats.plan_execution_time}\")\n    print(f\"Plan Exec. cost: {data_record_collection.execution_stats.plan_execution_cost}\")\n    print(f\"Total Execution time: {data_record_collection.execution_stats.total_execution_time}\")\n    print(f\"Total Execution Cost: {data_record_collection.execution_stats.total_execution_cost}\")\n"
  },
  {
    "path": "abacus-research/biodex-min-at-fixed-quality.py",
    "content": "import argparse\nimport json\nimport os\nimport time\n\nimport chromadb\nimport datasets\nfrom chromadb.utils.embedding_functions.openai_embedding_function import OpenAIEmbeddingFunction\n\n# from ragatouille import RAGPretrainedModel\nimport palimpzest as pz\nfrom palimpzest.constants import Model\n\nbiodex_entry_cols = [\n    {\"name\": \"pmid\", \"type\": str, \"desc\": \"The PubMed ID of the medical paper\"},\n    {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the medical paper\"},\n    {\"name\": \"abstract\", \"type\": str, \"desc\": \"The abstract of the medical paper\"},\n    {\"name\": \"fulltext\", \"type\": str, \"desc\": \"The full text of the medical paper, which contains information relevant for creating a drug safety report.\"},\n]\n\nbiodex_reactions_cols = [\n    {\"name\": \"reactions\", \"type\": list[str], \"desc\": \"The list of all medical conditions experienced by the patient as discussed in the report. Try to provide as many relevant medical conditions as possible.\"},\n]\n\nbiodex_reaction_labels_cols = [\n    {\"name\": \"reaction_labels\", \"type\": list[str], \"desc\": \"Official terms for medical conditions listed in `reactions`\"},\n]\n\nbiodex_ranked_reactions_labels_cols = [\n    {\"name\": \"ranked_reaction_labels\", \"type\": list[str], \"desc\": \"The ranked list of medical conditions experienced by the patient. The most relevant label occurs first in the list. Be sure to rank ALL of the inputs.\"},\n]\n\nclass BiodexValidator(pz.Validator):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__()\n\n        # read dataset and prepare entries\n        dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=\"train\").to_pandas()\n        if shuffle:\n            dataset = dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\"records\")\n        else:\n            dataset = dataset.to_dict(orient=\"records\")[:num_samples]\n\n        # compute mapping from pmid --> label (i.e. reactions list)\n        self.pmid_to_label = self._compute_pmid_to_label(dataset)\n\n        # store rp_at_k for computing rank-precision at k metric\n        self.k = rp_at_k\n\n    def _compute_pmid_to_label(self, dataset: list[dict]) -> dict:\n        \"\"\"Compute the label for a BioDEX report given its entry in the dataset.\"\"\"\n        pmid_to_label = {}\n        for entry in dataset:\n            pmid = str(entry[\"pmid\"])\n            reactions_lst = [\n                reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n                for reaction in entry[\"reactions\"].split(\",\")\n            ]\n            pmid_to_label[pmid] = reactions_lst\n\n        return pmid_to_label\n\n    def rank_precision_at_k(self, preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # lower-case each list\n            preds = [pred.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for pred in preds]\n            targets = set([target.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for target in targets])\n\n            # compute rank-precision at k\n            rn = len(targets)\n            denom = min(self.k, rn)\n            total = 0.0\n            for i in range(self.k):\n                total += preds[i] in targets if i < len(preds) else 0.0\n\n            return total / denom\n\n        except Exception:\n            os.makedirs(\"rp@k-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"rp@k-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def term_recall(self, preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # normalize terms in each list\n            pred_terms = set([\n                term.strip()\n                for pred in preds\n                for term in pred.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n            target_terms = ([\n                term.strip()\n                for target in targets\n                for term in target.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n\n            # compute term recall and return\n            intersect = pred_terms.intersection(target_terms)\n            term_recall = len(intersect) / len(target_terms)\n\n            return term_recall\n\n        except Exception:\n            os.makedirs(\"term-recall-eval-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"term-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        field_name = fields[0]\n        if field_name == \"reactions\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[str(input_record[\"pmid\"])]\n            return self.term_recall(preds, targets)\n        elif field_name == \"ranked_reaction_labels\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[str(input_record[\"pmid\"])]\n            return self.rank_precision_at_k(preds, targets)\n        else:\n            raise NotImplementedError(f\"Validator.map_score_fn not implemented for field {field_name}.\")\n\n    def topk_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        field_name = fields[0]\n        if field_name == \"reaction_labels\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[input_record[\"pmid\"]]\n            return self.term_recall(preds, targets)\n        else:\n            raise NotImplementedError(f\"Validator.topk_score_fn not implemented for field {field_name}.\")\n\n\nclass BiodexDataset(pz.IterDataset):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        split: str = \"test\",\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__(id=\"biodex\", schema=biodex_entry_cols)\n\n        self.dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=split).to_pandas()\n        if shuffle:\n            self.dataset = self.dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\"records\")\n        else:\n            self.dataset = self.dataset.to_dict(orient=\"records\")[:num_samples]\n\n        self.rp_at_k = rp_at_k\n        self.num_samples = num_samples\n        self.shuffle = shuffle\n        self.seed = seed\n        self.split = split\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        # get entry\n        entry = self.dataset[idx]\n\n        # get input fields\n        pmid = entry[\"pmid\"]\n        title = entry[\"title\"]\n        abstract = entry[\"abstract\"]\n        fulltext = entry[\"fulltext\"]\n\n        # create item with fields\n        item = {\"pmid\": pmid, \"title\": title, \"abstract\": abstract, \"fulltext\": fulltext}\n\n        return item\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\"--progress\", default=False, action=\"store_true\", help=\"Print progress output\")\n    parser.add_argument(\n        \"--execution-strategy\",\n        default=\"parallel\",\n        type=str,\n        help=\"The plan executor to use. One of sequential, pipelined, parallel\",\n    )\n    parser.add_argument(\n        \"--sentinel-execution-strategy\",\n        default=\"mab\",\n        type=str,\n        help=\"The sentinel execution strategy to use. One of mab or random\",\n    )\n    parser.add_argument(\n        \"--optimizer-strategy\",\n        default=\"pareto\",\n        type=str,\n        help=\"The optimizer to use. One of pareto or greedy\",\n    )\n    parser.add_argument(\n        \"--val-examples\",\n        default=30,\n        type=int,\n        help=\"Number of validation examples to sample from\",\n    )\n    parser.add_argument(\n        \"--seed\",\n        default=42,\n        type=int,\n        help=\"Seed used to initialize RNG for MAB sampling algorithm\",\n    )\n    parser.add_argument(\n        \"--k\",\n        default=10,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--j\",\n        default=3,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--sample-budget\",\n        default=100,\n        type=int,\n        help=\"Total sample budget in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--metric\",\n        default=None,\n        type=str,\n        help=\"whether to minimize latency or cost\",\n    )\n    parser.add_argument(\n        \"--exp-name\",\n        default=None,\n        type=str,\n        help=\"The experiment name.\",\n    )\n    parser.add_argument(\n        \"--priors-file\",\n        default=None,\n        type=str,\n        help=\"A file with a dictionary mapping physical operator ids to prior belief on their performance\",\n    )\n\n    args = parser.parse_args()\n\n    assert args.metric in [\"cost\", \"latency\"], \"metric must be one of cost or latency\"\n    metric = args.metric\n\n    # create directory for profiling data\n    os.makedirs(f\"min-{metric}-at-quality-data\", exist_ok=True)\n\n    verbose = args.verbose\n    progress = args.progress\n    seed = args.seed\n    val_examples = args.val_examples\n    k = args.k\n    j = args.j\n    sample_budget = args.sample_budget\n    execution_strategy = args.execution_strategy\n    sentinel_execution_strategy = args.sentinel_execution_strategy\n    optimizer_strategy = args.optimizer_strategy\n    exp_name = (\n        f\"biodex-min-{metric}-strategy-{optimizer_strategy}-k{k}-j{j}-budget{sample_budget}-seed{seed}\"\n        if args.exp_name is None\n        else args.exp_name\n    )\n    priors = None\n    if args.priors_file is not None:\n        with open(args.priors_file) as f:\n            priors = json.load(f)\n    print(f\"EXPERIMENT NAME: {exp_name}\")\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # create validator\n    validator = BiodexValidator(\n        rp_at_k=5,\n        num_samples=val_examples,\n        shuffle=True,\n        seed=seed,\n    )\n\n    # create validation data source\n    train_dataset = BiodexDataset(\n        split=\"train\",\n        num_samples=val_examples,\n        shuffle=True,\n        seed=seed,\n    )\n    train_dataset = {train_dataset.id: train_dataset}\n\n    # load index [text-embedding-3-small]\n    chroma_client = chromadb.PersistentClient(\".chroma-biodex\")\n    openai_ef = OpenAIEmbeddingFunction(\n        api_key=os.environ[\"OPENAI_API_KEY\"],\n        model_name=\"text-embedding-3-small\",\n    )\n    index = chroma_client.get_collection(\"biodex-reaction-terms\", embedding_function=openai_ef)\n\n    def search_func(index: chromadb.Collection, query: list[list[float]], k: int) -> list[str]:\n        # execute query with embeddings\n        results = index.query(query, n_results=5)\n\n        # get list of result terms with their cosine similarity scores\n        final_results = []\n        for query_docs, query_distances in zip(results[\"documents\"], results[\"distances\"]):\n            for doc, dist in zip(query_docs, query_distances):\n                cosine_similarity = 1 - dist\n                final_results.append({\"content\": doc, \"similarity\": cosine_similarity})\n\n        # sort the results by similarity score\n        sorted_results = sorted(final_results, key=lambda result: result[\"similarity\"], reverse=True)\n\n        # remove duplicates\n        sorted_results_set = set()\n        final_sorted_results = []\n        for result in sorted_results:\n            if result[\"content\"] not in sorted_results_set:\n                sorted_results_set.add(result[\"content\"])\n                final_sorted_results.append(result[\"content\"])\n\n        # return the top-k similar results and generation stats\n        return {\"reaction_labels\": final_sorted_results[:k]}\n\n    # construct plan\n    plan = BiodexDataset(split=\"test\", num_samples=250, shuffle=True, seed=seed)\n    plan = plan.sem_map(biodex_reactions_cols)\n    plan = plan.sem_topk(\n        index=index,\n        search_func=search_func,\n        search_attr=\"reactions\",\n        output_attrs=biodex_reaction_labels_cols,\n    )\n    plan = plan.sem_map(biodex_ranked_reactions_labels_cols, depends_on=[\"title\", \"abstract\", \"fulltext\", \"reaction_labels\"])\n\n    # set policy\n    policy = pz.MinCostAtFixedQuality(min_quality=0.216) if metric == \"cost\" else pz.MinTimeAtFixedQuality(min_quality=0.216)\n\n    # execute pz plan\n    config = pz.QueryProcessorConfig(\n        policy=policy,\n        optimizer_strategy=optimizer_strategy,\n        sentinel_execution_strategy=sentinel_execution_strategy,\n        execution_strategy=execution_strategy,\n        use_final_op_quality=True,\n        max_workers=64,\n        verbose=verbose,\n        available_models=[\n            Model.GPT_4o_MINI,\n        ],\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=progress,\n        k=k,\n        j=j,\n        sample_budget=sample_budget,\n        seed=seed,\n        exp_name=exp_name,\n        priors=priors,\n    )\n\n    data_record_collection = plan.optimize_and_run(config=config, train_dataset=train_dataset, validator=validator)\n\n    print(data_record_collection.to_df())\n    data_record_collection.to_df().to_csv(f\"min-{metric}-at-quality-data/{exp_name}-output.csv\", index=False)\n\n    # create filepaths for records and stats\n    records_path = f\"min-{metric}-at-quality-data/{exp_name}-records.json\"\n    stats_path = f\"min-{metric}-at-quality-data/{exp_name}-profiling.json\"\n\n    # save record outputs\n    record_jsons = []\n    for record in data_record_collection:\n        record_dict = record.to_dict()\n        record_dict = {\n            k: v\n            for k, v in record_dict.items()\n            if k in [\"pmid\", \"reactions\", \"reaction_labels\", \"ranked_reaction_labels\"]\n        }\n        record_jsons.append(record_dict)\n\n    with open(records_path, \"w\") as f:\n        json.dump(record_jsons, f)\n\n    # save statistics\n    execution_stats_dict = data_record_collection.execution_stats.to_json()\n    with open(stats_path, \"w\") as f:\n        json.dump(execution_stats_dict, f)\n\n    # score output\n    test_dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=\"test\").to_pandas()\n    test_dataset = test_dataset.sample(n=250, random_state=seed).to_dict(orient=\"records\")\n\n    # construct mapping from pmid --> label (field, value) pairs\n    def compute_target_record(entry):\n        reactions_lst = [\n            reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n            for reaction in entry[\"reactions\"].split(\",\")\n        ]\n        label_dict = {\"ranked_reaction_labels\": reactions_lst}\n        return label_dict\n\n    label_fields_to_values = {\n        entry[\"pmid\"]: compute_target_record(entry) for entry in test_dataset\n    }\n\n    def rank_precision_at_k(preds: list, targets: list, k: int):\n        if preds is None:\n            return 0.0\n\n        # lower-case each list\n        preds = [pred.lower().replace(\"'\", \"\").replace(\"^\", \"\") for pred in preds]\n        targets = set([target.lower().replace(\"'\", \"\").replace(\"^\", \"\") for target in targets])\n\n        # compute rank-precision at k\n        rn = len(targets)\n        denom = min(k, rn)\n        total = 0.0\n        for i in range(k):\n            total += preds[i] in targets if i < len(preds) else 0.0\n\n        return total / denom\n\n    def compute_avg_rp_at_k(records, k=5):\n        total_rp_at_k, bad = 0, 0\n        for record in records:\n            pmid = record['pmid']\n            preds = record['ranked_reaction_labels']\n            targets = label_fields_to_values[pmid]['ranked_reaction_labels']\n            try:\n                total_rp_at_k += rank_precision_at_k(preds, targets, k)\n            except Exception:\n                print(f\"Error computing rank precision at k for record with pmid {pmid}\")\n                bad += 1\n\n        return total_rp_at_k / len(records), bad\n\n    rp_at_k, failed = compute_avg_rp_at_k(record_jsons, k=5)\n    final_plan_id = list(data_record_collection.execution_stats.plan_stats.keys())[0]\n    final_plan_str = data_record_collection.execution_stats.plan_strs[final_plan_id]\n    stats_dict = {\n        \"rp@5\": rp_at_k,\n        \"failed\": failed,\n        \"optimization_time\": data_record_collection.execution_stats.optimization_time,\n        \"optimization_cost\": data_record_collection.execution_stats.optimization_cost,\n        \"plan_execution_time\": data_record_collection.execution_stats.plan_execution_time,\n        \"plan_execution_cost\": data_record_collection.execution_stats.plan_execution_cost,\n        \"total_execution_time\": data_record_collection.execution_stats.total_execution_time,\n        \"total_execution_cost\": data_record_collection.execution_stats.total_execution_cost,\n        \"plan_str\": final_plan_str,\n    }\n    with open(f\"min-{metric}-at-quality-data/{exp_name}-metrics.json\", \"w\") as f:\n        json.dump(stats_dict, f)\n\n    print(f\"rp@k: {rp_at_k:.5f}\")\n    print(f\"failed: {failed}\")\n    print(f\"Optimization time: {data_record_collection.execution_stats.optimization_time}\")\n    print(f\"Optimization cost: {data_record_collection.execution_stats.optimization_cost}\")\n    print(f\"Plan Exec. time: {data_record_collection.execution_stats.plan_execution_time}\")\n    print(f\"Plan Exec. cost: {data_record_collection.execution_stats.plan_execution_cost}\")\n    print(f\"Total Execution time: {data_record_collection.execution_stats.total_execution_time}\")\n    print(f\"Total Execution Cost: {data_record_collection.execution_stats.total_execution_cost}\")\n"
  },
  {
    "path": "abacus-research/biodex-pareto-cascades.py",
    "content": "import argparse\nimport json\nimport os\nimport time\n\nimport chromadb\nimport datasets\nfrom chromadb.utils.embedding_functions.openai_embedding_function import OpenAIEmbeddingFunction\n\nimport palimpzest as pz\nfrom palimpzest.constants import Model\nfrom palimpzest.policy import MaxQualityAtFixedCost\n\nbiodex_entry_cols = [\n    {\"name\": \"pmid\", \"type\": str, \"desc\": \"The PubMed ID of the medical paper\"},\n    {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the medical paper\"},\n    {\"name\": \"abstract\", \"type\": str, \"desc\": \"The abstract of the medical paper\"},\n    {\"name\": \"fulltext\", \"type\": str, \"desc\": \"The full text of the medical paper, which contains information relevant for creating a drug safety report.\"},\n]\n\nbiodex_reactions_cols = [\n    {\"name\": \"reactions\", \"type\": list[str], \"desc\": \"The list of all medical conditions experienced by the patient as discussed in the report. Try to provide as many relevant medical conditions as possible.\"},\n]\n\nbiodex_reaction_labels_cols = [\n    {\"name\": \"reaction_labels\", \"type\": list[str], \"desc\": \"Official terms for medical conditions listed in `reactions`\"},\n]\n\nbiodex_ranked_reactions_labels_cols = [\n    {\"name\": \"ranked_reaction_labels\", \"type\": list[str], \"desc\": \"The ranked list of medical conditions experienced by the patient. The most relevant label occurs first in the list. Be sure to rank ALL of the inputs.\"},\n]\n\nclass BiodexValidator(pz.Validator):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__()\n\n        # read dataset and prepare entries\n        dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=\"train\").to_pandas()\n        if shuffle:\n            dataset = dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\"records\")\n        else:\n            dataset = dataset.to_dict(orient=\"records\")[:num_samples]\n\n        # compute mapping from pmid --> label (i.e. reactions list)\n        self.pmid_to_label = self._compute_pmid_to_label(dataset)\n\n        # store rp_at_k for computing rank-precision at k metric\n        self.k = rp_at_k\n\n    def _compute_pmid_to_label(self, dataset: list[dict]) -> dict:\n        \"\"\"Compute the label for a BioDEX report given its entry in the dataset.\"\"\"\n        pmid_to_label = {}\n        for entry in dataset:\n            pmid = str(entry[\"pmid\"])\n            reactions_lst = [\n                reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n                for reaction in entry[\"reactions\"].split(\",\")\n            ]\n            pmid_to_label[pmid] = reactions_lst\n\n        return pmid_to_label\n\n    def rank_precision_at_k(self, preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # lower-case each list\n            preds = [pred.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for pred in preds]\n            targets = set([target.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for target in targets])\n\n            # compute rank-precision at k\n            rn = len(targets)\n            denom = min(self.k, rn)\n            total = 0.0\n            for i in range(self.k):\n                total += preds[i] in targets if i < len(preds) else 0.0\n\n            return total / denom\n\n        except Exception:\n            os.makedirs(\"rp@k-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"rp@k-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def term_recall(self, preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # normalize terms in each list\n            pred_terms = set([\n                term.strip()\n                for pred in preds\n                for term in pred.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n            target_terms = ([\n                term.strip()\n                for target in targets\n                for term in target.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n\n            # compute term recall and return\n            intersect = pred_terms.intersection(target_terms)\n            term_recall = len(intersect) / len(target_terms)\n\n            return term_recall\n\n        except Exception:\n            os.makedirs(\"term-recall-eval-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"term-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        field_name = fields[0]\n        if field_name == \"reactions\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[str(input_record[\"pmid\"])]\n            return self.term_recall(preds, targets)\n        elif field_name == \"ranked_reaction_labels\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[str(input_record[\"pmid\"])]\n            return self.rank_precision_at_k(preds, targets)\n        else:\n            raise NotImplementedError(f\"Validator.map_score_fn not implemented for field {field_name}.\")\n\n    def topk_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        field_name = fields[0]\n        if field_name == \"reaction_labels\":\n            preds = output.get(field_name)\n            targets = self.pmid_to_label[input_record[\"pmid\"]]\n            return self.term_recall(preds, targets)\n        else:\n            raise NotImplementedError(f\"Validator.topk_score_fn not implemented for field {field_name}.\")\n\n\nclass BiodexDataset(pz.IterDataset):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        split: str = \"test\",\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__(id=\"biodex\", schema=biodex_entry_cols)\n\n        self.dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=split).to_pandas()\n        if shuffle:\n            self.dataset = self.dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\"records\")\n        else:\n            self.dataset = self.dataset.to_dict(orient=\"records\")[:num_samples]\n\n        self.rp_at_k = rp_at_k\n        self.num_samples = num_samples\n        self.shuffle = shuffle\n        self.seed = seed\n        self.split = split\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        # get entry\n        entry = self.dataset[idx]\n\n        # get input fields\n        pmid = entry[\"pmid\"]\n        title = entry[\"title\"]\n        abstract = entry[\"abstract\"]\n        fulltext = entry[\"fulltext\"]\n\n        # create item with fields\n        item = {\"fields\": {}, \"labels\": {}, \"score_fn\": {}}\n        item[\"fields\"][\"pmid\"] = pmid\n        item[\"fields\"][\"title\"] = title\n        item[\"fields\"][\"abstract\"] = abstract\n        item[\"fields\"][\"fulltext\"] = fulltext\n\n        return item\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\"--progress\", default=False, action=\"store_true\", help=\"Print progress output\")\n    parser.add_argument(\"--constrained\", default=False, action=\"store_true\", help=\"Use constrained objective\")\n    parser.add_argument(\n        \"--execution-strategy\",\n        default=\"parallel\",\n        type=str,\n        help=\"The plan executor to use. One of sequential, pipelined, parallel\",\n    )\n    parser.add_argument(\n        \"--sentinel-execution-strategy\",\n        default=\"mab\",\n        type=str,\n        help=\"The sentinel execution strategy to use. One of mab or random\",\n    )\n    parser.add_argument(\n        \"--optimizer-strategy\",\n        default=\"pareto\",\n        type=str,\n        help=\"The optimizer to use. One of pareto or greedy\",\n    )\n    parser.add_argument(\n        \"--val-examples\",\n        default=30,\n        type=int,\n        help=\"Number of validation examples to sample from\",\n    )\n    parser.add_argument(\n        \"--seed\",\n        default=42,\n        type=int,\n        help=\"Seed used to initialize RNG for MAB sampling algorithm\",\n    )\n    parser.add_argument(\n        \"--k\",\n        default=10,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--j\",\n        default=3,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--sample-budget\",\n        default=100,\n        type=int,\n        help=\"Total sample budget in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--cost\",\n        default=1.0,\n        type=float,\n        help=\"The cost budget for the optimization\",\n    )\n    parser.add_argument(\n        \"--exp-name\",\n        default=None,\n        type=str,\n        help=\"The experiment name.\",\n    )\n    parser.add_argument(\n        \"--priors-file\",\n        default=None,\n        type=str,\n        help=\"A file with a dictionary mapping physical operator ids to prior belief on their performance\",\n    )\n\n    args = parser.parse_args()\n\n    # create directory for profiling data\n    os.makedirs(\"pareto-cascades-data\", exist_ok=True)\n\n    verbose = args.verbose\n    progress = args.progress\n    seed = args.seed\n    val_examples = args.val_examples\n    k = args.k\n    j = args.j\n    sample_budget = args.sample_budget\n    execution_strategy = args.execution_strategy\n    sentinel_execution_strategy = args.sentinel_execution_strategy\n    optimizer_strategy = args.optimizer_strategy\n    cost = args.cost\n    exp_name = (\n        f\"biodex-strategy-{optimizer_strategy}-k{k}-j{j}-budget{sample_budget}-seed{seed}\"\n        if args.exp_name is None\n        else args.exp_name\n    )\n    priors = None\n    if args.priors_file is not None:\n        with open(args.priors_file) as f:\n            priors = json.load(f)\n    print(f\"EXPERIMENT NAME: {exp_name}\")\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # create validator\n    validator = BiodexValidator(\n        rp_at_k=5,\n        num_samples=val_examples,\n        shuffle=True,\n        seed=seed,\n    )\n\n    # create validation data source\n    train_dataset = BiodexDataset(\n        split=\"train\",\n        num_samples=val_examples,\n        shuffle=True,\n        seed=seed,\n    )\n    train_dataset = {train_dataset.id: train_dataset}\n\n    # load index [text-embedding-3-small]\n    chroma_client = chromadb.PersistentClient(\".chroma-biodex\")\n    openai_ef = OpenAIEmbeddingFunction(\n        api_key=os.environ[\"OPENAI_API_KEY\"],\n        model_name=\"text-embedding-3-small\",\n    )\n    index = chroma_client.get_collection(\"biodex-reaction-terms\", embedding_function=openai_ef)\n\n    def search_func(index: chromadb.Collection, query: list[list[float]], k: int) -> list[str]:\n        # execute query with embeddings\n        results = index.query(query, n_results=5)\n\n        # get list of result terms with their cosine similarity scores\n        final_results = []\n        for query_docs, query_distances in zip(results[\"documents\"], results[\"distances\"]):\n            for doc, dist in zip(query_docs, query_distances):\n                cosine_similarity = 1 - dist\n                final_results.append({\"content\": doc, \"similarity\": cosine_similarity})\n\n        # sort the results by similarity score\n        sorted_results = sorted(final_results, key=lambda result: result[\"similarity\"], reverse=True)\n\n        # remove duplicates\n        sorted_results_set = set()\n        final_sorted_results = []\n        for result in sorted_results:\n            if result[\"content\"] not in sorted_results_set:\n                sorted_results_set.add(result[\"content\"])\n                final_sorted_results.append(result[\"content\"])\n\n        # return the top-k similar results and generation stats\n        return {\"reaction_labels\": final_sorted_results[:k]}\n\n    # construct plan\n    plan = BiodexDataset(split=\"test\", num_samples=250, shuffle=True, seed=seed)\n    plan = plan.sem_map(biodex_reactions_cols)\n    plan = plan.sem_topk(\n        index=index,\n        search_func=search_func,\n        search_attr=\"reactions\",\n        output_attrs=biodex_reaction_labels_cols,\n    )\n    plan = plan.sem_map(biodex_ranked_reactions_labels_cols, depends_on=[\"title\", \"abstract\", \"fulltext\", \"reaction_labels\"])\n\n    # execute pz plan\n    config = pz.QueryProcessorConfig(\n        policy=MaxQualityAtFixedCost(max_cost=cost),\n        optimizer_strategy=optimizer_strategy,\n        sentinel_execution_strategy=sentinel_execution_strategy,\n        execution_strategy=execution_strategy,\n        use_final_op_quality=True,\n        max_workers=64,\n        verbose=verbose,\n        available_models=[\n            Model.GPT_4o_MINI,\n            Model.LLAMA3_2_3B,\n            Model.LLAMA3_1_8B,\n            Model.LLAMA3_3_70B,\n            # Model.MIXTRAL, # NOTE: only available in tag `abacus-paper-experiments`\n            Model.DEEPSEEK_R1_DISTILL_QWEN_1_5B,\n        ],\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=progress,\n        k=k,\n        j=j,\n        sample_budget=sample_budget,\n        seed=seed,\n        exp_name=exp_name,\n        priors=priors,\n    )\n\n    data_record_collection = plan.optimize_and_run(config=config, train_dataset=train_dataset, validator=validator)\n\n    print(data_record_collection.to_df())\n    data_record_collection.to_df().to_csv(f\"pareto-cascades-data/{exp_name}-output.csv\", index=False)\n\n    # create filepaths for records and stats\n    records_path = f\"pareto-cascades-data/{exp_name}-records.json\"\n    stats_path = f\"pareto-cascades-data/{exp_name}-profiling.json\"\n\n    # save record outputs\n    record_jsons = []\n    for record in data_record_collection:\n        record_dict = record.to_dict()\n        record_dict = {\n            k: v\n            for k, v in record_dict.items()\n            if k in [\"pmid\", \"reactions\", \"reaction_labels\", \"ranked_reaction_labels\"]\n        }\n        record_jsons.append(record_dict)\n\n    with open(records_path, \"w\") as f:\n        json.dump(record_jsons, f)\n\n    # save statistics\n    execution_stats_dict = data_record_collection.execution_stats.to_json()\n    with open(stats_path, \"w\") as f:\n        json.dump(execution_stats_dict, f)\n\n    # score output\n    test_dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=\"test\").to_pandas()\n    test_dataset = test_dataset.sample(n=250, random_state=seed).to_dict(orient=\"records\")\n\n    # construct mapping from pmid --> label (field, value) pairs\n    def compute_target_record(entry):\n        reactions_lst = [\n            reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n            for reaction in entry[\"reactions\"].split(\",\")\n        ]\n        label_dict = {\"ranked_reaction_labels\": reactions_lst}\n        return label_dict\n\n    label_fields_to_values = {\n        entry[\"pmid\"]: compute_target_record(entry) for entry in test_dataset\n    }\n\n    def rank_precision_at_k(preds: list, targets: list, k: int):\n        if preds is None:\n            return 0.0\n\n        # lower-case each list\n        preds = [pred.lower().replace(\"'\", \"\").replace(\"^\", \"\") for pred in preds]\n        targets = set([target.lower().replace(\"'\", \"\").replace(\"^\", \"\") for target in targets])\n\n        # compute rank-precision at k\n        rn = len(targets)\n        denom = min(k, rn)\n        total = 0.0\n        for i in range(k):\n            total += preds[i] in targets if i < len(preds) else 0.0\n\n        return total / denom\n\n    def compute_avg_rp_at_k(records, k=5):\n        total_rp_at_k, bad = 0, 0\n        for record in records:\n            pmid = record['pmid']\n            preds = record['ranked_reaction_labels']\n            targets = label_fields_to_values[pmid]['ranked_reaction_labels']\n            try:\n                total_rp_at_k += rank_precision_at_k(preds, targets, k)\n            except Exception:\n                print(f\"Error computing rank precision at k for record with pmid {pmid}\")\n                bad += 1\n\n        return total_rp_at_k / len(records), bad\n\n    rp_at_k, failed = compute_avg_rp_at_k(record_jsons, k=5)\n    final_plan_id = list(data_record_collection.execution_stats.plan_stats.keys())[0]\n    final_plan_str = data_record_collection.execution_stats.plan_strs[final_plan_id]\n    stats_dict = {\n        \"rp@5\": rp_at_k,\n        \"failed\": failed,\n        \"optimization_time\": data_record_collection.execution_stats.optimization_time,\n        \"optimization_cost\": data_record_collection.execution_stats.optimization_cost,\n        \"plan_execution_time\": data_record_collection.execution_stats.plan_execution_time,\n        \"plan_execution_cost\": data_record_collection.execution_stats.plan_execution_cost,\n        \"total_execution_time\": data_record_collection.execution_stats.total_execution_time,\n        \"total_execution_cost\": data_record_collection.execution_stats.total_execution_cost,\n        \"plan_str\": final_plan_str,\n    }\n    with open(f\"pareto-cascades-data/{exp_name}-metrics.json\", \"w\") as f:\n        json.dump(stats_dict, f)\n\n    print(f\"rp@k: {rp_at_k:.5f}\")\n    print(f\"failed: {failed}\")\n    print(f\"Optimization time: {data_record_collection.execution_stats.optimization_time}\")\n    print(f\"Optimization cost: {data_record_collection.execution_stats.optimization_cost}\")\n    print(f\"Plan Exec. time: {data_record_collection.execution_stats.plan_execution_time}\")\n    print(f\"Plan Exec. cost: {data_record_collection.execution_stats.plan_execution_cost}\")\n    print(f\"Total Execution time: {data_record_collection.execution_stats.total_execution_time}\")\n    print(f\"Total Execution Cost: {data_record_collection.execution_stats.total_execution_cost}\")\n"
  },
  {
    "path": "abacus-research/biodex-priors-cascades.json",
    "content": "{\"0005c18b69\": {\"quality\": 0.19444444444444442, \"cost\": 0.0038703809999999996, \"time\": 61.16110005378724}, \"009df798a3\": {\"quality\": 0.09252136752136753, \"cost\": 0.003526569, \"time\": 74.32940173149109}, \"00c93aec22\": {\"quality\": 0.10641025641025642, \"cost\": 0.010484405999999998, \"time\": 72.25670802593231}, \"00e1fecc4c\": {\"quality\": 0.21752136752136753, \"cost\": 0.006413474000000001, \"time\": 63.18833725452423}, \"00f4acd0d3\": {\"quality\": 0.0, \"cost\": 0.005918445, \"time\": 48.42981550693512}, \"01413aa72d\": {\"quality\": 0.16538461538461538, \"cost\": 0.008015024, \"time\": 69.29275906085968}, \"01c2f973ad\": {\"quality\": 0.1626068376068376, \"cost\": 0.002413815, \"time\": 42.94722969532013}, \"02078988c1\": {\"quality\": 0.0, \"cost\": 0.006904731999999999, \"time\": 27.019422554969786}, \"021604dec1\": {\"quality\": 0.3175213675213675, \"cost\": 0.008399656999999998, \"time\": 67.33289885520935}, \"02410c662e\": {\"quality\": 0.21752136752136753, \"cost\": 0.006360101999999999, \"time\": 77.91296794414521}, \"0262668df7\": {\"quality\": 0.21752136752136753, \"cost\": 0.0075904909999999996, \"time\": 59.82583842277527}, \"0267c97b70\": {\"quality\": 0.0, \"cost\": 0.003249014999999999, \"time\": 50.19215798377991}, \"02ae38e4aa\": {\"quality\": 0.1876068376068376, \"cost\": 0.004417125, \"time\": 69.9335319519043}, \"02f49fe0fd\": {\"quality\": 0.14807692307692308, \"cost\": 0.00044366999999999996, \"time\": 35.21025230884552}, \"030756558c\": {\"quality\": 0.04444444444444444, \"cost\": 0.002745915, \"time\": 53.435633063316345}, \"033ca325e6\": {\"quality\": 0.20277777777777778, \"cost\": 0.002408562, \"time\": 27.521768379211426}, \"038a5f0a62\": {\"quality\": 0.1876068376068376, \"cost\": 0.006526305, \"time\": 52.622441697120664}, \"041b5af43d\": {\"quality\": 0.21752136752136753, \"cost\": 0.010993874, \"time\": 64.76753988265992}, \"042d933706\": {\"quality\": 0.19444444444444442, \"cost\": 0.008023619, \"time\": 57.533840489387515}, \"04397effa0\": {\"quality\": 0.2064102564102564, \"cost\": 0.007009628999999999, \"time\": 76.10882368087769}, \"0539e0b42d\": {\"quality\": 0.25085470085470085, \"cost\": 0.007662647000000001, \"time\": 80.42491667270662}, \"0554568b86\": {\"quality\": 0.14038461538461539, \"cost\": 0.008002722, \"time\": 58.663866949081424}, \"06493715cc\": {\"quality\": 0.02222222222222222, \"cost\": 0.005358681, \"time\": 78.51178731918336}, \"067ee6e91b\": {\"quality\": 0.2098290598290598, \"cost\": 0.0072115949999999995, \"time\": 56.89639482498169}, \"068b66f00d\": {\"quality\": 0.1, \"cost\": 0.006528483999999999, \"time\": 68.53199887275696}, \"0695f9b5fc\": {\"quality\": 0.06944444444444445, \"cost\": 0.007583937, \"time\": 73.26675381660462}, \"073ed5b301\": {\"quality\": 0.07371794871794872, \"cost\": 0.005419403999999999, \"time\": 60.23991825580597}, \"079feb14a8\": {\"quality\": 0.2098290598290598, \"cost\": 0.008877133, \"time\": 62.487732887268066}, \"07a3a7daf7\": {\"quality\": 0.0, \"cost\": 0.006907794, \"time\": 43.434891891479495}, \"08127cd6dd\": {\"quality\": 0.1987179487179487, \"cost\": 0.012973673000000002, \"time\": 55.40713529586792}, \"0833133620\": {\"quality\": 0.1987179487179487, \"cost\": 0.005889005999999999, \"time\": 65.5436268568039}, \"089565077c\": {\"quality\": 0.025, \"cost\": 0.000339832, \"time\": 65.31151757240295}, \"08bf8cc191\": {\"quality\": 0.10555555555555557, \"cost\": 0.011634817000000002, \"time\": 95.24387204647064}, \"08e1802287\": {\"quality\": 0.23974358974358972, \"cost\": 0.0033798820000000003, \"time\": 25.597942876815797}, \"090cd3ef31\": {\"quality\": 0.23974358974358972, \"cost\": 0.0031741180000000005, \"time\": 25.463910150527955}, \"0944a921e8\": {\"quality\": 0.0702991452991453, \"cost\": 0.00792617, \"time\": 85.74197387695312}, \"0947216ece\": {\"quality\": 0.1987179487179487, \"cost\": 0.015770336, \"time\": 82.8155886888504}, \"096d51f670\": {\"quality\": 0.18482905982905984, \"cost\": 0.009268239, \"time\": 104.61172659397124}, \"09791c731b\": {\"quality\": 0.0438034188034188, \"cost\": 0.006413886, \"time\": 85.14133274555206}, \"0990c0d4f8\": {\"quality\": 0.2098290598290598, \"cost\": 0.007531591, \"time\": 77.474085521698}, \"0a128688c1\": {\"quality\": 0.18333333333333332, \"cost\": 0.014743780000000001, \"time\": 100.52884330749512}, \"0a4c1bbb4a\": {\"quality\": 0.16538461538461538, \"cost\": 0.005475987, \"time\": 77.0306186914444}, \"0ac969dde3\": {\"quality\": 0.19444444444444442, \"cost\": 0.009684573, \"time\": 102.05468263626099}, \"0af1efab0e\": {\"quality\": 0.0, \"cost\": 0.0035705999999999993, \"time\": 76.96206550598144}, \"0b1ed7ff58\": {\"quality\": 0.19252136752136753, \"cost\": 0.008049169, \"time\": 92.94120838642121}, \"0b3dc2e896\": {\"quality\": 0.0, \"cost\": 0.00892118, \"time\": 96.49321339130401}, \"0b43e94f3f\": {\"quality\": 0.1987179487179487, \"cost\": 0.001608864, \"time\": 40.973159313201904}, \"0b4ab72197\": {\"quality\": 0.061111111111111116, \"cost\": 0.010094982, \"time\": 72.49727232456206}, \"0be862a0dc\": {\"quality\": 0.04038461538461539, \"cost\": 0.0062549730000000005, \"time\": 56.527322101593015}, \"0bf3129ae8\": {\"quality\": 0.16538461538461538, \"cost\": 0.0028206660000000003, \"time\": 54.240378093719485}, \"0c020b86a3\": {\"quality\": 0.18482905982905984, \"cost\": 0.008657178000000001, \"time\": 49.74211900234222}, \"0c6c7fe96a\": {\"quality\": 0.12863247863247865, \"cost\": 0.012195626999999999, \"time\": 80.83456852436066}, \"0c81c8996a\": {\"quality\": 0.023076923076923078, \"cost\": 0.006098082000000001, \"time\": 71.85198886394501}, \"0cdc5954dd\": {\"quality\": 0.11666666666666667, \"cost\": 0.008310576, \"time\": 66.0079866170883}, \"0d25188bf7\": {\"quality\": 0.04807692307692308, \"cost\": 0.008176553999999999, \"time\": 84.21552875041962}, \"0d9d767ae5\": {\"quality\": 0.04444444444444444, \"cost\": 0.012340264, \"time\": 71.0254203081131}, \"0e36342fe7\": {\"quality\": 0.09252136752136753, \"cost\": 0.00467736, \"time\": 38.08519749641418}, \"0e7e862290\": {\"quality\": 0.0, \"cost\": 0.003672576, \"time\": 74.36907055377961}, \"0e91cd07f9\": {\"quality\": 0.18333333333333332, \"cost\": 0.006350159999999999, \"time\": 76.38928816318511}, \"0ec672e7c8\": {\"quality\": 0.2098290598290598, \"cost\": 0.013312296999999999, \"time\": 90.61400089263915}, \"0ed243f788\": {\"quality\": 0.23205128205128206, \"cost\": 0.008812347, \"time\": 80.54674921035766}, \"0eeb372802\": {\"quality\": 0.23974358974358972, \"cost\": 0.013314724, \"time\": 94.53299105167389}, \"0effe9b1dc\": {\"quality\": 0.15918803418803418, \"cost\": 0.003532296, \"time\": 72.70150463581085}, \"0f7faf684d\": {\"quality\": 0.10982905982905983, \"cost\": 0.011855149000000002, \"time\": 89.06640274524689}, \"0fcec544e3\": {\"quality\": 0.25982905982905985, \"cost\": 0.004687262999999999, \"time\": 83.44144034385681}, \"0ff126ebf8\": {\"quality\": 0.21752136752136753, \"cost\": 0.011823445, \"time\": 64.27427606582641}, \"112d9a3421\": {\"quality\": 0.2730769230769231, \"cost\": 0.010725950000000001, \"time\": 87.46276273727418}, \"114a097c53\": {\"quality\": 0.10982905982905983, \"cost\": 0.00542718, \"time\": 47.13481698036194}, \"116334cd72\": {\"quality\": 0.3175213675213675, \"cost\": 0.012438881999999998, \"time\": 93.07386040687561}, \"1175ee37e6\": {\"quality\": 0.19444444444444442, \"cost\": 0.001459101, \"time\": 30.085742378234862}, \"11a66478dc\": {\"quality\": 0.125, \"cost\": 0.00746376, \"time\": 63.90720744132996}, \"11bc996d48\": {\"quality\": 0.2098290598290598, \"cost\": 0.0065042699999999995, \"time\": 65.52873225212097}, \"11debf9fc0\": {\"quality\": 0.18482905982905984, \"cost\": 0.006505245, \"time\": 89.37637939453126}, \"123fb650fb\": {\"quality\": 0.21752136752136753, \"cost\": 0.008327412, \"time\": 88.56314113140107}, \"1274c21076\": {\"quality\": 0.125, \"cost\": 0.011184342, \"time\": 102.6374900341034}, \"127af50739\": {\"quality\": 0.16538461538461538, \"cost\": 0.008449134, \"time\": 104.71919329166413}, \"12addbf5e2\": {\"quality\": 0.04444444444444444, \"cost\": 0.003276834, \"time\": 88.02170798778533}, \"133ee5023f\": {\"quality\": 0.1987179487179487, \"cost\": 0.005588716, \"time\": 53.45017175674438}, \"1368e1c78e\": {\"quality\": 0.16752136752136754, \"cost\": 0.0034759379999999996, \"time\": 84.43210818767548}, \"13a009fe0c\": {\"quality\": 0.21752136752136753, \"cost\": 0.012945674, \"time\": 98.5892866373062}, \"13da306f84\": {\"quality\": 0.10363247863247865, \"cost\": 0.0087151, \"time\": 77.5059399843216}, \"13f2b9c25b\": {\"quality\": 0.25982905982905985, \"cost\": 0.003494103, \"time\": 98.82596600055695}, \"13f75f9bd0\": {\"quality\": 0.2064102564102564, \"cost\": 0.007151064, \"time\": 70.01372668743133}, \"1404e0aa35\": {\"quality\": 0.2098290598290598, \"cost\": 0.007724231, \"time\": 94.99440484046936}, \"140ededb41\": {\"quality\": 0.2098290598290598, \"cost\": 0.001514445, \"time\": 19.843628692626954}, \"142f3a7c70\": {\"quality\": 0.0, \"cost\": 0.004404129, \"time\": 77.21951529979705}, \"1468dddecc\": {\"quality\": 0.16944444444444445, \"cost\": 0.002386455, \"time\": 48.906515979766844}, \"14d19a01e2\": {\"quality\": 0.05641025641025641, \"cost\": 0.0063798959999999995, \"time\": 72.75729236602783}, \"1624bb5302\": {\"quality\": 0.18482905982905984, \"cost\": 0.001919664, \"time\": 66.44049925804138}, \"1636e0833b\": {\"quality\": 0.025, \"cost\": 0.002752764, \"time\": 74.27101521492004}, \"1658296f3a\": {\"quality\": 0.21752136752136753, \"cost\": 0.005375874, \"time\": 64.17133255004883}, \"16f351273f\": {\"quality\": 0.17307692307692307, \"cost\": 0.01028627, \"time\": 67.0687807559967}, \"171e6ae293\": {\"quality\": 0.21752136752136753, \"cost\": 0.007912404000000001, \"time\": 62.13353226184845}, \"176da24f53\": {\"quality\": 0.015384615384615385, \"cost\": 0.0038477320000000004, \"time\": 42.74407241344451}, \"179379555f\": {\"quality\": 0.1987179487179487, \"cost\": 0.006129615, \"time\": 88.43889818191528}, \"17c928174f\": {\"quality\": 0.1987179487179487, \"cost\": 0.0028421130000000003, \"time\": 75.59559867382049}, \"181c91d1be\": {\"quality\": 0.21752136752136753, \"cost\": 0.012095439999999999, \"time\": 67.77491972446441}, \"18368684cd\": {\"quality\": 0.05641025641025641, \"cost\": 0.0059048880000000005, \"time\": 90.52595224380494}, \"183743e76e\": {\"quality\": 0.0, \"cost\": 0.005899368, \"time\": 74.72685542106629}, \"18f55750b0\": {\"quality\": 0.0, \"cost\": 0.008405136, \"time\": 124.15524208545685}, \"19563b057d\": {\"quality\": 0.1987179487179487, \"cost\": 0.011058744, \"time\": 94.49535017013551}, \"1957127275\": {\"quality\": 0.29252136752136754, \"cost\": 0.008193474000000001, \"time\": 92.91692547798156}, \"197bb53f10\": {\"quality\": 0.2098290598290598, \"cost\": 0.006091916999999999, \"time\": 79.83643505573272}, \"199fd1fbf2\": {\"quality\": 0.058333333333333334, \"cost\": 0.00121965, \"time\": 48.699701046943666}, \"19b40e0271\": {\"quality\": 0.3175213675213675, \"cost\": 0.008674995000000001, \"time\": 116.94037146568297}, \"19e3db7fe7\": {\"quality\": 0.025, \"cost\": 0.01204651, \"time\": 117.68800120353698}, \"1ad856985f\": {\"quality\": 0.19444444444444442, \"cost\": 0.008808637000000001, \"time\": 115.93177556991577}, \"1adec2dca2\": {\"quality\": 0.0, \"cost\": 0.013760218000000001, \"time\": 125.3757052898407}, \"1b04a2b184\": {\"quality\": 0.09252136752136753, \"cost\": 0.013170014, \"time\": 111.55746636390685}, \"1b28439bd7\": {\"quality\": 0.21752136752136753, \"cost\": 0.010848586, \"time\": 84.68317291736602}, \"1beb2fac62\": {\"quality\": 0.3175213675213675, \"cost\": 0.013538090000000001, \"time\": 73.24731080532074}, \"1c347e4d91\": {\"quality\": 0.11752136752136753, \"cost\": 0.003958569, \"time\": 81.87744581699371}, \"1c3882926e\": {\"quality\": 0.06538461538461539, \"cost\": 0.006650534999999999, \"time\": 102.02758514881134}, \"1cc6d9efb6\": {\"quality\": 0.21752136752136753, \"cost\": 0.011642267999999999, \"time\": 73.37450892925263}, \"1ce3d77039\": {\"quality\": 0.1987179487179487, \"cost\": 0.000591721, \"time\": 33.82067358493805}, \"1ce99cf2c8\": {\"quality\": 0.19252136752136753, \"cost\": 0.007676729, \"time\": 109.94239032268524}, \"1d26090364\": {\"quality\": 0.08141025641025641, \"cost\": 0.004802969999999999, \"time\": 37.3655684709549}, \"1d87f97e62\": {\"quality\": 0.25085470085470085, \"cost\": 0.0074821539999999995, \"time\": 69.90373904705048}, \"1da2369719\": {\"quality\": 0.1737179487179487, \"cost\": 0.003804339, \"time\": 75.42430078983307}, \"1e18e60895\": {\"quality\": 0.1952991452991453, \"cost\": 0.000637105, \"time\": 36.58538358211517}, \"1e1bf7e88b\": {\"quality\": 0.2064102564102564, \"cost\": 0.012300497, \"time\": 91.10763094425201}, \"1e8b3521f8\": {\"quality\": 0.23205128205128206, \"cost\": 0.003055887, \"time\": 85.00319654941559}, \"1f5e8c9e9a\": {\"quality\": 0.06538461538461539, \"cost\": 0.01254384, \"time\": 110.50039112567902}, \"2018bef45f\": {\"quality\": 0.13974358974358975, \"cost\": 0.0073448359999999996, \"time\": 90.09606876373292}, \"2066966577\": {\"quality\": 0.19444444444444442, \"cost\": 0.011873376, \"time\": 119.55739791393279}, \"2075ff1d04\": {\"quality\": 0.08333333333333334, \"cost\": 0.0009685659999999999, \"time\": 70.29870116710663}, \"2080b60a57\": {\"quality\": 0.04444444444444444, \"cost\": 0.005072787, \"time\": 116.32027242183685}, \"208a98f514\": {\"quality\": 0.21752136752136753, \"cost\": 0.009998995, \"time\": 118.32991974353791}, \"20e10af7d4\": {\"quality\": 0.2098290598290598, \"cost\": 0.005652002999999999, \"time\": 116.42957108020784}, \"20e2c0b057\": {\"quality\": 0.10982905982905983, \"cost\": 0.007542789, \"time\": 113.53230805397033}, \"211b89b4cd\": {\"quality\": 0.1952991452991453, \"cost\": 0.005615415, \"time\": 63.006171917915346}, \"2153174e2d\": {\"quality\": 0.11474358974358972, \"cost\": 0.007281448, \"time\": 118.94957902431489}, \"21b2b8ebd1\": {\"quality\": 0.21752136752136753, \"cost\": 0.007063268000000001, \"time\": 76.91416993141175}, \"21b2df8512\": {\"quality\": 0.19444444444444442, \"cost\": 0.00326791, \"time\": 69.47645666599274}, \"21bed16a7d\": {\"quality\": 0.07307692307692308, \"cost\": 0.0016491099999999999, \"time\": 30.10742793083191}, \"227246dff8\": {\"quality\": 0.1876068376068376, \"cost\": 0.0006464099999999999, \"time\": 42.72886557579041}, \"227c30d349\": {\"quality\": 0.1, \"cost\": 0.005918874, \"time\": 103.58872454166413}, \"228687831a\": {\"quality\": 0.04722222222222222, \"cost\": 0.005076029999999999, \"time\": 67.8051172733307}, \"23075b2a6e\": {\"quality\": 0.05, \"cost\": 0.0024716549999999997, \"time\": 97.33722229003905}, \"23566f15ab\": {\"quality\": 0.1876068376068376, \"cost\": 0.00044987499999999997, \"time\": 40.82834756374359}, \"2370cebb10\": {\"quality\": 0.21752136752136753, \"cost\": 0.004052682, \"time\": 106.21984114646912}, \"2386b03c4c\": {\"quality\": 0.1987179487179487, \"cost\": 0.006683786999999999, \"time\": 110.57667918205262}, \"24957f3a43\": {\"quality\": 0.1987179487179487, \"cost\": 0.004216121999999999, \"time\": 70.75154151916504}, \"24c122de4e\": {\"quality\": 0.061111111111111116, \"cost\": 0.0059287260000000005, \"time\": 78.56412732601166}, \"24f76747b9\": {\"quality\": 0.14871794871794872, \"cost\": 0.003334974, \"time\": 96.20667114257813}, \"2609bfd616\": {\"quality\": 0.08482905982905983, \"cost\": 0.009335312, \"time\": 101.31035015583038}, \"260ab3e966\": {\"quality\": 0.18482905982905984, \"cost\": 0.006512652000000001, \"time\": 107.31265118122101}, \"2728c8eb6a\": {\"quality\": 0.05918803418803419, \"cost\": 0.009020646, \"time\": 82.33452692031861}, \"27971eaaf5\": {\"quality\": 0.1876068376068376, \"cost\": 0.005893562, \"time\": 85.50352709293365}, \"27ba0964b2\": {\"quality\": 0.18205128205128207, \"cost\": 0.0012502439999999997, \"time\": 50.85352149009705}, \"27daa50458\": {\"quality\": 0.21752136752136753, \"cost\": 0.005911454, \"time\": 49.84150557518005}, \"28369b2421\": {\"quality\": 0.3175213675213675, \"cost\": 0.008581950000000001, \"time\": 82.63782026767731}, \"28421e6d62\": {\"quality\": 0.05, \"cost\": 0.0071309500000000005, \"time\": 77.62054443359375}, \"2848c42f91\": {\"quality\": 0.0, \"cost\": 0.004609089, \"time\": 92.82525777816772}, \"28a638bb6e\": {\"quality\": 0.04444444444444444, \"cost\": 0.0037765230000000004, \"time\": 103.16399810314178}, \"290947fe5a\": {\"quality\": 0.08482905982905983, \"cost\": 0.005273313, \"time\": 115.58293502330781}, \"2936c3e43e\": {\"quality\": 0.13205128205128205, \"cost\": 0.011067623, \"time\": 92.91213045120239}, \"293ec5edca\": {\"quality\": 0.2987179487179487, \"cost\": 0.006866393, \"time\": 77.98297226428986}, \"294e541235\": {\"quality\": 0.0, \"cost\": 0.0024907139999999998, \"time\": 55.42697856426239}, \"295ed5e759\": {\"quality\": 0.023076923076923078, \"cost\": 0.007770515999999999, \"time\": 73.042240858078}, \"2960431101\": {\"quality\": 0.21752136752136753, \"cost\": 0.007103678, \"time\": 72.23796293735504}, \"29892d8468\": {\"quality\": 0.0, \"cost\": 0.011991246, \"time\": 90.30782704353332}, \"299a0aeb65\": {\"quality\": 0.21752136752136753, \"cost\": 0.008716849999999998, \"time\": 87.09911065101625}, \"29bf3c0a3b\": {\"quality\": 0.0952991452991453, \"cost\": 0.008317653000000001, \"time\": 95.62355568408967}, \"29c8c693e2\": {\"quality\": 0.19252136752136753, \"cost\": 0.009049301000000001, \"time\": 97.66356644630432}, \"2a7d15f4a7\": {\"quality\": 0.1987179487179487, \"cost\": 0.0007805049999999999, \"time\": 26.81472017765045}, \"2ae24e0124\": {\"quality\": 0.2098290598290598, \"cost\": 0.006997367999999999, \"time\": 100.80345997810363}, \"2b5679d248\": {\"quality\": 0.23482905982905983, \"cost\": 0.012146391000000001, \"time\": 73.05133624076844}, \"2b5ab72a55\": {\"quality\": 0.1, \"cost\": 0.00048774399999999997, \"time\": 87.95751609802247}, \"2b82a67eb1\": {\"quality\": 0.2098290598290598, \"cost\": 0.002022681, \"time\": 73.1611754655838}, \"2bcbffdf85\": {\"quality\": 0.1, \"cost\": 0.003530234999999999, \"time\": 89.18944058418273}, \"2bd39ee744\": {\"quality\": 0.2098290598290598, \"cost\": 0.014582621000000002, \"time\": 59.03191387653351}, \"2bf38d797f\": {\"quality\": 0.2064102564102564, \"cost\": 0.0075413270000000004, \"time\": 54.856262516975406}, \"2c1640adf7\": {\"quality\": 0.18482905982905984, \"cost\": 0.0019162139999999999, \"time\": 77.58103239536285}, \"2c5cf9eb26\": {\"quality\": 0.125, \"cost\": 0.006593481, \"time\": 71.76690304279327}, \"2c87313a93\": {\"quality\": 0.11752136752136753, \"cost\": 0.012109324000000001, \"time\": 136.4559998512268}, \"2c9a9f94c4\": {\"quality\": 0.06538461538461539, \"cost\": 0.011352702, \"time\": 108.1334749698639}, \"2d3bbc2d23\": {\"quality\": 0.1987179487179487, \"cost\": 0.006806840999999999, \"time\": 94.72236230373383}, \"2d7f1dbd4b\": {\"quality\": 0.1987179487179487, \"cost\": 0.003597009, \"time\": 112.8651005268097}, \"2de113167b\": {\"quality\": 0.11752136752136753, \"cost\": 0.011756008, \"time\": 101.33509397506714}, \"2de3eb2c19\": {\"quality\": 0.23974358974358972, \"cost\": 0.006698560000000001, \"time\": 58.83604803085328}, \"2e02b71061\": {\"quality\": 0.09444444444444444, \"cost\": 0.002093574, \"time\": 88.67284088134765}, \"2e30394ac6\": {\"quality\": 0.07307692307692308, \"cost\": 0.007492572, \"time\": 90.685981965065}, \"2e5d071f21\": {\"quality\": 0.12863247863247865, \"cost\": 0.0072971500000000005, \"time\": 123.94845788478851}, \"2e9c5cc9bf\": {\"quality\": 0.05, \"cost\": 0.005878296000000001, \"time\": 49.976054430007935}, \"2ec4bec1a3\": {\"quality\": 0.21474358974358973, \"cost\": 0.010779144000000001, \"time\": 124.55905365943909}, \"2f1573da80\": {\"quality\": 0.1987179487179487, \"cost\": 0.008802546, \"time\": 103.0501886844635}, \"2fc0cb3592\": {\"quality\": 0.06538461538461539, \"cost\": 0.00690309, \"time\": 73.22720482349396}, \"2fd9cd426a\": {\"quality\": 0.21752136752136753, \"cost\": 0.01168472, \"time\": 97.46322917938232}, \"3019af79b3\": {\"quality\": 0.1, \"cost\": 0.001074796, \"time\": 51.414027357101446}, \"302c1d97fc\": {\"quality\": 0.1814102564102564, \"cost\": 0.01244514, \"time\": 70.44251236915588}, \"303b467574\": {\"quality\": 0.21752136752136753, \"cost\": 0.007695415999999999, \"time\": 104.79880075454712}, \"3058b1f1f8\": {\"quality\": 0.12863247863247865, \"cost\": 0.006238776, \"time\": 104.19706840515136}, \"30ae4cbe91\": {\"quality\": 0.19252136752136753, \"cost\": 0.000758614, \"time\": 64.8543310880661}, \"30e3ff1d17\": {\"quality\": 0.1841880341880342, \"cost\": 0.007638456, \"time\": 97.64294464588164}, \"30f20c8fe6\": {\"quality\": 0.1876068376068376, \"cost\": 0.006185881000000001, \"time\": 95.83135304450988}, \"316759d191\": {\"quality\": 0.06538461538461539, \"cost\": 0.01277288, \"time\": 106.51430850028991}, \"3177802176\": {\"quality\": 0.05, \"cost\": 0.000812176, \"time\": 68.97149279117585}, \"3184f977a8\": {\"quality\": 0.06944444444444445, \"cost\": 0.003731166, \"time\": 116.78110229969025}, \"3194e440cf\": {\"quality\": 0.1, \"cost\": 0.0021085889999999997, \"time\": 68.94484202861786}, \"3197ad4faf\": {\"quality\": 0.125, \"cost\": 0.0029357279999999994, \"time\": 104.7419487476349}, \"31a423a3bf\": {\"quality\": 0.0, \"cost\": 0.002416362, \"time\": 71.78247156143189}, \"321e17afbd\": {\"quality\": 0.1952991452991453, \"cost\": 0.009030642000000002, \"time\": 102.16506278514862}, \"32b101d807\": {\"quality\": 0.21752136752136753, \"cost\": 0.013004699000000002, \"time\": 107.63277230262756}, \"332a350ea2\": {\"quality\": 0.10363247863247865, \"cost\": 0.005886114, \"time\": 79.25630362033844}, \"33459cd29c\": {\"quality\": 0.02222222222222222, \"cost\": 0.009792765999999998, \"time\": 109.5698808670044}, \"33a187e74f\": {\"quality\": 0.1987179487179487, \"cost\": 0.0057371969999999994, \"time\": 108.5205159664154}, \"34026bb5cc\": {\"quality\": 0.2098290598290598, \"cost\": 0.003850629, \"time\": 79.99503300189971}, \"3511b5e1d0\": {\"quality\": 0.0, \"cost\": 0.007917534, \"time\": 106.0169960975647}, \"3513e54767\": {\"quality\": 0.12307692307692308, \"cost\": 0.00652464, \"time\": 56.468523144721985}, \"353f0cb1ac\": {\"quality\": 0.2064102564102564, \"cost\": 0.0027354140000000002, \"time\": 50.53772373199463}, \"3550bf88cb\": {\"quality\": 0.22863247863247865, \"cost\": 0.007747509, \"time\": 104.70380001068116}, \"35610fb420\": {\"quality\": 0.2098290598290598, \"cost\": 0.004438119, \"time\": 84.96952936649322}, \"357267e14b\": {\"quality\": 0.17222222222222222, \"cost\": 0.00107605, \"time\": 39.50561866760254}, \"36011c7606\": {\"quality\": 0.1987179487179487, \"cost\": 0.008097354, \"time\": 84.37511310577392}, \"362d480d6d\": {\"quality\": 0.22863247863247865, \"cost\": 0.010916724, \"time\": 110.61522128582001}, \"363209b6e7\": {\"quality\": 0.32863247863247863, \"cost\": 0.006868834000000001, \"time\": 83.50905873775483}, \"3637084f91\": {\"quality\": 0.3175213675213675, \"cost\": 0.007119692, \"time\": 82.10715711116791}, \"36b17c40f3\": {\"quality\": 0.03418803418803419, \"cost\": 0.004674881999999999, \"time\": 53.087517738342285}, \"36c66671ee\": {\"quality\": 0.12307692307692308, \"cost\": 0.006580928999999999, \"time\": 86.52281455993652}, \"372e8b5f4f\": {\"quality\": 0.058333333333333334, \"cost\": 0.001610232, \"time\": 70.54334411621093}, \"375ed248fe\": {\"quality\": 0.06752136752136753, \"cost\": 0.008873826000000001, \"time\": 86.2198546409607}, \"377b8b0bcc\": {\"quality\": 0.025, \"cost\": 0.0049981109999999995, \"time\": 106.38127601146698}, \"37bd28f2c9\": {\"quality\": 0.21752136752136753, \"cost\": 0.007961847, \"time\": 103.88883001804352}, \"38075bb01f\": {\"quality\": 0.21666666666666667, \"cost\": 0.003875973, \"time\": 75.78762099742889}, \"38567d6a43\": {\"quality\": 0.0, \"cost\": 0.004616319, \"time\": 89.08885374069214}, \"389a99ab21\": {\"quality\": 0.07371794871794872, \"cost\": 0.007963053, \"time\": 102.66287496089936}, \"389c54cbca\": {\"quality\": 0.0702991452991453, \"cost\": 0.005610244, \"time\": 48.15491397380829}, \"38ec11cf7b\": {\"quality\": 0.0, \"cost\": 0.007187318999999999, \"time\": 103.98532650470733}, \"3980f20caa\": {\"quality\": 0.04807692307692308, \"cost\": 0.014391898, \"time\": 100.08462851047516}, \"39cd4ca402\": {\"quality\": 0.025, \"cost\": 0.003120591, \"time\": 75.18385965824127}, \"3a34b24c41\": {\"quality\": 0.21752136752136753, \"cost\": 0.002965395, \"time\": 73.19122822284699}, \"3b2e8075ea\": {\"quality\": 0.2098290598290598, \"cost\": 0.003075393, \"time\": 70.68897068500519}, \"3b3676521a\": {\"quality\": 0.21752136752136753, \"cost\": 0.013715839000000002, \"time\": 107.15725564956665}, \"3b3a6bf087\": {\"quality\": 0.30982905982905984, \"cost\": 0.006470724000000001, \"time\": 74.07467935085296}, \"3b4bde0121\": {\"quality\": 0.0, \"cost\": 0.00302925, \"time\": 77.53029568195343}, \"3b57530a56\": {\"quality\": 0.2098290598290598, \"cost\": 0.002663507, \"time\": 48.332898473739625}, \"3c206c89f3\": {\"quality\": 0.21752136752136753, \"cost\": 0.008676515999999999, \"time\": 115.085169839859}, \"3cbab8082e\": {\"quality\": 0.09871794871794873, \"cost\": 0.008985135, \"time\": 116.89071977138519}, \"3d71c4dd2c\": {\"quality\": 0.10982905982905983, \"cost\": 0.006464912999999999, \"time\": 86.58996284008026}, \"3ea15ac20c\": {\"quality\": 0.11538461538461539, \"cost\": 0.007241556, \"time\": 125.68785231113434}, \"3f1a58aec9\": {\"quality\": 0.22863247863247865, \"cost\": 0.00189393, \"time\": 45.61551144123077}, \"3f2321bb08\": {\"quality\": 0.125, \"cost\": 0.000410382, \"time\": 33.271996068954465}, \"3f3ef494b0\": {\"quality\": 0.0, \"cost\": 0.001176912, \"time\": 50.03896124362946}, \"3f62c3fbfc\": {\"quality\": 0.21752136752136753, \"cost\": 0.004284207, \"time\": 95.22124736309053}, \"3f88dd99f7\": {\"quality\": 0.19252136752136753, \"cost\": 0.005681200000000001, \"time\": 63.5021466255188}, \"3fa747af9a\": {\"quality\": 0.15833333333333333, \"cost\": 0.002231574, \"time\": 96.99003422260284}, \"40104c813f\": {\"quality\": 0.0, \"cost\": 0.010208657000000001, \"time\": 125.99128649234771}, \"403b05da2d\": {\"quality\": 0.08141025641025641, \"cost\": 0.010538071999999999, \"time\": 94.6719701051712}, \"4043815a3e\": {\"quality\": 0.21752136752136753, \"cost\": 0.013145456, \"time\": 124.91901173591614}, \"409ff67607\": {\"quality\": 0.06944444444444445, \"cost\": 0.004053711, \"time\": 99.38009803295135}, \"412c065b83\": {\"quality\": 0.0, \"cost\": 0.008236578, \"time\": 92.44097802639007}, \"4171fbac5c\": {\"quality\": 0.19444444444444442, \"cost\": 0.009067686, \"time\": 114.00634713172913}, \"4191118787\": {\"quality\": 0.1987179487179487, \"cost\": 0.008734028999999999, \"time\": 109.63090167045593}, \"41d5b97871\": {\"quality\": 0.0, \"cost\": 0.002514906, \"time\": 62.46952087879181}, \"41d8845655\": {\"quality\": 0.2098290598290598, \"cost\": 0.010189181000000002, \"time\": 59.657727408409116}, \"41ee202cac\": {\"quality\": 0.06538461538461539, \"cost\": 0.000921442, \"time\": 54.925141382217404}, \"42082dcd0d\": {\"quality\": 0.1987179487179487, \"cost\": 0.007583567999999999, \"time\": 108.27059605121613}, \"42ddd48341\": {\"quality\": 0.22094017094017093, \"cost\": 0.014396616, \"time\": 104.56641755104064}, \"4361bc7ea7\": {\"quality\": 0.13333333333333333, \"cost\": 0.001473496, \"time\": 63.439108324050906}, \"43afdad250\": {\"quality\": 0.22863247863247865, \"cost\": 0.011869643, \"time\": 108.28341348171234}, \"43c3cf9cb8\": {\"quality\": 0.2064102564102564, \"cost\": 0.0068331120000000006, \"time\": 53.15135598182678}, \"43d24fb32a\": {\"quality\": 0.07307692307692308, \"cost\": 0.00976587, \"time\": 85.51874697208405}, \"440bc872de\": {\"quality\": 0.07222222222222222, \"cost\": 0.007188219, \"time\": 121.43468098640442}, \"44173a9aef\": {\"quality\": 0.30641025641025643, \"cost\": 0.01244934, \"time\": 115.21545617580415}, \"44d6af5523\": {\"quality\": 0.2098290598290598, \"cost\": 0.00476793, \"time\": 81.34744908809662}, \"44fe4e4e3e\": {\"quality\": 0.11752136752136753, \"cost\": 0.01136529, \"time\": 81.14654626846314}, \"4587a1500c\": {\"quality\": 0.3175213675213675, \"cost\": 0.006090974000000002, \"time\": 79.77526035308838}, \"45ef93b61e\": {\"quality\": 0.1987179487179487, \"cost\": 0.003150051, \"time\": 81.59605078697206}, \"461846a52d\": {\"quality\": 0.1814102564102564, \"cost\": 0.003777498, \"time\": 82.82821555137635}, \"462e6ff849\": {\"quality\": 0.11752136752136753, \"cost\": 0.010684958, \"time\": 113.33498125076294}, \"4630853d32\": {\"quality\": 0.06752136752136753, \"cost\": 0.008278823000000001, \"time\": 86.25940473079682}, \"46475b9e75\": {\"quality\": 0.2098290598290598, \"cost\": 0.0061740449999999995, \"time\": 87.75788187980652}, \"46654a1f32\": {\"quality\": 0.0, \"cost\": 0.011978384, \"time\": 67.87502360343933}, \"466a3036b2\": {\"quality\": 0.09252136752136753, \"cost\": 0.010652967, \"time\": 114.7115253686905}, \"466d4d16dd\": {\"quality\": 0.18482905982905984, \"cost\": 0.008396510000000001, \"time\": 86.67418849468231}, \"46a35022d8\": {\"quality\": 0.1952991452991453, \"cost\": 0.000766365, \"time\": 59.83990566730499}, \"46ed68152d\": {\"quality\": 0.18333333333333332, \"cost\": 0.01143953, \"time\": 92.87648718357087}, \"46edc488a4\": {\"quality\": 0.059829059829059825, \"cost\": 0.0012849899999999997, \"time\": 50.55697724819183}, \"476a12876c\": {\"quality\": 0.11752136752136753, \"cost\": 0.006154817999999999, \"time\": 81.70997877120972}, \"48043e2304\": {\"quality\": 0.2098290598290598, \"cost\": 0.009217511000000001, \"time\": 89.0722332715988}, \"488645cbd9\": {\"quality\": 0.2064102564102564, \"cost\": 0.001493856, \"time\": 48.69374096393585}, \"49009a3b57\": {\"quality\": 0.059829059829059825, \"cost\": 0.007598573999999999, \"time\": 78.47919590473174}, \"4909061216\": {\"quality\": 0.21752136752136753, \"cost\": 0.013011773, \"time\": 104.75487668514252}, \"49107972df\": {\"quality\": 0.025, \"cost\": 0.007639823999999999, \"time\": 118.11295173168182}, \"49731b1ccd\": {\"quality\": 0.0952991452991453, \"cost\": 0.006953804999999999, \"time\": 77.60816009044646}, \"498f146004\": {\"quality\": 0.11752136752136753, \"cost\": 0.008050870000000002, \"time\": 112.59319038391114}, \"49ad844bd2\": {\"quality\": 0.2098290598290598, \"cost\": 0.008512473, \"time\": 109.38159234523773}, \"49ca727e49\": {\"quality\": 0.2098290598290598, \"cost\": 0.000746866, \"time\": 34.221199584007266}, \"4a4a960a82\": {\"quality\": 0.13333333333333333, \"cost\": 0.007445242000000001, \"time\": 84.403648686409}, \"4a92372986\": {\"quality\": 0.16538461538461538, \"cost\": 0.00319335, \"time\": 83.45380291938781}, \"4aa7e8fde6\": {\"quality\": 0.10641025641025642, \"cost\": 0.0019865879999999996, \"time\": 83.08999395370483}, \"4aafd39d76\": {\"quality\": 0.14038461538461539, \"cost\": 0.005373513, \"time\": 82.9023279428482}, \"4ace1cfad1\": {\"quality\": 0.16538461538461538, \"cost\": 0.009508553999999999, \"time\": 106.34385075569153}, \"4ad1952206\": {\"quality\": 0.20213675213675214, \"cost\": 0.0085493, \"time\": 83.97655324935913}, \"4b59f40131\": {\"quality\": 0.04807692307692308, \"cost\": 0.006207440999999999, \"time\": 83.43397681713105}, \"4b86a1c038\": {\"quality\": 0.2064102564102564, \"cost\": 0.00792555, \"time\": 106.1481963634491}, \"4b92a26754\": {\"quality\": 0.0, \"cost\": 0.00047688399999999996, \"time\": 69.90886387825012}, \"4bc4528402\": {\"quality\": 0.16538461538461538, \"cost\": 0.004527411, \"time\": 81.34518160820008}, \"4c158a1a4a\": {\"quality\": 0.08205128205128205, \"cost\": 0.007535195999999999, \"time\": 78.38852503299714}, \"4c954323e3\": {\"quality\": 0.21752136752136753, \"cost\": 0.00974773, \"time\": 103.17564594745636}, \"4d8bcf8ae2\": {\"quality\": 0.125, \"cost\": 0.000539128, \"time\": 69.75875072479249}, \"4d91e8a27b\": {\"quality\": 0.21752136752136753, \"cost\": 0.010584353000000001, \"time\": 76.16667878627777}, \"4dd3635bc3\": {\"quality\": 0.2098290598290598, \"cost\": 0.012259316000000001, \"time\": 78.35238783359529}, \"4dd96bd18f\": {\"quality\": 0.21752136752136753, \"cost\": 0.011986314000000001, \"time\": 105.92028946876526}, \"4e298ee0d4\": {\"quality\": 0.1814102564102564, \"cost\": 0.008259033, \"time\": 95.37437601089476}, \"4e3443a0f9\": {\"quality\": 0.11752136752136753, \"cost\": 0.013054975, \"time\": 106.09084448814392}, \"4e4b9db2b8\": {\"quality\": 0.21752136752136753, \"cost\": 0.007140224000000001, \"time\": 72.97610201835633}, \"4e6509f614\": {\"quality\": 0.061111111111111116, \"cost\": 0.0053766, \"time\": 47.53030514717102}, \"4e6a83e751\": {\"quality\": 0.07371794871794872, \"cost\": 0.001089142, \"time\": 53.57680480480194}, \"4e8d8e527a\": {\"quality\": 0.14871794871794872, \"cost\": 0.0051247079999999995, \"time\": 85.9967652797699}, \"4ef333ab21\": {\"quality\": 0.19444444444444442, \"cost\": 0.001917522, \"time\": 79.47004499435425}, \"4f16545711\": {\"quality\": 0.06666666666666667, \"cost\": 0.005466131999999999, \"time\": 88.67296268939972}, \"4f8cca1195\": {\"quality\": 0.21752136752136753, \"cost\": 0.010168095, \"time\": 114.15254590511321}, \"500860eaa2\": {\"quality\": 0.0, \"cost\": 0.0022933619999999997, \"time\": 38.243334197998045}, \"50701b505e\": {\"quality\": 0.1987179487179487, \"cost\": 0.0064327049999999995, \"time\": 79.18819501399994}, \"50bc87e9cc\": {\"quality\": 0.1987179487179487, \"cost\": 0.005166006000000001, \"time\": 86.32211146354675}, \"50c03be77c\": {\"quality\": 0.10982905982905983, \"cost\": 0.00168543, \"time\": 94.39419853687286}, \"510375edad\": {\"quality\": 0.0, \"cost\": 0.0021525, \"time\": 67.02195911407472}, \"512fdb607c\": {\"quality\": 0.2098290598290598, \"cost\": 0.008402651, \"time\": 120.24980294704437}, \"51583a901c\": {\"quality\": 0.2098290598290598, \"cost\": 0.010226283000000001, \"time\": 109.70203943252564}, \"521314dab6\": {\"quality\": 0.19252136752136753, \"cost\": 0.0064746199999999995, \"time\": 59.58354845046997}, \"5241bf401b\": {\"quality\": 0.07222222222222222, \"cost\": 0.006321017999999999, \"time\": 82.37689163684846}, \"526878b5eb\": {\"quality\": 0.07371794871794872, \"cost\": 0.008675337, \"time\": 115.97295072078705}, \"52c1cba6ce\": {\"quality\": 0.21752136752136753, \"cost\": 0.009671612999999999, \"time\": 108.82956576347351}, \"52e5d0f4fb\": {\"quality\": 0.2064102564102564, \"cost\": 0.001697778, \"time\": 88.22295272350311}, \"52f041a70e\": {\"quality\": 0.2098290598290598, \"cost\": 0.001067476, \"time\": 32.61202094554901}, \"5307496302\": {\"quality\": 0.2098290598290598, \"cost\": 0.005388657, \"time\": 86.62635931968688}, \"53869388bb\": {\"quality\": 0.061111111111111116, \"cost\": 0.0021395499999999996, \"time\": 39.578048658370975}, \"53d2932c4f\": {\"quality\": 0.2814102564102564, \"cost\": 0.007557482000000001, \"time\": 73.93080537319183}, \"5474247f91\": {\"quality\": 0.0, \"cost\": 0.008248308, \"time\": 80.75084838867187}, \"557d2cf7ba\": {\"quality\": 0.19444444444444442, \"cost\": 0.009174846, \"time\": 81.92483322620392}, \"559c7120c5\": {\"quality\": 0.14038461538461539, \"cost\": 0.009353153, \"time\": 113.30842261314393}, \"55c8aa8935\": {\"quality\": 0.04807692307692308, \"cost\": 0.01165586, \"time\": 114.18454928398131}, \"56a29a28c5\": {\"quality\": 0.04807692307692308, \"cost\": 0.004978499999999999, \"time\": 55.467578983306886}, \"56b39eb1d6\": {\"quality\": 0.21752136752136753, \"cost\": 0.009558101, \"time\": 131.33570635318756}, \"5703697dbd\": {\"quality\": 0.2098290598290598, \"cost\": 0.006108023999999999, \"time\": 79.37135038375854}, \"5718f2ed80\": {\"quality\": 0.10641025641025642, \"cost\": 0.003117972, \"time\": 91.63201060295106}, \"572a02a59a\": {\"quality\": 0.19444444444444442, \"cost\": 0.004321886999999999, \"time\": 100.45879077911377}, \"572c2df793\": {\"quality\": 0.04038461538461539, \"cost\": 0.004394319000000001, \"time\": 143.467453289032}, \"5750713a41\": {\"quality\": 0.2098290598290598, \"cost\": 0.006046722, \"time\": 92.31352968215941}, \"57757ef15e\": {\"quality\": 0.1876068376068376, \"cost\": 0.006326642, \"time\": 106.06430275440215}, \"5793d14bbe\": {\"quality\": 0.2098290598290598, \"cost\": 0.009510144000000002, \"time\": 139.83697164058685}, \"579a915ed2\": {\"quality\": 0.08333333333333334, \"cost\": 0.011355894000000002, \"time\": 109.32498943805695}, \"579c81bbe0\": {\"quality\": 0.025, \"cost\": 0.004824098, \"time\": 75.78087794780731}, \"57bed1722f\": {\"quality\": 0.025, \"cost\": 0.005056998, \"time\": 85.83169932365418}, \"585ba6d20b\": {\"quality\": 0.21752136752136753, \"cost\": 0.007866175, \"time\": 151.51093373298644}, \"589a1cea79\": {\"quality\": 0.05, \"cost\": 0.009551150000000001, \"time\": 151.7195835828781}, \"59006532b4\": {\"quality\": 0.16538461538461538, \"cost\": 0.007208474999999999, \"time\": 118.06772444248199}, \"59326c4e00\": {\"quality\": 0.015384615384615385, \"cost\": 0.005300058, \"time\": 72.70066883563996}, \"596f0ed542\": {\"quality\": 0.023076923076923078, \"cost\": 0.011558136, \"time\": 137.05188086032868}, \"5971ba4e0d\": {\"quality\": 0.21752136752136753, \"cost\": 0.0018305540000000003, \"time\": 43.38757050037384}, \"59e0117b7d\": {\"quality\": 0.21752136752136753, \"cost\": 0.006857488, \"time\": 139.82263526916503}, \"59f515d0da\": {\"quality\": 0.2064102564102564, \"cost\": 0.00388749, \"time\": 108.1351862192154}, \"59f887b67c\": {\"quality\": 0.0, \"cost\": 0.010208398, \"time\": 146.34244396686552}, \"5a22920db4\": {\"quality\": 0.1987179487179487, \"cost\": 0.0018006569999999998, \"time\": 75.78275735378266}, \"5a35020d45\": {\"quality\": 0.1987179487179487, \"cost\": 0.0060887910000000005, \"time\": 109.82949197292328}, \"5aa71bb88a\": {\"quality\": 0.12094017094017094, \"cost\": 0.000638436, \"time\": 67.15015056133271}, \"5b10fbdbe1\": {\"quality\": 0.21752136752136753, \"cost\": 0.013089312000000002, \"time\": 128.6064488887787}, \"5b4ad39a9e\": {\"quality\": 0.11752136752136753, \"cost\": 0.0032681159999999997, \"time\": 99.70189120769501}, \"5bade9eb85\": {\"quality\": 0.18482905982905984, \"cost\": 0.008712719999999998, \"time\": 120.3732186794281}, \"5be16744bf\": {\"quality\": 0.1952991452991453, \"cost\": 0.013749790000000001, \"time\": 91.71922521591186}, \"5c0db11303\": {\"quality\": 0.058333333333333334, \"cost\": 0.002397702, \"time\": 93.91862483024596}, \"5c53feccd9\": {\"quality\": 0.023076923076923078, \"cost\": 0.015926556, \"time\": 118.78805196285248}, \"5c77c7c2b2\": {\"quality\": 0.15, \"cost\": 0.008434941000000001, \"time\": 85.08626408576964}, \"5d072194b8\": {\"quality\": 0.11752136752136753, \"cost\": 0.007648864000000002, \"time\": 104.28334321975709}, \"5d79b50feb\": {\"quality\": 0.14038461538461539, \"cost\": 0.013589224, \"time\": 84.18065688610076}, \"5dc216cd6b\": {\"quality\": 0.10982905982905983, \"cost\": 0.010508788, \"time\": 83.96197824478149}, \"5dd68c1b8f\": {\"quality\": 0.16944444444444445, \"cost\": 0.001444968, \"time\": 54.59350550174713}, \"5de4a882c1\": {\"quality\": 0.21752136752136753, \"cost\": 0.008734155, \"time\": 75.31518497467042}, \"5deeeb223f\": {\"quality\": 0.24316239316239316, \"cost\": 0.005724273, \"time\": 115.48380098342895}, \"5e2f03b962\": {\"quality\": 0.1876068376068376, \"cost\": 0.0007302779999999999, \"time\": 48.90210340023041}, \"5ea2fab380\": {\"quality\": 0.05149572649572649, \"cost\": 0.0048052319999999996, \"time\": 47.63824257850647}, \"5eb3bb525b\": {\"quality\": 0.0702991452991453, \"cost\": 0.008967890000000001, \"time\": 96.21212074756622}, \"5ec3832817\": {\"quality\": 0.0, \"cost\": 0.008163805, \"time\": 148.1108601331711}, \"5f0199e07b\": {\"quality\": 0.025, \"cost\": 0.00068568, \"time\": 49.921340346336365}, \"5f37b3902b\": {\"quality\": 0.04038461538461539, \"cost\": 0.007223498, \"time\": 77.24008927345275}, \"60cb623c53\": {\"quality\": 0.04871794871794872, \"cost\": 0.002396232, \"time\": 54.30748798847199}, \"612e546d71\": {\"quality\": 0.10705128205128206, \"cost\": 0.008148234, \"time\": 151.1273174762726}, \"6160bfb439\": {\"quality\": 0.11474358974358972, \"cost\": 0.010103339999999999, \"time\": 148.44622106552123}, \"6178f33808\": {\"quality\": 0.04722222222222222, \"cost\": 0.006706226999999999, \"time\": 153.97144429683686}, \"619b48dde9\": {\"quality\": 0.21752136752136753, \"cost\": 0.006427425999999999, \"time\": 120.5339899301529}, \"6268ac658c\": {\"quality\": 0.24316239316239316, \"cost\": 0.01069585, \"time\": 144.7754723072052}, \"628f34aace\": {\"quality\": 0.2098290598290598, \"cost\": 0.005102127, \"time\": 151.19322457313538}, \"630d1ecda0\": {\"quality\": 0.0, \"cost\": 0.00503412, \"time\": 158.05562288761138}, \"63a0aaebed\": {\"quality\": 0.15833333333333333, \"cost\": 0.001865547, \"time\": 88.02101895809173}, \"63f392465f\": {\"quality\": 0.18333333333333332, \"cost\": 0.0020362590000000003, \"time\": 114.81161072254181}, \"6527f214c3\": {\"quality\": 0.125, \"cost\": 0.0020534459999999996, \"time\": 107.1269181728363}, \"652c0f4bdf\": {\"quality\": 0.1, \"cost\": 0.013503214, \"time\": 140.5670464038849}, \"6533c85913\": {\"quality\": 0.10641025641025642, \"cost\": 0.009929774999999998, \"time\": 122.77552180290222}, \"65627426e0\": {\"quality\": 0.15, \"cost\": 0.000700312, \"time\": 55.69904806613922}, \"65801893b4\": {\"quality\": 0.025, \"cost\": 0.01106166, \"time\": 113.8573011636734}, \"65b76da9c6\": {\"quality\": 0.12222222222222223, \"cost\": 0.007336156, \"time\": 88.77893948554993}, \"65be1c1306\": {\"quality\": 0.06752136752136753, \"cost\": 0.001988206, \"time\": 34.18720245361328}, \"65e0216208\": {\"quality\": 0.05, \"cost\": 0.008013333, \"time\": 111.02614188194275}, \"65eee615d7\": {\"quality\": 0.1987179487179487, \"cost\": 0.000519715, \"time\": 32.74071106910706}, \"66277da52f\": {\"quality\": 0.09871794871794873, \"cost\": 0.0036039389999999996, \"time\": 110.55973320007324}, \"66750c0934\": {\"quality\": 0.21752136752136753, \"cost\": 0.007508133, \"time\": 85.75274183750153}, \"66776ec181\": {\"quality\": 0.10641025641025642, \"cost\": 0.006989541, \"time\": 111.77394149303436}, \"67632141f6\": {\"quality\": 0.04038461538461539, \"cost\": 0.004542882, \"time\": 63.92582683563233}, \"677deb302a\": {\"quality\": 0.1814102564102564, \"cost\": 0.007409720999999999, \"time\": 111.17820315361024}, \"67868fcff6\": {\"quality\": 0.21752136752136753, \"cost\": 0.011676892000000001, \"time\": 81.84591567516327}, \"67aad9ea16\": {\"quality\": 0.2098290598290598, \"cost\": 0.007452915, \"time\": 126.7279718399048}, \"67bab6732d\": {\"quality\": 0.10555555555555557, \"cost\": 0.01121626, \"time\": 84.01423738002777}, \"67fe399cf1\": {\"quality\": 0.21752136752136753, \"cost\": 0.007087558, \"time\": 80.51065149307252}, \"6846bd8fb3\": {\"quality\": 0.08141025641025641, \"cost\": 0.006947139, \"time\": 85.74786493778228}, \"68583552fb\": {\"quality\": 0.0, \"cost\": 0.0053533439999999995, \"time\": 93.39369978904725}, \"689e327daf\": {\"quality\": 0.08333333333333334, \"cost\": 0.000799428, \"time\": 57.74340398311615}, \"69b3b67de6\": {\"quality\": 0.23974358974358972, \"cost\": 0.008200215, \"time\": 95.98227195739746}, \"69bf3f6ba0\": {\"quality\": 0.21752136752136753, \"cost\": 0.008827848, \"time\": 102.32003858089448}, \"69f90e610f\": {\"quality\": 0.22585470085470086, \"cost\": 0.011046368, \"time\": 121.27799794673919}, \"6a022c3f73\": {\"quality\": 0.21944444444444444, \"cost\": 0.004195854, \"time\": 99.68456645011902}, \"6a10c53ad8\": {\"quality\": 0.32863247863247863, \"cost\": 0.012164633000000001, \"time\": 119.7128321170807}, \"6a6348f69d\": {\"quality\": 0.025, \"cost\": 0.0010719339999999999, \"time\": 79.13316841125489}, \"6a8726145c\": {\"quality\": 0.1876068376068376, \"cost\": 0.0014679629999999996, \"time\": 77.99199786186219}, \"6a8a675442\": {\"quality\": 0.22094017094017093, \"cost\": 0.005341439999999999, \"time\": 90.32394058704377}, \"6aac59742a\": {\"quality\": 0.0, \"cost\": 0.009132285, \"time\": 137.44288988113402}, \"6ac193c88f\": {\"quality\": 0.25085470085470085, \"cost\": 0.006668448, \"time\": 121.86276133060456}, \"6ae9e9de0b\": {\"quality\": 0.21752136752136753, \"cost\": 0.0077003290000000005, \"time\": 125.54181215763091}, \"6b0c585f5c\": {\"quality\": 0.21752136752136753, \"cost\": 0.006663018, \"time\": 122.58973615169526}, \"6b3c16def2\": {\"quality\": 0.18482905982905984, \"cost\": 0.001533927, \"time\": 51.53218412399292}, \"6c05c47050\": {\"quality\": 0.1987179487179487, \"cost\": 0.003248111999999999, \"time\": 96.47181532382965}, \"6c3667811b\": {\"quality\": 0.21474358974358973, \"cost\": 0.005244696, \"time\": 89.6854020357132}, \"6cc813aa68\": {\"quality\": 0.14807692307692308, \"cost\": 0.005938046000000001, \"time\": 58.51538195610046}, \"6cd78cac7e\": {\"quality\": 0.09444444444444444, \"cost\": 0.007387408, \"time\": 101.36276533603669}, \"6d20c6ace0\": {\"quality\": 0.21752136752136753, \"cost\": 0.024144216, \"time\": 121.96203644275664}, \"6d67c56ba6\": {\"quality\": 0.17307692307692307, \"cost\": 0.009393948, \"time\": 150.0689915418625}, \"6db70dc3b6\": {\"quality\": 0.04038461538461539, \"cost\": 0.0015085300000000001, \"time\": 94.44009244441986}, \"6e0690f576\": {\"quality\": 0.0, \"cost\": 0.001440486, \"time\": 74.4156935930252}, \"6e3db7ec5e\": {\"quality\": 0.21752136752136753, \"cost\": 0.009058176, \"time\": 141.13843698501586}, \"6e62bbb47f\": {\"quality\": 0.15982905982905982, \"cost\": 0.0032870549999999997, \"time\": 139.53261284828187}, \"6e859bfae6\": {\"quality\": 0.05, \"cost\": 0.004340145, \"time\": 162.00480234622955}, \"6e93514f45\": {\"quality\": 0.0, \"cost\": 0.0047780819999999995, \"time\": 95.28718383312224}, \"6eae47102b\": {\"quality\": 0.21752136752136753, \"cost\": 0.008071342, \"time\": 93.64182848930359}, \"6ecf93c479\": {\"quality\": 0.17307692307692307, \"cost\": 0.0028209449999999996, \"time\": 103.17148315906525}, \"6ef3b7127e\": {\"quality\": 0.21752136752136753, \"cost\": 0.006768256, \"time\": 68.10584411621093}, \"6f323f80c7\": {\"quality\": 0.0, \"cost\": 0.00037804, \"time\": 104.40542347431182}, \"6f60a05c33\": {\"quality\": 0.15833333333333333, \"cost\": 0.0037825470000000003, \"time\": 92.0291631937027}, \"6fbdd8b57c\": {\"quality\": 0.1, \"cost\": 0.002123712, \"time\": 95.29761900901795}, \"6fd6046c4b\": {\"quality\": 0.18333333333333332, \"cost\": 0.005121717, \"time\": 132.3336772441864}, \"6fe0b3f929\": {\"quality\": 0.0, \"cost\": 0.006223385999999999, \"time\": 68.93117754459381}, \"6ff4f667f8\": {\"quality\": 0.19252136752136753, \"cost\": 0.006652730000000001, \"time\": 87.94405431747437}, \"700474dfbd\": {\"quality\": 0.15833333333333333, \"cost\": 0.0028382069999999997, \"time\": 102.84161474704743}, \"700ab1d309\": {\"quality\": 0.023076923076923078, \"cost\": 0.017122988, \"time\": 117.11154954433441}, \"7040e83d52\": {\"quality\": 0.24166666666666664, \"cost\": 0.019616124999999998, \"time\": 101.78968648910522}, \"704209377f\": {\"quality\": 0.0876068376068376, \"cost\": 0.005003547, \"time\": 138.1696399450302}, \"70b666e371\": {\"quality\": 0.18482905982905984, \"cost\": 0.004856766, \"time\": 104.8691722393036}, \"70c850e039\": {\"quality\": 0.18333333333333332, \"cost\": 0.002363835, \"time\": 67.29307141304017}, \"7112a7e64c\": {\"quality\": 0.25085470085470085, \"cost\": 0.0058677550000000005, \"time\": 63.623961353302}, \"7114013f0c\": {\"quality\": 0.16538461538461538, \"cost\": 0.0022517069999999995, \"time\": 125.79093027114868}, \"715070d0ca\": {\"quality\": 0.025, \"cost\": 0.006732683999999999, \"time\": 135.16606471538546}, \"71b615468b\": {\"quality\": 0.09252136752136753, \"cost\": 0.010796074, \"time\": 94.77488939762115}, \"71ed893462\": {\"quality\": 0.19444444444444442, \"cost\": 0.011514972000000002, \"time\": 137.31028594970704}, \"722d41b2f8\": {\"quality\": 0.125, \"cost\": 0.006919008000000001, \"time\": 110.66710319519044}, \"723fd5589a\": {\"quality\": 0.125, \"cost\": 0.012675105000000002, \"time\": 105.89904611110688}, \"7250da0f41\": {\"quality\": 0.0, \"cost\": 0.0072467339999999995, \"time\": 74.25018970966339}, \"7260a96349\": {\"quality\": 0.10641025641025642, \"cost\": 0.006264144, \"time\": 160.2630994796753}, \"7347cf0308\": {\"quality\": 0.0, \"cost\": 0.003343845, \"time\": 72.2089759349823}, \"736e652158\": {\"quality\": 0.0, \"cost\": 0.005920191, \"time\": 146.5679278612137}, \"738364d6a2\": {\"quality\": 0.0, \"cost\": 0.009448194, \"time\": 159.01038644313812}, \"739b1f81dc\": {\"quality\": 0.1987179487179487, \"cost\": 0.0017973389999999998, \"time\": 73.72460424900055}, \"73c6240c29\": {\"quality\": 0.1, \"cost\": 0.005322242999999999, \"time\": 159.0038095474243}, \"73fc2767b9\": {\"quality\": 0.2064102564102564, \"cost\": 0.0037065989999999997, \"time\": 126.9765938282013}, \"7445d99939\": {\"quality\": 0.1814102564102564, \"cost\": 0.007021181999999999, \"time\": 111.21800615787507}, \"7466a5f424\": {\"quality\": 0.3175213675213675, \"cost\": 0.015386124, \"time\": 150.73878495693208}, \"74cc4b1bc4\": {\"quality\": 0.04038461538461539, \"cost\": 0.006254895, \"time\": 117.36351308822631}, \"74d7f64b8c\": {\"quality\": 0.0, \"cost\": 0.006549831000000001, \"time\": 151.8392071723938}, \"751869cbec\": {\"quality\": 0.015384615384615385, \"cost\": 0.007969879, \"time\": 134.38790624141694}, \"7524905580\": {\"quality\": 0.2098290598290598, \"cost\": 0.004221619, \"time\": 93.99037828445435}, \"752d9649f2\": {\"quality\": 0.25085470085470085, \"cost\": 0.01328176, \"time\": 121.08108768463134}, \"7558c9722d\": {\"quality\": 0.2064102564102564, \"cost\": 0.008134605999999999, \"time\": 120.57110471725464}, \"75ca9cd4f8\": {\"quality\": 0.1876068376068376, \"cost\": 0.0004974599999999999, \"time\": 68.25030062198638}, \"7604c0aa13\": {\"quality\": 0.015384615384615385, \"cost\": 0.010546463999999998, \"time\": 90.40108380317687}, \"765dbc6ad5\": {\"quality\": 0.21752136752136753, \"cost\": 0.003435049, \"time\": 94.19224355220794}, \"7707e6e7e3\": {\"quality\": 0.11474358974358972, \"cost\": 0.010274625999999999, \"time\": 94.13086493015288}, \"7765576286\": {\"quality\": 0.2098290598290598, \"cost\": 0.008545683, \"time\": 127.77431318759919}, \"77983b6105\": {\"quality\": 0.0, \"cost\": 0.002284218, \"time\": 68.26882412433625}, \"77b5740025\": {\"quality\": 0.015384615384615385, \"cost\": 0.006498776, \"time\": 106.45389134883881}, \"77c02b00c1\": {\"quality\": 0.257051282051282, \"cost\": 0.010423473999999999, \"time\": 126.61633729934692}, \"77c6a9703a\": {\"quality\": 0.2098290598290598, \"cost\": 0.007362192, \"time\": 108.2735918521881}, \"77f293b737\": {\"quality\": 0.21752136752136753, \"cost\": 0.009682152999999999, \"time\": 136.6608712911606}, \"7801da66b9\": {\"quality\": 0.0, \"cost\": 0.004307142, \"time\": 92.96408789157867}, \"7862ea67cb\": {\"quality\": 0.15149572649572648, \"cost\": 0.0014619539999999997, \"time\": 74.10243089199066}, \"786e5d0af5\": {\"quality\": 0.22094017094017093, \"cost\": 0.009905354999999998, \"time\": 129.44383957386017}, \"795d119bc7\": {\"quality\": 0.0, \"cost\": 0.006895664000000001, \"time\": 81.85152049064637}, \"7989343d94\": {\"quality\": 0.11666666666666667, \"cost\": 0.0030767639999999996, \"time\": 104.29451589584352}, \"79fad58f07\": {\"quality\": 0.21752136752136753, \"cost\": 0.002502802, \"time\": 51.48524146080017}, \"7a207b42a8\": {\"quality\": 0.023076923076923078, \"cost\": 0.008630430000000001, \"time\": 131.3380735397339}, \"7a58d3472b\": {\"quality\": 0.06752136752136753, \"cost\": 0.012301829, \"time\": 96.64494693279266}, \"7a7cc658c8\": {\"quality\": 0.06944444444444445, \"cost\": 0.009218845, \"time\": 130.32710280418394}, \"7b024a2966\": {\"quality\": 0.14807692307692308, \"cost\": 0.009636592999999999, \"time\": 127.10254328250885}, \"7b6dc3702e\": {\"quality\": 0.11752136752136753, \"cost\": 0.01032963, \"time\": 98.65728087425232}, \"7b6f44618e\": {\"quality\": 0.21752136752136753, \"cost\": 0.012669284, \"time\": 119.66786665916442}, \"7b74b23910\": {\"quality\": 0.06538461538461539, \"cost\": 0.012630904, \"time\": 93.18642947673797}, \"7b9cc96081\": {\"quality\": 0.14807692307692308, \"cost\": 0.000405328, \"time\": 79.7800312757492}, \"7c45c61d8d\": {\"quality\": 0.21752136752136753, \"cost\": 0.006698508, \"time\": 122.86696994304657}, \"7c89a2b69e\": {\"quality\": 0.025, \"cost\": 0.012249355, \"time\": 127.1623036623001}, \"7d44f0959d\": {\"quality\": 0.21752136752136753, \"cost\": 0.006413139, \"time\": 93.95755279064178}, \"7d60c38c5c\": {\"quality\": 0.2098290598290598, \"cost\": 0.0017982599999999999, \"time\": 60.89626235961914}, \"7d67e14414\": {\"quality\": 0.11752136752136753, \"cost\": 0.0037651139999999995, \"time\": 123.35564484596253}, \"7daf7ff182\": {\"quality\": 0.13974358974358975, \"cost\": 0.011279536, \"time\": 93.00705258846284}, \"7e0ad1c9c1\": {\"quality\": 0.07371794871794872, \"cost\": 0.00508167, \"time\": 85.57242858409882}, \"7ed07ad40a\": {\"quality\": 0.2098290598290598, \"cost\": 0.00644937, \"time\": 84.29863061904908}, \"7fa67a7656\": {\"quality\": 0.07307692307692308, \"cost\": 0.007675500000000001, \"time\": 69.2791738986969}, \"806881adcb\": {\"quality\": 0.2064102564102564, \"cost\": 0.008654110999999999, \"time\": 86.79695003032684}, \"80a1d9c2f3\": {\"quality\": 0.15833333333333333, \"cost\": 0.005246394, \"time\": 105.42429778575897}, \"80be7df955\": {\"quality\": 0.0, \"cost\": 0.0014326019999999998, \"time\": 58.2230907201767}, \"80bf60c422\": {\"quality\": 0.16538461538461538, \"cost\": 0.001969395, \"time\": 61.00985796451569}, \"81333c7a33\": {\"quality\": 0.1987179487179487, \"cost\": 0.018448848000000004, \"time\": 86.57931089401245}, \"813e75210b\": {\"quality\": 0.23632478632478632, \"cost\": 0.002293942, \"time\": 33.89890332221985}, \"815e7116df\": {\"quality\": 0.041025641025641026, \"cost\": 0.00256722, \"time\": 102.2627355337143}, \"816068ff07\": {\"quality\": 0.15833333333333333, \"cost\": 0.0041238989999999994, \"time\": 127.76587257385255}, \"81a4f42fd9\": {\"quality\": 0.21752136752136753, \"cost\": 0.005694090000000001, \"time\": 59.030240178108215}, \"828ccea2d3\": {\"quality\": 0.04038461538461539, \"cost\": 0.008288664, \"time\": 109.34065868854523}, \"829df73946\": {\"quality\": 0.2098290598290598, \"cost\": 0.002660831, \"time\": 58.90988037586212}, \"831728b179\": {\"quality\": 0.0, \"cost\": 0.005039063999999999, \"time\": 71.040651845932}, \"831e8b8be5\": {\"quality\": 0.1987179487179487, \"cost\": 0.002362275, \"time\": 99.75399866104127}, \"8357183895\": {\"quality\": 0.0, \"cost\": 0.008517975, \"time\": 130.41329088211057}, \"8392a6083a\": {\"quality\": 0.2098290598290598, \"cost\": 0.013261482, \"time\": 129.61621696949004}, \"83b244c163\": {\"quality\": 0.2098290598290598, \"cost\": 0.0017275300000000001, \"time\": 63.67854707241058}, \"83c9e66ec6\": {\"quality\": 0.2237179487179487, \"cost\": 0.007498675, \"time\": 99.27389492988587}, \"842c0d1062\": {\"quality\": 0.20705128205128204, \"cost\": 0.005142987, \"time\": 166.0325870513916}, \"846bed2aa7\": {\"quality\": 0.24252136752136752, \"cost\": 0.006434445, \"time\": 98.26402361392975}, \"847fd49235\": {\"quality\": 0.17863247863247866, \"cost\": 0.004663961999999999, \"time\": 112.9008763551712}, \"84dc98be95\": {\"quality\": 0.19444444444444442, \"cost\": 0.007237588999999999, \"time\": 108.35145225524903}, \"8519bef585\": {\"quality\": 0.015384615384615385, \"cost\": 0.0024311339999999997, \"time\": 63.10895071029663}, \"8572c6af3a\": {\"quality\": 0.22863247863247865, \"cost\": 0.005362141000000001, \"time\": 159.52487354278566}, \"85c94a5505\": {\"quality\": 0.05, \"cost\": 0.0025202259999999995, \"time\": 34.404055738449095}, \"85e8eaed6e\": {\"quality\": 0.06538461538461539, \"cost\": 0.004890077999999999, \"time\": 158.11657013893125}, \"862183bfb9\": {\"quality\": 0.21752136752136753, \"cost\": 0.007488814999999999, \"time\": 120.0406931400299}, \"8668f65f05\": {\"quality\": 0.21752136752136753, \"cost\": 0.009651689000000001, \"time\": 144.70214662551882}, \"870e2f87b4\": {\"quality\": 0.21752136752136753, \"cost\": 0.013074957000000002, \"time\": 150.40512261390685}, \"87c1b31c82\": {\"quality\": 0.15982905982905982, \"cost\": 0.011328134, \"time\": 152.22725927829742}, \"88436e05a9\": {\"quality\": 0.09252136752136753, \"cost\": 0.010687336, \"time\": 153.95657515525818}, \"887ad124e1\": {\"quality\": 0.1987179487179487, \"cost\": 0.0030405569999999997, \"time\": 113.64234898090362}, \"8886cb3082\": {\"quality\": 0.2098290598290598, \"cost\": 0.009527662999999999, \"time\": 144.03201706409453}, \"8940398bf1\": {\"quality\": 0.04038461538461539, \"cost\": 0.001635585, \"time\": 86.64000837802887}, \"8941621423\": {\"quality\": 0.1737179487179487, \"cost\": 0.006337113, \"time\": 113.57799446582794}, \"8961e4d901\": {\"quality\": 0.0, \"cost\": 0.007177176, \"time\": 104.54208600521088}, \"8974aa89a0\": {\"quality\": 0.2098290598290598, \"cost\": 0.001461674, \"time\": 45.66577224731445}, \"89a289907e\": {\"quality\": 0.25085470085470085, \"cost\": 0.008718126999999999, \"time\": 113.69486925601959}, \"89a35a09b1\": {\"quality\": 0.21752136752136753, \"cost\": 0.007729346999999999, \"time\": 109.72029886245727}, \"89bc21961a\": {\"quality\": 0.18482905982905984, \"cost\": 0.000487962, \"time\": 54.24717800617218}, \"89fbefd150\": {\"quality\": 0.23974358974358972, \"cost\": 0.00825181, \"time\": 111.73369183540345}, \"8a37c82283\": {\"quality\": 0.23974358974358972, \"cost\": 0.007666522, \"time\": 83.14605205059051}, \"8aaadb8649\": {\"quality\": 0.025, \"cost\": 0.006027930000000001, \"time\": 66.97984294891357}, \"8acd758b7f\": {\"quality\": 0.2098290598290598, \"cost\": 0.004582077, \"time\": 86.44987049102784}, \"8b721bbc6f\": {\"quality\": 0.21752136752136753, \"cost\": 0.007828845000000001, \"time\": 117.83376302719117}, \"8b90f4b639\": {\"quality\": 0.0, \"cost\": 0.00041072999999999994, \"time\": 49.55189027786255}, \"8bbbe0f52a\": {\"quality\": 0.1814102564102564, \"cost\": 0.006133866, \"time\": 83.5189700126648}, \"8bc184f385\": {\"quality\": 0.1702991452991453, \"cost\": 0.007205481, \"time\": 86.5741204738617}, \"8bf5c3eadc\": {\"quality\": 0.025, \"cost\": 0.006170141999999999, \"time\": 83.84268100261687}, \"8c195addc7\": {\"quality\": 0.16944444444444445, \"cost\": 0.01389775, \"time\": 90.39201626777648}, \"8c9881972c\": {\"quality\": 0.0626068376068376, \"cost\": 0.0028528439999999998, \"time\": 86.04524366855621}, \"8cf8b81d84\": {\"quality\": 0.125, \"cost\": 0.0008481899999999999, \"time\": 79.2455335855484}, \"8d79e03266\": {\"quality\": 0.21752136752136753, \"cost\": 0.008664971, \"time\": 108.97852082252503}, \"8d90814b94\": {\"quality\": 0.21752136752136753, \"cost\": 0.009054151, \"time\": 130.47983191013338}, \"8e33fac90f\": {\"quality\": 0.30982905982905984, \"cost\": 0.010998730000000002, \"time\": 130.90265057086944}, \"8e5842ccbd\": {\"quality\": 0.14038461538461539, \"cost\": 0.003239019, \"time\": 70.15636944770813}, \"8f4caddfe6\": {\"quality\": 0.21752136752136753, \"cost\": 0.014952506, \"time\": 105.19670691490174}, \"8f4edde3f0\": {\"quality\": 0.3175213675213675, \"cost\": 0.017418426, \"time\": 126.13828411102295}, \"900a58f984\": {\"quality\": 0.2064102564102564, \"cost\": 0.006433365999999999, \"time\": 98.91697227954865}, \"9025e2480f\": {\"quality\": 0.1987179487179487, \"cost\": 0.006619497, \"time\": 78.82729182243347}, \"9028588af4\": {\"quality\": 0.3175213675213675, \"cost\": 0.00501719, \"time\": 37.611924695968625}, \"9059fd80ad\": {\"quality\": 0.04038461538461539, \"cost\": 0.0018876219999999998, \"time\": 75.3997132062912}, \"90d5e40c1b\": {\"quality\": 0.0, \"cost\": 0.006230459999999999, \"time\": 77.66650586128236}, \"90d9a86a2a\": {\"quality\": 0.0, \"cost\": 0.001509345, \"time\": 48.08298280239106}, \"90ed9312e1\": {\"quality\": 0.16944444444444445, \"cost\": 0.008080153999999999, \"time\": 122.39040281772614}, \"90ff13783c\": {\"quality\": 0.24316239316239316, \"cost\": 0.010701109, \"time\": 137.6493337869644}, \"90ff8eb055\": {\"quality\": 0.0702991452991453, \"cost\": 0.009148667999999999, \"time\": 137.7864047050476}, \"9104e31369\": {\"quality\": 0.06944444444444445, \"cost\": 0.012645782999999999, \"time\": 140.74590210914613}, \"918983323f\": {\"quality\": 0.12307692307692308, \"cost\": 0.0011798999999999998, \"time\": 42.641357612609866}, \"91beb0cac1\": {\"quality\": 0.125, \"cost\": 0.000758454, \"time\": 83.41916146278382}, \"91c800af6b\": {\"quality\": 0.04038461538461539, \"cost\": 0.011276811, \"time\": 143.70840940475466}, \"91dd8884db\": {\"quality\": 0.20705128205128204, \"cost\": 0.005775448000000001, \"time\": 141.27923700809478}, \"924f128b3c\": {\"quality\": 0.21752136752136753, \"cost\": 0.013985312, \"time\": 127.0183181285858}, \"9288642e53\": {\"quality\": 0.1, \"cost\": 0.006031014000000001, \"time\": 71.42156167030335}, \"92c4137fb1\": {\"quality\": 0.25085470085470085, \"cost\": 0.006908012999999999, \"time\": 145.73949379920958}, \"92c9dcd43b\": {\"quality\": 0.21752136752136753, \"cost\": 0.008693151, \"time\": 141.44435067176818}, \"92f45e5cc7\": {\"quality\": 0.23974358974358972, \"cost\": 0.0006816839999999999, \"time\": 69.65455045700074}, \"93011c0821\": {\"quality\": 0.16752136752136754, \"cost\": 0.012118556999999999, \"time\": 129.1434654712677}, \"9303149ba4\": {\"quality\": 0.04807692307692308, \"cost\": 0.007609685, \"time\": 148.0766399145126}, \"933b4d17dd\": {\"quality\": 0.22863247863247865, \"cost\": 0.004478119000000001, \"time\": 98.24304263591766}, \"94010928c6\": {\"quality\": 0.1814102564102564, \"cost\": 0.006916301999999999, \"time\": 109.81094932556152}, \"9403809e44\": {\"quality\": 0.21752136752136753, \"cost\": 0.011799824, \"time\": 132.59149780273435}, \"943baaea0c\": {\"quality\": 0.17307692307692307, \"cost\": 0.011566795000000001, \"time\": 117.94117500782014}, \"94569f177a\": {\"quality\": 0.015384615384615385, \"cost\": 0.002410395, \"time\": 125.42731764316558}, \"9466542023\": {\"quality\": 0.03333333333333333, \"cost\": 0.0057622739999999995, \"time\": 136.34905714988707}, \"947e28ef2e\": {\"quality\": 0.06538461538461539, \"cost\": 0.0031674240000000003, \"time\": 124.9436069726944}, \"94ac356663\": {\"quality\": 0.1737179487179487, \"cost\": 0.007041248, \"time\": 104.0579169511795}, \"94dff9a424\": {\"quality\": 0.1987179487179487, \"cost\": 0.005677325, \"time\": 105.50971393585205}, \"9508356a2e\": {\"quality\": 0.14807692307692308, \"cost\": 0.006912872999999999, \"time\": 129.4876657009125}, \"956bdcc254\": {\"quality\": 0.2064102564102564, \"cost\": 0.015445604, \"time\": 125.50174486637115}, \"9594b0c783\": {\"quality\": 0.2098290598290598, \"cost\": 0.003143199, \"time\": 109.28865358829498}, \"964c671f18\": {\"quality\": 0.2098290598290598, \"cost\": 0.011780327, \"time\": 147.56322202682497}, \"9679fe2b69\": {\"quality\": 0.4098290598290598, \"cost\": 0.003103125, \"time\": 112.40209789276122}, \"968fc95038\": {\"quality\": 0.21752136752136753, \"cost\": 0.00356376, \"time\": 109.65150537490845}, \"96b487c724\": {\"quality\": 0.07371794871794872, \"cost\": 0.0006113920000000001, \"time\": 69.84607322216033}, \"96e85f9af4\": {\"quality\": 0.058333333333333334, \"cost\": 0.0037689600000000005, \"time\": 115.5232797384262}, \"96f87d6483\": {\"quality\": 0.19444444444444442, \"cost\": 0.012450801, \"time\": 146.1588816165924}, \"972c83b002\": {\"quality\": 0.14807692307692308, \"cost\": 0.007372287, \"time\": 112.55022113323211}, \"975bc44958\": {\"quality\": 0.24871794871794872, \"cost\": 0.0024022649999999998, \"time\": 111.18246881961822}, \"977a4d6b6b\": {\"quality\": 0.33974358974358976, \"cost\": 0.013285368000000002, \"time\": 138.42745580673215}, \"97ad4cd41a\": {\"quality\": 0.025, \"cost\": 0.008055923000000001, \"time\": 147.88769648075103}, \"97bc30bd83\": {\"quality\": 0.19252136752136753, \"cost\": 0.010554706, \"time\": 107.51760149002075}, \"97e1d0db92\": {\"quality\": 0.06538461538461539, \"cost\": 0.004098666000000001, \"time\": 98.48549189567566}, \"981da9ba40\": {\"quality\": 0.09871794871794873, \"cost\": 0.008754687, \"time\": 150.4159719467163}, \"98c1ea89f3\": {\"quality\": 0.1987179487179487, \"cost\": 0.004756095, \"time\": 144.49644501209258}, \"98eca2c65c\": {\"quality\": 0.2098290598290598, \"cost\": 0.007576088, \"time\": 113.79488031864167}, \"98ecf1a157\": {\"quality\": 0.2098290598290598, \"cost\": 0.0006460169999999999, \"time\": 66.48974347114563}, \"9927dc270b\": {\"quality\": 0.22863247863247865, \"cost\": 0.008019299, \"time\": 142.80323901176453}, \"99546d91e4\": {\"quality\": 0.2098290598290598, \"cost\": 0.0023279909999999997, \"time\": 109.73371806144715}, \"99cb0ba736\": {\"quality\": 0.0, \"cost\": 0.001753854, \"time\": 80.39895787239075}, \"99e44ab9b2\": {\"quality\": 0.2064102564102564, \"cost\": 0.007417979999999999, \"time\": 142.21234567165374}, \"9a0145c9b5\": {\"quality\": 0.07371794871794872, \"cost\": 0.003074034, \"time\": 115.01807222366332}, \"9a57ea3f89\": {\"quality\": 0.1952991452991453, \"cost\": 0.007409801999999998, \"time\": 133.90688972473146}, \"9a8420a0b3\": {\"quality\": 0.0, \"cost\": 0.0068175779999999995, \"time\": 152.5392238378525}, \"9aa32e6c96\": {\"quality\": 0.21752136752136753, \"cost\": 0.01103525, \"time\": 141.08924725055692}, \"9aa4abfb50\": {\"quality\": 0.0, \"cost\": 0.0034209360000000003, \"time\": 89.89876456260681}, \"9ada932bf5\": {\"quality\": 0.21752136752136753, \"cost\": 0.008047750999999999, \"time\": 144.1339359998703}, \"9b6d4915f3\": {\"quality\": 0.0, \"cost\": 0.00463566, \"time\": 54.38807971477509}, \"9bae5bafc1\": {\"quality\": 0.18333333333333332, \"cost\": 0.013702148, \"time\": 98.45531895160676}, \"9c549db0a7\": {\"quality\": 0.0, \"cost\": 0.010721709000000001, \"time\": 79.0128826379776}, \"9c595a2bc9\": {\"quality\": 0.025, \"cost\": 0.007724396, \"time\": 148.95668649673462}, \"9c85f8cfcb\": {\"quality\": 0.015384615384615385, \"cost\": 0.0023850119999999997, \"time\": 67.28472025394439}, \"9c8cc46e6c\": {\"quality\": 0.06538461538461539, \"cost\": 0.002001585, \"time\": 77.71750602722167}, \"9c97d35a30\": {\"quality\": 0.2098290598290598, \"cost\": 0.005077656, \"time\": 118.16123571395875}, \"9ca354a53e\": {\"quality\": 0.0, \"cost\": 0.003021384, \"time\": 101.3728716135025}, \"9ce2c3fd98\": {\"quality\": 0.0, \"cost\": 0.0047293560000000005, \"time\": 129.50395069122314}, \"9d18cd0737\": {\"quality\": 0.07307692307692308, \"cost\": 0.004835114999999999, \"time\": 82.27081375122071}, \"9d7142e7b4\": {\"quality\": 0.16944444444444445, \"cost\": 0.008467907, \"time\": 147.60730669498443}, \"9d778daa24\": {\"quality\": 0.03333333333333333, \"cost\": 0.007418613, \"time\": 153.25245223045349}, \"9e06360bc9\": {\"quality\": 0.2098290598290598, \"cost\": 0.008218711, \"time\": 113.20028014183045}, \"9fb157be35\": {\"quality\": 0.1, \"cost\": 0.008187586, \"time\": 116.65422949790954}, \"9fc44fdeb1\": {\"quality\": 0.0, \"cost\": 0.009764671999999999, \"time\": 163.913742351532}, \"9ffaa26d5a\": {\"quality\": 0.21752136752136753, \"cost\": 0.006876598000000001, \"time\": 107.02223331928253}, \"a041e7777a\": {\"quality\": 0.17649572649572648, \"cost\": 0.0054005519999999994, \"time\": 107.31747977733612}, \"a0b81be5b4\": {\"quality\": 0.21752136752136753, \"cost\": 0.002235789, \"time\": 84.43192894458771}, \"a0c85d260e\": {\"quality\": 0.21752136752136753, \"cost\": 0.013538011000000003, \"time\": 134.37746741771696}, \"a0dc9f50ac\": {\"quality\": 0.15833333333333333, \"cost\": 0.0058347719999999985, \"time\": 73.68754653930664}, \"a14c507393\": {\"quality\": 0.0, \"cost\": 0.0016790159999999998, \"time\": 85.27930269241332}, \"a1881eb481\": {\"quality\": 0.23974358974358972, \"cost\": 0.006974355, \"time\": 158.60932910442352}, \"a1bb32e6a1\": {\"quality\": 0.04038461538461539, \"cost\": 0.0019489619999999998, \"time\": 113.17418549060821}, \"a2347e8e9e\": {\"quality\": 0.08482905982905983, \"cost\": 0.003918969, \"time\": 112.96018514633178}, \"a2aa082d14\": {\"quality\": 0.0, \"cost\": 0.012548850000000002, \"time\": 97.43302309513092}, \"a2cd339ad9\": {\"quality\": 0.09252136752136753, \"cost\": 0.009654009, \"time\": 161.78442559242248}, \"a2fd03e6a5\": {\"quality\": 0.1876068376068376, \"cost\": 0.004650046, \"time\": 46.97034850120544}, \"a31e87d7cb\": {\"quality\": 0.2064102564102564, \"cost\": 0.00146718, \"time\": 73.75970520973206}, \"a344b2d79a\": {\"quality\": 0.2098290598290598, \"cost\": 0.008522825, \"time\": 167.94721865653992}, \"a3c0ea3342\": {\"quality\": 0.3175213675213675, \"cost\": 0.007620170000000001, \"time\": 166.26401495933533}, \"a456d75fef\": {\"quality\": 0.0, \"cost\": 0.003897582, \"time\": 149.68703374862673}, \"a457f6c300\": {\"quality\": 0.0, \"cost\": 0.007946964, \"time\": 158.9001291036606}, \"a4767e7679\": {\"quality\": 0.0, \"cost\": 0.002210106, \"time\": 111.21880123615264}, \"a47de025c8\": {\"quality\": 0.0, \"cost\": 0.003860679, \"time\": 106.15438146591185}, \"a4ad96343d\": {\"quality\": 0.23696581196581196, \"cost\": 0.009205054, \"time\": 161.23487601280215}, \"a515a9c8cc\": {\"quality\": 0.16944444444444445, \"cost\": 0.0021845939999999998, \"time\": 90.79483761787415}, \"a5949b76ec\": {\"quality\": 0.0, \"cost\": 0.0022742099999999996, \"time\": 77.40827825069428}, \"a5ae4dfe66\": {\"quality\": 0.1, \"cost\": 0.006415026, \"time\": 117.52768676280975}, \"a60dd076b8\": {\"quality\": 0.1564102564102564, \"cost\": 0.0024921179999999998, \"time\": 99.6796523809433}, \"a6297a6c56\": {\"quality\": 0.125, \"cost\": 0.0019724219999999997, \"time\": 83.12536749839782}, \"a6460dbb7c\": {\"quality\": 0.2098290598290598, \"cost\": 0.002657791, \"time\": 71.54174087047576}, \"a6796ed686\": {\"quality\": 0.09444444444444444, \"cost\": 0.0032055929999999996, \"time\": 130.10447964668273}, \"a6d2b05ec8\": {\"quality\": 0.2098290598290598, \"cost\": 0.008059675, \"time\": 133.3959671497345}, \"a717c4c535\": {\"quality\": 0.11752136752136753, \"cost\": 0.012759294000000001, \"time\": 123.11250400543213}, \"a721cd9ebf\": {\"quality\": 0.21752136752136753, \"cost\": 0.006374966000000001, \"time\": 93.21334941387177}, \"a8090787b1\": {\"quality\": 0.2098290598290598, \"cost\": 0.004406745, \"time\": 124.60862724781036}, \"a854343d46\": {\"quality\": 0.225, \"cost\": 0.0063598100000000005, \"time\": 87.96014966964722}, \"a86b137d7f\": {\"quality\": 0.15, \"cost\": 0.0021869159999999997, \"time\": 58.69887585639954}, \"a88eb1493c\": {\"quality\": 0.0952991452991453, \"cost\": 0.004839059999999999, \"time\": 60.44982805252076}, \"a88fb984e3\": {\"quality\": 0.21752136752136753, \"cost\": 0.008328870999999998, \"time\": 131.90409784317018}, \"a94e2e5f57\": {\"quality\": 0.04807692307692308, \"cost\": 0.006114, \"time\": 102.40570263862611}, \"a95b4a6dd0\": {\"quality\": 0.0452991452991453, \"cost\": 0.008242788000000001, \"time\": 101.27096877098083}, \"a9621ea4e6\": {\"quality\": 0.22863247863247865, \"cost\": 0.00707055, \"time\": 100.2644911289215}, \"a96e22379d\": {\"quality\": 0.13482905982905985, \"cost\": 0.004674597, \"time\": 135.9614454984665}, \"a9721a0a50\": {\"quality\": 0.18482905982905984, \"cost\": 0.006856040000000001, \"time\": 74.35422718524933}, \"aa08180e36\": {\"quality\": 0.21752136752136753, \"cost\": 0.009903308, \"time\": 141.78917632102966}, \"aa38702a02\": {\"quality\": 0.058333333333333334, \"cost\": 0.004223736, \"time\": 108.32900211811065}, \"aa8187c023\": {\"quality\": 0.0, \"cost\": 0.006725136, \"time\": 115.54149260520936}, \"aadbfc418b\": {\"quality\": 0.04807692307692308, \"cost\": 0.003954492, \"time\": 115.65469679832458}, \"ab1c706436\": {\"quality\": 0.09871794871794873, \"cost\": 0.007816556999999998, \"time\": 157.3217898607254}, \"ab43b02cb0\": {\"quality\": 0.0, \"cost\": 0.010739148, \"time\": 88.10484538078308}, \"aba21780bc\": {\"quality\": 0.09594017094017093, \"cost\": 0.001212108, \"time\": 68.86885101795197}, \"abbca95f00\": {\"quality\": 0.1841880341880342, \"cost\": 0.00179952, \"time\": 77.29247143268586}, \"ac208e7a1d\": {\"quality\": 0.21752136752136753, \"cost\": 0.009940174000000001, \"time\": 106.78546745777129}, \"ac7fcf90e2\": {\"quality\": 0.059829059829059825, \"cost\": 0.008491644, \"time\": 163.5559736728668}, \"ac828ffe70\": {\"quality\": 0.025, \"cost\": 0.000761398, \"time\": 72.60451011657715}, \"ac9fdc1550\": {\"quality\": 0.2098290598290598, \"cost\": 0.012285289000000001, \"time\": 112.57146043777465}, \"ad328d5108\": {\"quality\": 0.1611111111111111, \"cost\": 0.001292802, \"time\": 134.6257879257202}, \"ad3efe44c3\": {\"quality\": 0.0, \"cost\": 0.002330118, \"time\": 83.25956127643585}, \"ad48432c22\": {\"quality\": 0.0, \"cost\": 0.009617565, \"time\": 165.95421760082246}, \"ad5187a390\": {\"quality\": 0.22863247863247865, \"cost\": 0.010695554, \"time\": 160.50709626674652}, \"ad6ebbba8d\": {\"quality\": 0.16752136752136754, \"cost\": 0.0075785209999999995, \"time\": 114.89612691402435}, \"ad90055ef6\": {\"quality\": 0.0, \"cost\": 0.003493998, \"time\": 103.62199125289916}, \"ad97c5cee6\": {\"quality\": 0.15149572649572648, \"cost\": 0.0022562339999999998, \"time\": 137.5054316520691}, \"adab1e0fb1\": {\"quality\": 0.3175213675213675, \"cost\": 0.00585581, \"time\": 77.00263237953186}, \"ae655ec593\": {\"quality\": 0.11666666666666667, \"cost\": 0.003635154, \"time\": 112.42764747142792}, \"ae94b172be\": {\"quality\": 0.1987179487179487, \"cost\": 0.007214128000000001, \"time\": 117.53457653522491}, \"af360c323c\": {\"quality\": 0.0, \"cost\": 0.0032059769999999996, \"time\": 116.28576538562774}, \"af90567194\": {\"quality\": 0.21752136752136753, \"cost\": 0.013109858999999998, \"time\": 144.84608309268953}, \"b0156bb6d2\": {\"quality\": 0.25085470085470085, \"cost\": 0.011064174, \"time\": 144.33307461738588}, \"b03c31ca45\": {\"quality\": 0.09871794871794873, \"cost\": 0.007032179999999999, \"time\": 159.58066523075104}, \"b0530b98c3\": {\"quality\": 0.08333333333333334, \"cost\": 0.007914135, \"time\": 151.2534171819687}, \"b0948c05b6\": {\"quality\": 0.16538461538461538, \"cost\": 0.011687544000000001, \"time\": 77.90738432407379}, \"b18168b9c1\": {\"quality\": 0.1952991452991453, \"cost\": 0.007746042, \"time\": 102.36798930168152}, \"b1cf8d33e5\": {\"quality\": 0.0702991452991453, \"cost\": 0.007221741, \"time\": 96.63562579154969}, \"b1dcd7aa24\": {\"quality\": 0.2098290598290598, \"cost\": 0.009499193, \"time\": 142.64096357822416}, \"b214718d07\": {\"quality\": 0.16538461538461538, \"cost\": 0.002830191, \"time\": 105.70124731063842}, \"b28925e4b8\": {\"quality\": 0.06538461538461539, \"cost\": 0.011244171, \"time\": 117.42571413516998}, \"b2b057ba41\": {\"quality\": 0.21752136752136753, \"cost\": 0.0036646409999999997, \"time\": 104.08598182201385}, \"b2e063499d\": {\"quality\": 0.21752136752136753, \"cost\": 0.008730451, \"time\": 99.22217960357665}, \"b3369775dc\": {\"quality\": 0.2064102564102564, \"cost\": 0.012187132, \"time\": 128.45301840305328}, \"b35bf038c2\": {\"quality\": 0.14038461538461539, \"cost\": 0.0027645659999999996, \"time\": 118.8479066848755}, \"b363b25367\": {\"quality\": 0.10982905982905983, \"cost\": 0.00807849, \"time\": 150.05790014266967}, \"b3decd5c2f\": {\"quality\": 0.14722222222222223, \"cost\": 0.0054683579999999996, \"time\": 114.51114053726197}, \"b3f20b706d\": {\"quality\": 0.025, \"cost\": 0.002810052, \"time\": 90.74897117614745}, \"b4002173ee\": {\"quality\": 0.10641025641025642, \"cost\": 0.00532104, \"time\": 71.4259075164795}, \"b45fc30d81\": {\"quality\": 0.015384615384615385, \"cost\": 0.007027874999999999, \"time\": 148.3045286655426}, \"b4a259f6dd\": {\"quality\": 0.2098290598290598, \"cost\": 0.00846508, \"time\": 139.42276346683502}, \"b52cdb3c6d\": {\"quality\": 0.25085470085470085, \"cost\": 0.001953663, \"time\": 101.56977760791779}, \"b56c312eda\": {\"quality\": 0.21752136752136753, \"cost\": 0.008759253, \"time\": 135.296657371521}, \"b5a02bb8ab\": {\"quality\": 0.09252136752136753, \"cost\": 0.005621411999999999, \"time\": 126.08018836975097}, \"b5e2b41c1c\": {\"quality\": 0.1987179487179487, \"cost\": 0.004181511, \"time\": 97.78416235446929}, \"b64ddb14f9\": {\"quality\": 0.21752136752136753, \"cost\": 0.002377122, \"time\": 40.39987435340882}, \"b66118d5f2\": {\"quality\": 0.04038461538461539, \"cost\": 0.007001360999999999, \"time\": 141.72445845603943}, \"b67107a43e\": {\"quality\": 0.0, \"cost\": 0.005282744999999999, \"time\": 134.97200748920443}, \"b682a23b89\": {\"quality\": 0.2098290598290598, \"cost\": 0.004507057, \"time\": 94.8906276702881}, \"b690a1ddd6\": {\"quality\": 0.0, \"cost\": 0.004254936000000001, \"time\": 132.1497477531433}, \"b796b7ffd3\": {\"quality\": 0.21752136752136753, \"cost\": 0.008754787, \"time\": 134.97328844070432}, \"b7d0e8557f\": {\"quality\": 0.20982905982905983, \"cost\": 0.007772019, \"time\": 136.01008360385896}, \"b7f203a0bf\": {\"quality\": 0.09871794871794873, \"cost\": 0.0074011649999999995, \"time\": 142.89838807582856}, \"b81d5b2bd9\": {\"quality\": 0.04807692307692308, \"cost\": 0.008468160999999998, \"time\": 139.87127735614774}, \"b8343f05e1\": {\"quality\": 0.04722222222222222, \"cost\": 0.0071142839999999985, \"time\": 114.42319309711456}, \"b8ab3d2f25\": {\"quality\": 0.21752136752136753, \"cost\": 0.008466568, \"time\": 101.01810977458953}, \"b8b569172f\": {\"quality\": 0.10705128205128206, \"cost\": 0.010024281999999999, \"time\": 115.06630852222443}, \"b8b91e375d\": {\"quality\": 0.21752136752136753, \"cost\": 0.012645420000000001, \"time\": 146.17070028781893}, \"b8c685904d\": {\"quality\": 0.0, \"cost\": 0.0017556779999999999, \"time\": 98.82595324516296}, \"b8d1903276\": {\"quality\": 0.025, \"cost\": 0.004277349, \"time\": 116.25438375473021}, \"b91e7fdb29\": {\"quality\": 0.0, \"cost\": 0.009623747999999998, \"time\": 172.33287427425387}, \"b9770c2261\": {\"quality\": 0.24871794871794872, \"cost\": 0.0018192599999999998, \"time\": 68.26051120758056}, \"b9bb1e6f8d\": {\"quality\": 0.21752136752136753, \"cost\": 0.012109797999999998, \"time\": 153.9570437669754}, \"b9da208432\": {\"quality\": 0.22094017094017093, \"cost\": 0.009226414999999998, \"time\": 152.71056313514708}, \"ba3223f6ac\": {\"quality\": 0.17307692307692307, \"cost\": 0.003906612, \"time\": 117.72781562805176}, \"bac3d23c31\": {\"quality\": 0.21752136752136753, \"cost\": 0.011714114000000001, \"time\": 149.72536618709563}, \"bb3ee18de1\": {\"quality\": 0.2064102564102564, \"cost\": 0.00687103, \"time\": 112.67253947257996}, \"bb6536b0ab\": {\"quality\": 0.12585470085470085, \"cost\": 0.010593946, \"time\": 116.58704767227172}, \"bb70f60bf1\": {\"quality\": 0.1952991452991453, \"cost\": 0.0029260320000000003, \"time\": 113.47788968086243}, \"bbba9dd6ae\": {\"quality\": 0.3175213675213675, \"cost\": 0.005866453000000001, \"time\": 81.5745332479477}, \"bbfba2f2ee\": {\"quality\": 0.0, \"cost\": 0.0043119479999999995, \"time\": 144.39048359394076}, \"bc3d02f753\": {\"quality\": 0.0, \"cost\": 0.002622999, \"time\": 100.0285652399063}, \"bc4c1fcc64\": {\"quality\": 0.21752136752136753, \"cost\": 0.005892321000000001, \"time\": 80.69622204303741}, \"bc60556255\": {\"quality\": 0.025, \"cost\": 0.000541426, \"time\": 90.43072290420533}, \"bcae7c2fc4\": {\"quality\": 0.0952991452991453, \"cost\": 0.006106959, \"time\": 118.90915808677673}, \"bcef42e3b0\": {\"quality\": 0.20705128205128204, \"cost\": 0.003036438, \"time\": 126.3029891014099}, \"bcf4bf7c35\": {\"quality\": 0.09252136752136753, \"cost\": 0.010682502, \"time\": 163.63021724224092}, \"bcfb273436\": {\"quality\": 0.2675213675213675, \"cost\": 0.005338022999999999, \"time\": 154.81484963893888}, \"bd99b2fb21\": {\"quality\": 0.21752136752136753, \"cost\": 0.00784732, \"time\": 116.22147076129913}, \"bdf497196b\": {\"quality\": 0.21752136752136753, \"cost\": 0.00564954, \"time\": 81.09978458881378}, \"be2ae88f70\": {\"quality\": 0.0, \"cost\": 0.004892166, \"time\": 134.319625210762}, \"be4740f38f\": {\"quality\": 0.11752136752136753, \"cost\": 0.011724527999999998, \"time\": 111.63968887329102}, \"bed888d4dc\": {\"quality\": 0.12222222222222223, \"cost\": 0.007974548999999997, \"time\": 167.2181126832962}, \"bf2a5d2680\": {\"quality\": 0.08141025641025641, \"cost\": 0.005895564, \"time\": 116.7006804227829}, \"bf7b0a8dc1\": {\"quality\": 0.21752136752136753, \"cost\": 0.0072987239999999995, \"time\": 172.42680845260622}, \"bfed7670ed\": {\"quality\": 0.08482905982905983, \"cost\": 0.010263118, \"time\": 118.4290988445282}, \"c06b118e65\": {\"quality\": 0.20705128205128204, \"cost\": 0.005422905, \"time\": 111.81110198497773}, \"c08a5ad170\": {\"quality\": 0.19252136752136753, \"cost\": 0.008962954000000002, \"time\": 153.46751430034638}, \"c0d53a20de\": {\"quality\": 0.2098290598290598, \"cost\": 0.007020855, \"time\": 151.7084014415741}, \"c10e588987\": {\"quality\": 0.1814102564102564, \"cost\": 0.0070173779999999995, \"time\": 161.47480256557463}, \"c127509a7a\": {\"quality\": 0.30982905982905984, \"cost\": 0.012152782, \"time\": 114.87017834186554}, \"c13682c7c7\": {\"quality\": 0.21752136752136753, \"cost\": 0.011820473999999997, \"time\": 155.98555040359497}, \"c13d6e78e9\": {\"quality\": 0.2098290598290598, \"cost\": 0.015482613, \"time\": 135.19117062091829}, \"c145482664\": {\"quality\": 0.125, \"cost\": 0.011503632, \"time\": 157.4289677143097}, \"c14ff3144d\": {\"quality\": 0.15833333333333333, \"cost\": 0.001474128, \"time\": 73.17416946887971}, \"c186182658\": {\"quality\": 0.0702991452991453, \"cost\": 0.00651766, \"time\": 114.43175661563873}, \"c263c65d0a\": {\"quality\": 0.2098290598290598, \"cost\": 0.004897752, \"time\": 161.65783095359802}, \"c2949aa902\": {\"quality\": 0.2064102564102564, \"cost\": 0.006985964999999999, \"time\": 112.66284742355347}, \"c31c9d4d8c\": {\"quality\": 0.058333333333333334, \"cost\": 0.004036233, \"time\": 118.73210837841035}, \"c31e956b35\": {\"quality\": 0.10705128205128206, \"cost\": 0.008987883, \"time\": 156.68281931877135}, \"c339463a25\": {\"quality\": 0.11538461538461539, \"cost\": 0.0039407339999999996, \"time\": 142.9191876411438}, \"c36b525dde\": {\"quality\": 0.2098290598290598, \"cost\": 0.00295329, \"time\": 84.8311204433441}, \"c3d20f33bf\": {\"quality\": 0.15085470085470087, \"cost\": 0.009346874, \"time\": 155.06189365386962}, \"c3ec2cec59\": {\"quality\": 0.3175213675213675, \"cost\": 0.010893408, \"time\": 121.71417863368988}, \"c443a2c1fb\": {\"quality\": 0.09444444444444444, \"cost\": 0.008210058, \"time\": 158.09974863529203}, \"c4a80d19b3\": {\"quality\": 0.3175213675213675, \"cost\": 0.010017962999999998, \"time\": 152.05399761199953}, \"c4c2826afd\": {\"quality\": 0.14722222222222223, \"cost\": 0.008821047, \"time\": 150.96840312480924}, \"c4c94a5527\": {\"quality\": 0.14807692307692308, \"cost\": 0.007312788, \"time\": 118.44996173381804}, \"c4f3e7665d\": {\"quality\": 0.1, \"cost\": 0.004909806, \"time\": 122.77528305053711}, \"c58e9652b7\": {\"quality\": 0.03333333333333333, \"cost\": 0.004275492, \"time\": 163.37369763851166}, \"c5a16b834a\": {\"quality\": 0.24252136752136752, \"cost\": 0.009288249, \"time\": 145.43366010189055}, \"c5fbe2076f\": {\"quality\": 0.21752136752136753, \"cost\": 0.016178868, \"time\": 153.44052112102509}, \"c6198f364e\": {\"quality\": 0.21752136752136753, \"cost\": 0.008840571000000002, \"time\": 143.01735095977784}, \"c691570715\": {\"quality\": 0.025, \"cost\": 0.004222068, \"time\": 105.88032670021057}, \"c691a29c42\": {\"quality\": 0.06944444444444445, \"cost\": 0.014146588, \"time\": 136.24555165767669}, \"c6a339987c\": {\"quality\": 0.14807692307692308, \"cost\": 0.008871827000000002, \"time\": 124.18036108016967}, \"c6a4d256ce\": {\"quality\": 0.11752136752136753, \"cost\": 0.011863845, \"time\": 131.42284343242645}, \"c76222087e\": {\"quality\": 0.10982905982905983, \"cost\": 0.007390348000000001, \"time\": 108.10401859283448}, \"c772ff3704\": {\"quality\": 0.10982905982905983, \"cost\": 0.008256624, \"time\": 122.45812864303589}, \"c7d4ff0c05\": {\"quality\": 0.22863247863247865, \"cost\": 0.006689247, \"time\": 135.3987812280655}, \"c823f7ab29\": {\"quality\": 0.19252136752136753, \"cost\": 0.006112934, \"time\": 110.04468183517456}, \"c82b926689\": {\"quality\": 0.21752136752136753, \"cost\": 0.011001155000000002, \"time\": 132.11898975372316}, \"c82f834e85\": {\"quality\": 0.21752136752136753, \"cost\": 0.007510131000000001, \"time\": 88.01416339874268}, \"c9320068f9\": {\"quality\": 0.11474358974358972, \"cost\": 0.008706341999999999, \"time\": 138.78494324684144}, \"c935a33384\": {\"quality\": 0.23974358974358972, \"cost\": 0.013361947999999998, \"time\": 128.84306230545045}, \"c99f3577c7\": {\"quality\": 0.09444444444444444, \"cost\": 0.012227506, \"time\": 131.7928378343582}, \"c9d32a0a82\": {\"quality\": 0.1, \"cost\": 0.00772979, \"time\": 150.54008300304412}, \"ca55c36c3f\": {\"quality\": 0.2098290598290598, \"cost\": 0.002033097, \"time\": 109.44059422016144}, \"caa7c0bd6b\": {\"quality\": 0.1737179487179487, \"cost\": 0.007166354999999999, \"time\": 108.27636570930481}, \"cac6b051e9\": {\"quality\": 0.2098290598290598, \"cost\": 0.007765482000000001, \"time\": 140.63682539463042}, \"cb19d631b2\": {\"quality\": 0.2098290598290598, \"cost\": 0.0074525500000000005, \"time\": 109.26569464206696}, \"cbb25fb322\": {\"quality\": 0.19444444444444442, \"cost\": 0.004088256, \"time\": 146.00942890644075}, \"cbb5eb0e74\": {\"quality\": 0.2098290598290598, \"cost\": 0.007899888, \"time\": 146.89739501476288}, \"cbc32cbeff\": {\"quality\": 0.2098290598290598, \"cost\": 0.008663157000000001, \"time\": 144.39441390037535}, \"cbd4461293\": {\"quality\": 0.09871794871794873, \"cost\": 0.0019465299999999997, \"time\": 53.78504421710968}, \"cbe2318045\": {\"quality\": 0.18482905982905984, \"cost\": 0.00824068, \"time\": 99.53602702617644}, \"cc20ebc768\": {\"quality\": 0.13974358974358975, \"cost\": 0.008763393999999999, \"time\": 151.47476081848146}, \"cc886fe337\": {\"quality\": 0.05, \"cost\": 0.006118446, \"time\": 104.02940430641175}, \"ccf72745c1\": {\"quality\": 0.21752136752136753, \"cost\": 0.00881063, \"time\": 104.33854112625122}, \"cd1d418732\": {\"quality\": 0.0, \"cost\": 0.005609648, \"time\": 51.80708644390106}, \"cd23c79db1\": {\"quality\": 0.0, \"cost\": 0.008759461999999999, \"time\": 130.19892246723174}, \"cd64fbfcd9\": {\"quality\": 0.02222222222222222, \"cost\": 0.006661377, \"time\": 132.06611769199372}, \"cda3e2a4e9\": {\"quality\": 0.19252136752136753, \"cost\": 0.003251412, \"time\": 130.11529834270476}, \"cdc9ce922f\": {\"quality\": 0.1876068376068376, \"cost\": 0.00361428, \"time\": 130.60064585208892}, \"cdf0df2f51\": {\"quality\": 0.16538461538461538, \"cost\": 0.005876955, \"time\": 170.43609290122987}, \"ce281875b4\": {\"quality\": 0.2098290598290598, \"cost\": 0.002271957, \"time\": 134.98369066715242}, \"ce4bc5f348\": {\"quality\": 0.04038461538461539, \"cost\": 0.0031947119999999997, \"time\": 97.7704703092575}, \"ce980cf86f\": {\"quality\": 0.125, \"cost\": 0.003918672, \"time\": 130.57520353794098}, \"cecca90dd2\": {\"quality\": 0.1987179487179487, \"cost\": 0.004469783999999999, \"time\": 136.02725315093994}, \"cece83de2d\": {\"quality\": 0.10982905982905983, \"cost\": 0.0060226260000000005, \"time\": 134.92675962448118}, \"cf51e0a888\": {\"quality\": 0.025, \"cost\": 0.007815034000000002, \"time\": 141.21634793281555}, \"cf9538faf0\": {\"quality\": 0.21752136752136753, \"cost\": 0.009526646, \"time\": 136.5719269990921}, \"cf9d2e224c\": {\"quality\": 0.11752136752136753, \"cost\": 0.005909796, \"time\": 82.14293384552002}, \"cfd36f3a8c\": {\"quality\": 0.2064102564102564, \"cost\": 0.015834602, \"time\": 133.97862401008604}, \"cffa29a6ef\": {\"quality\": 0.1987179487179487, \"cost\": 0.00053197, \"time\": 57.04082276821137}, \"d03596c3de\": {\"quality\": 0.23974358974358972, \"cost\": 0.007832440999999999, \"time\": 161.64306690692902}, \"d0f9633442\": {\"quality\": 0.0, \"cost\": 0.007817082, \"time\": 133.32561285495757}, \"d192872a51\": {\"quality\": 0.06944444444444445, \"cost\": 0.011960976000000002, \"time\": 151.55924673080443}, \"d1d953cac7\": {\"quality\": 0.023076923076923078, \"cost\": 0.0069616439999999995, \"time\": 129.82648131847384}, \"d2164c8c4c\": {\"quality\": 0.25085470085470085, \"cost\": 0.007942218000000001, \"time\": 106.56984751224518}, \"d216eab7d8\": {\"quality\": 0.23974358974358972, \"cost\": 0.008338906, \"time\": 110.89031755924225}, \"d266c19ac8\": {\"quality\": 0.18482905982905984, \"cost\": 0.005187306, \"time\": 137.3413758277893}, \"d2af24b59e\": {\"quality\": 0.19252136752136753, \"cost\": 0.008368941999999999, \"time\": 107.03122828006744}, \"d3a2d50bd7\": {\"quality\": 0.2098290598290598, \"cost\": 0.004220073, \"time\": 104.34660010337831}, \"d3db4cf84d\": {\"quality\": 0.0, \"cost\": 0.0043255260000000005, \"time\": 86.13068716526033}, \"d40174fb0b\": {\"quality\": 0.015384615384615385, \"cost\": 0.004637364, \"time\": 72.65830745697022}, \"d43fafa19e\": {\"quality\": 0.023076923076923078, \"cost\": 0.004984457999999999, \"time\": 69.79157807826996}, \"d48ead13da\": {\"quality\": 0.04807692307692308, \"cost\": 0.007356842000000001, \"time\": 108.29052205085753}, \"d5016f4538\": {\"quality\": 0.023076923076923078, \"cost\": 0.004924038, \"time\": 63.6337170124054}, \"d55e983189\": {\"quality\": 0.2098290598290598, \"cost\": 0.0023029619999999995, \"time\": 103.95245385169983}, \"d573c2a414\": {\"quality\": 0.04444444444444444, \"cost\": 0.003798615, \"time\": 146.4341695547104}, \"d58036ba66\": {\"quality\": 0.21752136752136753, \"cost\": 0.007977542, \"time\": 101.040083694458}, \"d59bacbfe0\": {\"quality\": 0.09252136752136753, \"cost\": 0.009014214999999999, \"time\": 152.89927747249604}, \"d6040140b9\": {\"quality\": 0.21752136752136753, \"cost\": 0.013529567000000001, \"time\": 147.0174176454544}, \"d640edd7a7\": {\"quality\": 0.2064102564102564, \"cost\": 0.0030975000000000004, \"time\": 118.49127612113952}, \"d65185c1a4\": {\"quality\": 0.2098290598290598, \"cost\": 0.0048733019999999995, \"time\": 101.63841800689698}, \"d690b6d739\": {\"quality\": 0.13760683760683762, \"cost\": 0.0024723749999999997, \"time\": 115.86975026130676}, \"d6bd3b66ba\": {\"quality\": 0.0, \"cost\": 0.0037659959999999998, \"time\": 111.267901968956}, \"d6c4e48eeb\": {\"quality\": 0.21752136752136753, \"cost\": 0.008788099, \"time\": 116.89563345909119}, \"d6c60a5214\": {\"quality\": 0.17307692307692307, \"cost\": 0.0012564660000000001, \"time\": 69.47750086784363}, \"d6cbf265ee\": {\"quality\": 0.06538461538461539, \"cost\": 0.00160378, \"time\": 72.91289446353912}, \"d73a9aab4e\": {\"quality\": 0.04871794871794872, \"cost\": 0.0032188139999999995, \"time\": 79.5758284330368}, \"d752c30d07\": {\"quality\": 0.3175213675213675, \"cost\": 0.015617612, \"time\": 127.57117984294892}, \"d7c0972014\": {\"quality\": 0.19444444444444442, \"cost\": 0.0105163, \"time\": 80.39488637447357}, \"d7f6c0c9d4\": {\"quality\": 0.0, \"cost\": 0.002123865, \"time\": 94.49595766067506}, \"d813410e44\": {\"quality\": 0.04807692307692308, \"cost\": 0.002987295, \"time\": 133.61272318363189}, \"d87eb775da\": {\"quality\": 0.2098290598290598, \"cost\": 0.008013359000000001, \"time\": 119.17614867687226}, \"d8bab6c09b\": {\"quality\": 0.07307692307692308, \"cost\": 0.011752513, \"time\": 155.0189305305481}, \"d8bcac36e8\": {\"quality\": 0.025, \"cost\": 0.010723482, \"time\": 115.70462930202484}, \"d96677d8d4\": {\"quality\": 0.21752136752136753, \"cost\": 0.006440180999999999, \"time\": 108.42673063278198}, \"d9e2bb21a3\": {\"quality\": 0.19594017094017094, \"cost\": 0.005149365, \"time\": 107.66626167297363}, \"daaadadcc9\": {\"quality\": 0.21752136752136753, \"cost\": 0.009858946, \"time\": 142.47570564746857}, \"daf855e065\": {\"quality\": 0.1987179487179487, \"cost\": 0.005835996000000001, \"time\": 153.1451871395111}, \"db6f7259cd\": {\"quality\": 0.17307692307692307, \"cost\": 0.007230853000000001, \"time\": 107.38680810928344}, \"db9060cd27\": {\"quality\": 0.025, \"cost\": 0.00183247, \"time\": 75.05918591022493}, \"dbce95a072\": {\"quality\": 0.10641025641025642, \"cost\": 0.0072439719999999996, \"time\": 107.68909630775451}, \"dbe7d818fa\": {\"quality\": 0.14444444444444443, \"cost\": 0.0020769660000000004, \"time\": 109.260413813591}, \"dc66bccb1c\": {\"quality\": 0.2098290598290598, \"cost\": 0.00518188, \"time\": 101.09860997200012}, \"dc90065dea\": {\"quality\": 0.11752136752136753, \"cost\": 0.010587499, \"time\": 108.61172733306884}, \"dc9a912501\": {\"quality\": 0.06538461538461539, \"cost\": 0.004132665, \"time\": 122.22055804729462}, \"dd0d70fedd\": {\"quality\": 0.1987179487179487, \"cost\": 0.0018010649999999997, \"time\": 72.4727225780487}, \"dd76899626\": {\"quality\": 0.0, \"cost\": 0.01104418, \"time\": 163.46480329036712}, \"dd9f5d1ba9\": {\"quality\": 0.1987179487179487, \"cost\": 0.003202473, \"time\": 168.22212414741517}, \"de1645053b\": {\"quality\": 0.21752136752136753, \"cost\": 0.009485828, \"time\": 156.01319019794465}, \"de18bf45e1\": {\"quality\": 0.03333333333333333, \"cost\": 0.001299202, \"time\": 79.47014775276185}, \"de1e56370f\": {\"quality\": 0.2098290598290598, \"cost\": 0.0039003330000000006, \"time\": 95.9440367937088}, \"deb84ddd06\": {\"quality\": 0.0, \"cost\": 0.006051836, \"time\": 89.6067531824112}, \"df2bb408cf\": {\"quality\": 0.125, \"cost\": 0.0037215599999999996, \"time\": 133.46323487758636}, \"df2ebe2c01\": {\"quality\": 0.1987179487179487, \"cost\": 0.0045562350000000005, \"time\": 174.54163272380828}, \"dfb8aebe38\": {\"quality\": 0.21752136752136753, \"cost\": 0.008408229, \"time\": 166.86121113300322}, \"dfce6153aa\": {\"quality\": 0.08482905982905983, \"cost\": 0.00791217, \"time\": 165.6665771961212}, \"dfda94bd2a\": {\"quality\": 0.0, \"cost\": 0.0017028, \"time\": 85.28198800086975}, \"dff452a9ca\": {\"quality\": 0.0, \"cost\": 0.005131782, \"time\": 116.72450284957885}, \"e02f982a26\": {\"quality\": 0.2064102564102564, \"cost\": 0.00861959, \"time\": 133.95475313663482}, \"e06701b665\": {\"quality\": 0.07222222222222222, \"cost\": 0.005247648, \"time\": 155.51543612480162}, \"e18abd2ab0\": {\"quality\": 0.19252136752136753, \"cost\": 0.008510738, \"time\": 142.51286814212799}, \"e1e596ee1b\": {\"quality\": 0.06944444444444445, \"cost\": 0.009328152, \"time\": 170.3236501932144}, \"e20ba014a1\": {\"quality\": 0.25085470085470085, \"cost\": 0.010510256999999999, \"time\": 149.4374349117279}, \"e223700849\": {\"quality\": 0.10982905982905983, \"cost\": 0.002881368, \"time\": 120.50577642917634}, \"e26c7bfbdb\": {\"quality\": 0.08141025641025641, \"cost\": 0.007298063999999999, \"time\": 115.83573853969574}, \"e35e5f81a7\": {\"quality\": 0.0, \"cost\": 0.006075344999999999, \"time\": 117.27980465888976}, \"e3cdc0d870\": {\"quality\": 0.09444444444444444, \"cost\": 0.00339135, \"time\": 131.6535663843155}, \"e3df4cf041\": {\"quality\": 0.06538461538461539, \"cost\": 0.007586271, \"time\": 106.25668971538545}, \"e3eca9854c\": {\"quality\": 0.10705128205128206, \"cost\": 0.004585331999999999, \"time\": 159.58829066753387}, \"e47dc3abca\": {\"quality\": 0.09252136752136753, \"cost\": 0.011292871999999999, \"time\": 107.18091866970062}, \"e495ff601f\": {\"quality\": 0.14807692307692308, \"cost\": 0.007825082, \"time\": 156.01348607540132}, \"e4b9d4fb41\": {\"quality\": 0.14038461538461539, \"cost\": 0.007481999, \"time\": 112.45064301490783}, \"e510bda989\": {\"quality\": 0.2098290598290598, \"cost\": 0.005625576, \"time\": 35.45525462627411}, \"e515bc1935\": {\"quality\": 0.2098290598290598, \"cost\": 0.00642271, \"time\": 124.40159273147583}, \"e517cd2222\": {\"quality\": 0.1737179487179487, \"cost\": 0.0015429100000000002, \"time\": 54.5986225605011}, \"e51b01f418\": {\"quality\": 0.1952991452991453, \"cost\": 0.005621012999999999, \"time\": 119.1760358095169}, \"e520dfae5b\": {\"quality\": 0.2098290598290598, \"cost\": 0.005809304999999999, \"time\": 154.96505286693574}, \"e53906f84b\": {\"quality\": 0.17649572649572648, \"cost\": 0.003217785, \"time\": 119.95617558956147}, \"e53b349cce\": {\"quality\": 0.04807692307692308, \"cost\": 0.006779802, \"time\": 116.54631357192993}, \"e5401ed278\": {\"quality\": 0.1987179487179487, \"cost\": 0.009414028, \"time\": 154.65711226463316}, \"e56a16ca66\": {\"quality\": 0.3237179487179487, \"cost\": 0.001822221, \"time\": 82.61048684120178}, \"e5a2f72b30\": {\"quality\": 0.2098290598290598, \"cost\": 0.008565615, \"time\": 130.87860839366914}, \"e5a70a13ac\": {\"quality\": 0.19252136752136753, \"cost\": 0.007232057000000002, \"time\": 115.96633384227752}, \"e5fdeb4de9\": {\"quality\": 0.1987179487179487, \"cost\": 0.0035100910000000003, \"time\": 106.90362923145295}, \"e609601eee\": {\"quality\": 0.13333333333333333, \"cost\": 0.001958406, \"time\": 89.7593854188919}, \"e6f141cc8f\": {\"quality\": 0.1737179487179487, \"cost\": 0.007898352, \"time\": 159.6583716392517}, \"e7525c117a\": {\"quality\": 0.125, \"cost\": 0.003349254, \"time\": 129.90280907154084}, \"e7e94ab7a5\": {\"quality\": 0.1, \"cost\": 0.001623004, \"time\": 95.54465639591217}, \"e86bd256d6\": {\"quality\": 0.12863247863247865, \"cost\": 0.003572466, \"time\": 121.87137434482574}, \"e89283a4d9\": {\"quality\": 0.17222222222222222, \"cost\": 0.0017132999999999998, \"time\": 125.78474328517913}, \"e9befb80e0\": {\"quality\": 0.10982905982905983, \"cost\": 0.0064075799999999995, \"time\": 125.61113278865814}, \"ea38031fc1\": {\"quality\": 0.08482905982905983, \"cost\": 0.007648266000000001, \"time\": 158.89819700717925}, \"ea6ecc5653\": {\"quality\": 0.0626068376068376, \"cost\": 0.007534413, \"time\": 126.29475154876708}, \"ea8bcb3ae2\": {\"quality\": 0.16944444444444445, \"cost\": 0.01714652, \"time\": 120.57036209106445}, \"ea91a2e78b\": {\"quality\": 0.15, \"cost\": 0.005214324, \"time\": 129.3872970342636}, \"eac300f0d1\": {\"quality\": 0.1737179487179487, \"cost\": 0.008836863, \"time\": 174.77539343833922}, \"ebaa9b1297\": {\"quality\": 0.2098290598290598, \"cost\": 0.001620009, \"time\": 84.55729002952575}, \"ebbe8b6c4f\": {\"quality\": 0.0, \"cost\": 0.006925624, \"time\": 89.16193358898164}, \"ebdf3abff2\": {\"quality\": 0.10982905982905983, \"cost\": 0.013354845, \"time\": 123.62594261169434}, \"ebee1f2761\": {\"quality\": 0.22863247863247865, \"cost\": 0.004503177, \"time\": 143.60831434726714}, \"ec8844a5ae\": {\"quality\": 0.025, \"cost\": 0.00061108, \"time\": 93.01526029109955}, \"ecb5f78f37\": {\"quality\": 0.0, \"cost\": 0.010665879, \"time\": 184.15539872646332}, \"ece7ff5129\": {\"quality\": 0.09871794871794873, \"cost\": 0.006345035999999999, \"time\": 107.36751945018767}, \"ed60b8cac5\": {\"quality\": 0.025, \"cost\": 0.005388696, \"time\": 182.1943165063858}, \"ed6b5480a5\": {\"quality\": 0.0702991452991453, \"cost\": 0.004838202, \"time\": 76.91644642353057}, \"eda630dc85\": {\"quality\": 0.21752136752136753, \"cost\": 0.0075326360000000005, \"time\": 115.44470648765564}, \"edaaee5ed4\": {\"quality\": 0.07307692307692308, \"cost\": 0.006511264000000001, \"time\": 83.00507278442383}, \"edb2b764aa\": {\"quality\": 0.08482905982905983, \"cost\": 0.013305772, \"time\": 130.48116376399994}, \"edc52339db\": {\"quality\": 0.04444444444444444, \"cost\": 0.012820474000000002, \"time\": 177.77201774120329}, \"ee46042c5d\": {\"quality\": 0.14444444444444443, \"cost\": 0.011933044, \"time\": 173.3385812520981}, \"ee855899d8\": {\"quality\": 0.19444444444444442, \"cost\": 0.002078877, \"time\": 131.02240426540374}, \"eef12d478b\": {\"quality\": 0.06752136752136753, \"cost\": 0.005545968, \"time\": 137.4901770591736}, \"ef37b3e0be\": {\"quality\": 0.0, \"cost\": 0.008727486, \"time\": 182.45494146347045}, \"ef43d497f1\": {\"quality\": 0.13974358974358975, \"cost\": 0.012247042, \"time\": 182.13725912570953}, \"ef4d4c4a62\": {\"quality\": 0.3175213675213675, \"cost\": 0.008570482, \"time\": 138.68434212207794}, \"f0655621af\": {\"quality\": 0.05641025641025641, \"cost\": 0.004645079999999999, \"time\": 51.96998543739319}, \"f076b4c9ae\": {\"quality\": 0.1987179487179487, \"cost\": 0.008021229, \"time\": 145.75664427280424}, \"f07881a734\": {\"quality\": 0.0, \"cost\": 0.0008367299999999999, \"time\": 121.20072700977326}, \"f0829510fc\": {\"quality\": 0.12585470085470085, \"cost\": 0.007723235999999999, \"time\": 174.65302150249482}, \"f11eddb4ed\": {\"quality\": 0.22863247863247865, \"cost\": 0.011729255, \"time\": 164.10010514259338}, \"f12622d3d7\": {\"quality\": 0.0, \"cost\": 0.008399868000000001, \"time\": 176.69494199752808}, \"f1408da253\": {\"quality\": 0.2098290598290598, \"cost\": 0.009230984999999999, \"time\": 164.57058210372924}, \"f1770e7d28\": {\"quality\": 0.2098290598290598, \"cost\": 0.003513861, \"time\": 159.09384961128234}, \"f18cf41929\": {\"quality\": 0.0, \"cost\": 0.003205404, \"time\": 109.51144468784332}, \"f1bda127f6\": {\"quality\": 0.21752136752136753, \"cost\": 0.012127632, \"time\": 119.53270018100739}, \"f2a2e91541\": {\"quality\": 0.1814102564102564, \"cost\": 0.008741034, \"time\": 164.1866457939148}, \"f2c04ed1c8\": {\"quality\": 0.04038461538461539, \"cost\": 0.004816446, \"time\": 76.6830270767212}, \"f2cf5db12d\": {\"quality\": 0.09252136752136753, \"cost\": 0.008174008, \"time\": 126.23963840007782}, \"f3c7a062f7\": {\"quality\": 0.18482905982905984, \"cost\": 0.003666486, \"time\": 122.12519326210023}, \"f42277df7f\": {\"quality\": 0.2098290598290598, \"cost\": 0.001627851, \"time\": 72.52343001365662}, \"f437481e3b\": {\"quality\": 0.21752136752136753, \"cost\": 0.011085886000000001, \"time\": 157.97356908321382}, \"f487340019\": {\"quality\": 0.21816239316239316, \"cost\": 0.0031079669999999997, \"time\": 131.89823365211487}, \"f497b83523\": {\"quality\": 0.015384615384615385, \"cost\": 0.007367326, \"time\": 130.5487345457077}, \"f4aa8ffdf5\": {\"quality\": 0.18333333333333332, \"cost\": 0.00532851, \"time\": 143.96070950031282}, \"f4ab4a73b6\": {\"quality\": 0.07307692307692308, \"cost\": 0.0073043940000000005, \"time\": 143.23261981010438}, \"f4beb148d0\": {\"quality\": 0.2098290598290598, \"cost\": 0.0023591489999999996, \"time\": 131.58125035762788}, \"f4deb72db6\": {\"quality\": 0.0, \"cost\": 0.007903169999999998, \"time\": 146.17728536129}, \"f50a47a0aa\": {\"quality\": 0.15833333333333333, \"cost\": 0.003946065, \"time\": 136.4745246887207}, \"f5b9a94dcc\": {\"quality\": 0.0, \"cost\": 0.002980708, \"time\": 114.85360951423645}, \"f5c27e7172\": {\"quality\": 0.015384615384615385, \"cost\": 0.003205683, \"time\": 136.12301714420317}, \"f5e53d963b\": {\"quality\": 0.07222222222222222, \"cost\": 0.001996362, \"time\": 87.49879748821257}, \"f5ecac6743\": {\"quality\": 0.23974358974358972, \"cost\": 0.008653056000000001, \"time\": 190.25079679489136}, \"f614235c15\": {\"quality\": 0.1, \"cost\": 0.001451158, \"time\": 103.29747097492219}, \"f627bff3a1\": {\"quality\": 0.08333333333333334, \"cost\": 0.002846523, \"time\": 154.48251981735228}, \"f70e54a9a0\": {\"quality\": 0.2098290598290598, \"cost\": 0.008556343000000001, \"time\": 184.88659045696258}, \"f74c5a862e\": {\"quality\": 0.0, \"cost\": 0.006900402, \"time\": 186.4667979478836}, \"f74ec023e4\": {\"quality\": 0.19252136752136753, \"cost\": 0.011753, \"time\": 179.87489347457887}, \"f783b2b34a\": {\"quality\": 0.1987179487179487, \"cost\": 0.00525324, \"time\": 161.1631241083145}, \"f7b048bd54\": {\"quality\": 0.19444444444444442, \"cost\": 0.008355386999999999, \"time\": 145.28592591285707}, \"f7c4df993e\": {\"quality\": 0.2098290598290598, \"cost\": 0.007574031, \"time\": 157.67136673927308}, \"f848813dca\": {\"quality\": 0.02222222222222222, \"cost\": 0.0046792710000000005, \"time\": 184.49361929893493}, \"f854533145\": {\"quality\": 0.21752136752136753, \"cost\": 0.010477562999999999, \"time\": 188.20525524616244}, \"f8f946b5fb\": {\"quality\": 0.032692307692307694, \"cost\": 0.007302053999999999, \"time\": 199.7285678625107}, \"f912592c8d\": {\"quality\": 0.21752136752136753, \"cost\": 0.009682198999999999, \"time\": 187.25646502971648}, \"f93d9a2693\": {\"quality\": 0.10705128205128206, \"cost\": 0.006198954, \"time\": 124.54699866771698}, \"f99096d89c\": {\"quality\": 0.21752136752136753, \"cost\": 0.008768471, \"time\": 151.46780948638917}, \"f9e8e221f3\": {\"quality\": 0.06944444444444445, \"cost\": 0.006739169999999999, \"time\": 118.0668850183487}, \"fa38879eab\": {\"quality\": 0.17307692307692307, \"cost\": 0.013126675, \"time\": 162.19450600147246}, \"fa5b473f15\": {\"quality\": 0.0, \"cost\": 0.007688484000000001, \"time\": 179.80013384819028}, \"fa7882d46b\": {\"quality\": 0.21752136752136753, \"cost\": 0.011772778000000001, \"time\": 164.46603999137878}, \"fa906520d1\": {\"quality\": 0.2098290598290598, \"cost\": 0.012046285, \"time\": 162.0444113969803}, \"faabebaa30\": {\"quality\": 0.21752136752136753, \"cost\": 0.009497244000000002, \"time\": 166.34633650779722}, \"fb0339a7d0\": {\"quality\": 0.1952991452991453, \"cost\": 0.012292273000000001, \"time\": 170.6160633802414}, \"fb0e796cd3\": {\"quality\": 0.15149572649572648, \"cost\": 0.005093987999999999, \"time\": 133.95573093891142}, \"fb216ad6b3\": {\"quality\": 0.04038461538461539, \"cost\": 0.0032487699999999998, \"time\": 63.50213906764984}, \"fb29372712\": {\"quality\": 0.21752136752136753, \"cost\": 0.007952629, \"time\": 162.53553793430328}, \"fb6216880a\": {\"quality\": 0.0, \"cost\": 0.010096947, \"time\": 167.64952411651612}, \"fbd6c45271\": {\"quality\": 0.21752136752136753, \"cost\": 0.009907777, \"time\": 157.9287733793259}, \"fc1fd5bf54\": {\"quality\": 0.32863247863247863, \"cost\": 0.0051455070000000006, \"time\": 150.8760426044464}, \"fc4832696b\": {\"quality\": 0.032692307692307694, \"cost\": 0.001692516, \"time\": 101.06289932727813}, \"fccaadfcdb\": {\"quality\": 0.2098290598290598, \"cost\": 0.008330596999999999, \"time\": 168.6835078954697}, \"fcd3d2b250\": {\"quality\": 0.18482905982905984, \"cost\": 0.008048325, \"time\": 152.26023478507994}, \"fce38334b2\": {\"quality\": 0.21752136752136753, \"cost\": 0.01259885, \"time\": 138.37961835861205}, \"fd0709359e\": {\"quality\": 0.16538461538461538, \"cost\": 0.0006104649999999999, \"time\": 45.40659141540527}, \"fddccfbf94\": {\"quality\": 0.21752136752136753, \"cost\": 0.010504784000000001, \"time\": 108.93906075954438}, \"fe2b4d4d8b\": {\"quality\": 0.24316239316239316, \"cost\": 0.003949674, \"time\": 128.27600796222686}, \"fea4734c09\": {\"quality\": 0.0702991452991453, \"cost\": 0.0056253, \"time\": 73.18559403419495}, \"fef1ca27fa\": {\"quality\": 0.09252136752136753, \"cost\": 0.012806324, \"time\": 116.60554842948915}, \"ff11cb6a7a\": {\"quality\": 0.19252136752136753, \"cost\": 0.012347635999999999, \"time\": 110.44815881252289}, \"ff171e34e2\": {\"quality\": 0.0, \"cost\": 0.0037192830000000003, \"time\": 85.79805471897126}, \"ff1c958e21\": {\"quality\": 0.0, \"cost\": 0.002683884, \"time\": 74.0030293226242}, \"ff8e68049a\": {\"quality\": 0.10085470085470086, \"cost\": 0.006070932, \"time\": 85.60206592082977}}"
  },
  {
    "path": "abacus-research/biodex-priors.json",
    "content": "{\"00c93aec22\": {\"quality\": 0.06301075268817204, \"cost\": 0.011813586, \"time\": 87.30672872066498}, \"00f4acd0d3\": {\"quality\": 0.0, \"cost\": 0.0060519300000000005, \"time\": 43.473920822143555}, \"0121878170\": {\"quality\": 0.10301075268817204, \"cost\": 0.039456927, \"time\": 69.10648686885834}, \"01c2f973ad\": {\"quality\": 0.049677419354838714, \"cost\": 0.002479065, \"time\": 42.231227970123285}, \"01fca3c717\": {\"quality\": 0.0929032258064516, \"cost\": 0.029593058000000005, \"time\": 37.2765305519104}, \"02078988c1\": {\"quality\": 0.0, \"cost\": 0.007589632000000001, \"time\": 36.742701053619385}, \"021604dec1\": {\"quality\": 0.10301075268817204, \"cost\": 0.009616495, \"time\": 70.4432460308075}, \"0262668df7\": {\"quality\": 0.10301075268817204, \"cost\": 0.007743969, \"time\": 52.63532865047455}, \"0267c97b70\": {\"quality\": 0.0, \"cost\": 0.0035319149999999996, \"time\": 44.35502307415008}, \"02d6cdecdc\": {\"quality\": 0.10301075268817204, \"cost\": 0.019585369999999998, \"time\": 71.90730912685395}, \"033ca325e6\": {\"quality\": 0.03935483870967742, \"cost\": 0.0024435539999999997, \"time\": 32.108777427673346}, \"0364b5e990\": {\"quality\": 0.06258064516129032, \"cost\": 0.03804064, \"time\": 75.88355865478516}, \"0375ea52c9\": {\"quality\": 0.12301075268817205, \"cost\": 0.011332940000000001, \"time\": 23.44224610328674}, \"038a5f0a62\": {\"quality\": 0.10301075268817204, \"cost\": 0.006738629999999999, \"time\": 72.20387270450593}, \"039803b3b1\": {\"quality\": 0.12301075268817205, \"cost\": 0.029192550000000005, \"time\": 28.46143364906311}, \"03b972cb56\": {\"quality\": 0.10946236559139785, \"cost\": 0.024332390000000002, \"time\": 96.93496084213257}, \"042d933706\": {\"quality\": 0.11591397849462365, \"cost\": 0.008572104, \"time\": 87.77981555461884}, \"050b21ce37\": {\"quality\": 0.10623655913978494, \"cost\": 0.033586281999999995, \"time\": 91.73336846828461}, \"0524f42520\": {\"quality\": 0.10301075268817204, \"cost\": 0.04777091600000001, \"time\": 100.57431967258452}, \"05420351e5\": {\"quality\": 0.10301075268817204, \"cost\": 0.01269706, \"time\": 76.56351339817047}, \"057e332ab1\": {\"quality\": 0.12623655913978493, \"cost\": 0.03724580200000001, \"time\": 96.07672710418701}, \"0646f3f0fb\": {\"quality\": 0.12301075268817205, \"cost\": 0.03542941100000001, \"time\": 98.46862163543702}, \"06493715cc\": {\"quality\": 0.0, \"cost\": 0.005305185, \"time\": 88.10458414554597}, \"0659531b94\": {\"quality\": 0.12623655913978493, \"cost\": 0.031133855000000005, \"time\": 76.578067111969}, \"067ee6e91b\": {\"quality\": 0.10301075268817204, \"cost\": 0.007431704999999999, \"time\": 72.39745917320252}, \"0695f9b5fc\": {\"quality\": 0.020046082949308756, \"cost\": 0.008172494999999998, \"time\": 89.09118111133574}, \"06e94a0f2e\": {\"quality\": 0.10301075268817204, \"cost\": 0.008446952, \"time\": 43.131861782073976}, \"073ef31d23\": {\"quality\": 0.0929032258064516, \"cost\": 0.031532903, \"time\": 66.24685339927674}, \"078a7e545e\": {\"quality\": 0.10301075268817204, \"cost\": 0.031617475000000006, \"time\": 66.09703652858735}, \"079feb14a8\": {\"quality\": 0.11591397849462365, \"cost\": 0.008705902, \"time\": 48.95988223552703}, \"07a3a7daf7\": {\"quality\": 0.0, \"cost\": 0.007065148, \"time\": 43.999537682533266}, \"08127cd6dd\": {\"quality\": 0.11268817204301075, \"cost\": 0.013250458, \"time\": 49.02173194885254}, \"0833133620\": {\"quality\": 0.10623655913978494, \"cost\": 0.00606747, \"time\": 52.81210649013519}, \"087a2cabc4\": {\"quality\": 0.11591397849462365, \"cost\": 0.020192525000000003, \"time\": 94.1397991657257}, \"08bf8cc191\": {\"quality\": 0.0, \"cost\": 0.012546207, \"time\": 107.96688709259033}, \"08e1802287\": {\"quality\": 0.12946236559139784, \"cost\": 0.0034025700000000006, \"time\": 25.23361747264862}, \"08f7f63b30\": {\"quality\": 0.07290322580645162, \"cost\": 0.031714827, \"time\": 104.96963531970978}, \"090cd3ef31\": {\"quality\": 0.10946236559139785, \"cost\": 0.00352441, \"time\": 23.527972793579103}, \"0947216ece\": {\"quality\": 0.11268817204301075, \"cost\": 0.014792812, \"time\": 80.49954354763031}, \"096d51f670\": {\"quality\": 0.08301075268817204, \"cost\": 0.009779318999999998, \"time\": 120.63433356285095}, \"09791c731b\": {\"quality\": 0.08623655913978495, \"cost\": 0.007008072, \"time\": 91.62384450435638}, \"0990c0d4f8\": {\"quality\": 0.10301075268817204, \"cost\": 0.007845877, \"time\": 85.36602926254272}, \"0a4c1bbb4a\": {\"quality\": 0.0, \"cost\": 0.0061861469999999995, \"time\": 90.79566612243653}, \"0ac969dde3\": {\"quality\": 0.10301075268817204, \"cost\": 0.010119635, \"time\": 114.36860978603363}, \"0b4ab72197\": {\"quality\": 0.08301075268817204, \"cost\": 0.010272084, \"time\": 85.533615732193}, \"0bf9d31691\": {\"quality\": 0.0, \"cost\": 0.04230826, \"time\": 110.75088348388672}, \"0c020b86a3\": {\"quality\": 0.06, \"cost\": 0.009179564, \"time\": 80.5141788482666}, \"0c6c7fe96a\": {\"quality\": 0.11591397849462365, \"cost\": 0.012506979000000001, \"time\": 104.92288007736207}, \"0c81c8996a\": {\"quality\": 0.03333333333333333, \"cost\": 0.006368825999999999, \"time\": 67.1365476846695}, \"0cd25da9fe\": {\"quality\": 0.1429032258064516, \"cost\": 0.037754836, \"time\": 88.54509041309356}, \"0cd78f33d8\": {\"quality\": 0.12301075268817205, \"cost\": 0.05730706500000001, \"time\": 70.51942982673646}, \"0cdc5954dd\": {\"quality\": 0.06623655913978495, \"cost\": 0.008238566000000001, \"time\": 70.26068692207338}, \"0d53dd53c1\": {\"quality\": 0.10301075268817204, \"cost\": 0.040789094, \"time\": 83.40459337234498}, \"0d8436af32\": {\"quality\": 0.10301075268817204, \"cost\": 0.011668765, \"time\": 70.86097333431243}, \"0e38896654\": {\"quality\": 0.12301075268817205, \"cost\": 0.011624445, \"time\": 69.54783916473389}, \"0ec672e7c8\": {\"quality\": 0.10301075268817204, \"cost\": 0.013338835, \"time\": 92.46337025165558}, \"0ed243f788\": {\"quality\": 0.11591397849462365, \"cost\": 0.009052659000000001, \"time\": 82.04982092380524}, \"0eeb372802\": {\"quality\": 0.08489247311827958, \"cost\": 0.01393459, \"time\": 87.2686581134796}, \"0ef0becc1b\": {\"quality\": 0.14562980030721967, \"cost\": 0.028691548000000004, \"time\": 57.96495416164398}, \"0fefead197\": {\"quality\": 0.10623655913978494, \"cost\": 0.036967563, \"time\": 86.80253887176514}, \"0ff126ebf8\": {\"quality\": 0.10623655913978494, \"cost\": 0.012027738000000001, \"time\": 62.63344478607178}, \"10d1d4bdeb\": {\"quality\": 0.0, \"cost\": 0.034578758, \"time\": 84.60008835792542}, \"114a097c53\": {\"quality\": 0.02967741935483871, \"cost\": 0.00546162, \"time\": 32.99331085681915}, \"1175ee37e6\": {\"quality\": 0.06301075268817204, \"cost\": 0.0015683849999999998, \"time\": 27.241142559051514}, \"11bc996d48\": {\"quality\": 0.10301075268817204, \"cost\": 0.006689505, \"time\": 67.14279909133911}, \"11ded03305\": {\"quality\": 0.12301075268817205, \"cost\": 0.03211208700000001, \"time\": 71.28769545555114}, \"123fb650fb\": {\"quality\": 0.0704147465437788, \"cost\": 0.009157566, \"time\": 94.18252191543579}, \"132f6f3946\": {\"quality\": 0.0, \"cost\": 0.034120617000000006, \"time\": 80.40761358737946}, \"133ee5023f\": {\"quality\": 0.08301075268817204, \"cost\": 0.0056809600000000005, \"time\": 41.6411509513855}, \"13a009fe0c\": {\"quality\": 0.12290322580645162, \"cost\": 0.013064816, \"time\": 103.28004715442657}, \"13da306f84\": {\"quality\": 0.10301075268817204, \"cost\": 0.009303667999999998, \"time\": 75.9394079208374}, \"13e717e41e\": {\"quality\": 0.10301075268817204, \"cost\": 0.027711322000000004, \"time\": 46.58983483314514}, \"13f75f9bd0\": {\"quality\": 0.06301075268817204, \"cost\": 0.007269792000000001, \"time\": 78.83892266750337}, \"140ededb41\": {\"quality\": 0.07290322580645162, \"cost\": 0.0016495350000000002, \"time\": 27.533066105842593}, \"142e59c03f\": {\"quality\": 0.10301075268817204, \"cost\": 0.051444216, \"time\": 89.78037304878235}, \"142f3a7c70\": {\"quality\": 0.0, \"cost\": 0.0048542970000000005, \"time\": 90.07974328994752}, \"1468dddecc\": {\"quality\": 0.12623655913978493, \"cost\": 0.0025156650000000003, \"time\": 58.17216382026672}, \"15af009a01\": {\"quality\": 0.09333333333333334, \"cost\": 0.03807909200000001, \"time\": 106.9638382911682}, \"15b80a55d3\": {\"quality\": 0.09333333333333334, \"cost\": 0.036063054, \"time\": 102.59992785453797}, \"1625e624c5\": {\"quality\": 0.0696774193548387, \"cost\": 0.02807447, \"time\": 56.73278093338013}, \"16cff1c1e9\": {\"quality\": 0.0, \"cost\": 0.031882792, \"time\": 110.76414783000945}, \"17407df027\": {\"quality\": 0.08623655913978495, \"cost\": 0.035592325999999994, \"time\": 101.71423263549804}, \"176da24f53\": {\"quality\": 0.014285714285714285, \"cost\": 0.004459658, \"time\": 50.037702202796936}, \"179379555f\": {\"quality\": 0.08946236559139785, \"cost\": 0.006388011, \"time\": 77.25874888896942}, \"181c91d1be\": {\"quality\": 0.10623655913978494, \"cost\": 0.012226980000000002, \"time\": 70.32721478939057}, \"183743e76e\": {\"quality\": 0.0, \"cost\": 0.00601107, \"time\": 54.00228130817413}, \"186b58c209\": {\"quality\": 0.10623655913978494, \"cost\": 0.03340172, \"time\": 77.13717110157013}, \"187eace9fe\": {\"quality\": 0.0, \"cost\": 0.031072944, \"time\": 83.48713932037353}, \"190ed2e1b6\": {\"quality\": 0.20301075268817204, \"cost\": 0.032216430000000004, \"time\": 60.7668753862381}, \"191aafe1a6\": {\"quality\": 0.10301075268817204, \"cost\": 0.028626901, \"time\": 69.7408289194107}, \"194919ad28\": {\"quality\": 0.032903225806451615, \"cost\": 0.032168366000000004, \"time\": 64.83868379592896}, \"197bb53f10\": {\"quality\": 0.09333333333333334, \"cost\": 0.0064794810000000005, \"time\": 85.31207880973815}, \"19ba7d0617\": {\"quality\": 0.12301075268817205, \"cost\": 0.0850125, \"time\": 65.46971595287323}, \"19e3db7fe7\": {\"quality\": 0.03333333333333333, \"cost\": 0.012209139999999999, \"time\": 111.22203342914581}, \"1a08cb3f50\": {\"quality\": 0.0, \"cost\": 0.031574925000000004, \"time\": 48.5790287733078}, \"1a169179f6\": {\"quality\": 0.10301075268817204, \"cost\": 0.044995354, \"time\": 103.08906679153444}, \"1a71d61ac4\": {\"quality\": 0.08301075268817204, \"cost\": 0.02794963, \"time\": 76.25873317718506}, \"1ad856985f\": {\"quality\": 0.02258064516129032, \"cost\": 0.009150617, \"time\": 122.38001575469971}, \"1adec2dca2\": {\"quality\": 0.0, \"cost\": 0.01436658, \"time\": 146.7017418861389}, \"1b04a2b184\": {\"quality\": 0.11591397849462365, \"cost\": 0.013684314999999999, \"time\": 133.7267296552658}, \"1b28439bd7\": {\"quality\": 0.21591397849462365, \"cost\": 0.011476702000000002, \"time\": 100.34343218803406}, \"1b2c667b15\": {\"quality\": 0.19333333333333333, \"cost\": 0.033035388, \"time\": 89.06020038127899}, \"1b4511eada\": {\"quality\": 0.11456989247311827, \"cost\": 0.030632538, \"time\": 129.76828122138977}, \"1b7e6cad66\": {\"quality\": 0.10301075268817204, \"cost\": 0.028176118, \"time\": 68.36105287075043}, \"1beb2fac62\": {\"quality\": 0.10301075268817204, \"cost\": 0.013600296000000001, \"time\": 84.02366802692413}, \"1c347e4d91\": {\"quality\": 0.10301075268817204, \"cost\": 0.003799887, \"time\": 104.80383982658387}, \"1c35bf4be6\": {\"quality\": 0.10623655913978494, \"cost\": 0.03395996, \"time\": 116.73827676773071}, \"1c4bbf8f7e\": {\"quality\": 0.10301075268817204, \"cost\": 0.04330549100000001, \"time\": 104.88216242790222}, \"1c5f1341f6\": {\"quality\": 0.10623655913978494, \"cost\": 0.023331052, \"time\": 78.7630702495575}, \"1c71804bec\": {\"quality\": 0.08967741935483872, \"cost\": 0.019993577999999998, \"time\": 79.78316872119903}, \"1cc6d9efb6\": {\"quality\": 0.10301075268817204, \"cost\": 0.012043020000000002, \"time\": 78.92661263942719}, \"1ce3d77039\": {\"quality\": 0.10301075268817204, \"cost\": 0.000653472, \"time\": 41.55318109989166}, \"1d26090364\": {\"quality\": 0.12301075268817205, \"cost\": 0.004974435, \"time\": 56.773493957519534}, \"1d87f97e62\": {\"quality\": 0.10301075268817204, \"cost\": 0.007740711000000001, \"time\": 77.09953954219819}, \"1d90fb8ca6\": {\"quality\": 0.10623655913978494, \"cost\": 0.034693865000000004, \"time\": 72.57712695598602}, \"1da2369719\": {\"quality\": 0.032903225806451615, \"cost\": 0.004285659000000001, \"time\": 71.20325164794922}, \"1e18e60895\": {\"quality\": 0.12301075268817205, \"cost\": 0.0006631269999999999, \"time\": 39.11586356163025}, \"1e1bf7e88b\": {\"quality\": 0.10623655913978494, \"cost\": 0.012434028, \"time\": 72.40708291530609}, \"1e8b3521f8\": {\"quality\": 0.07729646697388634, \"cost\": 0.003250434, \"time\": 60.45711851119995}, \"1f412964ff\": {\"quality\": 0.12301075268817205, \"cost\": 0.029133057000000004, \"time\": 64.97693083286285}, \"1f72cfb78a\": {\"quality\": 0.21591397849462365, \"cost\": 0.020125592, \"time\": 99.64003388881683}, \"1fb5d170ad\": {\"quality\": 0.11268817204301075, \"cost\": 0.031407344000000004, \"time\": 107.89154133796691}, \"20180dd292\": {\"quality\": 0.03333333333333333, \"cost\": 0.035306141, \"time\": 115.35444395542144}, \"2018bef45f\": {\"quality\": 0.10301075268817204, \"cost\": 0.007217930000000001, \"time\": 68.45855422019959}, \"2075ff1d04\": {\"quality\": 0.06, \"cost\": 0.000997618, \"time\": 56.56468660831451}, \"208a98f514\": {\"quality\": 0.06623655913978495, \"cost\": 0.010588595, \"time\": 121.87733623981475}, \"20904e5c14\": {\"quality\": 0.08967741935483872, \"cost\": 0.031855074, \"time\": 83.47015318870544}, \"20afc3d539\": {\"quality\": 0.23591397849462367, \"cost\": 0.039295448, \"time\": 91.55873837471009}, \"20e10af7d4\": {\"quality\": 0.09591397849462366, \"cost\": 0.005751711, \"time\": 109.45092914104461}, \"20e2c0b057\": {\"quality\": 0.04, \"cost\": 0.008348217000000002, \"time\": 118.59764387607575}, \"211b89b4cd\": {\"quality\": 0.10301075268817204, \"cost\": 0.005769704999999999, \"time\": 68.7529235124588}, \"21386082aa\": {\"quality\": 0.10946236559139785, \"cost\": 0.019664704, \"time\": 109.10838623046875}, \"21b2b8ebd1\": {\"quality\": 0.10623655913978494, \"cost\": 0.0073147049999999995, \"time\": 87.6013917684555}, \"21b78249a7\": {\"quality\": 0.10301075268817204, \"cost\": 0.04023389200000001, \"time\": 85.05358052253723}, \"21bed16a7d\": {\"quality\": 0.05290322580645161, \"cost\": 0.0014240299999999997, \"time\": 44.216261696815494}, \"2200d969d0\": {\"quality\": 0.10301075268817204, \"cost\": 0.042653065000000004, \"time\": 104.29301726818085}, \"220d008704\": {\"quality\": 0.02, \"cost\": 0.030364681, \"time\": 107.97830784320831}, \"2251d21392\": {\"quality\": 0.11591397849462365, \"cost\": 0.033143465, \"time\": 79.32752299308777}, \"23566f15ab\": {\"quality\": 0.08301075268817204, \"cost\": 0.0005060469999999999, \"time\": 43.435147762298584}, \"23a9506d36\": {\"quality\": 0.08301075268817204, \"cost\": 0.005874952, \"time\": 36.02558331489563}, \"24957f3a43\": {\"quality\": 0.08946236559139785, \"cost\": 0.004644396, \"time\": 53.15490672588348}, \"2529e2f8b0\": {\"quality\": 0.10301075268817204, \"cost\": 0.007180190000000001, \"time\": 28.930508136749268}, \"252f01ac5b\": {\"quality\": 0.12623655913978493, \"cost\": 0.04125042200000001, \"time\": 92.3930985212326}, \"25fadf0883\": {\"quality\": 0.10946236559139785, \"cost\": 0.03512302, \"time\": 116.36114256381988}, \"2609bfd616\": {\"quality\": 0.10946236559139785, \"cost\": 0.009596338, \"time\": 116.62334115505219}, \"2629f3e324\": {\"quality\": 0.12301075268817205, \"cost\": 0.033418977, \"time\": 125.81296608448028}, \"262e4298f9\": {\"quality\": 0.06301075268817204, \"cost\": 0.03408896, \"time\": 126.59840724468232}, \"26cc40d3bb\": {\"quality\": 0.06258064516129032, \"cost\": 0.035955086, \"time\": 118.60315663814545}, \"2728c8eb6a\": {\"quality\": 0.063963133640553, \"cost\": 0.008921946, \"time\": 100.46461434364319}, \"27bc52befa\": {\"quality\": 0.08623655913978495, \"cost\": 0.0216924, \"time\": 65.37463784217834}, \"27daa50458\": {\"quality\": 0.19978494623655915, \"cost\": 0.005977634, \"time\": 62.87260589599609}, \"2821795e69\": {\"quality\": 0.11591397849462365, \"cost\": 0.035850537, \"time\": 122.94275135993958}, \"28369b2421\": {\"quality\": 0.11591397849462365, \"cost\": 0.008891583000000002, \"time\": 96.52785930633544}, \"28421e6d62\": {\"quality\": 0.07967741935483871, \"cost\": 0.007445298000000001, \"time\": 90.94689693450928}, \"2936c3e43e\": {\"quality\": 0.10623655913978494, \"cost\": 0.012975128999999998, \"time\": 113.21974639892579}, \"293ec5edca\": {\"quality\": 0.10301075268817204, \"cost\": 0.006139281, \"time\": 76.99833080768585}, \"29409d0894\": {\"quality\": 0.08623655913978495, \"cost\": 0.005303836, \"time\": 53.642726635932924}, \"294258298a\": {\"quality\": 0.08301075268817204, \"cost\": 0.029248269, \"time\": 73.05858094692229}, \"294e541235\": {\"quality\": 0.014285714285714285, \"cost\": 0.002588631, \"time\": 46.93031387329101}, \"295ed5e759\": {\"quality\": 0.03333333333333333, \"cost\": 0.007936986, \"time\": 78.30404438972474}, \"2960431101\": {\"quality\": 0.07913978494623655, \"cost\": 0.007295059999999999, \"time\": 65.45906202793121}, \"29892d8468\": {\"quality\": 0.0, \"cost\": 0.012367010000000001, \"time\": 107.3028032541275}, \"299a0aeb65\": {\"quality\": 0.10301075268817204, \"cost\": 0.008972781999999999, \"time\": 99.65134017467498}, \"29ad99e3ed\": {\"quality\": 0.012903225806451613, \"cost\": 0.028833116000000002, \"time\": 76.06020951271057}, \"2a5edac2de\": {\"quality\": 0.10301075268817204, \"cost\": 0.040388614, \"time\": 98.854678440094}, \"2a7d15f4a7\": {\"quality\": 0.10301075268817204, \"cost\": 0.0007995049999999998, \"time\": 26.340464663505553}, \"2aa996de6a\": {\"quality\": 0.13591397849462367, \"cost\": 0.041571458000000006, \"time\": 101.75389010906218}, \"2ac4fb293f\": {\"quality\": 0.10301075268817204, \"cost\": 0.027978918000000002, \"time\": 79.7981543302536}, \"2afeff0083\": {\"quality\": 0.13134408602150538, \"cost\": 0.034197613, \"time\": 108.6436208486557}, \"2b2bc9568b\": {\"quality\": 0.12301075268817205, \"cost\": 0.034515569, \"time\": 122.13092226982117}, \"2b5679d248\": {\"quality\": 0.07591397849462365, \"cost\": 0.012507200999999999, \"time\": 111.41059238910675}, \"2bcf54cda1\": {\"quality\": 0.05333333333333333, \"cost\": 0.03352914, \"time\": 122.20578529834748}, \"2bd39ee744\": {\"quality\": 0.10301075268817204, \"cost\": 0.016464735, \"time\": 93.0221663236618}, \"2bf38d797f\": {\"quality\": 0.09655913978494624, \"cost\": 0.007761932999999999, \"time\": 85.16239807605743}, \"2c4f4f304e\": {\"quality\": 0.12301075268817205, \"cost\": 0.004792675, \"time\": 50.257468938827515}, \"2c5cf9eb26\": {\"quality\": 0.0, \"cost\": 0.006811359, \"time\": 85.30586485862732}, \"2c9a9f94c4\": {\"quality\": 0.06623655913978495, \"cost\": 0.011060496, \"time\": 127.40647723674773}, \"2d3bbc2d23\": {\"quality\": 0.11333333333333334, \"cost\": 0.006915585, \"time\": 98.87370185852052}, \"2de113167b\": {\"quality\": 0.20623655913978495, \"cost\": 0.0120975, \"time\": 114.90267214775085}, \"2de3eb2c19\": {\"quality\": 0.08258064516129032, \"cost\": 0.006905800000000001, \"time\": 56.68419787883758}, \"2e30394ac6\": {\"quality\": 0.09333333333333334, \"cost\": 0.007497251999999999, \"time\": 99.04903969764709}, \"2e9c5cc9bf\": {\"quality\": 0.05290322580645161, \"cost\": 0.006026862, \"time\": 64.40614473819733}, \"2f1573da80\": {\"quality\": 0.0696774193548387, \"cost\": 0.009294684000000001, \"time\": 118.76816306114196}, \"2f39d78f34\": {\"quality\": 0.12623655913978493, \"cost\": 0.03424951, \"time\": 84.87492218017579}, \"2fc0cb3592\": {\"quality\": 0.07290322580645162, \"cost\": 0.007008576, \"time\": 93.76994898319245}, \"2fd9cd426a\": {\"quality\": 0.08301075268817204, \"cost\": 0.012578212000000002, \"time\": 125.21965701580046}, \"300924ebae\": {\"quality\": 0.10301075268817204, \"cost\": 0.034383670000000005, \"time\": 79.05412302017211}, \"3019af79b3\": {\"quality\": 0.05333333333333333, \"cost\": 0.000728678, \"time\": 50.62756032943726}, \"302c1d97fc\": {\"quality\": 0.032903225806451615, \"cost\": 0.012643149999999999, \"time\": 94.44454934597016}, \"30ae4cbe91\": {\"quality\": 0.04967741935483871, \"cost\": 0.0006998079999999999, \"time\": 55.2713984966278}, \"30c1f9ddf1\": {\"quality\": 0.02, \"cost\": 0.031684431000000006, \"time\": 81.2769385099411}, \"30cd375570\": {\"quality\": 0.10623655913978494, \"cost\": 0.034244302000000004, \"time\": 84.5775047302246}, \"3169782cbb\": {\"quality\": 0.12623655913978493, \"cost\": 0.030658732000000005, \"time\": 82.21524584293365}, \"3172fc459a\": {\"quality\": 0.20623655913978495, \"cost\": 0.039465366, \"time\": 106.1840931892395}, \"318499c14b\": {\"quality\": 0.06623655913978495, \"cost\": 0.03394029, \"time\": 109.55087904930114}, \"31a32be94d\": {\"quality\": 0.12623655913978493, \"cost\": 0.040973235, \"time\": 113.42904851436614}, \"32b101d807\": {\"quality\": 0.11134408602150538, \"cost\": 0.013450305, \"time\": 116.4323141336441}, \"32e2c7ad7f\": {\"quality\": 0.10623655913978494, \"cost\": 0.034506176, \"time\": 105.92563235759735}, \"33459cd29c\": {\"quality\": 0.05333333333333333, \"cost\": 0.009939888, \"time\": 108.05581707954407}, \"33a187e74f\": {\"quality\": 0.10301075268817204, \"cost\": 0.00615336, \"time\": 108.04435174465179}, \"33bab4f766\": {\"quality\": 0.10946236559139785, \"cost\": 0.030359016, \"time\": 94.6774396419525}, \"34922140da\": {\"quality\": 0.12623655913978493, \"cost\": 0.037889484, \"time\": 97.54083952903747}, \"3511b5e1d0\": {\"quality\": 0.0, \"cost\": 0.009048384, \"time\": 107.80388951301575}, \"3513311c2d\": {\"quality\": 0.10946236559139785, \"cost\": 0.030197856000000002, \"time\": 70.30446681976318}, \"3513e54767\": {\"quality\": 0.1529032258064516, \"cost\": 0.006643148000000001, \"time\": 51.87993569374085}, \"353f0cb1ac\": {\"quality\": 0.07591397849462365, \"cost\": 0.002595798, \"time\": 43.97162518501281}, \"3550bf88cb\": {\"quality\": 0.10301075268817204, \"cost\": 0.008131596, \"time\": 101.78304505348206}, \"35610fb420\": {\"quality\": 0.10301075268817204, \"cost\": 0.004529697, \"time\": 86.15702996253967}, \"357267e14b\": {\"quality\": 0.10301075268817204, \"cost\": 0.001159102, \"time\": 43.801218032836914}, \"35baa5c3cc\": {\"quality\": 0.12623655913978493, \"cost\": 0.032426465, \"time\": 97.53744850158691}, \"3637084f91\": {\"quality\": 0.11913978494623656, \"cost\": 0.00731297, \"time\": 75.00282621383667}, \"368a497102\": {\"quality\": 0.08946236559139785, \"cost\": 0.023536822000000006, \"time\": 74.35990719795228}, \"36c66671ee\": {\"quality\": 0.049677419354838714, \"cost\": 0.0066839249999999985, \"time\": 80.50561456680299}, \"37456cb002\": {\"quality\": 0.10623655913978494, \"cost\": 0.018729613, \"time\": 104.19728450775146}, \"3746ea5c03\": {\"quality\": 0.08946236559139785, \"cost\": 0.025954912000000004, \"time\": 77.87212231159211}, \"375ed248fe\": {\"quality\": 0.09333333333333334, \"cost\": 0.009719407999999999, \"time\": 83.5635404586792}, \"377cdf9209\": {\"quality\": 0.10301075268817204, \"cost\": 0.015890171, \"time\": 106.57615406513214}, \"37d4d0f214\": {\"quality\": 0.09913978494623656, \"cost\": 0.035574514, \"time\": 106.92880241870881}, \"37ece7217f\": {\"quality\": 0.11729646697388633, \"cost\": 0.027270672000000003, \"time\": 49.32679312229156}, \"38075bb01f\": {\"quality\": 0.05333333333333333, \"cost\": 0.0043161810000000005, \"time\": 80.82736201286316}, \"3831d758b1\": {\"quality\": 0.0, \"cost\": 0.032444862000000005, \"time\": 82.06478433609009}, \"38567d6a43\": {\"quality\": 0.0, \"cost\": 0.004685955, \"time\": 89.93808691501617}, \"3875787727\": {\"quality\": 0.08301075268817204, \"cost\": 0.034211713000000005, \"time\": 105.70414273738861}, \"389c54cbca\": {\"quality\": 0.09655913978494624, \"cost\": 0.00570144, \"time\": 52.78039865493774}, \"3980f20caa\": {\"quality\": 0.05290322580645161, \"cost\": 0.014630126000000002, \"time\": 109.6514874458313}, \"3997a836bd\": {\"quality\": 0.10301075268817204, \"cost\": 0.034291603000000004, \"time\": 108.07982413768768}, \"39ad76f8ce\": {\"quality\": 0.10301075268817204, \"cost\": 0.033542926, \"time\": 88.07275912761688}, \"39c0b7c171\": {\"quality\": 0.10301075268817204, \"cost\": 0.036100823000000004, \"time\": 108.34460818767548}, \"39cd4ca402\": {\"quality\": 0.02, \"cost\": 0.0034239629999999995, \"time\": 84.85748331546785}, \"3a32c98a53\": {\"quality\": 0.10623655913978494, \"cost\": 0.030208363000000002, \"time\": 82.875235414505}, \"3ac7fa4e46\": {\"quality\": 0.10301075268817204, \"cost\": 0.04524012, \"time\": 97.64954562187195}, \"3ad6dcf559\": {\"quality\": 0.0, \"cost\": 0.031851822, \"time\": 80.59675960540771}, \"3ae0de8663\": {\"quality\": 0.10301075268817204, \"cost\": 0.021463644000000004, \"time\": 97.35158727169036}, \"3b2e8075ea\": {\"quality\": 0.08623655913978495, \"cost\": 0.003255993, \"time\": 67.06967768669128}, \"3b3676521a\": {\"quality\": 0.10301075268817204, \"cost\": 0.013791060999999999, \"time\": 95.51536769866944}, \"3b57530a56\": {\"quality\": 0.10301075268817204, \"cost\": 0.002789373, \"time\": 44.8931683063507}, \"3b6fbfa11d\": {\"quality\": 0.10946236559139785, \"cost\": 0.016405658, \"time\": 70.66574103832244}, \"3b81215e7a\": {\"quality\": 0.11268817204301075, \"cost\": 0.034937565000000004, \"time\": 111.87461152076722}, \"3b9f8045d7\": {\"quality\": 0.11268817204301075, \"cost\": 0.044838478, \"time\": 104.71198470592498}, \"3c5857683c\": {\"quality\": 0.10946236559139785, \"cost\": 0.03896547, \"time\": 133.13998155593873}, \"3cbab8082e\": {\"quality\": 0.09333333333333334, \"cost\": 0.009450357, \"time\": 131.19907307624817}, \"3d21104666\": {\"quality\": 0.12301075268817205, \"cost\": 0.05427800000000001, \"time\": 58.033410573005675}, \"3d71c4dd2c\": {\"quality\": 0.10301075268817204, \"cost\": 0.006745917, \"time\": 99.85713205337524}, \"3d9e24215e\": {\"quality\": 0.10301075268817204, \"cost\": 0.008923918, \"time\": 54.70108168125152}, \"3e7efee65a\": {\"quality\": 0.09268817204301075, \"cost\": 0.034440120000000005, \"time\": 125.9171290397644}, \"3ed0ad20ed\": {\"quality\": 0.12301075268817205, \"cost\": 0.038282694000000006, \"time\": 132.21433353424072}, \"3f1a58aec9\": {\"quality\": 0.22301075268817205, \"cost\": 0.0019911300000000002, \"time\": 48.89557406902313}, \"3f2b07cb78\": {\"quality\": 0.10946236559139785, \"cost\": 0.016589166000000002, \"time\": 105.48563659191132}, \"3f3ef494b0\": {\"quality\": 0.0, \"cost\": 0.001254132, \"time\": 57.60023159980774}, \"3f62c3fbfc\": {\"quality\": 0.0696774193548387, \"cost\": 0.0044856359999999994, \"time\": 95.2896116256714}, \"3f730d8bfe\": {\"quality\": 0.10301075268817204, \"cost\": 0.04548097400000001, \"time\": 94.35254197120668}, \"3f8d2ee81f\": {\"quality\": 0.10623655913978494, \"cost\": 0.02839726, \"time\": 88.55384962558746}, \"40104c813f\": {\"quality\": 0.00967741935483871, \"cost\": 0.010927247000000001, \"time\": 137.43559978008273}, \"403b05da2d\": {\"quality\": 0.02, \"cost\": 0.010782219999999999, \"time\": 89.91662650108339}, \"403f0726fa\": {\"quality\": 0.10301075268817204, \"cost\": 0.030024500000000003, \"time\": 53.93264591693878}, \"4098178354\": {\"quality\": 0.10623655913978494, \"cost\": 0.033845859000000006, \"time\": 113.72850313186646}, \"409ff67607\": {\"quality\": 0.06, \"cost\": 0.004197711, \"time\": 99.27511491775513}, \"40b3b6642c\": {\"quality\": 0.12301075268817205, \"cost\": 0.033175452, \"time\": 78.3308812379837}, \"412c065b83\": {\"quality\": 0.03333333333333333, \"cost\": 0.00797817, \"time\": 76.86819369792937}, \"4191118787\": {\"quality\": 0.10301075268817204, \"cost\": 0.009176240999999998, \"time\": 115.17749345302582}, \"41d5b97871\": {\"quality\": 0.0, \"cost\": 0.002632212, \"time\": 67.86801817417145}, \"41d8845655\": {\"quality\": 0.12301075268817205, \"cost\": 0.011058695, \"time\": 66.63209574222564}, \"41ee202cac\": {\"quality\": 0.04, \"cost\": 0.0015582279999999996, \"time\": 81.97960751056671}, \"41fe4aee55\": {\"quality\": 0.04, \"cost\": 0.033214178, \"time\": 89.46116530895233}, \"42430ea391\": {\"quality\": 0.10301075268817204, \"cost\": 0.037945212000000006, \"time\": 121.86067166328431}, \"42ddd48341\": {\"quality\": 0.10301075268817204, \"cost\": 0.015020527999999998, \"time\": 114.5060165643692}, \"42f1e19aa7\": {\"quality\": 0.11591397849462365, \"cost\": 0.04674747600000001, \"time\": 126.75622403621674}, \"430a2ab32f\": {\"quality\": 0.12301075268817205, \"cost\": 0.039291168, \"time\": 96.79555187225341}, \"4339427ad8\": {\"quality\": 0.08634408602150537, \"cost\": 0.03448633, \"time\": 124.03267226219177}, \"4361bc7ea7\": {\"quality\": 0.07333333333333333, \"cost\": 0.001487792, \"time\": 83.36836609840392}, \"43c3cf9cb8\": {\"quality\": 0.10301075268817204, \"cost\": 0.006907384000000001, \"time\": 57.54507291316986}, \"43d24fb32a\": {\"quality\": 0.10946236559139785, \"cost\": 0.010483464, \"time\": 98.36521954536438}, \"43e9b39e5c\": {\"quality\": 0.12946236559139784, \"cost\": 0.036367179, \"time\": 118.90290827751159}, \"44d6af5523\": {\"quality\": 0.10301075268817204, \"cost\": 0.0051817799999999995, \"time\": 97.96241414546967}, \"44f189d813\": {\"quality\": 0.10623655913978494, \"cost\": 0.020515107000000005, \"time\": 121.6331589460373}, \"450f45a187\": {\"quality\": 0.0, \"cost\": 0.007900044000000002, \"time\": 60.44815545082092}, \"453d0a5097\": {\"quality\": 0.06301075268817204, \"cost\": 0.016095624, \"time\": 92.88813369274139}, \"4547ef4c8e\": {\"quality\": 0.10301075268817204, \"cost\": 0.033263573000000005, \"time\": 89.95854828357696}, \"461846a52d\": {\"quality\": 0.10623655913978494, \"cost\": 0.003919338, \"time\": 85.1272670507431}, \"462e6ff849\": {\"quality\": 0.06761904761904762, \"cost\": 0.011236148000000001, \"time\": 113.67141807079315}, \"4630853d32\": {\"quality\": 0.08946236559139785, \"cost\": 0.008533141000000001, \"time\": 86.52934875488282}, \"46475b9e75\": {\"quality\": 0.10301075268817204, \"cost\": 0.006695604000000001, \"time\": 94.7075261592865}, \"46654a1f32\": {\"quality\": 0.0, \"cost\": 0.012072104, \"time\": 64.53883934020996}, \"466a3036b2\": {\"quality\": 0.06301075268817204, \"cost\": 0.010685084999999999, \"time\": 106.71569502353668}, \"466d4d16dd\": {\"quality\": 0.08946236559139785, \"cost\": 0.008533694000000001, \"time\": 79.9828891992569}, \"46ed68152d\": {\"quality\": 0.07612903225806453, \"cost\": 0.012683225999999999, \"time\": 86.68546965122223}, \"476a12876c\": {\"quality\": 0.08946236559139785, \"cost\": 0.0062402939999999995, \"time\": 72.02064683437348}, \"4778401a7a\": {\"quality\": 0.10301075268817204, \"cost\": 0.016234258, \"time\": 93.26795687675477}, \"47f9115b26\": {\"quality\": 0.03333333333333333, \"cost\": 0.03462185200000001, \"time\": 101.22674057483673}, \"48043e2304\": {\"quality\": 0.10301075268817204, \"cost\": 0.009490545, \"time\": 83.147008061409}, \"487f30e740\": {\"quality\": 0.10623655913978494, \"cost\": 0.030731074000000004, \"time\": 72.48842267990112}, \"488645cbd9\": {\"quality\": 0.07290322580645162, \"cost\": 0.0015898319999999998, \"time\": 47.430080437660216}, \"48bf87f7fe\": {\"quality\": 0.11591397849462365, \"cost\": 0.041125237999999995, \"time\": 95.13780512809754}, \"4909061216\": {\"quality\": 0.08301075268817204, \"cost\": 0.013231131, \"time\": 110.22792162895203}, \"49731b1ccd\": {\"quality\": 0.08301075268817204, \"cost\": 0.007518614999999999, \"time\": 77.45786833763123}, \"49ad844bd2\": {\"quality\": 0.08301075268817204, \"cost\": 0.009246200999999999, \"time\": 125.66613755226135}, \"49ca727e49\": {\"quality\": 0.10301075268817204, \"cost\": 0.000910932, \"time\": 36.98862104415893}, \"4a23d8eff7\": {\"quality\": 0.012903225806451613, \"cost\": 0.033723964, \"time\": 133.58634560108186}, \"4a555da784\": {\"quality\": 0.12301075268817205, \"cost\": 0.030253567000000002, \"time\": 93.47855880260468}, \"4a5cea8b85\": {\"quality\": 0.10301075268817204, \"cost\": 0.037096130000000005, \"time\": 87.66496036052703}, \"4a767339bd\": {\"quality\": 0.10946236559139785, \"cost\": 0.034482308, \"time\": 119.38791158199311}, \"4aafd39d76\": {\"quality\": 0.0, \"cost\": 0.005941872, \"time\": 106.78513460159303}, \"4aca6e5216\": {\"quality\": 0.08623655913978495, \"cost\": 0.011614728, \"time\": 98.73750612735748}, \"4b18a647d6\": {\"quality\": 0.10623655913978494, \"cost\": 0.04434360500000001, \"time\": 133.19987111091615}, \"4bc4528402\": {\"quality\": 0.10623655913978494, \"cost\": 0.004866044999999999, \"time\": 105.87921042442322}, \"4c158a1a4a\": {\"quality\": 0.06301075268817204, \"cost\": 0.008219039999999999, \"time\": 107.04837877750397}, \"4c954323e3\": {\"quality\": 0.11591397849462365, \"cost\": 0.010109752999999999, \"time\": 127.91732285022735}, \"4d91e8a27b\": {\"quality\": 0.1761290322580645, \"cost\": 0.010757003000000001, \"time\": 101.67963089942933}, \"4dc185389a\": {\"quality\": 0.08290322580645162, \"cost\": 0.030485803000000006, \"time\": 113.32033824920654}, \"4dd3635bc3\": {\"quality\": 0.12623655913978493, \"cost\": 0.013448202, \"time\": 89.75064220428467}, \"4dd98ef398\": {\"quality\": 0.0, \"cost\": 0.030016080000000004, \"time\": 69.32772116661073}, \"4dfacd0007\": {\"quality\": 0.10623655913978494, \"cost\": 0.014845295, \"time\": 100.57037975788117}, \"4e298ee0d4\": {\"quality\": 0.07612903225806453, \"cost\": 0.008624916, \"time\": 97.4965528011322}, \"4e3443a0f9\": {\"quality\": 0.11591397849462365, \"cost\": 0.013369873000000001, \"time\": 110.12258355617523}, \"4e4b9db2b8\": {\"quality\": 0.116605222734255, \"cost\": 0.007327941000000001, \"time\": 75.85998649597168}, \"4e6509f614\": {\"quality\": 0.08301075268817204, \"cost\": 0.00546564, \"time\": 51.5486006975174}, \"4e6a83e751\": {\"quality\": 0.07290322580645162, \"cost\": 0.000857376, \"time\": 45.70523474216461}, \"4e79c8947f\": {\"quality\": 0.11134408602150539, \"cost\": 0.00732088, \"time\": 43.410219454765326}, \"4e9504432b\": {\"quality\": 0.10623655913978494, \"cost\": 0.034706275, \"time\": 129.37581310272216}, \"4e962170dc\": {\"quality\": 0.02, \"cost\": 0.033169164, \"time\": 83.41308994293212}, \"4eb0826f21\": {\"quality\": 0.0, \"cost\": 0.027521320000000002, \"time\": 56.27700278759002}, \"4ed41bf2e4\": {\"quality\": 0.12623655913978493, \"cost\": 0.03861573, \"time\": 115.21599047183992}, \"4f78672528\": {\"quality\": 0.08301075268817204, \"cost\": 0.037576234, \"time\": 94.65013363361359}, \"4f8cca1195\": {\"quality\": 0.13019969278033794, \"cost\": 0.010837381, \"time\": 124.13806178569794}, \"500860eaa2\": {\"quality\": 0.0, \"cost\": 0.002532492, \"time\": 48.11276173591614}, \"50701b505e\": {\"quality\": 0.09333333333333332, \"cost\": 0.006791684999999999, \"time\": 99.08809175491334}, \"51583a901c\": {\"quality\": 0.14946236559139786, \"cost\": 0.010773869, \"time\": 124.03754835128784}, \"51aeaf9f3e\": {\"quality\": 0.10301075268817204, \"cost\": 0.008776062000000001, \"time\": 40.02036185264588}, \"520b52b64c\": {\"quality\": 0.10301075268817204, \"cost\": 0.0302065, \"time\": 60.96432254314423}, \"521314dab6\": {\"quality\": 0.20301075268817204, \"cost\": 0.006639104, \"time\": 69.5313068151474}, \"5226eb7ff6\": {\"quality\": 0.10946236559139785, \"cost\": 0.008551829, \"time\": 88.08115694522857}, \"526878b5eb\": {\"quality\": 0.05290322580645161, \"cost\": 0.009073641, \"time\": 127.54378099441527}, \"52c1cba6ce\": {\"quality\": 0.11591397849462365, \"cost\": 0.010276206999999999, \"time\": 117.55816102027893}, \"52f041a70e\": {\"quality\": 0.10301075268817204, \"cost\": 0.001223052, \"time\": 42.505560874938965}, \"533867574b\": {\"quality\": 0.12301075268817205, \"cost\": 0.031638318000000006, \"time\": 88.52275938987732}, \"53869388bb\": {\"quality\": 0.05655913978494624, \"cost\": 0.0018472099999999997, \"time\": 38.346792578697205}, \"53aefd41e4\": {\"quality\": 0.10301075268817204, \"cost\": 0.022357914, \"time\": 79.89788761138917}, \"53d2932c4f\": {\"quality\": 0.12374807987711213, \"cost\": 0.007490942, \"time\": 79.30699775218963}, \"54375d3eba\": {\"quality\": 0.04967741935483871, \"cost\": 0.013380047999999999, \"time\": 77.51320762634276}, \"5474247f91\": {\"quality\": 0.0, \"cost\": 0.008527589999999998, \"time\": 80.76838719844818}, \"54993bc472\": {\"quality\": 0.13134408602150538, \"cost\": 0.027307650000000003, \"time\": 45.53245575428009}, \"55358f2285\": {\"quality\": 0.11591397849462365, \"cost\": 0.034442846, \"time\": 74.98782951831818}, \"5569b4f878\": {\"quality\": 0.10623655913978494, \"cost\": 0.004829285, \"time\": 52.80584018230438}, \"557d2cf7ba\": {\"quality\": 0.01935483870967742, \"cost\": 0.009244542000000001, \"time\": 81.70965638160706}, \"55c8aa8935\": {\"quality\": 0.07660522273425499, \"cost\": 0.012967038, \"time\": 127.87031970024108}, \"55e6bf8f14\": {\"quality\": 0.10623655913978494, \"cost\": 0.031587628000000006, \"time\": 88.87307133674622}, \"56a0660622\": {\"quality\": 0.07333333333333333, \"cost\": 0.02785145, \"time\": 61.1418399810791}, \"56a29a28c5\": {\"quality\": 0.04, \"cost\": 0.005067498, \"time\": 67.83964855670928}, \"56c4fd5056\": {\"quality\": 0.0, \"cost\": 0.05983196800000001, \"time\": 88.72949783802034}, \"5703697dbd\": {\"quality\": 0.09268817204301075, \"cost\": 0.006353454, \"time\": 94.9607982635498}, \"5718f2ed80\": {\"quality\": 0.10946236559139785, \"cost\": 0.003271383, \"time\": 89.07395238876343}, \"572a02a59a\": {\"quality\": 0.15623655913978496, \"cost\": 0.004805790000000001, \"time\": 95.1497330904007}, \"5750713a41\": {\"quality\": 0.10301075268817204, \"cost\": 0.006427176, \"time\": 93.20527319908142}, \"57757ef15e\": {\"quality\": 0.06301075268817204, \"cost\": 0.006945941999999999, \"time\": 92.32843782901764}, \"579c81bbe0\": {\"quality\": 0.0, \"cost\": 0.005599266, \"time\": 66.2362874507904}, \"57bed1722f\": {\"quality\": 0.0, \"cost\": 0.005544702, \"time\": 79.84823746681214}, \"585ba6d20b\": {\"quality\": 0.10301075268817204, \"cost\": 0.007482849, \"time\": 135.71461656093598}, \"589267ac64\": {\"quality\": 0.043010752688172046, \"cost\": 0.0073127800000000005, \"time\": 65.76824603080749}, \"589a1cea79\": {\"quality\": 0.03612903225806452, \"cost\": 0.009729424, \"time\": 144.28782908916475}, \"58ca42839b\": {\"quality\": 0.10301075268817204, \"cost\": 0.026166294, \"time\": 107.9141107559204}, \"58dc373441\": {\"quality\": 0.10623655913978494, \"cost\": 0.031231837999999998, \"time\": 134.37797992229463}, \"59006532b4\": {\"quality\": 0.10301075268817204, \"cost\": 0.007373505, \"time\": 106.22529966831206}, \"59326c4e00\": {\"quality\": 0.0, \"cost\": 0.006372392, \"time\": 76.82031381130219}, \"593975c75b\": {\"quality\": 0.11268817204301075, \"cost\": 0.04324769700000001, \"time\": 137.53491501808168}, \"596b4f8694\": {\"quality\": 0.10301075268817204, \"cost\": 0.034689392, \"time\": 136.43740742206575}, \"5971ba4e0d\": {\"quality\": 0.10946236559139785, \"cost\": 0.00188677, \"time\": 45.389907240867615}, \"5996465c0a\": {\"quality\": 0.10623655913978494, \"cost\": 0.031329037000000004, \"time\": 141.25967452526095}, \"59d70b9f65\": {\"quality\": 0.10301075268817204, \"cost\": 0.03321219, \"time\": 110.12462825775147}, \"59f887b67c\": {\"quality\": 0.04, \"cost\": 0.010642958000000001, \"time\": 167.4949809551239}, \"5a22920db4\": {\"quality\": 0.10623655913978494, \"cost\": 0.0019284089999999999, \"time\": 76.70821943283082}, \"5a35020d45\": {\"quality\": 0.10623655913978494, \"cost\": 0.006820569, \"time\": 120.13344073295593}, \"5aa43da1fc\": {\"quality\": 0.11591397849462365, \"cost\": 0.028597824, \"time\": 106.1217529296875}, \"5ae0d88127\": {\"quality\": 0.0, \"cost\": 0.032261024000000006, \"time\": 121.66821670532227}, \"5b10fbdbe1\": {\"quality\": 0.11913978494623656, \"cost\": 0.014016424, \"time\": 136.450377035141}, \"5bade9eb85\": {\"quality\": 0.10623655913978494, \"cost\": 0.009162800999999998, \"time\": 132.39995594024657}, \"5be16744bf\": {\"quality\": 0.09729646697388633, \"cost\": 0.01510615, \"time\": 108.84415483474731}, \"5c5055e252\": {\"quality\": 0.0696774193548387, \"cost\": 0.029808262000000006, \"time\": 102.84497191905976}, \"5c53feccd9\": {\"quality\": 0.09333333333333334, \"cost\": 0.016134796, \"time\": 133.5285586118698}, \"5c77c7c2b2\": {\"quality\": 0.08301075268817204, \"cost\": 0.009062296000000001, \"time\": 106.83322401046752}, \"5d298b5b48\": {\"quality\": 0.10623655913978494, \"cost\": 0.037192654000000006, \"time\": 126.51887106895447}, \"5d41515d2e\": {\"quality\": 0.10946236559139785, \"cost\": 0.027884363000000002, \"time\": 88.88220546245574}, \"5d4babc723\": {\"quality\": 0.12623655913978493, \"cost\": 0.033099258000000006, \"time\": 115.31343786716462}, \"5d79b50feb\": {\"quality\": 0.02258064516129032, \"cost\": 0.013859904000000003, \"time\": 97.84079582691191}, \"5dc216cd6b\": {\"quality\": 0.10946236559139785, \"cost\": 0.010709733999999999, \"time\": 78.82858054637909}, \"5dd68c1b8f\": {\"quality\": 0.10301075268817204, \"cost\": 0.0015060479999999998, \"time\": 49.43739776611328}, \"5de4a882c1\": {\"quality\": 0.10623655913978494, \"cost\": 0.008982037, \"time\": 78.31139621734619}, \"5e04e1c72d\": {\"quality\": 0.10623655913978494, \"cost\": 0.040715788, \"time\": 81.60498032569885}, \"5e923cee9e\": {\"quality\": 0.10301075268817204, \"cost\": 0.017039563, \"time\": 113.84931592941284}, \"5ea2fab380\": {\"quality\": 0.08301075268817204, \"cost\": 0.004995846, \"time\": 52.83791854381562}, \"5eb3bb525b\": {\"quality\": 0.05333333333333333, \"cost\": 0.009461896000000001, \"time\": 96.70504157543182}, \"5eea899380\": {\"quality\": 0.11591397849462365, \"cost\": 0.037967743000000005, \"time\": 113.0098082780838}, \"5f37b3902b\": {\"quality\": 0.06258064516129032, \"cost\": 0.007317896000000001, \"time\": 72.8305846452713}, \"5f9282df3c\": {\"quality\": 0.13946236559139785, \"cost\": 0.018534208000000003, \"time\": 113.41562910079956}, \"6019884cf3\": {\"quality\": 0.10301075268817204, \"cost\": 0.035076015, \"time\": 124.4206553697586}, \"606352363e\": {\"quality\": 0.06301075268817204, \"cost\": 0.036100772, \"time\": 120.98173689842224}, \"608728f868\": {\"quality\": 0.08946236559139785, \"cost\": 0.033319759000000004, \"time\": 122.43799624443054}, \"60b9e936f1\": {\"quality\": 0.02, \"cost\": 0.027687884000000003, \"time\": 62.59779427051544}, \"60cb623c53\": {\"quality\": 0.0, \"cost\": 0.002459736, \"time\": 58.33065054416656}, \"6234de86b4\": {\"quality\": 0.0696774193548387, \"cost\": 0.04136126700000001, \"time\": 96.66275715827942}, \"62352c6854\": {\"quality\": 0.11591397849462365, \"cost\": 0.03903031800000001, \"time\": 117.9777235031128}, \"63a0aaebed\": {\"quality\": 0.03333333333333333, \"cost\": 0.002063613, \"time\": 61.364303207397455}, \"647dda686f\": {\"quality\": 0.11268817204301075, \"cost\": 0.016785803000000002, \"time\": 112.19475474357606}, \"6511b21ded\": {\"quality\": 0.10301075268817204, \"cost\": 0.013076242000000002, \"time\": 83.33274998664857}, \"652c0f4bdf\": {\"quality\": 0.02, \"cost\": 0.013811997999999999, \"time\": 122.77105476856232}, \"6533c85913\": {\"quality\": 0.056129032258064517, \"cost\": 0.010490979000000001, \"time\": 115.75537250041961}, \"65627426e0\": {\"quality\": 0.07333333333333333, \"cost\": 0.000720418, \"time\": 74.83012104034424}, \"65b76da9c6\": {\"quality\": 0.0696774193548387, \"cost\": 0.007407492, \"time\": 86.19350485801698}, \"65be1c1306\": {\"quality\": 0.08623655913978495, \"cost\": 0.001827702, \"time\": 35.30676467418671}, \"65e0216208\": {\"quality\": 0.0, \"cost\": 0.008113563, \"time\": 120.1523479938507}, \"65eee615d7\": {\"quality\": 0.12301075268817205, \"cost\": 0.00056168, \"time\": 33.19386031627655}, \"6623d7a5ac\": {\"quality\": 0.11591397849462365, \"cost\": 0.035060653000000004, \"time\": 117.17992658615111}, \"66750c0934\": {\"quality\": 0.10301075268817204, \"cost\": 0.007793503, \"time\": 92.1173137664795}, \"66776ec181\": {\"quality\": 0.09612903225806452, \"cost\": 0.007265367, \"time\": 135.96958916187288}, \"66e5ae0a21\": {\"quality\": 0.0, \"cost\": 0.021849754, \"time\": 69.76327166557311}, \"6750a8d7a7\": {\"quality\": 0.11591397849462365, \"cost\": 0.014042485, \"time\": 90.68010911941528}, \"67632141f6\": {\"quality\": 0.008333333333333333, \"cost\": 0.0051496020000000005, \"time\": 66.89973094463349}, \"67868fcff6\": {\"quality\": 0.11591397849462365, \"cost\": 0.01194172, \"time\": 93.14265236854553}, \"67bab6732d\": {\"quality\": 0.07612903225806453, \"cost\": 0.011601776, \"time\": 102.1346355676651}, \"67fe399cf1\": {\"quality\": 0.11913978494623656, \"cost\": 0.007208916, \"time\": 103.40100402832032}, \"6846bd8fb3\": {\"quality\": 0.09333333333333334, \"cost\": 0.007100624999999999, \"time\": 104.9791820526123}, \"68b4cc3e39\": {\"quality\": 0.12301075268817205, \"cost\": 0.045394822, \"time\": 143.03540830612184}, \"69a029ae36\": {\"quality\": 0.0, \"cost\": 0.028152226000000002, \"time\": 119.96623668670654}, \"69b3b67de6\": {\"quality\": 0.11591397849462365, \"cost\": 0.008464616999999999, \"time\": 114.51782112121583}, \"69bf3f6ba0\": {\"quality\": 0.08946236559139785, \"cost\": 0.009020988, \"time\": 113.73228361606598}, \"6a022c3f73\": {\"quality\": 0.06946236559139785, \"cost\": 0.004192794, \"time\": 110.7649359703064}, \"6a10c53ad8\": {\"quality\": 0.12697388632872503, \"cost\": 0.012509873000000001, \"time\": 133.7932172060013}, \"6a6348f69d\": {\"quality\": 0.04, \"cost\": 0.001041536, \"time\": 82.5766867876053}, \"6a74a11bee\": {\"quality\": 0.12623655913978493, \"cost\": 0.030354143, \"time\": 111.96903376579286}, \"6a90ec29dd\": {\"quality\": 0.11591397849462365, \"cost\": 0.01421163, \"time\": 101.9924022436142}, \"6aac59742a\": {\"quality\": 0.00967741935483871, \"cost\": 0.009429429, \"time\": 145.55764169692992}, \"6b0862c597\": {\"quality\": 0.10623655913978494, \"cost\": 0.028999667000000007, \"time\": 90.02094576358795}, \"6b3c16def2\": {\"quality\": 0.10697388632872504, \"cost\": 0.001518831, \"time\": 61.66853258609771}, \"6b99b0e901\": {\"quality\": 0.20301075268817204, \"cost\": 0.01664123, \"time\": 88.09062700271608}, \"6b9b2b3515\": {\"quality\": 0.07333333333333333, \"cost\": 0.039732264, \"time\": 98.20316002368926}, \"6bcc02962b\": {\"quality\": 0.10946236559139785, \"cost\": 0.031133424, \"time\": 97.9912957429886}, \"6c1987a9e3\": {\"quality\": 0.12052227342549923, \"cost\": 0.034681694000000006, \"time\": 130.62333233356475}, \"6c50123ee1\": {\"quality\": 0.10301075268817204, \"cost\": 0.027581610000000003, \"time\": 51.59311645030975}, \"6c67c36480\": {\"quality\": 0.02, \"cost\": 0.034065069, \"time\": 137.40613181591033}, \"6c9b9f1363\": {\"quality\": 0.10623655913978494, \"cost\": 0.033907465, \"time\": 132.98066565990447}, \"6cc813aa68\": {\"quality\": 0.04258064516129032, \"cost\": 0.006083242000000001, \"time\": 64.82899880409241}, \"6d20c6ace0\": {\"quality\": 0.12301075268817205, \"cost\": 0.026338517999999998, \"time\": 129.11275482177734}, \"6d444fe21a\": {\"quality\": 0.08623655913978495, \"cost\": 0.011278744, \"time\": 98.86997501850128}, \"6db70dc3b6\": {\"quality\": 0.07333333333333333, \"cost\": 0.001547284, \"time\": 91.66993854045867}, \"6e0690f576\": {\"quality\": 0.02, \"cost\": 0.001789902, \"time\": 73.95127286911011}, \"6e06cc804f\": {\"quality\": 0.10301075268817204, \"cost\": 0.06912567600000001, \"time\": 111.87644176483155}, \"6e24048a2e\": {\"quality\": 0.10623655913978494, \"cost\": 0.020128832, \"time\": 86.35281929969787}, \"6e93514f45\": {\"quality\": 0.0, \"cost\": 0.005360688000000001, \"time\": 106.13040091991425}, \"6eae47102b\": {\"quality\": 0.10301075268817204, \"cost\": 0.008400768, \"time\": 124.15161423683166}, \"6ed4cae469\": {\"quality\": 0.10301075268817204, \"cost\": 0.007222452, \"time\": 52.79307363033295}, \"6ef3b7127e\": {\"quality\": 0.20946236559139786, \"cost\": 0.006944608000000001, \"time\": 77.88400571346283}, \"6f60a05c33\": {\"quality\": 0.06623655913978495, \"cost\": 0.0043084709999999995, \"time\": 119.35040576457976}, \"6fe0b3f929\": {\"quality\": 0.0, \"cost\": 0.006453737999999999, \"time\": 100.1590167760849}, \"700ab1d309\": {\"quality\": 0.0, \"cost\": 0.017191588, \"time\": 130.72283561229705}, \"7040e83d52\": {\"quality\": 0.07333333333333333, \"cost\": 0.021663679, \"time\": 123.93634688854218}, \"7046765af8\": {\"quality\": 0.10301075268817204, \"cost\": 0.035322934, \"time\": 155.01265740394592}, \"70b7c92ce8\": {\"quality\": 0.11456989247311827, \"cost\": 0.030668973000000002, \"time\": 145.30480256080628}, \"70c850e039\": {\"quality\": 0.0696774193548387, \"cost\": 0.0025364249999999997, \"time\": 93.23688249588012}, \"7112a7e64c\": {\"quality\": 0.10946236559139785, \"cost\": 0.005931566000000001, \"time\": 86.47371196746826}, \"71b615468b\": {\"quality\": 0.06, \"cost\": 0.011086878000000001, \"time\": 121.78951773643495}, \"723fd5589a\": {\"quality\": 0.0, \"cost\": 0.013953277000000002, \"time\": 128.76966230869294}, \"7250da0f41\": {\"quality\": 0.0, \"cost\": 0.007366451999999999, \"time\": 91.1376808166504}, \"7274a50778\": {\"quality\": 0.10301075268817204, \"cost\": 0.033229607999999994, \"time\": 102.63909165859224}, \"72d022ce33\": {\"quality\": 0.10301075268817204, \"cost\": 0.04128840500000001, \"time\": 126.40671186447145}, \"7347cf0308\": {\"quality\": 0.0, \"cost\": 0.0036175589999999994, \"time\": 66.07082681655885}, \"736e652158\": {\"quality\": 0.0, \"cost\": 0.006011901, \"time\": 124.95026433467865}, \"739b1f81dc\": {\"quality\": 0.10301075268817204, \"cost\": 0.001929351, \"time\": 59.82662603855133}, \"742a5c0552\": {\"quality\": 0.10623655913978494, \"cost\": 0.028237606000000005, \"time\": 58.23579893112183}, \"742ec1b2e1\": {\"quality\": 0.049677419354838714, \"cost\": 0.029807655000000002, \"time\": 101.90871527194977}, \"7435fd54f8\": {\"quality\": 0.06946236559139785, \"cost\": 0.030427277000000003, \"time\": 121.13120923042297}, \"7445d99939\": {\"quality\": 0.00967741935483871, \"cost\": 0.0071708580000000004, \"time\": 93.88335957527161}, \"7466a5f424\": {\"quality\": 0.21591397849462365, \"cost\": 0.015802706, \"time\": 113.56059277057648}, \"74a0be215b\": {\"quality\": 0.10623655913978494, \"cost\": 0.033583158, \"time\": 125.14361772537231}, \"74d7f64b8c\": {\"quality\": 0.0, \"cost\": 0.006800163, \"time\": 146.70624086856844}, \"7524905580\": {\"quality\": 0.10623655913978494, \"cost\": 0.004893289, \"time\": 87.59031102657318}, \"752d9649f2\": {\"quality\": 0.11268817204301075, \"cost\": 0.013868582, \"time\": 121.90489645004273}, \"75d61c2cd0\": {\"quality\": 0.10301075268817204, \"cost\": 0.029877515, \"time\": 85.95328326225281}, \"7604c0aa13\": {\"quality\": 0.10301075268817204, \"cost\": 0.010774968000000001, \"time\": 84.55494666099548}, \"76c09db721\": {\"quality\": 0.10301075268817204, \"cost\": 0.016476853, \"time\": 132.14744505882265}, \"774f268b66\": {\"quality\": 0.10946236559139785, \"cost\": 0.038069169, \"time\": 123.8245237827301}, \"7765576286\": {\"quality\": 0.09333333333333334, \"cost\": 0.008933451, \"time\": 130.5573842048645}, \"77c02b00c1\": {\"quality\": 0.05655913978494624, \"cost\": 0.011193958, \"time\": 124.06667771339417}, \"7801da66b9\": {\"quality\": 0.0, \"cost\": 0.004612473000000001, \"time\": 94.23449354171753}, \"782d52674e\": {\"quality\": 0.10301075268817204, \"cost\": 0.034843558000000004, \"time\": 141.27741813659668}, \"786e5d0af5\": {\"quality\": 0.10946236559139785, \"cost\": 0.009984510999999998, \"time\": 130.25511062145233}, \"7878563d63\": {\"quality\": 0.10946236559139785, \"cost\": 0.009147046, \"time\": 73.60474834442138}, \"79e1ca9b3c\": {\"quality\": 0.0, \"cost\": 0.02973492, \"time\": 82.63413181304932}, \"79fad58f07\": {\"quality\": 0.12623655913978493, \"cost\": 0.002492818, \"time\": 58.73518245220184}, \"7a207b42a8\": {\"quality\": 0.0, \"cost\": 0.008606742, \"time\": 142.3234792470932}, \"7a2cdc546c\": {\"quality\": 0.10301075268817204, \"cost\": 0.032290785, \"time\": 102.24121663570403}, \"7a42a77788\": {\"quality\": 0.10047619047619048, \"cost\": 0.031140232, \"time\": 94.29256238937378}, \"7a58d3472b\": {\"quality\": 0.09333333333333334, \"cost\": 0.013635729000000001, \"time\": 102.35271875858307}, \"7b3937c1f1\": {\"quality\": 0.10301075268817204, \"cost\": 0.009220510000000001, \"time\": 54.74555864334106}, \"7b6f44618e\": {\"quality\": 0.06946236559139785, \"cost\": 0.013006586, \"time\": 130.8397382736206}, \"7c62576527\": {\"quality\": 0.22424731182795699, \"cost\": 0.023333732000000003, \"time\": 129.530002951622}, \"7c89a2b69e\": {\"quality\": 0.0, \"cost\": 0.012572941, \"time\": 131.1510176181793}, \"7c96c9712f\": {\"quality\": 0.12301075268817205, \"cost\": 0.02800367, \"time\": 71.56067564487458}, \"7ca066aa1c\": {\"quality\": 0.10301075268817204, \"cost\": 0.017135469, \"time\": 126.58104584217071}, \"7cb5591f27\": {\"quality\": 0.12301075268817205, \"cost\": 0.037874200000000004, \"time\": 99.9374593257904}, \"7cf56a7fbc\": {\"quality\": 0.0, \"cost\": 0.03754542600000001, \"time\": 81.54608149528502}, \"7d44f0959d\": {\"quality\": 0.11015360983102919, \"cost\": 0.006577314, \"time\": 95.78297588825225}, \"7d60c38c5c\": {\"quality\": 0.10301075268817204, \"cost\": 0.0019278, \"time\": 62.340004205703735}, \"7d9b4535ac\": {\"quality\": 0.12301075268817205, \"cost\": 0.030719909000000004, \"time\": 124.86560246944427}, \"7daf7ff182\": {\"quality\": 0.21591397849462365, \"cost\": 0.011567735999999999, \"time\": 86.85500297546386}, \"7dcedb3d02\": {\"quality\": 0.11591397849462365, \"cost\": 0.034740937, \"time\": 118.09863307476044}, \"7e22f12cd1\": {\"quality\": 0.10946236559139785, \"cost\": 0.012891924999999999, \"time\": 121.22575757503509}, \"7e53a50b13\": {\"quality\": 0.10946236559139785, \"cost\": 0.014215881, \"time\": 90.20128819942474}, \"7ed07ad40a\": {\"quality\": 0.10623655913978494, \"cost\": 0.006843809999999999, \"time\": 95.13763875961303}, \"7fa67a7656\": {\"quality\": 0.06623655913978495, \"cost\": 0.008441146000000002, \"time\": 80.71841621398926}, \"7fc6c84bdf\": {\"quality\": 0.10301075268817204, \"cost\": 0.014648633, \"time\": 94.4585831642151}, \"7ff8a779cc\": {\"quality\": 0.049677419354838714, \"cost\": 0.033345343, \"time\": 90.90201032161713}, \"801af99400\": {\"quality\": 0.12623655913978493, \"cost\": 0.031287835, \"time\": 92.26307544708251}, \"806881adcb\": {\"quality\": 0.10623655913978494, \"cost\": 0.009180413, \"time\": 103.7968935251236}, \"80ad4122e4\": {\"quality\": 0.10946236559139785, \"cost\": 0.033705702000000004, \"time\": 147.08289949893953}, \"80be7df955\": {\"quality\": 0.0, \"cost\": 0.001751958, \"time\": 72.68495740890503}, \"80bf60c422\": {\"quality\": 0.0, \"cost\": 0.002103153, \"time\": 69.09833533763886}, \"81333c7a33\": {\"quality\": 0.19333333333333333, \"cost\": 0.018656088, \"time\": 122.73940353393554}, \"813e75210b\": {\"quality\": 0.12623655913978493, \"cost\": 0.002453274, \"time\": 31.607514262199402}, \"81660ae8b2\": {\"quality\": 0.00967741935483871, \"cost\": 0.03510192400000001, \"time\": 159.472354388237}, \"816958b5d1\": {\"quality\": 0.10301075268817204, \"cost\": 0.033492955000000005, \"time\": 150.83300273418428}, \"81ab2ef3f4\": {\"quality\": 0.08301075268817204, \"cost\": 0.015009652000000002, \"time\": 122.83408730030061}, \"829df73946\": {\"quality\": 0.10623655913978494, \"cost\": 0.0027339649999999997, \"time\": 95.04270524978638}, \"82ea1bd1b9\": {\"quality\": 0.10301075268817204, \"cost\": 0.036581163, \"time\": 130.3844569683075}, \"8357183895\": {\"quality\": 0.0, \"cost\": 0.009140319000000001, \"time\": 168.7040484428406}, \"8392a6083a\": {\"quality\": 0.11591397849462365, \"cost\": 0.013724715999999998, \"time\": 170.04993345737458}, \"83aee532b7\": {\"quality\": 0.0, \"cost\": 0.032044618999999996, \"time\": 169.97980024814606}, \"83b26646c3\": {\"quality\": 0.10946236559139785, \"cost\": 0.027026598000000002, \"time\": 126.73678522109986}, \"83c9e66ec6\": {\"quality\": 0.10623655913978494, \"cost\": 0.0077078630000000006, \"time\": 122.85937674045563}, \"847a0e5db5\": {\"quality\": 0.10301075268817204, \"cost\": 0.033220362, \"time\": 118.9956719636917}, \"847fd49235\": {\"quality\": 0.10623655913978494, \"cost\": 0.004587912, \"time\": 120.00769526958466}, \"849100224d\": {\"quality\": 0.11729646697388633, \"cost\": 0.040131096000000005, \"time\": 162.80983967781066}, \"84b91c37ab\": {\"quality\": 0.10301075268817204, \"cost\": 0.035888904, \"time\": 185.50609340667725}, \"8519bef585\": {\"quality\": 0.0, \"cost\": 0.002883114, \"time\": 83.83753573894501}, \"85c94a5505\": {\"quality\": 0.10623655913978494, \"cost\": 0.003060162, \"time\": 47.44887166023254}, \"85eda38404\": {\"quality\": 0.06301075268817204, \"cost\": 0.00743115, \"time\": 97.06213176250458}, \"862183bfb9\": {\"quality\": 0.11591397849462365, \"cost\": 0.007656215, \"time\": 150.60711097717285}, \"8631e49c94\": {\"quality\": 0.10623655913978494, \"cost\": 0.032397725, \"time\": 178.85168232917786}, \"8668f65f05\": {\"quality\": 0.10301075268817204, \"cost\": 0.010037819, \"time\": 170.7291125535965}, \"86bf6375af\": {\"quality\": 0.07290322580645162, \"cost\": 0.034898615, \"time\": 167.93282201290128}, \"870e2f87b4\": {\"quality\": 0.09333333333333334, \"cost\": 0.014351701000000001, \"time\": 174.38446729183198}, \"887ad124e1\": {\"quality\": 0.10623655913978494, \"cost\": 0.0033872489999999997, \"time\": 141.54452052116392}, \"8886cb3082\": {\"quality\": 0.10623655913978494, \"cost\": 0.010027301, \"time\": 168.62686750888824}, \"88e71efa9b\": {\"quality\": 0.11591397849462365, \"cost\": 0.024657334, \"time\": 163.67923073768617}, \"8941621423\": {\"quality\": 0.10623655913978494, \"cost\": 0.006795555, \"time\": 135.67179551124573}, \"8950a6efe0\": {\"quality\": 0.10946236559139785, \"cost\": 0.029011140000000005, \"time\": 120.09507937431336}, \"8961e4d901\": {\"quality\": 0.02, \"cost\": 0.007544093999999999, \"time\": 115.46372551918029}, \"8974aa89a0\": {\"quality\": 0.08301075268817204, \"cost\": 0.0016073620000000002, \"time\": 48.72573773860931}, \"89836d2020\": {\"quality\": 0.21591397849462365, \"cost\": 0.03316775000000001, \"time\": 93.58950910568237}, \"89a289907e\": {\"quality\": 0.10301075268817204, \"cost\": 0.009082531000000001, \"time\": 127.31396248340606}, \"89a35a09b1\": {\"quality\": 0.10623655913978494, \"cost\": 0.007982379000000001, \"time\": 124.18780100345612}, \"8a3a35c762\": {\"quality\": 0.10946236559139785, \"cost\": 0.03612732600000001, \"time\": 119.14178504943848}, \"8a50695d1f\": {\"quality\": 0.043010752688172046, \"cost\": 0.016531714000000003, \"time\": 100.70311589241028}, \"8ab351aa13\": {\"quality\": 0.0, \"cost\": 0.040063444000000004, \"time\": 105.603106880188}, \"8ac8b5773a\": {\"quality\": 0.0696774193548387, \"cost\": 0.033945373, \"time\": 124.25757415294648}, \"8acd758b7f\": {\"quality\": 0.10623655913978494, \"cost\": 0.004647693, \"time\": 101.40544395446777}, \"8b10891ea5\": {\"quality\": 0.11333333333333334, \"cost\": 0.038892921999999996, \"time\": 124.35444567203521}, \"8b721bbc6f\": {\"quality\": 0.10946236559139785, \"cost\": 0.008056791, \"time\": 122.71747550964355}, \"8b77535cce\": {\"quality\": 0.12623655913978493, \"cost\": 0.035407286999999996, \"time\": 130.12753999233246}, \"8bbbe0f52a\": {\"quality\": 0.09279569892473119, \"cost\": 0.006340248, \"time\": 96.97207028865814}, \"8bc184f385\": {\"quality\": 0.10623655913978494, \"cost\": 0.0073677989999999995, \"time\": 97.76727423667907}, \"8bf5c3eadc\": {\"quality\": 0.02, \"cost\": 0.006312036, \"time\": 82.65424501895905}, \"8bf80a50cb\": {\"quality\": 0.11591397849462365, \"cost\": 0.0346818, \"time\": 110.19822652339934}, \"8c274ca255\": {\"quality\": 0.12301075268817205, \"cost\": 0.036505015, \"time\": 115.35032222270965}, \"8d7594020b\": {\"quality\": 0.12623655913978493, \"cost\": 0.029785446, \"time\": 79.58442108631134}, \"8d79e03266\": {\"quality\": 0.10301075268817204, \"cost\": 0.008962553, \"time\": 91.70296292304992}, \"8d90814b94\": {\"quality\": 0.21591397849462365, \"cost\": 0.009549217000000002, \"time\": 107.88517692089081}, \"8e1a01da19\": {\"quality\": 0.00967741935483871, \"cost\": 0.030469137000000007, \"time\": 110.28689606189728}, \"8e2498635d\": {\"quality\": 0.12946236559139784, \"cost\": 0.067304136, \"time\": 88.78618555068971}, \"8e5842ccbd\": {\"quality\": 0.12623655913978493, \"cost\": 0.0035327910000000004, \"time\": 61.36363306045533}, \"8e5daf241e\": {\"quality\": 0.10301075268817204, \"cost\": 0.029329670000000002, \"time\": 80.64679687023164}, \"8e9715ee01\": {\"quality\": 0.13268817204301075, \"cost\": 0.034420532000000004, \"time\": 73.95220372676849}, \"8e9b7300d4\": {\"quality\": 0.10301075268817204, \"cost\": 0.029812478000000003, \"time\": 81.88405811786652}, \"8f29fab8ac\": {\"quality\": 0.05333333333333333, \"cost\": 0.034482348, \"time\": 119.19576361179352}, \"8f44d89429\": {\"quality\": 0.20301075268817204, \"cost\": 0.034302998, \"time\": 86.89617729187012}, \"8f4caddfe6\": {\"quality\": 0.07935483870967741, \"cost\": 0.014937176, \"time\": 100.91416237354278}, \"8f4edde3f0\": {\"quality\": 0.10623655913978494, \"cost\": 0.017718742000000003, \"time\": 124.73474152088164}, \"8f9cefbc22\": {\"quality\": 0.10623655913978494, \"cost\": 0.028400606000000002, \"time\": 59.505778431892395}, \"9025e2480f\": {\"quality\": 0.11268817204301075, \"cost\": 0.006730606, \"time\": 74.51114990711213}, \"9028588af4\": {\"quality\": 0.10623655913978494, \"cost\": 0.00524569, \"time\": 45.83101971149445}, \"9059fd80ad\": {\"quality\": 0.00967741935483871, \"cost\": 0.0016962140000000001, \"time\": 79.82600400447845}, \"90d5e40c1b\": {\"quality\": 0.0, \"cost\": 0.006609534, \"time\": 97.84093914031982}, \"90d9a86a2a\": {\"quality\": 0.0, \"cost\": 0.0016498349999999997, \"time\": 50.78291637897492}, \"90ff13783c\": {\"quality\": 0.10301075268817204, \"cost\": 0.010615371000000002, \"time\": 137.4936174631119}, \"90ff8eb055\": {\"quality\": 0.029677419354838707, \"cost\": 0.010129776, \"time\": 135.72056851387023}, \"9104e31369\": {\"quality\": 0.09333333333333334, \"cost\": 0.013592545000000001, \"time\": 145.33199377059935}, \"918983323f\": {\"quality\": 0.00967741935483871, \"cost\": 0.0012531780000000002, \"time\": 41.7289758682251}, \"91928dfdd9\": {\"quality\": 0.10623655913978494, \"cost\": 0.029831955000000007, \"time\": 108.31667373180389}, \"91c800af6b\": {\"quality\": 0.08301075268817204, \"cost\": 0.011924733, \"time\": 149.60272026062012}, \"91e841cfd5\": {\"quality\": 0.11268817204301075, \"cost\": 0.035107227000000005, \"time\": 144.8135196208954}, \"9253901a1f\": {\"quality\": 0.0, \"cost\": 0.038044404000000004, \"time\": 144.24513857364656}, \"9288642e53\": {\"quality\": 0.02, \"cost\": 0.006087138000000001, \"time\": 72.39175655841828}, \"92ba9c5be3\": {\"quality\": 0.13591397849462367, \"cost\": 0.030314066, \"time\": 89.64683423042297}, \"92c9dcd43b\": {\"quality\": 0.10301075268817204, \"cost\": 0.009152825, \"time\": 133.89202611446382}, \"93011c0821\": {\"quality\": 0.03935483870967742, \"cost\": 0.012697265000000001, \"time\": 135.9506582260132}, \"933b4d17dd\": {\"quality\": 0.10301075268817204, \"cost\": 0.004906763, \"time\": 94.6856369972229}, \"9373267bdb\": {\"quality\": 0.12623655913978493, \"cost\": 0.03234980700000001, \"time\": 123.84810786247253}, \"94010928c6\": {\"quality\": 0.08623655913978495, \"cost\": 0.007078176, \"time\": 105.30463340282441}, \"9403809e44\": {\"quality\": 0.08301075268817204, \"cost\": 0.012096366, \"time\": 135.0477834701538}, \"940c88ddc5\": {\"quality\": 0.06946236559139785, \"cost\": 0.010277144, \"time\": 97.58399150371551}, \"948f4081ba\": {\"quality\": 0.10623655913978494, \"cost\": 0.015952197, \"time\": 132.39694006443023}, \"94ac356663\": {\"quality\": 0.10301075268817204, \"cost\": 0.007538082, \"time\": 100.63510990142822}, \"94dff9a424\": {\"quality\": 0.10301075268817204, \"cost\": 0.0058817290000000005, \"time\": 97.47921371459961}, \"9508356a2e\": {\"quality\": 0.0, \"cost\": 0.0074193060000000005, \"time\": 119.1678344964981}, \"9539d0e28c\": {\"quality\": 0.11913978494623656, \"cost\": 0.016965942000000005, \"time\": 127.95936200618743}, \"956bdcc254\": {\"quality\": 0.10301075268817204, \"cost\": 0.016353198000000003, \"time\": 113.47469205856324}, \"957d0dafc5\": {\"quality\": 0.12623655913978493, \"cost\": 0.03451419800000001, \"time\": 99.19511275291444}, \"9594b0c783\": {\"quality\": 0.06946236559139785, \"cost\": 0.0033617670000000003, \"time\": 103.66688406467438}, \"95a7b80c2a\": {\"quality\": 0.10301075268817204, \"cost\": 0.012909755, \"time\": 96.70905361175537}, \"964c671f18\": {\"quality\": 0.21591397849462365, \"cost\": 0.012280253000000001, \"time\": 130.05689902305602}, \"9679fe2b69\": {\"quality\": 0.10301075268817204, \"cost\": 0.003347709, \"time\": 92.87826173305511}, \"968fc95038\": {\"quality\": 0.10301075268817204, \"cost\": 0.0036856379999999993, \"time\": 98.10628998279572}, \"96b487c724\": {\"quality\": 0.08946236559139785, \"cost\": 0.00053203, \"time\": 54.895061421394345}, \"96c30205f5\": {\"quality\": 0.10623655913978494, \"cost\": 0.031391626, \"time\": 97.40448112487793}, \"96f87d6483\": {\"quality\": 0.07333333333333333, \"cost\": 0.012635357, \"time\": 153.3388079404831}, \"972c83b002\": {\"quality\": 0.029677419354838707, \"cost\": 0.007475796, \"time\": 118.32212336063385}, \"977a4d6b6b\": {\"quality\": 0.13591397849462367, \"cost\": 0.01375324, \"time\": 138.46970500946043}, \"97bc30bd83\": {\"quality\": 0.07612903225806453, \"cost\": 0.010801506999999998, \"time\": 111.81023037433624}, \"980db5f95f\": {\"quality\": 0.10301075268817204, \"cost\": 0.03036672, \"time\": 113.69273748397828}, \"9836765d41\": {\"quality\": 0.12301075268817205, \"cost\": 0.032121622, \"time\": 104.68945379257201}, \"99569e3937\": {\"quality\": 0.10623655913978494, \"cost\": 0.035286567000000005, \"time\": 141.43686013221742}, \"99ea16a9a6\": {\"quality\": 0.10623655913978494, \"cost\": 0.009821013, \"time\": 108.473588347435}, \"9a5b39370f\": {\"quality\": 0.0, \"cost\": 0.036045704, \"time\": 155.65953776836395}, \"9aa4abfb50\": {\"quality\": 0.0, \"cost\": 0.003652884, \"time\": 89.60838494300842}, \"9ad7a98c31\": {\"quality\": 0.12301075268817205, \"cost\": 0.03251525400000001, \"time\": 111.1855703830719}, \"9b3fb79bcb\": {\"quality\": 0.12301075268817205, \"cost\": 0.010473083, \"time\": 107.05149478912352}, \"9b6d4915f3\": {\"quality\": 0.0, \"cost\": 0.00475386, \"time\": 53.93124535083771}, \"9bae5bafc1\": {\"quality\": 0.09333333333333334, \"cost\": 0.013743354000000001, \"time\": 105.66356384754181}, \"9be8a5f317\": {\"quality\": 0.10623655913978494, \"cost\": 0.038140224, \"time\": 141.94463317394258}, \"9c549db0a7\": {\"quality\": 0.0, \"cost\": 0.011336479, \"time\": 83.88670737743377}, \"9c85f8cfcb\": {\"quality\": 0.0, \"cost\": 0.002950311, \"time\": 71.82748806476593}, \"9c8cc46e6c\": {\"quality\": 0.06623655913978495, \"cost\": 0.0019796609999999997, \"time\": 72.90286383628845}, \"9c97d35a30\": {\"quality\": 0.10301075268817204, \"cost\": 0.005516262, \"time\": 115.19557769298552}, \"9cbe7858a2\": {\"quality\": 0.11333333333333334, \"cost\": 0.034743496000000006, \"time\": 140.37097189426424}, \"9ce2c3fd98\": {\"quality\": 0.0, \"cost\": 0.004397076, \"time\": 118.44574811458588}, \"9d18cd0737\": {\"quality\": 0.06301075268817204, \"cost\": 0.004985955, \"time\": 77.21244201660156}, \"9e06360bc9\": {\"quality\": 0.10946236559139785, \"cost\": 0.008386245, \"time\": 112.56666424274445}, \"9f07a95e69\": {\"quality\": 0.12301075268817205, \"cost\": 0.039345112, \"time\": 142.53303418159484}, \"9fb157be35\": {\"quality\": 0.04, \"cost\": 0.009244232, \"time\": 117.06155586242676}, \"a04ac8e33a\": {\"quality\": 0.0, \"cost\": 0.030044402000000005, \"time\": 83.63084626197815}, \"a04bc6e116\": {\"quality\": 0.04967741935483871, \"cost\": 0.029843946, \"time\": 100.13187789916992}, \"a0b81be5b4\": {\"quality\": 0.10301075268817204, \"cost\": 0.0022555170000000003, \"time\": 77.33556809425355}, \"a0c85d260e\": {\"quality\": 0.10623655913978494, \"cost\": 0.013999365, \"time\": 147.74334559440612}, \"a0dc9f50ac\": {\"quality\": 0.012903225806451613, \"cost\": 0.006003528, \"time\": 77.70861837863922}, \"a18225b7b5\": {\"quality\": 0.10301075268817204, \"cost\": 0.033204910000000004, \"time\": 99.84152095317842}, \"a1d822289e\": {\"quality\": 0.10301075268817204, \"cost\": 0.03703391, \"time\": 105.16314966678618}, \"a25596c056\": {\"quality\": 0.06301075268817204, \"cost\": 0.034485563999999996, \"time\": 138.92013294696807}, \"a2811c7324\": {\"quality\": 0.10623655913978494, \"cost\": 0.009388681000000001, \"time\": 112.0317987203598}, \"a2aa082d14\": {\"quality\": 0.0, \"cost\": 0.014266204, \"time\": 101.78627176284789}, \"a2cd339ad9\": {\"quality\": 0.09333333333333334, \"cost\": 0.010370297, \"time\": 147.42609903812408}, \"a2fd03e6a5\": {\"quality\": 0.07655913978494623, \"cost\": 0.004336301999999999, \"time\": 36.67434537410736}, \"a31e87d7cb\": {\"quality\": 0.06301075268817204, \"cost\": 0.001581552, \"time\": 76.9124368429184}, \"a3e23c327b\": {\"quality\": 0.06946236559139785, \"cost\": 0.03564489000000001, \"time\": 150.25721719264985}, \"a457f6c300\": {\"quality\": 0.0, \"cost\": 0.007949861999999999, \"time\": 132.3079260349274}, \"a47de025c8\": {\"quality\": 0.0, \"cost\": 0.004268559, \"time\": 89.55218677520752}, \"a515a9c8cc\": {\"quality\": 0.05333333333333333, \"cost\": 0.002060598, \"time\": 76.20932059288026}, \"a5949b76ec\": {\"quality\": 0.0, \"cost\": 0.002426688, \"time\": 69.78927192687988}, \"a60dd076b8\": {\"quality\": 0.10623655913978494, \"cost\": 0.002494836, \"time\": 77.4125370502472}, \"a6297a6c56\": {\"quality\": 0.06623655913978495, \"cost\": 0.0024065099999999997, \"time\": 65.99698326587676}, \"a62b7555b9\": {\"quality\": 0.08258064516129032, \"cost\": 0.03560092200000001, \"time\": 131.32008333206176}, \"a63f48e8ca\": {\"quality\": 0.0, \"cost\": 0.029433226000000003, \"time\": 74.86844005584717}, \"a6460dbb7c\": {\"quality\": 0.10623655913978494, \"cost\": 0.002761025, \"time\": 64.81612529754639}, \"a66a4cf4b0\": {\"quality\": 0.11591397849462365, \"cost\": 0.03962680600000001, \"time\": 131.48869993686674}, \"a6e2d69222\": {\"quality\": 0.10301075268817204, \"cost\": 0.029019002, \"time\": 97.08205032348633}, \"a717c4c535\": {\"quality\": 0.05290322580645161, \"cost\": 0.013103212, \"time\": 130.28173251152037}, \"a76afe9960\": {\"quality\": 0.12301075268817205, \"cost\": 0.030138277, \"time\": 103.21392459869384}, \"a7a6353090\": {\"quality\": 0.10623655913978494, \"cost\": 0.031531417000000006, \"time\": 103.30408613681793}, \"a80f6535b1\": {\"quality\": 0.06301075268817204, \"cost\": 0.005650186, \"time\": 66.3761245250702}, \"a86b137d7f\": {\"quality\": 0.06946236559139785, \"cost\": 0.002503022, \"time\": 67.13813652992249}, \"a88eb1493c\": {\"quality\": 0.10301075268817204, \"cost\": 0.0049795349999999985, \"time\": 64.7448471069336}, \"a89c533d6c\": {\"quality\": 0.0, \"cost\": 0.027630250000000002, \"time\": 46.682720041275026}, \"a8d8264600\": {\"quality\": 0.1296774193548387, \"cost\": 0.035944667, \"time\": 153.82529056072235}, \"a8eb36b210\": {\"quality\": 0.10301075268817204, \"cost\": 0.023975060000000003, \"time\": 108.62396328449249}, \"a95b4a6dd0\": {\"quality\": 0.08301075268817204, \"cost\": 0.00891948, \"time\": 111.40819778442383}, \"a9621ea4e6\": {\"quality\": 0.043010752688172046, \"cost\": 0.00758955, \"time\": 111.94071905612945}, \"a9721a0a50\": {\"quality\": 0.11268817204301075, \"cost\": 0.006963254, \"time\": 83.20404796600342}, \"a972b02c61\": {\"quality\": 0.12301075268817205, \"cost\": 0.016982312, \"time\": 36.22950010299682}, \"a9c5c4e311\": {\"quality\": 0.11456989247311827, \"cost\": 0.009789541, \"time\": 112.16675176620484}, \"a9d96670eb\": {\"quality\": 0.12623655913978493, \"cost\": 0.028751066000000002, \"time\": 111.07726712226868}, \"a9e8c974d3\": {\"quality\": 0.0, \"cost\": 0.03628516, \"time\": 154.63389499187468}, \"aa08180e36\": {\"quality\": 0.07591397849462365, \"cost\": 0.010442482000000001, \"time\": 149.7024934768677}, \"aa38702a02\": {\"quality\": 0.06623655913978495, \"cost\": 0.00453825, \"time\": 115.99718639850616}, \"aadbfc418b\": {\"quality\": 0.15301075268817205, \"cost\": 0.0039624659999999996, \"time\": 115.4608172416687}, \"aaeb8b0010\": {\"quality\": 0.03612903225806452, \"cost\": 0.033459904, \"time\": 146.18061661720276}, \"ab288ee7f2\": {\"quality\": 0.043010752688172046, \"cost\": 0.031835803, \"time\": 151.1489573955536}, \"ab43b02cb0\": {\"quality\": 0.0, \"cost\": 0.010940676, \"time\": 85.55018713474274}, \"aba1d612cc\": {\"quality\": 0.10301075268817204, \"cost\": 0.04242697200000001, \"time\": 129.53015527725222}, \"ac208e7a1d\": {\"quality\": 0.20623655913978495, \"cost\": 0.010061258, \"time\": 87.22576036453248}, \"ac2224adbe\": {\"quality\": 0.0, \"cost\": 0.026542246000000005, \"time\": 73.37120480537413}, \"ac828ffe70\": {\"quality\": 0.07655913978494625, \"cost\": 0.000756074, \"time\": 55.469612050056455}, \"ac9fdc1550\": {\"quality\": 0.10301075268817204, \"cost\": 0.013684735, \"time\": 102.56252949237823}, \"aca957ecff\": {\"quality\": 0.11456989247311827, \"cost\": 0.03768476100000001, \"time\": 152.26708533763883}, \"acfe1ed920\": {\"quality\": 0.09456989247311828, \"cost\": 0.005188744, \"time\": 57.178893375396726}, \"ad3efe44c3\": {\"quality\": 0.0, \"cost\": 0.002697534, \"time\": 72.25062220096589}, \"ad41c95a99\": {\"quality\": 0.10623655913978494, \"cost\": 0.025689668000000002, \"time\": 115.99265196323395}, \"ad48432c22\": {\"quality\": 0.0, \"cost\": 0.009775953, \"time\": 191.9520122528076}, \"ad6ebbba8d\": {\"quality\": 0.0, \"cost\": 0.007917252999999999, \"time\": 140.29350352287292}, \"ad90055ef6\": {\"quality\": 0.0, \"cost\": 0.003889926, \"time\": 112.45582339763641}, \"adab1e0fb1\": {\"quality\": 0.10301075268817204, \"cost\": 0.006015302, \"time\": 91.00434277057647}, \"ae655ec593\": {\"quality\": 0.08301075268817204, \"cost\": 0.004026768, \"time\": 141.7740585565567}, \"ae94b172be\": {\"quality\": 0.06301075268817204, \"cost\": 0.007697588, \"time\": 143.52500240802766}, \"aec9dc5873\": {\"quality\": 0.0, \"cost\": 0.027518260000000003, \"time\": 96.50754041671753}, \"af360c323c\": {\"quality\": 0.0, \"cost\": 0.0033090330000000003, \"time\": 149.9352635383606}, \"af90567194\": {\"quality\": 0.10623655913978494, \"cost\": 0.013733104999999999, \"time\": 181.40714287757874}, \"afe77d0f89\": {\"quality\": 0.11591397849462365, \"cost\": 0.03486629, \"time\": 177.79787800312042}, \"b0948c05b6\": {\"quality\": 0.09333333333333334, \"cost\": 0.01202388, \"time\": 110.92746062278746}, \"b0c4a6640b\": {\"quality\": 0.10301075268817204, \"cost\": 0.031655476, \"time\": 155.44975624084475}, \"b12caafd58\": {\"quality\": 0.06301075268817204, \"cost\": 0.03197457, \"time\": 155.3449794769287}, \"b18168b9c1\": {\"quality\": 0.08301075268817204, \"cost\": 0.0077949479999999995, \"time\": 119.46661601066589}, \"b1a7428a01\": {\"quality\": 0.12301075268817205, \"cost\": 0.034865056000000005, \"time\": 145.9925803422928}, \"b1acdebb48\": {\"quality\": 0.0929032258064516, \"cost\": 0.04751790600000001, \"time\": 124.27810626029968}, \"b1b06f4ee7\": {\"quality\": 0.0, \"cost\": 0.05763750000000001, \"time\": 95.25730528831483}, \"b1cf8d33e5\": {\"quality\": 0.08301075268817204, \"cost\": 0.007401675000000001, \"time\": 103.68885869979857}, \"b1e9ab6b1a\": {\"quality\": 0.10946236559139785, \"cost\": 0.038297786, \"time\": 146.62364199161527}, \"b2b057ba41\": {\"quality\": 0.10301075268817204, \"cost\": 0.00373947, \"time\": 102.88082673549653}, \"b2e063499d\": {\"quality\": 0.21591397849462365, \"cost\": 0.009003760999999999, \"time\": 103.52594914436341}, \"b33412410e\": {\"quality\": 0.0, \"cost\": 0.02474838, \"time\": 82.1103196144104}, \"b3369775dc\": {\"quality\": 0.10301075268817204, \"cost\": 0.012564327, \"time\": 147.59253644943237}, \"b3b9205f60\": {\"quality\": 0.07591397849462365, \"cost\": 0.020412146000000003, \"time\": 143.55707364082338}, \"b3c56f0b3c\": {\"quality\": 0.10623655913978494, \"cost\": 0.04188011999999999, \"time\": 104.512482714653}, \"b3f20b706d\": {\"quality\": 0.0, \"cost\": 0.0030317039999999996, \"time\": 98.11458349227905}, \"b4002173ee\": {\"quality\": 0.08301075268817204, \"cost\": 0.0054744, \"time\": 75.72368431091309}, \"b46d382384\": {\"quality\": 0.10301075268817204, \"cost\": 0.05967477, \"time\": 110.93994162082672}, \"b4b2482ef9\": {\"quality\": 0.07591397849462365, \"cost\": 0.033919934, \"time\": 114.34581046104432}, \"b4be043238\": {\"quality\": 0.12301075268817205, \"cost\": 0.04046709600000001, \"time\": 111.69091956615448}, \"b531bd0548\": {\"quality\": 0.10623655913978494, \"cost\": 0.042357856, \"time\": 138.67396595478056}, \"b56c312eda\": {\"quality\": 0.08301075268817204, \"cost\": 0.008939847, \"time\": 141.99621114730834}, \"b5e2b41c1c\": {\"quality\": 0.10623655913978494, \"cost\": 0.004302350999999999, \"time\": 110.59975845813752}, \"b61ce57a90\": {\"quality\": 0.11591397849462365, \"cost\": 0.020101276, \"time\": 140.674871301651}, \"b64ddb14f9\": {\"quality\": 0.10946236559139785, \"cost\": 0.0024013059999999998, \"time\": 41.2197571516037}, \"b67107a43e\": {\"quality\": 0.05333333333333333, \"cost\": 0.0056747790000000005, \"time\": 151.14496650695799}, \"b67720aa5c\": {\"quality\": 0.0, \"cost\": 0.029933376000000005, \"time\": 112.10192131996155}, \"b682a23b89\": {\"quality\": 0.10301075268817204, \"cost\": 0.004969453, \"time\": 109.97392725944519}, \"b69ef5add4\": {\"quality\": 0.04, \"cost\": 0.033153256000000006, \"time\": 127.61734595298768}, \"b796b7ffd3\": {\"quality\": 0.10946236559139785, \"cost\": 0.009056132000000001, \"time\": 156.37638986110687}, \"b7a0083dc4\": {\"quality\": 0.06, \"cost\": 0.029241969, \"time\": 111.82368907928466}, \"b7d0e8557f\": {\"quality\": 0.06623655913978495, \"cost\": 0.008024637, \"time\": 146.9499261379242}, \"b8317a3a8c\": {\"quality\": 0.10301075268817204, \"cost\": 0.040210949, \"time\": 126.69586458206176}, \"b8ab3d2f25\": {\"quality\": 0.11591397849462365, \"cost\": 0.008929705999999999, \"time\": 119.44116389751434}, \"b8b569172f\": {\"quality\": 0.10301075268817204, \"cost\": 0.010143565999999998, \"time\": 119.48254477977753}, \"b8f5ab44bb\": {\"quality\": 0.13591397849462367, \"cost\": 0.039590358000000006, \"time\": 144.63666880130768}, \"b91e7fdb29\": {\"quality\": 0.0, \"cost\": 0.009813995999999998, \"time\": 178.9033116340637}, \"b932beaaa6\": {\"quality\": 0.10301075268817204, \"cost\": 0.03316909800000001, \"time\": 101.8533546447754}, \"b9770c2261\": {\"quality\": 0.10301075268817204, \"cost\": 0.00194208, \"time\": 73.90803687572479}, \"b9bb1e6f8d\": {\"quality\": 0.06301075268817204, \"cost\": 0.012499610999999999, \"time\": 150.40043871402742}, \"b9d0e8740c\": {\"quality\": 0.10623655913978494, \"cost\": 0.034839880000000004, \"time\": 144.61329133510588}, \"b9da208432\": {\"quality\": 0.10301075268817204, \"cost\": 0.009705881, \"time\": 142.17393441200255}, \"ba3223f6ac\": {\"quality\": 0.04, \"cost\": 0.004067208, \"time\": 117.73131382465363}, \"bb13365175\": {\"quality\": 0.10946236559139785, \"cost\": 0.022399919, \"time\": 149.47980670928956}, \"bb1b3a4d29\": {\"quality\": 0.10301075268817204, \"cost\": 0.031956166, \"time\": 149.79417490959167}, \"bb6536b0ab\": {\"quality\": 0.06946236559139785, \"cost\": 0.010785974, \"time\": 102.99912581443786}, \"bbba9dd6ae\": {\"quality\": 0.11591397849462365, \"cost\": 0.005964354, \"time\": 77.64485285282134}, \"bbde69a1ae\": {\"quality\": 0.08967741935483872, \"cost\": 0.03738282200000001, \"time\": 164.79430203437806}, \"bc29a0c0fe\": {\"quality\": 0.11268817204301075, \"cost\": 0.033914098000000004, \"time\": 101.38525047302247}, \"bc3d02f753\": {\"quality\": 0.0, \"cost\": 0.0025910279999999996, \"time\": 93.81551551818848}, \"bc4c1fcc64\": {\"quality\": 0.20301075268817204, \"cost\": 0.0059535610000000004, \"time\": 67.91742668151855}, \"bd30d27f62\": {\"quality\": 0.12301075268817205, \"cost\": 0.062644566, \"time\": 121.11297037601472}, \"bd99b2fb21\": {\"quality\": 0.09333333333333334, \"cost\": 0.008204478, \"time\": 122.81714386940003}, \"bddc7d2a34\": {\"quality\": 0.10301075268817204, \"cost\": 0.03213537500000001, \"time\": 163.16414675712585}, \"be2ae88f70\": {\"quality\": 0.0, \"cost\": 0.004693302, \"time\": 137.78757257461547}, \"be4740f38f\": {\"quality\": 0.09655913978494624, \"cost\": 0.011966032000000001, \"time\": 123.94724879264831}, \"bec0c6a95f\": {\"quality\": 0.11134408602150538, \"cost\": 0.044889398, \"time\": 164.90446269512177}, \"bed888d4dc\": {\"quality\": 0.03612903225806452, \"cost\": 0.007983669, \"time\": 159.96076278686525}, \"bf45e407f6\": {\"quality\": 0.05290322580645161, \"cost\": 0.039748443999999994, \"time\": 127.84493451118469}, \"bf5550f320\": {\"quality\": 0.07333333333333333, \"cost\": 0.035326152, \"time\": 161.38907580375673}, \"bf87e58322\": {\"quality\": 0.10301075268817204, \"cost\": 0.02893854, \"time\": 130.29030165672305}, \"bfed7670ed\": {\"quality\": 0.08301075268817204, \"cost\": 0.010171252, \"time\": 128.07581346035005}, \"c0541e2220\": {\"quality\": 0.0, \"cost\": 0.037621554, \"time\": 99.32873375415802}, \"c0e10c0048\": {\"quality\": 0.0, \"cost\": 0.03382517, \"time\": 166.7884413957596}, \"c127509a7a\": {\"quality\": 0.12301075268817205, \"cost\": 0.013034976, \"time\": 122.95744087696076}, \"c13682c7c7\": {\"quality\": 0.11258064516129032, \"cost\": 0.012015295999999998, \"time\": 151.76094684600832}, \"c13d6e78e9\": {\"quality\": 0.12301075268817205, \"cost\": 0.017051258000000003, \"time\": 136.8413455247879}, \"c14ff3144d\": {\"quality\": 0.06623655913978495, \"cost\": 0.0015566399999999998, \"time\": 67.90232226848602}, \"c1e42ac47b\": {\"quality\": 0.10946236559139785, \"cost\": 0.039353246, \"time\": 155.5742819786072}, \"c2949aa902\": {\"quality\": 0.10301075268817204, \"cost\": 0.007100054999999999, \"time\": 123.65010952949524}, \"c31e956b35\": {\"quality\": 0.10301075268817204, \"cost\": 0.009654768000000001, \"time\": 160.06434454917905}, \"c36b525dde\": {\"quality\": 0.10623655913978494, \"cost\": 0.003221145, \"time\": 89.18764023780822}, \"c38326e2bd\": {\"quality\": 0.10946236559139785, \"cost\": 0.036419284, \"time\": 149.7904277563095}, \"c3ec2cec59\": {\"quality\": 0.11591397849462365, \"cost\": 0.012373450000000001, \"time\": 124.03460397720337}, \"c44720575f\": {\"quality\": 0.10623655913978494, \"cost\": 0.011866406, \"time\": 114.06694071292877}, \"c48ecefab6\": {\"quality\": 0.0, \"cost\": 0.031072968000000006, \"time\": 158.07063992023467}, \"c4a64eb40f\": {\"quality\": 0.10301075268817204, \"cost\": 0.038885108, \"time\": 112.12137053012847}, \"c4a80d19b3\": {\"quality\": 0.10301075268817204, \"cost\": 0.01095102, \"time\": 163.5285721540451}, \"c4c2826afd\": {\"quality\": 0.02258064516129032, \"cost\": 0.009101021, \"time\": 162.38105652332305}, \"c4c94a5527\": {\"quality\": 0.0, \"cost\": 0.0075986579999999995, \"time\": 122.43854382038117}, \"c4e75ee9ba\": {\"quality\": 0.14946236559139786, \"cost\": 0.06605840600000001, \"time\": 123.62291731834412}, \"c4f3e7665d\": {\"quality\": 0.04, \"cost\": 0.004755744000000001, \"time\": 128.4701939582825}, \"c5471bef57\": {\"quality\": 0.0, \"cost\": 0.034792454, \"time\": 174.61682531833648}, \"c54a408db7\": {\"quality\": 0.11729646697388633, \"cost\": 0.028817849000000003, \"time\": 119.10575988292695}, \"c59cc41335\": {\"quality\": 0.02, \"cost\": 0.029414305, \"time\": 127.53696141242982}, \"c5a0b065e0\": {\"quality\": 0.0, \"cost\": 0.011118656000000001, \"time\": 86.77302918434142}, \"c5a16b834a\": {\"quality\": 0.10623655913978494, \"cost\": 0.009676605, \"time\": 163.57068514823914}, \"c5fbe2076f\": {\"quality\": 0.09333333333333334, \"cost\": 0.016622912, \"time\": 171.14971058368684}, \"c617370f6b\": {\"quality\": 0.11134408602150538, \"cost\": 0.028784580000000004, \"time\": 123.97298481464387}, \"c67f782c7f\": {\"quality\": 0.10301075268817204, \"cost\": 0.031540366, \"time\": 122.78840267658234}, \"c691a29c42\": {\"quality\": 0.0696774193548387, \"cost\": 0.015512881999999999, \"time\": 156.41156446933746}, \"c6a339987c\": {\"quality\": 0.0, \"cost\": 0.009196331000000002, \"time\": 160.24627866744996}, \"c772ff3704\": {\"quality\": 0.06, \"cost\": 0.008646966, \"time\": 149.87581686973573}, \"c7e3f348c2\": {\"quality\": 0.10946236559139785, \"cost\": 0.03753916200000001, \"time\": 149.64094378948212}, \"c823589ab6\": {\"quality\": 0.03333333333333333, \"cost\": 0.030336296, \"time\": 124.21261048316956}, \"c82f834e85\": {\"quality\": 0.11591397849462365, \"cost\": 0.007749193, \"time\": 110.63324115276336}, \"c85099881f\": {\"quality\": 0.0032258064516129032, \"cost\": 0.00896294, \"time\": 77.65277953147888}, \"c935a33384\": {\"quality\": 0.11591397849462365, \"cost\": 0.013818304, \"time\": 147.3745896100998}, \"ca3177461f\": {\"quality\": 0.07290322580645162, \"cost\": 0.032201930000000004, \"time\": 144.48511946201324}, \"caa7c0bd6b\": {\"quality\": 0.08623655913978495, \"cost\": 0.007444665, \"time\": 109.04434747695923}, \"cac6b051e9\": {\"quality\": 0.0696774193548387, \"cost\": 0.008080419, \"time\": 151.42419612407684}, \"cacb342f64\": {\"quality\": 0.10946236559139785, \"cost\": 0.04521175200000001, \"time\": 148.37158913612365}, \"cb9948679c\": {\"quality\": 0.03612903225806452, \"cost\": 0.031414252000000004, \"time\": 155.25814893245698}, \"cbb5eb0e74\": {\"quality\": 0.10623655913978494, \"cost\": 0.008005419, \"time\": 159.4877107143402}, \"cbc32cbeff\": {\"quality\": 0.07729646697388634, \"cost\": 0.009278367, \"time\": 176.14444167613982}, \"cbd4461293\": {\"quality\": 0.09655913978494624, \"cost\": 0.0019236219999999998, \"time\": 56.308462142944336}, \"cbe2318045\": {\"quality\": 0.09602150537634407, \"cost\": 0.008356266000000001, \"time\": 120.74416449069977}, \"cc886fe337\": {\"quality\": 0.03333333333333333, \"cost\": 0.006418602, \"time\": 123.71861448287964}, \"cc9a6248a0\": {\"quality\": 0.08301075268817204, \"cost\": 0.027855250000000005, \"time\": 40.938735127449036}, \"ccb2335b3f\": {\"quality\": 0.08946236559139785, \"cost\": 0.03473691600000001, \"time\": 175.81531076431276}, \"ccdf03a55b\": {\"quality\": 0.04, \"cost\": 0.04126820400000002, \"time\": 147.74624931812286}, \"ccf72745c1\": {\"quality\": 0.11268817204301075, \"cost\": 0.010027356, \"time\": 131.06223764419556}, \"cd1d418732\": {\"quality\": 0.0, \"cost\": 0.005709352, \"time\": 72.77172949314118}, \"cd23c79db1\": {\"quality\": 0.049677419354838714, \"cost\": 0.009460574000000001, \"time\": 154.12733085155486}, \"cd64fbfcd9\": {\"quality\": 0.08623655913978495, \"cost\": 0.0068760869999999995, \"time\": 149.53058531284333}, \"cd85a01e81\": {\"quality\": 0.07333333333333333, \"cost\": 0.034051955, \"time\": 182.54483699798584}, \"ce4bc5f348\": {\"quality\": 0.0, \"cost\": 0.003405222, \"time\": 103.34760620594025}, \"ce980cf86f\": {\"quality\": 0.10301075268817204, \"cost\": 0.0039973019999999995, \"time\": 133.40581440925598}, \"ceae8b8bb9\": {\"quality\": 0.10946236559139785, \"cost\": 0.020052325000000003, \"time\": 167.7911945104599}, \"cecca90dd2\": {\"quality\": 0.10301075268817204, \"cost\": 0.004860323999999999, \"time\": 140.68442809581757}, \"cf9538faf0\": {\"quality\": 0.17052227342549925, \"cost\": 0.00948961, \"time\": 133.63166942596436}, \"cf9d2e224c\": {\"quality\": 0.06, \"cost\": 0.006049894, \"time\": 97.01887967586518}, \"cfd36f3a8c\": {\"quality\": 0.10301075268817204, \"cost\": 0.015497412000000002, \"time\": 141.9499899148941}, \"cffa29a6ef\": {\"quality\": 0.12301075268817205, \"cost\": 0.0005813769999999999, \"time\": 62.55750501155853}, \"d03596c3de\": {\"quality\": 0.10946236559139785, \"cost\": 0.007728069000000001, \"time\": 166.9729477405548}, \"d07b766487\": {\"quality\": 0.07935483870967741, \"cost\": 0.03461312800000001, \"time\": 162.30493545532227}, \"d0a0a66d75\": {\"quality\": 0.11268817204301075, \"cost\": 0.022896756000000004, \"time\": 80.08134040832519}, \"d0ce31134c\": {\"quality\": 0.05591397849462365, \"cost\": 0.038193279999999996, \"time\": 161.50320081710817}, \"d0f9633442\": {\"quality\": 0.0, \"cost\": 0.008286096, \"time\": 140.58964149951936}, \"d216eab7d8\": {\"quality\": 0.09268817204301075, \"cost\": 0.008950572, \"time\": 116.59772584438323}, \"d266c19ac8\": {\"quality\": 0.10301075268817204, \"cost\": 0.005751029999999999, \"time\": 161.90357491970062}, \"d26a70179a\": {\"quality\": 0.10623655913978494, \"cost\": 0.03584914900000001, \"time\": 172.63189315795898}, \"d2af24b59e\": {\"quality\": 0.06, \"cost\": 0.008735434, \"time\": 124.40694677829742}, \"d2f2dd5cd4\": {\"quality\": 0.13623655913978494, \"cost\": 0.023797876000000003, \"time\": 131.33156244754792}, \"d302278f85\": {\"quality\": 0.00967741935483871, \"cost\": 0.030764487000000004, \"time\": 169.61139817237853}, \"d37dcaea30\": {\"quality\": 0.10946236559139785, \"cost\": 0.019287263, \"time\": 164.99651217460632}, \"d3a2d50bd7\": {\"quality\": 0.07333333333333333, \"cost\": 0.004375113, \"time\": 134.56950817108154}, \"d3d4185487\": {\"quality\": 0.06322580645161291, \"cost\": 0.033443476, \"time\": 123.06616494655609}, \"d3db4cf84d\": {\"quality\": 0.0, \"cost\": 0.004757544, \"time\": 99.5310351371765}, \"d402233b53\": {\"quality\": 0.10946236559139785, \"cost\": 0.022382061, \"time\": 155.35112550258637}, \"d43fafa19e\": {\"quality\": 0.00967741935483871, \"cost\": 0.004969002, \"time\": 85.30777862071992}, \"d446a75eb7\": {\"quality\": 0.10946236559139785, \"cost\": 0.045008152, \"time\": 111.52242555618287}, \"d48ead13da\": {\"quality\": 0.02, \"cost\": 0.00741723, \"time\": 123.98356997966766}, \"d5016f4538\": {\"quality\": 0.0, \"cost\": 0.00508065, \"time\": 85.11343963146209}, \"d55a3613b0\": {\"quality\": 0.10623655913978494, \"cost\": 0.041770765, \"time\": 152.50331590175628}, \"d58036ba66\": {\"quality\": 0.10301075268817204, \"cost\": 0.008145371999999998, \"time\": 106.38735165596009}, \"d5a84c782e\": {\"quality\": 0.10623655913978494, \"cost\": 0.029261760000000005, \"time\": 104.74087455272675}, \"d5b2eef11c\": {\"quality\": 0.12301075268817205, \"cost\": 0.02737009, \"time\": 65.21177852153778}, \"d6040140b9\": {\"quality\": 0.10301075268817204, \"cost\": 0.014017109, \"time\": 153.3877902984619}, \"d65185c1a4\": {\"quality\": 0.10623655913978494, \"cost\": 0.00525975, \"time\": 117.94775750637055}, \"d667351f33\": {\"quality\": 0.10301075268817204, \"cost\": 0.035737883, \"time\": 160.19842824935913}, \"d6bd3b66ba\": {\"quality\": 0.04, \"cost\": 0.004024998, \"time\": 126.8981255054474}, \"d6c4e48eeb\": {\"quality\": 0.10946236559139785, \"cost\": 0.008899966, \"time\": 125.08437695503235}, \"d6cbf265ee\": {\"quality\": 0.0, \"cost\": 0.001772622, \"time\": 80.58833994865418}, \"d705447fd7\": {\"quality\": 0.02, \"cost\": 0.03513882900000001, \"time\": 158.08452146053315}, \"d73a9aab4e\": {\"quality\": 0.0, \"cost\": 0.003741966, \"time\": 86.37196826934814}, \"d752c30d07\": {\"quality\": 0.10301075268817204, \"cost\": 0.016376438, \"time\": 126.43599452972413}, \"d782682359\": {\"quality\": 0.12301075268817205, \"cost\": 0.035046282000000005, \"time\": 158.43969354629516}, \"d7c0972014\": {\"quality\": 0.07333333333333333, \"cost\": 0.010681736, \"time\": 93.32859888076783}, \"d867525748\": {\"quality\": 0.06301075268817204, \"cost\": 0.01181131, \"time\": 120.1785500049591}, \"d87eb775da\": {\"quality\": 0.10946236559139785, \"cost\": 0.008138371, \"time\": 109.30400257110595}, \"d8bab6c09b\": {\"quality\": 0.0696774193548387, \"cost\": 0.011994469, \"time\": 164.43309626579287}, \"d8bcac36e8\": {\"quality\": 0.03333333333333333, \"cost\": 0.010924840000000002, \"time\": 119.18308181762694}, \"d8eadc0190\": {\"quality\": 0.10301075268817204, \"cost\": 0.039214723, \"time\": 115.4887844324112}, \"d96677d8d4\": {\"quality\": 0.10301075268817204, \"cost\": 0.006749594999999999, \"time\": 111.51266958713532}, \"d98f22270e\": {\"quality\": 0.20301075268817204, \"cost\": 0.034043790000000004, \"time\": 121.76555292606355}, \"d9e2bb21a3\": {\"quality\": 0.05290322580645161, \"cost\": 0.005608887, \"time\": 109.89146270751954}, \"da95deeb20\": {\"quality\": 0.10301075268817204, \"cost\": 0.027880385, \"time\": 84.15849900245667}, \"daaadadcc9\": {\"quality\": 0.11591397849462365, \"cost\": 0.010063653999999998, \"time\": 151.10103166103363}, \"daf855e065\": {\"quality\": 0.11591397849462365, \"cost\": 0.006424737, \"time\": 158.07599256038665}, \"db00594832\": {\"quality\": 0.12623655913978493, \"cost\": 0.037128378000000004, \"time\": 93.38327257633209}, \"db19e677c4\": {\"quality\": 0.12301075268817205, \"cost\": 0.039896044, \"time\": 130.70764796733857}, \"db3c035639\": {\"quality\": 0.11591397849462365, \"cost\": 0.036307984, \"time\": 152.2756203889847}, \"db41487005\": {\"quality\": 0.0, \"cost\": 0.029024420000000002, \"time\": 115.28090765476227}, \"db6a7482fd\": {\"quality\": 0.11333333333333334, \"cost\": 0.027559395, \"time\": 72.33171219825743}, \"db9060cd27\": {\"quality\": 0.0696774193548387, \"cost\": 0.001964664, \"time\": 64.76092283725738}, \"dbce95a072\": {\"quality\": 0.10301075268817204, \"cost\": 0.007502654000000001, \"time\": 103.14728391170502}, \"dc195abe5e\": {\"quality\": 0.10623655913978494, \"cost\": 0.030985093999999998, \"time\": 136.5469225883484}, \"dc3f4b7138\": {\"quality\": 0.02, \"cost\": 0.028686388000000007, \"time\": 102.1283097743988}, \"dc66bccb1c\": {\"quality\": 0.11456989247311827, \"cost\": 0.006545776, \"time\": 108.06170182228088}, \"dc90065dea\": {\"quality\": 0.10301075268817204, \"cost\": 0.010758259999999999, \"time\": 109.67339315414429}, \"dd0d70fedd\": {\"quality\": 0.10301075268817204, \"cost\": 0.0019252949999999998, \"time\": 62.47777860164643}, \"dddc76b3ca\": {\"quality\": 0.12301075268817205, \"cost\": 0.03317511000000001, \"time\": 121.71198806762695}, \"de18bf45e1\": {\"quality\": 0.016129032258064516, \"cost\": 0.001520226, \"time\": 81.24462885856629}, \"de1e56370f\": {\"quality\": 0.10301075268817204, \"cost\": 0.004404816000000001, \"time\": 107.52295508384705}, \"df2160ecc8\": {\"quality\": 0.10301075268817204, \"cost\": 0.03259985, \"time\": 121.08217024803162}, \"dfda94bd2a\": {\"quality\": 0.012903225806451613, \"cost\": 0.001747728, \"time\": 94.69696862697602}, \"dff452a9ca\": {\"quality\": 0.0, \"cost\": 0.005642568000000001, \"time\": 113.53185505867005}, \"e09f75d9d6\": {\"quality\": 0.12301075268817205, \"cost\": 0.039859539, \"time\": 121.64791254997255}, \"e0b6a99753\": {\"quality\": 0.10301075268817204, \"cost\": 0.008541190000000002, \"time\": 52.27649214267731}, \"e0cf6587a7\": {\"quality\": 0.07612903225806453, \"cost\": 0.035429448, \"time\": 154.11263673305513}, \"e0de4a5929\": {\"quality\": 0.11591397849462365, \"cost\": 0.03794565500000001, \"time\": 146.6949642419815}, \"e1356fb426\": {\"quality\": 0.10301075268817204, \"cost\": 0.03224261, \"time\": 113.18122699260712}, \"e20ba014a1\": {\"quality\": 0.10623655913978494, \"cost\": 0.010716353000000001, \"time\": 146.11545753479004}, \"e21806e3bc\": {\"quality\": 0.06946236559139785, \"cost\": 0.011148466, \"time\": 108.38995244503022}, \"e24e97564c\": {\"quality\": 0.10623655913978494, \"cost\": 0.034947423000000005, \"time\": 141.45279150009156}, \"e2673c1ec8\": {\"quality\": 0.10301075268817204, \"cost\": 0.035187995, \"time\": 133.8898230791092}, \"e26c7bfbdb\": {\"quality\": 0.08301075268817204, \"cost\": 0.007581762000000001, \"time\": 99.8463338136673}, \"e2f9980b06\": {\"quality\": 0.11591397849462365, \"cost\": 0.019417651, \"time\": 133.10050780773162}, \"e3445f7632\": {\"quality\": 0.10623655913978494, \"cost\": 0.004820125, \"time\": 63.44422569274902}, \"e35e5f81a7\": {\"quality\": 0.0, \"cost\": 0.006810027, \"time\": 115.5727329492569}, \"e376ac53e7\": {\"quality\": 0.08946236559139785, \"cost\": 0.024290850000000003, \"time\": 110.75653586387634}, \"e3d8bb56da\": {\"quality\": 0.10301075268817204, \"cost\": 0.021544511000000002, \"time\": 116.27818155288696}, \"e3df4cf041\": {\"quality\": 0.0, \"cost\": 0.007794071, \"time\": 129.81809906959535}, \"e47dc3abca\": {\"quality\": 0.10623655913978494, \"cost\": 0.011515712, \"time\": 132.7535306930542}, \"e4b9d4fb41\": {\"quality\": 0.02, \"cost\": 0.007862585, \"time\": 139.48534836769102}, \"e510bda989\": {\"quality\": 0.13268817204301075, \"cost\": 0.005762856, \"time\": 33.74014439582825}, \"e517cd2222\": {\"quality\": 0.08761904761904762, \"cost\": 0.001565182, \"time\": 53.67046070098877}, \"e51b01f418\": {\"quality\": 0.07290322580645162, \"cost\": 0.005801265, \"time\": 138.57358028888703}, \"e520dfae5b\": {\"quality\": 0.10946236559139785, \"cost\": 0.006193371, \"time\": 176.72269680500028}, \"e521c9b7e4\": {\"quality\": 0.12301075268817205, \"cost\": 0.034744782, \"time\": 125.99484462738036}, \"e54097ad5d\": {\"quality\": 0.10301075268817204, \"cost\": 0.030318109000000006, \"time\": 143.14947426319122}, \"e56a16ca66\": {\"quality\": 0.10623655913978494, \"cost\": 0.0019251749999999997, \"time\": 100.41836931705475}, \"e5c4abf7ce\": {\"quality\": 0.10301075268817204, \"cost\": 0.03347102800000001, \"time\": 170.73657705783845}, \"e5d4689312\": {\"quality\": 0.0, \"cost\": 0.03450822800000001, \"time\": 180.7019939184189}, \"e62a7b27ae\": {\"quality\": 0.049677419354838714, \"cost\": 0.026221878, \"time\": 142.64135072231295}, \"e6a7aff3bc\": {\"quality\": 0.0, \"cost\": 0.032406869000000005, \"time\": 182.2341136932373}, \"e736999157\": {\"quality\": 0.04, \"cost\": 0.031729251, \"time\": 175.4835301399231}, \"e7517a8ce0\": {\"quality\": 0.12301075268817205, \"cost\": 0.034507195000000004, \"time\": 118.30208179950714}, \"e7520ca5ac\": {\"quality\": 0.12301075268817205, \"cost\": 0.03356782800000001, \"time\": 164.09410247802734}, \"e7e94ab7a5\": {\"quality\": 0.0, \"cost\": 0.001703792, \"time\": 94.30426328182222}, \"e887ddf5cc\": {\"quality\": 0.10623655913978494, \"cost\": 0.041667262, \"time\": 169.6624465703964}, \"e94fb5a295\": {\"quality\": 0.0, \"cost\": 0.033384014000000004, \"time\": 134.82464711666108}, \"ea6ecc5653\": {\"quality\": 0.06301075268817204, \"cost\": 0.007951587, \"time\": 134.84457714557647}, \"ea8bcb3ae2\": {\"quality\": 0.10301075268817204, \"cost\": 0.017331892, \"time\": 134.0905973672867}, \"ebbe8b6c4f\": {\"quality\": 0.0, \"cost\": 0.007050224, \"time\": 106.51247243881225}, \"ebdf3abff2\": {\"quality\": 0.10301075268817204, \"cost\": 0.013816473000000001, \"time\": 136.48189253807067}, \"ec55dba809\": {\"quality\": 0.10301075268817204, \"cost\": 0.03585076400000001, \"time\": 154.6976619243622}, \"ecb5f78f37\": {\"quality\": 0.0, \"cost\": 0.011340265, \"time\": 174.9997961997986}, \"ecda3d74cb\": {\"quality\": 0.02, \"cost\": 0.032141616, \"time\": 116.44101283550262}, \"ece10c0388\": {\"quality\": 0.18301075268817205, \"cost\": 0.05883078800000001, \"time\": 113.18113870620726}, \"ed6b5480a5\": {\"quality\": 0.06946236559139785, \"cost\": 0.0049282679999999995, \"time\": 77.08623633384704}, \"eda630dc85\": {\"quality\": 0.10623655913978494, \"cost\": 0.007813983, \"time\": 120.82467505931854}, \"edaaee5ed4\": {\"quality\": 0.08946236559139785, \"cost\": 0.006681024000000001, \"time\": 74.40945067405701}, \"edb2b764aa\": {\"quality\": 0.02, \"cost\": 0.013618342, \"time\": 113.58856484889984}, \"edc52339db\": {\"quality\": 0.03333333333333333, \"cost\": 0.013122086000000002, \"time\": 151.4774285554886}, \"ede7071775\": {\"quality\": 0.12301075268817205, \"cost\": 0.059439179999999994, \"time\": 100.97825186252595}, \"ee46042c5d\": {\"quality\": 0.0, \"cost\": 0.01214812, \"time\": 152.4212220430374}, \"ee68c51f73\": {\"quality\": 0.12301075268817205, \"cost\": 0.030089499999999998, \"time\": 61.54816117286683}, \"ee7b726747\": {\"quality\": 0.11591397849462365, \"cost\": 0.017031875000000002, \"time\": 160.18388693332673}, \"eec5f32da9\": {\"quality\": 0.11015360983102919, \"cost\": 0.040522985000000004, \"time\": 149.7510376214981}, \"eed40e4378\": {\"quality\": 0.10623655913978494, \"cost\": 0.03209097500000001, \"time\": 152.84840388298034}, \"eef12d478b\": {\"quality\": 0.08258064516129032, \"cost\": 0.0049949880000000006, \"time\": 113.2755955696106}, \"ef37b3e0be\": {\"quality\": 0.0, \"cost\": 0.009670806, \"time\": 178.49481089115142}, \"ef43d497f1\": {\"quality\": 0.08301075268817204, \"cost\": 0.012532439999999999, \"time\": 159.2441192626953}, \"ef4d4c4a62\": {\"quality\": 0.08623655913978495, \"cost\": 0.00850348, \"time\": 126.41268122196198}, \"ef9a651425\": {\"quality\": 0.11591397849462365, \"cost\": 0.036512095, \"time\": 164.66946573257445}, \"f0655621af\": {\"quality\": 0.02967741935483871, \"cost\": 0.0047729999999999995, \"time\": 45.447355723381044}, \"f076b4c9ae\": {\"quality\": 0.10623655913978494, \"cost\": 0.007286825, \"time\": 118.35513505935668}, \"f11eddb4ed\": {\"quality\": 0.10301075268817204, \"cost\": 0.012131963000000003, \"time\": 151.4715585231781}, \"f1408da253\": {\"quality\": 0.10301075268817204, \"cost\": 0.009555081, \"time\": 155.64059100151061}, \"f18cf41929\": {\"quality\": 0.0, \"cost\": 0.003205638, \"time\": 99.03487899303437}, \"f1aa0b0b42\": {\"quality\": 0.10946236559139785, \"cost\": 0.025626688, \"time\": 115.38884313106537}, \"f1bda127f6\": {\"quality\": 0.10301075268817204, \"cost\": 0.012302958, \"time\": 102.7573136806488}, \"f1f373e58e\": {\"quality\": 0.10623655913978494, \"cost\": 0.039605126000000004, \"time\": 118.39543704986573}, \"f2a2e91541\": {\"quality\": 0.09660522273425498, \"cost\": 0.008657139000000001, \"time\": 145.9810749053955}, \"f2c04ed1c8\": {\"quality\": 0.10301075268817204, \"cost\": 0.004977713999999999, \"time\": 68.23735935688019}, \"f2cf5db12d\": {\"quality\": 0.09268817204301075, \"cost\": 0.008181425999999999, \"time\": 120.30662734508515}, \"f366c0dd10\": {\"quality\": 0.00967741935483871, \"cost\": 0.029196041000000002, \"time\": 114.87482616901397}, \"f4303a5b4f\": {\"quality\": 0.10301075268817204, \"cost\": 0.041115134000000005, \"time\": 145.90202000141142}, \"f437481e3b\": {\"quality\": 0.21591397849462365, \"cost\": 0.012506865999999998, \"time\": 151.16193616390228}, \"f4bc6b63a7\": {\"quality\": 0.10301075268817204, \"cost\": 0.05883107500000001, \"time\": 110.05861134529113}, \"f4dc556633\": {\"quality\": 0.032903225806451615, \"cost\": 0.035164677000000005, \"time\": 165.65216555595396}, \"f4deb72db6\": {\"quality\": 0.0, \"cost\": 0.00779049, \"time\": 124.83286740779877}, \"f4ef2b9c33\": {\"quality\": 0.08623655913978495, \"cost\": 0.025671540000000003, \"time\": 110.72261786460876}, \"f566d6d6a1\": {\"quality\": 0.12301075268817205, \"cost\": 0.030296474, \"time\": 118.43880999088287}, \"f5b9a94dcc\": {\"quality\": 0.0, \"cost\": 0.00286785, \"time\": 96.0643517255783}, \"f5c27e7172\": {\"quality\": 0.029677419354838707, \"cost\": 0.003395451, \"time\": 131.81098673343658}, \"f5e53d963b\": {\"quality\": 0.11623655913978495, \"cost\": 0.0022498320000000002, \"time\": 82.10175185203552}, \"f614235c15\": {\"quality\": 0.0, \"cost\": 0.0015151140000000001, \"time\": 93.65864548683166}, \"f6546149e3\": {\"quality\": 0.10623655913978494, \"cost\": 0.034531665, \"time\": 126.04132678508759}, \"f74ec023e4\": {\"quality\": 0.11591397849462365, \"cost\": 0.011997618, \"time\": 167.2357141494751}, \"f7b048bd54\": {\"quality\": 0.10301075268817204, \"cost\": 0.008602385, \"time\": 132.47917804718017}, \"f7c4df993e\": {\"quality\": 0.10301075268817204, \"cost\": 0.007398474, \"time\": 140.5795171499252}, \"f854533145\": {\"quality\": 0.10301075268817204, \"cost\": 0.010546359, \"time\": 164.53556537628174}, \"f89b8a1930\": {\"quality\": 0.10946236559139785, \"cost\": 0.016153075, \"time\": 165.33965291976926}, \"f93d9a2693\": {\"quality\": 0.06301075268817204, \"cost\": 0.006399356999999998, \"time\": 130.61231911182404}, \"f97d91a249\": {\"quality\": 0.13591397849462367, \"cost\": 0.029466680000000002, \"time\": 128.38061220645903}, \"f99096d89c\": {\"quality\": 0.11268817204301075, \"cost\": 0.009063081, \"time\": 164.117862033844}, \"f9e8e221f3\": {\"quality\": 0.07729646697388634, \"cost\": 0.007258314, \"time\": 137.73151681423187}, \"fa38879eab\": {\"quality\": 0.07333333333333333, \"cost\": 0.013395525000000002, \"time\": 171.87790231704713}, \"fa71111570\": {\"quality\": 0.09333333333333334, \"cost\": 0.033387521, \"time\": 163.68935594558718}, \"fa7882d46b\": {\"quality\": 0.11591397849462365, \"cost\": 0.01214556, \"time\": 168.27967350482942}, \"fa906520d1\": {\"quality\": 0.10301075268817204, \"cost\": 0.012420450999999999, \"time\": 173.23661334514617}, \"faabebaa30\": {\"quality\": 0.08301075268817204, \"cost\": 0.01007883, \"time\": 177.29549469947813}, \"fb0339a7d0\": {\"quality\": 0.10623655913978494, \"cost\": 0.012603655000000002, \"time\": 175.26443803310394}, \"fb216ad6b3\": {\"quality\": 0.08623655913978495, \"cost\": 0.0027521299999999998, \"time\": 48.80933380126953}, \"fb6216880a\": {\"quality\": 0.04, \"cost\": 0.010161849, \"time\": 176.34763963222503}, \"fba499b89d\": {\"quality\": 0.03333333333333333, \"cost\": 0.034680020000000006, \"time\": 176.12596268653868}, \"fbc010a368\": {\"quality\": 0.0696774193548387, \"cost\": 0.03236535, \"time\": 140.89571504592897}, \"fbc02e2e07\": {\"quality\": 0.10946236559139785, \"cost\": 0.032320148, \"time\": 134.5744981765747}, \"fbd6c45271\": {\"quality\": 0.11591397849462365, \"cost\": 0.010223524000000001, \"time\": 171.81036722660065}, \"fc0a156e16\": {\"quality\": 0.10623655913978494, \"cost\": 0.014270385, \"time\": 127.52731266021729}, \"fc1fd5bf54\": {\"quality\": 0.0701536098310292, \"cost\": 0.005700248999999999, \"time\": 163.72950367927552}, \"fc6967a75b\": {\"quality\": 0.0, \"cost\": 0.03203869000000001, \"time\": 176.3084951877594}, \"fc73c3b0fa\": {\"quality\": 0.12301075268817205, \"cost\": 0.011811562000000001, \"time\": 53.39196453094482}, \"fce38334b2\": {\"quality\": 0.03612903225806452, \"cost\": 0.013067388, \"time\": 160.21355810165403}, \"fce5fca128\": {\"quality\": 0.15946236559139787, \"cost\": 0.038042624000000004, \"time\": 158.27728819847107}, \"fd0709359e\": {\"quality\": 0.10301075268817204, \"cost\": 0.0006582199999999999, \"time\": 41.16247012615204}, \"fd1f809d64\": {\"quality\": 0.10301075268817204, \"cost\": 0.040954353, \"time\": 135.35224254131316}, \"fd2c994a9d\": {\"quality\": 0.10946236559139785, \"cost\": 0.031750883, \"time\": 155.9396536588669}, \"fddccfbf94\": {\"quality\": 0.20301075268817204, \"cost\": 0.010760402, \"time\": 121.92865846157073}, \"fe7fa741b4\": {\"quality\": 0.11134408602150538, \"cost\": 0.016085802000000003, \"time\": 126.83495450019836}, \"fe9e1fec71\": {\"quality\": 0.12301075268817205, \"cost\": 0.033332814, \"time\": 128.354682469368}, \"fea4734c09\": {\"quality\": 0.12301075268817205, \"cost\": 0.005711668, \"time\": 87.13299584388733}, \"fef1ca27fa\": {\"quality\": 0.07913978494623655, \"cost\": 0.013074829999999999, \"time\": 140.50196795463563}, \"ff11cb6a7a\": {\"quality\": 0.08301075268817204, \"cost\": 0.012798740000000001, \"time\": 135.25200283527374}, \"ff171e34e2\": {\"quality\": 0.0, \"cost\": 0.004113531, \"time\": 106.45833990573882}, \"ff1c958e21\": {\"quality\": 0.0, \"cost\": 0.0030646650000000003, \"time\": 87.37032148838043}, \"ff8df4ace9\": {\"quality\": 0.08, \"cost\": 0.032799690000000006, \"time\": 86.2204920053482}, \"ff8e68049a\": {\"quality\": 0.10301075268817204, \"cost\": 0.006420515999999999, \"time\": 71.03863010406494}}"
  },
  {
    "path": "abacus-research/biodex-revision-priors-maxquality.json",
    "content": "{\"00b02360ef\": {\"quality\": 0.21648684648684646, \"cost\": 0.040317356500000005, \"time\": 40.99749290347099}, \"025a41642e\": {\"quality\": 0.2323391192141192, \"cost\": 0.011047775499999999, \"time\": 38.92555040717125}, \"03c4bc9bb0\": {\"quality\": 0.25034264346764346, \"cost\": 0.011205506, \"time\": 22.393556386232376}, \"04ed954ae9\": {\"quality\": 0.25250624375624375, \"cost\": 0.04878718750000001, \"time\": 20.034630668163302}, \"059dee8af9\": {\"quality\": 0.2151956376956377, \"cost\": 0.00895401925, \"time\": 31.44526737332344}, \"05fde83b5e\": {\"quality\": 0.20102897102897105, \"cost\": 0.007020948000000001, \"time\": 27.97608198523521}, \"06b62ff472\": {\"quality\": 0.26966144966144967, \"cost\": 0.005215445, \"time\": 14.247572779655457}, \"0730a1221a\": {\"quality\": 0.25292291042291043, \"cost\": 0.027662595, \"time\": 33.58496024608612}, \"07aec969dd\": {\"quality\": 0.2148887223887224, \"cost\": 0.0008298215000000001, \"time\": 43.21092504262924}, \"0a11f0ae2e\": {\"quality\": 0.24079150016650014, \"cost\": 0.010170454500000002, \"time\": 20.595997565984725}, \"0aeb79b24c\": {\"quality\": 0.14777319902319902, \"cost\": 0.0058596425, \"time\": 7.766067373752594}, \"0b3eafcb23\": {\"quality\": 0.1913265900765901, \"cost\": 0.0013463715, \"time\": 19.126238137483597}, \"0bc5bbe9f8\": {\"quality\": 0.19658452658452658, \"cost\": 0.0314150555, \"time\": 42.02523586153984}, \"0c328ddadf\": {\"quality\": 0.1950774919524919, \"cost\": 0.007442179, \"time\": 22.943194967508315}, \"0db4d3e16d\": {\"quality\": 0.18679285991785993, \"cost\": 0.0032097465, \"time\": 19.68699924945831}, \"0df97ab2f2\": {\"quality\": 0.22131729381729381, \"cost\": 0.008563199, \"time\": 43.75024158954621}, \"0e0162da5c\": {\"quality\": 0.21570068820068822, \"cost\": 0.027131577500000004, \"time\": 33.71510918140412}, \"0e0216352f\": {\"quality\": 0.16231865356865358, \"cost\": 0.009209249, \"time\": 33.987769949436185}, \"0ecd21b93a\": {\"quality\": 0.2128450715950716, \"cost\": 0.027050046, \"time\": 27.799335622787474}, \"0ff2bde030\": {\"quality\": 0.2117928599178599, \"cost\": 0.033819734000000004, \"time\": 19.09183721542358}, \"1059f21a6c\": {\"quality\": 0.21909507159507158, \"cost\": 0.0006545234999999999, \"time\": 13.647678703069687}, \"112b665311\": {\"quality\": 0.21884552947052946, \"cost\": 0.004884652, \"time\": 22.208240193128585}, \"11db1dbc4f\": {\"quality\": 0.1719506882006882, \"cost\": 0.003233838, \"time\": 22.85480940937996}, \"13145c10ec\": {\"quality\": 0.23070159007659008, \"cost\": 0.0313856595, \"time\": 35.557817763090135}, \"147915874a\": {\"quality\": 0.2280723443223443, \"cost\": 0.00055762025, \"time\": 34.335060054063796}, \"16e1e2aca6\": {\"quality\": 0.25766650016650017, \"cost\": 0.005274602500000001, \"time\": 13.559284949302672}, \"1914dfad94\": {\"quality\": 0.2624284049284049, \"cost\": 0.02536484, \"time\": 25.639233177900316}, \"1ab9dcd7d6\": {\"quality\": 0.2867339604839605, \"cost\": 0.0010822935, \"time\": 18.924210703372957}, \"1dc749411d\": {\"quality\": 0.2199178599178599, \"cost\": 0.010029340000000001, \"time\": 14.274628734588624}, \"1dd84b0308\": {\"quality\": 0.2286371961371961, \"cost\": 0.027319404999999998, \"time\": 25.991028892993928}, \"1dd9d9bef2\": {\"quality\": 0.2226475607725608, \"cost\": 0.007434026250000001, \"time\": 31.69112855195999}, \"1f0ba92b8b\": {\"quality\": 0.22614052614052613, \"cost\": 0.0175957475, \"time\": 36.91505531668663}, \"1f1b5a019d\": {\"quality\": 0.2593331668331669, \"cost\": 0.033523670000000005, \"time\": 26.298819702863693}, \"1f41a35dd3\": {\"quality\": 0.14541666666666667, \"cost\": 0.004300389, \"time\": 17.449908739328386}, \"212c3fccbf\": {\"quality\": 0.2939645770895771, \"cost\": 0.0327319625, \"time\": 27.72252732515335}, \"219d51cd03\": {\"quality\": 0.1703336247086247, \"cost\": 0.0009408769999999999, \"time\": 18.91644185781479}, \"238ed27a56\": {\"quality\": 0.24007659007659007, \"cost\": 0.0295427895, \"time\": 31.712434768676758}, \"23f3d97d47\": {\"quality\": 0.23146457708957707, \"cost\": 0.02700624675, \"time\": 32.10481671690941}, \"253547264d\": {\"quality\": 0.1917432567432567, \"cost\": 0.026901585, \"time\": 23.115290719270703}, \"27e14a0973\": {\"quality\": 0.23250624375624376, \"cost\": 0.033178285, \"time\": 36.78056996464729}, \"29960739d3\": {\"quality\": 0.221773643023643, \"cost\": 0.032106853000000005, \"time\": 29.176937651634216}, \"2b0a8d5614\": {\"quality\": 0.2043530081030081, \"cost\": 0.022987428499999997, \"time\": 20.08997523188591}, \"2ccd104a68\": {\"quality\": 0.20102897102897105, \"cost\": 0.0184998355, \"time\": 45.19519582390785}, \"2d1b0cd5d9\": {\"quality\": 0.22214008214008213, \"cost\": 0.0059833375, \"time\": 18.541752833127973}, \"2dcffa3d30\": {\"quality\": 0.27596042846042845, \"cost\": 0.009521012499999999, \"time\": 26.920003366470336}, \"2ebe033e6a\": {\"quality\": 0.21273532023532027, \"cost\": 0.0073467315, \"time\": 21.433203184604643}, \"31e8ef9410\": {\"quality\": 0.10839285714285715, \"cost\": 0.0007289739999999999, \"time\": 63.23130738735199}, \"32a628c971\": {\"quality\": 0.25402500277500273, \"cost\": 0.037233726, \"time\": 26.05542423725128}, \"336e1a4fdc\": {\"quality\": 0.1680573593073593, \"cost\": 0.031870938, \"time\": 38.903397005796435}, \"33e0fdf4e1\": {\"quality\": 0.2249284049284049, \"cost\": 0.005128618, \"time\": 22.448986053466797}, \"38a60c768a\": {\"quality\": 0.23622648185148185, \"cost\": 0.027102593, \"time\": 38.203392094373704}, \"39bb54fbfb\": {\"quality\": 0.248528971028971, \"cost\": 0.03168909524999999, \"time\": 49.03234923481941}, \"3ace395562\": {\"quality\": 0.20436230436230435, \"cost\": 0.025813554500000002, \"time\": 24.070003932714464}, \"3bbe08bf63\": {\"quality\": 0.2427335164835165, \"cost\": 0.033579119000000004, \"time\": 43.73568968772888}, \"3caf8d77c5\": {\"quality\": 0.25665397102897103, \"cost\": 0.0361807135, \"time\": 33.51353464722634}, \"3d25abc0cb\": {\"quality\": 0.2298871961371961, \"cost\": 0.008877388500000001, \"time\": 38.62544178366661}, \"3e19290ef5\": {\"quality\": 0.2251261932511933, \"cost\": 0.026017763000000006, \"time\": 31.622546792030334}, \"3e8e089a02\": {\"quality\": 0.232903971028971, \"cost\": 0.0243675525, \"time\": 12.379487264156342}, \"3f1771caee\": {\"quality\": 0.1291142884892885, \"cost\": 0.003903147, \"time\": 22.171902561187743}, \"3f7beb53c4\": {\"quality\": 0.26968038905538905, \"cost\": 0.00175834875, \"time\": 17.588406985998155}, \"4081f36f1d\": {\"quality\": 0.20352897102897105, \"cost\": 0.009008155, \"time\": 35.7961721599102}, \"410edf5f69\": {\"quality\": 0.24408452658452662, \"cost\": 0.033720045500000004, \"time\": 39.4614034473896}, \"44ec78f301\": {\"quality\": 0.2112887112887113, \"cost\": 0.0312310875, \"time\": 28.79513317346573}, \"44eeb11408\": {\"quality\": 0.23811230436230435, \"cost\": 0.027474213500000004, \"time\": 33.9330511868}, \"45623e6def\": {\"quality\": 0.27609841547341546, \"cost\": 0.0343063295, \"time\": 33.21763527989388}, \"470749179f\": {\"quality\": 0.22933316683316685, \"cost\": 0.0050279755, \"time\": 24.99732105731964}, \"47ea56eba0\": {\"quality\": 0.19954240204240203, \"cost\": 0.018521998, \"time\": 30.92959639430046}, \"49fc8b3768\": {\"quality\": 0.16649433899433902, \"cost\": 0.0028119045000000002, \"time\": 17.623043191432952}, \"4a60adbf47\": {\"quality\": 0.18713078588078588, \"cost\": 0.013557406999999999, \"time\": 28.078510254621506}, \"4ba372d292\": {\"quality\": 0.24674478299478303, \"cost\": 0.007506609250000001, \"time\": 32.822478234767914}, \"4bf38b1922\": {\"quality\": 0.19991785991785993, \"cost\": 0.031763846750000005, \"time\": 40.722243314981455}, \"4c415b4b86\": {\"quality\": 0.1526658757908758, \"cost\": 0.025909743, \"time\": 14.397501182556152}, \"4d1fc31b49\": {\"quality\": 0.2615845265845266, \"cost\": 0.024915399, \"time\": 18.815959566831587}, \"4d8c84f53a\": {\"quality\": 0.2150567488067488, \"cost\": 0.0271872135, \"time\": 31.158079880476}, \"4df2245b9b\": {\"quality\": 0.14843878343878342, \"cost\": 0.0011828115, \"time\": 5.0212810933589935}, \"4e9274d39b\": {\"quality\": 0.2178261322011322, \"cost\": 0.0059426750000000006, \"time\": 16.452285903692246}, \"4f87ed7c2e\": {\"quality\": 0.23269563769563767, \"cost\": 0.0090589945, \"time\": 26.12176838517189}, \"51a340bebc\": {\"quality\": 0.23375624375624376, \"cost\": 0.032705517, \"time\": 28.51812062859535}, \"52dee74e25\": {\"quality\": 0.2215728021978022, \"cost\": 0.032394368, \"time\": 31.705753284692765}, \"533fddb3bd\": {\"quality\": 0.13879745254745254, \"cost\": 0.0016485569999999997, \"time\": 9.408051878213882}, \"5358cb5855\": {\"quality\": 0.18664467476967478, \"cost\": 0.027746372000000002, \"time\": 23.547580116987227}, \"53b6c3e00f\": {\"quality\": 0.2590950715950716, \"cost\": 0.000778896, \"time\": 12.53750283718109}, \"5400c14640\": {\"quality\": 0.2697979104229104, \"cost\": 0.029639904999999994, \"time\": 41.52563037276268}, \"55e373a7e2\": {\"quality\": 0.16850885225885226, \"cost\": 0.0406613025, \"time\": 22.814987272024155}, \"5652070ecb\": {\"quality\": 0.17858849483849482, \"cost\": 0.0075221129999999995, \"time\": 32.25966700911522}, \"56c5cbc75a\": {\"quality\": 0.23478951603951606, \"cost\": 0.000632159, \"time\": 38.25331119894982}, \"577b168d21\": {\"quality\": 0.2182725607725608, \"cost\": 0.00705232, \"time\": 13.326130753755569}, \"57cabf0e4b\": {\"quality\": 0.22299325674325673, \"cost\": 0.02575083125, \"time\": 24.347850561141968}, \"58dbfc0499\": {\"quality\": 0.2620895770895771, \"cost\": 0.055131649000000005, \"time\": 28.987264293432233}, \"5a5739c3d2\": {\"quality\": 0.20117597680097682, \"cost\": 0.007228886499999999, \"time\": 25.403514659404756}, \"5b12979e52\": {\"quality\": 0.18208513708513707, \"cost\": 0.0038426475, \"time\": 22.1737478017807}, \"5b384da665\": {\"quality\": 0.20099920912420913, \"cost\": 0.014891756499999999, \"time\": 28.11651138663292}, \"5b837baf2e\": {\"quality\": 0.23083957708957709, \"cost\": 0.04059063350000001, \"time\": 38.7564038336277}, \"5c8acd9a75\": {\"quality\": 0.1964952408702409, \"cost\": 0.027179780499999997, \"time\": 31.81220202445984}, \"5da68874af\": {\"quality\": 0.1768232461982462, \"cost\": 0.00942769325, \"time\": 30.90108055472374}, \"5e304846b6\": {\"quality\": 0.13062375124875125, \"cost\": 0.00801055675, \"time\": 21.377113670110703}, \"5e608d2a8b\": {\"quality\": 0.160135212010212, \"cost\": 0.0008863050000000001, \"time\": 21.419854152202603}, \"5e6ad9fb8a\": {\"quality\": 0.19750624375624376, \"cost\": 0.025521787500000004, \"time\": 20.569793158769606}, \"5ec5b185c9\": {\"quality\": 0.18449037074037078, \"cost\": 0.002478051, \"time\": 34.97374950647354}, \"608e332141\": {\"quality\": 0.24616785991785994, \"cost\": 0.03769494400000001, \"time\": 25.56632845401764}, \"610753ffe5\": {\"quality\": 0.2546202408702409, \"cost\": 0.0301900835, \"time\": 22.38964868783951}, \"614e59c7e7\": {\"quality\": 0.22919427794427794, \"cost\": 0.0017370554999999999, \"time\": 16.915371453762056}, \"62320710f0\": {\"quality\": 0.25123730436230435, \"cost\": 0.03658336050000001, \"time\": 31.835360258817673}, \"649df27d73\": {\"quality\": 0.23370664058164056, \"cost\": 0.033310952000000005, \"time\": 22.02554016113281}, \"64f956ba72\": {\"quality\": 0.2440845265845266, \"cost\": 0.028164125, \"time\": 15.42359745502472}, \"65657a3d04\": {\"quality\": 0.21516650016650019, \"cost\": 0.011299091, \"time\": 48.3486190378666}, \"6604abbd43\": {\"quality\": 0.24808316683316683, \"cost\": 0.026956945250000003, \"time\": 36.97830517292023}, \"67050fa89b\": {\"quality\": 0.18341144966144968, \"cost\": 0.0313915525, \"time\": 39.6457058608532}, \"67f05c5985\": {\"quality\": 0.22079150016650018, \"cost\": 0.02690715625, \"time\": 22.91822772026062}, \"67fe27bce0\": {\"quality\": 0.24436230436230436, \"cost\": 0.041374972, \"time\": 33.75530356168747}, \"698d1b1d0e\": {\"quality\": 0.25665397102897103, \"cost\": 0.04129188500000001, \"time\": 50.97259179949761}, \"69e8aa9f73\": {\"quality\": 0.1903450715950716, \"cost\": 0.00740056, \"time\": 34.38110066056252}, \"6b240a8971\": {\"quality\": 0.21965180652680652, \"cost\": 0.0090931325, \"time\": 28.12567190527916}, \"6b56430005\": {\"quality\": 0.23343968531468534, \"cost\": 0.027798901, \"time\": 43.157285338640214}, \"6b611b7193\": {\"quality\": 0.2807205294705295, \"cost\": 0.06928643750000002, \"time\": 21.161213809251784}, \"6b77ef93d3\": {\"quality\": 0.1493911643911644, \"cost\": 0.0015440895, \"time\": 12.580836659669876}, \"6beefce17b\": {\"quality\": 0.27009552947052945, \"cost\": 0.03215786600000001, \"time\": 23.689218693971632}, \"6cb7c9a802\": {\"quality\": 0.27246836496836496, \"cost\": 0.0316063565, \"time\": 41.612704527378085}, \"6cba1da5b3\": {\"quality\": 0.18273684648684646, \"cost\": 0.013787275500000001, \"time\": 24.630589830875394}, \"6dea54bf99\": {\"quality\": 0.2606321456321456, \"cost\": 0.004957138, \"time\": 21.090058833360672}, \"6eda6f7780\": {\"quality\": 0.25468038905538903, \"cost\": 0.011272949, \"time\": 26.375605469942094}, \"708f3ca5da\": {\"quality\": 0.12009552947052946, \"cost\": 0.0015416722499999999, \"time\": 13.809926038980484}, \"72d75a5f01\": {\"quality\": 0.23014513264513264, \"cost\": 0.0153489635, \"time\": 28.418202906847}, \"72d99dee93\": {\"quality\": 0.2350756882006882, \"cost\": 0.029338768500000008, \"time\": 44.55438640117645}, \"73272e5bd3\": {\"quality\": 0.2751956376956377, \"cost\": 0.0333353575, \"time\": 29.94201866388321}, \"74608570f5\": {\"quality\": 0.11613782051282051, \"cost\": 0.0005538545, \"time\": 27.12369663715363}, \"7531840182\": {\"quality\": 0.1604120879120879, \"cost\": 0.0019984210000000002, \"time\": 58.24117745161056}, \"77ba656673\": {\"quality\": 0.18522602397602397, \"cost\": 0.008878578250000001, \"time\": 24.636592674255372}, \"77fe41f02f\": {\"quality\": 0.1971118464868465, \"cost\": 0.04380676650000001, \"time\": 36.06331704854965}, \"7857e95756\": {\"quality\": 0.2174590687090687, \"cost\": 0.014965390000000002, \"time\": 26.90245589017868}, \"78984e94f0\": {\"quality\": 0.21357017982017984, \"cost\": 0.007025722, \"time\": 20.596780753135683}, \"7986633543\": {\"quality\": 0.20898684648684648, \"cost\": 0.00747764175, \"time\": 33.76270450353623}, \"7bdbc32b57\": {\"quality\": 0.27164314851814847, \"cost\": 0.04712992375, \"time\": 27.709715336561203}, \"7c89cff787\": {\"quality\": 0.15383470695970697, \"cost\": 0.0016299584999999999, \"time\": 19.46316229701042}, \"7dc51ec191\": {\"quality\": 0.2680617993117993, \"cost\": 0.041678240500000005, \"time\": 52.64214360117913}, \"7e4a6245ac\": {\"quality\": 0.19536401098901102, \"cost\": 0.00056989225, \"time\": 64.66312664747238}, \"7ebc55ebcf\": {\"quality\": 0.26186230436230434, \"cost\": 0.0089994125, \"time\": 49.139087688922885}, \"7f14114a48\": {\"quality\": 0.20424325674325672, \"cost\": 0.0130888145, \"time\": 24.788931846618652}, \"7fb587503e\": {\"quality\": 0.2053228021978022, \"cost\": 0.009328153000000002, \"time\": 19.42089232802391}, \"7ff82b6f8d\": {\"quality\": 0.20713474025974027, \"cost\": 0.029847356000000005, \"time\": 21.508908247947694}, \"806a5ef096\": {\"quality\": 0.22686230436230437, \"cost\": 0.024261581, \"time\": 14.832676833868026}, \"8156d78e42\": {\"quality\": 0.1876665001665002, \"cost\": 0.007510368999999999, \"time\": 39.953574234247206}, \"815cfb848c\": {\"quality\": 0.24188124375624376, \"cost\": 0.02595560125, \"time\": 23.38996571302414}, \"820b42e0b1\": {\"quality\": 0.22338078588078586, \"cost\": 0.0027539085000000004, \"time\": 21.940480226278304}, \"82a733626e\": {\"quality\": 0.24082063769563766, \"cost\": 0.029802983, \"time\": 24.7150899887085}, \"83d6300ced\": {\"quality\": 0.22958957708957708, \"cost\": 0.009135905500000001, \"time\": 44.96203254461288}, \"844c822d56\": {\"quality\": 0.19127851315351316, \"cost\": 0.0269840145, \"time\": 23.020123237371443}, \"84f9f20f75\": {\"quality\": 0.2272988122988123, \"cost\": 0.02394343, \"time\": 14.9634037733078}, \"8525986a00\": {\"quality\": 0.2595895770895771, \"cost\": 0.03125997600000001, \"time\": 45.89759058356285}, \"85db012139\": {\"quality\": 0.28459047896547895, \"cost\": 0.04838826250000001, \"time\": 21.917299997806552}, \"85ddc8a71e\": {\"quality\": 0.19811230436230437, \"cost\": 0.00595452, \"time\": 21.147695326805113}, \"8767267bed\": {\"quality\": 0.1887950244200244, \"cost\": 0.0064469125000000006, \"time\": 7.573464637994766}, \"882238f677\": {\"quality\": 0.1552930402930403, \"cost\": 0.0032291700000000004, \"time\": 25.413773769140242}, \"88f78bdeaf\": {\"quality\": 0.12028311965811966, \"cost\": 0.0005936184999999999, \"time\": 24.23035447001457}, \"8af1605300\": {\"quality\": 0.21607017982017981, \"cost\": 0.007401837500000002, \"time\": 26.804660773277284}, \"8c09cba5e2\": {\"quality\": 0.21450119325119324, \"cost\": 0.02589227425, \"time\": 26.046354007720947}, \"8e5d966cf6\": {\"quality\": 0.15911401098901098, \"cost\": 0.0016278529999999999, \"time\": 76.43126168251038}, \"8eacbc7240\": {\"quality\": 0.2039868464868465, \"cost\": 0.0132405425, \"time\": 26.23775848150253}, \"8fc917c575\": {\"quality\": 0.2181123043623044, \"cost\": 0.0075467905, \"time\": 27.82010214924812}, \"8fe43ec148\": {\"quality\": 0.24137862137862137, \"cost\": 0.0031192755, \"time\": 28.713243955373763}, \"900de152da\": {\"quality\": 0.2329229104229104, \"cost\": 0.028014499999999998, \"time\": 11.612605232000352}, \"901ab65b55\": {\"quality\": 0.2703345265845266, \"cost\": 0.027957687500000005, \"time\": 9.624877899885178}, \"918182aa18\": {\"quality\": 0.19238629426129428, \"cost\": 0.002521713, \"time\": 45.21123977303505}, \"91ac37a21a\": {\"quality\": 0.140364010989011, \"cost\": 0.0072078985000000005, \"time\": 59.984852832555774}, \"9252afa4ec\": {\"quality\": 0.20398532023532023, \"cost\": 0.026106901500000005, \"time\": 17.68542433977127}, \"939b0c3ed9\": {\"quality\": 0.2067141192141192, \"cost\": 0.009995370500000001, \"time\": 20.511219489574433}, \"93a705cb0d\": {\"quality\": 0.23775578588078589, \"cost\": 0.0107520785, \"time\": 34.44544678330421}, \"94091e2968\": {\"quality\": 0.2271605477855478, \"cost\": 0.011070436, \"time\": 29.48041752576828}, \"945a9565fa\": {\"quality\": 0.19553356365856367, \"cost\": 0.005568486, \"time\": 9.69417832493782}, \"946baa620a\": {\"quality\": 0.2275567488067488, \"cost\": 0.025270525000000002, \"time\": 17.454011952877046}, \"950ed976dd\": {\"quality\": 0.1091089466089466, \"cost\": 0.00364847625, \"time\": 27.579946410655978}, \"955689ee8f\": {\"quality\": 0.18228708791208792, \"cost\": 0.001921847, \"time\": 29.050129276514056}, \"95c9b1b1dd\": {\"quality\": 0.21440018315018317, \"cost\": 0.031808489, \"time\": 30.75392867922783}, \"95ebba630d\": {\"quality\": 0.16167443667443665, \"cost\": 0.0045665385, \"time\": 21.370410311222074}, \"966282f059\": {\"quality\": 0.16554410866910865, \"cost\": 0.023086932000000004, \"time\": 19.98903277516365}, \"96c401772b\": {\"quality\": 0.24861825674325677, \"cost\": 0.0028384205, \"time\": 21.471487557888032}, \"9718d880f6\": {\"quality\": 0.1984595265845266, \"cost\": 0.0316786545, \"time\": 50.695869612693784}, \"971ae914cb\": {\"quality\": 0.2652461427461428, \"cost\": 0.025022515000000002, \"time\": 14.41015980243683}, \"97566e70dd\": {\"quality\": 0.23030386280386278, \"cost\": 0.0172863905, \"time\": 32.660637366771695}, \"97c882ca1f\": {\"quality\": 0.22630674880674884, \"cost\": 0.036967587999999996, \"time\": 26.217366951704026}, \"98e28fbb93\": {\"quality\": 0.1839868464868465, \"cost\": 0.0100201785, \"time\": 10.730712401866914}, \"992ab25916\": {\"quality\": 0.23011779886779885, \"cost\": 0.00176559, \"time\": 17.568689429759978}, \"9a271dff29\": {\"quality\": 0.19818986568986569, \"cost\": 0.0093025505, \"time\": 29.283438795804976}, \"9aa6fd8b03\": {\"quality\": 0.20840992340992343, \"cost\": 0.0059763937500000005, \"time\": 14.063047587871552}, \"9b12b7d441\": {\"quality\": 0.26338078588078584, \"cost\": 0.01091874575, \"time\": 34.633186584711076}, \"9bf31a2127\": {\"quality\": 0.20186230436230435, \"cost\": 0.0335034275, \"time\": 53.72036165595055}, \"9c4ad5aed3\": {\"quality\": 0.21813124375624376, \"cost\": 0.025749503750000007, \"time\": 27.629801028966902}, \"9cbb34ee5f\": {\"quality\": 0.19157370407370405, \"cost\": 0.0024265880000000004, \"time\": 20.018360090255737}, \"9e16d7ba9c\": {\"quality\": 0.2621788628038628, \"cost\": 0.032505795500000004, \"time\": 34.68335065841675}, \"9e1a7dd196\": {\"quality\": 0.19222097347097347, \"cost\": 0.003654598499999999, \"time\": 18.135820919275282}, \"9e6775ac33\": {\"quality\": 0.1184478021978022, \"cost\": 0.0016337835, \"time\": 15.268957149982452}, \"9e8674b6c7\": {\"quality\": 0.2206123043623044, \"cost\": 0.033377518, \"time\": 39.97827153205871}, \"9ea12d8d80\": {\"quality\": 0.15865613553113553, \"cost\": 0.0066404640000000004, \"time\": 11.089978641271593}, \"a232f1dfb6\": {\"quality\": 0.23243922743922746, \"cost\": 0.0017390729999999998, \"time\": 17.556494808197023}, \"a3b7cb6c33\": {\"quality\": 0.22440018315018312, \"cost\": 0.0092955765, \"time\": 25.77978389263153}, \"a72f24f687\": {\"quality\": 0.22096764346764347, \"cost\": 0.008759131000000002, \"time\": 17.248067778348922}, \"a8aac20fd9\": {\"quality\": 0.24810057997558, \"cost\": 0.028210719500000002, \"time\": 20.938565105199814}, \"a9e811d4a7\": {\"quality\": 0.2364456376956377, \"cost\": 0.024409971, \"time\": 15.286300283670425}, \"aa1fc47a86\": {\"quality\": 0.20214008214008214, \"cost\": 0.02572883575, \"time\": 27.581201010942458}, \"aa67b102e7\": {\"quality\": 0.23873730436230436, \"cost\": 0.030387160750000003, \"time\": 30.50166518688202}, \"ad453a813f\": {\"quality\": 0.21407370407370407, \"cost\": 0.003257968, \"time\": 14.862909889221193}, \"add25d67a0\": {\"quality\": 0.21374875124875123, \"cost\": 0.028111017000000002, \"time\": 24.08863323330879}, \"adf6ae1ba7\": {\"quality\": 0.16509552947052947, \"cost\": 0.0030274415000000002, \"time\": 9.59496791958809}, \"af57afe626\": {\"quality\": 0.238234681984682, \"cost\": 0.0028139635, \"time\": 19.57368974685669}, \"b07ae35700\": {\"quality\": 0.220353465978466, \"cost\": 0.003183216, \"time\": 24.758755880594254}, \"b09741c8f7\": {\"quality\": 0.1432112332112332, \"cost\": 0.0005155585, \"time\": 36.59446266293526}, \"b1b81c0847\": {\"quality\": 0.2774873043623044, \"cost\": 0.024767306250000003, \"time\": 17.89712796807289}, \"b38fbeda99\": {\"quality\": 0.19523684648684647, \"cost\": 0.00822793925, \"time\": 30.255630522966385}, \"b3b92b1835\": {\"quality\": 0.26213078588078587, \"cost\": 0.04786767, \"time\": 23.704857051372528}, \"b3bba3eee2\": {\"quality\": 0.1911576617826618, \"cost\": 0.009401211, \"time\": 32.60036996603012}, \"b40727bd68\": {\"quality\": 0.2542234154734155, \"cost\": 0.025418039000000003, \"time\": 17.466598081588742}, \"b6a0f78896\": {\"quality\": 0.23056179931179932, \"cost\": 0.008894759499999998, \"time\": 32.4968527674675}, \"b84e122880\": {\"quality\": 0.24589008214008212, \"cost\": 0.008891399000000001, \"time\": 39.84640188813209}, \"b8c6c44a39\": {\"quality\": 0.24308316683316683, \"cost\": 0.029864208750000003, \"time\": 21.440666025877}, \"bb9ed9dffe\": {\"quality\": 0.15225163725163726, \"cost\": 0.0015893925, \"time\": 19.59902848601341}, \"bbc50a0411\": {\"quality\": 0.1945796564546564, \"cost\": 0.0031702530000000005, \"time\": 33.99009187817573}, \"bbd0498616\": {\"quality\": 0.22329150016650018, \"cost\": 0.05419105850000001, \"time\": 20.79185708165169}, \"bd018ec9d7\": {\"quality\": 0.20710851648351647, \"cost\": 0.0006870085, \"time\": 58.566940271854406}, \"be11a8a86e\": {\"quality\": 0.22035589410589412, \"cost\": 0.014606052500000001, \"time\": 33.792596000432965}, \"bfa93a259c\": {\"quality\": 0.25311230436230436, \"cost\": 0.02601113875, \"time\": 29.902825361490248}, \"c127942d8c\": {\"quality\": 0.2601658757908758, \"cost\": 0.0032753834999999995, \"time\": 25.132321882247922}, \"c14216d744\": {\"quality\": 0.23213078588078587, \"cost\": 0.01049051375, \"time\": 26.48539803624153}, \"c2f0bc6921\": {\"quality\": 0.15319229381729382, \"cost\": 0.0006038535000000001, \"time\": 16.494534188508986}, \"c440b67f31\": {\"quality\": 0.19924325674325674, \"cost\": 0.030812481500000002, \"time\": 25.34757208228111}, \"caa29fbe8c\": {\"quality\": 0.229778971028971, \"cost\": 0.025535755, \"time\": 26.289914256334306}, \"cd46331132\": {\"quality\": 0.2289372433122433, \"cost\": 0.0059953365, \"time\": 16.957564985752107}, \"cdd01242f5\": {\"quality\": 0.23936230436230438, \"cost\": 0.030000593500000002, \"time\": 23.897997051477432}, \"cdf27f19d3\": {\"quality\": 0.21152472527472527, \"cost\": 0.00597143625, \"time\": 21.69409868121147}, \"ce7d236454\": {\"quality\": 0.21988629426129425, \"cost\": 0.0319162425, \"time\": 34.63264524936676}, \"d117c8e23b\": {\"quality\": 0.24436230436230436, \"cost\": 0.04203393050000001, \"time\": 37.38888673782348}, \"d27a5ad38a\": {\"quality\": 0.19329240204240206, \"cost\": 0.025790743000000005, \"time\": 19.247011250257493}, \"d287ceaae7\": {\"quality\": 0.26063124375624375, \"cost\": 0.036845655000000005, \"time\": 33.75713464021683}, \"d29027747a\": {\"quality\": 0.12741758241758241, \"cost\": 0.0004609135, \"time\": 47.16611239314079}, \"d2d7780d31\": {\"quality\": 0.2296599234099234, \"cost\": 0.033220648000000005, \"time\": 29.299119901657104}, \"d566a94f3a\": {\"quality\": 0.2568127011877012, \"cost\": 0.023108875, \"time\": 7.455539703369141}, \"d64dd29abb\": {\"quality\": 0.2582205294705294, \"cost\": 0.039288317, \"time\": 35.80952010750771}, \"d75495aed3\": {\"quality\": 0.24019563769563768, \"cost\": 0.0183352845, \"time\": 36.634383380413055}, \"d93337a87d\": {\"quality\": 0.24102897102897103, \"cost\": 0.010006869500000001, \"time\": 21.931237626075745}, \"de08b84b51\": {\"quality\": 0.23914557664557662, \"cost\": 0.005503559999999999, \"time\": 15.377897375822068}, \"de39f4f113\": {\"quality\": 0.19986825674325676, \"cost\": 0.047524011, \"time\": 18.598653775453567}, \"de83615ac5\": {\"quality\": 0.27044275169275167, \"cost\": 0.00756211075, \"time\": 28.75660304427147}, \"dfa1e5267d\": {\"quality\": 0.2558807858807859, \"cost\": 0.0013494974999999998, \"time\": 9.088204801082611}, \"dfedfba6df\": {\"quality\": 0.2490950715950716, \"cost\": 0.004502494499999999, \"time\": 26.490580713748933}, \"e15d59fc9d\": {\"quality\": 0.2783206376956377, \"cost\": 0.011371385500000001, \"time\": 52.64762927293778}, \"e540f66a9d\": {\"quality\": 0.1645820845820846, \"cost\": 0.025751229500000007, \"time\": 19.154958713054658}, \"e5c7c9fce2\": {\"quality\": 0.24218129093129093, \"cost\": 0.0258564275, \"time\": 28.940241599082945}, \"e73beb7df9\": {\"quality\": 0.1834839466089466, \"cost\": 0.02164974225, \"time\": 23.38083545565605}, \"e939aaa8b3\": {\"quality\": 0.21797341547341548, \"cost\": 0.028107724500000004, \"time\": 34.55525960922241}, \"e9e55ba5db\": {\"quality\": 0.1835701798201798, \"cost\": 0.02538775525, \"time\": 25.480451303720475}, \"ea817a074f\": {\"quality\": 0.1976956376956377, \"cost\": 0.031266095, \"time\": 24.096565490961076}, \"eca3d11e40\": {\"quality\": 0.1920706376956377, \"cost\": 0.0259032935, \"time\": 21.720754396915435}, \"ed945b7e4a\": {\"quality\": 0.21056270118770118, \"cost\": 0.03181901100000001, \"time\": 23.55217906832695}, \"ee4ddebe70\": {\"quality\": 0.21718129093129093, \"cost\": 0.007574288250000001, \"time\": 33.07409880757332}, \"ef62e4b003\": {\"quality\": 0.1578144078144078, \"cost\": 0.0220901955, \"time\": 19.725072503089905}, \"efe8e44ba4\": {\"quality\": 0.25713078588078586, \"cost\": 0.00423088875, \"time\": 25.126209509372714}, \"efebd8ca8a\": {\"quality\": 0.12043345543345543, \"cost\": 0.006324273500000001, \"time\": 43.206120282411575}, \"f0a1a27ee9\": {\"quality\": 0.25917291042291035, \"cost\": 0.009095101500000001, \"time\": 33.561097860336304}, \"f1c116068c\": {\"quality\": 0.23332244144744144, \"cost\": 0.00174076875, \"time\": 18.031138664484025}, \"f2c938846b\": {\"quality\": 0.262278971028971, \"cost\": 0.025320526250000003, \"time\": 19.31320353746414}, \"f462950863\": {\"quality\": 0.14478951603951604, \"cost\": 0.0007027954999999999, \"time\": 15.553800004720689}, \"f53c8d7a08\": {\"quality\": 0.23630674880674882, \"cost\": 0.0337485395, \"time\": 42.32889723777771}, \"f570083655\": {\"quality\": 0.21157523032523032, \"cost\": 0.014973397000000003, \"time\": 23.602320194244385}, \"f5f303dab2\": {\"quality\": 0.2371788628038628, \"cost\": 0.032701334, \"time\": 22.13268209695816}, \"f70f78f75a\": {\"quality\": 0.27127698690198687, \"cost\": 0.016446131000000003, \"time\": 12.967699444293977}, \"f767c00a27\": {\"quality\": 0.18858911921411922, \"cost\": 0.007980581, \"time\": 24.888762706518172}, \"f9cf3c3d99\": {\"quality\": 0.23966144966144964, \"cost\": 0.014676935000000002, \"time\": 24.44706709384918}, \"f9e84d96b6\": {\"quality\": 0.234640984015984, \"cost\": 0.0256412165, \"time\": 28.137858784198762}, \"faa5cc0998\": {\"quality\": 0.23488719613719616, \"cost\": 0.0153379235, \"time\": 29.529147928953172}, \"faa744b8b7\": {\"quality\": 0.1879650210900211, \"cost\": 0.008215464999999998, \"time\": 26.75429188609123}, \"fabf1b0f34\": {\"quality\": 0.2536892274392274, \"cost\": 0.0017171129999999997, \"time\": 13.97360804080963}, \"fb53f9a5e6\": {\"quality\": 0.22727897102897102, \"cost\": 0.031424654, \"time\": 43.32508450746536}, \"fe3b87fd92\": {\"quality\": 0.2024367993117993, \"cost\": 0.003581219, \"time\": 22.43371703028679}}"
  },
  {
    "path": "abacus-research/biodex-revision-priors-mincost.json",
    "content": "{\"00b02360ef\": {\"quality\": 0.25047369297369293, \"cost\": 0.038505449000000004, \"time\": 33.970101940631864}, \"025a41642e\": {\"quality\": 0.2696782384282384, \"cost\": 0.010557185, \"time\": 31.072653269767763}, \"03c4bc9bb0\": {\"quality\": 0.2581852869352869, \"cost\": 0.010541456000000001, \"time\": 16.739641118049622}, \"04ed954ae9\": {\"quality\": 0.2958458208458209, \"cost\": 0.046920197000000004, \"time\": 19.630667555332185}, \"059dee8af9\": {\"quality\": 0.23122460872460873, \"cost\": 0.008803890499999998, \"time\": 29.826552951335906}, \"05fde83b5e\": {\"quality\": 0.20872460872460877, \"cost\": 0.006782864000000001, \"time\": 30.33404096364975}, \"06b62ff472\": {\"quality\": 0.2968228993228993, \"cost\": 0.005058610000000001, \"time\": 14.343733203411102}, \"0730a1221a\": {\"quality\": 0.26417915417915416, \"cost\": 0.026461371, \"time\": 30.643629491329193}, \"07aec969dd\": {\"quality\": 0.22977744477744477, \"cost\": 0.000723774, \"time\": 38.15098943710327}, \"0a11f0ae2e\": {\"quality\": 0.26241633366633366, \"cost\": 0.009693431000000002, \"time\": 17.0743106007576}, \"0aeb79b24c\": {\"quality\": 0.19304639804639806, \"cost\": 0.0058567, \"time\": 8.773907899856567}, \"0b3eafcb23\": {\"quality\": 0.16931984681984683, \"cost\": 0.0012721839999999998, \"time\": 16.44918472766876}, \"0bc5bbe9f8\": {\"quality\": 0.2640023865023865, \"cost\": 0.030501102000000002, \"time\": 42.8920240521431}, \"0c328ddadf\": {\"quality\": 0.21182165057165053, \"cost\": 0.007213844999999999, \"time\": 25.052031695842743}, \"0db4d3e16d\": {\"quality\": 0.24025238650238653, \"cost\": 0.003130782000000001, \"time\": 18.73849000930786}, \"0df97ab2f2\": {\"quality\": 0.2226345876345876, \"cost\": 0.0073693990000000004, \"time\": 38.507711839675906}, \"0e0162da5c\": {\"quality\": 0.2489013764013764, \"cost\": 0.026232925000000008, \"time\": 28.133001935482024}, \"0e0216352f\": {\"quality\": 0.17297064047064048, \"cost\": 0.008939957, \"time\": 34.948954832553866}, \"0ecd21b93a\": {\"quality\": 0.20069014319014317, \"cost\": 0.025977679000000004, \"time\": 28.50129954814911}, \"0ff2bde030\": {\"quality\": 0.2610857198357198, \"cost\": 0.031667818, \"time\": 16.528453254699706}, \"1059f21a6c\": {\"quality\": 0.2190234765234765, \"cost\": 0.000557161, \"time\": 9.67062486410141}, \"112b665311\": {\"quality\": 0.2551910589410589, \"cost\": 0.004582068, \"time\": 23.5158061504364}, \"11db1dbc4f\": {\"quality\": 0.23640137640137643, \"cost\": 0.003211713, \"time\": 21.993061244487762}, \"13145c10ec\": {\"quality\": 0.3172365134865135, \"cost\": 0.030289498000000005, \"time\": 31.429670691490173}, \"147915874a\": {\"quality\": 0.19364468864468862, \"cost\": 0.00046279649999999997, \"time\": 40.85908321142197}, \"16e1e2aca6\": {\"quality\": 0.219499666999667, \"cost\": 0.005164350000000001, \"time\": 12.553999137878417}, \"1914dfad94\": {\"quality\": 0.24569014319014312, \"cost\": 0.024254144999999998, \"time\": 19.75359899997711}, \"1ab9dcd7d6\": {\"quality\": 0.29846792096792096, \"cost\": 0.001003366, \"time\": 21.74862095117569}, \"1dc749411d\": {\"quality\": 0.20150238650238647, \"cost\": 0.009444109000000001, \"time\": 15.746719026565552}, \"1dd84b0308\": {\"quality\": 0.2539410589410589, \"cost\": 0.026399116, \"time\": 27.934893941879274}, \"1dd9d9bef2\": {\"quality\": 0.2544617882117882, \"cost\": 0.007334834500000001, \"time\": 31.84605222940445}, \"1f0ba92b8b\": {\"quality\": 0.27228105228105226, \"cost\": 0.016661861, \"time\": 33.20399481058121}, \"1f1b5a019d\": {\"quality\": 0.2561663336663337, \"cost\": 0.03240448, \"time\": 22.484670639038086}, \"1f41a35dd3\": {\"quality\": 0.07833333333333332, \"cost\": 0.004150375499999999, \"time\": 15.255917310714722}, \"212c3fccbf\": {\"quality\": 0.3162624875124875, \"cost\": 0.030542124999999996, \"time\": 23.602639937400816}, \"219d51cd03\": {\"quality\": 0.1531672494172494, \"cost\": 0.0008238629999999999, \"time\": 15.20914809703827}, \"238ed27a56\": {\"quality\": 0.28265318015318014, \"cost\": 0.028584483000000004, \"time\": 29.401537466049195}, \"23f3d97d47\": {\"quality\": 0.28292915417915415, \"cost\": 0.026455834000000004, \"time\": 34.8084576010704}, \"253547264d\": {\"quality\": 0.20348651348651345, \"cost\": 0.0261750685, \"time\": 20.272440803050994}, \"27e14a0973\": {\"quality\": 0.2700124875124875, \"cost\": 0.031140450000000004, \"time\": 27.74119987487793}, \"29960739d3\": {\"quality\": 0.18854728604728602, \"cost\": 0.030525630000000005, \"time\": 26.206378316879274}, \"2b0a8d5614\": {\"quality\": 0.2262060162060162, \"cost\": 0.021956766, \"time\": 16.901934444904327}, \"2ccd104a68\": {\"quality\": 0.23872460872460874, \"cost\": 0.017867876999999997, \"time\": 46.65757336616516}, \"2d1b0cd5d9\": {\"quality\": 0.2476134976134976, \"cost\": 0.005795422000000001, \"time\": 17.555127108097075}, \"2dcffa3d30\": {\"quality\": 0.3144208569208569, \"cost\": 0.008584119999999999, \"time\": 24.954598808288573}, \"2ebe033e6a\": {\"quality\": 0.2554706404706405, \"cost\": 0.007171497000000001, \"time\": 16.872591602802277}, \"31e8ef9410\": {\"quality\": 0.06845238095238096, \"cost\": 0.0007121749999999999, \"time\": 75.84835036993027}, \"32a628c971\": {\"quality\": 0.31888333888333886, \"cost\": 0.035293712000000005, \"time\": 27.239106237888336}, \"336e1a4fdc\": {\"quality\": 0.22028138528138527, \"cost\": 0.030691956000000003, \"time\": 36.1869512796402}, \"33e0fdf4e1\": {\"quality\": 0.2365234765234765, \"cost\": 0.005039102, \"time\": 23.91675148010254}, \"38a60c768a\": {\"quality\": 0.2449529637029637, \"cost\": 0.025963075000000002, \"time\": 34.96920952796936}, \"39bb54fbfb\": {\"quality\": 0.29789127539127536, \"cost\": 0.030766864999999997, \"time\": 45.52991482019424}, \"3ace395562\": {\"quality\": 0.24872460872460872, \"cost\": 0.024912788, \"time\": 23.518352210521698}, \"3bbe08bf63\": {\"quality\": 0.33796703296703295, \"cost\": 0.032394053000000006, \"time\": 37.81355751752854}, \"3caf8d77c5\": {\"quality\": 0.3083079420579421, \"cost\": 0.03448324800000001, \"time\": 29.443913543224333}, \"3d25abc0cb\": {\"quality\": 0.2672743922743922, \"cost\": 0.008724563000000001, \"time\": 37.292464637756346}, \"3e19290ef5\": {\"quality\": 0.29275238650238655, \"cost\": 0.024963852000000005, \"time\": 22.140427541732787}, \"3e8e089a02\": {\"quality\": 0.280807942057942, \"cost\": 0.0233879175, \"time\": 11.965235376358033}, \"3f1771caee\": {\"quality\": 0.143228576978577, \"cost\": 0.003751839, \"time\": 24.353824079036713}, \"3f7beb53c4\": {\"quality\": 0.29019411144411145, \"cost\": 0.00168912, \"time\": 15.684711122512818}, \"4081f36f1d\": {\"quality\": 0.2203912753912754, \"cost\": 0.008877949000000001, \"time\": 35.81259698867798}, \"410edf5f69\": {\"quality\": 0.2731690531690532, \"cost\": 0.032445931, \"time\": 40.782276201248166}, \"44ec78f301\": {\"quality\": 0.2575774225774226, \"cost\": 0.030435683499999998, \"time\": 29.96491810083389}, \"44eeb11408\": {\"quality\": 0.27455794205794204, \"cost\": 0.026113869, \"time\": 22.3088561296463}, \"45623e6def\": {\"quality\": 0.34303016428016425, \"cost\": 0.032769938, \"time\": 30.375658011436464}, \"470749179f\": {\"quality\": 0.26199966699966704, \"cost\": 0.004939885999999999, \"time\": 24.02986795902252}, \"47ea56eba0\": {\"quality\": 0.2065848040848041, \"cost\": 0.017984692000000004, \"time\": 29.315977609157564}, \"49fc8b3768\": {\"quality\": 0.16132201132201132, \"cost\": 0.002801354, \"time\": 18.934259843826293}, \"4a60adbf47\": {\"quality\": 0.17342823842823843, \"cost\": 0.013229327499999999, \"time\": 26.285319995880126}, \"4ba372d292\": {\"quality\": 0.26265623265623267, \"cost\": 0.0073457165, \"time\": 27.0092391371727}, \"4bf38b1922\": {\"quality\": 0.24983571983571987, \"cost\": 0.030574272000000006, \"time\": 43.60012021064758}, \"4c415b4b86\": {\"quality\": 0.15533175158175158, \"cost\": 0.024658497000000005, \"time\": 14.953267228603362}, \"4d1fc31b49\": {\"quality\": 0.3281690531690532, \"cost\": 0.023400054000000003, \"time\": 16.318808138370514}, \"4d8c84f53a\": {\"quality\": 0.2701134976134976, \"cost\": 0.026254961000000004, \"time\": 29.027335906028746}, \"4df2245b9b\": {\"quality\": 0.12687756687756688, \"cost\": 0.001181844, \"time\": 4.539927685260773}, \"4e9274d39b\": {\"quality\": 0.2406522644022644, \"cost\": 0.005825131000000001, \"time\": 15.826024615764618}, \"4f87ed7c2e\": {\"quality\": 0.2662246087246087, \"cost\": 0.008736922999999999, \"time\": 27.13904390335083}, \"51a340bebc\": {\"quality\": 0.24584582084582088, \"cost\": 0.0313118765, \"time\": 28.96012043952942}, \"52dee74e25\": {\"quality\": 0.1881456043956044, \"cost\": 0.031453754, \"time\": 34.28346945047379}, \"533fddb3bd\": {\"quality\": 0.13842823842823843, \"cost\": 0.00165954, \"time\": 9.557414150238037}, \"5358cb5855\": {\"quality\": 0.18078934953934955, \"cost\": 0.026390755999999998, \"time\": 21.654051101207735}, \"53b6c3e00f\": {\"quality\": 0.25569014319014316, \"cost\": 0.000717601, \"time\": 10.53179224729538}, \"5400c14640\": {\"quality\": 0.30376248751248747, \"cost\": 0.028650208999999992, \"time\": 38.553849005699156}, \"55e373a7e2\": {\"quality\": 0.12201770451770451, \"cost\": 0.039114879000000005, \"time\": 18.304805862903596}, \"5652070ecb\": {\"quality\": 0.21717698967698967, \"cost\": 0.007322918, \"time\": 32.560013604164126}, \"56c5cbc75a\": {\"quality\": 0.21707903207903206, \"cost\": 0.000547569, \"time\": 38.314506363868716}, \"577b168d21\": {\"quality\": 0.22071178821178825, \"cost\": 0.006765484, \"time\": 12.956210255622864}, \"57cabf0e4b\": {\"quality\": 0.21348651348651346, \"cost\": 0.024673236, \"time\": 20.83515272140503}, \"58dbfc0499\": {\"quality\": 0.3025124875124875, \"cost\": 0.053043769000000004, \"time\": 33.01721404790878}, \"5a5739c3d2\": {\"quality\": 0.16318528693528694, \"cost\": 0.006033024, \"time\": 30.727768671512603}, \"5b12979e52\": {\"quality\": 0.15083694083694082, \"cost\": 0.0036988935, \"time\": 18.811239528656007}, \"5b384da665\": {\"quality\": 0.24699841824841826, \"cost\": 0.014450417999999998, \"time\": 25.65202077627182}, \"5b837baf2e\": {\"quality\": 0.2600124875124875, \"cost\": 0.03879646400000001, \"time\": 38.47186998128891}, \"5c8acd9a75\": {\"quality\": 0.2188238150738151, \"cost\": 0.026283571999999998, \"time\": 32.48901460170746}, \"5da68874af\": {\"quality\": 0.14364649239649238, \"cost\": 0.008979280499999999, \"time\": 33.84627612829208}, \"5e304846b6\": {\"quality\": 0.11458083583083582, \"cost\": 0.007788067, \"time\": 22.096365022659302}, \"5e608d2a8b\": {\"quality\": 0.15110375735375733, \"cost\": 0.0008234890000000002, \"time\": 19.88741739988327}, \"5e6ad9fb8a\": {\"quality\": 0.24834582084582085, \"cost\": 0.024505677000000003, \"time\": 15.4741956949234}, \"5ec5b185c9\": {\"quality\": 0.2114807414807415, \"cost\": 0.0023043549999999997, \"time\": 37.4191596031189}, \"608e332141\": {\"quality\": 0.34233571983571986, \"cost\": 0.035189383000000005, \"time\": 22.62248021364212}, \"610753ffe5\": {\"quality\": 0.31424048174048175, \"cost\": 0.029135328, \"time\": 19.57549432516098}, \"614e59c7e7\": {\"quality\": 0.2492218892218892, \"cost\": 0.001694424, \"time\": 18.55128355026245}, \"62320710f0\": {\"quality\": 0.3199746087246087, \"cost\": 0.03502582600000001, \"time\": 31.618607234954833}, \"649df27d73\": {\"quality\": 0.2782466144966145, \"cost\": 0.032545105, \"time\": 19.415873074531554}, \"64f956ba72\": {\"quality\": 0.2931690531690532, \"cost\": 0.026772999999999998, \"time\": 9.635773408412934}, \"65657a3d04\": {\"quality\": 0.20616633366633366, \"cost\": 0.011125794, \"time\": 52.96317781209946}, \"6604abbd43\": {\"quality\": 0.30366633366633367, \"cost\": 0.025960788500000005, \"time\": 24.2287339925766}, \"67050fa89b\": {\"quality\": 0.25265623265623266, \"cost\": 0.030506207, \"time\": 39.747187566757205}, \"67f05c5985\": {\"quality\": 0.24658300033300035, \"cost\": 0.02588163, \"time\": 20.159594106674195}, \"67fe27bce0\": {\"quality\": 0.3328912753912754, \"cost\": 0.039865689, \"time\": 32.06505537033081}, \"698d1b1d0e\": {\"quality\": 0.31830794205794205, \"cost\": 0.03946355800000001, \"time\": 51.54699250459671}, \"69e8aa9f73\": {\"quality\": 0.18985680985680986, \"cost\": 0.007278381, \"time\": 38.63639385700226}, \"6b240a8971\": {\"quality\": 0.20597027972027973, \"cost\": 0.008727093000000002, \"time\": 28.28091263771057}, \"6b56430005\": {\"quality\": 0.3260460372960373, \"cost\": 0.026531200000000005, \"time\": 41.742786502838136}, \"6b611b7193\": {\"quality\": 0.2922743922743923, \"cost\": 0.066505375, \"time\": 17.20961241722107}, \"6b77ef93d3\": {\"quality\": 0.17878232878232875, \"cost\": 0.001498626, \"time\": 14.678290736675262}, \"6beefce17b\": {\"quality\": 0.30852439227439227, \"cost\": 0.030691372000000005, \"time\": 24.38176097869873}, \"6cb7c9a802\": {\"quality\": 0.33327006327006325, \"cost\": 0.030604636500000004, \"time\": 39.88295543193817}, \"6cba1da5b3\": {\"quality\": 0.15214035964035963, \"cost\": 0.013090929000000001, \"time\": 21.816087329387663}, \"6dea54bf99\": {\"quality\": 0.2879309579309579, \"cost\": 0.004796466000000001, \"time\": 23.340252709388732}, \"6eda6f7780\": {\"quality\": 0.2093607781107781, \"cost\": 0.010549649000000001, \"time\": 36.39340627193451}, \"708f3ca5da\": {\"quality\": 0.13352439227439225, \"cost\": 0.0014636204999999998, \"time\": 14.394513404369354}, \"72d75a5f01\": {\"quality\": 0.22862359862359863, \"cost\": 0.014409670999999999, \"time\": 24.5260990858078}, \"72d99dee93\": {\"quality\": 0.2751513764013764, \"cost\": 0.028088187000000008, \"time\": 41.42688899040222}, \"73272e5bd3\": {\"quality\": 0.3387246087246087, \"cost\": 0.030827000000000004, \"time\": 27.08615951538086}, \"74608570f5\": {\"quality\": 0.11727564102564103, \"cost\": 0.000531192, \"time\": 25.665570771694185}, \"7531840182\": {\"quality\": 0.1608241758241758, \"cost\": 0.0018236030000000005, \"time\": 65.24554077386856}, \"77ba656673\": {\"quality\": 0.15628538128538128, \"cost\": 0.008854519500000001, \"time\": 28.12923115491867}, \"77fe41f02f\": {\"quality\": 0.15589035964035963, \"cost\": 0.042701628000000005, \"time\": 38.7138086438179}, \"7857e95756\": {\"quality\": 0.20908480408480407, \"cost\": 0.014196858000000001, \"time\": 26.021716463565827}, \"78984e94f0\": {\"quality\": 0.19880702630702632, \"cost\": 0.0067921480000000005, \"time\": 24.860538172721864}, \"7986633543\": {\"quality\": 0.25714035964035964, \"cost\": 0.0073338084999999996, \"time\": 35.33250515460968}, \"7bdbc32b57\": {\"quality\": 0.3116196303696303, \"cost\": 0.0451492925, \"time\": 26.672986245155336}, \"7c89cff787\": {\"quality\": 0.1501694139194139, \"cost\": 0.0016053209999999998, \"time\": 20.911856007575988}, \"7dc51ec191\": {\"quality\": 0.34112359862359865, \"cost\": 0.039412432000000004, \"time\": 45.49512637853623}, \"7e4a6245ac\": {\"quality\": 0.13822802197802198, \"cost\": 0.0004819685, \"time\": 76.24876043796539}, \"7ebc55ebcf\": {\"quality\": 0.3203912753912754, \"cost\": 0.008825814999999997, \"time\": 52.15610808134079}, \"7f14114a48\": {\"quality\": 0.21598651348651346, \"cost\": 0.012735542499999999, \"time\": 17.204772758483887}, \"7fb587503e\": {\"quality\": 0.22814560439560436, \"cost\": 0.009006149000000001, \"time\": 18.705357003211976}, \"7ff82b6f8d\": {\"quality\": 0.22926948051948054, \"cost\": 0.028896988000000002, \"time\": 20.79049743413925}, \"806a5ef096\": {\"quality\": 0.27122460872460874, \"cost\": 0.023206115, \"time\": 11.138013732433318}, \"8156d78e42\": {\"quality\": 0.251999666999667, \"cost\": 0.0073474729999999985, \"time\": 36.957821094989775}, \"815cfb848c\": {\"quality\": 0.2520958208458209, \"cost\": 0.0247796475, \"time\": 23.402184796333312}, \"820b42e0b1\": {\"quality\": 0.2400949050949051, \"cost\": 0.002655385, \"time\": 22.863080990314483}, \"82a733626e\": {\"quality\": 0.30914127539127534, \"cost\": 0.028937054, \"time\": 23.70889393091202}, \"83d6300ced\": {\"quality\": 0.2575124875124875, \"cost\": 0.008893523, \"time\": 39.3106644153595}, \"844c822d56\": {\"quality\": 0.15672369297369299, \"cost\": 0.02642007, \"time\": 22.442726397514342}, \"84f9f20f75\": {\"quality\": 0.2695976245976246, \"cost\": 0.023199555000000004, \"time\": 15.404870545864105}, \"8525986a00\": {\"quality\": 0.2975124875124875, \"cost\": 0.030195346000000005, \"time\": 38.38539198637009}, \"85db012139\": {\"quality\": 0.2975142912642913, \"cost\": 0.04670653, \"time\": 23.935909843444826}, \"85ddc8a71e\": {\"quality\": 0.2362246087246087, \"cost\": 0.005867745, \"time\": 22.264061617851258}, \"8767267bed\": {\"quality\": 0.1742567155067155, \"cost\": 0.006601124, \"time\": 7.172838580608368}, \"882238f677\": {\"quality\": 0.14891941391941393, \"cost\": 0.003255827, \"time\": 27.542255055904388}, \"88f78bdeaf\": {\"quality\": 0.09306623931623932, \"cost\": 0.000593519, \"time\": 27.451544547080992}, \"8af1605300\": {\"quality\": 0.29047369297369297, \"cost\": 0.007263364000000003, \"time\": 27.19462056159973}, \"8c09cba5e2\": {\"quality\": 0.24566905316905316, \"cost\": 0.0249850575, \"time\": 30.331667792797088}, \"8e5d966cf6\": {\"quality\": 0.13822802197802198, \"cost\": 0.001409651, \"time\": 68.11073198318482}, \"8eacbc7240\": {\"quality\": 0.21547369297369298, \"cost\": 0.012992563000000002, \"time\": 23.104503071308137}, \"8fc917c575\": {\"quality\": 0.24955794205794207, \"cost\": 0.007428123, \"time\": 26.517943835258485}, \"8fe43ec148\": {\"quality\": 0.25275724275724276, \"cost\": 0.0030687225, \"time\": 27.1856760263443}, \"900de152da\": {\"quality\": 0.28334582084582083, \"cost\": 0.026874375, \"time\": 8.219515359401703}, \"901ab65b55\": {\"quality\": 0.34566905316905316, \"cost\": 0.027035125000000004, \"time\": 9.139500224590302}, \"918182aa18\": {\"quality\": 0.18560592185592187, \"cost\": 0.002369203, \"time\": 56.10349173545838}, \"91ac37a21a\": {\"quality\": 0.128228021978022, \"cost\": 0.006092536000000001, \"time\": 74.77904181480407}, \"9252afa4ec\": {\"quality\": 0.2821373071373071, \"cost\": 0.024976994000000002, \"time\": 19.884121739864348}, \"939b0c3ed9\": {\"quality\": 0.24926157176157174, \"cost\": 0.009163022, \"time\": 20.210047280788423}, \"93a705cb0d\": {\"quality\": 0.25301157176157174, \"cost\": 0.01034815, \"time\": 28.735098361968994}, \"94091e2968\": {\"quality\": 0.20182109557109557, \"cost\": 0.010538336, \"time\": 30.044746005535124}, \"945a9565fa\": {\"quality\": 0.177733793983794, \"cost\": 0.005453404, \"time\": 10.252689445018769}, \"946baa620a\": {\"quality\": 0.2701134976134976, \"cost\": 0.024427862, \"time\": 11.070768821239472}, \"950ed976dd\": {\"quality\": 0.09488455988455988, \"cost\": 0.0034412205, \"time\": 20.066544604301452}, \"955689ee8f\": {\"quality\": 0.17957417582417584, \"cost\": 0.0017661469999999998, \"time\": 37.72812922000885}, \"95c9b1b1dd\": {\"quality\": 0.2563003663003663, \"cost\": 0.030597902, \"time\": 27.93373385667801}, \"95ebba630d\": {\"quality\": 0.19334887334887335, \"cost\": 0.004341687, \"time\": 18.660895478725433}, \"966282f059\": {\"quality\": 0.16358821733821732, \"cost\": 0.022551696000000003, \"time\": 21.573746180534364}, \"96c401772b\": {\"quality\": 0.2422365134865135, \"cost\": 0.0026645590000000004, \"time\": 17.938199400901794}, \"9718d880f6\": {\"quality\": 0.20941905316905318, \"cost\": 0.030509630000000003, \"time\": 42.891600167751314}, \"971ae914cb\": {\"quality\": 0.3079922854922855, \"cost\": 0.024098346, \"time\": 11.75359138250351}, \"97566e70dd\": {\"quality\": 0.21477439227439227, \"cost\": 0.016637812999999998, \"time\": 31.738031923770905}, \"97c882ca1f\": {\"quality\": 0.2701134976134976, \"cost\": 0.034874281, \"time\": 22.407317924499512}, \"98e28fbb93\": {\"quality\": 0.18297369297369298, \"cost\": 0.009208929000000001, \"time\": 11.046243464946746}, \"992ab25916\": {\"quality\": 0.24106893106893104, \"cost\": 0.001710615, \"time\": 18.1801389336586}, \"9a271dff29\": {\"quality\": 0.20304639804639804, \"cost\": 0.009020588999999999, \"time\": 25.38304090499878}, \"9aa6fd8b03\": {\"quality\": 0.2443198468198468, \"cost\": 0.0058449355000000005, \"time\": 14.622885823249817}, \"9b12b7d441\": {\"quality\": 0.3142615717615717, \"cost\": 0.010453449, \"time\": 24.59365568161011}, \"9bf31a2127\": {\"quality\": 0.2737246087246087, \"cost\": 0.032263425, \"time\": 41.504128229618075}, \"9c4ad5aed3\": {\"quality\": 0.2762624875124875, \"cost\": 0.024755695000000005, \"time\": 21.714357328414916}, \"9cbb34ee5f\": {\"quality\": 0.23064740814740814, \"cost\": 0.0023012260000000004, \"time\": 23.29413582086563}, \"9e16d7ba9c\": {\"quality\": 0.25935772560772563, \"cost\": 0.031369863000000005, \"time\": 33.015732765197754}, \"9e1a7dd196\": {\"quality\": 0.19444194694194694, \"cost\": 0.0035900009999999998, \"time\": 15.451323175430298}, \"9e6775ac33\": {\"quality\": 0.06856227106227106, \"cost\": 0.0017025660000000002, \"time\": 15.161614441871643}, \"9e8674b6c7\": {\"quality\": 0.25872460872460873, \"cost\": 0.032513841, \"time\": 38.995942330360414}, \"9ea12d8d80\": {\"quality\": 0.14647893772893772, \"cost\": 0.006858684000000001, \"time\": 12.53976230621338}, \"a232f1dfb6\": {\"quality\": 0.2482117882117882, \"cost\": 0.0017046329999999999, \"time\": 17.978844976425172}, \"a3b7cb6c33\": {\"quality\": 0.24213369963369963, \"cost\": 0.00900874, \"time\": 19.766272163391115}, \"a72f24f687\": {\"quality\": 0.21526862026862026, \"cost\": 0.007625456000000003, \"time\": 14.816489315032959}, \"a8aac20fd9\": {\"quality\": 0.33620115995115996, \"cost\": 0.027009404000000004, \"time\": 20.422273874282837}, \"a9e811d4a7\": {\"quality\": 0.27455794205794204, \"cost\": 0.023571902, \"time\": 17.510642659664153}, \"aa1fc47a86\": {\"quality\": 0.23011349761349761, \"cost\": 0.024848436999999998, \"time\": 23.83184322118759}, \"aa67b102e7\": {\"quality\": 0.29247460872460873, \"cost\": 0.029215488, \"time\": 25.6765993475914}, \"ad453a813f\": {\"quality\": 0.2581474081474081, \"cost\": 0.0031871219999999997, \"time\": 17.01520048379898}, \"add25d67a0\": {\"quality\": 0.19999750249750248, \"cost\": 0.027308536, \"time\": 21.267851161956788}, \"adf6ae1ba7\": {\"quality\": 0.15185772560772562, \"cost\": 0.0030511870000000003, \"time\": 10.442331075668335}, \"af57afe626\": {\"quality\": 0.25730269730269734, \"cost\": 0.0027442350000000003, \"time\": 20.559476172924043}, \"b07ae35700\": {\"quality\": 0.25820693195693195, \"cost\": 0.0031182555, \"time\": 21.726053476333618}, \"b09741c8f7\": {\"quality\": 0.13642246642246642, \"cost\": 0.00039643100000000004, \"time\": 36.347797584533694}, \"b1b81c0847\": {\"quality\": 0.35997460872460874, \"cost\": 0.0234363775, \"time\": 12.63291654586792}, \"b38fbeda99\": {\"quality\": 0.18547369297369298, \"cost\": 0.008016031, \"time\": 29.251738607883453}, \"b3b92b1835\": {\"quality\": 0.24926157176157174, \"cost\": 0.04590919, \"time\": 22.62844548225403}, \"b3bba3eee2\": {\"quality\": 0.19231532356532358, \"cost\": 0.007909536, \"time\": 42.60630009174347}, \"b40727bd68\": {\"quality\": 0.2634468309468309, \"cost\": 0.024455202000000006, \"time\": 20.662753903865813}, \"b6a0f78896\": {\"quality\": 0.264456931956932, \"cost\": 0.008735882999999998, \"time\": 34.55237965583801}, \"b84e122880\": {\"quality\": 0.2734468309468309, \"cost\": 0.008659572, \"time\": 40.34617894887924}, \"b8c6c44a39\": {\"quality\": 0.30366633366633367, \"cost\": 0.0289365195, \"time\": 22.390970599651336}, \"bb9ed9dffe\": {\"quality\": 0.16283660783660783, \"cost\": 0.001536843, \"time\": 17.358729672431945}, \"bbc50a0411\": {\"quality\": 0.22165931290931287, \"cost\": 0.0031182930000000003, \"time\": 28.581140315532686}, \"bbd0498616\": {\"quality\": 0.22491633366633365, \"cost\": 0.051756142000000005, \"time\": 17.691377997398376}, \"bd018ec9d7\": {\"quality\": 0.19921703296703297, \"cost\": 0.0005695385, \"time\": 72.71849246025086}, \"be11a8a86e\": {\"quality\": 0.2582117882117882, \"cost\": 0.013947379000000001, \"time\": 26.179900193214415}, \"bfa93a259c\": {\"quality\": 0.3037246087246087, \"cost\": 0.0248712575, \"time\": 27.526713395118712}, \"c127942d8c\": {\"quality\": 0.2636650849150849, \"cost\": 0.0032525069999999995, \"time\": 24.242635214328764}, \"c14216d744\": {\"quality\": 0.24926157176157174, \"cost\": 0.010124539499999998, \"time\": 25.00144135951996}, \"c2f0bc6921\": {\"quality\": 0.14888458763458762, \"cost\": 0.000605392, \"time\": 14.55265941619873}, \"c440b67f31\": {\"quality\": 0.21598651348651346, \"cost\": 0.02981187, \"time\": 24.81428416967392}, \"caa29fbe8c\": {\"quality\": 0.27789127539127534, \"cost\": 0.0247254, \"time\": 26.996189880371094}, \"cd46331132\": {\"quality\": 0.30287448662448657, \"cost\": 0.005876232, \"time\": 16.816848158836365}, \"cdd01242f5\": {\"quality\": 0.3245579420579421, \"cost\": 0.029167213999999997, \"time\": 18.141655015945435}, \"cdf27f19d3\": {\"quality\": 0.2780494505494505, \"cost\": 0.005842002000000001, \"time\": 15.971444475650788}, \"ce7d236454\": {\"quality\": 0.28977258852258847, \"cost\": 0.030690639, \"time\": 30.128969633579253}, \"d117c8e23b\": {\"quality\": 0.3412246087246087, \"cost\": 0.03989662200000001, \"time\": 39.21010599136353}, \"d27a5ad38a\": {\"quality\": 0.1940848040848041, \"cost\": 0.024636320000000003, \"time\": 19.60495171546936}, \"d287ceaae7\": {\"quality\": 0.3262624875124875, \"cost\": 0.035333185, \"time\": 32.71745250225067}, \"d29027747a\": {\"quality\": 0.0815018315018315, \"cost\": 0.000420386, \"time\": 37.273941445350644}, \"d2d7780d31\": {\"quality\": 0.2843198468198468, \"cost\": 0.031180686000000006, \"time\": 28.442743134498595}, \"d566a94f3a\": {\"quality\": 0.30195873570873577, \"cost\": 0.02176175, \"time\": 8.928415286540986}, \"d64dd29abb\": {\"quality\": 0.3114410589410589, \"cost\": 0.03748003400000001, \"time\": 31.35522733926773}, \"d75495aed3\": {\"quality\": 0.24872460872460872, \"cost\": 0.017243897, \"time\": 37.48555475473404}, \"d93337a87d\": {\"quality\": 0.2603912753912754, \"cost\": 0.009152621000000001, \"time\": 21.659664070606233}, \"de08b84b51\": {\"quality\": 0.2466244866244866, \"cost\": 0.005220604999999999, \"time\": 13.204281628131866}, \"de39f4f113\": {\"quality\": 0.1780698468198468, \"cost\": 0.045369664999999997, \"time\": 15.614853060245514}, \"de83615ac5\": {\"quality\": 0.32421883671883667, \"cost\": 0.0074163139999999985, \"time\": 25.368940019607543}, \"dfa1e5267d\": {\"quality\": 0.24926157176157174, \"cost\": 0.001261335, \"time\": 7.206225287914276}, \"dfedfba6df\": {\"quality\": 0.24569014319014312, \"cost\": 0.0042948735, \"time\": 20.24753314256668}, \"e15d59fc9d\": {\"quality\": 0.33080794205794206, \"cost\": 0.011172813, \"time\": 53.19109081029892}, \"e540f66a9d\": {\"quality\": 0.14333083583083583, \"cost\": 0.024458602000000006, \"time\": 16.43147575855255}, \"e5c7c9fce2\": {\"quality\": 0.28936258186258185, \"cost\": 0.024852987500000003, \"time\": 23.036948883533476}, \"e73beb7df9\": {\"quality\": 0.1644678932178932, \"cost\": 0.0206705125, \"time\": 19.65618509054184}, \"e939aaa8b3\": {\"quality\": 0.24094683094683092, \"cost\": 0.026475653000000005, \"time\": 27.854652655124664}, \"e9e55ba5db\": {\"quality\": 0.16214035964035964, \"cost\": 0.024552966000000002, \"time\": 22.01356840133667}, \"ea817a074f\": {\"quality\": 0.2803912753912754, \"cost\": 0.030159274000000003, \"time\": 26.384180903434753}, \"eca3d11e40\": {\"quality\": 0.25997460872460876, \"cost\": 0.024926519, \"time\": 22.995232653617858}, \"ed945b7e4a\": {\"quality\": 0.28362540237540235, \"cost\": 0.030488778000000005, \"time\": 23.277513599395753}, \"ee4ddebe70\": {\"quality\": 0.2726959151959152, \"cost\": 0.007441470000000001, \"time\": 33.65448969602585}, \"ef62e4b003\": {\"quality\": 0.15812881562881562, \"cost\": 0.020466967000000003, \"time\": 19.30031136274338}, \"efe8e44ba4\": {\"quality\": 0.2592615717615717, \"cost\": 0.0040250775, \"time\": 21.48128048181534}, \"efebd8ca8a\": {\"quality\": 0.11086691086691088, \"cost\": 0.004719911, \"time\": 28.901436603069307}, \"f0a1a27ee9\": {\"quality\": 0.25251248751248745, \"cost\": 0.008780909, \"time\": 30.412869799137116}, \"f1c116068c\": {\"quality\": 0.2574782162282162, \"cost\": 0.001708935, \"time\": 17.96944682598114}, \"f2c938846b\": {\"quality\": 0.32955794205794203, \"cost\": 0.024549196500000002, \"time\": 19.827334237098693}, \"f462950863\": {\"quality\": 0.18207903207903206, \"cost\": 0.0006345659999999999, \"time\": 19.180297017097473}, \"f53c8d7a08\": {\"quality\": 0.27761349761349763, \"cost\": 0.032353539, \"time\": 41.01129903793335}, \"f570083655\": {\"quality\": 0.25315046065046065, \"cost\": 0.014203856, \"time\": 25.355476438999176}, \"f5f303dab2\": {\"quality\": 0.26935772560772564, \"cost\": 0.031210957999999994, \"time\": 26.234438502788542}, \"f70f78f75a\": {\"quality\": 0.31005397380397376, \"cost\": 0.015462705999999998, \"time\": 15.054353976249695}, \"f767c00a27\": {\"quality\": 0.1513449050949051, \"cost\": 0.007778000499999998, \"time\": 23.562188839912416}, \"f9cf3c3d99\": {\"quality\": 0.30015623265623265, \"cost\": 0.013173157, \"time\": 23.90029693841934}, \"f9e84d96b6\": {\"quality\": 0.2451153013653013, \"cost\": 0.024730744000000002, \"time\": 21.731143033504488}, \"faa5cc0998\": {\"quality\": 0.27644105894105897, \"cost\": 0.014485047000000001, \"time\": 30.353227543830872}, \"faa744b8b7\": {\"quality\": 0.17509670884670886, \"cost\": 0.007972195999999999, \"time\": 26.635566544532775}, \"fabf1b0f34\": {\"quality\": 0.3082117882117882, \"cost\": 0.0016653059999999996, \"time\": 13.257695925235748}, \"fb53f9a5e6\": {\"quality\": 0.312057942057942, \"cost\": 0.030517666500000006, \"time\": 38.16447761058807}, \"fe3b87fd92\": {\"quality\": 0.2373735986235986, \"cost\": 0.003384599, \"time\": 24.418578135967255}}"
  },
  {
    "path": "abacus-research/cheap-priors-cascades.json",
    "content": "{\"0005c18b69\": {\"quality\": 0.5422666666666667, \"cost\": 1.62e-06, \"time\": 0.026}, \"009df798a3\": {\"quality\": 0.470225, \"cost\": 1.59e-06, \"time\": 0.027999999999999997}, \"00c93aec22\": {\"quality\": 0.426725, \"cost\": 3.1199999999999998e-06, \"time\": 0.030900000000000004}, \"00e1fecc4c\": {\"quality\": 0.48889999999999995, \"cost\": 2.24e-06, \"time\": 0.026199999999999998}, \"00f4acd0d3\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"01413aa72d\": {\"quality\": 0.44705, \"cost\": 2.3600000000000003e-06, \"time\": 0.0293}, \"01c2f973ad\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"02078988c1\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.031}, \"021604dec1\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"02410c662e\": {\"quality\": 0.4013, \"cost\": 1.8e-06, \"time\": 0.0299}, \"0262668df7\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"0267c97b70\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.0308}, \"02ae38e4aa\": {\"quality\": 0.45935000000000004, \"cost\": 1.59e-06, \"time\": 0.0247}, \"02f49fe0fd\": {\"quality\": 0.365, \"cost\": 2.4e-07, \"time\": 0.0128}, \"030756558c\": {\"quality\": 0.45363333333333333, \"cost\": 9.9e-07, \"time\": 0.0226}, \"033ca325e6\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"038a5f0a62\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.030799999999999998}, \"041b5af43d\": {\"quality\": 0.52195, \"cost\": 3.830000000000001e-06, \"time\": 0.0413}, \"042d933706\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"04397effa0\": {\"quality\": 0.46777500000000005, \"cost\": 2.43e-06, \"time\": 0.0333}, \"0539e0b42d\": {\"quality\": 0.5050250000000001, \"cost\": 2.7500000000000004e-06, \"time\": 0.0365}, \"0554568b86\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.0229}, \"06493715cc\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"067ee6e91b\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"068b66f00d\": {\"quality\": 0.4630666666666667, \"cost\": 2e-06, \"time\": 0.026699999999999998}, \"0695f9b5fc\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"073ed5b301\": {\"quality\": 0.39890000000000003, \"cost\": 1.6799999999999998e-06, \"time\": 0.020200000000000003}, \"079feb14a8\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.0263}, \"07a3a7daf7\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"08127cd6dd\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"0833133620\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"089565077c\": {\"quality\": 0.365, \"cost\": 1.2e-07, \"time\": 0.0064}, \"08bf8cc191\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"08e1802287\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"090cd3ef31\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"0944a921e8\": {\"quality\": 0.48890000000000006, \"cost\": 2.24e-06, \"time\": 0.026199999999999998}, \"0947216ece\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.0304}, \"096d51f670\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"09791c731b\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"0990c0d4f8\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"0a128688c1\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.0229}, \"0a4c1bbb4a\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.018299999999999997}, \"0ac969dde3\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.040799999999999996}, \"0af1efab0e\": {\"quality\": 0.4021666666666666, \"cost\": 8.4e-07, \"time\": 0.0149}, \"0b1ed7ff58\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"0b3dc2e896\": {\"quality\": 0.45555, \"cost\": 2.6e-06, \"time\": 0.025500000000000002}, \"0b43e94f3f\": {\"quality\": 0.5121, \"cost\": 1.88e-06, \"time\": 0.0203}, \"0b4ab72197\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"0be862a0dc\": {\"quality\": 0.4762, \"cost\": 2.07e-06, \"time\": 0.027399999999999997}, \"0bf3129ae8\": {\"quality\": 0.4794666666666667, \"cost\": 1.23e-06, \"time\": 0.0221}, \"0c020b86a3\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.0224}, \"0c6c7fe96a\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"0c81c8996a\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"0cdc5954dd\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.031}, \"0d25188bf7\": {\"quality\": 0.4098, \"cost\": 2.04e-06, \"time\": 0.0261}, \"0d9d767ae5\": {\"quality\": 0.5611333333333334, \"cost\": 3.6400000000000003e-06, \"time\": 0.0342}, \"0e36342fe7\": {\"quality\": 0.39885000000000004, \"cost\": 1.3199999999999999e-06, \"time\": 0.0176}, \"0e7e862290\": {\"quality\": 0.4021666666666666, \"cost\": 8.4e-07, \"time\": 0.0149}, \"0e91cd07f9\": {\"quality\": 0.4762, \"cost\": 2.07e-06, \"time\": 0.027399999999999997}, \"0ec672e7c8\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.037500000000000006}, \"0ed243f788\": {\"quality\": 0.543775, \"cost\": 3.23e-06, \"time\": 0.0355}, \"0eeb372802\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"0effe9b1dc\": {\"quality\": 0.40216666666666673, \"cost\": 8.4e-07, \"time\": 0.0149}, \"0f7faf684d\": {\"quality\": 0.52195, \"cost\": 3.83e-06, \"time\": 0.0413}, \"0fcec544e3\": {\"quality\": 0.5064500000000001, \"cost\": 1.9799999999999997e-06, \"time\": 0.0286}, \"0ff126ebf8\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"112d9a3421\": {\"quality\": 0.455475, \"cost\": 3.2000000000000003e-06, \"time\": 0.0379}, \"114a097c53\": {\"quality\": 0.43270000000000003, \"cost\": 2.4e-06, \"time\": 0.0224}, \"116334cd72\": {\"quality\": 0.5290250000000001, \"cost\": 4.84e-06, \"time\": 0.045399999999999996}, \"1175ee37e6\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"11a66478dc\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"11bc996d48\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.030799999999999998}, \"11debf9fc0\": {\"quality\": 0.467775, \"cost\": 2.43e-06, \"time\": 0.033299999999999996}, \"123fb650fb\": {\"quality\": 0.429175, \"cost\": 2.2799999999999998e-06, \"time\": 0.0256}, \"1274c21076\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.0341}, \"127af50739\": {\"quality\": 0.44705, \"cost\": 2.3600000000000003e-06, \"time\": 0.0293}, \"12addbf5e2\": {\"quality\": 0.3989, \"cost\": 1.6799999999999998e-06, \"time\": 0.0202}, \"133ee5023f\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"1368e1c78e\": {\"quality\": 0.4021666666666666, \"cost\": 8.4e-07, \"time\": 0.0149}, \"13a009fe0c\": {\"quality\": 0.5399750000000001, \"cost\": 4.24e-06, \"time\": 0.0363}, \"13da306f84\": {\"quality\": 0.4969666666666666, \"cost\": 3.32e-06, \"time\": 0.0277}, \"13f2b9c25b\": {\"quality\": 0.517325, \"cost\": 1.98e-06, \"time\": 0.0319}, \"13f75f9bd0\": {\"quality\": 0.4392333333333333, \"cost\": 1.92e-06, \"time\": 0.023}, \"1404e0aa35\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"140ededb41\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"142f3a7c70\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"1468dddecc\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"14d19a01e2\": {\"quality\": 0.4134, \"cost\": 1.6799999999999998e-06, \"time\": 0.0235}, \"1624bb5302\": {\"quality\": 0.4166666666666667, \"cost\": 8.4e-07, \"time\": 0.0182}, \"1636e0833b\": {\"quality\": 0.37633333333333335, \"cost\": 6e-07, \"time\": 0.0154}, \"1658296f3a\": {\"quality\": 0.41340000000000005, \"cost\": 1.6799999999999998e-06, \"time\": 0.0235}, \"16f351273f\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"171e6ae293\": {\"quality\": 0.5611333333333334, \"cost\": 3.64e-06, \"time\": 0.034199999999999994}, \"176da24f53\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"179379555f\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"17c928174f\": {\"quality\": 0.5422666666666667, \"cost\": 1.6200000000000002e-06, \"time\": 0.026}, \"181c91d1be\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.030999999999999996}, \"18368684cd\": {\"quality\": 0.4134, \"cost\": 1.6799999999999998e-06, \"time\": 0.0235}, \"183743e76e\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"18f55750b0\": {\"quality\": 0.39892500000000003, \"cost\": 2.04e-06, \"time\": 0.0228}, \"19563b057d\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.0229}, \"1957127275\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.0229}, \"197bb53f10\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"199fd1fbf2\": {\"quality\": 0.40375, \"cost\": 4.8e-07, \"time\": 0.0123}, \"19b40e0271\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"19e3db7fe7\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"1ad856985f\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"1adec2dca2\": {\"quality\": 0.47247500000000003, \"cost\": 3.68e-06, \"time\": 0.0303}, \"1b04a2b184\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.037500000000000006}, \"1b28439bd7\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.0304}, \"1beb2fac62\": {\"quality\": 0.4969666666666666, \"cost\": 3.32e-06, \"time\": 0.0277}, \"1c347e4d91\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.021599999999999998}, \"1c3882926e\": {\"quality\": 0.4484, \"cost\": 2.19e-06, \"time\": 0.033800000000000004}, \"1cc6d9efb6\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.03899999999999999}, \"1ce3d77039\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"1ce99cf2c8\": {\"quality\": 0.5050250000000001, \"cost\": 2.7500000000000004e-06, \"time\": 0.0365}, \"1d26090364\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"1d87f97e62\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"1da2369719\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"1e18e60895\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"1e1bf7e88b\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"1e8b3521f8\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"1f5e8c9e9a\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.034100000000000005}, \"2018bef45f\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.025699999999999997}, \"2066966577\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.0341}, \"2075ff1d04\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"2080b60a57\": {\"quality\": 0.448475, \"cost\": 1.5899999999999998e-06, \"time\": 0.021400000000000002}, \"208a98f514\": {\"quality\": 0.5800000000000001, \"cost\": 3.62e-06, \"time\": 0.03609999999999999}, \"20e10af7d4\": {\"quality\": 0.47872499999999996, \"cost\": 1.8299999999999998e-06, \"time\": 0.0242}, \"20e2c0b057\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"211b89b4cd\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"2153174e2d\": {\"quality\": 0.45792499999999997, \"cost\": 2.3600000000000003e-06, \"time\": 0.0326}, \"21b2b8ebd1\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"21b2df8512\": {\"quality\": 0.48889999999999995, \"cost\": 2.24e-06, \"time\": 0.0262}, \"21bed16a7d\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"227246dff8\": {\"quality\": 0.40375, \"cost\": 4.8e-07, \"time\": 0.0123}, \"227c30d349\": {\"quality\": 0.38766666666666666, \"cost\": 8.4e-07, \"time\": 0.011600000000000001}, \"228687831a\": {\"quality\": 0.4101333333333333, \"cost\": 2.52e-06, \"time\": 0.0288}, \"23075b2a6e\": {\"quality\": 0.4649666666666667, \"cost\": 1.23e-06, \"time\": 0.018799999999999997}, \"23566f15ab\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"2370cebb10\": {\"quality\": 0.45935000000000004, \"cost\": 1.59e-06, \"time\": 0.0247}, \"2386b03c4c\": {\"quality\": 0.4484, \"cost\": 2.19e-06, \"time\": 0.033800000000000004}, \"24957f3a43\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.0197}, \"24c122de4e\": {\"quality\": 0.4101333333333334, \"cost\": 2.5199999999999996e-06, \"time\": 0.0288}, \"24f76747b9\": {\"quality\": 0.4649666666666667, \"cost\": 1.23e-06, \"time\": 0.018799999999999997}, \"2609bfd616\": {\"quality\": 0.4858, \"cost\": 2.84e-06, \"time\": 0.0283}, \"260ab3e966\": {\"quality\": 0.420675, \"cost\": 2.04e-06, \"time\": 0.0294}, \"2728c8eb6a\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"27971eaaf5\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.0229}, \"27ba0964b2\": {\"quality\": 0.40375, \"cost\": 4.8e-07, \"time\": 0.0123}, \"27daa50458\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"28369b2421\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.0263}, \"28421e6d62\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.022399999999999996}, \"2848c42f91\": {\"quality\": 0.4649666666666667, \"cost\": 1.2299999999999999e-06, \"time\": 0.0188}, \"28a638bb6e\": {\"quality\": 0.45935000000000004, \"cost\": 1.59e-06, \"time\": 0.0247}, \"290947fe5a\": {\"quality\": 0.45935, \"cost\": 1.5899999999999998e-06, \"time\": 0.0247}, \"2936c3e43e\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.0375}, \"293ec5edca\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"294e541235\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"295ed5e759\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.0197}, \"2960431101\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"29892d8468\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"299a0aeb65\": {\"quality\": 0.590875, \"cost\": 3.62e-06, \"time\": 0.0394}, \"29bf3c0a3b\": {\"quality\": 0.45690000000000003, \"cost\": 2.43e-06, \"time\": 0.03}, \"29c8c693e2\": {\"quality\": 0.52195, \"cost\": 3.830000000000001e-06, \"time\": 0.041299999999999996}, \"2a7d15f4a7\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"2ae24e0124\": {\"quality\": 0.514875, \"cost\": 2.82e-06, \"time\": 0.0372}, \"2b5679d248\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"2b5ab72a55\": {\"quality\": 0.365, \"cost\": 1.2e-07, \"time\": 0.0064}, \"2b82a67eb1\": {\"quality\": 0.45363333333333333, \"cost\": 9.9e-07, \"time\": 0.0226}, \"2bcbffdf85\": {\"quality\": 0.45935000000000004, \"cost\": 1.59e-06, \"time\": 0.0247}, \"2bd39ee744\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"2bf38d797f\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"2c1640adf7\": {\"quality\": 0.4794666666666667, \"cost\": 1.23e-06, \"time\": 0.0221}, \"2c5cf9eb26\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.023599999999999996}, \"2c87313a93\": {\"quality\": 0.47240000000000004, \"cost\": 4.28e-06, \"time\": 0.0427}, \"2c9a9f94c4\": {\"quality\": 0.426725, \"cost\": 3.1199999999999998e-06, \"time\": 0.030900000000000004}, \"2d3bbc2d23\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"2d7f1dbd4b\": {\"quality\": 0.470225, \"cost\": 1.59e-06, \"time\": 0.027999999999999997}, \"2de113167b\": {\"quality\": 0.494225, \"cost\": 3.68e-06, \"time\": 0.0369}, \"2de3eb2c19\": {\"quality\": 0.6592, \"cost\": 3.52e-06, \"time\": 0.0278}, \"2e02b71061\": {\"quality\": 0.4166666666666667, \"cost\": 8.4e-07, \"time\": 0.0182}, \"2e30394ac6\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.011099999999999999}, \"2e5d071f21\": {\"quality\": 0.45792499999999997, \"cost\": 2.3600000000000003e-06, \"time\": 0.0326}, \"2e9c5cc9bf\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"2ec4bec1a3\": {\"quality\": 0.455475, \"cost\": 3.2000000000000003e-06, \"time\": 0.0379}, \"2f1573da80\": {\"quality\": 0.429175, \"cost\": 2.2799999999999998e-06, \"time\": 0.0256}, \"2fc0cb3592\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"2fd9cd426a\": {\"quality\": 0.48335, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"3019af79b3\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"302c1d97fc\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027699999999999995}, \"303b467574\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"3058b1f1f8\": {\"quality\": 0.4013, \"cost\": 1.8e-06, \"time\": 0.0299}, \"30ae4cbe91\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"30e3ff1d17\": {\"quality\": 0.47729999999999995, \"cost\": 2.6e-06, \"time\": 0.0321}, \"30f20c8fe6\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"316759d191\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.034100000000000005}, \"3177802176\": {\"quality\": 0.365, \"cost\": 1.2e-07, \"time\": 0.0064}, \"3184f977a8\": {\"quality\": 0.39287500000000003, \"cost\": 9.6e-07, \"time\": 0.0213}, \"3194e440cf\": {\"quality\": 0.4762, \"cost\": 2.07e-06, \"time\": 0.027399999999999997}, \"3197ad4faf\": {\"quality\": 0.37633333333333335, \"cost\": 6e-07, \"time\": 0.0154}, \"31a423a3bf\": {\"quality\": 0.382, \"cost\": 4.8e-07, \"time\": 0.009000000000000001}, \"321e17afbd\": {\"quality\": 0.46642500000000003, \"cost\": 2.6e-06, \"time\": 0.0288}, \"32b101d807\": {\"quality\": 0.538875, \"cost\": 4.9100000000000004e-06, \"time\": 0.0461}, \"332a350ea2\": {\"quality\": 0.4134, \"cost\": 1.6799999999999998e-06, \"time\": 0.0235}, \"33459cd29c\": {\"quality\": 0.47492500000000004, \"cost\": 2.84e-06, \"time\": 0.025}, \"33a187e74f\": {\"quality\": 0.525825, \"cost\": 2.22e-06, \"time\": 0.0281}, \"34026bb5cc\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"3511b5e1d0\": {\"quality\": 0.39899999999999997, \"cost\": 1.08e-06, \"time\": 0.0078}, \"3513e54767\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"353f0cb1ac\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"3550bf88cb\": {\"quality\": 0.53425, \"cost\": 3.06e-06, \"time\": 0.036699999999999997}, \"35610fb420\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"357267e14b\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"36011c7606\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"362d480d6d\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.034100000000000005}, \"363209b6e7\": {\"quality\": 0.48890000000000006, \"cost\": 2.24e-06, \"time\": 0.026199999999999998}, \"3637084f91\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.025699999999999997}, \"36b17c40f3\": {\"quality\": 0.39885000000000004, \"cost\": 1.3199999999999999e-06, \"time\": 0.0176}, \"36c66671ee\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.023599999999999996}, \"372e8b5f4f\": {\"quality\": 0.4134, \"cost\": 1.6799999999999998e-06, \"time\": 0.0235}, \"375ed248fe\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.025699999999999997}, \"377b8b0bcc\": {\"quality\": 0.45935, \"cost\": 1.5899999999999998e-06, \"time\": 0.0247}, \"37bd28f2c9\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"38075bb01f\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"38567d6a43\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"389a99ab21\": {\"quality\": 0.46532500000000004, \"cost\": 3.27e-06, \"time\": 0.0386}, \"389c54cbca\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"38ec11cf7b\": {\"quality\": 0.45690000000000003, \"cost\": 2.43e-06, \"time\": 0.03}, \"3980f20caa\": {\"quality\": 0.48090000000000005, \"cost\": 4.52e-06, \"time\": 0.038900000000000004}, \"39cd4ca402\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.018299999999999997}, \"3a34b24c41\": {\"quality\": 0.4762, \"cost\": 2.07e-06, \"time\": 0.0274}, \"3b2e8075ea\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"3b3676521a\": {\"quality\": 0.5955, \"cost\": 5.47e-06, \"time\": 0.048799999999999996}, \"3b3a6bf087\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"3b4bde0121\": {\"quality\": 0.38766666666666666, \"cost\": 8.4e-07, \"time\": 0.011600000000000001}, \"3b57530a56\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"3c206c89f3\": {\"quality\": 0.47485, \"cost\": 3.44e-06, \"time\": 0.037399999999999996}, \"3cbab8082e\": {\"quality\": 0.523375, \"cost\": 3.06e-06, \"time\": 0.0334}, \"3d71c4dd2c\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.026899999999999997}, \"3ea15ac20c\": {\"quality\": 0.390425, \"cost\": 1.8e-06, \"time\": 0.026600000000000002}, \"3f1a58aec9\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"3f2321bb08\": {\"quality\": 0.365, \"cost\": 1.2e-07, \"time\": 0.0064}, \"3f3ef494b0\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"3f62c3fbfc\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"3f88dd99f7\": {\"quality\": 0.5121, \"cost\": 1.88e-06, \"time\": 0.0203}, \"3fa747af9a\": {\"quality\": 0.3908333333333333, \"cost\": 6e-07, \"time\": 0.0187}, \"40104c813f\": {\"quality\": 0.522025, \"cost\": 3.23e-06, \"time\": 0.028899999999999995}, \"403b05da2d\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027699999999999995}, \"4043815a3e\": {\"quality\": 0.5206, \"cost\": 4.000000000000001e-06, \"time\": 0.0368}, \"409ff67607\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.018299999999999997}, \"412c065b83\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"4171fbac5c\": {\"quality\": 0.47485, \"cost\": 3.44e-06, \"time\": 0.037399999999999996}, \"4191118787\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"41d5b97871\": {\"quality\": 0.43923333333333336, \"cost\": 1.92e-06, \"time\": 0.023}, \"41d8845655\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"41ee202cac\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"42082dcd0d\": {\"quality\": 0.45690000000000003, \"cost\": 2.43e-06, \"time\": 0.030000000000000002}, \"42ddd48341\": {\"quality\": 0.537525, \"cost\": 5.0800000000000005e-06, \"time\": 0.0416}, \"4361bc7ea7\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"43afdad250\": {\"quality\": 0.52195, \"cost\": 3.83e-06, \"time\": 0.0413}, \"43c3cf9cb8\": {\"quality\": 0.6592, \"cost\": 3.52e-06, \"time\": 0.0278}, \"43d24fb32a\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027700000000000002}, \"440bc872de\": {\"quality\": 0.45690000000000003, \"cost\": 2.4299999999999996e-06, \"time\": 0.03}, \"44173a9aef\": {\"quality\": 0.5290250000000001, \"cost\": 4.84e-06, \"time\": 0.045399999999999996}, \"44d6af5523\": {\"quality\": 0.6309, \"cost\": 2.25e-06, \"time\": 0.0294}, \"44fe4e4e3e\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"4587a1500c\": {\"quality\": 0.4630666666666667, \"cost\": 2e-06, \"time\": 0.026699999999999998}, \"45ef93b61e\": {\"quality\": 0.4762, \"cost\": 2.07e-06, \"time\": 0.0274}, \"461846a52d\": {\"quality\": 0.42799999999999994, \"cost\": 1.08e-06, \"time\": 0.0144}, \"462e6ff849\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"4630853d32\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"46475b9e75\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"46654a1f32\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.039}, \"466a3036b2\": {\"quality\": 0.47382500000000005, \"cost\": 3.51e-06, \"time\": 0.0348}, \"466d4d16dd\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.022399999999999996}, \"46a35022d8\": {\"quality\": 0.49795, \"cost\": 8.7e-07, \"time\": 0.0162}, \"46ed68152d\": {\"quality\": 0.5002333333333333, \"cost\": 2.48e-06, \"time\": 0.0224}, \"46edc488a4\": {\"quality\": 0.39885000000000004, \"cost\": 1.3199999999999999e-06, \"time\": 0.0176}, \"476a12876c\": {\"quality\": 0.4392333333333333, \"cost\": 1.92e-06, \"time\": 0.023}, \"48043e2304\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"488645cbd9\": {\"quality\": 0.4425, \"cost\": 7.2e-07, \"time\": 0.0118}, \"49009a3b57\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.022899999999999997}, \"4909061216\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.037500000000000006}, \"49107972df\": {\"quality\": 0.390425, \"cost\": 1.8e-06, \"time\": 0.026600000000000002}, \"49731b1ccd\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"498f146004\": {\"quality\": 0.5715, \"cost\": 3.3800000000000002e-06, \"time\": 0.0399}, \"49ad844bd2\": {\"quality\": 0.4847, \"cost\": 3.51e-06, \"time\": 0.0381}, \"49ca727e49\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"4a4a960a82\": {\"quality\": 0.48889999999999995, \"cost\": 2.24e-06, \"time\": 0.026199999999999998}, \"4a92372986\": {\"quality\": 0.40216666666666673, \"cost\": 8.4e-07, \"time\": 0.0149}, \"4aa7e8fde6\": {\"quality\": 0.45363333333333333, \"cost\": 9.9e-07, \"time\": 0.022600000000000002}, \"4aafd39d76\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"4ace1cfad1\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.0341}, \"4ad1952206\": {\"quality\": 0.48889999999999995, \"cost\": 2.24e-06, \"time\": 0.0262}, \"4b59f40131\": {\"quality\": 0.4762, \"cost\": 2.07e-06, \"time\": 0.027399999999999997}, \"4b86a1c038\": {\"quality\": 0.46642500000000003, \"cost\": 2.6e-06, \"time\": 0.0288}, \"4b92a26754\": {\"quality\": 0.365, \"cost\": 1.2e-07, \"time\": 0.0064}, \"4bc4528402\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"4c158a1a4a\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.0197}, \"4c954323e3\": {\"quality\": 0.5800000000000001, \"cost\": 3.62e-06, \"time\": 0.03609999999999999}, \"4d8bcf8ae2\": {\"quality\": 0.365, \"cost\": 1.2e-07, \"time\": 0.0064}, \"4d91e8a27b\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"4dd3635bc3\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"4dd96bd18f\": {\"quality\": 0.47240000000000004, \"cost\": 4.28e-06, \"time\": 0.0427}, \"4e298ee0d4\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.029500000000000002}, \"4e3443a0f9\": {\"quality\": 0.538875, \"cost\": 4.9100000000000004e-06, \"time\": 0.0461}, \"4e4b9db2b8\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"4e6509f614\": {\"quality\": 0.43270000000000003, \"cost\": 2.4e-06, \"time\": 0.0224}, \"4e6a83e751\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"4e8d8e527a\": {\"quality\": 0.38756666666666667, \"cost\": 1.4399999999999998e-06, \"time\": 0.024}, \"4ef333ab21\": {\"quality\": 0.4166666666666667, \"cost\": 8.4e-07, \"time\": 0.0182}, \"4f16545711\": {\"quality\": 0.39890000000000003, \"cost\": 1.6799999999999998e-06, \"time\": 0.020200000000000003}, \"4f8cca1195\": {\"quality\": 0.59795, \"cost\": 4.63e-06, \"time\": 0.0435}, \"500860eaa2\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"50701b505e\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.026899999999999997}, \"50bc87e9cc\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.0229}, \"50c03be77c\": {\"quality\": 0.3908333333333333, \"cost\": 6e-07, \"time\": 0.0187}, \"510375edad\": {\"quality\": 0.382, \"cost\": 4.8e-07, \"time\": 0.009000000000000001}, \"512fdb607c\": {\"quality\": 0.52195, \"cost\": 3.830000000000001e-06, \"time\": 0.0413}, \"51583a901c\": {\"quality\": 0.59795, \"cost\": 4.63e-06, \"time\": 0.0435}, \"521314dab6\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"5241bf401b\": {\"quality\": 0.41340000000000005, \"cost\": 1.6799999999999998e-06, \"time\": 0.0235}, \"526878b5eb\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"52c1cba6ce\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.040799999999999996}, \"52e5d0f4fb\": {\"quality\": 0.3908333333333333, \"cost\": 6e-07, \"time\": 0.0187}, \"52f041a70e\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"5307496302\": {\"quality\": 0.4762, \"cost\": 2.0699999999999997e-06, \"time\": 0.0274}, \"53869388bb\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"53d2932c4f\": {\"quality\": 0.5114666666666666, \"cost\": 3.32e-06, \"time\": 0.031}, \"5474247f91\": {\"quality\": 0.48573333333333335, \"cost\": 2.48e-06, \"time\": 0.0191}, \"557d2cf7ba\": {\"quality\": 0.48573333333333335, \"cost\": 2.48e-06, \"time\": 0.0191}, \"559c7120c5\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"55c8aa8935\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"56a29a28c5\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"56b39eb1d6\": {\"quality\": 0.5785750000000001, \"cost\": 4.39e-06, \"time\": 0.044}, \"5703697dbd\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"5718f2ed80\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"572a02a59a\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"572c2df793\": {\"quality\": 0.439975, \"cost\": 1.3499999999999998e-06, \"time\": 0.0252}, \"5750713a41\": {\"quality\": 0.4392333333333333, \"cost\": 1.92e-06, \"time\": 0.023}, \"57757ef15e\": {\"quality\": 0.5002333333333333, \"cost\": 2.48e-06, \"time\": 0.0224}, \"5793d14bbe\": {\"quality\": 0.5314749999999999, \"cost\": 4.000000000000001e-06, \"time\": 0.0401}, \"579a915ed2\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"579c81bbe0\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"57bed1722f\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"585ba6d20b\": {\"quality\": 0.5328999999999999, \"cost\": 3.23e-06, \"time\": 0.0322}, \"589a1cea79\": {\"quality\": 0.47492500000000004, \"cost\": 2.84e-06, \"time\": 0.025}, \"59006532b4\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"59326c4e00\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"596f0ed542\": {\"quality\": 0.47485000000000005, \"cost\": 3.44e-06, \"time\": 0.0374}, \"5971ba4e0d\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"59e0117b7d\": {\"quality\": 0.46642500000000003, \"cost\": 2.6e-06, \"time\": 0.0288}, \"59f515d0da\": {\"quality\": 0.48889999999999995, \"cost\": 2.24e-06, \"time\": 0.0262}, \"59f887b67c\": {\"quality\": 0.47492500000000004, \"cost\": 2.84e-06, \"time\": 0.025}, \"5a22920db4\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"5a35020d45\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"5aa71bb88a\": {\"quality\": 0.40375, \"cost\": 4.8e-07, \"time\": 0.0123}, \"5b10fbdbe1\": {\"quality\": 0.5399750000000001, \"cost\": 4.24e-06, \"time\": 0.0363}, \"5b4ad39a9e\": {\"quality\": 0.48889999999999995, \"cost\": 2.24e-06, \"time\": 0.0262}, \"5bade9eb85\": {\"quality\": 0.523375, \"cost\": 3.06e-06, \"time\": 0.0334}, \"5be16744bf\": {\"quality\": 0.4969666666666666, \"cost\": 3.32e-06, \"time\": 0.0277}, \"5c0db11303\": {\"quality\": 0.37633333333333335, \"cost\": 6e-07, \"time\": 0.0154}, \"5c53feccd9\": {\"quality\": 0.48090000000000005, \"cost\": 4.52e-06, \"time\": 0.038900000000000004}, \"5c77c7c2b2\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"5d072194b8\": {\"quality\": 0.47729999999999995, \"cost\": 2.6e-06, \"time\": 0.0321}, \"5d79b50feb\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.030399999999999996}, \"5dc216cd6b\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.030999999999999996}, \"5dd68c1b8f\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"5de4a882c1\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"5deeeb223f\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"5e2f03b962\": {\"quality\": 0.40375, \"cost\": 4.8e-07, \"time\": 0.0123}, \"5ea2fab380\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"5eb3bb525b\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.022399999999999996}, \"5ec3832817\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"5f0199e07b\": {\"quality\": 0.382, \"cost\": 4.8e-07, \"time\": 0.009000000000000001}, \"5f37b3902b\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027699999999999995}, \"60cb623c53\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"612e546d71\": {\"quality\": 0.40980000000000005, \"cost\": 2.04e-06, \"time\": 0.0261}, \"6160bfb439\": {\"quality\": 0.40735000000000005, \"cost\": 2.88e-06, \"time\": 0.031400000000000004}, \"6178f33808\": {\"quality\": 0.4484, \"cost\": 2.19e-06, \"time\": 0.033800000000000004}, \"619b48dde9\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"6268ac658c\": {\"quality\": 0.5206, \"cost\": 4.000000000000001e-06, \"time\": 0.0368}, \"628f34aace\": {\"quality\": 0.5064500000000001, \"cost\": 1.9799999999999997e-06, \"time\": 0.0286}, \"630d1ecda0\": {\"quality\": 0.40137500000000004, \"cost\": 1.2e-06, \"time\": 0.0175}, \"63a0aaebed\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"63f392465f\": {\"quality\": 0.45363333333333333, \"cost\": 9.9e-07, \"time\": 0.0226}, \"6527f214c3\": {\"quality\": 0.4021666666666666, \"cost\": 8.4e-07, \"time\": 0.0149}, \"652c0f4bdf\": {\"quality\": 0.47247500000000003, \"cost\": 3.68e-06, \"time\": 0.0303}, \"6533c85913\": {\"quality\": 0.47382500000000005, \"cost\": 3.51e-06, \"time\": 0.0348}, \"65627426e0\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"65801893b4\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.0341}, \"65b76da9c6\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.022399999999999996}, \"65be1c1306\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"65e0216208\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"65eee615d7\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"66277da52f\": {\"quality\": 0.470225, \"cost\": 1.59e-06, \"time\": 0.027999999999999997}, \"66750c0934\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"66776ec181\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"67632141f6\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"677deb302a\": {\"quality\": 0.467775, \"cost\": 2.43e-06, \"time\": 0.0333}, \"67868fcff6\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.03899999999999999}, \"67aad9ea16\": {\"quality\": 0.45690000000000003, \"cost\": 2.4299999999999996e-06, \"time\": 0.03}, \"67bab6732d\": {\"quality\": 0.5082000000000001, \"cost\": 4.16e-06, \"time\": 0.0363}, \"67fe399cf1\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.025699999999999997}, \"6846bd8fb3\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"68583552fb\": {\"quality\": 0.39890000000000003, \"cost\": 1.6799999999999998e-06, \"time\": 0.020200000000000003}, \"689e327daf\": {\"quality\": 0.382, \"cost\": 4.8e-07, \"time\": 0.009000000000000001}, \"69b3b67de6\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"69bf3f6ba0\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.0224}, \"69f90e610f\": {\"quality\": 0.52195, \"cost\": 3.830000000000001e-06, \"time\": 0.0413}, \"6a022c3f73\": {\"quality\": 0.42799999999999994, \"cost\": 1.08e-06, \"time\": 0.0144}, \"6a10c53ad8\": {\"quality\": 0.588425, \"cost\": 4.46e-06, \"time\": 0.044700000000000004}, \"6a6348f69d\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"6a8726145c\": {\"quality\": 0.4762, \"cost\": 2.0699999999999997e-06, \"time\": 0.0274}, \"6a8a675442\": {\"quality\": 0.4762, \"cost\": 2.0699999999999997e-06, \"time\": 0.0274}, \"6aac59742a\": {\"quality\": 0.46540000000000004, \"cost\": 2.67e-06, \"time\": 0.0262}, \"6ac193c88f\": {\"quality\": 0.420675, \"cost\": 2.04e-06, \"time\": 0.0294}, \"6ae9e9de0b\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"6b0c585f5c\": {\"quality\": 0.420675, \"cost\": 2.04e-06, \"time\": 0.0294}, \"6b3c16def2\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"6c05c47050\": {\"quality\": 0.4649666666666667, \"cost\": 1.23e-06, \"time\": 0.0188}, \"6c3667811b\": {\"quality\": 0.41340000000000005, \"cost\": 1.6799999999999998e-06, \"time\": 0.0235}, \"6cc813aa68\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"6cd78cac7e\": {\"quality\": 0.48890000000000006, \"cost\": 2.24e-06, \"time\": 0.026199999999999998}, \"6d20c6ace0\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.0304}, \"6d67c56ba6\": {\"quality\": 0.466425, \"cost\": 2.6e-06, \"time\": 0.0288}, \"6db70dc3b6\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"6e0690f576\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"6e3db7ec5e\": {\"quality\": 0.47485, \"cost\": 3.44e-06, \"time\": 0.037399999999999996}, \"6e62bbb47f\": {\"quality\": 0.45085000000000003, \"cost\": 1.35e-06, \"time\": 0.028499999999999998}, \"6e859bfae6\": {\"quality\": 0.439975, \"cost\": 1.3499999999999998e-06, \"time\": 0.0252}, \"6e93514f45\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"6eae47102b\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.0263}, \"6ecf93c479\": {\"quality\": 0.47946666666666665, \"cost\": 1.2299999999999999e-06, \"time\": 0.022099999999999998}, \"6ef3b7127e\": {\"quality\": 0.6592, \"cost\": 3.52e-06, \"time\": 0.0278}, \"6f323f80c7\": {\"quality\": 0.365, \"cost\": 1.2e-07, \"time\": 0.0064}, \"6f60a05c33\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"6fbdd8b57c\": {\"quality\": 0.4021666666666666, \"cost\": 8.4e-07, \"time\": 0.0149}, \"6fd6046c4b\": {\"quality\": 0.45935, \"cost\": 1.5899999999999998e-06, \"time\": 0.0247}, \"6fe0b3f929\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"6ff4f667f8\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"700474dfbd\": {\"quality\": 0.47946666666666665, \"cost\": 1.2299999999999999e-06, \"time\": 0.022099999999999998}, \"700ab1d309\": {\"quality\": 0.48573333333333335, \"cost\": 2.48e-06, \"time\": 0.0191}, \"7040e83d52\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"704209377f\": {\"quality\": 0.46777500000000005, \"cost\": 2.43e-06, \"time\": 0.033299999999999996}, \"70b666e371\": {\"quality\": 0.39890000000000003, \"cost\": 1.68e-06, \"time\": 0.020200000000000003}, \"70c850e039\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"7112a7e64c\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"7114013f0c\": {\"quality\": 0.45363333333333333, \"cost\": 9.9e-07, \"time\": 0.0226}, \"715070d0ca\": {\"quality\": 0.4098, \"cost\": 2.04e-06, \"time\": 0.026099999999999998}, \"71b615468b\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027700000000000002}, \"71ed893462\": {\"quality\": 0.47485000000000005, \"cost\": 3.44e-06, \"time\": 0.0374}, \"722d41b2f8\": {\"quality\": 0.3989, \"cost\": 1.6799999999999998e-06, \"time\": 0.0202}, \"723fd5589a\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"7250da0f41\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027700000000000002}, \"7260a96349\": {\"quality\": 0.4013, \"cost\": 1.8e-06, \"time\": 0.0299}, \"7347cf0308\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"736e652158\": {\"quality\": 0.46785, \"cost\": 1.8299999999999998e-06, \"time\": 0.020900000000000002}, \"738364d6a2\": {\"quality\": 0.45555, \"cost\": 2.6e-06, \"time\": 0.025500000000000002}, \"739b1f81dc\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"73c6240c29\": {\"quality\": 0.448475, \"cost\": 1.5899999999999998e-06, \"time\": 0.021400000000000002}, \"73fc2767b9\": {\"quality\": 0.4649666666666667, \"cost\": 1.2299999999999999e-06, \"time\": 0.0188}, \"7445d99939\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"7466a5f424\": {\"quality\": 0.537525, \"cost\": 5.0800000000000005e-06, \"time\": 0.0416}, \"74cc4b1bc4\": {\"quality\": 0.4762, \"cost\": 2.07e-06, \"time\": 0.027399999999999997}, \"74d7f64b8c\": {\"quality\": 0.46785, \"cost\": 1.8299999999999998e-06, \"time\": 0.020900000000000002}, \"751869cbec\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"7524905580\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"752d9649f2\": {\"quality\": 0.5484, \"cost\": 5.0800000000000005e-06, \"time\": 0.044899999999999995}, \"7558c9722d\": {\"quality\": 0.47485, \"cost\": 3.44e-06, \"time\": 0.0374}, \"75ca9cd4f8\": {\"quality\": 0.365, \"cost\": 2.4e-07, \"time\": 0.0128}, \"7604c0aa13\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027699999999999995}, \"765dbc6ad5\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"7707e6e7e3\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"7765576286\": {\"quality\": 0.4847, \"cost\": 3.51e-06, \"time\": 0.0381}, \"77983b6105\": {\"quality\": 0.382, \"cost\": 4.8e-07, \"time\": 0.009000000000000001}, \"77b5740025\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.022899999999999997}, \"77c02b00c1\": {\"quality\": 0.48335, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"77c6a9703a\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"77f293b737\": {\"quality\": 0.5785750000000001, \"cost\": 4.39e-06, \"time\": 0.044}, \"7801da66b9\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"7862ea67cb\": {\"quality\": 0.39885000000000004, \"cost\": 1.3199999999999999e-06, \"time\": 0.0176}, \"786e5d0af5\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"795d119bc7\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"7989343d94\": {\"quality\": 0.4166666666666667, \"cost\": 8.4e-07, \"time\": 0.0182}, \"79fad58f07\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"7a207b42a8\": {\"quality\": 0.4183, \"cost\": 2.2799999999999998e-06, \"time\": 0.0223}, \"7a58d3472b\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"7a7cc658c8\": {\"quality\": 0.52195, \"cost\": 3.830000000000001e-06, \"time\": 0.041299999999999996}, \"7b024a2966\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"7b6dc3702e\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"7b6f44618e\": {\"quality\": 0.4917750000000001, \"cost\": 4.52e-06, \"time\": 0.0422}, \"7b74b23910\": {\"quality\": 0.5611333333333334, \"cost\": 3.64e-06, \"time\": 0.034199999999999994}, \"7b9cc96081\": {\"quality\": 0.365, \"cost\": 1.2e-07, \"time\": 0.0064}, \"7c45c61d8d\": {\"quality\": 0.467775, \"cost\": 2.43e-06, \"time\": 0.033299999999999996}, \"7c89a2b69e\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"7d44f0959d\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"7d60c38c5c\": {\"quality\": 0.6309, \"cost\": 1.5e-06, \"time\": 0.0196}, \"7d67e14414\": {\"quality\": 0.517325, \"cost\": 1.98e-06, \"time\": 0.0319}, \"7daf7ff182\": {\"quality\": 0.5082000000000001, \"cost\": 4.16e-06, \"time\": 0.0363}, \"7e0ad1c9c1\": {\"quality\": 0.38756666666666667, \"cost\": 1.4399999999999998e-06, \"time\": 0.024}, \"7ed07ad40a\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.030799999999999998}, \"7fa67a7656\": {\"quality\": 0.5114666666666666, \"cost\": 3.32e-06, \"time\": 0.031}, \"806881adcb\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"80a1d9c2f3\": {\"quality\": 0.4021666666666666, \"cost\": 8.4e-07, \"time\": 0.0149}, \"80be7df955\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"80bf60c422\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"81333c7a33\": {\"quality\": 0.6592, \"cost\": 5.28e-06, \"time\": 0.0417}, \"813e75210b\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"815e7116df\": {\"quality\": 0.37633333333333335, \"cost\": 6e-07, \"time\": 0.0154}, \"816068ff07\": {\"quality\": 0.439975, \"cost\": 1.3499999999999998e-06, \"time\": 0.0252}, \"81a4f42fd9\": {\"quality\": 0.5121, \"cost\": 1.88e-06, \"time\": 0.0203}, \"828ccea2d3\": {\"quality\": 0.48889999999999995, \"cost\": 2.24e-06, \"time\": 0.026199999999999998}, \"829df73946\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"831728b179\": {\"quality\": 0.4101333333333334, \"cost\": 2.5199999999999996e-06, \"time\": 0.0288}, \"831e8b8be5\": {\"quality\": 0.4794666666666667, \"cost\": 1.23e-06, \"time\": 0.022099999999999998}, \"8357183895\": {\"quality\": 0.46540000000000004, \"cost\": 2.67e-06, \"time\": 0.0262}, \"8392a6083a\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"83b244c163\": {\"quality\": 0.5121, \"cost\": 1.88e-06, \"time\": 0.0203}, \"83c9e66ec6\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"842c0d1062\": {\"quality\": 0.45935000000000004, \"cost\": 1.5899999999999998e-06, \"time\": 0.0247}, \"846bed2aa7\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"847fd49235\": {\"quality\": 0.42799999999999994, \"cost\": 1.08e-06, \"time\": 0.0144}, \"84dc98be95\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"8519bef585\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"8572c6af3a\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"85c94a5505\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"85e8eaed6e\": {\"quality\": 0.41225, \"cost\": 1.2e-06, \"time\": 0.0208}, \"862183bfb9\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"8668f65f05\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"870e2f87b4\": {\"quality\": 0.587075, \"cost\": 4.63e-06, \"time\": 0.0402}, \"87c1b31c82\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.0341}, \"88436e05a9\": {\"quality\": 0.455475, \"cost\": 3.2000000000000003e-06, \"time\": 0.0379}, \"887ad124e1\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.021599999999999998}, \"8886cb3082\": {\"quality\": 0.5328999999999999, \"cost\": 3.23e-06, \"time\": 0.0322}, \"8940398bf1\": {\"quality\": 0.49795, \"cost\": 8.7e-07, \"time\": 0.0162}, \"8941621423\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"8961e4d901\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"8974aa89a0\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"89a289907e\": {\"quality\": 0.590875, \"cost\": 3.62e-06, \"time\": 0.0394}, \"89a35a09b1\": {\"quality\": 0.48714999999999997, \"cost\": 2.67e-06, \"time\": 0.0328}, \"89bc21961a\": {\"quality\": 0.365, \"cost\": 2.4e-07, \"time\": 0.0128}, \"89fbefd150\": {\"quality\": 0.47485, \"cost\": 3.44e-06, \"time\": 0.0374}, \"8a37c82283\": {\"quality\": 0.5611333333333334, \"cost\": 3.64e-06, \"time\": 0.034199999999999994}, \"8aaadb8649\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"8acd758b7f\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"8b721bbc6f\": {\"quality\": 0.48714999999999997, \"cost\": 2.67e-06, \"time\": 0.0328}, \"8b90f4b639\": {\"quality\": 0.4101333333333334, \"cost\": 2.5199999999999996e-06, \"time\": 0.0288}, \"8bbbe0f52a\": {\"quality\": 0.4392333333333333, \"cost\": 1.92e-06, \"time\": 0.023}, \"8bc184f385\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.026899999999999997}, \"8bf5c3eadc\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"8c195addc7\": {\"quality\": 0.5611333333333334, \"cost\": 3.64e-06, \"time\": 0.034199999999999994}, \"8c9881972c\": {\"quality\": 0.4134, \"cost\": 1.68e-06, \"time\": 0.0235}, \"8cf8b81d84\": {\"quality\": 0.38756666666666667, \"cost\": 1.4399999999999998e-06, \"time\": 0.024}, \"8d79e03266\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"8d90814b94\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"8e33fac90f\": {\"quality\": 0.47485, \"cost\": 3.44e-06, \"time\": 0.0374}, \"8e5842ccbd\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.026899999999999997}, \"8f4caddfe6\": {\"quality\": 0.5869666666666667, \"cost\": 3.88e-06, \"time\": 0.033699999999999994}, \"8f4edde3f0\": {\"quality\": 0.4857333333333333, \"cost\": 2.48e-06, \"time\": 0.0191}, \"900a58f984\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"9025e2480f\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"9028588af4\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"9059fd80ad\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"90d5e40c1b\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"90d9a86a2a\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"90ed9312e1\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.0229}, \"90ff13783c\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.0375}, \"90ff8eb055\": {\"quality\": 0.426725, \"cost\": 3.1199999999999998e-06, \"time\": 0.030900000000000004}, \"9104e31369\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"918983323f\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"91beb0cac1\": {\"quality\": 0.382, \"cost\": 4.8e-07, \"time\": 0.009000000000000001}, \"91c800af6b\": {\"quality\": 0.47382500000000005, \"cost\": 3.51e-06, \"time\": 0.0348}, \"91dd8884db\": {\"quality\": 0.46642500000000003, \"cost\": 2.6e-06, \"time\": 0.0288}, \"924f128b3c\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.0229}, \"9288642e53\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"92c4137fb1\": {\"quality\": 0.46777500000000005, \"cost\": 2.43e-06, \"time\": 0.0333}, \"92c9dcd43b\": {\"quality\": 0.543775, \"cost\": 3.23e-06, \"time\": 0.0355}, \"92f45e5cc7\": {\"quality\": 0.49795, \"cost\": 8.7e-07, \"time\": 0.0162}, \"93011c0821\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"9303149ba4\": {\"quality\": 0.5050250000000001, \"cost\": 2.7500000000000004e-06, \"time\": 0.0365}, \"933b4d17dd\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"94010928c6\": {\"quality\": 0.42799999999999994, \"cost\": 1.08e-06, \"time\": 0.0144}, \"9403809e44\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"943baaea0c\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"94569f177a\": {\"quality\": 0.4649666666666667, \"cost\": 1.23e-06, \"time\": 0.018799999999999997}, \"9466542023\": {\"quality\": 0.3876666666666666, \"cost\": 8.4e-07, \"time\": 0.0116}, \"947e28ef2e\": {\"quality\": 0.4166666666666667, \"cost\": 8.4e-07, \"time\": 0.0182}, \"94ac356663\": {\"quality\": 0.5114666666666666, \"cost\": 3.32e-06, \"time\": 0.031}, \"94dff9a424\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"9508356a2e\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"956bdcc254\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.0224}, \"9594b0c783\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.021599999999999998}, \"964c671f18\": {\"quality\": 0.587075, \"cost\": 4.63e-06, \"time\": 0.0402}, \"9679fe2b69\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.021599999999999998}, \"968fc95038\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.0216}, \"96b487c724\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"96e85f9af4\": {\"quality\": 0.38766666666666666, \"cost\": 8.4e-07, \"time\": 0.011600000000000001}, \"96f87d6483\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"972c83b002\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.023599999999999996}, \"975bc44958\": {\"quality\": 0.4794666666666667, \"cost\": 1.23e-06, \"time\": 0.022099999999999998}, \"977a4d6b6b\": {\"quality\": 0.5484, \"cost\": 5.0800000000000005e-06, \"time\": 0.044899999999999995}, \"97ad4cd41a\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"97bc30bd83\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"97e1d0db92\": {\"quality\": 0.39890000000000003, \"cost\": 1.68e-06, \"time\": 0.020200000000000003}, \"981da9ba40\": {\"quality\": 0.45690000000000003, \"cost\": 2.43e-06, \"time\": 0.03}, \"98c1ea89f3\": {\"quality\": 0.45935000000000004, \"cost\": 1.5899999999999998e-06, \"time\": 0.0247}, \"98eca2c65c\": {\"quality\": 0.4630666666666667, \"cost\": 2e-06, \"time\": 0.0267}, \"98ecf1a157\": {\"quality\": 0.49795, \"cost\": 8.7e-07, \"time\": 0.0162}, \"9927dc270b\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"99546d91e4\": {\"quality\": 0.5422666666666667, \"cost\": 1.6200000000000002e-06, \"time\": 0.026}, \"99cb0ba736\": {\"quality\": 0.4134, \"cost\": 1.68e-06, \"time\": 0.0235}, \"99e44ab9b2\": {\"quality\": 0.40980000000000005, \"cost\": 2.04e-06, \"time\": 0.026099999999999998}, \"9a0145c9b5\": {\"quality\": 0.4794666666666667, \"cost\": 1.23e-06, \"time\": 0.022099999999999998}, \"9a57ea3f89\": {\"quality\": 0.40980000000000005, \"cost\": 2.04e-06, \"time\": 0.0261}, \"9a8420a0b3\": {\"quality\": 0.4098, \"cost\": 2.04e-06, \"time\": 0.026099999999999998}, \"9aa32e6c96\": {\"quality\": 0.47485, \"cost\": 3.44e-06, \"time\": 0.0374}, \"9aa4abfb50\": {\"quality\": 0.39899999999999997, \"cost\": 7.2e-07, \"time\": 0.0052}, \"9ada932bf5\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"9b6d4915f3\": {\"quality\": 0.43270000000000003, \"cost\": 3.6e-06, \"time\": 0.0336}, \"9bae5bafc1\": {\"quality\": 0.5869666666666667, \"cost\": 3.88e-06, \"time\": 0.033699999999999994}, \"9c549db0a7\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"9c595a2bc9\": {\"quality\": 0.466425, \"cost\": 2.6e-06, \"time\": 0.0288}, \"9c85f8cfcb\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"9c8cc46e6c\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"9c97d35a30\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"9ca354a53e\": {\"quality\": 0.39890000000000003, \"cost\": 1.68e-06, \"time\": 0.020200000000000003}, \"9ce2c3fd98\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.0111}, \"9d18cd0737\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"9d7142e7b4\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"9d778daa24\": {\"quality\": 0.467775, \"cost\": 2.43e-06, \"time\": 0.0333}, \"9e06360bc9\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"9fb157be35\": {\"quality\": 0.48573333333333335, \"cost\": 2.48e-06, \"time\": 0.0191}, \"9fc44fdeb1\": {\"quality\": 0.45555, \"cost\": 2.6e-06, \"time\": 0.025500000000000002}, \"9ffaa26d5a\": {\"quality\": 0.48890000000000006, \"cost\": 2.24e-06, \"time\": 0.026199999999999998}, \"a041e7777a\": {\"quality\": 0.41340000000000005, \"cost\": 1.6799999999999998e-06, \"time\": 0.0235}, \"a0b81be5b4\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"a0c85d260e\": {\"quality\": 0.5955, \"cost\": 5.47e-06, \"time\": 0.048799999999999996}, \"a0dc9f50ac\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"a14c507393\": {\"quality\": 0.4134, \"cost\": 1.68e-06, \"time\": 0.0235}, \"a1881eb481\": {\"quality\": 0.46777500000000005, \"cost\": 2.43e-06, \"time\": 0.0333}, \"a1bb32e6a1\": {\"quality\": 0.4021666666666666, \"cost\": 8.4e-07, \"time\": 0.0149}, \"a2347e8e9e\": {\"quality\": 0.4794666666666667, \"cost\": 1.23e-06, \"time\": 0.022099999999999998}, \"a2aa082d14\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027700000000000002}, \"a2cd339ad9\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.040799999999999996}, \"a2fd03e6a5\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"a31e87d7cb\": {\"quality\": 0.4425, \"cost\": 7.2e-07, \"time\": 0.0118}, \"a344b2d79a\": {\"quality\": 0.52195, \"cost\": 3.830000000000001e-06, \"time\": 0.0413}, \"a3c0ea3342\": {\"quality\": 0.47729999999999995, \"cost\": 2.6e-06, \"time\": 0.0321}, \"a456d75fef\": {\"quality\": 0.38766666666666666, \"cost\": 8.4e-07, \"time\": 0.011600000000000001}, \"a457f6c300\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.0111}, \"a4767e7679\": {\"quality\": 0.39890000000000003, \"cost\": 1.68e-06, \"time\": 0.020200000000000003}, \"a47de025c8\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"a4ad96343d\": {\"quality\": 0.46642500000000003, \"cost\": 2.6e-06, \"time\": 0.0288}, \"a515a9c8cc\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"a5949b76ec\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"a5ae4dfe66\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.022899999999999997}, \"a60dd076b8\": {\"quality\": 0.4392333333333333, \"cost\": 1.92e-06, \"time\": 0.023}, \"a6297a6c56\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"a6460dbb7c\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"a6796ed686\": {\"quality\": 0.45085000000000003, \"cost\": 1.35e-06, \"time\": 0.028499999999999998}, \"a6d2b05ec8\": {\"quality\": 0.5715, \"cost\": 3.3800000000000002e-06, \"time\": 0.0399}, \"a717c4c535\": {\"quality\": 0.4917750000000001, \"cost\": 4.52e-06, \"time\": 0.0422}, \"a721cd9ebf\": {\"quality\": 0.48889999999999995, \"cost\": 2.24e-06, \"time\": 0.026199999999999998}, \"a8090787b1\": {\"quality\": 0.45935000000000004, \"cost\": 1.5899999999999998e-06, \"time\": 0.0247}, \"a854343d46\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.022899999999999997}, \"a86b137d7f\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"a88eb1493c\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"a88fb984e3\": {\"quality\": 0.52195, \"cost\": 3.830000000000001e-06, \"time\": 0.0413}, \"a94e2e5f57\": {\"quality\": 0.4630666666666667, \"cost\": 2e-06, \"time\": 0.026699999999999998}, \"a95b4a6dd0\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"a9621ea4e6\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.0197}, \"a96e22379d\": {\"quality\": 0.46777500000000005, \"cost\": 2.43e-06, \"time\": 0.033299999999999996}, \"a9721a0a50\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.030999999999999996}, \"aa08180e36\": {\"quality\": 0.4858, \"cost\": 2.84e-06, \"time\": 0.0283}, \"aa38702a02\": {\"quality\": 0.42799999999999994, \"cost\": 1.08e-06, \"time\": 0.0144}, \"aa8187c023\": {\"quality\": 0.39890000000000003, \"cost\": 1.6799999999999998e-06, \"time\": 0.020200000000000003}, \"aadbfc418b\": {\"quality\": 0.44250000000000006, \"cost\": 1.08e-06, \"time\": 0.0177}, \"ab1c706436\": {\"quality\": 0.46532500000000004, \"cost\": 3.27e-06, \"time\": 0.0386}, \"ab43b02cb0\": {\"quality\": 0.5082000000000001, \"cost\": 4.16e-06, \"time\": 0.0363}, \"aba21780bc\": {\"quality\": 0.39885000000000004, \"cost\": 1.3199999999999999e-06, \"time\": 0.0176}, \"abbca95f00\": {\"quality\": 0.5121, \"cost\": 1.88e-06, \"time\": 0.0203}, \"ac208e7a1d\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.0304}, \"ac7fcf90e2\": {\"quality\": 0.4098, \"cost\": 2.04e-06, \"time\": 0.0261}, \"ac828ffe70\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"ac9fdc1550\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"ad328d5108\": {\"quality\": 0.365, \"cost\": 3.5999999999999994e-07, \"time\": 0.019200000000000002}, \"ad3efe44c3\": {\"quality\": 0.39899999999999997, \"cost\": 7.2e-07, \"time\": 0.0052}, \"ad48432c22\": {\"quality\": 0.46540000000000004, \"cost\": 2.67e-06, \"time\": 0.0262}, \"ad5187a390\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.034100000000000005}, \"ad6ebbba8d\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"ad90055ef6\": {\"quality\": 0.39899999999999997, \"cost\": 7.2e-07, \"time\": 0.0052}, \"ad97c5cee6\": {\"quality\": 0.3908333333333333, \"cost\": 6e-07, \"time\": 0.0187}, \"adab1e0fb1\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"ae655ec593\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"ae94b172be\": {\"quality\": 0.5002333333333333, \"cost\": 2.48e-06, \"time\": 0.0224}, \"af360c323c\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.018299999999999997}, \"af90567194\": {\"quality\": 0.587075, \"cost\": 4.63e-06, \"time\": 0.0402}, \"b0156bb6d2\": {\"quality\": 0.47485, \"cost\": 3.44e-06, \"time\": 0.0374}, \"b03c31ca45\": {\"quality\": 0.514875, \"cost\": 2.82e-06, \"time\": 0.0372}, \"b0530b98c3\": {\"quality\": 0.46532500000000004, \"cost\": 3.27e-06, \"time\": 0.0386}, \"b0948c05b6\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.03899999999999999}, \"b18168b9c1\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"b1cf8d33e5\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"b1dcd7aa24\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"b214718d07\": {\"quality\": 0.4649666666666667, \"cost\": 1.23e-06, \"time\": 0.0188}, \"b28925e4b8\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"b2b057ba41\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"b2e063499d\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"b3369775dc\": {\"quality\": 0.588425, \"cost\": 4.46e-06, \"time\": 0.044700000000000004}, \"b35bf038c2\": {\"quality\": 0.37633333333333335, \"cost\": 6e-07, \"time\": 0.0154}, \"b363b25367\": {\"quality\": 0.40980000000000005, \"cost\": 2.04e-06, \"time\": 0.0261}, \"b3decd5c2f\": {\"quality\": 0.38756666666666667, \"cost\": 1.4399999999999998e-06, \"time\": 0.024}, \"b3f20b706d\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"b4002173ee\": {\"quality\": 0.43270000000000003, \"cost\": 2.4e-06, \"time\": 0.0224}, \"b45fc30d81\": {\"quality\": 0.45690000000000003, \"cost\": 2.43e-06, \"time\": 0.03}, \"b4a259f6dd\": {\"quality\": 0.44705, \"cost\": 2.3600000000000003e-06, \"time\": 0.0293}, \"b52cdb3c6d\": {\"quality\": 0.4794666666666667, \"cost\": 1.23e-06, \"time\": 0.0221}, \"b56c312eda\": {\"quality\": 0.4847, \"cost\": 3.51e-06, \"time\": 0.0381}, \"b5a02bb8ab\": {\"quality\": 0.40980000000000005, \"cost\": 2.04e-06, \"time\": 0.026099999999999998}, \"b5e2b41c1c\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"b64ddb14f9\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"b66118d5f2\": {\"quality\": 0.45690000000000003, \"cost\": 2.43e-06, \"time\": 0.03}, \"b67107a43e\": {\"quality\": 0.46785, \"cost\": 1.8299999999999998e-06, \"time\": 0.020900000000000002}, \"b682a23b89\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"b690a1ddd6\": {\"quality\": 0.40137500000000004, \"cost\": 1.2e-06, \"time\": 0.0175}, \"b796b7ffd3\": {\"quality\": 0.590875, \"cost\": 3.62e-06, \"time\": 0.0394}, \"b7d0e8557f\": {\"quality\": 0.48714999999999997, \"cost\": 2.67e-06, \"time\": 0.0328}, \"b7f203a0bf\": {\"quality\": 0.467775, \"cost\": 2.43e-06, \"time\": 0.0333}, \"b81d5b2bd9\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"b8343f05e1\": {\"quality\": 0.3989, \"cost\": 1.6799999999999998e-06, \"time\": 0.0202}, \"b8ab3d2f25\": {\"quality\": 0.5869666666666667, \"cost\": 3.88e-06, \"time\": 0.033699999999999994}, \"b8b569172f\": {\"quality\": 0.4969666666666666, \"cost\": 3.32e-06, \"time\": 0.0277}, \"b8b91e375d\": {\"quality\": 0.5290250000000001, \"cost\": 4.84e-06, \"time\": 0.045399999999999996}, \"b8c685904d\": {\"quality\": 0.4762, \"cost\": 2.07e-06, \"time\": 0.0274}, \"b8d1903276\": {\"quality\": 0.4649666666666667, \"cost\": 1.23e-06, \"time\": 0.018799999999999997}, \"b91e7fdb29\": {\"quality\": 0.4183, \"cost\": 2.2799999999999998e-06, \"time\": 0.0223}, \"b9770c2261\": {\"quality\": 0.6309, \"cost\": 1.5e-06, \"time\": 0.0196}, \"b9bb1e6f8d\": {\"quality\": 0.588425, \"cost\": 4.46e-06, \"time\": 0.044700000000000004}, \"b9da208432\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"ba3223f6ac\": {\"quality\": 0.428, \"cost\": 1.08e-06, \"time\": 0.0144}, \"bac3d23c31\": {\"quality\": 0.5206, \"cost\": 4.000000000000001e-06, \"time\": 0.0368}, \"bb3ee18de1\": {\"quality\": 0.48890000000000006, \"cost\": 2.24e-06, \"time\": 0.026199999999999998}, \"bb6536b0ab\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.030999999999999996}, \"bb70f60bf1\": {\"quality\": 0.40216666666666673, \"cost\": 8.4e-07, \"time\": 0.0149}, \"bbba9dd6ae\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"bbfba2f2ee\": {\"quality\": 0.38766666666666666, \"cost\": 8.4e-07, \"time\": 0.011600000000000001}, \"bc3d02f753\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"bc4c1fcc64\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"bc60556255\": {\"quality\": 0.365, \"cost\": 1.2e-07, \"time\": 0.0064}, \"bcae7c2fc4\": {\"quality\": 0.4762, \"cost\": 2.0699999999999997e-06, \"time\": 0.0274}, \"bcef42e3b0\": {\"quality\": 0.4166666666666667, \"cost\": 8.4e-07, \"time\": 0.0182}, \"bcf4bf7c35\": {\"quality\": 0.40735000000000005, \"cost\": 2.88e-06, \"time\": 0.031400000000000004}, \"bcfb273436\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"bd99b2fb21\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.031}, \"bdf497196b\": {\"quality\": 0.5121, \"cost\": 1.88e-06, \"time\": 0.0203}, \"be2ae88f70\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.0111}, \"be4740f38f\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.03899999999999999}, \"bed888d4dc\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"bf2a5d2680\": {\"quality\": 0.4134, \"cost\": 1.6799999999999998e-06, \"time\": 0.0235}, \"bf7b0a8dc1\": {\"quality\": 0.45792499999999997, \"cost\": 2.3600000000000003e-06, \"time\": 0.0326}, \"bfed7670ed\": {\"quality\": 0.4969666666666666, \"cost\": 3.32e-06, \"time\": 0.0277}, \"c06b118e65\": {\"quality\": 0.4762, \"cost\": 2.0699999999999997e-06, \"time\": 0.0274}, \"c08a5ad170\": {\"quality\": 0.466425, \"cost\": 2.6e-06, \"time\": 0.0288}, \"c0d53a20de\": {\"quality\": 0.514875, \"cost\": 2.82e-06, \"time\": 0.0372}, \"c10e588987\": {\"quality\": 0.40980000000000005, \"cost\": 2.04e-06, \"time\": 0.026099999999999998}, \"c127509a7a\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"c13682c7c7\": {\"quality\": 0.494225, \"cost\": 3.68e-06, \"time\": 0.0369}, \"c13d6e78e9\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"c145482664\": {\"quality\": 0.47485000000000005, \"cost\": 3.44e-06, \"time\": 0.0374}, \"c14ff3144d\": {\"quality\": 0.4425, \"cost\": 7.2e-07, \"time\": 0.0118}, \"c186182658\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"c263c65d0a\": {\"quality\": 0.5064500000000001, \"cost\": 1.9799999999999997e-06, \"time\": 0.0286}, \"c2949aa902\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.030799999999999998}, \"c31c9d4d8c\": {\"quality\": 0.4649666666666667, \"cost\": 1.2299999999999999e-06, \"time\": 0.0188}, \"c31e956b35\": {\"quality\": 0.523375, \"cost\": 3.06e-06, \"time\": 0.0334}, \"c339463a25\": {\"quality\": 0.37633333333333335, \"cost\": 6e-07, \"time\": 0.0154}, \"c36b525dde\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.030799999999999998}, \"c3d20f33bf\": {\"quality\": 0.5314749999999999, \"cost\": 4.000000000000001e-06, \"time\": 0.0401}, \"c3ec2cec59\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.0304}, \"c443a2c1fb\": {\"quality\": 0.4098, \"cost\": 2.04e-06, \"time\": 0.0261}, \"c4a80d19b3\": {\"quality\": 0.5800000000000001, \"cost\": 3.62e-06, \"time\": 0.03609999999999999}, \"c4c2826afd\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"c4c94a5527\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"c4f3e7665d\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.0111}, \"c58e9652b7\": {\"quality\": 0.39287500000000003, \"cost\": 9.6e-07, \"time\": 0.0213}, \"c5a16b834a\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"c5fbe2076f\": {\"quality\": 0.537525, \"cost\": 5.0800000000000005e-06, \"time\": 0.0416}, \"c6198f364e\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"c691570715\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.0229}, \"c691a29c42\": {\"quality\": 0.48090000000000005, \"cost\": 4.52e-06, \"time\": 0.038900000000000004}, \"c6a339987c\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"c6a4d256ce\": {\"quality\": 0.52195, \"cost\": 3.83e-06, \"time\": 0.0413}, \"c76222087e\": {\"quality\": 0.4630666666666667, \"cost\": 2e-06, \"time\": 0.026699999999999998}, \"c772ff3704\": {\"quality\": 0.429175, \"cost\": 2.2799999999999998e-06, \"time\": 0.0256}, \"c7d4ff0c05\": {\"quality\": 0.467775, \"cost\": 2.43e-06, \"time\": 0.033299999999999996}, \"c823f7ab29\": {\"quality\": 0.4630666666666667, \"cost\": 2e-06, \"time\": 0.026699999999999998}, \"c82b926689\": {\"quality\": 0.52195, \"cost\": 3.830000000000001e-06, \"time\": 0.0413}, \"c82f834e85\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"c9320068f9\": {\"quality\": 0.40735000000000005, \"cost\": 2.88e-06, \"time\": 0.031400000000000004}, \"c935a33384\": {\"quality\": 0.5484, \"cost\": 5.0800000000000005e-06, \"time\": 0.044899999999999995}, \"c99f3577c7\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.034100000000000005}, \"c9d32a0a82\": {\"quality\": 0.466425, \"cost\": 2.6e-06, \"time\": 0.0288}, \"ca55c36c3f\": {\"quality\": 0.4794666666666667, \"cost\": 1.23e-06, \"time\": 0.0221}, \"caa7c0bd6b\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"cac6b051e9\": {\"quality\": 0.53425, \"cost\": 3.06e-06, \"time\": 0.036699999999999997}, \"cb19d631b2\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"cbb25fb322\": {\"quality\": 0.39287500000000003, \"cost\": 9.6e-07, \"time\": 0.0213}, \"cbb5eb0e74\": {\"quality\": 0.53425, \"cost\": 3.06e-06, \"time\": 0.036699999999999997}, \"cbc32cbeff\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.029500000000000002}, \"cbd4461293\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"cbe2318045\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.025699999999999997}, \"cc20ebc768\": {\"quality\": 0.47485, \"cost\": 3.44e-06, \"time\": 0.0374}, \"cc886fe337\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"ccf72745c1\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027700000000000002}, \"cd1d418732\": {\"quality\": 0.5082000000000001, \"cost\": 4.16e-06, \"time\": 0.0363}, \"cd23c79db1\": {\"quality\": 0.48573333333333335, \"cost\": 2.48e-06, \"time\": 0.0191}, \"cd64fbfcd9\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.023599999999999996}, \"cda3e2a4e9\": {\"quality\": 0.4134, \"cost\": 1.68e-06, \"time\": 0.0235}, \"cdc9ce922f\": {\"quality\": 0.4649666666666667, \"cost\": 1.23e-06, \"time\": 0.0188}, \"cdf0df2f51\": {\"quality\": 0.45690000000000003, \"cost\": 2.4299999999999996e-06, \"time\": 0.03}, \"ce281875b4\": {\"quality\": 0.4794666666666667, \"cost\": 1.23e-06, \"time\": 0.022099999999999998}, \"ce4bc5f348\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"ce980cf86f\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"cecca90dd2\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"cece83de2d\": {\"quality\": 0.4101333333333334, \"cost\": 2.5199999999999996e-06, \"time\": 0.0288}, \"cf51e0a888\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.0229}, \"cf9538faf0\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.0257}, \"cf9d2e224c\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"cfd36f3a8c\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.0263}, \"cffa29a6ef\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"d03596c3de\": {\"quality\": 0.5328999999999999, \"cost\": 3.23e-06, \"time\": 0.0322}, \"d0f9633442\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"d192872a51\": {\"quality\": 0.47240000000000004, \"cost\": 4.28e-06, \"time\": 0.0427}, \"d1d953cac7\": {\"quality\": 0.3989, \"cost\": 1.6799999999999998e-06, \"time\": 0.0202}, \"d2164c8c4c\": {\"quality\": 0.5611333333333334, \"cost\": 3.64e-06, \"time\": 0.034199999999999994}, \"d216eab7d8\": {\"quality\": 0.5869666666666667, \"cost\": 3.88e-06, \"time\": 0.033699999999999994}, \"d266c19ac8\": {\"quality\": 0.525825, \"cost\": 2.22e-06, \"time\": 0.0281}, \"d2af24b59e\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.0224}, \"d3a2d50bd7\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"d3db4cf84d\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"d40174fb0b\": {\"quality\": 0.39885000000000004, \"cost\": 1.3199999999999999e-06, \"time\": 0.0176}, \"d43fafa19e\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"d48ead13da\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.022399999999999996}, \"d5016f4538\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"d55e983189\": {\"quality\": 0.5422666666666667, \"cost\": 1.6200000000000002e-06, \"time\": 0.026}, \"d573c2a414\": {\"quality\": 0.45935000000000004, \"cost\": 1.59e-06, \"time\": 0.0247}, \"d58036ba66\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.0224}, \"d59bacbfe0\": {\"quality\": 0.52195, \"cost\": 3.830000000000001e-06, \"time\": 0.041299999999999996}, \"d6040140b9\": {\"quality\": 0.5955, \"cost\": 5.47e-06, \"time\": 0.048799999999999996}, \"d640edd7a7\": {\"quality\": 0.4794666666666667, \"cost\": 1.23e-06, \"time\": 0.0221}, \"d65185c1a4\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"d690b6d739\": {\"quality\": 0.4649666666666667, \"cost\": 1.23e-06, \"time\": 0.018799999999999997}, \"d6bd3b66ba\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.0111}, \"d6c4e48eeb\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.0263}, \"d6c60a5214\": {\"quality\": 0.40375, \"cost\": 4.8e-07, \"time\": 0.0123}, \"d6cbf265ee\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"d73a9aab4e\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"d752c30d07\": {\"quality\": 0.5002333333333333, \"cost\": 2.48e-06, \"time\": 0.0224}, \"d7c0972014\": {\"quality\": 0.5082, \"cost\": 4.16e-06, \"time\": 0.0363}, \"d7f6c0c9d4\": {\"quality\": 0.4762, \"cost\": 2.07e-06, \"time\": 0.0274}, \"d813410e44\": {\"quality\": 0.47946666666666665, \"cost\": 1.2299999999999999e-06, \"time\": 0.022099999999999998}, \"d87eb775da\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"d8bab6c09b\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"d8bcac36e8\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027699999999999995}, \"d96677d8d4\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.026899999999999997}, \"d9e2bb21a3\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.021599999999999998}, \"daaadadcc9\": {\"quality\": 0.4858, \"cost\": 2.84e-06, \"time\": 0.0283}, \"daf855e065\": {\"quality\": 0.525825, \"cost\": 2.22e-06, \"time\": 0.0281}, \"db6f7259cd\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"db9060cd27\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"dbce95a072\": {\"quality\": 0.5114666666666666, \"cost\": 3.32e-06, \"time\": 0.031}, \"dbe7d818fa\": {\"quality\": 0.3908333333333333, \"cost\": 6e-07, \"time\": 0.0187}, \"dc66bccb1c\": {\"quality\": 0.5002333333333333, \"cost\": 2.48e-06, \"time\": 0.0224}, \"dc90065dea\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"dc9a912501\": {\"quality\": 0.4649666666666667, \"cost\": 1.2299999999999999e-06, \"time\": 0.0188}, \"dd0d70fedd\": {\"quality\": 0.6309, \"cost\": 1.5e-06, \"time\": 0.0196}, \"dd76899626\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.0341}, \"dd9f5d1ba9\": {\"quality\": 0.45085000000000003, \"cost\": 1.35e-06, \"time\": 0.028499999999999998}, \"de1645053b\": {\"quality\": 0.5314749999999999, \"cost\": 4.000000000000001e-06, \"time\": 0.0401}, \"de18bf45e1\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"de1e56370f\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"deb84ddd06\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"df2bb408cf\": {\"quality\": 0.40216666666666673, \"cost\": 8.4e-07, \"time\": 0.0149}, \"df2ebe2c01\": {\"quality\": 0.45935000000000004, \"cost\": 1.59e-06, \"time\": 0.0247}, \"dfb8aebe38\": {\"quality\": 0.5244, \"cost\": 2.99e-06, \"time\": 0.036}, \"dfce6153aa\": {\"quality\": 0.418225, \"cost\": 2.88e-06, \"time\": 0.0347}, \"dfda94bd2a\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"dff452a9ca\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"e02f982a26\": {\"quality\": 0.48889999999999995, \"cost\": 2.24e-06, \"time\": 0.0262}, \"e06701b665\": {\"quality\": 0.4649666666666667, \"cost\": 1.2299999999999999e-06, \"time\": 0.0188}, \"e18abd2ab0\": {\"quality\": 0.4744, \"cost\": 2.24e-06, \"time\": 0.022899999999999997}, \"e1e596ee1b\": {\"quality\": 0.46642500000000003, \"cost\": 2.6e-06, \"time\": 0.0288}, \"e20ba014a1\": {\"quality\": 0.59795, \"cost\": 4.63e-06, \"time\": 0.0435}, \"e223700849\": {\"quality\": 0.4134, \"cost\": 1.68e-06, \"time\": 0.0235}, \"e26c7bfbdb\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.0197}, \"e35e5f81a7\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"e3cdc0d870\": {\"quality\": 0.4021666666666666, \"cost\": 8.4e-07, \"time\": 0.0149}, \"e3df4cf041\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"e3eca9854c\": {\"quality\": 0.41225, \"cost\": 1.2e-06, \"time\": 0.0208}, \"e47dc3abca\": {\"quality\": 0.5082000000000001, \"cost\": 4.16e-06, \"time\": 0.0363}, \"e495ff601f\": {\"quality\": 0.466425, \"cost\": 2.6e-06, \"time\": 0.0288}, \"e4b9d4fb41\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"e510bda989\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"e515bc1935\": {\"quality\": 0.48889999999999995, \"cost\": 2.24e-06, \"time\": 0.026199999999999998}, \"e517cd2222\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"e51b01f418\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"e520dfae5b\": {\"quality\": 0.47872499999999996, \"cost\": 1.8299999999999998e-06, \"time\": 0.0242}, \"e53906f84b\": {\"quality\": 0.4649666666666667, \"cost\": 1.23e-06, \"time\": 0.0188}, \"e53b349cce\": {\"quality\": 0.3989, \"cost\": 1.6799999999999998e-06, \"time\": 0.0202}, \"e5401ed278\": {\"quality\": 0.463975, \"cost\": 3.44e-06, \"time\": 0.034100000000000005}, \"e56a16ca66\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"e5a2f72b30\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"e5a70a13ac\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"e5fdeb4de9\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"e609601eee\": {\"quality\": 0.39890000000000003, \"cost\": 1.6799999999999998e-06, \"time\": 0.020200000000000003}, \"e6f141cc8f\": {\"quality\": 0.45690000000000003, \"cost\": 2.43e-06, \"time\": 0.030000000000000002}, \"e7525c117a\": {\"quality\": 0.4021666666666666, \"cost\": 8.4e-07, \"time\": 0.0149}, \"e7e94ab7a5\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"e86bd256d6\": {\"quality\": 0.4021666666666666, \"cost\": 8.4e-07, \"time\": 0.0149}, \"e89283a4d9\": {\"quality\": 0.3908333333333333, \"cost\": 6e-07, \"time\": 0.0187}, \"e9befb80e0\": {\"quality\": 0.4101333333333334, \"cost\": 2.5199999999999996e-06, \"time\": 0.0288}, \"ea38031fc1\": {\"quality\": 0.418225, \"cost\": 2.88e-06, \"time\": 0.0347}, \"ea6ecc5653\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"ea8bcb3ae2\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.03899999999999999}, \"ea91a2e78b\": {\"quality\": 0.39890000000000003, \"cost\": 1.68e-06, \"time\": 0.020200000000000003}, \"eac300f0d1\": {\"quality\": 0.45690000000000003, \"cost\": 2.43e-06, \"time\": 0.03}, \"ebaa9b1297\": {\"quality\": 0.49795, \"cost\": 8.7e-07, \"time\": 0.0162}, \"ebbe8b6c4f\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.031}, \"ebdf3abff2\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"ebee1f2761\": {\"quality\": 0.46777500000000005, \"cost\": 2.43e-06, \"time\": 0.033299999999999996}, \"ec8844a5ae\": {\"quality\": 0.365, \"cost\": 1.2e-07, \"time\": 0.0064}, \"ecb5f78f37\": {\"quality\": 0.522025, \"cost\": 3.23e-06, \"time\": 0.028899999999999995}, \"ece7ff5129\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"ed60b8cac5\": {\"quality\": 0.40137500000000004, \"cost\": 1.2e-06, \"time\": 0.0175}, \"ed6b5480a5\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"eda630dc85\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"edaaee5ed4\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"edb2b764aa\": {\"quality\": 0.5869666666666666, \"cost\": 3.88e-06, \"time\": 0.0337}, \"edc52339db\": {\"quality\": 0.47247500000000003, \"cost\": 3.68e-06, \"time\": 0.0303}, \"ee46042c5d\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"ee855899d8\": {\"quality\": 0.45363333333333333, \"cost\": 9.9e-07, \"time\": 0.0226}, \"eef12d478b\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"ef37b3e0be\": {\"quality\": 0.4183, \"cost\": 2.2799999999999998e-06, \"time\": 0.0223}, \"ef43d497f1\": {\"quality\": 0.48335, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"ef4d4c4a62\": {\"quality\": 0.5869666666666667, \"cost\": 3.88e-06, \"time\": 0.033699999999999994}, \"f0655621af\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"f076b4c9ae\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"f07881a734\": {\"quality\": 0.38756666666666667, \"cost\": 1.44e-06, \"time\": 0.024}, \"f0829510fc\": {\"quality\": 0.418225, \"cost\": 2.88e-06, \"time\": 0.0347}, \"f11eddb4ed\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"f12622d3d7\": {\"quality\": 0.39892500000000003, \"cost\": 2.04e-06, \"time\": 0.0228}, \"f1408da253\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"f1770e7d28\": {\"quality\": 0.517325, \"cost\": 1.98e-06, \"time\": 0.0319}, \"f18cf41929\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"f1bda127f6\": {\"quality\": 0.5114666666666666, \"cost\": 3.32e-06, \"time\": 0.031}, \"f2a2e91541\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.029500000000000002}, \"f2c04ed1c8\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"f2cf5db12d\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.031}, \"f3c7a062f7\": {\"quality\": 0.5422666666666667, \"cost\": 1.6200000000000002e-06, \"time\": 0.026}, \"f42277df7f\": {\"quality\": 0.49795, \"cost\": 8.7e-07, \"time\": 0.0162}, \"f437481e3b\": {\"quality\": 0.5399750000000001, \"cost\": 4.24e-06, \"time\": 0.0363}, \"f487340019\": {\"quality\": 0.4762, \"cost\": 2.07e-06, \"time\": 0.0274}, \"f497b83523\": {\"quality\": 0.5517, \"cost\": 2.6300000000000002e-06, \"time\": 0.0301}, \"f4aa8ffdf5\": {\"quality\": 0.40216666666666673, \"cost\": 8.4e-07, \"time\": 0.0149}, \"f4ab4a73b6\": {\"quality\": 0.48563333333333336, \"cost\": 3.08e-06, \"time\": 0.0315}, \"f4beb148d0\": {\"quality\": 0.5422666666666667, \"cost\": 1.6200000000000002e-06, \"time\": 0.026}, \"f4deb72db6\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"f50a47a0aa\": {\"quality\": 0.47946666666666665, \"cost\": 1.2299999999999999e-06, \"time\": 0.022099999999999998}, \"f5b9a94dcc\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"f5c27e7172\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.018299999999999997}, \"f5e53d963b\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"f5ecac6743\": {\"quality\": 0.466425, \"cost\": 2.6e-06, \"time\": 0.0288}, \"f614235c15\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"f627bff3a1\": {\"quality\": 0.47946666666666665, \"cost\": 1.2299999999999999e-06, \"time\": 0.022099999999999998}, \"f70e54a9a0\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"f74c5a862e\": {\"quality\": 0.4098, \"cost\": 2.04e-06, \"time\": 0.026099999999999998}, \"f74ec023e4\": {\"quality\": 0.494225, \"cost\": 3.68e-06, \"time\": 0.0369}, \"f783b2b34a\": {\"quality\": 0.4649666666666667, \"cost\": 1.23e-06, \"time\": 0.0188}, \"f7b048bd54\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"f7c4df993e\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"f848813dca\": {\"quality\": 0.448475, \"cost\": 1.5899999999999998e-06, \"time\": 0.021400000000000002}, \"f854533145\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"f8f946b5fb\": {\"quality\": 0.390425, \"cost\": 1.8e-06, \"time\": 0.026600000000000002}, \"f912592c8d\": {\"quality\": 0.5785750000000001, \"cost\": 4.39e-06, \"time\": 0.044}, \"f93d9a2693\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"f99096d89c\": {\"quality\": 0.543775, \"cost\": 3.23e-06, \"time\": 0.0355}, \"f9e8e221f3\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"fa38879eab\": {\"quality\": 0.538875, \"cost\": 4.9100000000000004e-06, \"time\": 0.0461}, \"fa5b473f15\": {\"quality\": 0.39892500000000003, \"cost\": 2.04e-06, \"time\": 0.0228}, \"fa7882d46b\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"fa906520d1\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"faabebaa30\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"fb0339a7d0\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.0375}, \"fb0e796cd3\": {\"quality\": 0.38756666666666667, \"cost\": 1.4399999999999998e-06, \"time\": 0.024}, \"fb216ad6b3\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"fb29372712\": {\"quality\": 0.5715, \"cost\": 3.3800000000000002e-06, \"time\": 0.0399}, \"fb6216880a\": {\"quality\": 0.522025, \"cost\": 3.23e-06, \"time\": 0.028899999999999995}, \"fbd6c45271\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"fc1fd5bf54\": {\"quality\": 0.47872499999999996, \"cost\": 1.8299999999999998e-06, \"time\": 0.0242}, \"fc4832696b\": {\"quality\": 0.41340000000000005, \"cost\": 1.6799999999999998e-06, \"time\": 0.0235}, \"fccaadfcdb\": {\"quality\": 0.513525, \"cost\": 2.99e-06, \"time\": 0.0327}, \"fcd3d2b250\": {\"quality\": 0.45690000000000003, \"cost\": 2.43e-06, \"time\": 0.030000000000000002}, \"fce38334b2\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"fd0709359e\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"fddccfbf94\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.030999999999999996}, \"fe2b4d4d8b\": {\"quality\": 0.41225, \"cost\": 1.2e-06, \"time\": 0.0208}, \"fea4734c09\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"fef1ca27fa\": {\"quality\": 0.4917750000000001, \"cost\": 4.52e-06, \"time\": 0.0422}, \"ff11cb6a7a\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"ff171e34e2\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.023599999999999996}, \"ff1c958e21\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"ff8e68049a\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}}"
  },
  {
    "path": "abacus-research/cheap-priors.json",
    "content": "{\"00c93aec22\": {\"quality\": 0.426725, \"cost\": 3.1199999999999998e-06, \"time\": 0.030900000000000004}, \"00f4acd0d3\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"0121878170\": {\"quality\": 0.630875, \"cost\": 2.611e-05, \"time\": 0.0282}, \"01c2f973ad\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"01fca3c717\": {\"quality\": 0.5406666666666666, \"cost\": 1.4060000000000001e-05, \"time\": 0.024999999999999998}, \"02078988c1\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.031}, \"021604dec1\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"0262668df7\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"0267c97b70\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.0308}, \"02d6cdecdc\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.038900000000000004}, \"033ca325e6\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"0364b5e990\": {\"quality\": 0.5594250000000001, \"cost\": 1.582e-05, \"time\": 0.0356}, \"0375ea52c9\": {\"quality\": 0.7468, \"cost\": 1.25e-05, \"time\": 0.0079}, \"038a5f0a62\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.030799999999999998}, \"039803b3b1\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.0289}, \"03b972cb56\": {\"quality\": 0.5594250000000001, \"cost\": 1.582e-05, \"time\": 0.03560000000000001}, \"042d933706\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"050b21ce37\": {\"quality\": 0.5161250000000001, \"cost\": 1.4420000000000001e-05, \"time\": 0.0309}, \"0524f42520\": {\"quality\": 0.63795, \"cost\": 2.712e-05, \"time\": 0.032299999999999995}, \"05420351e5\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"057e332ab1\": {\"quality\": 0.5594250000000001, \"cost\": 1.582e-05, \"time\": 0.0356}, \"0646f3f0fb\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.034199999999999994}, \"06493715cc\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"0659531b94\": {\"quality\": 0.6695333333333333, \"cost\": 1.4000000000000001e-05, \"time\": 0.0275}, \"067ee6e91b\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"0695f9b5fc\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"06e94a0f2e\": {\"quality\": 0.7468, \"cost\": 1.25e-05, \"time\": 0.0079}, \"073ef31d23\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.026199999999999998}, \"078a7e545e\": {\"quality\": 0.6019, \"cost\": 1.436e-05, \"time\": 0.0301}, \"079feb14a8\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.0263}, \"07a3a7daf7\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"08127cd6dd\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"0833133620\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"087a2cabc4\": {\"quality\": 0.6174000000000001, \"cost\": 1.6210000000000002e-05, \"time\": 0.042800000000000005}, \"08bf8cc191\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"08e1802287\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"08f7f63b30\": {\"quality\": 0.563225, \"cost\": 1.481e-05, \"time\": 0.0348}, \"090cd3ef31\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"0947216ece\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.0304}, \"096d51f670\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"09791c731b\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"0990c0d4f8\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"0a4c1bbb4a\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.018299999999999997}, \"0ac969dde3\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.040799999999999996}, \"0b4ab72197\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"0bf9d31691\": {\"quality\": 0.5149333333333334, \"cost\": 1.322e-05, \"time\": 0.0131}, \"0c020b86a3\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.0224}, \"0c6c7fe96a\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"0c81c8996a\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"0cd25da9fe\": {\"quality\": 0.5594250000000001, \"cost\": 1.582e-05, \"time\": 0.0356}, \"0cd78f33d8\": {\"quality\": 0.7081666666666666, \"cost\": 2.5750000000000002e-05, \"time\": 0.0256}, \"0cdc5954dd\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.031}, \"0d53dd53c1\": {\"quality\": 0.5922000000000001, \"cost\": 2.6560000000000003e-05, \"time\": 0.0329}, \"0d8436af32\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.0289}, \"0e38896654\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.0289}, \"0ec672e7c8\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.037500000000000006}, \"0ed243f788\": {\"quality\": 0.543775, \"cost\": 3.23e-06, \"time\": 0.0355}, \"0eeb372802\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"0ef0becc1b\": {\"quality\": 0.6067333333333333, \"cost\": 1.361e-05, \"time\": 0.0236}, \"0fefead197\": {\"quality\": 0.6417499999999999, \"cost\": 2.611e-05, \"time\": 0.0315}, \"0ff126ebf8\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"10d1d4bdeb\": {\"quality\": 0.561875, \"cost\": 1.498e-05, \"time\": 0.0303}, \"114a097c53\": {\"quality\": 0.43270000000000003, \"cost\": 2.4e-06, \"time\": 0.0224}, \"1175ee37e6\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"11bc996d48\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.030799999999999998}, \"11ded03305\": {\"quality\": 0.6067333333333332, \"cost\": 1.361e-05, \"time\": 0.0236}, \"123fb650fb\": {\"quality\": 0.429175, \"cost\": 2.2799999999999998e-06, \"time\": 0.0256}, \"132f6f3946\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"133ee5023f\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"13a009fe0c\": {\"quality\": 0.5399750000000001, \"cost\": 4.24e-06, \"time\": 0.0363}, \"13da306f84\": {\"quality\": 0.4969666666666666, \"cost\": 3.32e-06, \"time\": 0.0277}, \"13e717e41e\": {\"quality\": 0.59465, \"cost\": 1.286e-05, \"time\": 0.0138}, \"13f75f9bd0\": {\"quality\": 0.4392333333333333, \"cost\": 1.92e-06, \"time\": 0.023}, \"140ededb41\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"142e59c03f\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.024399999999999998}, \"142f3a7c70\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"1468dddecc\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"15af009a01\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.038900000000000004}, \"15b80a55d3\": {\"quality\": 0.561875, \"cost\": 1.498e-05, \"time\": 0.030299999999999997}, \"1625e624c5\": {\"quality\": 0.58975, \"cost\": 1.3700000000000001e-05, \"time\": 0.0191}, \"16cff1c1e9\": {\"quality\": 0.49682499999999996, \"cost\": 1.358e-05, \"time\": 0.019000000000000003}, \"17407df027\": {\"quality\": 0.5028, \"cost\": 1.526e-05, \"time\": 0.0329}, \"176da24f53\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"179379555f\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"181c91d1be\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.030999999999999996}, \"183743e76e\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"186b58c209\": {\"quality\": 0.5161250000000001, \"cost\": 1.4420000000000001e-05, \"time\": 0.0309}, \"187eace9fe\": {\"quality\": 0.5149333333333334, \"cost\": 1.322e-05, \"time\": 0.0131}, \"190ed2e1b6\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.0289}, \"191aafe1a6\": {\"quality\": 0.6067333333333333, \"cost\": 1.361e-05, \"time\": 0.0236}, \"194919ad28\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.021699999999999997}, \"197bb53f10\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"19ba7d0617\": {\"quality\": 0.7468, \"cost\": 3.7500000000000003e-05, \"time\": 0.023700000000000002}, \"19e3db7fe7\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"1a08cb3f50\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.028900000000000002}, \"1a169179f6\": {\"quality\": 0.646375, \"cost\": 2.7960000000000003e-05, \"time\": 0.040900000000000006}, \"1a71d61ac4\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.0217}, \"1ad856985f\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"1adec2dca2\": {\"quality\": 0.47247500000000003, \"cost\": 3.68e-06, \"time\": 0.0303}, \"1b04a2b184\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.037500000000000006}, \"1b28439bd7\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.0304}, \"1b2c667b15\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"1b4511eada\": {\"quality\": 0.612775, \"cost\": 1.436e-05, \"time\": 0.0334}, \"1b7e6cad66\": {\"quality\": 0.7030000000000001, \"cost\": 1.426e-05, \"time\": 0.0218}, \"1beb2fac62\": {\"quality\": 0.4969666666666666, \"cost\": 3.32e-06, \"time\": 0.0277}, \"1c347e4d91\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.021599999999999998}, \"1c35bf4be6\": {\"quality\": 0.610325, \"cost\": 1.52e-05, \"time\": 0.0387}, \"1c4bbf8f7e\": {\"quality\": 0.630875, \"cost\": 2.611e-05, \"time\": 0.0282}, \"1c5f1341f6\": {\"quality\": 0.6067333333333333, \"cost\": 1.361e-05, \"time\": 0.0236}, \"1c71804bec\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.0244}, \"1cc6d9efb6\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.03899999999999999}, \"1ce3d77039\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"1d26090364\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"1d87f97e62\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"1d90fb8ca6\": {\"quality\": 0.7081666666666666, \"cost\": 2.5750000000000002e-05, \"time\": 0.0256}, \"1da2369719\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"1e18e60895\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"1e1bf7e88b\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"1e8b3521f8\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"1f412964ff\": {\"quality\": 0.6067333333333332, \"cost\": 1.361e-05, \"time\": 0.0236}, \"1f72cfb78a\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.038900000000000004}, \"1fb5d170ad\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.0262}, \"20180dd292\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.0342}, \"2018bef45f\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.025699999999999997}, \"2075ff1d04\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"208a98f514\": {\"quality\": 0.5800000000000001, \"cost\": 3.62e-06, \"time\": 0.03609999999999999}, \"20904e5c14\": {\"quality\": 0.5439333333333334, \"cost\": 1.322e-05, \"time\": 0.0197}, \"20afc3d539\": {\"quality\": 0.7176, \"cost\": 2.676e-05, \"time\": 0.0297}, \"20e10af7d4\": {\"quality\": 0.47872499999999996, \"cost\": 1.8299999999999998e-06, \"time\": 0.0242}, \"20e2c0b057\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"211b89b4cd\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"21386082aa\": {\"quality\": 0.561875, \"cost\": 1.498e-05, \"time\": 0.0303}, \"21b2b8ebd1\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"21b78249a7\": {\"quality\": 0.6308666666666666, \"cost\": 2.536e-05, \"time\": 0.0184}, \"21bed16a7d\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"2200d969d0\": {\"quality\": 0.695925, \"cost\": 2.751e-05, \"time\": 0.03950000000000001}, \"220d008704\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.0262}, \"2251d21392\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"23566f15ab\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"23a9506d36\": {\"quality\": 0.7468, \"cost\": 1.25e-05, \"time\": 0.0079}, \"24957f3a43\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.0197}, \"2529e2f8b0\": {\"quality\": 0.7468, \"cost\": 1.25e-05, \"time\": 0.0079}, \"252f01ac5b\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.024399999999999998}, \"25fadf0883\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.038900000000000004}, \"2609bfd616\": {\"quality\": 0.4858, \"cost\": 2.84e-06, \"time\": 0.0283}, \"2629f3e324\": {\"quality\": 0.5632250000000001, \"cost\": 1.481e-05, \"time\": 0.0348}, \"262e4298f9\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.0276}, \"26cc40d3bb\": {\"quality\": 0.5509999999999999, \"cost\": 1.498e-05, \"time\": 0.026999999999999996}, \"2728c8eb6a\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"27bc52befa\": {\"quality\": 0.5374, \"cost\": 1.49e-05, \"time\": 0.0303}, \"27daa50458\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"2821795e69\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.034199999999999994}, \"28369b2421\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.0263}, \"28421e6d62\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.022399999999999996}, \"2936c3e43e\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.0375}, \"293ec5edca\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"29409d0894\": {\"quality\": 0.59465, \"cost\": 1.286e-05, \"time\": 0.0138}, \"294258298a\": {\"quality\": 0.6067333333333332, \"cost\": 1.361e-05, \"time\": 0.0236}, \"294e541235\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"295ed5e759\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.0197}, \"2960431101\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"29892d8468\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"299a0aeb65\": {\"quality\": 0.590875, \"cost\": 3.62e-06, \"time\": 0.0394}, \"29ad99e3ed\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.016399999999999998}, \"2a5edac2de\": {\"quality\": 0.5922000000000001, \"cost\": 2.6560000000000003e-05, \"time\": 0.0329}, \"2a7d15f4a7\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"2aa996de6a\": {\"quality\": 0.5837749999999999, \"cost\": 2.572e-05, \"time\": 0.024300000000000002}, \"2ac4fb293f\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"2afeff0083\": {\"quality\": 0.563225, \"cost\": 1.481e-05, \"time\": 0.034800000000000005}, \"2b2bc9568b\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"2b5679d248\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"2bcf54cda1\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.0276}, \"2bd39ee744\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"2bf38d797f\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"2c4f4f304e\": {\"quality\": 0.68885, \"cost\": 1.325e-05, \"time\": 0.0177}, \"2c5cf9eb26\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.023599999999999996}, \"2c9a9f94c4\": {\"quality\": 0.426725, \"cost\": 3.1199999999999998e-06, \"time\": 0.030900000000000004}, \"2d3bbc2d23\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"2de113167b\": {\"quality\": 0.494225, \"cost\": 3.68e-06, \"time\": 0.0369}, \"2de3eb2c19\": {\"quality\": 0.6592, \"cost\": 3.52e-06, \"time\": 0.0278}, \"2e30394ac6\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.011099999999999999}, \"2e9c5cc9bf\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"2f1573da80\": {\"quality\": 0.429175, \"cost\": 2.2799999999999998e-06, \"time\": 0.0256}, \"2f39d78f34\": {\"quality\": 0.6453666666666668, \"cost\": 2.536e-05, \"time\": 0.0217}, \"2fc0cb3592\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"2fd9cd426a\": {\"quality\": 0.48335, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"300924ebae\": {\"quality\": 0.6308666666666666, \"cost\": 2.536e-05, \"time\": 0.0184}, \"3019af79b3\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"302c1d97fc\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027699999999999995}, \"30ae4cbe91\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"30c1f9ddf1\": {\"quality\": 0.6067333333333333, \"cost\": 1.361e-05, \"time\": 0.0236}, \"30cd375570\": {\"quality\": 0.6129000000000001, \"cost\": 1.546e-05, \"time\": 0.033}, \"3169782cbb\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.0164}, \"3172fc459a\": {\"quality\": 0.6244750000000001, \"cost\": 1.722e-05, \"time\": 0.0469}, \"318499c14b\": {\"quality\": 0.610325, \"cost\": 1.52e-05, \"time\": 0.0387}, \"31a32be94d\": {\"quality\": 0.6393, \"cost\": 2.695e-05, \"time\": 0.0368}, \"32b101d807\": {\"quality\": 0.538875, \"cost\": 4.9100000000000004e-06, \"time\": 0.0461}, \"32e2c7ad7f\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.0375}, \"33459cd29c\": {\"quality\": 0.47492500000000004, \"cost\": 2.84e-06, \"time\": 0.025}, \"33a187e74f\": {\"quality\": 0.525825, \"cost\": 2.22e-06, \"time\": 0.0281}, \"33bab4f766\": {\"quality\": 0.612775, \"cost\": 1.436e-05, \"time\": 0.0334}, \"34922140da\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.038900000000000004}, \"3511b5e1d0\": {\"quality\": 0.39899999999999997, \"cost\": 1.08e-06, \"time\": 0.0078}, \"3513311c2d\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"3513e54767\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"353f0cb1ac\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"3550bf88cb\": {\"quality\": 0.53425, \"cost\": 3.06e-06, \"time\": 0.036699999999999997}, \"35610fb420\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"357267e14b\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"35baa5c3cc\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.0375}, \"3637084f91\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.025699999999999997}, \"368a497102\": {\"quality\": 0.5406666666666666, \"cost\": 1.4060000000000001e-05, \"time\": 0.025}, \"36c66671ee\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.023599999999999996}, \"37456cb002\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"3746ea5c03\": {\"quality\": 0.5406666666666666, \"cost\": 1.4060000000000001e-05, \"time\": 0.024999999999999998}, \"375ed248fe\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.025699999999999997}, \"377cdf9209\": {\"quality\": 0.563225, \"cost\": 1.481e-05, \"time\": 0.0348}, \"37d4d0f214\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.0389}, \"37ece7217f\": {\"quality\": 0.59465, \"cost\": 1.286e-05, \"time\": 0.0138}, \"38075bb01f\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"3831d758b1\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.021699999999999997}, \"38567d6a43\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"3875787727\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"389c54cbca\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"3980f20caa\": {\"quality\": 0.48090000000000005, \"cost\": 4.52e-06, \"time\": 0.038900000000000004}, \"3997a836bd\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.034199999999999994}, \"39ad76f8ce\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.024399999999999998}, \"39c0b7c171\": {\"quality\": 0.6174000000000001, \"cost\": 1.621e-05, \"time\": 0.042800000000000005}, \"39cd4ca402\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.018299999999999997}, \"3a32c98a53\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"3ac7fa4e46\": {\"quality\": 0.646375, \"cost\": 2.7960000000000003e-05, \"time\": 0.040900000000000006}, \"3ad6dcf559\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"3ae0de8663\": {\"quality\": 0.5594250000000001, \"cost\": 1.582e-05, \"time\": 0.03560000000000001}, \"3b2e8075ea\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"3b3676521a\": {\"quality\": 0.5955, \"cost\": 5.47e-06, \"time\": 0.048799999999999996}, \"3b57530a56\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"3b6fbfa11d\": {\"quality\": 0.6129000000000001, \"cost\": 1.546e-05, \"time\": 0.033}, \"3b81215e7a\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.037500000000000006}, \"3b9f8045d7\": {\"quality\": 0.646375, \"cost\": 2.7960000000000003e-05, \"time\": 0.040900000000000006}, \"3c5857683c\": {\"quality\": 0.5678500000000001, \"cost\": 1.666e-05, \"time\": 0.0442}, \"3cbab8082e\": {\"quality\": 0.523375, \"cost\": 3.06e-06, \"time\": 0.0334}, \"3d21104666\": {\"quality\": 0.6421, \"cost\": 2.62e-05, \"time\": 0.027}, \"3d71c4dd2c\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.026899999999999997}, \"3d9e24215e\": {\"quality\": 0.7030000000000001, \"cost\": 1.426e-05, \"time\": 0.0218}, \"3e7efee65a\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.0375}, \"3ed0ad20ed\": {\"quality\": 0.5594250000000001, \"cost\": 1.582e-05, \"time\": 0.0356}, \"3f1a58aec9\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"3f2b07cb78\": {\"quality\": 0.6129000000000001, \"cost\": 1.546e-05, \"time\": 0.033}, \"3f3ef494b0\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"3f62c3fbfc\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"3f730d8bfe\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.0244}, \"3f8d2ee81f\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"40104c813f\": {\"quality\": 0.522025, \"cost\": 3.23e-06, \"time\": 0.028899999999999995}, \"403b05da2d\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027699999999999995}, \"403f0726fa\": {\"quality\": 0.7468, \"cost\": 2.5e-05, \"time\": 0.0158}, \"4098178354\": {\"quality\": 0.563225, \"cost\": 1.481e-05, \"time\": 0.034800000000000005}, \"409ff67607\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.018299999999999997}, \"40b3b6642c\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"412c065b83\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"4191118787\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"41d5b97871\": {\"quality\": 0.43923333333333336, \"cost\": 1.92e-06, \"time\": 0.023}, \"41d8845655\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"41ee202cac\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"41fe4aee55\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.024399999999999998}, \"42430ea391\": {\"quality\": 0.559425, \"cost\": 1.582e-05, \"time\": 0.0356}, \"42ddd48341\": {\"quality\": 0.537525, \"cost\": 5.0800000000000005e-06, \"time\": 0.0416}, \"42f1e19aa7\": {\"quality\": 0.63795, \"cost\": 2.712e-05, \"time\": 0.032299999999999995}, \"430a2ab32f\": {\"quality\": 0.7176, \"cost\": 2.676e-05, \"time\": 0.0297}, \"4339427ad8\": {\"quality\": 0.513675, \"cost\": 1.526e-05, \"time\": 0.0362}, \"4361bc7ea7\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"43c3cf9cb8\": {\"quality\": 0.6592, \"cost\": 3.52e-06, \"time\": 0.0278}, \"43d24fb32a\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027700000000000002}, \"43e9b39e5c\": {\"quality\": 0.6740250000000001, \"cost\": 1.677e-05, \"time\": 0.0455}, \"44d6af5523\": {\"quality\": 0.6309, \"cost\": 2.25e-06, \"time\": 0.0294}, \"44f189d813\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"450f45a187\": {\"quality\": 0.5729, \"cost\": 1.286e-05, \"time\": 0.0105}, \"453d0a5097\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.0217}, \"4547ef4c8e\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"461846a52d\": {\"quality\": 0.42799999999999994, \"cost\": 1.08e-06, \"time\": 0.0144}, \"462e6ff849\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"4630853d32\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"46475b9e75\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"46654a1f32\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.039}, \"466a3036b2\": {\"quality\": 0.47382500000000005, \"cost\": 3.51e-06, \"time\": 0.0348}, \"466d4d16dd\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.022399999999999996}, \"46ed68152d\": {\"quality\": 0.5002333333333333, \"cost\": 2.48e-06, \"time\": 0.0224}, \"476a12876c\": {\"quality\": 0.4392333333333333, \"cost\": 1.92e-06, \"time\": 0.023}, \"4778401a7a\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.027600000000000003}, \"47f9115b26\": {\"quality\": 0.561875, \"cost\": 1.498e-05, \"time\": 0.0303}, \"48043e2304\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"487f30e740\": {\"quality\": 0.6129, \"cost\": 1.546e-05, \"time\": 0.033}, \"488645cbd9\": {\"quality\": 0.4425, \"cost\": 7.2e-07, \"time\": 0.0118}, \"48bf87f7fe\": {\"quality\": 0.648825, \"cost\": 2.712e-05, \"time\": 0.0356}, \"4909061216\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.037500000000000006}, \"49731b1ccd\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"49ad844bd2\": {\"quality\": 0.4847, \"cost\": 3.51e-06, \"time\": 0.0381}, \"49ca727e49\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"4a23d8eff7\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.0276}, \"4a555da784\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"4a5cea8b85\": {\"quality\": 0.6421, \"cost\": 2.62e-05, \"time\": 0.027}, \"4a767339bd\": {\"quality\": 0.57275, \"cost\": 1.498e-05, \"time\": 0.0336}, \"4aafd39d76\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"4aca6e5216\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.0164}, \"4b18a647d6\": {\"quality\": 0.630875, \"cost\": 2.611e-05, \"time\": 0.0282}, \"4bc4528402\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"4c158a1a4a\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.0197}, \"4c954323e3\": {\"quality\": 0.5800000000000001, \"cost\": 3.62e-06, \"time\": 0.03609999999999999}, \"4d91e8a27b\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"4dc185389a\": {\"quality\": 0.5656749999999999, \"cost\": 1.397e-05, \"time\": 0.0295}, \"4dd3635bc3\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"4dd98ef398\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.021699999999999997}, \"4dfacd0007\": {\"quality\": 0.563225, \"cost\": 1.481e-05, \"time\": 0.0348}, \"4e298ee0d4\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.029500000000000002}, \"4e3443a0f9\": {\"quality\": 0.538875, \"cost\": 4.9100000000000004e-06, \"time\": 0.0461}, \"4e4b9db2b8\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"4e6509f614\": {\"quality\": 0.43270000000000003, \"cost\": 2.4e-06, \"time\": 0.0224}, \"4e6a83e751\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"4e79c8947f\": {\"quality\": 0.58975, \"cost\": 1.3700000000000001e-05, \"time\": 0.0191}, \"4e9504432b\": {\"quality\": 0.560775, \"cost\": 1.565e-05, \"time\": 0.040100000000000004}, \"4e962170dc\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.024399999999999998}, \"4eb0826f21\": {\"quality\": 0.5729, \"cost\": 1.286e-05, \"time\": 0.0105}, \"4ed41bf2e4\": {\"quality\": 0.5837749999999999, \"cost\": 2.572e-05, \"time\": 0.024300000000000002}, \"4f78672528\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.0164}, \"4f8cca1195\": {\"quality\": 0.59795, \"cost\": 4.63e-06, \"time\": 0.0435}, \"500860eaa2\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"50701b505e\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.026899999999999997}, \"51583a901c\": {\"quality\": 0.59795, \"cost\": 4.63e-06, \"time\": 0.0435}, \"51aeaf9f3e\": {\"quality\": 0.7468, \"cost\": 1.25e-05, \"time\": 0.0079}, \"520b52b64c\": {\"quality\": 0.7468, \"cost\": 2.5e-05, \"time\": 0.0158}, \"521314dab6\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"5226eb7ff6\": {\"quality\": 0.6067333333333332, \"cost\": 1.361e-05, \"time\": 0.0236}, \"526878b5eb\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"52c1cba6ce\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.040799999999999996}, \"52f041a70e\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"533867574b\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.0217}, \"53869388bb\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"53aefd41e4\": {\"quality\": 0.5439333333333334, \"cost\": 1.322e-05, \"time\": 0.019700000000000002}, \"53d2932c4f\": {\"quality\": 0.5114666666666666, \"cost\": 3.32e-06, \"time\": 0.031}, \"54375d3eba\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.0217}, \"5474247f91\": {\"quality\": 0.48573333333333335, \"cost\": 2.48e-06, \"time\": 0.0191}, \"54993bc472\": {\"quality\": 0.59465, \"cost\": 1.286e-05, \"time\": 0.0138}, \"55358f2285\": {\"quality\": 0.6884, \"cost\": 1.602e-05, \"time\": 0.035699999999999996}, \"5569b4f878\": {\"quality\": 0.68885, \"cost\": 1.325e-05, \"time\": 0.0177}, \"557d2cf7ba\": {\"quality\": 0.48573333333333335, \"cost\": 2.48e-06, \"time\": 0.0191}, \"55c8aa8935\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"55e6bf8f14\": {\"quality\": 0.5406666666666666, \"cost\": 1.4060000000000001e-05, \"time\": 0.024999999999999998}, \"56a0660622\": {\"quality\": 0.58975, \"cost\": 1.3700000000000001e-05, \"time\": 0.0191}, \"56a29a28c5\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"56c4fd5056\": {\"quality\": 0.6308666666666667, \"cost\": 2.536e-05, \"time\": 0.0184}, \"5703697dbd\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"5718f2ed80\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"572a02a59a\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"5750713a41\": {\"quality\": 0.4392333333333333, \"cost\": 1.92e-06, \"time\": 0.023}, \"57757ef15e\": {\"quality\": 0.5002333333333333, \"cost\": 2.48e-06, \"time\": 0.0224}, \"579c81bbe0\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"57bed1722f\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"585ba6d20b\": {\"quality\": 0.5328999999999999, \"cost\": 3.23e-06, \"time\": 0.0322}, \"589267ac64\": {\"quality\": 0.58975, \"cost\": 1.3700000000000001e-05, \"time\": 0.0191}, \"589a1cea79\": {\"quality\": 0.47492500000000004, \"cost\": 2.84e-06, \"time\": 0.025}, \"58ca42839b\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.0164}, \"58dc373441\": {\"quality\": 0.5076999999999999, \"cost\": 1.358e-05, \"time\": 0.0223}, \"59006532b4\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"59326c4e00\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"593975c75b\": {\"quality\": 0.695925, \"cost\": 2.751e-05, \"time\": 0.03950000000000001}, \"596b4f8694\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.0276}, \"5971ba4e0d\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"5996465c0a\": {\"quality\": 0.563225, \"cost\": 1.481e-05, \"time\": 0.0348}, \"59d70b9f65\": {\"quality\": 0.6695333333333333, \"cost\": 1.4e-05, \"time\": 0.0275}, \"59f887b67c\": {\"quality\": 0.47492500000000004, \"cost\": 2.84e-06, \"time\": 0.025}, \"5a22920db4\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"5a35020d45\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"5aa43da1fc\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.0244}, \"5ae0d88127\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.016399999999999998}, \"5b10fbdbe1\": {\"quality\": 0.5399750000000001, \"cost\": 4.24e-06, \"time\": 0.0363}, \"5bade9eb85\": {\"quality\": 0.523375, \"cost\": 3.06e-06, \"time\": 0.0334}, \"5be16744bf\": {\"quality\": 0.4969666666666666, \"cost\": 3.32e-06, \"time\": 0.0277}, \"5c5055e252\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.0164}, \"5c53feccd9\": {\"quality\": 0.48090000000000005, \"cost\": 4.52e-06, \"time\": 0.038900000000000004}, \"5c77c7c2b2\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"5d298b5b48\": {\"quality\": 0.559425, \"cost\": 1.582e-05, \"time\": 0.0356}, \"5d41515d2e\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"5d4babc723\": {\"quality\": 0.5618749999999999, \"cost\": 1.498e-05, \"time\": 0.0303}, \"5d79b50feb\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.030399999999999996}, \"5dc216cd6b\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.030999999999999996}, \"5dd68c1b8f\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"5de4a882c1\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"5e04e1c72d\": {\"quality\": 0.6453666666666668, \"cost\": 2.536e-05, \"time\": 0.0217}, \"5e923cee9e\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"5ea2fab380\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"5eb3bb525b\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.022399999999999996}, \"5eea899380\": {\"quality\": 0.6174000000000001, \"cost\": 1.6210000000000002e-05, \"time\": 0.042800000000000005}, \"5f37b3902b\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027699999999999995}, \"5f9282df3c\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.027600000000000003}, \"6019884cf3\": {\"quality\": 0.560775, \"cost\": 1.565e-05, \"time\": 0.040100000000000004}, \"606352363e\": {\"quality\": 0.5028, \"cost\": 1.526e-05, \"time\": 0.0329}, \"608728f868\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"60b9e936f1\": {\"quality\": 0.5729, \"cost\": 1.286e-05, \"time\": 0.0105}, \"60cb623c53\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"6234de86b4\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"62352c6854\": {\"quality\": 0.5678500000000001, \"cost\": 1.666e-05, \"time\": 0.0442}, \"63a0aaebed\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"647dda686f\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.0375}, \"6511b21ded\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"652c0f4bdf\": {\"quality\": 0.47247500000000003, \"cost\": 3.68e-06, \"time\": 0.0303}, \"6533c85913\": {\"quality\": 0.47382500000000005, \"cost\": 3.51e-06, \"time\": 0.0348}, \"65627426e0\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"65b76da9c6\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.022399999999999996}, \"65be1c1306\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"65e0216208\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"65eee615d7\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"6623d7a5ac\": {\"quality\": 0.66695, \"cost\": 1.576e-05, \"time\": 0.041400000000000006}, \"66750c0934\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"66776ec181\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"66e5ae0a21\": {\"quality\": 0.5406666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.025}, \"6750a8d7a7\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"67632141f6\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"67868fcff6\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.03899999999999999}, \"67bab6732d\": {\"quality\": 0.5082000000000001, \"cost\": 4.16e-06, \"time\": 0.0363}, \"67fe399cf1\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.025699999999999997}, \"6846bd8fb3\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"68b4cc3e39\": {\"quality\": 0.581325, \"cost\": 2.6560000000000003e-05, \"time\": 0.0296}, \"69a029ae36\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.024399999999999998}, \"69b3b67de6\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"69bf3f6ba0\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.0224}, \"6a022c3f73\": {\"quality\": 0.42799999999999994, \"cost\": 1.08e-06, \"time\": 0.0144}, \"6a10c53ad8\": {\"quality\": 0.588425, \"cost\": 4.46e-06, \"time\": 0.044700000000000004}, \"6a6348f69d\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"6a74a11bee\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"6a90ec29dd\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.0244}, \"6aac59742a\": {\"quality\": 0.46540000000000004, \"cost\": 2.67e-06, \"time\": 0.0262}, \"6b0862c597\": {\"quality\": 0.6067333333333332, \"cost\": 1.361e-05, \"time\": 0.0236}, \"6b3c16def2\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"6b99b0e901\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.0244}, \"6b9b2b3515\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"6bcc02962b\": {\"quality\": 0.6884, \"cost\": 1.602e-05, \"time\": 0.035699999999999996}, \"6c1987a9e3\": {\"quality\": 0.57275, \"cost\": 1.498e-05, \"time\": 0.0336}, \"6c50123ee1\": {\"quality\": 0.68885, \"cost\": 1.325e-05, \"time\": 0.0177}, \"6c67c36480\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"6c9b9f1363\": {\"quality\": 0.610325, \"cost\": 1.52e-05, \"time\": 0.0387}, \"6cc813aa68\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"6d20c6ace0\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.0304}, \"6d444fe21a\": {\"quality\": 0.5406666666666666, \"cost\": 1.4060000000000001e-05, \"time\": 0.025}, \"6db70dc3b6\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"6e0690f576\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"6e06cc804f\": {\"quality\": 0.6308666666666666, \"cost\": 2.536e-05, \"time\": 0.0184}, \"6e24048a2e\": {\"quality\": 0.5406666666666666, \"cost\": 1.4060000000000001e-05, \"time\": 0.025}, \"6e93514f45\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"6eae47102b\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.0263}, \"6ed4cae469\": {\"quality\": 0.7468, \"cost\": 1.25e-05, \"time\": 0.0079}, \"6ef3b7127e\": {\"quality\": 0.6592, \"cost\": 3.52e-06, \"time\": 0.0278}, \"6f60a05c33\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"6fe0b3f929\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"700ab1d309\": {\"quality\": 0.48573333333333335, \"cost\": 2.48e-06, \"time\": 0.0191}, \"7040e83d52\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"7046765af8\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"70b7c92ce8\": {\"quality\": 0.5656749999999999, \"cost\": 1.397e-05, \"time\": 0.0295}, \"70c850e039\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"7112a7e64c\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"71b615468b\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027700000000000002}, \"723fd5589a\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"7250da0f41\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027700000000000002}, \"7274a50778\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"72d022ce33\": {\"quality\": 0.6393, \"cost\": 2.695e-05, \"time\": 0.0368}, \"7347cf0308\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"736e652158\": {\"quality\": 0.46785, \"cost\": 1.8299999999999998e-06, \"time\": 0.020900000000000002}, \"739b1f81dc\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"742a5c0552\": {\"quality\": 0.7030000000000001, \"cost\": 1.426e-05, \"time\": 0.0218}, \"742ec1b2e1\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.028900000000000002}, \"7435fd54f8\": {\"quality\": 0.5656749999999999, \"cost\": 1.397e-05, \"time\": 0.0295}, \"7445d99939\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"7466a5f424\": {\"quality\": 0.537525, \"cost\": 5.0800000000000005e-06, \"time\": 0.0416}, \"74a0be215b\": {\"quality\": 0.5161250000000001, \"cost\": 1.4420000000000001e-05, \"time\": 0.0309}, \"74d7f64b8c\": {\"quality\": 0.46785, \"cost\": 1.8299999999999998e-06, \"time\": 0.020900000000000002}, \"7524905580\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"752d9649f2\": {\"quality\": 0.5484, \"cost\": 5.0800000000000005e-06, \"time\": 0.044899999999999995}, \"75d61c2cd0\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.028900000000000002}, \"7604c0aa13\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027699999999999995}, \"76c09db721\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.0375}, \"774f268b66\": {\"quality\": 0.6417499999999999, \"cost\": 2.611e-05, \"time\": 0.0315}, \"7765576286\": {\"quality\": 0.4847, \"cost\": 3.51e-06, \"time\": 0.0381}, \"77c02b00c1\": {\"quality\": 0.48335, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"7801da66b9\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"782d52674e\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"786e5d0af5\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"7878563d63\": {\"quality\": 0.7030000000000001, \"cost\": 1.426e-05, \"time\": 0.0218}, \"79e1ca9b3c\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.028900000000000002}, \"79fad58f07\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"7a207b42a8\": {\"quality\": 0.4183, \"cost\": 2.2799999999999998e-06, \"time\": 0.0223}, \"7a2cdc546c\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.0289}, \"7a42a77788\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.0217}, \"7a58d3472b\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"7b3937c1f1\": {\"quality\": 0.7030000000000001, \"cost\": 1.426e-05, \"time\": 0.0218}, \"7b6f44618e\": {\"quality\": 0.4917750000000001, \"cost\": 4.52e-06, \"time\": 0.0422}, \"7c62576527\": {\"quality\": 0.561875, \"cost\": 1.498e-05, \"time\": 0.0303}, \"7c89a2b69e\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"7c96c9712f\": {\"quality\": 0.58975, \"cost\": 1.3700000000000001e-05, \"time\": 0.0191}, \"7ca066aa1c\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.0262}, \"7cb5591f27\": {\"quality\": 0.6421, \"cost\": 2.62e-05, \"time\": 0.027}, \"7cf56a7fbc\": {\"quality\": 0.6129000000000001, \"cost\": 1.546e-05, \"time\": 0.033}, \"7d44f0959d\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"7d60c38c5c\": {\"quality\": 0.6309, \"cost\": 1.5e-06, \"time\": 0.0196}, \"7d9b4535ac\": {\"quality\": 0.612775, \"cost\": 1.436e-05, \"time\": 0.0334}, \"7daf7ff182\": {\"quality\": 0.5082000000000001, \"cost\": 4.16e-06, \"time\": 0.0363}, \"7dcedb3d02\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.037500000000000006}, \"7e22f12cd1\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.0262}, \"7e53a50b13\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"7ed07ad40a\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.030799999999999998}, \"7fa67a7656\": {\"quality\": 0.5114666666666666, \"cost\": 3.32e-06, \"time\": 0.031}, \"7fc6c84bdf\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"7ff8a779cc\": {\"quality\": 0.6067333333333332, \"cost\": 1.361e-05, \"time\": 0.0236}, \"801af99400\": {\"quality\": 0.6695333333333333, \"cost\": 1.4e-05, \"time\": 0.0275}, \"806881adcb\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"80ad4122e4\": {\"quality\": 0.5618749999999999, \"cost\": 1.498e-05, \"time\": 0.0303}, \"80be7df955\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"80bf60c422\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"81333c7a33\": {\"quality\": 0.6592, \"cost\": 5.28e-06, \"time\": 0.0417}, \"813e75210b\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"81660ae8b2\": {\"quality\": 0.494375, \"cost\": 1.4420000000000001e-05, \"time\": 0.024300000000000002}, \"816958b5d1\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.034199999999999994}, \"81ab2ef3f4\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.0164}, \"829df73946\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"82ea1bd1b9\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"8357183895\": {\"quality\": 0.46540000000000004, \"cost\": 2.67e-06, \"time\": 0.0262}, \"8392a6083a\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"83aee532b7\": {\"quality\": 0.543925, \"cost\": 1.397e-05, \"time\": 0.022899999999999997}, \"83b26646c3\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"83c9e66ec6\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"847a0e5db5\": {\"quality\": 0.6129, \"cost\": 1.546e-05, \"time\": 0.033}, \"847fd49235\": {\"quality\": 0.42799999999999994, \"cost\": 1.08e-06, \"time\": 0.0144}, \"849100224d\": {\"quality\": 0.5837749999999999, \"cost\": 2.572e-05, \"time\": 0.024300000000000002}, \"84b91c37ab\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.034199999999999994}, \"8519bef585\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"85c94a5505\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"85eda38404\": {\"quality\": 0.58975, \"cost\": 1.3700000000000001e-05, \"time\": 0.0191}, \"862183bfb9\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"8631e49c94\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.026199999999999998}, \"8668f65f05\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"86bf6375af\": {\"quality\": 0.560775, \"cost\": 1.565e-05, \"time\": 0.040100000000000004}, \"870e2f87b4\": {\"quality\": 0.587075, \"cost\": 4.63e-06, \"time\": 0.0402}, \"887ad124e1\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.021599999999999998}, \"8886cb3082\": {\"quality\": 0.5328999999999999, \"cost\": 3.23e-06, \"time\": 0.0322}, \"88e71efa9b\": {\"quality\": 0.5594250000000001, \"cost\": 1.582e-05, \"time\": 0.03560000000000001}, \"8941621423\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"8950a6efe0\": {\"quality\": 0.5439333333333334, \"cost\": 1.322e-05, \"time\": 0.0197}, \"8961e4d901\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"8974aa89a0\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"89836d2020\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"89a289907e\": {\"quality\": 0.590875, \"cost\": 3.62e-06, \"time\": 0.0394}, \"89a35a09b1\": {\"quality\": 0.48714999999999997, \"cost\": 2.67e-06, \"time\": 0.0328}, \"8a3a35c762\": {\"quality\": 0.626925, \"cost\": 1.6380000000000002e-05, \"time\": 0.0416}, \"8a50695d1f\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.0217}, \"8ab351aa13\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.024399999999999998}, \"8ac8b5773a\": {\"quality\": 0.563225, \"cost\": 1.481e-05, \"time\": 0.034800000000000005}, \"8acd758b7f\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"8b10891ea5\": {\"quality\": 0.5678500000000001, \"cost\": 1.666e-05, \"time\": 0.0442}, \"8b721bbc6f\": {\"quality\": 0.48714999999999997, \"cost\": 2.67e-06, \"time\": 0.0328}, \"8b77535cce\": {\"quality\": 0.6174, \"cost\": 1.6210000000000002e-05, \"time\": 0.042800000000000005}, \"8bbbe0f52a\": {\"quality\": 0.4392333333333333, \"cost\": 1.92e-06, \"time\": 0.023}, \"8bc184f385\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.026899999999999997}, \"8bf5c3eadc\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"8bf80a50cb\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.0375}, \"8c274ca255\": {\"quality\": 0.6740250000000001, \"cost\": 1.677e-05, \"time\": 0.0455}, \"8d7594020b\": {\"quality\": 0.5406666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.025}, \"8d79e03266\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"8d90814b94\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"8e1a01da19\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.0262}, \"8e2498635d\": {\"quality\": 0.7176, \"cost\": 2.676e-05, \"time\": 0.0297}, \"8e5842ccbd\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.026899999999999997}, \"8e5daf241e\": {\"quality\": 0.6695333333333333, \"cost\": 1.4e-05, \"time\": 0.0275}, \"8e9715ee01\": {\"quality\": 0.6453666666666668, \"cost\": 2.536e-05, \"time\": 0.0217}, \"8e9b7300d4\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"8f29fab8ac\": {\"quality\": 0.513675, \"cost\": 1.526e-05, \"time\": 0.0362}, \"8f44d89429\": {\"quality\": 0.6884, \"cost\": 1.602e-05, \"time\": 0.035699999999999996}, \"8f4caddfe6\": {\"quality\": 0.5869666666666667, \"cost\": 3.88e-06, \"time\": 0.033699999999999994}, \"8f4edde3f0\": {\"quality\": 0.4857333333333333, \"cost\": 2.48e-06, \"time\": 0.0191}, \"8f9cefbc22\": {\"quality\": 0.7030000000000001, \"cost\": 1.426e-05, \"time\": 0.0218}, \"9025e2480f\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"9028588af4\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"9059fd80ad\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"90d5e40c1b\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"90d9a86a2a\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"90ff13783c\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.0375}, \"90ff8eb055\": {\"quality\": 0.426725, \"cost\": 3.1199999999999998e-06, \"time\": 0.030900000000000004}, \"9104e31369\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"918983323f\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"91928dfdd9\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.028900000000000002}, \"91c800af6b\": {\"quality\": 0.47382500000000005, \"cost\": 3.51e-06, \"time\": 0.0348}, \"91e841cfd5\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.037500000000000006}, \"9253901a1f\": {\"quality\": 0.5594250000000001, \"cost\": 1.582e-05, \"time\": 0.0356}, \"9288642e53\": {\"quality\": 0.5291, \"cost\": 2.12e-06, \"time\": 0.0165}, \"92ba9c5be3\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"92c9dcd43b\": {\"quality\": 0.543775, \"cost\": 3.23e-06, \"time\": 0.0355}, \"93011c0821\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"933b4d17dd\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"9373267bdb\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.0375}, \"94010928c6\": {\"quality\": 0.42799999999999994, \"cost\": 1.08e-06, \"time\": 0.0144}, \"9403809e44\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"940c88ddc5\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.0164}, \"948f4081ba\": {\"quality\": 0.563225, \"cost\": 1.481e-05, \"time\": 0.0348}, \"94ac356663\": {\"quality\": 0.5114666666666666, \"cost\": 3.32e-06, \"time\": 0.031}, \"94dff9a424\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"9508356a2e\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"9539d0e28c\": {\"quality\": 0.561875, \"cost\": 1.498e-05, \"time\": 0.0303}, \"956bdcc254\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.0224}, \"957d0dafc5\": {\"quality\": 0.6884, \"cost\": 1.602e-05, \"time\": 0.035699999999999996}, \"9594b0c783\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.021599999999999998}, \"95a7b80c2a\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.0289}, \"964c671f18\": {\"quality\": 0.587075, \"cost\": 4.63e-06, \"time\": 0.0402}, \"9679fe2b69\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.021599999999999998}, \"968fc95038\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.0216}, \"96b487c724\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"96c30205f5\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.024399999999999998}, \"96f87d6483\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"972c83b002\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.023599999999999996}, \"977a4d6b6b\": {\"quality\": 0.5484, \"cost\": 5.0800000000000005e-06, \"time\": 0.044899999999999995}, \"97bc30bd83\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"980db5f95f\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.0164}, \"9836765d41\": {\"quality\": 0.5406666666666666, \"cost\": 1.4060000000000001e-05, \"time\": 0.024999999999999998}, \"99569e3937\": {\"quality\": 0.6174, \"cost\": 1.6210000000000002e-05, \"time\": 0.042800000000000005}, \"99ea16a9a6\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"9a5b39370f\": {\"quality\": 0.5509999999999999, \"cost\": 1.498e-05, \"time\": 0.026999999999999996}, \"9aa4abfb50\": {\"quality\": 0.39899999999999997, \"cost\": 7.2e-07, \"time\": 0.0052}, \"9ad7a98c31\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"9b3fb79bcb\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"9b6d4915f3\": {\"quality\": 0.43270000000000003, \"cost\": 3.6e-06, \"time\": 0.0336}, \"9bae5bafc1\": {\"quality\": 0.5869666666666667, \"cost\": 3.88e-06, \"time\": 0.033699999999999994}, \"9be8a5f317\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.038900000000000004}, \"9c549db0a7\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"9c85f8cfcb\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"9c8cc46e6c\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"9c97d35a30\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"9cbe7858a2\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.0276}, \"9ce2c3fd98\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.0111}, \"9d18cd0737\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"9e06360bc9\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"9f07a95e69\": {\"quality\": 0.61605, \"cost\": 1.6380000000000002e-05, \"time\": 0.0383}, \"9fb157be35\": {\"quality\": 0.48573333333333335, \"cost\": 2.48e-06, \"time\": 0.0191}, \"a04ac8e33a\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.0217}, \"a04bc6e116\": {\"quality\": 0.5406666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.025}, \"a0b81be5b4\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"a0c85d260e\": {\"quality\": 0.5955, \"cost\": 5.47e-06, \"time\": 0.048799999999999996}, \"a0dc9f50ac\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"a18225b7b5\": {\"quality\": 0.6884, \"cost\": 1.602e-05, \"time\": 0.035699999999999996}, \"a1d822289e\": {\"quality\": 0.6421, \"cost\": 2.62e-05, \"time\": 0.027}, \"a25596c056\": {\"quality\": 0.513675, \"cost\": 1.526e-05, \"time\": 0.0362}, \"a2811c7324\": {\"quality\": 0.6067333333333332, \"cost\": 1.361e-05, \"time\": 0.0236}, \"a2aa082d14\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027700000000000002}, \"a2cd339ad9\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.040799999999999996}, \"a2fd03e6a5\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"a31e87d7cb\": {\"quality\": 0.4425, \"cost\": 7.2e-07, \"time\": 0.0118}, \"a3e23c327b\": {\"quality\": 0.561875, \"cost\": 1.498e-05, \"time\": 0.030299999999999997}, \"a457f6c300\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.0111}, \"a47de025c8\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"a515a9c8cc\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"a5949b76ec\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"a60dd076b8\": {\"quality\": 0.4392333333333333, \"cost\": 1.92e-06, \"time\": 0.023}, \"a6297a6c56\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"a62b7555b9\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.0389}, \"a63f48e8ca\": {\"quality\": 0.5406666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.025}, \"a6460dbb7c\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"a66a4cf4b0\": {\"quality\": 0.61605, \"cost\": 1.6380000000000002e-05, \"time\": 0.0383}, \"a6e2d69222\": {\"quality\": 0.6129000000000001, \"cost\": 1.546e-05, \"time\": 0.033}, \"a717c4c535\": {\"quality\": 0.4917750000000001, \"cost\": 4.52e-06, \"time\": 0.0422}, \"a76afe9960\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"a7a6353090\": {\"quality\": 0.6067333333333332, \"cost\": 1.361e-05, \"time\": 0.0236}, \"a80f6535b1\": {\"quality\": 0.59465, \"cost\": 1.286e-05, \"time\": 0.0138}, \"a86b137d7f\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"a88eb1493c\": {\"quality\": 0.5318, \"cost\": 1.95e-06, \"time\": 0.020999999999999998}, \"a89c533d6c\": {\"quality\": 0.5374, \"cost\": 1.4900000000000001e-05, \"time\": 0.0303}, \"a8d8264600\": {\"quality\": 0.6174000000000001, \"cost\": 1.621e-05, \"time\": 0.042800000000000005}, \"a8eb36b210\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.0289}, \"a95b4a6dd0\": {\"quality\": 0.42146666666666666, \"cost\": 2.76e-06, \"time\": 0.025}, \"a9621ea4e6\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.0197}, \"a9721a0a50\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.030999999999999996}, \"a972b02c61\": {\"quality\": 0.7468, \"cost\": 1.25e-05, \"time\": 0.0079}, \"a9c5c4e311\": {\"quality\": 0.6067333333333332, \"cost\": 1.361e-05, \"time\": 0.0236}, \"a9d96670eb\": {\"quality\": 0.5439333333333334, \"cost\": 1.322e-05, \"time\": 0.0197}, \"a9e8c974d3\": {\"quality\": 0.5509999999999999, \"cost\": 1.498e-05, \"time\": 0.026999999999999996}, \"aa08180e36\": {\"quality\": 0.4858, \"cost\": 2.84e-06, \"time\": 0.0283}, \"aa38702a02\": {\"quality\": 0.42799999999999994, \"cost\": 1.08e-06, \"time\": 0.0144}, \"aadbfc418b\": {\"quality\": 0.44250000000000006, \"cost\": 1.08e-06, \"time\": 0.0177}, \"aaeb8b0010\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.0276}, \"ab288ee7f2\": {\"quality\": 0.563225, \"cost\": 1.481e-05, \"time\": 0.0348}, \"ab43b02cb0\": {\"quality\": 0.5082000000000001, \"cost\": 4.16e-06, \"time\": 0.0363}, \"aba1d612cc\": {\"quality\": 0.648825, \"cost\": 2.712e-05, \"time\": 0.0356}, \"ac208e7a1d\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.0304}, \"ac2224adbe\": {\"quality\": 0.6129000000000001, \"cost\": 1.546e-05, \"time\": 0.033}, \"ac828ffe70\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"ac9fdc1550\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"aca957ecff\": {\"quality\": 0.6417499999999999, \"cost\": 2.611e-05, \"time\": 0.0315}, \"acfe1ed920\": {\"quality\": 0.59465, \"cost\": 1.286e-05, \"time\": 0.0138}, \"ad3efe44c3\": {\"quality\": 0.39899999999999997, \"cost\": 7.2e-07, \"time\": 0.0052}, \"ad41c95a99\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"ad48432c22\": {\"quality\": 0.46540000000000004, \"cost\": 2.67e-06, \"time\": 0.0262}, \"ad6ebbba8d\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"ad90055ef6\": {\"quality\": 0.39899999999999997, \"cost\": 7.2e-07, \"time\": 0.0052}, \"adab1e0fb1\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"ae655ec593\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"ae94b172be\": {\"quality\": 0.5002333333333333, \"cost\": 2.48e-06, \"time\": 0.0224}, \"aec9dc5873\": {\"quality\": 0.5729, \"cost\": 1.286e-05, \"time\": 0.0105}, \"af360c323c\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.018299999999999997}, \"af90567194\": {\"quality\": 0.587075, \"cost\": 4.63e-06, \"time\": 0.0402}, \"afe77d0f89\": {\"quality\": 0.66695, \"cost\": 1.576e-05, \"time\": 0.041400000000000006}, \"b0948c05b6\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.03899999999999999}, \"b0c4a6640b\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.0262}, \"b12caafd58\": {\"quality\": 0.5076999999999999, \"cost\": 1.358e-05, \"time\": 0.0223}, \"b18168b9c1\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"b1a7428a01\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.038900000000000004}, \"b1acdebb48\": {\"quality\": 0.6884, \"cost\": 1.602e-05, \"time\": 0.035699999999999996}, \"b1b06f4ee7\": {\"quality\": 0.6421, \"cost\": 2.62e-05, \"time\": 0.027000000000000003}, \"b1cf8d33e5\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"b1e9ab6b1a\": {\"quality\": 0.559425, \"cost\": 1.582e-05, \"time\": 0.0356}, \"b2b057ba41\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"b2e063499d\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"b33412410e\": {\"quality\": 0.5374, \"cost\": 1.4900000000000001e-05, \"time\": 0.0303}, \"b3369775dc\": {\"quality\": 0.588425, \"cost\": 4.46e-06, \"time\": 0.044700000000000004}, \"b3b9205f60\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.027600000000000003}, \"b3c56f0b3c\": {\"quality\": 0.6421, \"cost\": 2.62e-05, \"time\": 0.027}, \"b3f20b706d\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"b4002173ee\": {\"quality\": 0.43270000000000003, \"cost\": 2.4e-06, \"time\": 0.0224}, \"b46d382384\": {\"quality\": 0.6453666666666668, \"cost\": 2.536e-05, \"time\": 0.0217}, \"b4b2482ef9\": {\"quality\": 0.6129000000000001, \"cost\": 1.546e-05, \"time\": 0.033}, \"b4be043238\": {\"quality\": 0.6308666666666666, \"cost\": 2.536e-05, \"time\": 0.0184}, \"b531bd0548\": {\"quality\": 0.581325, \"cost\": 2.6560000000000003e-05, \"time\": 0.0296}, \"b56c312eda\": {\"quality\": 0.4847, \"cost\": 3.51e-06, \"time\": 0.0381}, \"b5e2b41c1c\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"b61ce57a90\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.038900000000000004}, \"b64ddb14f9\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"b67107a43e\": {\"quality\": 0.46785, \"cost\": 1.8299999999999998e-06, \"time\": 0.020900000000000002}, \"b67720aa5c\": {\"quality\": 0.5149333333333334, \"cost\": 1.322e-05, \"time\": 0.0131}, \"b682a23b89\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"b69ef5add4\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.0164}, \"b796b7ffd3\": {\"quality\": 0.590875, \"cost\": 3.62e-06, \"time\": 0.0394}, \"b7a0083dc4\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"b7d0e8557f\": {\"quality\": 0.48714999999999997, \"cost\": 2.67e-06, \"time\": 0.0328}, \"b8317a3a8c\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"b8ab3d2f25\": {\"quality\": 0.5869666666666667, \"cost\": 3.88e-06, \"time\": 0.033699999999999994}, \"b8b569172f\": {\"quality\": 0.4969666666666666, \"cost\": 3.32e-06, \"time\": 0.0277}, \"b8f5ab44bb\": {\"quality\": 0.6244750000000001, \"cost\": 1.722e-05, \"time\": 0.0469}, \"b91e7fdb29\": {\"quality\": 0.4183, \"cost\": 2.2799999999999998e-06, \"time\": 0.0223}, \"b932beaaa6\": {\"quality\": 0.6129, \"cost\": 1.546e-05, \"time\": 0.033}, \"b9770c2261\": {\"quality\": 0.6309, \"cost\": 1.5e-06, \"time\": 0.0196}, \"b9bb1e6f8d\": {\"quality\": 0.588425, \"cost\": 4.46e-06, \"time\": 0.044700000000000004}, \"b9d0e8740c\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.038900000000000004}, \"b9da208432\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"ba3223f6ac\": {\"quality\": 0.428, \"cost\": 1.08e-06, \"time\": 0.0144}, \"bb13365175\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.034199999999999994}, \"bb1b3a4d29\": {\"quality\": 0.6019, \"cost\": 1.436e-05, \"time\": 0.0301}, \"bb6536b0ab\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.030999999999999996}, \"bbba9dd6ae\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"bbde69a1ae\": {\"quality\": 0.5028, \"cost\": 1.526e-05, \"time\": 0.0329}, \"bc29a0c0fe\": {\"quality\": 0.6453666666666668, \"cost\": 2.536e-05, \"time\": 0.0217}, \"bc3d02f753\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"bc4c1fcc64\": {\"quality\": 0.64505, \"cost\": 2.51e-06, \"time\": 0.0237}, \"bd30d27f62\": {\"quality\": 0.6308666666666666, \"cost\": 2.536e-05, \"time\": 0.0184}, \"bd99b2fb21\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.031}, \"bddc7d2a34\": {\"quality\": 0.6019, \"cost\": 1.436e-05, \"time\": 0.0301}, \"be2ae88f70\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.0111}, \"be4740f38f\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.03899999999999999}, \"bec0c6a95f\": {\"quality\": 0.581325, \"cost\": 2.6560000000000003e-05, \"time\": 0.0296}, \"bed888d4dc\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"bf45e407f6\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"bf5550f320\": {\"quality\": 0.5703, \"cost\": 1.582e-05, \"time\": 0.0389}, \"bf87e58322\": {\"quality\": 0.6695333333333333, \"cost\": 1.4e-05, \"time\": 0.0275}, \"bfed7670ed\": {\"quality\": 0.4969666666666666, \"cost\": 3.32e-06, \"time\": 0.0277}, \"c0541e2220\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.0217}, \"c0e10c0048\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.0276}, \"c127509a7a\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"c13682c7c7\": {\"quality\": 0.494225, \"cost\": 3.68e-06, \"time\": 0.0369}, \"c13d6e78e9\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"c14ff3144d\": {\"quality\": 0.4425, \"cost\": 7.2e-07, \"time\": 0.0118}, \"c1e42ac47b\": {\"quality\": 0.6244750000000001, \"cost\": 1.722e-05, \"time\": 0.0469}, \"c2949aa902\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.030799999999999998}, \"c31e956b35\": {\"quality\": 0.523375, \"cost\": 3.06e-06, \"time\": 0.0334}, \"c36b525dde\": {\"quality\": 0.5648333333333334, \"cost\": 2.7e-06, \"time\": 0.030799999999999998}, \"c38326e2bd\": {\"quality\": 0.626925, \"cost\": 1.6380000000000002e-05, \"time\": 0.0416}, \"c3ec2cec59\": {\"quality\": 0.5724666666666667, \"cost\": 3.88e-06, \"time\": 0.0304}, \"c44720575f\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"c48ecefab6\": {\"quality\": 0.49682499999999996, \"cost\": 1.358e-05, \"time\": 0.019000000000000003}, \"c4a64eb40f\": {\"quality\": 0.7176, \"cost\": 2.676e-05, \"time\": 0.0297}, \"c4a80d19b3\": {\"quality\": 0.5800000000000001, \"cost\": 3.62e-06, \"time\": 0.03609999999999999}, \"c4c2826afd\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"c4c94a5527\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"c4e75ee9ba\": {\"quality\": 0.7176, \"cost\": 2.676e-05, \"time\": 0.0297}, \"c4f3e7665d\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.0111}, \"c5471bef57\": {\"quality\": 0.494375, \"cost\": 1.4420000000000001e-05, \"time\": 0.024300000000000002}, \"c54a408db7\": {\"quality\": 0.6067333333333333, \"cost\": 1.361e-05, \"time\": 0.0236}, \"c59cc41335\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"c5a0b065e0\": {\"quality\": 0.5729, \"cost\": 1.286e-05, \"time\": 0.0105}, \"c5a16b834a\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"c5fbe2076f\": {\"quality\": 0.537525, \"cost\": 5.0800000000000005e-06, \"time\": 0.0416}, \"c617370f6b\": {\"quality\": 0.5439333333333334, \"cost\": 1.322e-05, \"time\": 0.0197}, \"c67f782c7f\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.024399999999999998}, \"c691a29c42\": {\"quality\": 0.48090000000000005, \"cost\": 4.52e-06, \"time\": 0.038900000000000004}, \"c6a339987c\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"c772ff3704\": {\"quality\": 0.429175, \"cost\": 2.2799999999999998e-06, \"time\": 0.0256}, \"c7e3f348c2\": {\"quality\": 0.61605, \"cost\": 1.6380000000000002e-05, \"time\": 0.0383}, \"c823589ab6\": {\"quality\": 0.5149333333333334, \"cost\": 1.322e-05, \"time\": 0.0131}, \"c82f834e85\": {\"quality\": 0.5775333333333333, \"cost\": 2.87e-06, \"time\": 0.029599999999999998}, \"c85099881f\": {\"quality\": 0.5729, \"cost\": 1.286e-05, \"time\": 0.0105}, \"c935a33384\": {\"quality\": 0.5484, \"cost\": 5.0800000000000005e-06, \"time\": 0.044899999999999995}, \"ca3177461f\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.0276}, \"caa7c0bd6b\": {\"quality\": 0.49876666666666675, \"cost\": 3.15e-06, \"time\": 0.0322}, \"cac6b051e9\": {\"quality\": 0.53425, \"cost\": 3.06e-06, \"time\": 0.036699999999999997}, \"cacb342f64\": {\"quality\": 0.63795, \"cost\": 2.712e-05, \"time\": 0.032299999999999995}, \"cb9948679c\": {\"quality\": 0.5076999999999999, \"cost\": 1.358e-05, \"time\": 0.0223}, \"cbb5eb0e74\": {\"quality\": 0.53425, \"cost\": 3.06e-06, \"time\": 0.036699999999999997}, \"cbc32cbeff\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.029500000000000002}, \"cbd4461293\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"cbe2318045\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.025699999999999997}, \"cc886fe337\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"cc9a6248a0\": {\"quality\": 0.7468, \"cost\": 1.25e-05, \"time\": 0.0079}, \"ccb2335b3f\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"ccdf03a55b\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.0164}, \"ccf72745c1\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027700000000000002}, \"cd1d418732\": {\"quality\": 0.5082000000000001, \"cost\": 4.16e-06, \"time\": 0.0363}, \"cd23c79db1\": {\"quality\": 0.48573333333333335, \"cost\": 2.48e-06, \"time\": 0.0191}, \"cd64fbfcd9\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.023599999999999996}, \"cd85a01e81\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"ce4bc5f348\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"ce980cf86f\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"ceae8b8bb9\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.034199999999999994}, \"cecca90dd2\": {\"quality\": 0.5536, \"cost\": 1.86e-06, \"time\": 0.022199999999999998}, \"cf9538faf0\": {\"quality\": 0.5147333333333334, \"cost\": 2.48e-06, \"time\": 0.0257}, \"cf9d2e224c\": {\"quality\": 0.5508500000000001, \"cost\": 2.12e-06, \"time\": 0.019799999999999998}, \"cfd36f3a8c\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.0263}, \"cffa29a6ef\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"d03596c3de\": {\"quality\": 0.5328999999999999, \"cost\": 3.23e-06, \"time\": 0.0322}, \"d07b766487\": {\"quality\": 0.57275, \"cost\": 1.498e-05, \"time\": 0.0336}, \"d0a0a66d75\": {\"quality\": 0.6129000000000001, \"cost\": 1.546e-05, \"time\": 0.033}, \"d0ce31134c\": {\"quality\": 0.5594250000000001, \"cost\": 1.582e-05, \"time\": 0.0356}, \"d0f9633442\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"d216eab7d8\": {\"quality\": 0.5869666666666667, \"cost\": 3.88e-06, \"time\": 0.033699999999999994}, \"d266c19ac8\": {\"quality\": 0.525825, \"cost\": 2.22e-06, \"time\": 0.0281}, \"d26a70179a\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.034199999999999994}, \"d2af24b59e\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.0224}, \"d2f2dd5cd4\": {\"quality\": 0.5439333333333334, \"cost\": 1.322e-05, \"time\": 0.0197}, \"d302278f85\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.0262}, \"d37dcaea30\": {\"quality\": 0.6174000000000001, \"cost\": 1.6210000000000002e-05, \"time\": 0.042800000000000005}, \"d3a2d50bd7\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.0183}, \"d3d4185487\": {\"quality\": 0.6016666666666667, \"cost\": 1.462e-05, \"time\": 0.024399999999999998}, \"d3db4cf84d\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"d402233b53\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.034199999999999994}, \"d43fafa19e\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"d446a75eb7\": {\"quality\": 0.7176, \"cost\": 2.676e-05, \"time\": 0.0297}, \"d48ead13da\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.022399999999999996}, \"d5016f4538\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"d55a3613b0\": {\"quality\": 0.695925, \"cost\": 2.751e-05, \"time\": 0.03950000000000001}, \"d58036ba66\": {\"quality\": 0.5002333333333334, \"cost\": 2.48e-06, \"time\": 0.0224}, \"d5a84c782e\": {\"quality\": 0.6695333333333333, \"cost\": 1.4e-05, \"time\": 0.0275}, \"d5b2eef11c\": {\"quality\": 0.68885, \"cost\": 1.325e-05, \"time\": 0.0177}, \"d6040140b9\": {\"quality\": 0.5955, \"cost\": 5.47e-06, \"time\": 0.048799999999999996}, \"d65185c1a4\": {\"quality\": 0.5680999999999999, \"cost\": 1.86e-06, \"time\": 0.0255}, \"d667351f33\": {\"quality\": 0.6174000000000001, \"cost\": 1.621e-05, \"time\": 0.042800000000000005}, \"d6bd3b66ba\": {\"quality\": 0.4135, \"cost\": 1.08e-06, \"time\": 0.0111}, \"d6c4e48eeb\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.0263}, \"d6cbf265ee\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"d705447fd7\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.0342}, \"d73a9aab4e\": {\"quality\": 0.41585, \"cost\": 1.5599999999999999e-06, \"time\": 0.0138}, \"d752c30d07\": {\"quality\": 0.5002333333333333, \"cost\": 2.48e-06, \"time\": 0.0224}, \"d782682359\": {\"quality\": 0.5618749999999999, \"cost\": 1.498e-05, \"time\": 0.0303}, \"d7c0972014\": {\"quality\": 0.5082, \"cost\": 4.16e-06, \"time\": 0.0363}, \"d867525748\": {\"quality\": 0.5406666666666666, \"cost\": 1.4060000000000001e-05, \"time\": 0.025}, \"d87eb775da\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"d8bab6c09b\": {\"quality\": 0.53045, \"cost\": 4.07e-06, \"time\": 0.0375}, \"d8bcac36e8\": {\"quality\": 0.4969666666666667, \"cost\": 3.32e-06, \"time\": 0.027699999999999995}, \"d8eadc0190\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"d96677d8d4\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.026899999999999997}, \"d98f22270e\": {\"quality\": 0.6129000000000001, \"cost\": 1.546e-05, \"time\": 0.033}, \"d9e2bb21a3\": {\"quality\": 0.5053, \"cost\": 1.47e-06, \"time\": 0.021599999999999998}, \"da95deeb20\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.0289}, \"daaadadcc9\": {\"quality\": 0.4858, \"cost\": 2.84e-06, \"time\": 0.0283}, \"daf855e065\": {\"quality\": 0.525825, \"cost\": 2.22e-06, \"time\": 0.0281}, \"db00594832\": {\"quality\": 0.6129, \"cost\": 1.546e-05, \"time\": 0.033}, \"db19e677c4\": {\"quality\": 0.5149333333333334, \"cost\": 1.322e-05, \"time\": 0.0131}, \"db3c035639\": {\"quality\": 0.626925, \"cost\": 1.6380000000000002e-05, \"time\": 0.0416}, \"db41487005\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.016399999999999998}, \"db6a7482fd\": {\"quality\": 0.68885, \"cost\": 1.325e-05, \"time\": 0.0177}, \"db9060cd27\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"dbce95a072\": {\"quality\": 0.5114666666666666, \"cost\": 3.32e-06, \"time\": 0.031}, \"dc195abe5e\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.0262}, \"dc3f4b7138\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.016399999999999998}, \"dc66bccb1c\": {\"quality\": 0.5002333333333333, \"cost\": 2.48e-06, \"time\": 0.0224}, \"dc90065dea\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"dd0d70fedd\": {\"quality\": 0.6309, \"cost\": 1.5e-06, \"time\": 0.0196}, \"dddc76b3ca\": {\"quality\": 0.5374, \"cost\": 1.4900000000000001e-05, \"time\": 0.0303}, \"de18bf45e1\": {\"quality\": 0.4425, \"cost\": 3.6e-07, \"time\": 0.0059}, \"de1e56370f\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"df2160ecc8\": {\"quality\": 0.5374, \"cost\": 1.4900000000000001e-05, \"time\": 0.0303}, \"dfda94bd2a\": {\"quality\": 0.42074999999999996, \"cost\": 7.2e-07, \"time\": 0.0085}, \"dff452a9ca\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"e09f75d9d6\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"e0b6a99753\": {\"quality\": 0.7468, \"cost\": 1.25e-05, \"time\": 0.0079}, \"e0cf6587a7\": {\"quality\": 0.50525, \"cost\": 1.4420000000000001e-05, \"time\": 0.0276}, \"e0de4a5929\": {\"quality\": 0.6174000000000001, \"cost\": 1.6210000000000002e-05, \"time\": 0.042800000000000005}, \"e1356fb426\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.0289}, \"e20ba014a1\": {\"quality\": 0.59795, \"cost\": 4.63e-06, \"time\": 0.0435}, \"e21806e3bc\": {\"quality\": 0.5406666666666666, \"cost\": 1.4060000000000001e-05, \"time\": 0.025}, \"e24e97564c\": {\"quality\": 0.66695, \"cost\": 1.576e-05, \"time\": 0.041400000000000006}, \"e2673c1ec8\": {\"quality\": 0.6174, \"cost\": 1.6210000000000002e-05, \"time\": 0.042800000000000005}, \"e26c7bfbdb\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.0197}, \"e2f9980b06\": {\"quality\": 0.6174000000000001, \"cost\": 1.6210000000000002e-05, \"time\": 0.042800000000000005}, \"e3445f7632\": {\"quality\": 0.68885, \"cost\": 1.325e-05, \"time\": 0.0177}, \"e35e5f81a7\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"e376ac53e7\": {\"quality\": 0.5294333333333333, \"cost\": 1.322e-05, \"time\": 0.016399999999999998}, \"e3d8bb56da\": {\"quality\": 0.6067333333333332, \"cost\": 1.361e-05, \"time\": 0.0236}, \"e3df4cf041\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"e47dc3abca\": {\"quality\": 0.5082000000000001, \"cost\": 4.16e-06, \"time\": 0.0363}, \"e4b9d4fb41\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"e510bda989\": {\"quality\": 0.6592, \"cost\": 1.76e-06, \"time\": 0.0139}, \"e517cd2222\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"e51b01f418\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"e520dfae5b\": {\"quality\": 0.47872499999999996, \"cost\": 1.8299999999999998e-06, \"time\": 0.0242}, \"e521c9b7e4\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.0217}, \"e54097ad5d\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"e56a16ca66\": {\"quality\": 0.5367, \"cost\": 1.11e-06, \"time\": 0.0157}, \"e5c4abf7ce\": {\"quality\": 0.5632250000000001, \"cost\": 1.481e-05, \"time\": 0.0348}, \"e5d4689312\": {\"quality\": 0.494375, \"cost\": 1.4420000000000001e-05, \"time\": 0.024300000000000002}, \"e62a7b27ae\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.021699999999999997}, \"e6a7aff3bc\": {\"quality\": 0.543925, \"cost\": 1.397e-05, \"time\": 0.022899999999999997}, \"e736999157\": {\"quality\": 0.543925, \"cost\": 1.397e-05, \"time\": 0.022899999999999997}, \"e7517a8ce0\": {\"quality\": 0.7081666666666666, \"cost\": 2.5750000000000002e-05, \"time\": 0.0256}, \"e7520ca5ac\": {\"quality\": 0.5632250000000001, \"cost\": 1.481e-05, \"time\": 0.0348}, \"e7e94ab7a5\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"e887ddf5cc\": {\"quality\": 0.648825, \"cost\": 2.712e-05, \"time\": 0.0356}, \"e94fb5a295\": {\"quality\": 0.5149333333333334, \"cost\": 1.322e-05, \"time\": 0.0131}, \"ea6ecc5653\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.0236}, \"ea8bcb3ae2\": {\"quality\": 0.5837, \"cost\": 4.7200000000000005e-06, \"time\": 0.03899999999999999}, \"ebbe8b6c4f\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.031}, \"ebdf3abff2\": {\"quality\": 0.6497666666666667, \"cost\": 4.27e-06, \"time\": 0.037599999999999995}, \"ec55dba809\": {\"quality\": 0.561875, \"cost\": 1.498e-05, \"time\": 0.030299999999999997}, \"ecb5f78f37\": {\"quality\": 0.522025, \"cost\": 3.23e-06, \"time\": 0.028899999999999995}, \"ecda3d74cb\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.021699999999999997}, \"ece10c0388\": {\"quality\": 0.6453666666666668, \"cost\": 2.536e-05, \"time\": 0.0217}, \"ed6b5480a5\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"eda630dc85\": {\"quality\": 0.6403333333333333, \"cost\": 3.26e-06, \"time\": 0.0335}, \"edaaee5ed4\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"edb2b764aa\": {\"quality\": 0.5869666666666666, \"cost\": 3.88e-06, \"time\": 0.0337}, \"edc52339db\": {\"quality\": 0.47247500000000003, \"cost\": 3.68e-06, \"time\": 0.0303}, \"ede7071775\": {\"quality\": 0.7081666666666667, \"cost\": 2.5750000000000002e-05, \"time\": 0.0256}, \"ee46042c5d\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"ee68c51f73\": {\"quality\": 0.7468, \"cost\": 2.5e-05, \"time\": 0.0158}, \"ee7b726747\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.0375}, \"eec5f32da9\": {\"quality\": 0.6393, \"cost\": 2.695e-05, \"time\": 0.0368}, \"eed40e4378\": {\"quality\": 0.61985, \"cost\": 1.537e-05, \"time\": 0.0375}, \"eef12d478b\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"ef37b3e0be\": {\"quality\": 0.4183, \"cost\": 2.2799999999999998e-06, \"time\": 0.0223}, \"ef43d497f1\": {\"quality\": 0.48335, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"ef4d4c4a62\": {\"quality\": 0.5869666666666667, \"cost\": 3.88e-06, \"time\": 0.033699999999999994}, \"ef9a651425\": {\"quality\": 0.6740250000000001, \"cost\": 1.677e-05, \"time\": 0.0455}, \"f0655621af\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"f076b4c9ae\": {\"quality\": 0.5630333333333334, \"cost\": 2.87e-06, \"time\": 0.026299999999999997}, \"f11eddb4ed\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"f1408da253\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.0295}, \"f18cf41929\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}, \"f1aa0b0b42\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"f1bda127f6\": {\"quality\": 0.5114666666666666, \"cost\": 3.32e-06, \"time\": 0.031}, \"f1f373e58e\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"f2a2e91541\": {\"quality\": 0.476275, \"cost\": 2.67e-06, \"time\": 0.029500000000000002}, \"f2c04ed1c8\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"f2cf5db12d\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.031}, \"f366c0dd10\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"f4303a5b4f\": {\"quality\": 0.5922000000000001, \"cost\": 2.6560000000000003e-05, \"time\": 0.0329}, \"f437481e3b\": {\"quality\": 0.5399750000000001, \"cost\": 4.24e-06, \"time\": 0.0363}, \"f4bc6b63a7\": {\"quality\": 0.7081666666666666, \"cost\": 2.5750000000000002e-05, \"time\": 0.0256}, \"f4dc556633\": {\"quality\": 0.608975, \"cost\": 1.537e-05, \"time\": 0.0342}, \"f4deb72db6\": {\"quality\": 0.4102333333333333, \"cost\": 1.92e-06, \"time\": 0.016399999999999998}, \"f4ef2b9c33\": {\"quality\": 0.6034666666666667, \"cost\": 1.445e-05, \"time\": 0.0289}, \"f566d6d6a1\": {\"quality\": 0.6161666666666666, \"cost\": 1.462e-05, \"time\": 0.0277}, \"f5b9a94dcc\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"f5c27e7172\": {\"quality\": 0.49079999999999996, \"cost\": 1.47e-06, \"time\": 0.018299999999999997}, \"f5e53d963b\": {\"quality\": 0.4376, \"cost\": 1.5599999999999999e-06, \"time\": 0.0171}, \"f614235c15\": {\"quality\": 0.39899999999999997, \"cost\": 3.6e-07, \"time\": 0.0026}, \"f6546149e3\": {\"quality\": 0.7081666666666666, \"cost\": 2.5750000000000002e-05, \"time\": 0.0256}, \"f74ec023e4\": {\"quality\": 0.494225, \"cost\": 3.68e-06, \"time\": 0.0369}, \"f7b048bd54\": {\"quality\": 0.5742666666666667, \"cost\": 3.71e-06, \"time\": 0.0349}, \"f7c4df993e\": {\"quality\": 0.4763, \"cost\": 1.47e-06, \"time\": 0.015}, \"f854533145\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"f89b8a1930\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.0262}, \"f93d9a2693\": {\"quality\": 0.5020333333333333, \"cost\": 2.31e-06, \"time\": 0.0269}, \"f97d91a249\": {\"quality\": 0.5406666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.025}, \"f99096d89c\": {\"quality\": 0.543775, \"cost\": 3.23e-06, \"time\": 0.0355}, \"f9e8e221f3\": {\"quality\": 0.43596666666666667, \"cost\": 2.76e-06, \"time\": 0.0283}, \"fa38879eab\": {\"quality\": 0.538875, \"cost\": 4.9100000000000004e-06, \"time\": 0.0461}, \"fa71111570\": {\"quality\": 0.55235, \"cost\": 1.481e-05, \"time\": 0.0315}, \"fa7882d46b\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"fa906520d1\": {\"quality\": 0.5413250000000001, \"cost\": 4.07e-06, \"time\": 0.0408}, \"faabebaa30\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"fb0339a7d0\": {\"quality\": 0.5304500000000001, \"cost\": 4.07e-06, \"time\": 0.0375}, \"fb216ad6b3\": {\"quality\": 0.43270000000000003, \"cost\": 1.2e-06, \"time\": 0.0112}, \"fb6216880a\": {\"quality\": 0.522025, \"cost\": 3.23e-06, \"time\": 0.028899999999999995}, \"fba499b89d\": {\"quality\": 0.561875, \"cost\": 1.498e-05, \"time\": 0.0303}, \"fbc010a368\": {\"quality\": 0.5261666666666667, \"cost\": 1.4060000000000001e-05, \"time\": 0.0217}, \"fbc02e2e07\": {\"quality\": 0.5406666666666666, \"cost\": 1.4060000000000001e-05, \"time\": 0.024999999999999998}, \"fbd6c45271\": {\"quality\": 0.5329, \"cost\": 3.23e-06, \"time\": 0.0322}, \"fc0a156e16\": {\"quality\": 0.6789666666666667, \"cost\": 1.501e-05, \"time\": 0.0316}, \"fc1fd5bf54\": {\"quality\": 0.47872499999999996, \"cost\": 1.8299999999999998e-06, \"time\": 0.0242}, \"fc6967a75b\": {\"quality\": 0.49682499999999996, \"cost\": 1.358e-05, \"time\": 0.019000000000000003}, \"fc73c3b0fa\": {\"quality\": 0.7468, \"cost\": 1.25e-05, \"time\": 0.0079}, \"fce38334b2\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.033600000000000005}, \"fce5fca128\": {\"quality\": 0.6174000000000001, \"cost\": 1.6210000000000002e-05, \"time\": 0.042800000000000005}, \"fd0709359e\": {\"quality\": 0.6309, \"cost\": 7.5e-07, \"time\": 0.0098}, \"fd1f809d64\": {\"quality\": 0.5922333333333333, \"cost\": 1.361e-05, \"time\": 0.0203}, \"fd2c994a9d\": {\"quality\": 0.5548, \"cost\": 1.397e-05, \"time\": 0.026199999999999998}, \"fddccfbf94\": {\"quality\": 0.5114666666666667, \"cost\": 3.32e-06, \"time\": 0.030999999999999996}, \"fe7fa741b4\": {\"quality\": 0.6129000000000001, \"cost\": 1.546e-05, \"time\": 0.033}, \"fe9e1fec71\": {\"quality\": 0.6129, \"cost\": 1.546e-05, \"time\": 0.033}, \"fea4734c09\": {\"quality\": 0.54595, \"cost\": 2.96e-06, \"time\": 0.025099999999999997}, \"fef1ca27fa\": {\"quality\": 0.4917750000000001, \"cost\": 4.52e-06, \"time\": 0.0422}, \"ff11cb6a7a\": {\"quality\": 0.48335000000000006, \"cost\": 3.68e-06, \"time\": 0.0336}, \"ff171e34e2\": {\"quality\": 0.4875333333333334, \"cost\": 2.31e-06, \"time\": 0.023599999999999996}, \"ff1c958e21\": {\"quality\": 0.51495, \"cost\": 1.11e-06, \"time\": 0.0124}, \"ff8df4ace9\": {\"quality\": 0.5374, \"cost\": 1.4900000000000001e-05, \"time\": 0.0303}, \"ff8e68049a\": {\"quality\": 0.42473333333333335, \"cost\": 1.92e-06, \"time\": 0.019700000000000002}}"
  },
  {
    "path": "abacus-research/cuad-demo.py",
    "content": "import argparse\nimport json\nimport os\nimport string\n\nimport numpy as np\nimport pandas as pd\nfrom cuad_data_loader import load_cuad_data\n\nimport palimpzest as pz\nfrom palimpzest.constants import Model\n\nCUAD_CATEGORIES = [\n    {\n        \"Category\": \"Document Name\",\n        \"Description\": \"The name of the contract\",\n        \"Answer Format\": \"Contract Name\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Parties\",\n        \"Description\": \"The two or more parties who signed the contract\",\n        \"Answer Format\": \"Entity or individual names\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Agreement Date\",\n        \"Description\": \"The date of the contract\",\n        \"Answer Format\": \"Date (mm/dd/yyyy)\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Effective Date\",\n        \"Description\": \"The date when the contract is effective\\u00a0\",\n        \"Answer Format\": \"Date (mm/dd/yyyy)\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Expiration Date\",\n        \"Description\": \"On what date will the contract's initial term expire?\",\n        \"Answer Format\": \"Date (mm/dd/yyyy) / Perpetual\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Renewal Term\",\n        \"Description\": \"What is the renewal term after the initial term expires? This includes automatic extensions and unilateral extensions with prior notice.\",\n        \"Answer Format\": \"[Successive] number of years/months / Perpetual\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Notice Period to Terminate Renewal\",\n        \"Description\": \"What is the notice period required to terminate renewal?\",\n        \"Answer Format\": \"Number of days/months/year(s)\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Governing Law\",\n        \"Description\": \"Which state/country's law governs the interpretation of the contract?\",\n        \"Answer Format\": \"Name of a US State / non-US Province, Country\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Most Favored Nation\",\n        \"Description\": \"Is there a clause that if a third party gets better terms on the licensing or sale of technology/goods/services described in the contract, the buyer of such technology/goods/services under the contract shall be entitled to those better terms?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Non-Compete\",\n        \"Description\": \"Is there a restriction on the ability of a party to compete with the counterparty or operate in a certain geography or business or technology sector?\\u00a0\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"Exclusivity\",\n        \"Description\": \"Is there an exclusive dealing\\u00a0 commitment with the counterparty? This includes a commitment to procure all \\u201crequirements\\u201d from one party of certain technology, goods, or services or a prohibition on licensing or selling technology, goods or services to third parties, or a prohibition on\\u00a0 collaborating or working with other parties), whether during the contract or\\u00a0 after the contract ends (or both).\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"No-Solicit of Customers\",\n        \"Description\": \"Is a party restricted from contracting or soliciting customers or partners of the counterparty, whether during the contract or after the contract ends (or both)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"Competitive Restriction Exception\",\n        \"Description\": \"This category includes the exceptions or carveouts to Non-Compete, Exclusivity and No-Solicit of Customers above.\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"No-Solicit of Employees\",\n        \"Description\": \"Is there a restriction on a party\\u2019s soliciting or hiring employees and/or contractors from the\\u00a0 counterparty, whether during the contract or after the contract ends (or both)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Non-Disparagement\",\n        \"Description\": \"Is there a requirement on a party not to disparage the counterparty?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Termination for Convenience\",\n        \"Description\": \"Can a party terminate this\\u00a0 contract without cause (solely by giving a notice and allowing a waiting\\u00a0 period to expire)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Rofr/Rofo/Rofn\",\n        \"Description\": \"Is there a clause granting one party a right of first refusal, right of first offer or right of first negotiation to purchase, license, market, or distribute equity interest, technology, assets, products or services?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Change of Control\",\n        \"Description\": \"Does one party have the right to terminate or is consent or notice required of the counterparty if such party undergoes a change of control, such as a merger, stock sale, transfer of all or substantially all of its assets or business, or assignment by operation of law?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 3\",\n    },\n    {\n        \"Category\": \"Anti-Assignment\",\n        \"Description\": \"Is consent or notice required of a party if the contract is assigned to a third party?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 3\",\n    },\n    {\n        \"Category\": \"Revenue/Profit Sharing\",\n        \"Description\": \"Is one party required to share revenue or profit with the counterparty for any technology, goods, or\\u00a0services?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Price Restrictions\",\n        \"Description\": \"Is there a restriction on the\\u00a0 ability of a party to raise or reduce prices of technology, goods, or\\u00a0 services provided?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Minimum Commitment\",\n        \"Description\": \"Is there a minimum order size or minimum amount or units per-time period that one party must buy from the counterparty under the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Volume Restriction\",\n        \"Description\": \"Is there a fee increase or consent requirement, etc. if one party\\u2019s use of the product/services exceeds certain threshold?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"IP Ownership Assignment\",\n        \"Description\": \"Does intellectual property created\\u00a0 by one party become the property of the counterparty, either per the terms of the contract or upon the occurrence of certain events?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Joint IP Ownership\",\n        \"Description\": \"Is there any clause providing for joint or shared ownership of intellectual property between the parties to the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"License Grant\",\n        \"Description\": \"Does the contract contain a license granted by one party to its counterparty?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Non-Transferable License\",\n        \"Description\": \"Does the contract limit the ability of a party to transfer the license being granted to a third party?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Affiliate License-Licensor\",\n        \"Description\": \"Does the contract contain a license grant by affiliates of the licensor or that includes intellectual property of affiliates of the licensor?\\u00a0\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Affiliate License-Licensee\",\n        \"Description\": \"Does the contract contain a license grant to a licensee (incl. sublicensor) and the affiliates of such licensee/sublicensor?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Unlimited/All-You-Can-Eat-License\",\n        \"Description\": \"Is there a clause granting one party an \\u201centerprise,\\u201d \\u201call you can eat\\u201d or unlimited usage license?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Irrevocable or Perpetual License\",\n        \"Description\": \"Does the contract contain a\\u00a0 license grant that is irrevocable or perpetual?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Source Code Escrow\",\n        \"Description\": \"Is one party required to deposit its source code into escrow with a third party, which can be released to the counterparty upon the occurrence of certain events (bankruptcy,\\u00a0 insolvency, etc.)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Post-Termination Services\",\n        \"Description\": \"Is a party subject to obligations after the termination or expiration of a contract, including any post-termination transition, payment, transfer of IP, wind-down, last-buy, or similar commitments?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 5\",\n    },\n    {\n        \"Category\": \"Audit Rights\",\n        \"Description\": \"Does a party have the right to\\u00a0 audit the books, records, or physical locations of the counterparty to ensure compliance with the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 5\",\n    },\n    {\n        \"Category\": \"Uncapped Liability\",\n        \"Description\": \"Is a party\\u2019s liability uncapped upon the breach of its obligation in the contract? This also includes uncap liability for a particular type of breach such as IP infringement or breach of confidentiality obligation.\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 6\",\n    },\n    {\n        \"Category\": \"Cap on Liability\",\n        \"Description\": \"Does the contract include a cap on liability upon the breach of a party\\u2019s obligation? This includes time limitation for the counterparty to bring claims or maximum amount for recovery.\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 6\",\n    },\n    {\n        \"Category\": \"Liquidated Damages\",\n        \"Description\": \"Does the contract contain a clause that would award either party liquidated damages for breach or a fee upon the termination of a contract (termination fee)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Warranty Duration\",\n        \"Description\": \"What is the duration of any\\u00a0 warranty against defects or errors in technology, products, or services\\u00a0 provided under the contract?\",\n        \"Answer Format\": \"Number of months or years\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Insurance\",\n        \"Description\": \"Is there a requirement for insurance that must be maintained by one party for the benefit of the counterparty?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Covenant Not to Sue\",\n        \"Description\": \"Is a party restricted from contesting the validity of the counterparty\\u2019s ownership of intellectual property or otherwise bringing a claim against the counterparty for matters unrelated to the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Third Party Beneficiary\",\n        \"Description\": \"Is there a non-contracting party who is a beneficiary to some or all of the clauses in the contract and therefore can enforce its rights against a contracting party?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n]\n\nNUM_FIELDS_TO_EXTRACT_PER_CONTRACT = 41\n\n# 0.15 is used in the Doc-ETL paper. It should be 0.5 for the actual benchmark.\nIOU_THRESH = 0.15\n\ndef get_label_df(num_contracts: int = 1, seed: int=42) -> pd.DataFrame:\n    dataset = load_cuad_data(split=\"test\")\n\n    # get the set of unique contract titles; to ensure the order of the contracts is\n    # preserved, we use a list rather than using python's set()\n    contract_titles = []\n    for row in dataset:\n        if row[\"title\"] not in contract_titles:\n            contract_titles.append(row[\"title\"])\n\n    # shuffle the contracts for the given seed\n    rng = np.random.default_rng(seed=seed)\n    rng.shuffle(contract_titles)\n\n    # get the first num_contracts\n    contract_titles = contract_titles[:num_contracts]\n\n    # construct the dataset one contract at a time\n    final_label_dataset = []\n    for title in contract_titles:\n        # get the rows for this contract\n        contract_rows = [row for row in dataset if row[\"title\"] == title]\n\n        # construct the contract; we get the contract_id and contract text from the first row\n        contract = {\n            \"contract_id\": contract_rows[0][\"id\"],\n            \"title\": title,\n            \"contract\": contract_rows[0][\"context\"],\n        }\n\n        # add the labels\n        category_names = list(map(lambda category: category[\"Category\"], CUAD_CATEGORIES))\n        contract.update({category_name: [] for category_name in category_names})\n        for row in contract_rows:\n            category_name = row[\"id\"].split(\"__\")[-1].split(\"_\")[0].strip()\n            category_name = category_name.replace(\" For \", \" for \")\n            category_name = category_name.replace(\" Of \", \" of \")\n            category_name = category_name.replace(\" On \", \" on \")\n            category_name = category_name.replace(\" Or \", \" or \")\n            category_name = category_name.replace(\" To \", \" to \")\n            category_name = category_name.replace(\"Ip\", \"IP\")\n            assert category_name in category_names, f\"Unknown category {category_name}\"\n\n            # Extract text from answers list (handles both old and new format)\n            answer_texts = []\n            if isinstance(row[\"answers\"], list):\n                answer_texts = [ans[\"text\"] for ans in row[\"answers\"]] if row[\"answers\"] else []\n            else:\n                answer_texts = row[\"answers\"].get(\"text\", [])\n            contract[category_name].extend(answer_texts)\n\n        # add the contract to the dataset\n        final_label_dataset.append(contract)\n\n    return pd.DataFrame(final_label_dataset)\n\n\n#  Return the Jaccard similarity between two strings\ndef get_jaccard(label, pred):\n    remove_tokens = [c for c in string.punctuation if c != \"/\"]\n    for token in remove_tokens:\n        label = label.replace(token, \"\")\n        pred = pred.replace(token, \"\")\n    label = label.lower()\n    pred = pred.lower()\n    label = label.replace(\"/\", \" \")\n    pred = pred.replace(\"/\", \" \")\n\n    label_words = set(label.split(\" \"))\n    pred_words = set(pred.split(\" \"))\n\n    intersection = label_words.intersection(pred_words)\n    union = label_words.union(pred_words)\n    jaccard = len(intersection) / len(union)\n    return jaccard\n\n\n# Find the number of true positives, false positives, and false negatives for each entry\n# (one field extracted from each contract) by comparing the labels and predictions.\n# Labels and preds are lists of strings\ndef evaluate_entry(labels, preds, substr_ok):\n    tp, fp, fn = 0, 0, 0\n\n    # jaccard similarity expects strings\n    # TODO: This is a hack, ideally, the return type of the preds should be known\n    for idx, pred in enumerate(preds):\n        if not isinstance(pred, str):\n            print(f\"Expected string, but got {pred}\")\n            preds[idx] = str(pred)\n\n    # first check if labels is empty\n    if len(labels) == 0:\n        if len(preds) > 0:\n            fp += len(preds)  # false positive for each one\n    else:\n        for ans in labels:\n            assert len(ans) > 0\n            # check if there is a match\n            match_found = False\n            for pred in preds:\n                if substr_ok:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH or ans in pred\n                else:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH\n                if is_match:\n                    match_found = True\n\n            if match_found:\n                tp += 1\n            else:\n                fn += 1\n\n        # now also get any fps by looping through preds\n        for pred in preds:\n            # Check if there's a match. if so, don't count (don't want to double count based on the above)\n            # but if there's no match, then this is a false positive.\n            # (Note: we get the true positives in the above loop instead of this loop so that we don't double count\n            # multiple predictions that are matched with the same answer.)\n            match_found = False\n            for ans in labels:\n                assert len(ans) > 0\n                if substr_ok:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH or ans in pred\n                else:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH\n                if is_match:\n                    match_found = True\n\n            if not match_found:\n                fp += 1\n\n    return tp, fp, fn\n\n\ndef handle_empty_preds(preds):\n    if preds is None or (  # noqa: SIM114\n        isinstance(preds, str) and (preds == \"\" or preds == \" \" or preds == \"null\" or preds == \"None\")\n    ):\n        return []\n    elif isinstance(preds, float) and np.isnan(preds):\n        return []\n    if not isinstance(preds, (list, np.ndarray)):\n        return [preds]\n    return preds\n\n\nclass CUADValidator(pz.Validator):\n    def __init__(self, num_contracts: int = 1, seed: int=42):\n        super().__init__()\n        self.num_contracts = num_contracts\n        self.seed = seed\n\n        # get clean names for the categories\n        self.category_names = list(map(lambda category: category[\"Category\"], CUAD_CATEGORIES))\n\n        # compute mapping from contract_id --> label\n        self.contract_id_to_label = self._compute_contract_id_to_labels()\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        tps, fps, fns = 0, 0, 0\n        for field in fields:\n            preds = handle_empty_preds(output.get(field))\n            labels = self.contract_id_to_label[input_record[\"contract_id\"]][field]\n            entry_tp, entry_fp, entry_fn = evaluate_entry(labels, preds, substr_ok=True) if field == \"Parties\" else evaluate_entry(labels, preds, substr_ok=False)\n            tps += entry_tp\n            fps += entry_fp\n            fns += entry_fn\n        precision = tps / (tps + fps) if tps + fps > 0 else 0.0\n        recall = tps / (tps + fns) if tps + fns > 0 else 0.0\n        f1 = 2 * precision * recall / (precision + recall) if precision + recall > 0 else 0.0\n\n        return f1\n\n    def _compute_contract_id_to_labels(self):\n        # load full train dataset\n        dataset = load_cuad_data(split=\"train\")\n\n        # get the set of unique contract titles; to ensure the order of the contracts is\n        # preserved, we use a list rather than using python's set()\n        contract_titles = []\n        for row in dataset:\n            if row[\"title\"] not in contract_titles:\n                contract_titles.append(row[\"title\"])\n\n        # shuffle the contracts for the given seed\n        rng = np.random.default_rng(seed=self.seed)\n        rng.shuffle(contract_titles)\n\n        # get the first num_contracts\n        contract_titles = contract_titles[:self.num_contracts]\n\n        # construct the mapping from contract_id to labels\n        contract_id_to_labels = {}\n        for title in contract_titles:\n            # get the rows for this contract\n            contract_rows = [row for row in dataset if row[\"title\"] == title]\n\n            # get the contract_id from the first row\n            contract_id = contract_rows[0][\"id\"]\n\n            # get the labels\n            labels = {category: [] for category in self.category_names}\n            for row in contract_rows:\n                category_name = row[\"id\"].split(\"__\")[-1].split(\"_\")[0].strip()\n                category_name = category_name.replace(\" For \", \" for \")\n                category_name = category_name.replace(\" Of \", \" of \")\n                category_name = category_name.replace(\" On \", \" on \")\n                category_name = category_name.replace(\" Or \", \" or \")\n                category_name = category_name.replace(\" To \", \" to \")\n                category_name = category_name.replace(\"Ip\", \"IP\")\n                assert category_name in self.category_names, f\"Unknown category {category_name}\"\n\n                # Extract text from answers list (handles both old and new format)\n                answer_texts = []\n                if isinstance(row[\"answers\"], list):\n                    answer_texts = [ans[\"text\"] for ans in row[\"answers\"]] if row[\"answers\"] else []\n                else:\n                    answer_texts = row[\"answers\"].get(\"text\", [])\n                labels[category_name].extend(answer_texts)\n\n            # update the dictionary\n            contract_id_to_labels[contract_id] = labels\n\n        return contract_id_to_labels\n\n\nclass CUADDataset(pz.IterDataset):\n    def __init__(self, num_contracts: int = 1, split: str = \"train\", seed: int=42):\n        self.num_contracts = num_contracts\n        self.split = split\n        self.seed = seed\n\n        input_cols = [\n            {\"name\": \"contract_id\", \"type\": str, \"desc\": \"The id of the the contract to be analyzed\"},\n            {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the the contract to be analyzed\"},\n            {\"name\": \"contract\", \"type\": str, \"desc\": \"The content of the the contract to be analyzed\"},\n        ]\n        super().__init__(id=\"cuad\", schema=input_cols)\n\n        # convert the dataset into a list of dictionaries where each row is for a single contract\n        dataset = load_cuad_data(split=split)\n        self.dataset = self._construct_dataset(dataset, num_contracts, seed)\n\n\n    def _construct_dataset(self, dataset, num_contracts, seed: int=42):\n        # get the set of unique contract titles; to ensure the order of the contracts is\n        # preserved, we use a list rather than using python's set()\n        contract_titles = []\n        for row in dataset:\n            if row[\"title\"] not in contract_titles:\n                contract_titles.append(row[\"title\"])\n\n        # shuffle the contracts for the given seed\n        rng = np.random.default_rng(seed=seed)\n        rng.shuffle(contract_titles)\n\n        # get the first num_contracts\n        contract_titles = contract_titles[:num_contracts]\n\n        # construct the dataset one contract at a time\n        new_dataset = []\n        for title in contract_titles:\n            # get the rows for this contract\n            contract_rows = [row for row in dataset if row[\"title\"] == title]\n\n            # construct the contract; we get the contract_id and contract text from the first row\n            contract = {\n                \"contract_id\": contract_rows[0][\"id\"],\n                \"title\": title,\n                \"contract\": contract_rows[0][\"context\"],\n            }\n\n            # add the rows to the dataset\n            new_dataset.append(contract)\n\n        return new_dataset\n\n    def __len__(self):\n        return self.num_contracts\n\n    def __getitem__(self, idx: int):\n        return self.dataset[idx]\n\n# Compute the precision and recall for the entire dataset.\n# Each row in the dataframes should correspond to a contract.\n# The columns should be the extracted fields (categories in CUAD_CATEGORIES).\ndef compute_precision_recall(label_df, preds_df):\n    tp, fp, fn = 0, 0, 0\n\n    label_df = label_df.sort_values(\"contract_id\").reset_index(drop=True)\n    preds_df = preds_df.sort_values(\"contract_id\").reset_index(drop=True)\n\n    assert label_df.shape == preds_df.shape, (\n        f\"Label and prediction dataframes have different shapes, label shape: {label_df.shape} vs preds shape {preds_df.shape}\"\n    )\n\n    categories = [category[\"Category\"] for category in CUAD_CATEGORIES]\n\n    for label_row, pred_row in zip(label_df.iterrows(), preds_df.iterrows()):\n        assert label_row[1][\"contract_id\"] == pred_row[1][\"contract_id\"], (\n            f\"IDs do not match. label id: {label_row[1]['contract_id']} vs pred id: {pred_row[1]['contract_id']}\"\n        )\n        for category in categories:\n            substr_ok = \"Parties\" in category\n\n            labels = label_row[1][category]\n            assert isinstance(labels, list)\n\n            preds = pred_row[1][category]\n            preds = handle_empty_preds(preds)\n\n            entry_tp, entry_fp, entry_fn = evaluate_entry(labels, preds, substr_ok)\n            tp += entry_tp\n            fp += entry_fp\n            fn += entry_fn\n\n    precision = tp / (tp + fp) if tp + fp > 0 else np.nan\n    recall = tp / (tp + fn) if tp + fn > 0 else np.nan\n\n    return precision, recall\n\n\ndef parse_arguments():\n    parser = argparse.ArgumentParser(description=\"Run CUAD demo\")\n    parser.add_argument(\"--mode\", type=str, help=\"one-convert or separate-converts\", default=\"one-convert\")\n    parser.add_argument(\"--test\", type=str, help=\"test time compute active or inactive\", default=\"active\")\n    parser.add_argument(\"--constrained\", default=False, action=\"store_true\", help=\"Use constrained objective\")\n    parser.add_argument(\"--gpt4-mini-only\", default=False, action=\"store_true\", help=\"Use only GPT-4o-mini\")\n    parser.add_argument(\n        \"--sentinel-execution-strategy\",\n        default=\"mab\",\n        type=str,\n        help=\"The engine to use. One of mab or random\",\n    )\n    parser.add_argument(\n        \"--execution-strategy\",\n        default=\"parallel\",\n        type=str,\n        help=\"The plan executor to use. One of sequential, pipelined, parallel\",\n    )\n    parser.add_argument(\n        \"--optimizer-strategy\",\n        default=\"pareto\",\n        type=str,\n        help=\"The optimizer to use. One of pareto or greedy\",\n    )\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\n        \"--seed\",\n        default=42,\n        type=int,\n        help=\"Seed used to initialize RNG for MAB sampling algorithm\",\n    )\n    parser.add_argument(\n        \"--k\",\n        default=10,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--j\",\n        default=3,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--sample-budget\",\n        default=100,\n        type=int,\n        help=\"Total sample budget in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--exp-name\",\n        default=None,\n        type=str,\n        help=\"The experiment name.\",\n    )\n    parser.add_argument(\n        \"--priors-file\",\n        default=None,\n        type=str,\n        help=\"A file with a dictionary mapping physical operator ids to prior belief on their performance\",\n    )\n    parser.add_argument(\n        \"--quality\",\n        default=None,\n        type=float,\n        help=\"Quality threshold\",\n    )\n    parser.add_argument(\n        \"--policy\",\n        default=\"maxquality\",\n        type=str,\n        help=\"One of 'mincost', 'mintime', 'maxquality'\",\n    )\n    return parser.parse_args()\n\n\ndef build_cuad_query(dataset, mode):\n    assert mode in [\"one-convert\", \"separate-converts\"]\n\n    if mode == \"one-convert\":\n        cols = []\n        for category in CUAD_CATEGORIES:\n            desc = (\n                f\"Extract the text spans (if they exist) from the contract corresponding to: {category['Description']}. If no spans exist, return an empty list. Quote text spans verbatim (do not summarize or paraphrase).\"\n            )\n            cols.append({\"name\": category[\"Category\"], \"type\": list[str], \"desc\": desc})\n\n        desc = \"Extract the text spans (if they exist) from the contract.\"\n        dataset = dataset.sem_map(cols, depends_on=[\"contract\"])\n    elif mode == \"separate-converts\":\n        for category in CUAD_CATEGORIES:\n            desc = (\n                f\"Extract the text spans (if they exist) from the contract corresponding to: {category['Description']}. If no spans exist, return an empty list. Quote text spans verbatim (do not summarize or paraphrase).\"\n            )\n            dataset = dataset.sem_map(\n                [{\"name\": category[\"Category\"], \"type\": list[str], \"desc\": desc}],\n                depends_on=[\"contract\"],\n            )\n\n    return dataset\n\n\ndef main():\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    args = parse_arguments()\n\n    # create directory for profiling data\n    os.makedirs(\"opt-profiling-data\", exist_ok=True)\n\n    # create validator for CUAD\n    validator = CUADValidator(num_contracts=25, seed=args.seed)\n\n    # create datasets for CUAD\n    dataset = CUADDataset(split=\"test\", num_contracts=100, seed=args.seed)\n    train_dataset = CUADDataset(split=\"train\", num_contracts=25, seed=args.seed)\n    train_dataset = {train_dataset.id: train_dataset}\n    print(\"Created datasets\")\n\n    # build and run the CUAD query\n    query = build_cuad_query(dataset, args.mode)\n    print(\"Built query; Starting query execution\")\n\n    # set the optimization policy; constraint set to 25% percentile from unconstrained plans\n    policy = pz.MaxQuality() if not args.constrained else pz.MaxQualityAtFixedCost(max_cost=2.759)\n    if args.policy == \"mincost\":\n        policy = pz.MinCost()\n    elif args.policy == \"minlatency\":\n        policy = pz.MinTime()\n    elif args.quality is not None and args.policy == \"mincostatfixedquality\":\n        policy = pz.MinCostAtFixedQuality(min_quality=args.quality)\n    elif args.quality is not None and args.policy == \"minlatencyatfixedquality\":\n        policy = pz.MinTimeAtFixedQuality(min_quality=args.quality)\n    print(f\"USING POLICY: {policy}\")\n\n    # set models\n    models = [Model.GPT_4o_MINI] if args.gpt4_mini_only else [\n        Model.GPT_4o,\n        Model.GPT_4o_MINI,\n        Model.LLAMA3_1_8B,\n        Model.LLAMA3_3_70B,\n        # Model.MIXTRAL, # NOTE: only available in tag `abacus-paper-experiments`\n        Model.DEEPSEEK_R1_DISTILL_QWEN_1_5B,\n    ]\n\n    sentinel_strategy = args.sentinel_execution_strategy\n    optimizer_strategy = args.optimizer_strategy\n    execution_strategy = args.execution_strategy\n    seed = args.seed\n    k = args.k\n    j = args.j\n    sample_budget = args.sample_budget\n    exp_name = (\n        f\"cuad-final-{sentinel_strategy}-k{k}-j{j}-budget{sample_budget}-seed{seed}\"\n        if args.exp_name is None\n        else args.exp_name\n    )\n    priors = None\n    if args.priors_file is not None:\n        with open(args.priors_file) as f:\n            priors = json.load(f)\n\n    config = pz.QueryProcessorConfig(\n        policy=policy,\n        verbose=False,\n        optimizer_strategy=optimizer_strategy,\n        sentinel_execution_strategy=sentinel_strategy,\n        execution_strategy=execution_strategy,\n        max_workers=64,\n        available_models=models,\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=True,\n        k=k,\n        j=j,\n        sample_budget=sample_budget,\n        seed=seed,\n        exp_name=exp_name,\n        priors=priors,\n        dont_use_priors=(priors is None),\n    )\n\n    print(f\"EXPERIMENT NAME: {exp_name}\")\n    data_record_collection = query.optimize_and_run(config=config, train_dataset=train_dataset, validator=validator)\n    print(\"Query execution completed\")\n\n    # save statistics\n    execution_stats_dict = data_record_collection.execution_stats.to_json()\n    with open(f\"opt-profiling-data/{exp_name}-stats.json\", \"w\") as f:\n        json.dump(execution_stats_dict, f)\n\n    pred_df = data_record_collection.to_df()\n    label_df = get_label_df(num_contracts=100, seed=seed)\n    # pred_df.to_csv(f\"{exp_name}-pred.csv\", index=False)\n    # label_df.to_csv(f\"{exp_name}-label.csv\", index=False)\n    final_plan_id = list(data_record_collection.execution_stats.plan_stats.keys())[0]\n    final_plan_str = data_record_collection.execution_stats.plan_strs[final_plan_id]\n\n    prec, recall = compute_precision_recall(label_df, pred_df)\n    f1 = 2 * (prec * recall) / (prec + recall) if prec + recall > 0 else 0.0\n    stats_dict = {\n        \"precision\": prec,\n        \"recall\": recall,\n        \"f1\": f1,\n        \"optimization_time\": data_record_collection.execution_stats.optimization_time,\n        \"optimization_cost\": data_record_collection.execution_stats.optimization_cost,\n        \"plan_execution_time\": data_record_collection.execution_stats.plan_execution_time,\n        \"plan_execution_cost\": data_record_collection.execution_stats.plan_execution_cost,\n        \"total_execution_time\": data_record_collection.execution_stats.total_execution_time,\n        \"total_execution_cost\": data_record_collection.execution_stats.total_execution_cost,\n        \"plan_str\": final_plan_str,\n    }\n    with open(f\"opt-profiling-data/{exp_name}-metrics.json\", \"w\") as f:\n        json.dump(stats_dict, f)\n\n    print(f\"Precision: {prec:.3f}, Recall: {recall:.3f}, F1: {f1:.3f}\")\n    print(f\"Optimization time: {data_record_collection.execution_stats.optimization_time}\")\n    print(f\"Optimization cost: {data_record_collection.execution_stats.optimization_cost}\")\n    print(f\"Plan Exec. time: {data_record_collection.execution_stats.plan_execution_time}\")\n    print(f\"Plan Exec. cost: {data_record_collection.execution_stats.plan_execution_cost}\")\n    print(f\"Total Execution time: {data_record_collection.execution_stats.total_execution_time}\")\n    print(f\"Total Execution Cost: {data_record_collection.execution_stats.total_execution_cost}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "abacus-research/cuad-max-quality-at-cost.py",
    "content": "import argparse\nimport json\nimport os\nimport string\n\nimport numpy as np\nimport pandas as pd\nfrom cuad_data_loader import load_cuad_data\n\nimport palimpzest as pz\nfrom palimpzest.constants import Model\nfrom palimpzest.policy import MaxQuality, MaxQualityAtFixedCost\n\nCUAD_CATEGORIES = [\n    {\n        \"Category\": \"Document Name\",\n        \"Description\": \"The name of the contract\",\n        \"Answer Format\": \"Contract Name\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Parties\",\n        \"Description\": \"The two or more parties who signed the contract\",\n        \"Answer Format\": \"Entity or individual names\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Agreement Date\",\n        \"Description\": \"The date of the contract\",\n        \"Answer Format\": \"Date (mm/dd/yyyy)\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Effective Date\",\n        \"Description\": \"The date when the contract is effective\\u00a0\",\n        \"Answer Format\": \"Date (mm/dd/yyyy)\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Expiration Date\",\n        \"Description\": \"On what date will the contract's initial term expire?\",\n        \"Answer Format\": \"Date (mm/dd/yyyy) / Perpetual\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Renewal Term\",\n        \"Description\": \"What is the renewal term after the initial term expires? This includes automatic extensions and unilateral extensions with prior notice.\",\n        \"Answer Format\": \"[Successive] number of years/months / Perpetual\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Notice Period to Terminate Renewal\",\n        \"Description\": \"What is the notice period required to terminate renewal?\",\n        \"Answer Format\": \"Number of days/months/year(s)\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Governing Law\",\n        \"Description\": \"Which state/country's law governs the interpretation of the contract?\",\n        \"Answer Format\": \"Name of a US State / non-US Province, Country\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Most Favored Nation\",\n        \"Description\": \"Is there a clause that if a third party gets better terms on the licensing or sale of technology/goods/services described in the contract, the buyer of such technology/goods/services under the contract shall be entitled to those better terms?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Non-Compete\",\n        \"Description\": \"Is there a restriction on the ability of a party to compete with the counterparty or operate in a certain geography or business or technology sector?\\u00a0\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"Exclusivity\",\n        \"Description\": \"Is there an exclusive dealing\\u00a0 commitment with the counterparty? This includes a commitment to procure all \\u201crequirements\\u201d from one party of certain technology, goods, or services or a prohibition on licensing or selling technology, goods or services to third parties, or a prohibition on\\u00a0 collaborating or working with other parties), whether during the contract or\\u00a0 after the contract ends (or both).\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"No-Solicit of Customers\",\n        \"Description\": \"Is a party restricted from contracting or soliciting customers or partners of the counterparty, whether during the contract or after the contract ends (or both)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"Competitive Restriction Exception\",\n        \"Description\": \"This category includes the exceptions or carveouts to Non-Compete, Exclusivity and No-Solicit of Customers above.\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"No-Solicit of Employees\",\n        \"Description\": \"Is there a restriction on a party\\u2019s soliciting or hiring employees and/or contractors from the\\u00a0 counterparty, whether during the contract or after the contract ends (or both)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Non-Disparagement\",\n        \"Description\": \"Is there a requirement on a party not to disparage the counterparty?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Termination for Convenience\",\n        \"Description\": \"Can a party terminate this\\u00a0 contract without cause (solely by giving a notice and allowing a waiting\\u00a0 period to expire)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Rofr/Rofo/Rofn\",\n        \"Description\": \"Is there a clause granting one party a right of first refusal, right of first offer or right of first negotiation to purchase, license, market, or distribute equity interest, technology, assets, products or services?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Change of Control\",\n        \"Description\": \"Does one party have the right to terminate or is consent or notice required of the counterparty if such party undergoes a change of control, such as a merger, stock sale, transfer of all or substantially all of its assets or business, or assignment by operation of law?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 3\",\n    },\n    {\n        \"Category\": \"Anti-Assignment\",\n        \"Description\": \"Is consent or notice required of a party if the contract is assigned to a third party?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 3\",\n    },\n    {\n        \"Category\": \"Revenue/Profit Sharing\",\n        \"Description\": \"Is one party required to share revenue or profit with the counterparty for any technology, goods, or\\u00a0services?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Price Restrictions\",\n        \"Description\": \"Is there a restriction on the\\u00a0 ability of a party to raise or reduce prices of technology, goods, or\\u00a0 services provided?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Minimum Commitment\",\n        \"Description\": \"Is there a minimum order size or minimum amount or units per-time period that one party must buy from the counterparty under the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Volume Restriction\",\n        \"Description\": \"Is there a fee increase or consent requirement, etc. if one party\\u2019s use of the product/services exceeds certain threshold?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"IP Ownership Assignment\",\n        \"Description\": \"Does intellectual property created\\u00a0 by one party become the property of the counterparty, either per the terms of the contract or upon the occurrence of certain events?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Joint IP Ownership\",\n        \"Description\": \"Is there any clause providing for joint or shared ownership of intellectual property between the parties to the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"License Grant\",\n        \"Description\": \"Does the contract contain a license granted by one party to its counterparty?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Non-Transferable License\",\n        \"Description\": \"Does the contract limit the ability of a party to transfer the license being granted to a third party?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Affiliate License-Licensor\",\n        \"Description\": \"Does the contract contain a license grant by affiliates of the licensor or that includes intellectual property of affiliates of the licensor?\\u00a0\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Affiliate License-Licensee\",\n        \"Description\": \"Does the contract contain a license grant to a licensee (incl. sublicensor) and the affiliates of such licensee/sublicensor?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Unlimited/All-You-Can-Eat-License\",\n        \"Description\": \"Is there a clause granting one party an \\u201centerprise,\\u201d \\u201call you can eat\\u201d or unlimited usage license?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Irrevocable or Perpetual License\",\n        \"Description\": \"Does the contract contain a\\u00a0 license grant that is irrevocable or perpetual?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Source Code Escrow\",\n        \"Description\": \"Is one party required to deposit its source code into escrow with a third party, which can be released to the counterparty upon the occurrence of certain events (bankruptcy,\\u00a0 insolvency, etc.)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Post-Termination Services\",\n        \"Description\": \"Is a party subject to obligations after the termination or expiration of a contract, including any post-termination transition, payment, transfer of IP, wind-down, last-buy, or similar commitments?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 5\",\n    },\n    {\n        \"Category\": \"Audit Rights\",\n        \"Description\": \"Does a party have the right to\\u00a0 audit the books, records, or physical locations of the counterparty to ensure compliance with the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 5\",\n    },\n    {\n        \"Category\": \"Uncapped Liability\",\n        \"Description\": \"Is a party\\u2019s liability uncapped upon the breach of its obligation in the contract? This also includes uncap liability for a particular type of breach such as IP infringement or breach of confidentiality obligation.\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 6\",\n    },\n    {\n        \"Category\": \"Cap on Liability\",\n        \"Description\": \"Does the contract include a cap on liability upon the breach of a party\\u2019s obligation? This includes time limitation for the counterparty to bring claims or maximum amount for recovery.\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 6\",\n    },\n    {\n        \"Category\": \"Liquidated Damages\",\n        \"Description\": \"Does the contract contain a clause that would award either party liquidated damages for breach or a fee upon the termination of a contract (termination fee)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Warranty Duration\",\n        \"Description\": \"What is the duration of any\\u00a0 warranty against defects or errors in technology, products, or services\\u00a0 provided under the contract?\",\n        \"Answer Format\": \"Number of months or years\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Insurance\",\n        \"Description\": \"Is there a requirement for insurance that must be maintained by one party for the benefit of the counterparty?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Covenant Not to Sue\",\n        \"Description\": \"Is a party restricted from contesting the validity of the counterparty\\u2019s ownership of intellectual property or otherwise bringing a claim against the counterparty for matters unrelated to the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Third Party Beneficiary\",\n        \"Description\": \"Is there a non-contracting party who is a beneficiary to some or all of the clauses in the contract and therefore can enforce its rights against a contracting party?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n]\n\nNUM_FIELDS_TO_EXTRACT_PER_CONTRACT = 41\n\n# 0.15 is used in the Doc-ETL paper. It should be 0.5 for the actual benchmark.\nIOU_THRESH = 0.15\n\ndef get_label_df(num_contracts: int = 1, seed: int=42) -> pd.DataFrame:\n    dataset = load_cuad_data(split=\"test\")\n\n    # get the set of unique contract titles; to ensure the order of the contracts is\n    # preserved, we use a list rather than using python's set()\n    contract_titles = []\n    for row in dataset:\n        if row[\"title\"] not in contract_titles:\n            contract_titles.append(row[\"title\"])\n\n    # shuffle the contracts for the given seed\n    rng = np.random.default_rng(seed=seed)\n    rng.shuffle(contract_titles)\n\n    # get the first num_contracts\n    contract_titles = contract_titles[:num_contracts]\n\n    # construct the dataset one contract at a time\n    final_label_dataset = []\n    for title in contract_titles:\n        # get the rows for this contract\n        contract_rows = [row for row in dataset if row[\"title\"] == title]\n\n        # construct the contract; we get the contract_id and contract text from the first row\n        contract = {\n            \"contract_id\": contract_rows[0][\"id\"],\n            \"title\": title,\n            \"contract\": contract_rows[0][\"context\"],\n        }\n\n        # add the labels\n        category_names = list(map(lambda category: category[\"Category\"], CUAD_CATEGORIES))\n        for row in contract_rows:\n            category_name = row[\"id\"].split(\"__\")[-1].split(\"_\")[0].strip()\n            category_name = category_name.replace(\" For \", \" for \")\n            category_name = category_name.replace(\" Of \", \" of \")\n            category_name = category_name.replace(\" On \", \" on \")\n            category_name = category_name.replace(\" Or \", \" or \")\n            category_name = category_name.replace(\" To \", \" to \")\n            category_name = category_name.replace(\"Ip\", \"IP\")\n            assert category_name in category_names, f\"Unknown category {category_name}\"\n\n            # Extract text from answers list (handles both old and new format)\n            answer_texts = []\n            if isinstance(row[\"answers\"], list):\n                answer_texts = [ans[\"text\"] for ans in row[\"answers\"]] if row[\"answers\"] else []\n            else:\n                answer_texts = row[\"answers\"].get(\"text\", [])\n            contract[category_name].extend(answer_texts)\n\n        # add the contract to the dataset\n        final_label_dataset.append(contract)\n\n    return pd.DataFrame(final_label_dataset)\n\n\n#  Return the Jaccard similarity between two strings\ndef get_jaccard(label, pred):\n    remove_tokens = [c for c in string.punctuation if c != \"/\"]\n    for token in remove_tokens:\n        label = label.replace(token, \"\")\n        pred = pred.replace(token, \"\")\n    label = label.lower()\n    pred = pred.lower()\n    label = label.replace(\"/\", \" \")\n    pred = pred.replace(\"/\", \" \")\n\n    label_words = set(label.split(\" \"))\n    pred_words = set(pred.split(\" \"))\n\n    intersection = label_words.intersection(pred_words)\n    union = label_words.union(pred_words)\n    jaccard = len(intersection) / len(union)\n    return jaccard\n\n\n# Find the number of true positives, false positives, and false negatives for each entry\n# (one field extracted from each contract) by comparing the labels and predictions.\n# Labels and preds are lists of strings\ndef evaluate_entry(labels, preds, substr_ok):\n    tp, fp, fn = 0, 0, 0\n\n    # jaccard similarity expects strings\n    # TODO: This is a hack, ideally, the return type of the preds should be known\n    for idx, pred in enumerate(preds):\n        if not isinstance(pred, str):\n            print(f\"Expected string, but got {pred}\")\n            preds[idx] = str(pred)\n\n    # first check if labels is empty\n    if len(labels) == 0:\n        if len(preds) > 0:\n            fp += len(preds)  # false positive for each one\n    else:\n        for ans in labels:\n            assert len(ans) > 0\n            # check if there is a match\n            match_found = False\n            for pred in preds:\n                if substr_ok:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH or ans in pred\n                else:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH\n                if is_match:\n                    match_found = True\n\n            if match_found:\n                tp += 1\n            else:\n                fn += 1\n\n        # now also get any fps by looping through preds\n        for pred in preds:\n            # Check if there's a match. if so, don't count (don't want to double count based on the above)\n            # but if there's no match, then this is a false positive.\n            # (Note: we get the true positives in the above loop instead of this loop so that we don't double count\n            # multiple predictions that are matched with the same answer.)\n            match_found = False\n            for ans in labels:\n                assert len(ans) > 0\n                if substr_ok:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH or ans in pred\n                else:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH\n                if is_match:\n                    match_found = True\n\n            if not match_found:\n                fp += 1\n\n    return tp, fp, fn\n\n\ndef handle_empty_preds(preds):\n    if preds is None or (  # noqa: SIM114\n        isinstance(preds, str) and (preds == \"\" or preds == \" \" or preds == \"null\" or preds == \"None\")\n    ):\n        return []\n    elif isinstance(preds, float) and np.isnan(preds):\n        return []\n    if not isinstance(preds, (list, np.ndarray)):\n        return [preds]\n    return preds\n\n\nclass CUADValidator(pz.Validator):\n    def __init__(self, num_contracts: int = 1, seed: int=42):\n        super().__init__()\n        self.num_contracts = num_contracts\n        self.seed = seed\n\n        # get clean names for the categories\n        self.category_names = list(map(lambda category: category[\"Category\"], CUAD_CATEGORIES))\n\n        # compute mapping from contract_id --> label\n        self.contract_id_to_label = self._compute_contract_id_to_labels()\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        scores = []\n        for field in fields:\n            preds = handle_empty_preds(output.get(field))\n            labels = self.contract_id_to_label[input_record[\"contract_id\"]][field]\n            entry_tp, _, entry_fn = evaluate_entry(labels, preds, substr_ok=True) if field == \"Parties\" else evaluate_entry(labels, preds, substr_ok=False)\n            score = None\n            if len(labels) > 0:  # noqa: SIM108\n                score = entry_tp / (entry_tp + entry_fn)\n            else:\n                score = 1.0 if len(preds) == 0 else 0.0\n            scores.append(score)\n        return np.mean(scores)\n\n    def _compute_contract_id_to_labels(self):\n        # load full train dataset\n        dataset = load_cuad_data(split=\"train\")\n\n        # get the set of unique contract titles; to ensure the order of the contracts is\n        # preserved, we use a list rather than using python's set()\n        contract_titles = []\n        for row in dataset:\n            if row[\"title\"] not in contract_titles:\n                contract_titles.append(row[\"title\"])\n\n        # shuffle the contracts for the given seed\n        rng = np.random.default_rng(seed=self.seed)\n        rng.shuffle(contract_titles)\n\n        # get the first num_contracts\n        contract_titles = contract_titles[:self.num_contracts]\n\n        # construct the mapping from contract_id to labels\n        contract_id_to_labels = {}\n        for title in contract_titles:\n            # get the rows for this contract\n            contract_rows = [row for row in dataset if row[\"title\"] == title]\n\n            # get the contract_id from the first row\n            contract_id = contract_rows[0][\"id\"]\n\n            # get the labels\n            labels = {category: [] for category in self.category_names}\n            for row in contract_rows:\n                category_name = row[\"id\"].split(\"__\")[-1].split(\"_\")[0].strip()\n                category_name = category_name.replace(\" For \", \" for \")\n                category_name = category_name.replace(\" Of \", \" of \")\n                category_name = category_name.replace(\" On \", \" on \")\n                category_name = category_name.replace(\" Or \", \" or \")\n                category_name = category_name.replace(\" To \", \" to \")\n                category_name = category_name.replace(\"Ip\", \"IP\")\n                assert category_name in self.category_names, f\"Unknown category {category_name}\"\n\n                # Extract text from answers list (handles both old and new format)\n                answer_texts = []\n                if isinstance(row[\"answers\"], list):\n                    answer_texts = [ans[\"text\"] for ans in row[\"answers\"]] if row[\"answers\"] else []\n                else:\n                    answer_texts = row[\"answers\"].get(\"text\", [])\n                labels[category_name].extend(answer_texts)\n\n            # update the dictionary\n            contract_id_to_labels[contract_id] = labels\n\n        return contract_id_to_labels\n\n\nclass CUADDataset(pz.IterDataset):\n    def __init__(self, num_contracts: int = 1, split: str = \"train\", seed: int=42):\n        self.num_contracts = num_contracts\n        self.split = split\n        self.seed = seed\n\n        input_cols = [\n            {\"name\": \"contract_id\", \"type\": str, \"desc\": \"The id of the the contract to be analyzed\"},\n            {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the the contract to be analyzed\"},\n            {\"name\": \"contract\", \"type\": str, \"desc\": \"The content of the the contract to be analyzed\"},\n        ]\n        super().__init__(id=\"cuad\", schema=input_cols)\n\n        # Load dataset from local files\n        dataset = load_cuad_data(split=split)\n        self.dataset = self._construct_dataset(dataset, num_contracts, seed)\n\n    def _construct_dataset(self, dataset, num_contracts, seed: int=42):\n        # get the set of unique contract titles; to ensure the order of the contracts is\n        # preserved, we use a list rather than using python's set()\n        contract_titles = []\n        for row in dataset:\n            if row[\"title\"] not in contract_titles:\n                contract_titles.append(row[\"title\"])\n\n        # shuffle the contracts for the given seed\n        rng = np.random.default_rng(seed=seed)\n        rng.shuffle(contract_titles)\n\n        # get the first num_contracts\n        contract_titles = contract_titles[:num_contracts]\n\n        # construct the dataset one contract at a time\n        new_dataset = []\n        for title in contract_titles:\n            # get the rows for this contract\n            contract_rows = [row for row in dataset if row[\"title\"] == title]\n\n            # construct the contract; we get the contract_id and contract text from the first row\n            contract = {\n                \"contract_id\": contract_rows[0][\"id\"],\n                \"title\": title,\n                \"contract\": contract_rows[0][\"context\"],\n            }\n\n            # add the rows to the dataset\n            new_dataset.append(contract)\n\n        return new_dataset\n\n    def __len__(self):\n        return self.num_contracts\n\n    def __getitem__(self, idx: int):\n        return self.dataset[idx]\n\n# Compute the precision and recall for the entire dataset.\n# Each row in the dataframes should correspond to a contract.\n# The columns should be the extracted fields (categories in CUAD_CATEGORIES).\ndef compute_precision_recall(label_df, preds_df):\n    tp, fp, fn = 0, 0, 0\n\n    label_df = label_df.sort_values(\"contract_id\").reset_index(drop=True)\n    preds_df = preds_df.sort_values(\"contract_id\").reset_index(drop=True)\n\n    assert label_df.shape == preds_df.shape, (\n        f\"Label and prediction dataframes have different shapes, label shape: {label_df.shape} vs preds shape {preds_df.shape}\"\n    )\n\n    categories = [category[\"Category\"] for category in CUAD_CATEGORIES]\n\n    for label_row, pred_row in zip(label_df.iterrows(), preds_df.iterrows()):\n        assert label_row[1][\"contract_id\"] == pred_row[1][\"contract_id\"], (\n            f\"IDs do not match. label id: {label_row[1]['contract_id']} vs pred id: {pred_row[1]['contract_id']}\"\n        )\n        for category in categories:\n            substr_ok = \"Parties\" in category\n\n            labels = label_row[1][category]\n            assert isinstance(labels, list)\n\n            preds = pred_row[1][category]\n            preds = handle_empty_preds(preds)\n\n            entry_tp, entry_fp, entry_fn = evaluate_entry(labels, preds, substr_ok)\n            tp += entry_tp\n            fp += entry_fp\n            fn += entry_fn\n\n    precision = tp / (tp + fp) if tp + fp > 0 else np.nan\n    recall = tp / (tp + fn) if tp + fn > 0 else np.nan\n\n    return precision, recall\n\n\ndef parse_arguments():\n    parser = argparse.ArgumentParser(description=\"Run CUAD demo\")\n    parser.add_argument(\"--mode\", type=str, help=\"one-convert or separate-converts\", default=\"one-convert\")\n    parser.add_argument(\"--test\", type=str, help=\"test time compute active or inactive\", default=\"active\")\n    parser.add_argument(\"--constrained\", default=False, action=\"store_true\", help=\"Use constrained objective\")\n    parser.add_argument(\n        \"--sentinel-execution-strategy\",\n        default=\"mab\",\n        type=str,\n        help=\"The engine to use. One of mab or random\",\n    )\n    parser.add_argument(\n        \"--optimizer-strategy\",\n        default=\"pareto\",\n        type=str,\n        help=\"The optimizer to use. One of pareto or greedy\",\n    )\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\n        \"--seed\",\n        default=42,\n        type=int,\n        help=\"Seed used to initialize RNG for MAB sampling algorithm\",\n    )\n    parser.add_argument(\n        \"--k\",\n        default=10,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--j\",\n        default=3,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--sample-budget\",\n        default=100,\n        type=int,\n        help=\"Total sample budget in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--exp-name\",\n        default=None,\n        type=str,\n        help=\"The experiment name.\",\n    )\n    parser.add_argument(\n        \"--priors-file\",\n        default=None,\n        type=str,\n        help=\"A file with a dictionary mapping physical operator ids to prior belief on their performance\",\n    )\n    parser.add_argument(\n        \"--cost\",\n        default=1.0,\n        type=float,\n        help=\"The cost budget for the optimization\",\n    )\n    return parser.parse_args()\n\n\ndef build_cuad_query(dataset, mode):\n    assert mode in [\"one-convert\", \"separate-converts\"]\n\n    if mode == \"one-convert\":\n        cols = []\n        for category in CUAD_CATEGORIES:\n            desc = (\n                f\"Extract the text spans (if they exist) from the contract corresponding to {category['Description']}\"\n            )\n            cols.append({\"name\": category[\"Category\"], \"type\": list[str], \"desc\": desc})\n\n        desc = \"Extract the text spans (if they exist) from the contract.\"\n        dataset = dataset.sem_map(cols, desc=desc, depends_on=[\"contract\"])\n    elif mode == \"separate-converts\":\n        for category in CUAD_CATEGORIES:\n            desc = (\n                f\"Extract the text spans (if they exist) from the contract corresponding to {category['Description']}\"\n            )\n            dataset = dataset.sem_map(\n                [{\"name\": category[\"Category\"], \"type\": list[str], \"desc\": desc}],\n                desc=category[\"Description\"],\n                depends_on=[\"contract\"],\n            )\n\n    return dataset\n\n\ndef main():\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    args = parse_arguments()\n\n    # create directory for profiling data\n    os.makedirs(\"max-quality-at-cost-data\", exist_ok=True)\n\n    # create validator for CUAD\n    validator = CUADValidator(num_contracts=25, seed=args.seed)\n\n    # create datasets for CUAD\n    dataset = CUADDataset(split=\"test\", num_contracts=100, seed=args.seed)\n    train_dataset = CUADDataset(split=\"train\", num_contracts=25, seed=args.seed)\n    train_dataset = {train_dataset.id: train_dataset}\n    print(\"Created datasets\")\n\n    # Build and run the CUAD query\n    query = build_cuad_query(dataset, args.mode)\n    print(\"Built query; Starting query execution\")\n\n    # set the optimization policy; constraint set to 25% percentile from unconstrained plans\n    policy = MaxQualityAtFixedCost(max_cost=args.cost) if args.cost < 999 else MaxQuality()\n    print(f\"USING POLICY: {policy}\")\n\n    sentinel_strategy = args.sentinel_execution_strategy\n    optimizer_strategy = args.optimizer_strategy\n    seed = args.seed\n    k = args.k\n    j = args.j\n    sample_budget = args.sample_budget\n    exp_name = (\n        f\"cuad-strategy-{optimizer_strategy}-k{k}-j{j}-budget{sample_budget}-seed{seed}\"\n        if args.exp_name is None\n        else args.exp_name\n    )\n    priors = None\n    if args.priors_file is not None:\n        with open(args.priors_file) as f:\n            priors = json.load(f)\n\n    config = pz.QueryProcessorConfig(\n        policy=policy,\n        verbose=False,\n        optimizer_strategy=optimizer_strategy,\n        sentinel_execution_strategy=sentinel_strategy,\n        execution_strategy=\"parallel\",\n        max_workers=64,\n        available_models=[\n            Model.GPT_4o,\n            Model.GPT_4o_MINI,\n            Model.LLAMA3_1_8B,\n            Model.LLAMA3_3_70B,\n            # Model.MIXTRAL, # NOTE: only available in tag `abacus-paper-experiments`\n            Model.DEEPSEEK_R1_DISTILL_QWEN_1_5B,\n        ],\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=True,\n        seed=seed,\n        k=k,\n        j=j,\n        sample_budget=sample_budget,\n        exp_name=exp_name,\n        priors=priors,\n    )\n\n    print(f\"EXPERIMENT NAME: {exp_name}\")\n    data_record_collection = query.optimize_and_run(config=config, train_dataset=train_dataset, validator=validator)\n    print(\"Query execution completed\")\n\n    # save statistics\n    execution_stats_dict = data_record_collection.execution_stats.to_json()\n    with open(f\"max-quality-at-cost-data/{exp_name}-stats.json\", \"w\") as f:\n        json.dump(execution_stats_dict, f)\n\n    pred_df = data_record_collection.to_df()\n    label_df = get_label_df(num_contracts=100, seed=args.seed)\n    # pred_df.to_csv(f\"{exp_name}-pred.csv\", index=False)\n    # label_df.to_csv(f\"{exp_name}-label.csv\", index=False)\n    final_plan_id = list(data_record_collection.execution_stats.plan_stats.keys())[0]\n    final_plan_str = data_record_collection.execution_stats.plan_strs[final_plan_id]\n\n    prec, recall = compute_precision_recall(label_df, pred_df)\n    f1 = 2 * (prec * recall) / (prec + recall) if prec + recall > 0 else 0.0\n    stats_dict = {\n        \"precision\": prec,\n        \"recall\": recall,\n        \"f1\": f1,\n        \"optimization_time\": data_record_collection.execution_stats.optimization_time,\n        \"optimization_cost\": data_record_collection.execution_stats.optimization_cost,\n        \"plan_execution_time\": data_record_collection.execution_stats.plan_execution_time,\n        \"plan_execution_cost\": data_record_collection.execution_stats.plan_execution_cost,\n        \"total_execution_time\": data_record_collection.execution_stats.total_execution_time,\n        \"total_execution_cost\": data_record_collection.execution_stats.total_execution_cost,\n        \"plan_str\": final_plan_str,\n    }\n    with open(f\"max-quality-at-cost-data/{exp_name}-metrics.json\", \"w\") as f:\n        json.dump(stats_dict, f)\n\n    print(f\"Precision: {prec:.3f}, Recall: {recall:.3f}, F1: {f1:.3f}\")\n    print(f\"Optimization time: {data_record_collection.execution_stats.optimization_time}\")\n    print(f\"Optimization cost: {data_record_collection.execution_stats.optimization_cost}\")\n    print(f\"Plan Exec. time: {data_record_collection.execution_stats.plan_execution_time}\")\n    print(f\"Plan Exec. cost: {data_record_collection.execution_stats.plan_execution_cost}\")\n    print(f\"Total Execution time: {data_record_collection.execution_stats.total_execution_time}\")\n    print(f\"Total Execution Cost: {data_record_collection.execution_stats.total_execution_cost}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "abacus-research/cuad-priors.json",
    "content": "{\"00c93aec22\": {\"quality\": 0.5304878048780488, \"cost\": 0.01609626, \"time\": 121.91315126419067}, \"00f4acd0d3\": {\"quality\": 0.6195121951219512, \"cost\": 0.007308299999999999, \"time\": 62.89620461463928}, \"0121878170\": {\"quality\": 0.7871196283391406, \"cost\": 0.084214564, \"time\": 137.42955293655396}, \"01c2f973ad\": {\"quality\": 0.6369918699186992, \"cost\": 0.00766284, \"time\": 88.04302344322204}, \"01fca3c717\": {\"quality\": 0.6376887340301974, \"cost\": 0.067807144, \"time\": 56.765127992630006}, \"02078988c1\": {\"quality\": 0.6195121951219512, \"cost\": 0.010957832, \"time\": 58.52112374305725}, \"021604dec1\": {\"quality\": 0.32056910569105695, \"cost\": 0.014393556000000002, \"time\": 102.96775641441346}, \"0262668df7\": {\"quality\": 0.7521835075493613, \"cost\": 0.015505927999999999, \"time\": 110.59786329269409}, \"0267c97b70\": {\"quality\": 0.6195121951219512, \"cost\": 0.007206299999999999, \"time\": 78.4086401939392}, \"02d6cdecdc\": {\"quality\": 0.17902439024390246, \"cost\": 0.043331628, \"time\": 118.95345692634582}, \"033ca325e6\": {\"quality\": 0.6308943089430895, \"cost\": 0.006504208, \"time\": 47.695338296890256}, \"0364b5e990\": {\"quality\": 0.6195121951219512, \"cost\": 0.069325568, \"time\": 129.42635221481322}, \"0375ea52c9\": {\"quality\": 0.6893495934959348, \"cost\": 0.026699588000000003, \"time\": 43.96490340232849}, \"038a5f0a62\": {\"quality\": 0.6491056910569106, \"cost\": 0.01088205, \"time\": 142.37771649360656}, \"039803b3b1\": {\"quality\": 0.8145180023228804, \"cost\": 0.06754355, \"time\": 77.90279107093811}, \"03b972cb56\": {\"quality\": 0.20228803716608595, \"cost\": 0.04939144400000001, \"time\": 117.83176441192627}, \"042d933706\": {\"quality\": 0.6789430894308943, \"cost\": 0.011810856, \"time\": 96.84674339294433}, \"050b21ce37\": {\"quality\": 0.6331707317073171, \"cost\": 0.063467788, \"time\": 138.06499099731445}, \"0524f42520\": {\"quality\": 0.5098606271777004, \"cost\": 0.100158564, \"time\": 119.39450807571411}, \"05420351e5\": {\"quality\": 0.28818815331010456, \"cost\": 0.03700374, \"time\": 99.57810640335083}, \"057e332ab1\": {\"quality\": 0.5822299651567945, \"cost\": 0.071891988, \"time\": 137.52664232254028}, \"0646f3f0fb\": {\"quality\": 0.4296747967479675, \"cost\": 0.06800707800000001, \"time\": 122.81284818649291}, \"06493715cc\": {\"quality\": 0.6146341463414634, \"cost\": 0.009369455999999998, \"time\": 101.42031364440918}, \"0659531b94\": {\"quality\": 0.7977235772357723, \"cost\": 0.06570914000000001, \"time\": 88.527423286438}, \"067ee6e91b\": {\"quality\": 0.7303135888501743, \"cost\": 0.012997799999999999, \"time\": 91.4061038017273}, \"0695f9b5fc\": {\"quality\": 0.6195121951219512, \"cost\": 0.012743448, \"time\": 123.27798008918762}, \"06e94a0f2e\": {\"quality\": 0.6381300813008131, \"cost\": 0.022528364000000002, \"time\": 70.015203332901}, \"073ef31d23\": {\"quality\": 0.6711730545876887, \"cost\": 0.06321265200000001, \"time\": 125.18851342201233}, \"078a7e545e\": {\"quality\": 0.5136469221835075, \"cost\": 0.06231631000000001, \"time\": 139.78958024978638}, \"079feb14a8\": {\"quality\": 0.5636585365853659, \"cost\": 0.012793146, \"time\": 83.61773014068604}, \"07a3a7daf7\": {\"quality\": 0.6195121951219512, \"cost\": 0.00968809, \"time\": 64.83957290649414}, \"08127cd6dd\": {\"quality\": 0.6773170731707316, \"cost\": 0.018527102, \"time\": 74.52640342712402}, \"0833133620\": {\"quality\": 0.783089430894309, \"cost\": 0.011579322, \"time\": 103.93566818237305}, \"087a2cabc4\": {\"quality\": 0.7773054587688735, \"cost\": 0.041329712000000005, \"time\": 149.253262424469}, \"08bf8cc191\": {\"quality\": 0.6195121951219512, \"cost\": 0.018227304, \"time\": 146.171551656723}, \"08e1802287\": {\"quality\": 0.40414634146341466, \"cost\": 0.061857968, \"time\": 639.4221799850463}, \"08f7f63b30\": {\"quality\": 0.6039024390243902, \"cost\": 0.06393643200000002, \"time\": 150.91687684059144}, \"090cd3ef31\": {\"quality\": 0.5126016260162601, \"cost\": 0.024500464, \"time\": 242.8270983695984}, \"0947216ece\": {\"quality\": 0.6116260162601627, \"cost\": 0.022374364, \"time\": 86.5027738571167}, \"096d51f670\": {\"quality\": 0.6195121951219512, \"cost\": 0.015167303999999998, \"time\": 168.75218114852905}, \"09791c731b\": {\"quality\": 0.6195121951219512, \"cost\": 0.010591104, \"time\": 98.68934841156006}, \"0990c0d4f8\": {\"quality\": 0.7328455284552845, \"cost\": 0.015548904000000002, \"time\": 126.22301592826844}, \"0a4c1bbb4a\": {\"quality\": 0.6195121951219512, \"cost\": 0.01167957, \"time\": 112.43891968727112}, \"0ac969dde3\": {\"quality\": 0.532520325203252, \"cost\": 0.019075388000000002, \"time\": 182.81917290687562}, \"0b4ab72197\": {\"quality\": 0.6195121951219512, \"cost\": 0.01298358, \"time\": 88.19813923835754}, \"0bf9d31691\": {\"quality\": 0.6195121951219512, \"cost\": 0.06748344, \"time\": 112.7701108455658}, \"0c020b86a3\": {\"quality\": 0.48138211382113827, \"cost\": 0.012613356000000003, \"time\": 94.54495902061463}, \"0c6c7fe96a\": {\"quality\": 0.6439024390243903, \"cost\": 0.017969835999999996, \"time\": 175.61341876983641}, \"0c81c8996a\": {\"quality\": 0.6195121951219512, \"cost\": 0.008283972, \"time\": 123.77774615287781}, \"0cd25da9fe\": {\"quality\": 0.41504065040650406, \"cost\": 0.071906784, \"time\": 151.1712607383728}, \"0cd78f33d8\": {\"quality\": 0.6969337979094077, \"cost\": 0.12822476, \"time\": 110.73728203773499}, \"0cdc5954dd\": {\"quality\": 0.39105691056910574, \"cost\": 0.013535116, \"time\": 110.69660997390747}, \"0d53dd53c1\": {\"quality\": 0.4902439024390244, \"cost\": 0.08892936000000001, \"time\": 129.61446180343628}, \"0d8436af32\": {\"quality\": 0.6047386759581882, \"cost\": 0.03048576, \"time\": 119.35665726661682}, \"0e38896654\": {\"quality\": 0.7874680603948896, \"cost\": 0.029542839999999997, \"time\": 118.93085074424744}, \"0ec672e7c8\": {\"quality\": 0.3153774680603949, \"cost\": 0.018613362, \"time\": 156.979243183136}, \"0ed243f788\": {\"quality\": 0.6195121951219512, \"cost\": 0.014807712, \"time\": 171.58208327293397}, \"0eeb372802\": {\"quality\": 0.6195121951219512, \"cost\": 0.017926556, \"time\": 135.83013033866882}, \"0ef0becc1b\": {\"quality\": 0.3480371660859466, \"cost\": 0.058053114, \"time\": 112.96433029174804}, \"0fefead197\": {\"quality\": 0.7968176538908246, \"cost\": 0.09321153600000001, \"time\": 177.5354432106018}, \"0ff126ebf8\": {\"quality\": 0.3920325203252033, \"cost\": 0.01680763, \"time\": 116.37514338493347}, \"10d1d4bdeb\": {\"quality\": 0.6195121951219512, \"cost\": 0.066024524, \"time\": 129.0803412914276}, \"114a097c53\": {\"quality\": 0.6195121951219512, \"cost\": 0.00815448, \"time\": 68.65160498619079}, \"1175ee37e6\": {\"quality\": 0.5414634146341463, \"cost\": 0.002796924, \"time\": 66.07583198547363}, \"11bc996d48\": {\"quality\": 0.7575261324041811, \"cost\": 0.011037809999999999, \"time\": 160.15645961761476}, \"11ded03305\": {\"quality\": 0.7726016260162601, \"cost\": 0.07187866600000001, \"time\": 108.60114121437073}, \"123fb650fb\": {\"quality\": 0.6260162601626017, \"cost\": 0.012730068, \"time\": 158.32172050476075}, \"132f6f3946\": {\"quality\": 0.6478048780487804, \"cost\": 0.06792896600000001, \"time\": 142.56146936416627}, \"133ee5023f\": {\"quality\": 0.6323577235772359, \"cost\": 0.009341408, \"time\": 76.29307880401612}, \"13a009fe0c\": {\"quality\": 0.6134146341463416, \"cost\": 0.020489088, \"time\": 159.91688990592957}, \"13da306f84\": {\"quality\": 0.504390243902439, \"cost\": 0.013949744, \"time\": 125.56137013435364}, \"13e717e41e\": {\"quality\": 0.5087804878048781, \"cost\": 0.055634039999999996, \"time\": 89.58166871070861}, \"13f75f9bd0\": {\"quality\": 0.3702439024390244, \"cost\": 0.009800003999999998, \"time\": 140.3795708656311}, \"140ededb41\": {\"quality\": 0.7700813008130082, \"cost\": 0.0035910299999999994, \"time\": 54.85756406784058}, \"142e59c03f\": {\"quality\": 0.39747967479674795, \"cost\": 0.080664288, \"time\": 125.91337852478027}, \"142f3a7c70\": {\"quality\": 0.6195121951219512, \"cost\": 0.009178488000000002, \"time\": 142.1152557849884}, \"1468dddecc\": {\"quality\": 0.6699186991869919, \"cost\": 0.007540319999999999, \"time\": 112.98228130340576}, \"15af009a01\": {\"quality\": 0.47317073170731705, \"cost\": 0.07114547600000001, \"time\": 165.90874967575073}, \"15b80a55d3\": {\"quality\": 0.5414634146341464, \"cost\": 0.06812305999999999, \"time\": 156.45735163688659}, \"1625e624c5\": {\"quality\": 0.56931475029036, \"cost\": 0.05795948000000001, \"time\": 88.15128273963929}, \"16cff1c1e9\": {\"quality\": 0.6195121951219512, \"cost\": 0.06384869200000001, \"time\": 149.38432698249818}, \"17407df027\": {\"quality\": 0.5050406504065041, \"cost\": 0.06935714000000001, \"time\": 157.59272437095643}, \"176da24f53\": {\"quality\": 0.6195121951219512, \"cost\": 0.006307396000000002, \"time\": 48.773886156082156}, \"179379555f\": {\"quality\": 0.43966318234610924, \"cost\": 0.008765568, \"time\": 131.12133555412294}, \"181c91d1be\": {\"quality\": 0.4508943089430894, \"cost\": 0.016041972, \"time\": 104.77644038200378}, \"183743e76e\": {\"quality\": 0.6195121951219512, \"cost\": 0.007336812, \"time\": 88.6048484802246}, \"186b58c209\": {\"quality\": 0.47317073170731705, \"cost\": 0.064688372, \"time\": 136.70242128372192}, \"187eace9fe\": {\"quality\": 0.6195121951219512, \"cost\": 0.061498704, \"time\": 111.12849621772766}, \"190ed2e1b6\": {\"quality\": 0.4953193960511033, \"cost\": 0.06357867000000002, \"time\": 139.53685355186462}, \"191aafe1a6\": {\"quality\": 0.6285017421602788, \"cost\": 0.05959545200000001, \"time\": 133.747340965271}, \"194919ad28\": {\"quality\": 0.6195121951219512, \"cost\": 0.061897668, \"time\": 120.82026715278626}, \"197bb53f10\": {\"quality\": 0.4398373983739837, \"cost\": 0.012262326, \"time\": 128.21201944351196}, \"19ba7d0617\": {\"quality\": 0.8144715447154471, \"cost\": 0.18371900000000002, \"time\": 127.1558424949646}, \"19e3db7fe7\": {\"quality\": 0.6195121951219512, \"cost\": 0.016522608, \"time\": 157.01711511611938}, \"1a08cb3f50\": {\"quality\": 0.6195121951219512, \"cost\": 0.06344755, \"time\": 113.1626263141632}, \"1a169179f6\": {\"quality\": 0.3619976771196284, \"cost\": 0.096862584, \"time\": 173.04007925987244}, \"1a71d61ac4\": {\"quality\": 0.6055284552845528, \"cost\": 0.044680528000000004, \"time\": 134.11820168495177}, \"1ad856985f\": {\"quality\": 0.6195121951219512, \"cost\": 0.015116656000000001, \"time\": 213.2139684677124}, \"1adec2dca2\": {\"quality\": 0.6195121951219512, \"cost\": 0.018743704, \"time\": 189.0085569858551}, \"1b04a2b184\": {\"quality\": 0.27695702671312433, \"cost\": 0.018901547999999997, \"time\": 190.3927951812744}, \"1b28439bd7\": {\"quality\": 0.595040650406504, \"cost\": 0.018163196000000003, \"time\": 130.19455728530883}, \"1b2c667b15\": {\"quality\": 0.6146341463414634, \"cost\": 0.06703916, \"time\": 141.4092381477356}, \"1b4511eada\": {\"quality\": 0.5601277584204414, \"cost\": 0.06213515000000001, \"time\": 197.098247051239}, \"1b7e6cad66\": {\"quality\": 0.6678048780487805, \"cost\": 0.061792784, \"time\": 95.13608679771423}, \"1beb2fac62\": {\"quality\": 0.5772357723577235, \"cost\": 0.023681768000000002, \"time\": 99.76408967971801}, \"1c347e4d91\": {\"quality\": 0.5130894308943089, \"cost\": 0.005374038, \"time\": 119.8269275188446}, \"1c35bf4be6\": {\"quality\": 0.6613124274099883, \"cost\": 0.06615356000000001, \"time\": 178.00547165870665}, \"1c4bbf8f7e\": {\"quality\": 0.812137049941928, \"cost\": 0.09307272000000003, \"time\": 165.35304036140442}, \"1c5f1341f6\": {\"quality\": 0.6847967479674797, \"cost\": 0.035318894, \"time\": 106.12908926010132}, \"1c71804bec\": {\"quality\": 0.5271544715447154, \"cost\": 0.037727392, \"time\": 100.24818325042725}, \"1cc6d9efb6\": {\"quality\": 0.4643902439024391, \"cost\": 0.018275512, \"time\": 88.59051547050476}, \"1ce3d77039\": {\"quality\": 0.6304878048780488, \"cost\": 0.0014104099999999997, \"time\": 62.92718415260315}, \"1d26090364\": {\"quality\": 0.5578048780487805, \"cost\": 0.00619749, \"time\": 73.83342995643616}, \"1d87f97e62\": {\"quality\": 0.6894192799070848, \"cost\": 0.013183738, \"time\": 122.42380046844482}, \"1d90fb8ca6\": {\"quality\": 0.7902903600464576, \"cost\": 0.08424590000000001, \"time\": 134.87732191085814}, \"1da2369719\": {\"quality\": 0.676492450638792, \"cost\": 0.010263576, \"time\": 102.77953281402588}, \"1e18e60895\": {\"quality\": 0.6342276422764228, \"cost\": 0.0014448139999999998, \"time\": 62.875070858001706}, \"1e1bf7e88b\": {\"quality\": 0.38325203252032525, \"cost\": 0.01685045, \"time\": 99.01278185844421}, \"1e8b3521f8\": {\"quality\": 0.6149709639953542, \"cost\": 0.006865926, \"time\": 112.37140879631042}, \"1f412964ff\": {\"quality\": 0.6195121951219512, \"cost\": 0.059373496000000005, \"time\": 141.1672589302063}, \"1f72cfb78a\": {\"quality\": 0.2634494773519164, \"cost\": 0.04392928, \"time\": 141.55319528579713}, \"1fb5d170ad\": {\"quality\": 0.4796399535423926, \"cost\": 0.061984526000000005, \"time\": 130.7815794467926}, \"20180dd292\": {\"quality\": 0.6195121951219512, \"cost\": 0.06752007200000001, \"time\": 180.62230010032653}, \"2018bef45f\": {\"quality\": 0.624390243902439, \"cost\": 0.010508668, \"time\": 100.28150401115417}, \"2075ff1d04\": {\"quality\": 0.3623577235772358, \"cost\": 0.015088508000000004, \"time\": 1278.50945892334}, \"208a98f514\": {\"quality\": 0.775156794425087, \"cost\": 0.017165512, \"time\": 166.07545523643495}, \"20904e5c14\": {\"quality\": 0.476260162601626, \"cost\": 0.072447476, \"time\": 98.36213431358337}, \"20afc3d539\": {\"quality\": 0.5903716608594658, \"cost\": 0.084462272, \"time\": 125.81502628326416}, \"20e10af7d4\": {\"quality\": 0.6369918699186992, \"cost\": 0.011088839999999999, \"time\": 154.89131064414977}, \"20e2c0b057\": {\"quality\": 0.7137398373983739, \"cost\": 0.014822027999999998, \"time\": 160.84442710876465}, \"211b89b4cd\": {\"quality\": 0.3874680603948897, \"cost\": 0.0074940599999999994, \"time\": 110.43994216918945}, \"21386082aa\": {\"quality\": 0.44699186991869916, \"cost\": 0.04453856, \"time\": 132.42854318618774}, \"21b2b8ebd1\": {\"quality\": 0.4067479674796748, \"cost\": 0.010719648000000002, \"time\": 121.89121770858765}, \"21b78249a7\": {\"quality\": 0.40636469221835075, \"cost\": 0.08786833200000001, \"time\": 114.88702239990235}, \"21bed16a7d\": {\"quality\": 0.42804878048780487, \"cost\": 0.021284448, \"time\": 757.0631816387177}, \"2200d969d0\": {\"quality\": 0.729616724738676, \"cost\": 0.103071816, \"time\": 184.93275413513183}, \"220d008704\": {\"quality\": 0.6195121951219512, \"cost\": 0.061679156, \"time\": 194.95589275360106}, \"2251d21392\": {\"quality\": 0.3285365853658536, \"cost\": 0.063983622, \"time\": 125.72053866386413}, \"23566f15ab\": {\"quality\": 0.6269918699186992, \"cost\": 0.0011016140000000001, \"time\": 68.13801875114441}, \"23a9506d36\": {\"quality\": 0.6296747967479674, \"cost\": 0.04247636400000001, \"time\": 287.74875631332395}, \"24957f3a43\": {\"quality\": 0.36911730545876886, \"cost\": 0.007789104, \"time\": 66.6147982120514}, \"2529e2f8b0\": {\"quality\": 0.6348780487804878, \"cost\": 0.018974088000000004, \"time\": 46.99674005508423}, \"252f01ac5b\": {\"quality\": 0.7063414634146341, \"cost\": 0.09021934000000001, \"time\": 138.44833331108094}, \"25fadf0883\": {\"quality\": 0.6009756097560977, \"cost\": 0.069262768, \"time\": 195.04961967468262}, \"2609bfd616\": {\"quality\": 0.4904878048780488, \"cost\": 0.014383092, \"time\": 138.67839546203612}, \"2629f3e324\": {\"quality\": 0.3373867595818815, \"cost\": 0.06496914000000001, \"time\": 199.1607180118561}, \"262e4298f9\": {\"quality\": 0.5845528455284553, \"cost\": 0.06646106, \"time\": 168.6952573776245}, \"26cc40d3bb\": {\"quality\": 0.6195121951219512, \"cost\": 0.066186748, \"time\": 191.06331505775452}, \"2728c8eb6a\": {\"quality\": 0.6195121951219512, \"cost\": 0.013364808, \"time\": 136.0991479873657}, \"27bc52befa\": {\"quality\": 0.7133333333333334, \"cost\": 0.035852640000000005, \"time\": 102.54774498939514}, \"27daa50458\": {\"quality\": 0.5414634146341464, \"cost\": 0.008815032, \"time\": 95.5854898929596}, \"2821795e69\": {\"quality\": 0.47317073170731716, \"cost\": 0.06960079, \"time\": 199.99654774665834}, \"28369b2421\": {\"quality\": 0.5271544715447154, \"cost\": 0.012585374, \"time\": 139.39593086242675}, \"28421e6d62\": {\"quality\": 0.6195121951219512, \"cost\": 0.010284108, \"time\": 126.62169485092163}, \"2936c3e43e\": {\"quality\": 0.7311382113821139, \"cost\": 0.022435948, \"time\": 190.63684725761414}, \"293ec5edca\": {\"quality\": 0.8191753774680605, \"cost\": 0.014760236, \"time\": 130.9795027732849}, \"29409d0894\": {\"quality\": 0.30975609756097555, \"cost\": 0.023374712000000002, \"time\": 67.99601030349731}, \"294258298a\": {\"quality\": 0.6918466898954704, \"cost\": 0.060563396000000005, \"time\": 171.09867010116577}, \"294e541235\": {\"quality\": 0.6195121951219512, \"cost\": 0.004267548, \"time\": 63.70019145011902}, \"295ed5e759\": {\"quality\": 0.2658536585365854, \"cost\": 0.00976596, \"time\": 102.0877944469452}, \"2960431101\": {\"quality\": 0.40162601626016264, \"cost\": 0.010855686, \"time\": 106.73301725387573}, \"29892d8468\": {\"quality\": 0.6195121951219512, \"cost\": 0.016309508, \"time\": 152.2814826965332}, \"299a0aeb65\": {\"quality\": 0.6601509872241579, \"cost\": 0.014925237999999999, \"time\": 216.44449853897095}, \"29ad99e3ed\": {\"quality\": 0.6195121951219512, \"cost\": 0.06017612800000001, \"time\": 140.98456511497497}, \"2a5edac2de\": {\"quality\": 0.4360046457607433, \"cost\": 0.093795708, \"time\": 186.12790670394898}, \"2a7d15f4a7\": {\"quality\": 0.6637398373983739, \"cost\": 0.0017069580000000002, \"time\": 66.5458746433258}, \"2aa996de6a\": {\"quality\": 0.40952380952380957, \"cost\": 0.09028132800000001, \"time\": 171.12979340553284}, \"2ac4fb293f\": {\"quality\": 0.6234146341463415, \"cost\": 0.039522680000000004, \"time\": 134.96180925369262}, \"2afeff0083\": {\"quality\": 0.5430894308943089, \"cost\": 0.065989844, \"time\": 233.06356620788574}, \"2b2bc9568b\": {\"quality\": 0.6195121951219512, \"cost\": 0.071159096, \"time\": 242.5479420185089}, \"2b5679d248\": {\"quality\": 0.6195121951219512, \"cost\": 0.018697384, \"time\": 247.21016097068787}, \"2bcf54cda1\": {\"quality\": 0.5373983739837398, \"cost\": 0.06900906000000001, \"time\": 200.87765913009645}, \"2bd39ee744\": {\"quality\": 0.7727642276422764, \"cost\": 0.029763757999999994, \"time\": 142.61127648353576}, \"2bf38d797f\": {\"quality\": 0.6145528455284553, \"cost\": 0.015432928000000002, \"time\": 154.4491545200348}, \"2c4f4f304e\": {\"quality\": 0.5686875725900116, \"cost\": 0.02265774, \"time\": 104.49429354667663}, \"2c5cf9eb26\": {\"quality\": 0.6195121951219512, \"cost\": 0.010875623999999999, \"time\": 208.2448058128357}, \"2c9a9f94c4\": {\"quality\": 0.6195121951219512, \"cost\": 0.016278144, \"time\": 211.14152693748474}, \"2d3bbc2d23\": {\"quality\": 0.5347386759581882, \"cost\": 0.011645886, \"time\": 142.9862838745117}, \"2de113167b\": {\"quality\": 0.3548780487804878, \"cost\": 0.01615302, \"time\": 215.62943921089172}, \"2de3eb2c19\": {\"quality\": 0.6495934959349594, \"cost\": 0.012367168000000001, \"time\": 84.4677538394928}, \"2e30394ac6\": {\"quality\": 0.42853658536585365, \"cost\": 0.013055616, \"time\": 162.57028393745424}, \"2e9c5cc9bf\": {\"quality\": 0.6195121951219512, \"cost\": 0.008993688000000001, \"time\": 123.94604053497315}, \"2f1573da80\": {\"quality\": 0.628048780487805, \"cost\": 0.012240792, \"time\": 185.1746757030487}, \"2f39d78f34\": {\"quality\": 0.6814169570267131, \"cost\": 0.08897421200000001, \"time\": 171.90278725624086}, \"2fc0cb3592\": {\"quality\": 0.3300813008130082, \"cost\": 0.011238624, \"time\": 162.5598623752594}, \"2fd9cd426a\": {\"quality\": 0.3750406504065041, \"cost\": 0.01883974, \"time\": 179.53315086364745}, \"300924ebae\": {\"quality\": 0.2581533101045296, \"cost\": 0.08039779600000001, \"time\": 140.4485348701477}, \"3019af79b3\": {\"quality\": 0.5535423925667828, \"cost\": 0.014683848, \"time\": 1308.0872743606567}, \"302c1d97fc\": {\"quality\": 0.6195121951219512, \"cost\": 0.01627172, \"time\": 136.45579180717468}, \"30ae4cbe91\": {\"quality\": 0.37552845528455286, \"cost\": 0.006890768, \"time\": 750.692762708664}, \"30c1f9ddf1\": {\"quality\": 0.6262369337979093, \"cost\": 0.071551416, \"time\": 112.11581921577454}, \"30cd375570\": {\"quality\": 0.6195121951219512, \"cost\": 0.06602648800000002, \"time\": 118.63089265823365}, \"3169782cbb\": {\"quality\": 0.4402439024390244, \"cost\": 0.060568003999999995, \"time\": 103.64021511077881}, \"3172fc459a\": {\"quality\": 0.4475609756097561, \"cost\": 0.073885504, \"time\": 161.98054652214051}, \"318499c14b\": {\"quality\": 0.4873054587688733, \"cost\": 0.06648116000000001, \"time\": 187.55820865631102}, \"31a32be94d\": {\"quality\": 0.8294541231126598, \"cost\": 0.08899002, \"time\": 188.3829703807831}, \"32b101d807\": {\"quality\": 0.6408594657375145, \"cost\": 0.023361424, \"time\": 210.3520890712738}, \"32e2c7ad7f\": {\"quality\": 0.317479674796748, \"cost\": 0.06465559000000001, \"time\": 176.4482195854187}, \"33459cd29c\": {\"quality\": 0.6195121951219512, \"cost\": 0.014899176000000004, \"time\": 156.23946170806886}, \"33a187e74f\": {\"quality\": 0.6709756097560975, \"cost\": 0.01123047, \"time\": 201.9312825202942}, \"33bab4f766\": {\"quality\": 0.7291869918699188, \"cost\": 0.063704096, \"time\": 215.9541217803955}, \"34922140da\": {\"quality\": 0.5666666666666667, \"cost\": 0.06817234800000001, \"time\": 192.9766547679901}, \"3511b5e1d0\": {\"quality\": 0.6195121951219512, \"cost\": 0.013343184, \"time\": 129.7560082912445}, \"3513311c2d\": {\"quality\": 0.48170731707317077, \"cost\": 0.064084452, \"time\": 141.37073793411255}, \"3513e54767\": {\"quality\": 0.6195121951219512, \"cost\": 0.010438112, \"time\": 88.52075481414795}, \"353f0cb1ac\": {\"quality\": 0.6175609756097561, \"cost\": 0.006175324, \"time\": 85.6404734134674}, \"3550bf88cb\": {\"quality\": 0.6732752613240418, \"cost\": 0.012484512, \"time\": 185.91844930648804}, \"35610fb420\": {\"quality\": 0.7706271777003485, \"cost\": 0.009480828, \"time\": 151.24167275428772}, \"357267e14b\": {\"quality\": 0.40682926829268296, \"cost\": 0.017491424, \"time\": 768.3776504993439}, \"35baa5c3cc\": {\"quality\": 0.7191056910569106, \"cost\": 0.07052012, \"time\": 176.32511940002442}, \"3637084f91\": {\"quality\": 0.5414634146341464, \"cost\": 0.010371784, \"time\": 99.66692337989807}, \"368a497102\": {\"quality\": 0.5840650406504064, \"cost\": 0.039775324, \"time\": 107.50217499732972}, \"36c66671ee\": {\"quality\": 0.6195121951219512, \"cost\": 0.010492836, \"time\": 127.38456826210022}, \"37456cb002\": {\"quality\": 0.800801393728223, \"cost\": 0.036142828, \"time\": 157.18044147491455}, \"3746ea5c03\": {\"quality\": 0.45056910569105685, \"cost\": 0.03366273200000001, \"time\": 102.54732451438903}, \"375ed248fe\": {\"quality\": 0.33772357723577234, \"cost\": 0.013014735999999999, \"time\": 85.65126795768738}, \"377cdf9209\": {\"quality\": 0.7822648083623693, \"cost\": 0.036440860000000005, \"time\": 189.91213726997375}, \"37d4d0f214\": {\"quality\": 0.6195121951219512, \"cost\": 0.071572196, \"time\": 152.68034386634827}, \"37ece7217f\": {\"quality\": 0.47804878048780486, \"cost\": 0.05685350800000001, \"time\": 76.75442943572997}, \"38075bb01f\": {\"quality\": 0.6195121951219512, \"cost\": 0.010187640000000001, \"time\": 148.4516242980957}, \"3831d758b1\": {\"quality\": 0.6195121951219512, \"cost\": 0.063214564, \"time\": 129.3372209072113}, \"38567d6a43\": {\"quality\": 0.6195121951219512, \"cost\": 0.008923896, \"time\": 136.575146484375}, \"3875787727\": {\"quality\": 0.6195121951219512, \"cost\": 0.06695744000000001, \"time\": 189.5042601108551}, \"389c54cbca\": {\"quality\": 0.633739837398374, \"cost\": 0.010009344, \"time\": 94.1163019657135}, \"3980f20caa\": {\"quality\": 0.5065040650406505, \"cost\": 0.020949968, \"time\": 150.0350682735443}, \"3997a836bd\": {\"quality\": 0.7323577235772357, \"cost\": 0.069346092, \"time\": 181.2115756034851}, \"39ad76f8ce\": {\"quality\": 0.5396747967479676, \"cost\": 0.06732360800000001, \"time\": 117.92112016677856}, \"39c0b7c171\": {\"quality\": 0.4475609756097561, \"cost\": 0.07215601600000002, \"time\": 221.94310250282288}, \"39cd4ca402\": {\"quality\": 0.6195121951219512, \"cost\": 0.007245383999999999, \"time\": 139.3380250453949}, \"3a32c98a53\": {\"quality\": 0.7996399535423925, \"cost\": 0.066284192, \"time\": 151.66175842285156}, \"3ac7fa4e46\": {\"quality\": 0.6183972125435541, \"cost\": 0.09699730400000002, \"time\": 154.47008938789367}, \"3ad6dcf559\": {\"quality\": 0.6195121951219512, \"cost\": 0.071971364, \"time\": 124.51828866004944}, \"3ae0de8663\": {\"quality\": 0.2573170731707317, \"cost\": 0.043973896, \"time\": 145.02047486305236}, \"3b2e8075ea\": {\"quality\": 0.7909523809523809, \"cost\": 0.007301412, \"time\": 154.77439031600952}, \"3b3676521a\": {\"quality\": 0.7468989547038328, \"cost\": 0.025589519999999998, \"time\": 245.33448405265807}, \"3b57530a56\": {\"quality\": 0.7408943089430895, \"cost\": 0.009371704, \"time\": 124.3612250328064}, \"3b6fbfa11d\": {\"quality\": 0.4241695702671313, \"cost\": 0.03699336, \"time\": 136.01204528808594}, \"3b81215e7a\": {\"quality\": 0.7074099883855982, \"cost\": 0.069720068, \"time\": 194.1675964832306}, \"3b9f8045d7\": {\"quality\": 0.5806620209059233, \"cost\": 0.09104832400000001, \"time\": 150.99677429199218}, \"3c5857683c\": {\"quality\": 0.5024390243902439, \"cost\": 0.07306957600000001, \"time\": 161.98392362594603}, \"3cbab8082e\": {\"quality\": 0.6873751451800232, \"cost\": 0.014340899999999998, \"time\": 174.04833178520204}, \"3d21104666\": {\"quality\": 0.24378629500580723, \"cost\": 0.126616, \"time\": 80.24543118476868}, \"3d71c4dd2c\": {\"quality\": 0.6786062717770036, \"cost\": 0.010625964, \"time\": 140.9857957839966}, \"3d9e24215e\": {\"quality\": 0.3852845528455284, \"cost\": 0.028569556000000003, \"time\": 57.36162734031677}, \"3e7efee65a\": {\"quality\": 0.5565040650406504, \"cost\": 0.066932276, \"time\": 163.1447666168213}, \"3ed0ad20ed\": {\"quality\": 0.32991869918699185, \"cost\": 0.071716344, \"time\": 165.24500761032104}, \"3f1a58aec9\": {\"quality\": 0.533739837398374, \"cost\": 0.013489895999999998, \"time\": 331.1659511566162}, \"3f2b07cb78\": {\"quality\": 0.19126596980255517, \"cost\": 0.041601888, \"time\": 124.82682638168335}, \"3f3ef494b0\": {\"quality\": 0.6195121951219512, \"cost\": 0.0015848999999999998, \"time\": 51.92140054702759}, \"3f62c3fbfc\": {\"quality\": 0.19586527293844366, \"cost\": 0.0071477519999999985, \"time\": 118.16850218772888}, \"3f730d8bfe\": {\"quality\": 0.7800813008130081, \"cost\": 0.083953044, \"time\": 111.95527267456055}, \"3f8d2ee81f\": {\"quality\": 0.6478048780487805, \"cost\": 0.042814124, \"time\": 129.17971215248107}, \"40104c813f\": {\"quality\": 0.6195121951219512, \"cost\": 0.016629716, \"time\": 190.76988382339476}, \"403b05da2d\": {\"quality\": 0.6195121951219512, \"cost\": 0.013792372, \"time\": 143.9212529182434}, \"403f0726fa\": {\"quality\": 0.44464576074332174, \"cost\": 0.07387800000000001, \"time\": 62.18464155197144}, \"4098178354\": {\"quality\": 0.6802322880371661, \"cost\": 0.06734463600000001, \"time\": 214.58685207366943}, \"409ff67607\": {\"quality\": 0.6195121951219512, \"cost\": 0.0056238419999999996, \"time\": 99.12217197418212}, \"40b3b6642c\": {\"quality\": 0.4744831591173055, \"cost\": 0.06449838599999999, \"time\": 133.02915487289428}, \"412c065b83\": {\"quality\": 0.6195121951219512, \"cost\": 0.009992927999999998, \"time\": 171.42633333206177}, \"4191118787\": {\"quality\": 0.6195121951219512, \"cost\": 0.014096268, \"time\": 191.99022674560547}, \"41d5b97871\": {\"quality\": 0.6195121951219512, \"cost\": 0.003545856, \"time\": 92.14297785758973}, \"41d8845655\": {\"quality\": 0.7837282229965157, \"cost\": 0.021811158000000004, \"time\": 109.06256651878357}, \"41ee202cac\": {\"quality\": 0.5507317073170731, \"cost\": 0.014144588000000003, \"time\": 1669.106618309021}, \"41fe4aee55\": {\"quality\": 0.6195121951219512, \"cost\": 0.06379543600000001, \"time\": 147.32905130386354}, \"42430ea391\": {\"quality\": 0.39349593495934954, \"cost\": 0.073369736, \"time\": 190.46762056350707}, \"42ddd48341\": {\"quality\": 0.4267479674796748, \"cost\": 0.023891804, \"time\": 166.09094681739808}, \"42f1e19aa7\": {\"quality\": 0.2668060394889663, \"cost\": 0.10482261600000001, \"time\": 180.36561794281005}, \"430a2ab32f\": {\"quality\": 0.49148664343786297, \"cost\": 0.08972675600000002, \"time\": 123.45088205337524}, \"4339427ad8\": {\"quality\": 0.6195121951219512, \"cost\": 0.06641424400000001, \"time\": 176.55676898956298}, \"4361bc7ea7\": {\"quality\": 0.433472706155633, \"cost\": 0.018289859999999998, \"time\": 1338.038490152359}, \"43c3cf9cb8\": {\"quality\": 0.6104065040650406, \"cost\": 0.011720016, \"time\": 66.8409245967865}, \"43d24fb32a\": {\"quality\": 0.3804878048780488, \"cost\": 0.016379803999999998, \"time\": 121.97038044929505}, \"43e9b39e5c\": {\"quality\": 0.7054587688734031, \"cost\": 0.07738095200000002, \"time\": 196.7220965385437}, \"44d6af5523\": {\"quality\": 0.6940650406504065, \"cost\": 0.010872959999999997, \"time\": 159.60600094795228}, \"44f189d813\": {\"quality\": 0.7999303135888501, \"cost\": 0.038299172000000006, \"time\": 177.3291199207306}, \"450f45a187\": {\"quality\": 0.6195121951219512, \"cost\": 0.016291588000000003, \"time\": 72.91690697669983}, \"453d0a5097\": {\"quality\": 0.39195121951219514, \"cost\": 0.029831792000000003, \"time\": 120.01631045341492}, \"4547ef4c8e\": {\"quality\": 0.4353658536585366, \"cost\": 0.065468638, \"time\": 134.60169172286987}, \"461846a52d\": {\"quality\": 0.6195121951219512, \"cost\": 0.006258096, \"time\": 108.38983693122864}, \"462e6ff849\": {\"quality\": 0.4353658536585366, \"cost\": 0.016916924, \"time\": 133.2640913963318}, \"4630853d32\": {\"quality\": 0.6390243902439025, \"cost\": 0.01612172, \"time\": 131.57029628753662}, \"46475b9e75\": {\"quality\": 0.7159001161440186, \"cost\": 0.011104722, \"time\": 122.19585280418396}, \"46654a1f32\": {\"quality\": 0.6195121951219512, \"cost\": 0.016634112, \"time\": 77.13833050727844}, \"466a3036b2\": {\"quality\": 0.6498257839721253, \"cost\": 0.018471648, \"time\": 207.47669353485108}, \"466d4d16dd\": {\"quality\": 0.5902439024390244, \"cost\": 0.011955879999999999, \"time\": 118.00577793121337}, \"46ed68152d\": {\"quality\": 0.5575493612078979, \"cost\": 0.016878404, \"time\": 116.81743607521057}, \"476a12876c\": {\"quality\": 0.6195121951219512, \"cost\": 0.007925976, \"time\": 132.9657735824585}, \"4778401a7a\": {\"quality\": 0.6119047619047618, \"cost\": 0.036857304, \"time\": 163.85823788642884}, \"47f9115b26\": {\"quality\": 0.6195121951219512, \"cost\": 0.06548748400000001, \"time\": 177.22612433433534}, \"48043e2304\": {\"quality\": 0.4546341463414635, \"cost\": 0.013231672000000003, \"time\": 139.5205948829651}, \"487f30e740\": {\"quality\": 0.6455284552845528, \"cost\": 0.04562078000000001, \"time\": 155.9966438770294}, \"488645cbd9\": {\"quality\": 0.6195121951219512, \"cost\": 0.0027527399999999996, \"time\": 95.60826187133789}, \"48bf87f7fe\": {\"quality\": 0.4546922183507549, \"cost\": 0.095429448, \"time\": 165.5019425392151}, \"4909061216\": {\"quality\": 0.2585365853658536, \"cost\": 0.018523724, \"time\": 212.3366045475006}, \"49731b1ccd\": {\"quality\": 0.2308246225319396, \"cost\": 0.010751964, \"time\": 158.96827306747437}, \"49ad844bd2\": {\"quality\": 0.6195121951219512, \"cost\": 0.015603239999999999, \"time\": 223.71964612007142}, \"49ca727e49\": {\"quality\": 0.653739837398374, \"cost\": 0.0016870099999999999, \"time\": 64.0728322505951}, \"4a23d8eff7\": {\"quality\": 0.6195121951219512, \"cost\": 0.06434071200000001, \"time\": 202.87527737617492}, \"4a555da784\": {\"quality\": 0.6274796747967479, \"cost\": 0.06717144000000001, \"time\": 190.02644109725952}, \"4a5cea8b85\": {\"quality\": 0.25497096399535424, \"cost\": 0.08279934000000001, \"time\": 153.70768852233886}, \"4a767339bd\": {\"quality\": 0.6195121951219512, \"cost\": 0.065784288, \"time\": 185.730606508255}, \"4aafd39d76\": {\"quality\": 0.6487804878048781, \"cost\": 0.011711388, \"time\": 195.04439516067504}, \"4aca6e5216\": {\"quality\": 0.4877235772357723, \"cost\": 0.029520160000000007, \"time\": 136.51992263793946}, \"4b18a647d6\": {\"quality\": 0.7812195121951219, \"cost\": 0.09365719600000001, \"time\": 201.93038654327393}, \"4bc4528402\": {\"quality\": 0.6845528455284553, \"cost\": 0.00974208, \"time\": 175.54990649223328}, \"4c158a1a4a\": {\"quality\": 0.47682926829268296, \"cost\": 0.010284996, \"time\": 130.92566409111024}, \"4c954323e3\": {\"quality\": 0.6668757259001161, \"cost\": 0.016642032, \"time\": 211.33878846168517}, \"4d91e8a27b\": {\"quality\": 0.38890824622531933, \"cost\": 0.014601055999999998, \"time\": 149.77316427230835}, \"4dc185389a\": {\"quality\": 0.6353658536585366, \"cost\": 0.063264392, \"time\": 190.5721879005432}, \"4dd3635bc3\": {\"quality\": 0.6648548199767712, \"cost\": 0.026166204, \"time\": 156.0182451725006}, \"4dd98ef398\": {\"quality\": 0.48878048780487804, \"cost\": 0.06819976000000001, \"time\": 104.27807059288025}, \"4dfacd0007\": {\"quality\": 0.8142973286875724, \"cost\": 0.037908216, \"time\": 182.91077904701234}, \"4e298ee0d4\": {\"quality\": 0.5714634146341463, \"cost\": 0.012921647999999999, \"time\": 178.78159699440002}, \"4e3443a0f9\": {\"quality\": 0.6566550522648084, \"cost\": 0.022017672, \"time\": 196.2869851589203}, \"4e4b9db2b8\": {\"quality\": 0.3020325203252033, \"cost\": 0.010630452000000002, \"time\": 134.6365571498871}, \"4e6509f614\": {\"quality\": 0.5430894308943089, \"cost\": 0.0084714, \"time\": 103.83811821937562}, \"4e6a83e751\": {\"quality\": 0.4271544715447154, \"cost\": 0.014789456000000001, \"time\": 1661.4491950511933}, \"4e79c8947f\": {\"quality\": 0.6573170731707317, \"cost\": 0.016321300000000004, \"time\": 72.1683876991272}, \"4e9504432b\": {\"quality\": 0.667479674796748, \"cost\": 0.07073244000000001, \"time\": 194.78891081809996}, \"4e962170dc\": {\"quality\": 0.6195121951219512, \"cost\": 0.06384445600000002, \"time\": 125.06455044746399}, \"4eb0826f21\": {\"quality\": 0.6195121951219512, \"cost\": 0.05706447600000001, \"time\": 84.61785073280335}, \"4ed41bf2e4\": {\"quality\": 0.5291056910569105, \"cost\": 0.079910268, \"time\": 152.21971316337584}, \"4f78672528\": {\"quality\": 0.7267711962833914, \"cost\": 0.073045852, \"time\": 149.74157452583313}, \"4f8cca1195\": {\"quality\": 0.7037514518002322, \"cost\": 0.021906119999999998, \"time\": 222.39478406906127}, \"500860eaa2\": {\"quality\": 0.6195121951219512, \"cost\": 0.14549248799999998, \"time\": 2286.8156877040865}, \"50701b505e\": {\"quality\": 0.6195121951219512, \"cost\": 0.01093236, \"time\": 183.55579237937928}, \"51583a901c\": {\"quality\": 0.5827526132404182, \"cost\": 0.020865352, \"time\": 207.0611351966858}, \"51aeaf9f3e\": {\"quality\": 0.6361788617886179, \"cost\": 0.02071614, \"time\": 60.56362895965576}, \"520b52b64c\": {\"quality\": 0.6005458768873403, \"cost\": 0.07544150000000002, \"time\": 98.03264541625977}, \"521314dab6\": {\"quality\": 0.5414634146341464, \"cost\": 0.011220016000000001, \"time\": 111.50055747032165}, \"5226eb7ff6\": {\"quality\": 0.7410569105691057, \"cost\": 0.028896044, \"time\": 154.04090991020203}, \"526878b5eb\": {\"quality\": 0.47317073170731705, \"cost\": 0.016668683999999996, \"time\": 211.7058870792389}, \"52c1cba6ce\": {\"quality\": 0.6341463414634146, \"cost\": 0.01895658, \"time\": 214.20635576248168}, \"52f041a70e\": {\"quality\": 0.715691056910569, \"cost\": 0.00206075, \"time\": 69.41744899749756}, \"533867574b\": {\"quality\": 0.6173054587688733, \"cost\": 0.06281091200000001, \"time\": 141.25713596343994}, \"53869388bb\": {\"quality\": 0.35284552845528455, \"cost\": 0.011996687999999997, \"time\": 385.71526923179624}, \"53aefd41e4\": {\"quality\": 0.5982229965156796, \"cost\": 0.03475750400000001, \"time\": 124.53529238700867}, \"53d2932c4f\": {\"quality\": 0.5304065040650406, \"cost\": 0.012760976000000002, \"time\": 143.55552654266359}, \"54375d3eba\": {\"quality\": 0.6189430894308943, \"cost\": 0.024152808000000005, \"time\": 112.89141573905945}, \"5474247f91\": {\"quality\": 0.6195121951219512, \"cost\": 0.012371476000000001, \"time\": 125.63727469444275}, \"54993bc472\": {\"quality\": 0.6341463414634146, \"cost\": 0.057053648, \"time\": 83.04090037345887}, \"55358f2285\": {\"quality\": 0.6589430894308943, \"cost\": 0.06846545600000001, \"time\": 137.55791845321656}, \"5569b4f878\": {\"quality\": 0.7922648083623692, \"cost\": 0.021984700000000003, \"time\": 105.12538561820983}, \"557d2cf7ba\": {\"quality\": 0.6195121951219512, \"cost\": 0.012524256000000001, \"time\": 145.9854365348816}, \"55c8aa8935\": {\"quality\": 0.40569105691056906, \"cost\": 0.018154683999999997, \"time\": 185.0385425567627}, \"55e6bf8f14\": {\"quality\": 0.4878048780487805, \"cost\": 0.06334899600000002, \"time\": 175.02917833328246}, \"56a0660622\": {\"quality\": 0.5082926829268293, \"cost\": 0.05803248000000001, \"time\": 128.3964723110199}, \"56a29a28c5\": {\"quality\": 0.6195121951219512, \"cost\": 0.006270012, \"time\": 96.01242618560791}, \"56c4fd5056\": {\"quality\": 0.6195121951219512, \"cost\": 0.128267136, \"time\": 190.17179284095764}, \"5703697dbd\": {\"quality\": 0.13241579558652733, \"cost\": 0.009410826, \"time\": 197.99875745773315}, \"5718f2ed80\": {\"quality\": 0.7897909407665504, \"cost\": 0.0071524019999999995, \"time\": 187.86544318199157}, \"572a02a59a\": {\"quality\": 0.6716260162601626, \"cost\": 0.006410382000000001, \"time\": 171.54228215217591}, \"5750713a41\": {\"quality\": 0.5406504065040652, \"cost\": 0.008179103999999998, \"time\": 157.33118991851808}, \"57757ef15e\": {\"quality\": 0.5353658536585366, \"cost\": 0.011508492, \"time\": 147.7572299003601}, \"579c81bbe0\": {\"quality\": 0.6146341463414634, \"cost\": 0.007194356000000001, \"time\": 109.05034627914429}, \"57bed1722f\": {\"quality\": 0.6195121951219512, \"cost\": 0.00790296, \"time\": 105.67013120651245}, \"585ba6d20b\": {\"quality\": 0.6342973286875726, \"cost\": 0.016752543999999998, \"time\": 243.40559344291688}, \"589267ac64\": {\"quality\": 0.5423112659698026, \"cost\": 0.01900224, \"time\": 108.88611903190613}, \"589a1cea79\": {\"quality\": 0.6195121951219512, \"cost\": 0.014395692000000002, \"time\": 192.94557881355286}, \"58ca42839b\": {\"quality\": 0.7067479674796748, \"cost\": 0.042082800000000004, \"time\": 164.78914923667907}, \"58dc373441\": {\"quality\": 0.48292682926829267, \"cost\": 0.062875472, \"time\": 206.20198302268983}, \"59006532b4\": {\"quality\": 0.4398373983739837, \"cost\": 0.01421172, \"time\": 209.91840863227844}, \"59326c4e00\": {\"quality\": 0.6195121951219512, \"cost\": 0.009057812, \"time\": 105.74859399795533}, \"593975c75b\": {\"quality\": 0.688780487804878, \"cost\": 0.09570490400000001, \"time\": 229.09487490653993}, \"596b4f8694\": {\"quality\": 0.4878048780487805, \"cost\": 0.066254572, \"time\": 204.71900162696838}, \"5971ba4e0d\": {\"quality\": 0.46154471544715453, \"cost\": 0.02107208, \"time\": 666.5095714569092}, \"5996465c0a\": {\"quality\": 0.7142276422764228, \"cost\": 0.0669193, \"time\": 276.37003965377806}, \"59d70b9f65\": {\"quality\": 0.7408943089430894, \"cost\": 0.06718589000000001, \"time\": 181.75046825408936}, \"59f887b67c\": {\"quality\": 0.6195121951219512, \"cost\": 0.015060804, \"time\": 185.60658679008483}, \"5a22920db4\": {\"quality\": 0.7369802555168409, \"cost\": 0.004994448, \"time\": 107.91197681427002}, \"5a35020d45\": {\"quality\": 0.6195121951219512, \"cost\": 0.012432768, \"time\": 174.93387031555176}, \"5aa43da1fc\": {\"quality\": 0.6634959349593497, \"cost\": 0.04908599600000001, \"time\": 142.87173461914062}, \"5ae0d88127\": {\"quality\": 0.6195121951219512, \"cost\": 0.07555794, \"time\": 205.7282470703125}, \"5b10fbdbe1\": {\"quality\": 0.443089430894309, \"cost\": 0.022101736000000004, \"time\": 206.7174467563629}, \"5bade9eb85\": {\"quality\": 0.686225319396051, \"cost\": 0.014263044000000002, \"time\": 240.30284028053285}, \"5be16744bf\": {\"quality\": 0.40822299651567945, \"cost\": 0.020666648000000003, \"time\": 183.62090783119203}, \"5c5055e252\": {\"quality\": 0.6341463414634146, \"cost\": 0.057907604, \"time\": 170.79225449562074}, \"5c53feccd9\": {\"quality\": 0.535191637630662, \"cost\": 0.022622808, \"time\": 214.88010306358336}, \"5c77c7c2b2\": {\"quality\": 0.486829268292683, \"cost\": 0.01199347, \"time\": 164.7836329460144}, \"5d298b5b48\": {\"quality\": 0.5081300813008129, \"cost\": 0.07153200800000001, \"time\": 211.4806851863861}, \"5d41515d2e\": {\"quality\": 0.6988037166085945, \"cost\": 0.041189079999999996, \"time\": 152.06495885849}, \"5d4babc723\": {\"quality\": 0.2809291521486643, \"cost\": 0.06604162400000001, \"time\": 172.66046080589294}, \"5d79b50feb\": {\"quality\": 0.3548780487804878, \"cost\": 0.018818415999999998, \"time\": 109.70083494186402}, \"5dc216cd6b\": {\"quality\": 0.6195121951219512, \"cost\": 0.014456452, \"time\": 143.22672152519226}, \"5dd68c1b8f\": {\"quality\": 0.6670731707317074, \"cost\": 0.002938692, \"time\": 95.08386464118958}, \"5de4a882c1\": {\"quality\": 0.3775609756097561, \"cost\": 0.018079487999999998, \"time\": 163.3632432937622}, \"5e04e1c72d\": {\"quality\": 0.5864227642276423, \"cost\": 0.060815928000000005, \"time\": 139.02201280593872}, \"5e923cee9e\": {\"quality\": 0.7142740998838559, \"cost\": 0.037600364, \"time\": 188.52763104438782}, \"5ea2fab380\": {\"quality\": 0.5060975609756098, \"cost\": 0.006009479999999999, \"time\": 89.62659049034119}, \"5eb3bb525b\": {\"quality\": 0.6526829268292682, \"cost\": 0.012761808000000003, \"time\": 132.20405316352844}, \"5eea899380\": {\"quality\": 0.4015679442508711, \"cost\": 0.07097477600000002, \"time\": 210.64328713417052}, \"5f37b3902b\": {\"quality\": 0.6195121951219512, \"cost\": 0.010051332, \"time\": 109.50351824760438}, \"5f9282df3c\": {\"quality\": 0.37855981416957035, \"cost\": 0.040870088000000006, \"time\": 185.25703506469728}, \"6019884cf3\": {\"quality\": 0.698931475029036, \"cost\": 0.06932, \"time\": 240.45240106582642}, \"606352363e\": {\"quality\": 0.19362369337979093, \"cost\": 0.070268456, \"time\": 234.0612476825714}, \"608728f868\": {\"quality\": 0.6, \"cost\": 0.068669716, \"time\": 218.80884642601012}, \"60b9e936f1\": {\"quality\": 0.6195121951219512, \"cost\": 0.056154416000000006, \"time\": 87.54008474349976}, \"60cb623c53\": {\"quality\": 0.46341463414634154, \"cost\": 0.004160484000000001, \"time\": 67.07470922470092}, \"6234de86b4\": {\"quality\": 0.680650406504065, \"cost\": 0.08330867000000002, \"time\": 161.33356328010558}, \"62352c6854\": {\"quality\": 0.347479674796748, \"cost\": 0.07185688, \"time\": 218.85872540473937}, \"63a0aaebed\": {\"quality\": 0.6195121951219512, \"cost\": 0.005112576000000001, \"time\": 118.10645513534546}, \"647dda686f\": {\"quality\": 0.740650406504065, \"cost\": 0.038015924, \"time\": 197.1805561542511}, \"6511b21ded\": {\"quality\": 0.5281068524970964, \"cost\": 0.036544824, \"time\": 129.3101086616516}, \"652c0f4bdf\": {\"quality\": 0.6195121951219512, \"cost\": 0.01854406, \"time\": 200.11599526405334}, \"6533c85913\": {\"quality\": 0.671869918699187, \"cost\": 0.016586988, \"time\": 216.59043788909912}, \"65627426e0\": {\"quality\": 0.5750058072009292, \"cost\": 0.011773232000000003, \"time\": 1798.7506415367127}, \"65b76da9c6\": {\"quality\": 0.6195121951219512, \"cost\": 0.01047374, \"time\": 144.71517310142517}, \"65be1c1306\": {\"quality\": 0.4783739837398374, \"cost\": 0.044860399999999995, \"time\": 1590.2891113758087}, \"65e0216208\": {\"quality\": 0.6195121951219512, \"cost\": 0.01263888, \"time\": 234.11023540496825}, \"65eee615d7\": {\"quality\": 0.6295934959349594, \"cost\": 0.0011287079999999998, \"time\": 67.5095160961151}, \"6623d7a5ac\": {\"quality\": 0.7706504065040651, \"cost\": 0.067642702, \"time\": 238.91902923583984}, \"66750c0934\": {\"quality\": 0.7448896631823461, \"cost\": 0.013174562, \"time\": 184.41564054489135}, \"66776ec181\": {\"quality\": 0.5121951219512195, \"cost\": 0.014528544, \"time\": 218.20506610870362}, \"66e5ae0a21\": {\"quality\": 0.6195121951219512, \"cost\": 0.0327125, \"time\": 125.57788677215576}, \"6750a8d7a7\": {\"quality\": 0.7970847851335656, \"cost\": 0.03363056, \"time\": 169.0002564907074}, \"67632141f6\": {\"quality\": 0.6195121951219512, \"cost\": 0.007475844000000001, \"time\": 102.3034242630005}, \"67868fcff6\": {\"quality\": 0.44991869918699184, \"cost\": 0.019169336000000002, \"time\": 185.75657877922058}, \"67bab6732d\": {\"quality\": 0.38455284552845526, \"cost\": 0.016392688, \"time\": 159.32311158180238}, \"67fe399cf1\": {\"quality\": 0.624390243902439, \"cost\": 0.010455704, \"time\": 154.35488867759705}, \"6846bd8fb3\": {\"quality\": 0.3334146341463414, \"cost\": 0.009431328, \"time\": 157.63436169624327}, \"68b4cc3e39\": {\"quality\": 0.44522648083623695, \"cost\": 0.09505392000000001, \"time\": 195.20148849487305}, \"69a029ae36\": {\"quality\": 0.6195121951219512, \"cost\": 0.042185764, \"time\": 178.9181830883026}, \"69b3b67de6\": {\"quality\": 0.5991869918699188, \"cost\": 0.016261864, \"time\": 182.98831782341003}, \"69bf3f6ba0\": {\"quality\": 0.5317073170731708, \"cost\": 0.012870332000000002, \"time\": 144.30875706672668}, \"6a022c3f73\": {\"quality\": 0.6390243902439025, \"cost\": 0.006610608, \"time\": 128.81844487190247}, \"6a10c53ad8\": {\"quality\": 0.7610220673635307, \"cost\": 0.018649594, \"time\": 219.1147134780884}, \"6a6348f69d\": {\"quality\": 0.4230894308943089, \"cost\": 0.014753328, \"time\": 1653.7405955314637}, \"6a74a11bee\": {\"quality\": 0.5999303135888503, \"cost\": 0.06506421600000001, \"time\": 158.56066660881044}, \"6a90ec29dd\": {\"quality\": 0.26048780487804873, \"cost\": 0.032533988, \"time\": 121.64445943832398}, \"6aac59742a\": {\"quality\": 0.6195121951219512, \"cost\": 0.015229139999999999, \"time\": 210.34569411277772}, \"6b0862c597\": {\"quality\": 0.6967479674796748, \"cost\": 0.061550596, \"time\": 178.11355109214782}, \"6b3c16def2\": {\"quality\": 0.46341463414634154, \"cost\": 0.002938128, \"time\": 90.25655012130737}, \"6b99b0e901\": {\"quality\": 0.5002439024390244, \"cost\": 0.036148564, \"time\": 119.39121503829956}, \"6b9b2b3515\": {\"quality\": 0.3779326364692218, \"cost\": 0.088494424, \"time\": 215.23190116882324}, \"6bcc02962b\": {\"quality\": 0.6769105691056911, \"cost\": 0.050208152000000006, \"time\": 122.08708581924438}, \"6c1987a9e3\": {\"quality\": 0.5414634146341464, \"cost\": 0.06623874400000002, \"time\": 188.0838879108429}, \"6c50123ee1\": {\"quality\": 0.5729268292682927, \"cost\": 0.05634123, \"time\": 96.05054454803467}, \"6c67c36480\": {\"quality\": 0.6195121951219512, \"cost\": 0.066650968, \"time\": 282.4478175640106}, \"6c9b9f1363\": {\"quality\": 0.5295121951219512, \"cost\": 0.06645499, \"time\": 267.66957621574403}, \"6cc813aa68\": {\"quality\": 0.6195121951219512, \"cost\": 0.008967600000000001, \"time\": 110.58158135414124}, \"6d20c6ace0\": {\"quality\": 0.5440301974448316, \"cost\": 0.043662036, \"time\": 164.7413963794708}, \"6d444fe21a\": {\"quality\": 0.43720092915214864, \"cost\": 0.028377092000000003, \"time\": 181.5676950931549}, \"6db70dc3b6\": {\"quality\": 0.5349593495934959, \"cost\": 0.018471248000000003, \"time\": 1839.0920428276063}, \"6e0690f576\": {\"quality\": 0.6195121951219512, \"cost\": 0.003072384, \"time\": 123.15826468467712}, \"6e06cc804f\": {\"quality\": 0.7730894308943089, \"cost\": 0.122657024, \"time\": 161.19403052330017}, \"6e24048a2e\": {\"quality\": 0.6522764227642277, \"cost\": 0.032542016, \"time\": 137.39676880836487}, \"6e93514f45\": {\"quality\": 0.6195121951219512, \"cost\": 0.008316468, \"time\": 125.78651990890503}, \"6eae47102b\": {\"quality\": 0.4591869918699187, \"cost\": 0.012840492, \"time\": 157.67578744888306}, \"6ed4cae469\": {\"quality\": 0.6234959349593496, \"cost\": 0.021061364000000003, \"time\": 89.84902305603028}, \"6ef3b7127e\": {\"quality\": 0.5697560975609757, \"cost\": 0.012670767999999999, \"time\": 80.09802565574645}, \"6f60a05c33\": {\"quality\": 0.6195121951219512, \"cost\": 0.010392396, \"time\": 162.94261040687562}, \"6fe0b3f929\": {\"quality\": 0.6195121951219512, \"cost\": 0.0075620999999999996, \"time\": 134.45373644828797}, \"700ab1d309\": {\"quality\": 0.6195121951219512, \"cost\": 0.02821224, \"time\": 147.49021158218383}, \"7040e83d52\": {\"quality\": 0.5623693379790942, \"cost\": 0.040416854, \"time\": 170.36972126960754}, \"7046765af8\": {\"quality\": 0.200801393728223, \"cost\": 0.06678788599999999, \"time\": 205.17822675704957}, \"70b7c92ce8\": {\"quality\": 0.6439024390243903, \"cost\": 0.06304620400000002, \"time\": 216.9712851047516}, \"70c850e039\": {\"quality\": 0.6195121951219512, \"cost\": 0.00682044, \"time\": 111.72890758514404}, \"7112a7e64c\": {\"quality\": 0.5082113821138211, \"cost\": 0.008551890000000001, \"time\": 102.93057270050049}, \"71b615468b\": {\"quality\": 0.31138211382113823, \"cost\": 0.016398592, \"time\": 133.36204581260682}, \"723fd5589a\": {\"quality\": 0.6195121951219512, \"cost\": 0.025608394, \"time\": 148.1404324531555}, \"7250da0f41\": {\"quality\": 0.6195121951219512, \"cost\": 0.010252872, \"time\": 102.41908440589904}, \"7274a50778\": {\"quality\": 0.5536585365853659, \"cost\": 0.06315958, \"time\": 127.02501587867737}, \"72d022ce33\": {\"quality\": 0.8057259001161441, \"cost\": 0.08692716000000002, \"time\": 196.42877383232116}, \"7347cf0308\": {\"quality\": 0.6195121951219512, \"cost\": 0.007518113999999999, \"time\": 112.80636467933655}, \"736e652158\": {\"quality\": 0.6195121951219512, \"cost\": 0.010933332, \"time\": 202.134419298172}, \"739b1f81dc\": {\"quality\": 0.7352845528455284, \"cost\": 0.005161632, \"time\": 102.21428322792053}, \"742a5c0552\": {\"quality\": 0.6585365853658537, \"cost\": 0.05980691200000001, \"time\": 95.91068015098571}, \"742ec1b2e1\": {\"quality\": 0.691869918699187, \"cost\": 0.06350464000000001, \"time\": 189.14654784202577}, \"7435fd54f8\": {\"quality\": 0.6567131242741, \"cost\": 0.06201505600000001, \"time\": 203.98798232078553}, \"7445d99939\": {\"quality\": 0.4034843205574912, \"cost\": 0.011135484, \"time\": 153.3177831172943}, \"7466a5f424\": {\"quality\": 0.4263414634146342, \"cost\": 0.0244955, \"time\": 218.84586930274963}, \"74a0be215b\": {\"quality\": 0.48292682926829267, \"cost\": 0.06485306800000001, \"time\": 214.73199005126952}, \"74d7f64b8c\": {\"quality\": 0.6195121951219512, \"cost\": 0.011531088000000002, \"time\": 211.1984474658966}, \"7524905580\": {\"quality\": 0.7694889663182346, \"cost\": 0.012599868, \"time\": 157.22432436943055}, \"752d9649f2\": {\"quality\": 0.5467479674796748, \"cost\": 0.022367612000000002, \"time\": 204.80810337066652}, \"75d61c2cd0\": {\"quality\": 0.7049128919860628, \"cost\": 0.062316, \"time\": 160.31273555755615}, \"7604c0aa13\": {\"quality\": 0.6195121951219512, \"cost\": 0.0142076, \"time\": 142.32078042030335}, \"76c09db721\": {\"quality\": 0.7820441347270615, \"cost\": 0.04096148000000001, \"time\": 236.8196849822998}, \"774f268b66\": {\"quality\": 0.6810569105691057, \"cost\": 0.08915512, \"time\": 201.10393743515016}, \"7765576286\": {\"quality\": 0.567479674796748, \"cost\": 0.016253124, \"time\": 222.17675213813783}, \"77c02b00c1\": {\"quality\": 0.5561091753774681, \"cost\": 0.017938308, \"time\": 201.10154690742493}, \"7801da66b9\": {\"quality\": 0.6195121951219512, \"cost\": 0.00746667, \"time\": 114.64372763633727}, \"782d52674e\": {\"quality\": 0.3201974448315911, \"cost\": 0.06483101399999999, \"time\": 193.24506635665892}, \"786e5d0af5\": {\"quality\": 0.6195121951219512, \"cost\": 0.016454096, \"time\": 197.87756876945497}, \"7878563d63\": {\"quality\": 0.1632171893147503, \"cost\": 0.028198652, \"time\": 84.60407543182373}, \"79e1ca9b3c\": {\"quality\": 0.6195121951219512, \"cost\": 0.06867538, \"time\": 110.48155570030212}, \"79fad58f07\": {\"quality\": 0.46674796747967484, \"cost\": 0.029654968000000004, \"time\": 679.6173066616059}, \"7a207b42a8\": {\"quality\": 0.6195121951219512, \"cost\": 0.012325835999999998, \"time\": 163.009268951416}, \"7a2cdc546c\": {\"quality\": 0.2631707317073171, \"cost\": 0.06310833, \"time\": 160.65065565109253}, \"7a42a77788\": {\"quality\": 0.6195121951219512, \"cost\": 0.06019363200000001, \"time\": 138.42030897140503}, \"7a58d3472b\": {\"quality\": 0.5959349593495935, \"cost\": 0.025425438, \"time\": 147.28676762580872}, \"7b3937c1f1\": {\"quality\": 0.39544715447154466, \"cost\": 0.02868438, \"time\": 89.45209522247315}, \"7b6f44618e\": {\"quality\": 0.5699186991869919, \"cost\": 0.018827976000000003, \"time\": 179.5670045375824}, \"7c62576527\": {\"quality\": 0.5152845528455284, \"cost\": 0.042298876, \"time\": 167.9848653316498}, \"7c89a2b69e\": {\"quality\": 0.6195121951219512, \"cost\": 0.019231344, \"time\": 252.72414374351501}, \"7c96c9712f\": {\"quality\": 0.6536585365853659, \"cost\": 0.06122004000000001, \"time\": 127.95261840820312}, \"7ca066aa1c\": {\"quality\": 0.7818002322880371, \"cost\": 0.040026332000000005, \"time\": 187.7031367778778}, \"7cb5591f27\": {\"quality\": 0.3416840882694541, \"cost\": 0.0826827, \"time\": 138.52483682632447}, \"7cf56a7fbc\": {\"quality\": 0.6195121951219512, \"cost\": 0.08315224400000001, \"time\": 106.8193588256836}, \"7d44f0959d\": {\"quality\": 0.46680603948896626, \"cost\": 0.011511138, \"time\": 133.2327687740326}, \"7d60c38c5c\": {\"quality\": 0.7220441347270616, \"cost\": 0.00538134, \"time\": 131.21526570320128}, \"7d9b4535ac\": {\"quality\": 0.6769802555168408, \"cost\": 0.063389518, \"time\": 211.52207164764405}, \"7daf7ff182\": {\"quality\": 0.5284552845528456, \"cost\": 0.017232744, \"time\": 170.8186996459961}, \"7dcedb3d02\": {\"quality\": 0.6593960511033682, \"cost\": 0.06829533600000001, \"time\": 231.27069878578186}, \"7e22f12cd1\": {\"quality\": 0.5735540069686411, \"cost\": 0.03302558, \"time\": 186.45001711845399}, \"7e53a50b13\": {\"quality\": 0.4565737514518003, \"cost\": 0.033876876, \"time\": 151.98724694252013}, \"7ed07ad40a\": {\"quality\": 0.6177700348432056, \"cost\": 0.0116127, \"time\": 211.14939546585083}, \"7fa67a7656\": {\"quality\": 0.5077235772357723, \"cost\": 0.01125696, \"time\": 107.8444951057434}, \"7fc6c84bdf\": {\"quality\": 0.7865969802555168, \"cost\": 0.03407424, \"time\": 176.73210568428038}, \"7ff8a779cc\": {\"quality\": 0.7043089430894309, \"cost\": 0.06711379800000002, \"time\": 168.2764458656311}, \"801af99400\": {\"quality\": 0.7181765389082462, \"cost\": 0.07170897000000001, \"time\": 184.03885922431945}, \"806881adcb\": {\"quality\": 0.6209175377468059, \"cost\": 0.012441478, \"time\": 130.4481972694397}, \"80ad4122e4\": {\"quality\": 0.6211382113821139, \"cost\": 0.07175158800000002, \"time\": 199.85183753967286}, \"80be7df955\": {\"quality\": 0.6195121951219512, \"cost\": 0.0029887200000000003, \"time\": 99.58167638778687}, \"80bf60c422\": {\"quality\": 0.6195121951219512, \"cost\": 0.005357952000000001, \"time\": 138.18060364723206}, \"81333c7a33\": {\"quality\": 0.5926829268292683, \"cost\": 0.027174752000000003, \"time\": 143.90170640945433}, \"813e75210b\": {\"quality\": 0.3377816492450639, \"cost\": 0.05362793600000001, \"time\": 1273.6602407932282}, \"81660ae8b2\": {\"quality\": 0.6195121951219512, \"cost\": 0.06746300400000001, \"time\": 194.1754295349121}, \"816958b5d1\": {\"quality\": 0.6963414634146342, \"cost\": 0.07060810400000002, \"time\": 230.40107922554017}, \"81ab2ef3f4\": {\"quality\": 0.3821138211382114, \"cost\": 0.030218416000000005, \"time\": 135.14831619262696}, \"829df73946\": {\"quality\": 0.6964227642276423, \"cost\": 0.008790552, \"time\": 111.97322883605958}, \"82ea1bd1b9\": {\"quality\": 0.8085365853658537, \"cost\": 0.069294098, \"time\": 143.64819779396058}, \"8357183895\": {\"quality\": 0.6195121951219512, \"cost\": 0.014406096000000002, \"time\": 247.72333025932312}, \"8392a6083a\": {\"quality\": 0.6195121951219512, \"cost\": 0.017490875999999995, \"time\": 201.15325956344606}, \"83aee532b7\": {\"quality\": 0.6195121951219512, \"cost\": 0.064240072, \"time\": 249.5134038925171}, \"83b26646c3\": {\"quality\": 0.6652845528455285, \"cost\": 0.04101521600000001, \"time\": 139.2657244682312}, \"83c9e66ec6\": {\"quality\": 0.6439024390243903, \"cost\": 0.012396760000000001, \"time\": 179.99957365989684}, \"847a0e5db5\": {\"quality\": 0.6909756097560976, \"cost\": 0.06664256800000001, \"time\": 183.08391642570496}, \"847fd49235\": {\"quality\": 0.3338327526132404, \"cost\": 0.006645311999999999, \"time\": 167.98347339630126}, \"849100224d\": {\"quality\": 0.3345528455284553, \"cost\": 0.08884810000000001, \"time\": 220.43858489990234}, \"84b91c37ab\": {\"quality\": 0.24628339140534267, \"cost\": 0.069210264, \"time\": 236.29133720397948}, \"8519bef585\": {\"quality\": 0.6195121951219512, \"cost\": 0.005065452, \"time\": 122.15238361358642}, \"85c94a5505\": {\"quality\": 0.4611382113821138, \"cost\": 0.056482279999999996, \"time\": 1274.952497625351}, \"85eda38404\": {\"quality\": 0.6357142857142857, \"cost\": 0.01686636, \"time\": 103.26268811225891}, \"862183bfb9\": {\"quality\": 0.6951219512195121, \"cost\": 0.012958452, \"time\": 162.91549973487855}, \"8631e49c94\": {\"quality\": 0.6439024390243903, \"cost\": 0.06574006400000001, \"time\": 227.21288151741027}, \"8668f65f05\": {\"quality\": 0.6195121951219512, \"cost\": 0.01596044, \"time\": 191.07359557151796}, \"86bf6375af\": {\"quality\": 0.6560975609756097, \"cost\": 0.06964888, \"time\": 260.4157979488373}, \"870e2f87b4\": {\"quality\": 0.7660046457607433, \"cost\": 0.025822204000000005, \"time\": 201.3596661567688}, \"887ad124e1\": {\"quality\": 0.709349593495935, \"cost\": 0.0070279439999999995, \"time\": 150.77690081596376}, \"8886cb3082\": {\"quality\": 0.6834843205574914, \"cost\": 0.019317688000000003, \"time\": 211.99706134796142}, \"88e71efa9b\": {\"quality\": 0.28095238095238095, \"cost\": 0.046789276000000005, \"time\": 186.75326628684996}, \"8941621423\": {\"quality\": 0.6786991869918699, \"cost\": 0.012530784, \"time\": 167.57245116233827}, \"8950a6efe0\": {\"quality\": 0.48292682926829267, \"cost\": 0.05844699200000001, \"time\": 136.73901495933532}, \"8961e4d901\": {\"quality\": 0.6195121951219512, \"cost\": 0.010754568, \"time\": 156.90380158424378}, \"8974aa89a0\": {\"quality\": 0.5727642276422764, \"cost\": 0.01088668, \"time\": 377.91359758377075}, \"89836d2020\": {\"quality\": 0.6195121951219512, \"cost\": 0.06319070400000001, \"time\": 132.40995478630066}, \"89a289907e\": {\"quality\": 0.7753426248548199, \"cost\": 0.014910682, \"time\": 212.81768522262573}, \"89a35a09b1\": {\"quality\": 0.525609756097561, \"cost\": 0.011930976, \"time\": 194.43942880630493}, \"8a3a35c762\": {\"quality\": 0.6390243902439025, \"cost\": 0.07371796400000001, \"time\": 188.05857005119324}, \"8a50695d1f\": {\"quality\": 0.5045644599303136, \"cost\": 0.027424088000000003, \"time\": 111.09706916809083}, \"8ab351aa13\": {\"quality\": 0.6526829268292682, \"cost\": 0.089840192, \"time\": 137.60196342468262}, \"8ac8b5773a\": {\"quality\": 0.6292682926829268, \"cost\": 0.06544508800000001, \"time\": 216.46768646240236}, \"8acd758b7f\": {\"quality\": 0.6600929152148665, \"cost\": 0.008952419999999999, \"time\": 151.0333396911621}, \"8b10891ea5\": {\"quality\": 0.4276422764227642, \"cost\": 0.07350195200000001, \"time\": 188.96431422233582}, \"8b721bbc6f\": {\"quality\": 0.6195121951219512, \"cost\": 0.012898752, \"time\": 215.82190022468566}, \"8b77535cce\": {\"quality\": 0.7979558652729384, \"cost\": 0.07287564, \"time\": 215.97581152915956}, \"8bbbe0f52a\": {\"quality\": 0.6195121951219512, \"cost\": 0.008104248, \"time\": 149.25933980941772}, \"8bc184f385\": {\"quality\": 0.37040650406504066, \"cost\": 0.008849639999999999, \"time\": 130.85588617324828}, \"8bf5c3eadc\": {\"quality\": 0.6195121951219512, \"cost\": 0.008037444, \"time\": 130.57925519943237}, \"8bf80a50cb\": {\"quality\": 0.45910569105691057, \"cost\": 0.06549705000000001, \"time\": 191.59308562278747}, \"8c274ca255\": {\"quality\": 0.7277932636469222, \"cost\": 0.07598581600000001, \"time\": 203.60676689147948}, \"8d7594020b\": {\"quality\": 0.6195121951219512, \"cost\": 0.06030525600000001, \"time\": 142.03003664016722}, \"8d79e03266\": {\"quality\": 0.728513356562137, \"cost\": 0.019196192, \"time\": 166.50991864204406}, \"8d90814b94\": {\"quality\": 0.35382113821138217, \"cost\": 0.019129876, \"time\": 218.55778388977052}, \"8e1a01da19\": {\"quality\": 0.6195121951219512, \"cost\": 0.06321022000000001, \"time\": 206.71381578445434}, \"8e2498635d\": {\"quality\": 0.8135772357723576, \"cost\": 0.142084164, \"time\": 153.71730332374574}, \"8e5842ccbd\": {\"quality\": 0.7329268292682927, \"cost\": 0.007325802000000001, \"time\": 112.95453724861144}, \"8e5daf241e\": {\"quality\": 0.6923228803716608, \"cost\": 0.060446860000000005, \"time\": 186.30223965644836}, \"8e9715ee01\": {\"quality\": 0.5639837398373984, \"cost\": 0.08655280400000002, \"time\": 139.08190727233887}, \"8e9b7300d4\": {\"quality\": 0.4639837398373984, \"cost\": 0.06058211200000001, \"time\": 150.15705633163452}, \"8f29fab8ac\": {\"quality\": 0.4739140534262486, \"cost\": 0.06764498000000001, \"time\": 200.2176959514618}, \"8f44d89429\": {\"quality\": 0.7065040650406504, \"cost\": 0.069996624, \"time\": 185.83366765975953}, \"8f4caddfe6\": {\"quality\": 0.5421138211382114, \"cost\": 0.021058936, \"time\": 161.7523732662201}, \"8f4edde3f0\": {\"quality\": 0.5726480836236936, \"cost\": 0.03185762, \"time\": 159.63181648254394}, \"8f9cefbc22\": {\"quality\": 0.6585365853658537, \"cost\": 0.05867756800000001, \"time\": 105.53310813903809}, \"9025e2480f\": {\"quality\": 0.6595934959349593, \"cost\": 0.00965967, \"time\": 128.92424964904785}, \"9028588af4\": {\"quality\": 0.35292682926829266, \"cost\": 0.09341379200000001, \"time\": 1090.4150963783263}, \"9059fd80ad\": {\"quality\": 0.5902439024390244, \"cost\": 0.027686471999999997, \"time\": 2047.5434857845307}, \"90d5e40c1b\": {\"quality\": 0.6195121951219512, \"cost\": 0.007589304, \"time\": 120.88176670074463}, \"90d9a86a2a\": {\"quality\": 0.6195121951219512, \"cost\": 0.0036013499999999997, \"time\": 98.5603612422943}, \"90ff13783c\": {\"quality\": 0.6255284552845529, \"cost\": 0.018824976, \"time\": 219.693346452713}, \"90ff8eb055\": {\"quality\": 0.5428571428571429, \"cost\": 0.015549384, \"time\": 206.51184477806092}, \"9104e31369\": {\"quality\": 0.6640650406504065, \"cost\": 0.022935780000000003, \"time\": 229.26985802650452}, \"918983323f\": {\"quality\": 0.5085365853658537, \"cost\": 0.045187920000000006, \"time\": 1981.811208820343}, \"91928dfdd9\": {\"quality\": 0.6747967479674797, \"cost\": 0.06523348, \"time\": 182.1119598388672}, \"91c800af6b\": {\"quality\": 0.6161904761904762, \"cost\": 0.018671568000000003, \"time\": 212.29187331199645}, \"91e841cfd5\": {\"quality\": 0.6195121951219512, \"cost\": 0.067784204, \"time\": 200.92469301223755}, \"9253901a1f\": {\"quality\": 0.6195121951219512, \"cost\": 0.07052953600000002, \"time\": 216.47048025131227}, \"9288642e53\": {\"quality\": 0.6195121951219512, \"cost\": 0.008932916, \"time\": 94.32965250015259}, \"92ba9c5be3\": {\"quality\": 0.567479674796748, \"cost\": 0.064902632, \"time\": 138.75575132369994}, \"92c9dcd43b\": {\"quality\": 0.6195121951219512, \"cost\": 0.014990692, \"time\": 237.96368551254272}, \"93011c0821\": {\"quality\": 0.6195121951219512, \"cost\": 0.019036675999999995, \"time\": 267.35766806602476}, \"933b4d17dd\": {\"quality\": 0.5723577235772358, \"cost\": 0.012143372, \"time\": 172.0639699459076}, \"9373267bdb\": {\"quality\": 0.5410452961672474, \"cost\": 0.06831846000000001, \"time\": 258.8440625667572}, \"94010928c6\": {\"quality\": 0.25325203252032524, \"cost\": 0.011834136, \"time\": 131.6460223197937}, \"9403809e44\": {\"quality\": 0.4588617886178861, \"cost\": 0.016092682, \"time\": 231.8062086582184}, \"940c88ddc5\": {\"quality\": 0.5727874564459932, \"cost\": 0.029609188, \"time\": 126.90794463157654}, \"948f4081ba\": {\"quality\": 0.6792102206736355, \"cost\": 0.039854912000000006, \"time\": 248.46457772254945}, \"94ac356663\": {\"quality\": 0.5626016260162602, \"cost\": 0.012098179999999998, \"time\": 162.48227286338806}, \"94dff9a424\": {\"quality\": 0.7495934959349594, \"cost\": 0.013244731999999999, \"time\": 187.24048733711243}, \"9508356a2e\": {\"quality\": 0.6195121951219512, \"cost\": 0.012070511999999999, \"time\": 184.9062297821045}, \"9539d0e28c\": {\"quality\": 0.37120789779326363, \"cost\": 0.04079825200000001, \"time\": 198.3246217250824}, \"956bdcc254\": {\"quality\": 0.12377468060394889, \"cost\": 0.026780572000000002, \"time\": 157.82376141548156}, \"957d0dafc5\": {\"quality\": 0.5816260162601626, \"cost\": 0.067579808, \"time\": 153.21978812217714}, \"9594b0c783\": {\"quality\": 0.6195121951219512, \"cost\": 0.006768972, \"time\": 171.20207090377806}, \"95a7b80c2a\": {\"quality\": 0.797979094076655, \"cost\": 0.02950122, \"time\": 181.7308371067047}, \"964c671f18\": {\"quality\": 0.7772822299651567, \"cost\": 0.021085212, \"time\": 212.14437518119811}, \"9679fe2b69\": {\"quality\": 0.6195121951219512, \"cost\": 0.007065155999999999, \"time\": 184.30773782730103}, \"968fc95038\": {\"quality\": 0.4525319396051103, \"cost\": 0.005385312, \"time\": 147.3559940338135}, \"96b487c724\": {\"quality\": 0.440650406504065, \"cost\": 0.0038495599999999993, \"time\": 780.6981408596039}, \"96c30205f5\": {\"quality\": 0.5457723577235772, \"cost\": 0.064298388, \"time\": 146.8725693702698}, \"96f87d6483\": {\"quality\": 0.7184436701509872, \"cost\": 0.021446996000000003, \"time\": 236.88481707572936}, \"972c83b002\": {\"quality\": 0.6195121951219512, \"cost\": 0.008979192, \"time\": 162.82133150100708}, \"977a4d6b6b\": {\"quality\": 0.28170731707317076, \"cost\": 0.021416136000000002, \"time\": 220.9775969028473}, \"97bc30bd83\": {\"quality\": 0.21211382113821137, \"cost\": 0.015064284, \"time\": 209.8210876464844}, \"980db5f95f\": {\"quality\": 0.6791869918699187, \"cost\": 0.059587308000000005, \"time\": 168.43710894584655}, \"9836765d41\": {\"quality\": 0.6195121951219512, \"cost\": 0.06211643600000001, \"time\": 171.59594984054564}, \"99569e3937\": {\"quality\": 0.6777932636469222, \"cost\": 0.07203462400000002, \"time\": 259.5382764339447}, \"99ea16a9a6\": {\"quality\": 0.815528455284553, \"cost\": 0.034362648, \"time\": 175.2437418460846}, \"9a5b39370f\": {\"quality\": 0.6195121951219512, \"cost\": 0.068652448, \"time\": 237.78598909378053}, \"9aa4abfb50\": {\"quality\": 0.6195121951219512, \"cost\": 0.005156892000000001, \"time\": 114.41352844238281}, \"9ad7a98c31\": {\"quality\": 0.7987108013937282, \"cost\": 0.07103688, \"time\": 182.73348231315612}, \"9b3fb79bcb\": {\"quality\": 0.7849477351916376, \"cost\": 0.028800140000000002, \"time\": 167.58384928703308}, \"9b6d4915f3\": {\"quality\": 0.6195121951219512, \"cost\": 0.0058737600000000004, \"time\": 96.19086871147155}, \"9bae5bafc1\": {\"quality\": 0.4828222996515679, \"cost\": 0.020766276000000004, \"time\": 144.2882854938507}, \"9be8a5f317\": {\"quality\": 0.6097560975609756, \"cost\": 0.070895624, \"time\": 241.7383065700531}, \"9c549db0a7\": {\"quality\": 0.6195121951219512, \"cost\": 0.021638197999999997, \"time\": 138.09213070869447}, \"9c85f8cfcb\": {\"quality\": 0.6195121951219512, \"cost\": 0.004881552, \"time\": 85.23935956954956}, \"9c8cc46e6c\": {\"quality\": 0.6195121951219512, \"cost\": 0.004920108, \"time\": 113.78448638916015}, \"9c97d35a30\": {\"quality\": 0.8032520325203251, \"cost\": 0.011073209999999998, \"time\": 181.72330527305604}, \"9cbe7858a2\": {\"quality\": 0.6195121951219512, \"cost\": 0.06554662800000002, \"time\": 227.9875654697418}, \"9ce2c3fd98\": {\"quality\": 0.6195121951219512, \"cost\": 0.006856308, \"time\": 155.26185359954835}, \"9d18cd0737\": {\"quality\": 0.4644018583042973, \"cost\": 0.006854579999999999, \"time\": 124.20924863815307}, \"9e06360bc9\": {\"quality\": 0.46796747967479674, \"cost\": 0.011727056, \"time\": 160.3625663280487}, \"9f07a95e69\": {\"quality\": 0.5894308943089431, \"cost\": 0.073255632, \"time\": 234.6284239768982}, \"9fb157be35\": {\"quality\": 0.6195121951219512, \"cost\": 0.012696056, \"time\": 159.5446131706238}, \"a04ac8e33a\": {\"quality\": 0.6195121951219512, \"cost\": 0.06885804400000001, \"time\": 157.56720504760742}, \"a04bc6e116\": {\"quality\": 0.5231707317073171, \"cost\": 0.06362432400000001, \"time\": 166.92331409454346}, \"a0b81be5b4\": {\"quality\": 0.3871660859465737, \"cost\": 0.00347016, \"time\": 123.85612168312073}, \"a0c85d260e\": {\"quality\": 0.69602787456446, \"cost\": 0.024224592, \"time\": 235.78984532356262}, \"a0dc9f50ac\": {\"quality\": 0.18065040650406505, \"cost\": 0.007600488, \"time\": 130.9164544582367}, \"a18225b7b5\": {\"quality\": 0.6352032520325204, \"cost\": 0.050534568, \"time\": 152.35800580978395}, \"a1d822289e\": {\"quality\": 0.4399303135888502, \"cost\": 0.08478274000000001, \"time\": 163.93071274757386}, \"a25596c056\": {\"quality\": 0.6195121951219512, \"cost\": 0.067823316, \"time\": 216.80869884490966}, \"a2811c7324\": {\"quality\": 0.6701277584204414, \"cost\": 0.028778920000000003, \"time\": 168.49099044799806}, \"a2aa082d14\": {\"quality\": 0.6195121951219512, \"cost\": 0.022896492, \"time\": 119.0820493221283}, \"a2cd339ad9\": {\"quality\": 0.6406504065040651, \"cost\": 0.018992632, \"time\": 243.06863555908203}, \"a2fd03e6a5\": {\"quality\": 0.47276422764227644, \"cost\": 0.07033531999999999, \"time\": 1132.2907946109772}, \"a31e87d7cb\": {\"quality\": 0.6329268292682927, \"cost\": 0.00273726, \"time\": 105.99368834495544}, \"a3e23c327b\": {\"quality\": 0.5317073170731708, \"cost\": 0.068050688, \"time\": 204.56698241233826}, \"a457f6c300\": {\"quality\": 0.6195121951219512, \"cost\": 0.012840227999999999, \"time\": 168.6292993545532}, \"a47de025c8\": {\"quality\": 0.6195121951219512, \"cost\": 0.007961357999999998, \"time\": 138.18772959709167}, \"a515a9c8cc\": {\"quality\": 0.6253658536585366, \"cost\": 0.005416404, \"time\": 126.60180377960205}, \"a5949b76ec\": {\"quality\": 0.6195121951219512, \"cost\": 0.003754764, \"time\": 81.27935070991516}, \"a60dd076b8\": {\"quality\": 0.405609756097561, \"cost\": 0.0034407720000000004, \"time\": 118.2785505771637}, \"a6297a6c56\": {\"quality\": 0.6195121951219512, \"cost\": 0.005110776, \"time\": 118.47536420822144}, \"a62b7555b9\": {\"quality\": 0.4455284552845528, \"cost\": 0.07002202800000001, \"time\": 229.59565677642823}, \"a63f48e8ca\": {\"quality\": 0.6195121951219512, \"cost\": 0.069574668, \"time\": 132.71091380119324}, \"a6460dbb7c\": {\"quality\": 0.7212078977932637, \"cost\": 0.009528016, \"time\": 130.55586276054382}, \"a66a4cf4b0\": {\"quality\": 0.567479674796748, \"cost\": 0.07641536000000002, \"time\": 202.29455227851867}, \"a6e2d69222\": {\"quality\": 0.4239024390243902, \"cost\": 0.045957064, \"time\": 178.1903151988983}, \"a717c4c535\": {\"quality\": 0.6195121951219512, \"cost\": 0.019736044, \"time\": 234.4358515739441}, \"a76afe9960\": {\"quality\": 0.5696051103368177, \"cost\": 0.060384544000000005, \"time\": 172.19008646011352}, \"a7a6353090\": {\"quality\": 0.4487224157955866, \"cost\": 0.066870862, \"time\": 166.1868350982666}, \"a80f6535b1\": {\"quality\": 0.5308130081300814, \"cost\": 0.020093776, \"time\": 97.89270992279053}, \"a86b137d7f\": {\"quality\": 0.6815447154471544, \"cost\": 0.006451132, \"time\": 93.94514923095703}, \"a88eb1493c\": {\"quality\": 0.41745644599303144, \"cost\": 0.0066783, \"time\": 116.80834813117981}, \"a89c533d6c\": {\"quality\": 0.6195121951219512, \"cost\": 0.06598649999999999, \"time\": 83.00586166381837}, \"a8d8264600\": {\"quality\": 0.4808943089430895, \"cost\": 0.071912272, \"time\": 251.74223246574402}, \"a8eb36b210\": {\"quality\": 0.4832520325203252, \"cost\": 0.03806308, \"time\": 194.64342403411865}, \"a95b4a6dd0\": {\"quality\": 0.6195121951219512, \"cost\": 0.012172187999999999, \"time\": 172.00175199508666}, \"a9621ea4e6\": {\"quality\": 0.47317073170731705, \"cost\": 0.010803696, \"time\": 194.9391739845276}, \"a9721a0a50\": {\"quality\": 0.6624390243902439, \"cost\": 0.009840604, \"time\": 122.32440161705017}, \"a972b02c61\": {\"quality\": 0.7189430894308944, \"cost\": 0.03160314, \"time\": 67.33741846084595}, \"a9c5c4e311\": {\"quality\": 0.7951335656213705, \"cost\": 0.028237476, \"time\": 182.70986766815184}, \"a9d96670eb\": {\"quality\": 0.6195121951219512, \"cost\": 0.05767817200000001, \"time\": 169.8078365802765}, \"a9e8c974d3\": {\"quality\": 0.6195121951219512, \"cost\": 0.06961316799999999, \"time\": 225.65897822380066}, \"aa08180e36\": {\"quality\": 0.6146341463414634, \"cost\": 0.014778952000000001, \"time\": 221.92620844841002}, \"aa38702a02\": {\"quality\": 0.6146341463414634, \"cost\": 0.006915852, \"time\": 164.99627866744996}, \"aadbfc418b\": {\"quality\": 0.45479674796747965, \"cost\": 0.00576612, \"time\": 161.90984616279601}, \"aaeb8b0010\": {\"quality\": 0.6195121951219512, \"cost\": 0.06474566000000001, \"time\": 209.02622961997986}, \"ab288ee7f2\": {\"quality\": 0.5373983739837398, \"cost\": 0.06573288000000001, \"time\": 226.1878888130188}, \"ab43b02cb0\": {\"quality\": 0.6195121951219512, \"cost\": 0.01377632, \"time\": 126.55581045150757}, \"aba1d612cc\": {\"quality\": 0.5083391405342625, \"cost\": 0.09728845600000001, \"time\": 219.89660325050355}, \"ac208e7a1d\": {\"quality\": 0.4333333333333334, \"cost\": 0.01616966, \"time\": 142.93958106040955}, \"ac2224adbe\": {\"quality\": 0.6195121951219512, \"cost\": 0.039715708, \"time\": 125.78687748908996}, \"ac828ffe70\": {\"quality\": 0.5008130081300812, \"cost\": 0.003011208, \"time\": 454.9158252716064}, \"ac9fdc1550\": {\"quality\": 0.5360162601626015, \"cost\": 0.026754198, \"time\": 173.65110726356505}, \"aca957ecff\": {\"quality\": 0.8072590011614402, \"cost\": 0.08830083200000001, \"time\": 230.63441123962403}, \"acfe1ed920\": {\"quality\": 0.35609756097560974, \"cost\": 0.02273798, \"time\": 88.96396307945251}, \"ad3efe44c3\": {\"quality\": 0.6195121951219512, \"cost\": 0.004642056, \"time\": 92.58378911018372}, \"ad41c95a99\": {\"quality\": 0.3482113821138212, \"cost\": 0.045380508, \"time\": 154.1578179359436}, \"ad48432c22\": {\"quality\": 0.6569105691056911, \"cost\": 0.015225708, \"time\": 247.30540533065795}, \"ad6ebbba8d\": {\"quality\": 0.6195121951219512, \"cost\": 0.012972292, \"time\": 209.87320923805237}, \"ad90055ef6\": {\"quality\": 0.6195121951219512, \"cost\": 0.00457182, \"time\": 93.97476525306702}, \"adab1e0fb1\": {\"quality\": 0.46829268292682935, \"cost\": 0.008422604000000002, \"time\": 115.4403871536255}, \"ae655ec593\": {\"quality\": 0.35599303135888505, \"cost\": 0.006649944, \"time\": 167.62679462432862}, \"ae94b172be\": {\"quality\": 0.5552613240418119, \"cost\": 0.013280692, \"time\": 162.66643962860107}, \"aec9dc5873\": {\"quality\": 0.6195121951219512, \"cost\": 0.055165792000000005, \"time\": 132.68569073677062}, \"af360c323c\": {\"quality\": 0.6195121951219512, \"cost\": 0.006864167999999999, \"time\": 195.27329888343812}, \"af90567194\": {\"quality\": 0.7732404181184669, \"cost\": 0.02378152, \"time\": 264.66726565361023}, \"afe77d0f89\": {\"quality\": 0.768513356562137, \"cost\": 0.069933622, \"time\": 251.59823207855226}, \"b0948c05b6\": {\"quality\": 0.6380487804878049, \"cost\": 0.017833904, \"time\": 149.21853432655334}, \"b0c4a6640b\": {\"quality\": 0.4317653890824623, \"cost\": 0.063453446, \"time\": 221.18421225547792}, \"b12caafd58\": {\"quality\": 0.46829268292682935, \"cost\": 0.06446362400000001, \"time\": 217.29487829208375}, \"b18168b9c1\": {\"quality\": 0.3608130081300814, \"cost\": 0.010323167999999999, \"time\": 189.31807408332824}, \"b1a7428a01\": {\"quality\": 0.21260162601626015, \"cost\": 0.06952381600000002, \"time\": 206.16540212631224}, \"b1acdebb48\": {\"quality\": 0.6821138211382114, \"cost\": 0.102856868, \"time\": 183.48106842041017}, \"b1b06f4ee7\": {\"quality\": 0.6195121951219512, \"cost\": 0.12662600000000002, \"time\": 142.81967635154723}, \"b1cf8d33e5\": {\"quality\": 0.3186178861788618, \"cost\": 0.011143266, \"time\": 175.05015845298766}, \"b1e9ab6b1a\": {\"quality\": 0.5263414634146341, \"cost\": 0.07167981200000001, \"time\": 206.59926490783693}, \"b2b057ba41\": {\"quality\": 0.4799186991869918, \"cost\": 0.0052038779999999995, \"time\": 142.83305773735046}, \"b2e063499d\": {\"quality\": 0.6911498257839722, \"cost\": 0.018540096, \"time\": 176.64938292503356}, \"b33412410e\": {\"quality\": 0.6195121951219512, \"cost\": 0.03299608, \"time\": 125.82604823112487}, \"b3369775dc\": {\"quality\": 0.49406504065040646, \"cost\": 0.017937791999999998, \"time\": 202.59222974777222}, \"b3b9205f60\": {\"quality\": 0.20133565621370497, \"cost\": 0.03923410000000001, \"time\": 184.9780117034912}, \"b3c56f0b3c\": {\"quality\": 0.594959349593496, \"cost\": 0.06256806000000001, \"time\": 168.73735642433167}, \"b3f20b706d\": {\"quality\": 0.6195121951219512, \"cost\": 0.0037520640000000003, \"time\": 106.53766713142394}, \"b4002173ee\": {\"quality\": 0.5188617886178861, \"cost\": 0.007600319999999999, \"time\": 109.14190998077393}, \"b46d382384\": {\"quality\": 0.7555284552845528, \"cost\": 0.12484462, \"time\": 146.6869523525238}, \"b4b2482ef9\": {\"quality\": 0.4543554006968641, \"cost\": 0.06673528000000001, \"time\": 163.60684385299683}, \"b4be043238\": {\"quality\": 0.3460394889663182, \"cost\": 0.08778872000000001, \"time\": 150.07519998550416}, \"b531bd0548\": {\"quality\": 0.3177584204413472, \"cost\": 0.08969832, \"time\": 202.44645614624022}, \"b56c312eda\": {\"quality\": 0.5277351916376307, \"cost\": 0.016686852, \"time\": 249.4414801120758}, \"b5e2b41c1c\": {\"quality\": 0.551684088269454, \"cost\": 0.009034968, \"time\": 169.00915660858155}, \"b61ce57a90\": {\"quality\": 0.36904761904761907, \"cost\": 0.039441384, \"time\": 192.418292427063}, \"b64ddb14f9\": {\"quality\": 0.3847967479674797, \"cost\": 0.02944528, \"time\": 824.0557322502136}, \"b67107a43e\": {\"quality\": 0.6195121951219512, \"cost\": 0.01083792, \"time\": 248.4702594280243}, \"b67720aa5c\": {\"quality\": 0.6195121951219512, \"cost\": 0.05931710800000001, \"time\": 146.00394463539124}, \"b682a23b89\": {\"quality\": 0.7337746806039489, \"cost\": 0.013234963999999998, \"time\": 213.58269662857055}, \"b69ef5add4\": {\"quality\": 0.6584204413472706, \"cost\": 0.07431707200000001, \"time\": 161.14033164978028}, \"b796b7ffd3\": {\"quality\": 0.3386178861788618, \"cost\": 0.014401015999999999, \"time\": 259.1940643310547}, \"b7a0083dc4\": {\"quality\": 0.6195121951219512, \"cost\": 0.061847876, \"time\": 214.91841106414796}, \"b7d0e8557f\": {\"quality\": 0.5686411149825784, \"cost\": 0.012575807999999999, \"time\": 247.72326970100403}, \"b8317a3a8c\": {\"quality\": 0.7867131242740998, \"cost\": 0.06423970400000001, \"time\": 181.56251420974732}, \"b8ab3d2f25\": {\"quality\": 0.5626016260162602, \"cost\": 0.015544804, \"time\": 184.08619875907897}, \"b8b569172f\": {\"quality\": 0.5490243902439025, \"cost\": 0.015828372, \"time\": 178.33178467750548}, \"b8f5ab44bb\": {\"quality\": 0.634959349593496, \"cost\": 0.07826424000000001, \"time\": 262.68980021476744}, \"b91e7fdb29\": {\"quality\": 0.6195121951219512, \"cost\": 0.012287124, \"time\": 227.29503026008607}, \"b932beaaa6\": {\"quality\": 0.544959349593496, \"cost\": 0.06720848000000001, \"time\": 186.7761948108673}, \"b9770c2261\": {\"quality\": 0.5645644599303136, \"cost\": 0.00541122, \"time\": 150.7996078491211}, \"b9bb1e6f8d\": {\"quality\": 0.6998373983739838, \"cost\": 0.018532544, \"time\": 244.14531807899476}, \"b9d0e8740c\": {\"quality\": 0.6134146341463415, \"cost\": 0.07111140800000001, \"time\": 214.80163626670839}, \"b9da208432\": {\"quality\": 0.641869918699187, \"cost\": 0.019034844000000002, \"time\": 232.56544876098633}, \"ba3223f6ac\": {\"quality\": 0.6195121951219512, \"cost\": 0.005706648, \"time\": 151.19898543357849}, \"bb13365175\": {\"quality\": 0.7763995354239257, \"cost\": 0.042266292000000004, \"time\": 224.2499403476715}, \"bb1b3a4d29\": {\"quality\": 0.8029384436701509, \"cost\": 0.064898518, \"time\": 256.5067723274231}, \"bb6536b0ab\": {\"quality\": 0.6195121951219512, \"cost\": 0.013670112000000002, \"time\": 159.04916138648986}, \"bbba9dd6ae\": {\"quality\": 0.5152032520325204, \"cost\": 0.008736126, \"time\": 125.51340088844299}, \"bbde69a1ae\": {\"quality\": 0.47317073170731705, \"cost\": 0.07075216, \"time\": 211.4701558113098}, \"bc29a0c0fe\": {\"quality\": 0.6923809523809524, \"cost\": 0.08504816000000001, \"time\": 152.53999376296997}, \"bc3d02f753\": {\"quality\": 0.6195121951219512, \"cost\": 0.0033908220000000004, \"time\": 108.17700786590576}, \"bc4c1fcc64\": {\"quality\": 0.25837398373983744, \"cost\": 0.008965720000000002, \"time\": 110.68981971740723}, \"bd30d27f62\": {\"quality\": 0.8004529616724738, \"cost\": 0.13683402800000002, \"time\": 179.2394115447998}, \"bd99b2fb21\": {\"quality\": 0.6292682926829268, \"cost\": 0.013529848, \"time\": 165.33139224052428}, \"bddc7d2a34\": {\"quality\": 0.7716144018583042, \"cost\": 0.06608894800000001, \"time\": 242.8191273212433}, \"be2ae88f70\": {\"quality\": 0.6195121951219512, \"cost\": 0.0070768079999999995, \"time\": 175.22385454177856}, \"be4740f38f\": {\"quality\": 0.5178048780487805, \"cost\": 0.018557584000000002, \"time\": 177.69371061325074}, \"bec0c6a95f\": {\"quality\": 0.31897793263646923, \"cost\": 0.08918554800000002, \"time\": 219.01660284996032}, \"bed888d4dc\": {\"quality\": 0.6195121951219512, \"cost\": 0.012997296, \"time\": 392.78799629211426}, \"bf45e407f6\": {\"quality\": 0.7567479674796748, \"cost\": 0.085539992, \"time\": 158.9132725715637}, \"bf5550f320\": {\"quality\": 0.6613821138211382, \"cost\": 0.070068868, \"time\": 233.04684481620788}, \"bf87e58322\": {\"quality\": 0.6940650406504065, \"cost\": 0.06185992, \"time\": 188.9820168018341}, \"bfed7670ed\": {\"quality\": 0.645609756097561, \"cost\": 0.014454772, \"time\": 144.87740097045898}, \"c0541e2220\": {\"quality\": 0.6195121951219512, \"cost\": 0.06049530400000001, \"time\": 134.89153051376343}, \"c0e10c0048\": {\"quality\": 0.6195121951219512, \"cost\": 0.06394020800000001, \"time\": 253.89145169258117}, \"c127509a7a\": {\"quality\": 0.5921951219512195, \"cost\": 0.02507367, \"time\": 171.8149597644806}, \"c13682c7c7\": {\"quality\": 0.6195121951219512, \"cost\": 0.016085024, \"time\": 252.46734075546266}, \"c13d6e78e9\": {\"quality\": 0.4615098722415795, \"cost\": 0.026846708000000004, \"time\": 158.4395161151886}, \"c14ff3144d\": {\"quality\": 0.6380487804878049, \"cost\": 0.002752848, \"time\": 124.53351817131042}, \"c1e42ac47b\": {\"quality\": 0.5833333333333334, \"cost\": 0.07545873600000001, \"time\": 234.8793641090393}, \"c2949aa902\": {\"quality\": 0.6728455284552846, \"cost\": 0.00916251, \"time\": 212.7397180557251}, \"c31e956b35\": {\"quality\": 0.6621718931475028, \"cost\": 0.014615412, \"time\": 238.76983637809752}, \"c36b525dde\": {\"quality\": 0.781869918699187, \"cost\": 0.00692364, \"time\": 160.97053718566895}, \"c38326e2bd\": {\"quality\": 0.4835772357723577, \"cost\": 0.07338262, \"time\": 231.5795979499817}, \"c3ec2cec59\": {\"quality\": 0.6329268292682928, \"cost\": 0.017786080000000003, \"time\": 169.43426485061644}, \"c44720575f\": {\"quality\": 0.5554355400696864, \"cost\": 0.035053448, \"time\": 162.69704794883728}, \"c48ecefab6\": {\"quality\": 0.6195121951219512, \"cost\": 0.06189563200000001, \"time\": 231.7577772140503}, \"c4a64eb40f\": {\"quality\": 0.5757375145180024, \"cost\": 0.087534268, \"time\": 171.7744441986084}, \"c4a80d19b3\": {\"quality\": 0.667862950058072, \"cost\": 0.017699555999999998, \"time\": 262.5382396697998}, \"c4c2826afd\": {\"quality\": 0.6195121951219512, \"cost\": 0.014996676, \"time\": 243.1779758453369}, \"c4c94a5527\": {\"quality\": 0.6195121951219512, \"cost\": 0.009552324000000001, \"time\": 171.8789219379425}, \"c4e75ee9ba\": {\"quality\": 0.6092334494773519, \"cost\": 0.14632424400000002, \"time\": 163.68729362487792}, \"c4f3e7665d\": {\"quality\": 0.6195121951219512, \"cost\": 0.006700248000000001, \"time\": 171.2033597946167}, \"c5471bef57\": {\"quality\": 0.6195121951219512, \"cost\": 0.06709224, \"time\": 360.6734181404114}, \"c54a408db7\": {\"quality\": 0.4271544715447154, \"cost\": 0.061197178000000005, \"time\": 205.84643664360047}, \"c59cc41335\": {\"quality\": 0.6195121951219512, \"cost\": 0.060401755999999994, \"time\": 227.63298444747926}, \"c5a0b065e0\": {\"quality\": 0.6585365853658537, \"cost\": 0.020608076000000003, \"time\": 100.40361194610595}, \"c5a16b834a\": {\"quality\": 0.6756097560975609, \"cost\": 0.020331156, \"time\": 377.4331964969635}, \"c5fbe2076f\": {\"quality\": 0.2873170731707317, \"cost\": 0.024696848, \"time\": 235.4668863296509}, \"c617370f6b\": {\"quality\": 0.6195121951219512, \"cost\": 0.05902978000000001, \"time\": 201.3127824783325}, \"c67f782c7f\": {\"quality\": 0.6860975609756098, \"cost\": 0.064923248, \"time\": 186.35708870887757}, \"c691a29c42\": {\"quality\": 0.5146341463414634, \"cost\": 0.02263088, \"time\": 342.815619802475}, \"c6a339987c\": {\"quality\": 0.6195121951219512, \"cost\": 0.014977467999999999, \"time\": 256.19790320396424}, \"c772ff3704\": {\"quality\": 0.5390243902439025, \"cost\": 0.012446268, \"time\": 256.34659972190855}, \"c7e3f348c2\": {\"quality\": 0.7056910569105692, \"cost\": 0.07413946400000002, \"time\": 226.76761050224303}, \"c823589ab6\": {\"quality\": 0.6195121951219512, \"cost\": 0.060011588000000005, \"time\": 169.34995718002318}, \"c82f834e85\": {\"quality\": 0.6195121951219512, \"cost\": 0.013014976000000001, \"time\": 360.72568283081057}, \"c85099881f\": {\"quality\": 0.6214634146341463, \"cost\": 0.019014336, \"time\": 91.83121838569642}, \"c935a33384\": {\"quality\": 0.47804878048780486, \"cost\": 0.021866696, \"time\": 201.4007071018219}, \"ca3177461f\": {\"quality\": 0.6390243902439025, \"cost\": 0.064102784, \"time\": 260.361168384552}, \"caa7c0bd6b\": {\"quality\": 0.614308943089431, \"cost\": 0.013427039999999998, \"time\": 201.50019497871398}, \"cac6b051e9\": {\"quality\": 0.6777119628339141, \"cost\": 0.012200891999999998, \"time\": 235.3216923236847}, \"cacb342f64\": {\"quality\": 0.3350406504065041, \"cost\": 0.09074681200000001, \"time\": 199.62914476394653}, \"cb9948679c\": {\"quality\": 0.6195121951219512, \"cost\": 0.060963451999999994, \"time\": 209.66846399307252}, \"cbb5eb0e74\": {\"quality\": 0.36753774680603957, \"cost\": 0.012385512, \"time\": 269.31641936302185}, \"cbc32cbeff\": {\"quality\": 0.09656213704994195, \"cost\": 0.012335669999999998, \"time\": 223.05522747039794}, \"cbd4461293\": {\"quality\": 0.4478861788617886, \"cost\": 0.037316864, \"time\": 1704.694543504715}, \"cbe2318045\": {\"quality\": 0.35097560975609754, \"cost\": 0.011905024, \"time\": 143.2994128704071}, \"cc886fe337\": {\"quality\": 0.6195121951219512, \"cost\": 0.008570712000000001, \"time\": 189.0461087703705}, \"cc9a6248a0\": {\"quality\": 0.8031823461091754, \"cost\": 0.06325050000000002, \"time\": 65.81817102432251}, \"ccb2335b3f\": {\"quality\": 0.30666666666666675, \"cost\": 0.066565, \"time\": 270.24665660858153}, \"ccdf03a55b\": {\"quality\": 0.554239256678281, \"cost\": 0.066687724, \"time\": 194.36894397735597}, \"ccf72745c1\": {\"quality\": 0.5723577235772358, \"cost\": 0.015334892, \"time\": 180.34684844017028}, \"cd1d418732\": {\"quality\": 0.6195121951219512, \"cost\": 0.007930735999999999, \"time\": 90.1477038860321}, \"cd23c79db1\": {\"quality\": 0.6195121951219512, \"cost\": 0.013173464000000001, \"time\": 219.91192269325256}, \"cd64fbfcd9\": {\"quality\": 0.6552845528455286, \"cost\": 0.010721148, \"time\": 215.86649370193481}, \"cd85a01e81\": {\"quality\": 0.6195121951219512, \"cost\": 0.066680636, \"time\": 309.61991782188414}, \"ce4bc5f348\": {\"quality\": 0.4926829268292683, \"cost\": 0.004973436, \"time\": 122.72588214874267}, \"ce980cf86f\": {\"quality\": 0.3678281068524971, \"cost\": 0.006999414, \"time\": 186.7628839492798}, \"ceae8b8bb9\": {\"quality\": 0.569965156794425, \"cost\": 0.040919136, \"time\": 278.74190187454224}, \"cecca90dd2\": {\"quality\": 0.7816376306620209, \"cost\": 0.009244559999999999, \"time\": 204.2295413017273}, \"cf9538faf0\": {\"quality\": 0.5823577235772358, \"cost\": 0.013121356, \"time\": 168.41699748039247}, \"cf9d2e224c\": {\"quality\": 0.5317073170731708, \"cost\": 0.008401064, \"time\": 129.19066014289857}, \"cfd36f3a8c\": {\"quality\": 0.5922764227642275, \"cost\": 0.026429034, \"time\": 197.97676906585693}, \"cffa29a6ef\": {\"quality\": 0.6403252032520326, \"cost\": 0.0011394740000000001, \"time\": 84.72708497047424}, \"d03596c3de\": {\"quality\": 0.6584552845528455, \"cost\": 0.017459960000000004, \"time\": 243.87012338638306}, \"d07b766487\": {\"quality\": 0.5414634146341464, \"cost\": 0.06639410400000001, \"time\": 248.754083442688}, \"d0a0a66d75\": {\"quality\": 0.4197560975609756, \"cost\": 0.037255556, \"time\": 155.2362714290619}, \"d0ce31134c\": {\"quality\": 0.6195121951219512, \"cost\": 0.07035569600000001, \"time\": 244.35629963874817}, \"d0f9633442\": {\"quality\": 0.6195121951219512, \"cost\": 0.010673088, \"time\": 174.68734726905822}, \"d216eab7d8\": {\"quality\": 0.3921138211382114, \"cost\": 0.015362728, \"time\": 177.72364864349365}, \"d266c19ac8\": {\"quality\": 0.5590592334494773, \"cost\": 0.010259069999999999, \"time\": 249.80076293945314}, \"d26a70179a\": {\"quality\": 0.6736701509872242, \"cost\": 0.072242652, \"time\": 281.9195426940918}, \"d2af24b59e\": {\"quality\": 0.37926829268292683, \"cost\": 0.012534632, \"time\": 162.15444717407226}, \"d2f2dd5cd4\": {\"quality\": 0.6195121951219512, \"cost\": 0.033032736, \"time\": 161.3020932674408}, \"d302278f85\": {\"quality\": 0.6195121951219512, \"cost\": 0.065508068, \"time\": 281.0260838031769}, \"d37dcaea30\": {\"quality\": 0.5957026713124275, \"cost\": 0.042292012000000004, \"time\": 254.16097497940063}, \"d3a2d50bd7\": {\"quality\": 0.5393263646922184, \"cost\": 0.008568708, \"time\": 180.23848376274108}, \"d3d4185487\": {\"quality\": 0.6195121951219512, \"cost\": 0.063731828, \"time\": 186.93449816703796}, \"d3db4cf84d\": {\"quality\": 0.6195121951219512, \"cost\": 0.007814196, \"time\": 126.7787591457367}, \"d402233b53\": {\"quality\": 0.7804761904761903, \"cost\": 0.04559879200000001, \"time\": 268.7110698223114}, \"d43fafa19e\": {\"quality\": 0.6195121951219512, \"cost\": 0.006752928, \"time\": 137.8223207473755}, \"d446a75eb7\": {\"quality\": 0.508432055749129, \"cost\": 0.07295704800000001, \"time\": 211.27834362983702}, \"d48ead13da\": {\"quality\": 0.6195121951219512, \"cost\": 0.010855976, \"time\": 194.97514882087708}, \"d5016f4538\": {\"quality\": 0.6195121951219512, \"cost\": 0.006847740000000001, \"time\": 154.1272620677948}, \"d55a3613b0\": {\"quality\": 0.7033565621370499, \"cost\": 0.09393361200000001, \"time\": 280.48901586532594}, \"d58036ba66\": {\"quality\": 0.6341463414634146, \"cost\": 0.012329896000000002, \"time\": 182.65342464447022}, \"d5a84c782e\": {\"quality\": 0.7512311265969802, \"cost\": 0.06159429000000001, \"time\": 238.61238617897033}, \"d5b2eef11c\": {\"quality\": 0.6305458768873403, \"cost\": 0.059769270000000006, \"time\": 148.21854801177977}, \"d6040140b9\": {\"quality\": 0.6772357723577236, \"cost\": 0.025150488000000006, \"time\": 293.03985538482664}, \"d65185c1a4\": {\"quality\": 0.5714750290360046, \"cost\": 0.011127215999999999, \"time\": 211.66472873687744}, \"d667351f33\": {\"quality\": 0.6195121951219512, \"cost\": 0.07187473600000001, \"time\": 284.58656883239746}, \"d6bd3b66ba\": {\"quality\": 0.6195121951219512, \"cost\": 0.0062973000000000005, \"time\": 179.81442737579346}, \"d6c4e48eeb\": {\"quality\": 0.3821138211382114, \"cost\": 0.013097992, \"time\": 185.88441767692566}, \"d6cbf265ee\": {\"quality\": 0.553170731707317, \"cost\": 0.027157003999999995, \"time\": 2382.298054933548}, \"d705447fd7\": {\"quality\": 0.6195121951219512, \"cost\": 0.06927182000000001, \"time\": 274.04458661079406}, \"d73a9aab4e\": {\"quality\": 0.6195121951219512, \"cost\": 0.00601782, \"time\": 119.8721248626709}, \"d752c30d07\": {\"quality\": 0.6491869918699187, \"cost\": 0.029268951999999997, \"time\": 187.99695973396302}, \"d782682359\": {\"quality\": 0.48292682926829267, \"cost\": 0.070581532, \"time\": 240.62954745292663}, \"d7c0972014\": {\"quality\": 0.46747967479674796, \"cost\": 0.014235999999999999, \"time\": 146.19986548423768}, \"d867525748\": {\"quality\": 0.2914634146341463, \"cost\": 0.031817048, \"time\": 176.6470585823059}, \"d87eb775da\": {\"quality\": 0.5716260162601626, \"cost\": 0.011791225999999998, \"time\": 171.5166805744171}, \"d8bab6c09b\": {\"quality\": 0.5560975609756098, \"cost\": 0.0201636, \"time\": 262.4358974933624}, \"d8bcac36e8\": {\"quality\": 0.6195121951219512, \"cost\": 0.014129348, \"time\": 178.82634310722352}, \"d8eadc0190\": {\"quality\": 0.6644715447154471, \"cost\": 0.08632676000000002, \"time\": 188.2232988357544}, \"d96677d8d4\": {\"quality\": 0.7195005807200929, \"cost\": 0.010380108, \"time\": 202.3421525001526}, \"d98f22270e\": {\"quality\": 0.46382113821138216, \"cost\": 0.06577788, \"time\": 195.36387667655944}, \"d9e2bb21a3\": {\"quality\": 0.5120789779326365, \"cost\": 0.011266182, \"time\": 204.83109889030456}, \"da95deeb20\": {\"quality\": 0.3967479674796748, \"cost\": 0.05949913, \"time\": 146.4172206401825}, \"daaadadcc9\": {\"quality\": 0.6085365853658538, \"cost\": 0.014349220000000001, \"time\": 236.59604306221007}, \"daf855e065\": {\"quality\": 0.7283739837398373, \"cost\": 0.01084773, \"time\": 254.38643417358398}, \"db00594832\": {\"quality\": 0.7669570267131242, \"cost\": 0.082393492, \"time\": 131.31271381378173}, \"db19e677c4\": {\"quality\": 0.816341463414634, \"cost\": 0.08040591200000001, \"time\": 186.9021366119385}, \"db3c035639\": {\"quality\": 0.6329268292682927, \"cost\": 0.07513583600000001, \"time\": 251.00368614196776}, \"db41487005\": {\"quality\": 0.6195121951219512, \"cost\": 0.05802746000000001, \"time\": 196.43507223129274}, \"db6a7482fd\": {\"quality\": 0.6803716608594658, \"cost\": 0.056941420000000006, \"time\": 138.47659482955933}, \"db9060cd27\": {\"quality\": 0.42195121951219516, \"cost\": 0.027881287999999997, \"time\": 1896.3412520885468}, \"dbce95a072\": {\"quality\": 0.5060975609756098, \"cost\": 0.012663032000000001, \"time\": 179.2968252182007}, \"dc195abe5e\": {\"quality\": 0.31232288037166084, \"cost\": 0.062047202, \"time\": 256.8396565437317}, \"dc3f4b7138\": {\"quality\": 0.6195121951219512, \"cost\": 0.05797826800000001, \"time\": 188.94316940307618}, \"dc66bccb1c\": {\"quality\": 0.3678048780487805, \"cost\": 0.01096708, \"time\": 168.7502061367035}, \"dc90065dea\": {\"quality\": 0.36016260162601627, \"cost\": 0.014289246, \"time\": 199.27637152671815}, \"dd0d70fedd\": {\"quality\": 0.7897677119628339, \"cost\": 0.005333369999999999, \"time\": 160.19438481330872}, \"dddc76b3ca\": {\"quality\": 0.32487804878048776, \"cost\": 0.06413708, \"time\": 203.57222208976745}, \"de18bf45e1\": {\"quality\": 0.5365040650406504, \"cost\": 0.009448855999999999, \"time\": 966.3698208808898}, \"de1e56370f\": {\"quality\": 0.6371544715447155, \"cost\": 0.007415147999999999, \"time\": 141.48761720657347}, \"df2160ecc8\": {\"quality\": 0.4975609756097561, \"cost\": 0.06514112000000001, \"time\": 184.16321535110472}, \"dfda94bd2a\": {\"quality\": 0.6195121951219512, \"cost\": 0.0029004839999999996, \"time\": 111.35115175247192}, \"dff452a9ca\": {\"quality\": 0.6195121951219512, \"cost\": 0.00818748, \"time\": 128.0037736415863}, \"e09f75d9d6\": {\"quality\": 0.5954123112659697, \"cost\": 0.07994701400000001, \"time\": 174.90588331222534}, \"e0b6a99753\": {\"quality\": 0.6808943089430894, \"cost\": 0.022907588, \"time\": 75.86744494438172}, \"e0cf6587a7\": {\"quality\": 0.38739837398373983, \"cost\": 0.06783974000000001, \"time\": 224.5930316925049}, \"e0de4a5929\": {\"quality\": 0.28844367015098726, \"cost\": 0.070857196, \"time\": 237.39370694160462}, \"e1356fb426\": {\"quality\": 0.563054587688734, \"cost\": 0.06065425, \"time\": 185.1619038105011}, \"e20ba014a1\": {\"quality\": 0.6341463414634146, \"cost\": 0.022406492, \"time\": 231.54046459197997}, \"e21806e3bc\": {\"quality\": 0.6686991869918699, \"cost\": 0.030775240000000002, \"time\": 153.30845246315002}, \"e24e97564c\": {\"quality\": 0.5788269454123112, \"cost\": 0.065976868, \"time\": 234.19773464202882}, \"e2673c1ec8\": {\"quality\": 0.5588617886178862, \"cost\": 0.07093568800000001, \"time\": 268.0965113162994}, \"e26c7bfbdb\": {\"quality\": 0.526829268292683, \"cost\": 0.010329528, \"time\": 162.32453393936157}, \"e2f9980b06\": {\"quality\": 0.8120905923344948, \"cost\": 0.041606056, \"time\": 251.1425283432007}, \"e3445f7632\": {\"quality\": 0.7865737514518002, \"cost\": 0.02273798, \"time\": 124.58966102600098}, \"e35e5f81a7\": {\"quality\": 0.6195121951219512, \"cost\": 0.012546798000000001, \"time\": 182.4151572227478}, \"e376ac53e7\": {\"quality\": 0.5323577235772358, \"cost\": 0.031615856, \"time\": 169.9604299068451}, \"e3d8bb56da\": {\"quality\": 0.7227758420441346, \"cost\": 0.03464479, \"time\": 173.92777223587035}, \"e3df4cf041\": {\"quality\": 0.6195121951219512, \"cost\": 0.013172260000000002, \"time\": 228.6481415748596}, \"e47dc3abca\": {\"quality\": 0.5421602787456445, \"cost\": 0.01662284, \"time\": 196.10405125617982}, \"e4b9d4fb41\": {\"quality\": 0.6195121951219512, \"cost\": 0.013018407999999999, \"time\": 231.83416152000427}, \"e510bda989\": {\"quality\": 0.4415447154471545, \"cost\": 0.12802820799999998, \"time\": 1276.7408336162566}, \"e517cd2222\": {\"quality\": 0.3045528455284553, \"cost\": 0.020794063999999998, \"time\": 1299.5594879627229}, \"e51b01f418\": {\"quality\": 0.6195121951219512, \"cost\": 0.011492568, \"time\": 202.7894808292389}, \"e520dfae5b\": {\"quality\": 0.6439024390243903, \"cost\": 0.011328468, \"time\": 263.53743448257444}, \"e521c9b7e4\": {\"quality\": 0.7932520325203252, \"cost\": 0.06690654, \"time\": 156.73078799247742}, \"e54097ad5d\": {\"quality\": 0.5923344947735192, \"cost\": 0.062158128, \"time\": 209.91472067832947}, \"e56a16ca66\": {\"quality\": 0.4848315911730546, \"cost\": 0.004748472, \"time\": 137.46008925437928}, \"e5c4abf7ce\": {\"quality\": 0.35598141695702673, \"cost\": 0.06409228000000002, \"time\": 256.3565216064453}, \"e5d4689312\": {\"quality\": 0.6195121951219512, \"cost\": 0.06653434, \"time\": 235.4001616001129}, \"e62a7b27ae\": {\"quality\": 0.6195121951219512, \"cost\": 0.035626552, \"time\": 195.45834450721742}, \"e6a7aff3bc\": {\"quality\": 0.6195121951219512, \"cost\": 0.06874828800000002, \"time\": 295.3631275177002}, \"e736999157\": {\"quality\": 0.6195121951219512, \"cost\": 0.06437495600000001, \"time\": 288.5091665744782}, \"e7517a8ce0\": {\"quality\": 0.8125435540069686, \"cost\": 0.08348902, \"time\": 200.60669808387757}, \"e7520ca5ac\": {\"quality\": 0.36312427409988385, \"cost\": 0.06479258000000002, \"time\": 261.0504195690155}, \"e7e94ab7a5\": {\"quality\": 0.591869918699187, \"cost\": 0.019925159999999997, \"time\": 2488.357094335556}, \"e887ddf5cc\": {\"quality\": 0.48739837398373986, \"cost\": 0.09377338400000002, \"time\": 253.3760078907013}, \"e94fb5a295\": {\"quality\": 0.6195121951219512, \"cost\": 0.07366761200000001, \"time\": 223.6755922794342}, \"ea6ecc5653\": {\"quality\": 0.6166666666666667, \"cost\": 0.010067135999999999, \"time\": 190.92216954231262}, \"ea8bcb3ae2\": {\"quality\": 0.45772357723577234, \"cost\": 0.022977280000000003, \"time\": 212.54974527359008}, \"ebbe8b6c4f\": {\"quality\": 0.6195121951219512, \"cost\": 0.009877384, \"time\": 160.62470216751097}, \"ebdf3abff2\": {\"quality\": 0.686260162601626, \"cost\": 0.019338396, \"time\": 179.08990707397462}, \"ec55dba809\": {\"quality\": 0.4276422764227642, \"cost\": 0.068014448, \"time\": 230.82128653526306}, \"ecb5f78f37\": {\"quality\": 0.6195121951219512, \"cost\": 0.017395611999999998, \"time\": 249.2554892539978}, \"ecda3d74cb\": {\"quality\": 0.6195121951219512, \"cost\": 0.061945256000000004, \"time\": 200.0551459789276}, \"ece10c0388\": {\"quality\": 0.7655168408826946, \"cost\": 0.129639792, \"time\": 172.15098700523376}, \"ed6b5480a5\": {\"quality\": 0.47195121951219515, \"cost\": 0.006612492, \"time\": 117.65485558509826}, \"eda630dc85\": {\"quality\": 0.6290127758420442, \"cost\": 0.013249908000000001, \"time\": 202.01231064796448}, \"edaaee5ed4\": {\"quality\": 0.5414634146341464, \"cost\": 0.011020607999999998, \"time\": 124.82250752449036}, \"edb2b764aa\": {\"quality\": 0.6329268292682928, \"cost\": 0.018810900000000002, \"time\": 160.87249393463134}, \"edc52339db\": {\"quality\": 0.6195121951219512, \"cost\": 0.0179609, \"time\": 246.9092752456665}, \"ede7071775\": {\"quality\": 0.7035423925667827, \"cost\": 0.13207611000000002, \"time\": 200.41918315887452}, \"ee46042c5d\": {\"quality\": 0.6195121951219512, \"cost\": 0.016286580000000002, \"time\": 244.87669949531556}, \"ee68c51f73\": {\"quality\": 0.6029500580720093, \"cost\": 0.07541250000000001, \"time\": 109.02409811019898}, \"ee7b726747\": {\"quality\": 0.7833797909407665, \"cost\": 0.039892712, \"time\": 251.92134594917297}, \"eec5f32da9\": {\"quality\": 0.81595818815331, \"cost\": 0.0879383, \"time\": 286.01010518074037}, \"eed40e4378\": {\"quality\": 0.6613821138211382, \"cost\": 0.06929234, \"time\": 282.0774456501007}, \"eef12d478b\": {\"quality\": 0.46260162601626015, \"cost\": 0.009616956, \"time\": 204.5144693851471}, \"ef37b3e0be\": {\"quality\": 0.6195121951219512, \"cost\": 0.012494243999999998, \"time\": 257.47455110549924}, \"ef43d497f1\": {\"quality\": 0.40569105691056906, \"cost\": 0.016996992, \"time\": 246.4581404685974}, \"ef4d4c4a62\": {\"quality\": 0.4484552845528455, \"cost\": 0.015539336, \"time\": 182.2000419616699}, \"ef9a651425\": {\"quality\": 0.7040650406504065, \"cost\": 0.07579404, \"time\": 250.92651562690736}, \"f0655621af\": {\"quality\": 0.4265853658536585, \"cost\": 0.15761675999999997, \"time\": 2013.6334864139558}, \"f076b4c9ae\": {\"quality\": 0.7834262485481996, \"cost\": 0.015863708, \"time\": 187.7628330230713}, \"f11eddb4ed\": {\"quality\": 0.20284552845528453, \"cost\": 0.016428986, \"time\": 237.75844135284424}, \"f1408da253\": {\"quality\": 0.6967479674796748, \"cost\": 0.014442047999999999, \"time\": 249.8467248916626}, \"f18cf41929\": {\"quality\": 0.6195121951219512, \"cost\": 0.003961836, \"time\": 117.34907512664795}, \"f1aa0b0b42\": {\"quality\": 0.5720325203252032, \"cost\": 0.043219422, \"time\": 146.36151003837585}, \"f1bda127f6\": {\"quality\": 0.46178861788617886, \"cost\": 0.017929852000000003, \"time\": 197.9198618412018}, \"f1f373e58e\": {\"quality\": 0.7831010452961673, \"cost\": 0.09002725200000002, \"time\": 185.47337765693663}, \"f2a2e91541\": {\"quality\": 0.4859117305458769, \"cost\": 0.012855611999999999, \"time\": 232.05357608795165}, \"f2c04ed1c8\": {\"quality\": 0.405609756097561, \"cost\": 0.0067632479999999995, \"time\": 126.54541292190552}, \"f2cf5db12d\": {\"quality\": 0.19959349593495934, \"cost\": 0.013177552, \"time\": 168.4030219078064}, \"f366c0dd10\": {\"quality\": 0.6195121951219512, \"cost\": 0.06272198, \"time\": 196.8466109275818}, \"f4303a5b4f\": {\"quality\": 0.6103252032520325, \"cost\": 0.09301452800000001, \"time\": 231.81033272743224}, \"f437481e3b\": {\"quality\": 0.5909756097560975, \"cost\": 0.019717612, \"time\": 227.29155945777893}, \"f4bc6b63a7\": {\"quality\": 0.571788617886179, \"cost\": 0.12058799000000002, \"time\": 197.6338578224182}, \"f4dc556633\": {\"quality\": 0.6195121951219512, \"cost\": 0.06797158, \"time\": 289.321187877655}, \"f4deb72db6\": {\"quality\": 0.6195121951219512, \"cost\": 0.0105891, \"time\": 200.89887857437134}, \"f4ef2b9c33\": {\"quality\": 0.5439837398373983, \"cost\": 0.03817632000000001, \"time\": 213.8931649684906}, \"f566d6d6a1\": {\"quality\": 0.4036585365853659, \"cost\": 0.062300763999999995, \"time\": 201.07639598846436}, \"f5b9a94dcc\": {\"quality\": 0.6146341463414634, \"cost\": 0.04594065199999999, \"time\": 1847.3314466953277}, \"f5c27e7172\": {\"quality\": 0.6195121951219512, \"cost\": 0.007075463999999999, \"time\": 232.92631516456603}, \"f5e53d963b\": {\"quality\": 0.6195121951219512, \"cost\": 0.004926876000000001, \"time\": 146.53346581459044}, \"f614235c15\": {\"quality\": 0.546829268292683, \"cost\": 0.019139696, \"time\": 1690.8127262592316}, \"f6546149e3\": {\"quality\": 0.7115214866434378, \"cost\": 0.08226788, \"time\": 214.23944363594055}, \"f74ec023e4\": {\"quality\": 0.6195121951219512, \"cost\": 0.016337988, \"time\": 300.67309379577637}, \"f7b048bd54\": {\"quality\": 0.46829268292682935, \"cost\": 0.016687760000000003, \"time\": 214.77183957099913}, \"f7c4df993e\": {\"quality\": 0.6378513356562137, \"cost\": 0.01224465, \"time\": 182.30367503166198}, \"f854533145\": {\"quality\": 0.6585365853658537, \"cost\": 0.017361696, \"time\": 270.88364191055297}, \"f89b8a1930\": {\"quality\": 0.8136236933797909, \"cost\": 0.040442136000000004, \"time\": 276.7925349235535}, \"f93d9a2693\": {\"quality\": 0.37878048780487805, \"cost\": 0.008193323999999998, \"time\": 204.3297384738922}, \"f97d91a249\": {\"quality\": 0.3878048780487805, \"cost\": 0.06316157600000001, \"time\": 184.88821873664855}, \"f99096d89c\": {\"quality\": 0.6195121951219512, \"cost\": 0.014407992, \"time\": 267.10197510719297}, \"f9e8e221f3\": {\"quality\": 0.39916376306620205, \"cost\": 0.010630103999999998, \"time\": 213.8131926059723}, \"fa38879eab\": {\"quality\": 0.6504065040650406, \"cost\": 0.021828032000000004, \"time\": 295.17917666435244}, \"fa71111570\": {\"quality\": 0.5853658536585366, \"cost\": 0.069471068, \"time\": 280.6139543056488}, \"fa7882d46b\": {\"quality\": 0.34783972125435536, \"cost\": 0.016001058000000002, \"time\": 247.1370331764221}, \"fa906520d1\": {\"quality\": 0.6655981416957026, \"cost\": 0.018073107999999997, \"time\": 259.80645098686216}, \"faabebaa30\": {\"quality\": 0.22723577235772358, \"cost\": 0.015047418000000002, \"time\": 238.21279168128967}, \"fb0339a7d0\": {\"quality\": 0.7656794425087108, \"cost\": 0.021738743999999997, \"time\": 273.6737798213959}, \"fb216ad6b3\": {\"quality\": 0.3592682926829268, \"cost\": 0.05636392800000001, \"time\": 978.5727415084839}, \"fb6216880a\": {\"quality\": 0.6195121951219512, \"cost\": 0.016447008, \"time\": 279.96212286949157}, \"fba499b89d\": {\"quality\": 0.6195121951219512, \"cost\": 0.06753407200000001, \"time\": 276.17343015670775}, \"fbc010a368\": {\"quality\": 0.6195121951219512, \"cost\": 0.06443590400000002, \"time\": 217.46284394264222}, \"fbc02e2e07\": {\"quality\": 0.6195121951219512, \"cost\": 0.061001164, \"time\": 259.36248960494993}, \"fbd6c45271\": {\"quality\": 0.24020905923344946, \"cost\": 0.015250424000000002, \"time\": 285.5677849769592}, \"fc0a156e16\": {\"quality\": 0.7864692218350754, \"cost\": 0.034151164, \"time\": 244.54232273101806}, \"fc1fd5bf54\": {\"quality\": 0.6195121951219512, \"cost\": 0.010393596, \"time\": 302.0887975215912}, \"fc6967a75b\": {\"quality\": 0.6195121951219512, \"cost\": 0.06284471200000001, \"time\": 293.1112622261047}, \"fc73c3b0fa\": {\"quality\": 0.6592682926829267, \"cost\": 0.025536140000000002, \"time\": 98.14458861351014}, \"fce38334b2\": {\"quality\": 0.6195121951219512, \"cost\": 0.01724038, \"time\": 267.3305795669556}, \"fce5fca128\": {\"quality\": 0.4260278745644599, \"cost\": 0.066997868, \"time\": 262.8612917900085}, \"fd0709359e\": {\"quality\": 0.6446341463414634, \"cost\": 0.001408038, \"time\": 79.83381242752075}, \"fd1f809d64\": {\"quality\": 0.8097560975609757, \"cost\": 0.07385033799999999, \"time\": 213.83500204086303}, \"fd2c994a9d\": {\"quality\": 0.6390243902439025, \"cost\": 0.06503392399999999, \"time\": 284.80486459732055}, \"fddccfbf94\": {\"quality\": 0.46829268292682935, \"cost\": 0.014656296000000003, \"time\": 217.74681878089905}, \"fe7fa741b4\": {\"quality\": 0.48540069686411147, \"cost\": 0.034931979999999994, \"time\": 209.350274848938}, \"fe9e1fec71\": {\"quality\": 0.5036585365853659, \"cost\": 0.067148512, \"time\": 212.58801732063293}, \"fea4734c09\": {\"quality\": 0.4943902439024391, \"cost\": 0.008797415999999999, \"time\": 129.66075100898743}, \"fef1ca27fa\": {\"quality\": 0.4926829268292683, \"cost\": 0.020025176, \"time\": 248.60985913276673}, \"ff11cb6a7a\": {\"quality\": 0.39739837398373984, \"cost\": 0.0197844, \"time\": 228.27161712646483}, \"ff171e34e2\": {\"quality\": 0.6195121951219512, \"cost\": 0.007519518, \"time\": 156.71968116760254}, \"ff1c958e21\": {\"quality\": 0.6341463414634146, \"cost\": 0.0048062339999999995, \"time\": 124.33939328193665}, \"ff8df4ace9\": {\"quality\": 0.5845528455284553, \"cost\": 0.0622996, \"time\": 187.53624334335328}, \"ff8e68049a\": {\"quality\": 0.28821138211382114, \"cost\": 0.010620372, \"time\": 159.36370429992675}}"
  },
  {
    "path": "abacus-research/cuad_data_loader.py",
    "content": "\"\"\"\nShared CUAD data loading utilities to replace HuggingFace datasets.\nAll CUAD scripts should import from this module.\n\"\"\"\n\nimport json\nimport os\n\nimport numpy as np\n\n# Default data directory\nDEFAULT_DATA_DIR = \"cuad-data\"\n\ndef load_cuad_data(split=\"test\", data_dir=None):\n    \"\"\"\n    Load CUAD dataset from local JSON files.\n    \n    Args:\n        split: \"train\" or \"test\"\n        data_dir: Directory containing CUAD JSON files (default: \"cuad-data\")\n    \n    Returns:\n        List of dictionaries with CUAD data in flat format\n    \"\"\"\n    if data_dir is None:\n        data_dir = DEFAULT_DATA_DIR\n    \n    if split == \"train\":\n        file_path = os.path.join(data_dir, \"train_separate_questions.json\")\n    else:\n        file_path = os.path.join(data_dir, \"test.json\")\n    \n    if not os.path.exists(file_path):\n        raise FileNotFoundError(\n            f\"CUAD data file not found at {file_path}. \"\n            f\"Please run 'python setup_cuad_data.py' first to download the data.\"\n        )\n    \n    with open(file_path) as f:\n        raw_data = json.load(f)\n\n    # Convert to flat format\n    dataset = []\n    for article in raw_data[\"data\"]:\n        title = article.get(\"title\", \"\").strip()\n        for paragraph in article[\"paragraphs\"]:\n            context = paragraph[\"context\"].strip()\n            for qa in paragraph[\"qas\"]:\n                dataset.append({\n                    \"id\": qa[\"id\"],\n                    \"title\": title,\n                    \"context\": context,\n                    \"question\": qa[\"question\"].strip(),\n                    \"answers\": qa.get(\"answers\", [])\n                })\n    \n    return dataset\n\n\ndef get_unique_contracts(dataset):\n    \"\"\"Get list of unique contract titles from dataset.\"\"\"\n    contract_titles = []\n    for row in dataset:\n        if row[\"title\"] not in contract_titles:\n            contract_titles.append(row[\"title\"])\n    return contract_titles\n\n\ndef filter_by_contracts(dataset, contract_titles):\n    \"\"\"Filter dataset to only include specified contracts.\"\"\"\n    return [row for row in dataset if row[\"title\"] in contract_titles]\n\n\ndef sample_contracts(dataset, num_contracts, seed=42):\n    \"\"\"\n    Sample a subset of contracts from the dataset.\n    \n    Args:\n        dataset: CUAD dataset\n        num_contracts: Number of contracts to sample\n        seed: Random seed for reproducibility\n    \n    Returns:\n        Filtered dataset with only the sampled contracts\n    \"\"\"\n    contract_titles = get_unique_contracts(dataset)\n    \n    # Shuffle and sample\n    rng = np.random.default_rng(seed=seed)\n    rng.shuffle(contract_titles)\n    sampled_titles = contract_titles[:num_contracts]\n    \n    return filter_by_contracts(dataset, sampled_titles), sampled_titles"
  },
  {
    "path": "abacus-research/download_embeddings_and_mmqa.sh",
    "content": "#!/bin/bash\n\nwget -nc https://palimpzest-workloads.s3.us-east-1.amazonaws.com/abacus-data.tar.gz\ntar -xzf abacus-data.tar.gz\n\n"
  },
  {
    "path": "abacus-research/helper-scripts/biodex-gen-index.py",
    "content": "import os\nimport time\n\nimport chromadb\nimport chromadb.utils.embedding_functions as embedding_functions\nimport numpy as np\nfrom openai import OpenAI\nfrom tqdm import tqdm\n\n# NOTE: this script is meant to be run from the root of the repository\nif __name__ == \"__main__\":\n    # initialize openai client\n    openai_client = OpenAI()\n\n    # load reaction terms\n    reaction_terms = []\n    with open(\"testdata/reaction_terms.txt\") as f:\n        for line in f:\n            reaction_terms.append(line.strip())\n\n    # create directory for embeddings\n    os.makedirs(\"testdata/reaction-term-embeddings/\", exist_ok=True)\n\n    # generate embeddings in batches of 1000 at a time\n    indices = np.linspace(0, len(reaction_terms), len(reaction_terms)//1000, dtype=int)\n    total_embeds = len(indices)\n    print(f\"Generating {total_embeds} embeddings...\")\n    gen_indices = []\n    for iter_idx, start_idx in tqdm(enumerate(indices), total=total_embeds):\n        # check if embedding needs to be computed\n        end_idx = indices[iter_idx + 1] if iter_idx + 1 < len(indices) else None\n        filename = f\"testdata/reaction-term-embeddings/{start_idx}_{end_idx}.npy\"\n        if end_idx is not None and not os.path.exists(filename):\n            # generate embeddings\n            batch = reaction_terms[start_idx:end_idx]\n            resp = openai_client.embeddings.create(input=batch, model=\"text-embedding-3-small\")\n            embeddings = [item.embedding for item in resp.data]\n\n            # save embeddings to disk\n            with open(filename, \"wb\") as f:\n                np.save(f, np.array(embeddings))\n\n            gen_indices.append((start_idx, end_idx))\n            time.sleep(1)\n    print(\"Done generating embeddings.\")\n\n    # initialize chroma client\n    chroma_client = chromadb.PersistentClient(\".chroma-biodex\")\n\n    # initialize embedding function\n    openai_ef = embedding_functions.OpenAIEmbeddingFunction(\n        api_key=os.environ[\"OPENAI_API_KEY\"],\n        model_name=\"text-embedding-3-small\"\n    )\n\n    # create a collection\n    collection = chroma_client.get_or_create_collection(\n        name=\"biodex-reaction-terms\",\n        embedding_function=openai_ef,\n        metadata={\"hnsw:space\": \"cosine\"},\n    )\n\n    # insert documents in batches\n    total_inserts = len(gen_indices)\n    print(f\"Inserting {total_inserts} batches into the collection...\")\n    for start_idx, end_idx in tqdm(gen_indices, total=total_inserts):\n        embeddings = np.load(f\"testdata/reaction-term-embeddings/{start_idx}_{end_idx}.npy\")\n        collection.add(\n            documents=reaction_terms[start_idx:end_idx],\n            embeddings=embeddings.tolist(),\n            ids=[f\"id{idx}\" for idx in range(start_idx, end_idx)]\n        )\n"
  },
  {
    "path": "abacus-research/helper-scripts/generate-prior-stats-biodex-first-convert.py",
    "content": "\"\"\"\nNOTE: this script worked with the tag `abacus-paper-experiments` but is no longer compatible with the main branch.\n\"\"\"\nimport argparse\nimport json\nimport os\nimport time\n\nimport datasets\n\n# from ragatouille import RAGPretrainedModel\nimport palimpzest as pz\nfrom palimpzest.constants import Model\n\nbiodex_entry_cols = [\n    {\"name\": \"pmid\", \"type\": str, \"desc\": \"The PubMed ID of the medical paper\"},\n    {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the medical paper\"},\n    {\"name\": \"abstract\", \"type\": str, \"desc\": \"The abstract of the medical paper\"},\n    {\"name\": \"fulltext\", \"type\": str, \"desc\": \"The full text of the medical paper, which contains information relevant for creating a drug safety report.\"},\n]\n\nbiodex_reactions_cols = [\n    {\"name\": \"reactions\", \"type\": list[str], \"desc\": \"The list of all medical conditions experienced by the patient as discussed in the report. Try to provide as many relevant medical conditions as possible.\"},\n]\n\nclass BiodexDataset(pz.IterDataset):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        split: str = \"test\",\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__(id=f\"biodex-{split}\", schema=biodex_entry_cols)\n\n        self.dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=split).to_pandas()\n        if shuffle:\n            self.dataset = self.dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\"records\")\n        else:\n            self.dataset = self.dataset.to_dict(orient=\"records\")[:num_samples]\n\n        self.rp_at_k = rp_at_k\n        self.num_samples = num_samples\n        self.shuffle = shuffle\n        self.seed = seed\n        self.split = split\n\n    def compute_label(self, entry: dict) -> dict:\n        \"\"\"Compute the label for a BioDEX report given its entry in the dataset.\"\"\"\n        reactions_lst = [\n            reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n            for reaction in entry[\"reactions\"].split(\",\")\n        ]\n        label_dict = {\"reactions\": reactions_lst}\n        return label_dict\n\n    @staticmethod\n    def term_recall(preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # normalize terms in each list\n            pred_terms = set([\n                term.strip()\n                for pred in preds\n                for term in pred.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n            target_terms = ([\n                term.strip()\n                for target in targets\n                for term in target.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n\n            # compute term recall and return\n            intersect = pred_terms.intersection(target_terms)\n            term_recall = len(intersect) / len(target_terms)\n\n            return term_recall\n\n        except Exception:\n            os.makedirs(\"term-recall-eval-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"term-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        # get entry\n        entry = self.dataset[idx]\n\n        # get input fields\n        pmid = entry[\"pmid\"]\n        title = entry[\"title\"]\n        abstract = entry[\"abstract\"]\n        fulltext = entry[\"fulltext\"]\n\n        # create item with fields\n        item = {\"fields\": {}, \"labels\": {}, \"score_fn\": {}}\n        item[\"fields\"][\"pmid\"] = pmid\n        item[\"fields\"][\"title\"] = title\n        item[\"fields\"][\"abstract\"] = abstract\n        item[\"fields\"][\"fulltext\"] = fulltext\n\n        if self.split == \"train\":\n            # add label info\n            item[\"labels\"] = self.compute_label(entry)\n\n            # add scoring functions for list fields\n            item[\"score_fn\"][\"reactions\"] = BiodexDataset.term_recall\n\n        return item\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\"--progress\", default=True, action=\"store_true\", help=\"Print progress output\")\n    args = parser.parse_args()\n\n    # create directory for profiling data\n    os.makedirs(\"priors-data\", exist_ok=True)\n\n    verbose = args.verbose\n    progress = args.progress\n    seed = 123 # NOTE: unique to cascades run\n    execution_strategy = \"parallel\"\n    sentinel_execution_strategy = \"all\"\n    optimizer_strategy = \"pareto\"\n    exp_name = f\"biodex-priors-{optimizer_strategy}-seed{seed}-cascades\" # NOTE: unique to cascades run\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # create data source\n    dataset = BiodexDataset(\n        split=\"test\",\n        num_samples=1,\n        shuffle=True,\n        seed=seed,\n    )\n\n    # create validation data source\n    train_dataset = BiodexDataset(\n        split=\"train\",\n        num_samples=5,\n        shuffle=True,\n        seed=seed,\n    )\n\n    # construct plan\n    plan = dataset\n    plan = plan.sem_add_columns(biodex_reactions_cols)\n\n    # only use final op quality\n    use_final_op_quality = True\n\n    # execute pz plan\n    config = pz.QueryProcessorConfig(\n        optimizer_strategy=optimizer_strategy,\n        sentinel_execution_strategy=sentinel_execution_strategy,\n        execution_strategy=execution_strategy,\n        use_final_op_quality=use_final_op_quality,\n        max_workers=64,\n        verbose=verbose,\n        available_models=[ # NOTE: unique to cascades run\n            # Model.GPT_4o,\n            Model.GPT_4o_MINI,\n            Model.LLAMA3_2_3B,\n            Model.LLAMA3_1_8B,\n            Model.LLAMA3_3_70B,\n            Model.LLAMA3_2_90B_V,\n            # Model.MIXTRAL, # NOTE: only available in tag `abacus-paper-experiments`\n            # Model.DEEPSEEK_V3,\n            Model.DEEPSEEK_R1_DISTILL_QWEN_1_5B,\n        ],\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=progress,\n        k=-1,\n        j=-1,\n        sample_budget=5*1014,\n        seed=seed,\n        exp_name=exp_name,\n    )\n\n    data_record_collection = plan.run(config=config, train_dataset=train_dataset, validator=pz.Validator())\n\n    print(data_record_collection.to_df())\n    data_record_collection.to_df().to_csv(f\"priors-data/{exp_name}-output.csv\", index=False)\n\n    # create filepaths for records and stats\n    records_path = f\"priors-data/{exp_name}-records.json\"\n    stats_path = f\"priors-data/{exp_name}-profiling.json\"\n\n    # save record outputs\n    record_jsons = []\n    for record in data_record_collection:\n        record_dict = record.to_dict()\n        record_dict = {\n            k: v\n            for k, v in record_dict.items()\n            if k in [\"pmid\", \"reactions\"]\n        }\n        record_jsons.append(record_dict)\n\n    with open(records_path, \"w\") as f:\n        json.dump(record_jsons, f)\n\n    # save statistics\n    execution_stats_dict = data_record_collection.execution_stats.to_json()\n    with open(stats_path, \"w\") as f:\n        json.dump(execution_stats_dict, f)\n"
  },
  {
    "path": "abacus-research/helper-scripts/generate-prior-stats-biodex.py",
    "content": "\"\"\"\nNOTE: this script worked with the tag `abacus-paper-experiments` but is no longer compatible with the main branch.\n\"\"\"\nimport argparse\nimport json\nimport os\nimport time\nfrom functools import partial\n\nimport chromadb\nimport datasets\nfrom chromadb.utils.embedding_functions.openai_embedding_function import OpenAIEmbeddingFunction\n\n# from ragatouille import RAGPretrainedModel\nimport palimpzest as pz\nfrom palimpzest.constants import Model\n\nbiodex_entry_cols = [\n    {\"name\": \"pmid\", \"type\": str, \"desc\": \"The PubMed ID of the medical paper\"},\n    {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the medical paper\"},\n    {\"name\": \"abstract\", \"type\": str, \"desc\": \"The abstract of the medical paper\"},\n    {\"name\": \"fulltext\", \"type\": str, \"desc\": \"The full text of the medical paper, which contains information relevant for creating a drug safety report.\"},\n    {\"name\": \"reactions\", \"type\": list[str], \"desc\": \"The list of all medical conditions experienced by the patient as discussed in the report. Try to provide as many relevant medical conditions as possible.\"},\n]\n\nbiodex_reaction_labels_cols = [\n    {\"name\": \"reaction_labels\", \"type\": list[str], \"desc\": \"Official terms for medical conditions listed in `reactions`\"},\n]\n\nbiodex_ranked_reactions_labels_cols = [\n    {\"name\": \"ranked_reaction_labels\", \"type\": list[str], \"desc\": \"The ranked list of medical conditions experienced by the patient. The most relevant label occurs first in the list. Be sure to rank ALL of the inputs.\"},\n]\n\n\nclass BiodexDataset(pz.IterDataset):\n    def __init__(\n        self,\n        rp_at_k: int = 5,\n        num_samples: int = 5,\n        split: str = \"test\",\n    ):\n        super().__init__(id=f\"biodex-{split}\", schema=biodex_entry_cols)\n\n        if split == \"test\":\n            self.dataset = datasets.load_dataset(\"BioDEX/BioDEX-Reactions\", split=split).to_pandas().to_dict(orient=\"records\")[:num_samples]\n        else:\n            with open('priors-data/source-idx-to-record-state-cascades.json') as f: # NOTE: unique to cascades run\n                self.source_idx_to_record_state = json.load(f)\n                self.dataset = [\n                    self.source_idx_to_record_state[str(idx)]\n                    for idx in range(5)\n                ]\n\n        self.rp_at_k = rp_at_k\n        self.num_samples = num_samples\n        self.split = split\n\n    def compute_label(self, entry: dict) -> dict:\n        \"\"\"Compute the label for a BioDEX report given its entry in the dataset.\"\"\"\n        reactions_lst = [\n            reaction.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\")\n            for reaction in json.dumps(entry[\"reactions\"]).split(\",\")\n        ]\n        label_dict = {\n            \"reaction_labels\": reactions_lst,\n            \"ranked_reaction_labels\": reactions_lst,\n        }\n        return label_dict\n\n    @staticmethod\n    def rank_precision_at_k(preds: list | None, targets: list, k: int):\n        if preds is None:\n            return 0.0\n\n        try:\n            # lower-case each list\n            preds = [pred.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for pred in preds]\n            targets = set([target.strip().lower().replace(\"'\", \"\").replace(\"^\", \"\") for target in targets])\n\n            # compute rank-precision at k\n            rn = len(targets)\n            denom = min(k, rn)\n            total = 0.0\n            for i in range(k):\n                total += preds[i] in targets if i < len(preds) else 0.0\n\n            return total / denom\n\n        except Exception:\n            os.makedirs(\"rp@k-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"rp@k-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    @staticmethod\n    def term_recall(preds: list | None, targets: list):\n        if preds is None:\n            return 0.0\n\n        try:\n            # normalize terms in each list\n            pred_terms = set([\n                term.strip()\n                for pred in preds\n                for term in pred.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n            target_terms = ([\n                term.strip()\n                for target in targets\n                for term in target.lower().replace(\"'\", \"\").replace(\"^\", \"\").split(\" \")\n            ])\n\n            # compute term recall and return\n            intersect = pred_terms.intersection(target_terms)\n            term_recall = len(intersect) / len(target_terms)\n\n            return term_recall\n\n        except Exception:\n            os.makedirs(\"term-recall-eval-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"term-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        # get entry\n        entry = self.dataset[idx]\n\n        # get input fields\n        pmid = entry[\"pmid\"]\n        title = entry[\"title\"]\n        abstract = entry[\"abstract\"]\n        fulltext = entry[\"fulltext\"]\n        reactions = entry[\"reactions\"]\n\n        # create item with fields\n        item = {\"fields\": {}, \"labels\": {}, \"score_fn\": {}}\n        item[\"fields\"][\"pmid\"] = pmid\n        item[\"fields\"][\"title\"] = title\n        item[\"fields\"][\"abstract\"] = abstract\n        item[\"fields\"][\"fulltext\"] = fulltext\n        item[\"fields\"][\"reactions\"] = json.dumps(reactions)\n\n        if self.split == \"train\":\n            # add label info\n            item[\"labels\"] = self.compute_label(entry)\n\n            # add scoring functions for list fields\n            rank_precision_at_k = partial(BiodexDataset.rank_precision_at_k, k=self.rp_at_k)\n            item[\"score_fn\"][\"reaction_labels\"] = BiodexDataset.term_recall\n            item[\"score_fn\"][\"ranked_reaction_labels\"] = rank_precision_at_k\n\n        return item\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\"--progress\", default=True, action=\"store_true\", help=\"Print progress output\")\n    args = parser.parse_args()\n\n    # create directory for profiling data\n    os.makedirs(\"priors-data\", exist_ok=True)\n\n    verbose = args.verbose\n    progress = args.progress\n    seed = 123 # NOTE: unique to cascades run\n    execution_strategy = \"parallel\"\n    sentinel_execution_strategy = \"all\"\n    optimizer_strategy = \"pareto\"\n    exp_name = f\"biodex-priors-{optimizer_strategy}-seed{seed}-second-convert-cascades\" # NOTE: unique to cascades run\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # create data source\n    dataset = BiodexDataset(\n        split=\"test\",\n        num_samples=1,\n    )\n\n    # create validation data source\n    train_dataset = BiodexDataset(\n        split=\"train\",\n        num_samples=5,\n    )\n\n    # load index [text-embedding-3-small]\n    chroma_client = chromadb.PersistentClient(\".chroma-biodex\")\n    openai_ef = OpenAIEmbeddingFunction(\n        api_key=os.environ[\"OPENAI_API_KEY\"],\n        model_name=\"text-embedding-3-small\",\n    )\n    index = chroma_client.get_collection(\"biodex-reaction-terms\", embedding_function=openai_ef)\n\n    def search_func(index: chromadb.Collection, query: list[list[float]], k: int) -> list[str]:\n        # execute query with embeddings\n        results = index.query(query, n_results=5)\n\n        # get list of result terms with their cosine similarity scores\n        final_results = []\n        for query_docs, query_distances in zip(results[\"documents\"], results[\"distances\"]):\n            for doc, dist in zip(query_docs, query_distances):\n                cosine_similarity = 1 - dist\n                final_results.append({\"content\": doc, \"similarity\": cosine_similarity})\n\n        # sort the results by similarity score\n        sorted_results = sorted(final_results, key=lambda result: result[\"similarity\"], reverse=True)\n\n        # remove duplicates\n        sorted_results_set = set()\n        final_sorted_results = []\n        for result in sorted_results:\n            if result[\"content\"] not in sorted_results_set:\n                sorted_results_set.add(result[\"content\"])\n                final_sorted_results.append(result[\"content\"])\n\n        # return the top-k similar results and generation stats\n        return {\"reaction_labels\": final_sorted_results[:k]}\n\n    # construct plan\n    plan = dataset.sem_topk(\n        index=index,\n        search_func=search_func,\n        search_attr=\"reactions\",\n        output_attrs=biodex_reaction_labels_cols,\n    )\n    plan = plan.sem_add_columns(biodex_ranked_reactions_labels_cols, depends_on=[\"title\", \"abstract\", \"fulltext\", \"reaction_labels\"])\n\n    # only use final op quality\n    use_final_op_quality = True\n\n    # execute pz plan\n    config = pz.QueryProcessorConfig(\n        optimizer_strategy=optimizer_strategy,\n        sentinel_execution_strategy=sentinel_execution_strategy,\n        execution_strategy=execution_strategy,\n        use_final_op_quality=use_final_op_quality,\n        max_workers=64,\n        verbose=verbose,\n        available_models=[ # NOTE: unique to cascades run\n            # Model.GPT_4o,\n            Model.GPT_4o_MINI,\n            Model.LLAMA3_2_3B,\n            Model.LLAMA3_1_8B,\n            Model.LLAMA3_3_70B,\n            Model.LLAMA3_2_90B_V,\n            # Model.MIXTRAL, # NOTE: only available in tag `abacus-paper-experiments`\n            # Model.DEEPSEEK_V3,\n            Model.DEEPSEEK_R1_DISTILL_QWEN_1_5B,\n        ],\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=progress,\n        k=-1,\n        j=-1,\n        sample_budget=5*1014 + 5*7,\n        seed=seed,\n        exp_name=exp_name,\n    )\n\n    data_record_collection = plan.optimize_and_run(config=config, train_dataset=train_dataset, validator=pz.Validator())\n\n    print(data_record_collection.to_df())\n    data_record_collection.to_df().to_csv(f\"priors-data/{exp_name}-output.csv\", index=False)\n\n    # create filepaths for records and stats\n    records_path = f\"priors-data/{exp_name}-records.json\"\n    stats_path = f\"priors-data/{exp_name}-profiling.json\"\n\n    # save record outputs\n    record_jsons = []\n    for record in data_record_collection:\n        record_dict = record.to_dict()\n        record_dict = {\n            k: v\n            for k, v in record_dict.items()\n            if k in [\"pmid\", \"reactions\"]\n        }\n        record_jsons.append(record_dict)\n\n    with open(records_path, \"w\") as f:\n        json.dump(record_jsons, f)\n\n    # save statistics\n    execution_stats_dict = data_record_collection.execution_stats.to_json()\n    with open(stats_path, \"w\") as f:\n        json.dump(execution_stats_dict, f)\n"
  },
  {
    "path": "abacus-research/helper-scripts/generate-prior-stats-cuad.py",
    "content": "\"\"\"\nNOTE: this script worked with the tag `abacus-paper-experiments` but is no longer compatible with the main branch.\n\"\"\"\nimport argparse\nimport json\nimport os\nimport string\nfrom functools import partial\n\nimport datasets\nimport numpy as np\nimport pandas as pd\n\nimport palimpzest as pz\nfrom palimpzest.constants import Model\n\ncuad_categories = [\n    {\n        \"Category\": \"Document Name\",\n        \"Description\": \"The name of the contract\",\n        \"Answer Format\": \"Contract Name\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Parties\",\n        \"Description\": \"The two or more parties who signed the contract\",\n        \"Answer Format\": \"Entity or individual names\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Agreement Date\",\n        \"Description\": \"The date of the contract\",\n        \"Answer Format\": \"Date (mm/dd/yyyy)\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Effective Date\",\n        \"Description\": \"The date when the contract is effective\\u00a0\",\n        \"Answer Format\": \"Date (mm/dd/yyyy)\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Expiration Date\",\n        \"Description\": \"On what date will the contract's initial term expire?\",\n        \"Answer Format\": \"Date (mm/dd/yyyy) / Perpetual\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Renewal Term\",\n        \"Description\": \"What is the renewal term after the initial term expires? This includes automatic extensions and unilateral extensions with prior notice.\",\n        \"Answer Format\": \"[Successive] number of years/months / Perpetual\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Notice Period to Terminate Renewal\",\n        \"Description\": \"What is the notice period required to terminate renewal?\",\n        \"Answer Format\": \"Number of days/months/year(s)\",\n        \"Group\": \"Group: 1\",\n    },\n    {\n        \"Category\": \"Governing Law\",\n        \"Description\": \"Which state/country's law governs the interpretation of the contract?\",\n        \"Answer Format\": \"Name of a US State / non-US Province, Country\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Most Favored Nation\",\n        \"Description\": \"Is there a clause that if a third party gets better terms on the licensing or sale of technology/goods/services described in the contract, the buyer of such technology/goods/services under the contract shall be entitled to those better terms?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Non-Compete\",\n        \"Description\": \"Is there a restriction on the ability of a party to compete with the counterparty or operate in a certain geography or business or technology sector?\\u00a0\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"Exclusivity\",\n        \"Description\": \"Is there an exclusive dealing\\u00a0 commitment with the counterparty? This includes a commitment to procure all \\u201crequirements\\u201d from one party of certain technology, goods, or services or a prohibition on licensing or selling technology, goods or services to third parties, or a prohibition on\\u00a0 collaborating or working with other parties), whether during the contract or\\u00a0 after the contract ends (or both).\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"No-Solicit of Customers\",\n        \"Description\": \"Is a party restricted from contracting or soliciting customers or partners of the counterparty, whether during the contract or after the contract ends (or both)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"Competitive Restriction Exception\",\n        \"Description\": \"This category includes the exceptions or carveouts to Non-Compete, Exclusivity and No-Solicit of Customers above.\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 2\",\n    },\n    {\n        \"Category\": \"No-Solicit of Employees\",\n        \"Description\": \"Is there a restriction on a party\\u2019s soliciting or hiring employees and/or contractors from the\\u00a0 counterparty, whether during the contract or after the contract ends (or both)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Non-Disparagement\",\n        \"Description\": \"Is there a requirement on a party not to disparage the counterparty?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Termination for Convenience\",\n        \"Description\": \"Can a party terminate this\\u00a0 contract without cause (solely by giving a notice and allowing a waiting\\u00a0 period to expire)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Rofr/Rofo/Rofn\",\n        \"Description\": \"Is there a clause granting one party a right of first refusal, right of first offer or right of first negotiation to purchase, license, market, or distribute equity interest, technology, assets, products or services?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Change of Control\",\n        \"Description\": \"Does one party have the right to terminate or is consent or notice required of the counterparty if such party undergoes a change of control, such as a merger, stock sale, transfer of all or substantially all of its assets or business, or assignment by operation of law?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 3\",\n    },\n    {\n        \"Category\": \"Anti-Assignment\",\n        \"Description\": \"Is consent or notice required of a party if the contract is assigned to a third party?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 3\",\n    },\n    {\n        \"Category\": \"Revenue/Profit Sharing\",\n        \"Description\": \"Is one party required to share revenue or profit with the counterparty for any technology, goods, or\\u00a0services?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Price Restrictions\",\n        \"Description\": \"Is there a restriction on the\\u00a0 ability of a party to raise or reduce prices of technology, goods, or\\u00a0 services provided?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Minimum Commitment\",\n        \"Description\": \"Is there a minimum order size or minimum amount or units per-time period that one party must buy from the counterparty under the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Volume Restriction\",\n        \"Description\": \"Is there a fee increase or consent requirement, etc. if one party\\u2019s use of the product/services exceeds certain threshold?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"IP Ownership Assignment\",\n        \"Description\": \"Does intellectual property created\\u00a0 by one party become the property of the counterparty, either per the terms of the contract or upon the occurrence of certain events?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Joint IP Ownership\",\n        \"Description\": \"Is there any clause providing for joint or shared ownership of intellectual property between the parties to the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"License Grant\",\n        \"Description\": \"Does the contract contain a license granted by one party to its counterparty?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Non-Transferable License\",\n        \"Description\": \"Does the contract limit the ability of a party to transfer the license being granted to a third party?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Affiliate License-Licensor\",\n        \"Description\": \"Does the contract contain a license grant by affiliates of the licensor or that includes intellectual property of affiliates of the licensor?\\u00a0\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Affiliate License-Licensee\",\n        \"Description\": \"Does the contract contain a license grant to a licensee (incl. sublicensor) and the affiliates of such licensee/sublicensor?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Unlimited/All-You-Can-Eat-License\",\n        \"Description\": \"Is there a clause granting one party an \\u201centerprise,\\u201d \\u201call you can eat\\u201d or unlimited usage license?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Irrevocable or Perpetual License\",\n        \"Description\": \"Does the contract contain a\\u00a0 license grant that is irrevocable or perpetual?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 4\",\n    },\n    {\n        \"Category\": \"Source Code Escrow\",\n        \"Description\": \"Is one party required to deposit its source code into escrow with a third party, which can be released to the counterparty upon the occurrence of certain events (bankruptcy,\\u00a0 insolvency, etc.)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Post-Termination Services\",\n        \"Description\": \"Is a party subject to obligations after the termination or expiration of a contract, including any post-termination transition, payment, transfer of IP, wind-down, last-buy, or similar commitments?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 5\",\n    },\n    {\n        \"Category\": \"Audit Rights\",\n        \"Description\": \"Does a party have the right to\\u00a0 audit the books, records, or physical locations of the counterparty to ensure compliance with the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 5\",\n    },\n    {\n        \"Category\": \"Uncapped Liability\",\n        \"Description\": \"Is a party\\u2019s liability uncapped upon the breach of its obligation in the contract? This also includes uncap liability for a particular type of breach such as IP infringement or breach of confidentiality obligation.\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 6\",\n    },\n    {\n        \"Category\": \"Cap on Liability\",\n        \"Description\": \"Does the contract include a cap on liability upon the breach of a party\\u2019s obligation? This includes time limitation for the counterparty to bring claims or maximum amount for recovery.\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: 6\",\n    },\n    {\n        \"Category\": \"Liquidated Damages\",\n        \"Description\": \"Does the contract contain a clause that would award either party liquidated damages for breach or a fee upon the termination of a contract (termination fee)?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Warranty Duration\",\n        \"Description\": \"What is the duration of any\\u00a0 warranty against defects or errors in technology, products, or services\\u00a0 provided under the contract?\",\n        \"Answer Format\": \"Number of months or years\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Insurance\",\n        \"Description\": \"Is there a requirement for insurance that must be maintained by one party for the benefit of the counterparty?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Covenant Not to Sue\",\n        \"Description\": \"Is a party restricted from contesting the validity of the counterparty\\u2019s ownership of intellectual property or otherwise bringing a claim against the counterparty for matters unrelated to the contract?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n    {\n        \"Category\": \"Third Party Beneficiary\",\n        \"Description\": \"Is there a non-contracting party who is a beneficiary to some or all of the clauses in the contract and therefore can enforce its rights against a contracting party?\",\n        \"Answer Format\": \"Yes/No\",\n        \"Group\": \"Group: -\",\n    },\n]\n\nNUM_FIELDS_TO_EXTRACT_PER_CONTRACT = 41\n\n# 0.15 is used in the Doc-ETL paper. It should be 0.5 for the actual benchmark.\nIOU_THRESH = 0.15\n\n\n#  Return the Jaccard similarity between two strings\ndef get_jaccard(label, pred):\n    remove_tokens = [c for c in string.punctuation if c != \"/\"]\n    for token in remove_tokens:\n        label = label.replace(token, \"\")\n        pred = pred.replace(token, \"\")\n    label = label.lower()\n    pred = pred.lower()\n    label = label.replace(\"/\", \" \")\n    pred = pred.replace(\"/\", \" \")\n\n    label_words = set(label.split(\" \"))\n    pred_words = set(pred.split(\" \"))\n\n    intersection = label_words.intersection(pred_words)\n    union = label_words.union(pred_words)\n    jaccard = len(intersection) / len(union)\n    return jaccard\n\n\n# Find the number of true positives, false positives, and false negatives for each entry\n# (one field extracted from each contract) by comparing the labels and predictions.\n# Labels and preds are lists of strings\ndef evaluate_entry(labels, preds, substr_ok):\n    tp, fp, fn = 0, 0, 0\n\n    # jaccard similarity expects strings\n    # TODO: This is a hack, ideally, the return type of the preds should be known\n    for idx, pred in enumerate(preds):\n        if not isinstance(pred, str):\n            print(f\"Expected string, but got {pred}\")\n            preds[idx] = str(pred)\n\n    # first check if labels is empty\n    if len(labels) == 0:\n        if len(preds) > 0:\n            fp += len(preds)  # false positive for each one\n    else:\n        for ans in labels:\n            assert len(ans) > 0\n            # check if there is a match\n            match_found = False\n            for pred in preds:\n                if substr_ok:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH or ans in pred\n                else:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH\n                if is_match:\n                    match_found = True\n\n            if match_found:\n                tp += 1\n            else:\n                fn += 1\n\n        # now also get any fps by looping through preds\n        for pred in preds:\n            # Check if there's a match. if so, don't count (don't want to double count based on the above)\n            # but if there's no match, then this is a false positive.\n            # (Note: we get the true positives in the above loop instead of this loop so that we don't double count\n            # multiple predictions that are matched with the same answer.)\n            match_found = False\n            for ans in labels:\n                assert len(ans) > 0\n                if substr_ok:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH or ans in pred\n                else:\n                    is_match = get_jaccard(ans, pred) >= IOU_THRESH\n                if is_match:\n                    match_found = True\n\n            if not match_found:\n                fp += 1\n\n    return tp, fp, fn\n\n\n# TODO(Siva): This is a temporary fix to handle the case where the preds are empty.\ndef handle_empty_preds(preds):\n    if preds is None or (  # noqa: SIM114\n        isinstance(preds, str) and (preds == \"\" or preds == \" \" or preds == \"null\" or preds == \"None\")\n    ):\n        return []\n    elif isinstance(preds, float) and np.isnan(preds):\n        return []\n    if not isinstance(preds, (list, np.ndarray)):\n        return [preds]\n    return preds\n\n\n# Compute the precision and recall for the entire dataset.\n# Each row in the dataframes should correspond to a contract.\n# The columns should be the extracted fields (categories in cuad_categories).\ndef compute_precision_recall(label_df, preds_df):\n    tp, fp, fn = 0, 0, 0\n\n    label_df = label_df.sort_values(\"contract_id\").reset_index(drop=True)\n    preds_df = preds_df.sort_values(\"contract_id\").reset_index(drop=True)\n\n    assert label_df.shape == preds_df.shape, (\n        f\"Label and prediction dataframes have different shapes, label shape: {label_df.shape} vs preds shape {preds_df.shape}\"\n    )\n\n    categories = [category[\"Category\"] for category in cuad_categories]\n\n    for label_row, pred_row in zip(label_df.iterrows(), preds_df.iterrows()):\n        assert label_row[1][\"contract_id\"] == pred_row[1][\"contract_id\"], (\n            f\"IDs do not match. label id: {label_row[1]['contract_id']} vs pred id: {pred_row[1]['contract_id']}\"\n        )\n        for category in categories:\n            substr_ok = \"Parties\" in category\n\n            labels = label_row[1][category]\n            assert isinstance(labels, list)\n\n            preds = pred_row[1][category]\n            preds = handle_empty_preds(preds)\n\n            entry_tp, entry_fp, entry_fn = evaluate_entry(labels, preds, substr_ok)\n            tp += entry_tp\n            fp += entry_fp\n            fn += entry_fn\n\n    precision = tp / (tp + fp) if tp + fp > 0 else np.nan\n    recall = tp / (tp + fn) if tp + fn > 0 else np.nan\n\n    return precision, recall\n\nclass CUADDataset(pz.IterDataset):\n    def __init__(self, num_contracts: int = 1, split: str = \"train\", seed: int=42):\n        self.num_contracts = num_contracts\n        self.split = split\n        self.seed = seed\n\n        input_cols = [\n            {\"name\": \"contract_id\", \"type\": str, \"desc\": \"The id of the the contract to be analyzed\"},\n            {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the the contract to be analyzed\"},\n            {\"name\": \"contract\", \"type\": str, \"desc\": \"The content of the the contract to be analyzed\"},\n        ]\n        super().__init__(id=f\"cuad-{split}\", schema=input_cols)\n\n        # convert the dataset into a list of dictionaries where each row is for a single contract\n        include_labels = split == \"train\"\n        dataset = datasets.load_dataset(\"theatticusproject/cuad-qa\")[split]\n        self.dataset = self._construct_dataset(dataset, num_contracts, seed, include_labels)\n\n\n    def _construct_dataset(self, dataset, num_contracts, seed: int=42, include_labels: bool=False):\n        # get the set of unique contract titles; to ensure the order of the contracts is\n        # preserved, we use a list rather than using python's set()\n        contract_titles = []\n        for row in dataset:\n            if row[\"title\"] not in contract_titles:\n                contract_titles.append(row[\"title\"])\n\n        # shuffle the contracts for the given seed\n        rng = np.random.default_rng(seed=seed)\n        rng.shuffle(contract_titles)\n\n        # get the first num_contracts\n        contract_titles = contract_titles[:num_contracts]\n\n        # construct the dataset one contract at a time\n        new_dataset = []\n        for title in contract_titles:\n            # get the rows for this contract\n            contract_rows = [row for row in dataset if row[\"title\"] == title]\n\n            # construct the contract; we get the contract_id and contract text from the first row\n            contract = {\n                \"contract_id\": contract_rows[0][\"id\"],\n                \"title\": title,\n                \"contract\": contract_rows[0][\"context\"],\n            }\n\n            # for train / validation data, add the labels\n            if include_labels:\n                contract = {\"fields\": contract}\n\n                # add the labels\n                category_names = list(map(lambda category: category[\"Category\"], cuad_categories))\n                contract[\"labels\"] = {category: [] for category in category_names}\n                contract[\"score_fn\"] = {category: None for category in category_names}\n                for row in contract_rows:\n                    category_name = row[\"id\"].split(\"__\")[-1].split(\"_\")[0].strip()\n                    category_name = category_name.replace(\" For \", \" for \")\n                    category_name = category_name.replace(\" Of \", \" of \")\n                    category_name = category_name.replace(\" On \", \" on \")\n                    category_name = category_name.replace(\" Or \", \" or \")\n                    category_name = category_name.replace(\" To \", \" to \")\n                    category_name = category_name.replace(\"Ip\", \"IP\")\n                    assert category_name in category_names, f\"Unknown category {category_name}\"\n                    contract[\"labels\"][category_name].extend(row[\"answers\"][\"text\"])\n\n                    def score_fn(preds, labels, category_name):\n                        preds = handle_empty_preds(preds)\n                        entry_tp, _, entry_fn = evaluate_entry(labels, preds, substr_ok=True) if category_name == \"Parties\" else evaluate_entry(labels, preds, substr_ok=False)\n                        score = None\n                        if len(labels) > 0:  # noqa: SIM108\n                            score = entry_tp / (entry_tp + entry_fn)\n                        else:\n                            score = 1.0 if len(preds) == 0 else 0.0\n\n                        return score\n\n                    contract[\"score_fn\"][category_name] = partial(score_fn, category_name=category_name)\n\n            # add the rows to the dataset\n            new_dataset.append(contract)\n\n        return new_dataset\n\n    def __len__(self):\n        return self.num_contracts\n\n    def __getitem__(self, idx: int):\n        return self.dataset[idx]\n\n    def get_label_df(self):\n        full_dataset = datasets.load_dataset(\"theatticusproject/cuad-qa\")[self.split]\n        label_dataset = self._construct_dataset(full_dataset, self.num_contracts, self.seed, True)\n        final_label_dataset = []\n        for entry in label_dataset:\n            row = {}\n            row[\"contract_id\"] = entry[\"fields\"][\"contract_id\"]\n            row[\"title\"] = entry[\"fields\"][\"title\"]\n            row[\"contract\"] = entry[\"fields\"][\"contract\"]\n            row = {**row, **entry[\"labels\"]}\n            final_label_dataset.append(row)\n\n        return pd.DataFrame(final_label_dataset)\n\n\ndef parse_arguments():\n    parser = argparse.ArgumentParser(description=\"Run CUAD demo\")\n    parser.add_argument(\"--mode\", type=str, help=\"one-convert or separate-converts\", default=\"one-convert\")\n    parser.add_argument(\"--test\", type=str, help=\"test time compute active or inactive\", default=\"active\")\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    return parser.parse_args()\n\n\ndef build_cuad_query(dataset, mode):\n    assert mode in [\"one-convert\", \"separate-converts\"]\n\n    if mode == \"one-convert\":\n        cols = []\n        for category in cuad_categories:\n            desc = (\n                f\"Extract the text spans (if they exist) from the contract corresponding to {category['Description']}\"\n            )\n            cols.append({\"name\": category[\"Category\"], \"type\": list[str], \"desc\": desc})\n\n        desc = \"Extract the text spans (if they exist) from the contract.\"\n        dataset = dataset.sem_add_columns(cols, desc=desc, depends_on=[\"contract\"])\n    elif mode == \"separate-converts\":\n        for category in cuad_categories:\n            desc = (\n                f\"Extract the text spans (if they exist) from the contract corresponding to {category['Description']}\"\n            )\n            dataset = dataset.sem_add_columns(\n                [{\"name\": category[\"Category\"], \"type\": list[str], \"desc\": desc}],\n                desc=category[\"Description\"],\n                depends_on=[\"contract\"],\n            )\n\n    return dataset\n\n\ndef main():\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    args = parse_arguments()\n\n    # create directory for profiling data\n    os.makedirs(\"opt-profiling-data\", exist_ok=True)\n\n    # Create a data reader for the CUAD dataset\n    dataset = CUADDataset(split=\"test\", num_contracts=1)\n    train_dataset = CUADDataset(split=\"train\", num_contracts=5)\n    print(\"Created data reader\")\n\n    # Build and run the CUAD query\n    query = build_cuad_query(dataset, args.mode)\n    print(\"Built query; Starting query execution\")\n\n    execution_strategy = \"parallel\"\n    sentinel_execution_strategy = \"all\"\n    optimizer_strategy = \"pareto\"\n    seed = 0\n    exp_name = f\"cuad-priors-{optimizer_strategy}-seed{seed}\"\n    config = pz.QueryProcessorConfig(\n        verbose=False,\n        optimizer_strategy=optimizer_strategy,\n        sentinel_execution_strategy=sentinel_execution_strategy,\n        execution_strategy=execution_strategy,\n        max_workers=64,\n        available_models=[\n            Model.GPT_4o,\n            Model.GPT_4o_MINI,\n            # Model.LLAMA3_2_3B,\n            Model.LLAMA3_1_8B,\n            Model.LLAMA3_3_70B,\n            # Model.LLAMA3_2_90B_V,\n            # Model.MIXTRAL, # NOTE: only available in tag `abacus-paper-experiments`\n            # Model.DEEPSEEK_V3,\n            Model.DEEPSEEK_R1_DISTILL_QWEN_1_5B,\n        ],\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=True,\n        k=-1,\n        j=-1,\n        sample_budget=1014*5,\n        seed=seed,\n        exp_name=exp_name,\n    )\n    data_record_collection = query.optimize_and_run(config=config, train_dataset=train_dataset, validator=pz.Validator())\n    print(\"Query execution completed\")\n\n    # save statistics\n    execution_stats_dict = data_record_collection.execution_stats.to_json()\n    with open(f\"priors-data/{exp_name}-stats.json\", \"w\") as f:\n        json.dump(execution_stats_dict, f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "abacus-research/helper-scripts/mmqa-baseline.py",
    "content": "import argparse\nimport json\nimport os\nimport string\nimport time\n\nimport numpy as np\nfrom openai import OpenAI\n\nfrom palimpzest.constants import Cardinality, Model\nfrom palimpzest.query.generators.generators import get_json_from_answer\n\n\ndef f1(preds: list | None, targets: list):\n    if preds is None or len(targets) == 0:\n        return 0.0\n\n    tp, fp, fn = 0, 0, 0\n    try:\n        # compute recall of retrieved ids and return\n        preds = [str(pred).lower() for pred in preds]\n        targets = [str(target).lower() for target in targets]\n\n        remove_tokens = [c for c in string.punctuation if c != \"/\"]\n        for token in remove_tokens:\n            preds = [pred.replace(token, \"\") for pred in preds]\n            targets = [target.replace(token, \"\") for target in targets]\n\n        for pred in preds:\n            if pred in targets:\n                tp += 1\n            else:\n                fp += 1\n        for target in targets:\n            if target not in preds:\n                fn += 1\n\n        # compute overall f1 score and return\n        recall = tp / (tp + fn) if (tp + fn) > 0 else 0\n        precision = tp / (tp + fp) if (tp + fp) > 0 else 0\n        f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0\n\n        return f1\n\n    except Exception:\n        os.makedirs(\"mmqa-recall-eval-errors\", exist_ok=True)\n        ts = time.time()\n        with open(f\"mmqa-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n            f.write(str(preds))\n        return 0.0\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\n        \"--seed\",\n        default=42,\n        type=int,\n        help=\"Seed used to initialize RNG for MAB sampling algorithm\",\n    )\n    args = parser.parse_args()\n\n    # create directory for profiling data\n    os.makedirs(\"opt-profiling-data\", exist_ok=True)\n    seed = args.seed\n    print(f\"Running with seed: {seed}\")\n\n    # start time for processing\n    start_time = time.time()\n\n    # read the appropriate dataset\n    dataset = []\n    with open(\"data/MMQA_dev.jsonl\") as f:\n        for line in f:\n            dict_line = json.loads(line)\n            if \"image\" in dict_line[\"metadata\"][\"modalities\"] and len(dict_line[\"metadata\"][\"modalities\"]) > 1:\n                dataset.append(dict_line)\n\n    # shuffle the questions for the given seed\n    rng = np.random.default_rng(seed=seed)\n    rng.shuffle(dataset)\n\n    # trim to number of samples\n    dataset = dataset[:100]\n\n    # construct the prompt\n    prompt = \"\"\"You are an intelligent AI assistant designed to answer questions. Please answer the following question to the best of your ability based on your prior knowledge.\n    Return your answer as a JSON list of strings.\n    Do not include any additional context or an explanation in your answer, simply list the entities asked for by the question:\n\nQUESTION: {question}\n\nANSWER: \n\"\"\"\n\n    # iterate over the dataset and generate answers\n    model_name = \"gpt-4o-mini-2024-07-18\"\n    preds, total_cost = [], 0.0\n    for idx, entry in enumerate(dataset):\n        print(f\"Processing entry {idx}\")\n        formatted_prompt = prompt.format(question=entry[\"question\"])\n        client = OpenAI()\n        payload = {\n            \"model\": model_name,\n            \"temperature\": 0.0,\n            \"messages\": [{\"role\": \"user\", \"content\": formatted_prompt}],\n        }\n        completion = client.chat.completions.create(**payload)\n\n        # compute total cost\n        model = Model(model_name)\n        usd_per_input_token = model.get_usd_per_input_token()\n        usd_per_output_token = model.get_usd_per_output_token()\n        input_tokens = completion.usage.prompt_tokens\n        output_tokens = completion.usage.completion_tokens\n        total_cost += input_tokens * usd_per_input_token + output_tokens * usd_per_output_token\n\n        # extract answer\n        completion_text = completion.choices[0].message.content\n        try:\n            answer = get_json_from_answer(completion_text, Model.GPT_4o_MINI, Cardinality.ONE_TO_MANY)\n        except Exception:\n            answer = [completion_text]\n        preds.append(answer)\n\n    # get total time\n    total_time = time.time() - start_time\n\n    # score the output\n    scores = []\n    for pred, entry in zip(preds, dataset):\n        answers = entry[\"answers\"]\n        answer = [ans[\"answer\"] for ans in answers]\n        f1_score = f1(pred, answer)\n        scores.append(f1_score)\n\n    # create final stats dict\n    stats = {}\n    stats[\"total_time\"] = total_time\n    stats[\"total_cost\"] = total_cost\n    stats[\"f1\"] = np.mean(scores)\n    with open(f'opt-profiling-data/mmqa-baseline-seed-{seed}-stats.json', 'w') as f:\n        json.dump(stats, f)\n    print(stats)\n"
  },
  {
    "path": "abacus-research/helper-scripts/mmqa-gen-image-index.py",
    "content": "import json\nimport os\n\nimport chromadb\nimport chromadb.utils.embedding_functions as embedding_functions\nimport numpy as np\nfrom PIL import Image\nfrom sentence_transformers import SentenceTransformer\nfrom tqdm import tqdm\n\nCORRUPTED_IMAGE_IDS = [\n    \"17ae0616ac745e70781203267f3a382d\",\n    \"bf201cbbd058ef51aef89b1be4158c2a\",\n    \"ef457a7b3ab437cd78ab9f82dc083048\",\n    \"225c3db49d60b5ef30ed0bfc649ebf78\",\n    \"b413cc1dc4969dcbe4cb6a55c0f2e359\",\n    \"e81b2acfd792b171389c8f47a0e14504\",\n]\n\n# NOTE: this script is meant to be run from the root of the repository\nif __name__ == \"__main__\":\n    # load CLIP model\n    model = SentenceTransformer(\"clip-ViT-B-32\")\n\n    # load image metadata\n    image_filepaths, image_ids = [], []\n    with open(\"data/MMQA_images.jsonl\") as f:\n        possible_endings = {'.JPG', '.png', '.jpeg', '.jpg', '.tif', '.JPEG', '.tiff', '.PNG', '.Jpg', '.gif'}\n        for line in f:\n            dict_line = json.loads(line)\n            image_id = dict_line[\"id\"]\n\n            # skip corrupted images:\n            if image_id in CORRUPTED_IMAGE_IDS:\n                continue\n\n            # add image to image_ids\n            image_ids.append(image_id)\n\n            # find the correct image file\n            for ending in possible_endings:\n                if os.path.exists(f\"/ssd1/mdrusso/mmqa-images/{image_id}{ending}\"):\n                    image_id += ending\n                    break\n\n            image_filepaths.append(f\"/ssd1/mdrusso/mmqa-images/{image_id}\")\n\n    # create directory for embeddings\n    os.makedirs(\"testdata/mmqa-image-embeddings/\", exist_ok=True)\n\n    # generate embeddings in batches of 128 at a time\n    indices = np.linspace(0, len(image_filepaths), len(image_filepaths)//128, dtype=int)\n    total_embeds = len(indices)\n    print(f\"Generating {total_embeds} batches of embeddings...\")\n    gen_indices = []\n    for iter_idx, start_idx in tqdm(enumerate(indices), total=total_embeds):\n        # check if embedding needs to be computed\n        end_idx = indices[iter_idx + 1] if iter_idx + 1 < len(indices) else None\n        filename = f\"testdata/mmqa-image-embeddings/{start_idx}_{end_idx}.npy\"\n        if end_idx is not None and not os.path.exists(filename):\n            # generate embeddings\n            batch_fps = image_filepaths[start_idx:end_idx]\n            batch_images = [Image.open(fp) for fp in batch_fps]\n            embeddings = model.encode(batch_images)\n            \n            # save embeddings to disk\n            with open(filename, \"wb\") as f:\n                np.save(f, embeddings)\n\n            gen_indices.append((start_idx, end_idx))\n    print(\"Done generating embeddings.\")\n\n    # initialize chroma client\n    chroma_client = chromadb.PersistentClient(\".chroma-mmqa\")\n\n    # initialize embedding function\n    sentence_transformer_ef = embedding_functions.SentenceTransformerEmbeddingFunction(\n        model_name=\"clip-ViT-B-32\"\n    )\n\n    # create a collection\n    collection = chroma_client.get_or_create_collection(\n        name=\"mmqa-images\",\n        embedding_function=sentence_transformer_ef,\n        metadata={\"hnsw:space\": \"cosine\"},\n    )\n\n    # insert documents in batches\n    total_inserts = len(gen_indices)\n    print(f\"Inserting {total_inserts} batches into the collection...\")\n    for start_idx, end_idx in tqdm(gen_indices, total=total_inserts):\n        embeddings = np.load(f\"testdata/mmqa-image-embeddings/{start_idx}_{end_idx}.npy\")\n        collection.add(\n            documents=[os.path.basename(fp) for fp in image_filepaths[start_idx:end_idx]],\n            embeddings=embeddings.tolist(),\n            ids=image_ids[start_idx:end_idx],\n        )\n"
  },
  {
    "path": "abacus-research/helper-scripts/mmqa-gen-image-title-index.py",
    "content": "import json\nimport os\nimport time\n\nimport chromadb\nimport chromadb.utils.embedding_functions as embedding_functions\nimport numpy as np\nfrom openai import OpenAI\nfrom tqdm import tqdm\n\n# NOTE: this script is meant to be run from the root of the repository\nif __name__ == \"__main__\":\n    # initialize openai client\n    openai_client = OpenAI()\n\n    # load image metadata\n    image_title_set = set()\n    image_titles, image_ids = [], []\n    with open(\"data/MMQA_images.jsonl\") as f:\n        for line in f:\n            dict_line = json.loads(line)\n            image_title = dict_line[\"title\"]\n            if image_title == \"\":\n                image_title = dict_line[\"url\"]\n\n            if image_title not in image_title_set:\n                image_titles.append(image_title)\n                image_title_set.add(image_title)\n            else:\n                idx = 1\n                while f\"{image_title} ({idx})\" in image_title_set:\n                    idx += 1\n                image_title = f\"{image_title} ({idx})\"\n                image_titles.append(image_title)\n                image_title_set.add(image_title)\n\n            image_ids.append(dict_line[\"id\"])\n\n    # create directory for embeddings\n    os.makedirs(\"testdata/mmqa-image-title-embeddings/\", exist_ok=True)\n\n    # generate embeddings in batches of 1000 at a time\n    indices = np.linspace(0, len(image_titles), len(image_titles)//1000, dtype=int)\n    total_embeds = len(indices)\n    print(f\"Generating {total_embeds} batches of embeddings...\")\n    gen_indices = []\n    for iter_idx, start_idx in tqdm(enumerate(indices), total=total_embeds):\n        # check if embedding needs to be computed\n        end_idx = indices[iter_idx + 1] if iter_idx + 1 < len(indices) else None\n        filename = f\"testdata/mmqa-image-title-embeddings/{start_idx}_{end_idx}.npy\"\n        if end_idx is not None and not os.path.exists(filename):\n            # generate embeddings\n            batch = image_titles[start_idx:end_idx]\n            resp = openai_client.embeddings.create(input=batch, model=\"text-embedding-3-small\")\n            embeddings = [item.embedding for item in resp.data]\n\n            # save embeddings to disk\n            with open(filename, \"wb\") as f:\n                np.save(f, np.array(embeddings))\n\n            gen_indices.append((start_idx, end_idx))\n            time.sleep(1)\n    print(\"Done generating embeddings.\")\n\n    # initialize chroma client\n    chroma_client = chromadb.PersistentClient(\".chroma-mmqa\")\n\n    # initialize embedding function\n    openai_ef = embedding_functions.OpenAIEmbeddingFunction(\n        api_key=os.environ[\"OPENAI_API_KEY\"],\n        model_name=\"text-embedding-3-small\"\n    )\n\n    # create a collection\n    collection = chroma_client.get_or_create_collection(\n        name=\"mmqa-image-titles\",\n        embedding_function=openai_ef,\n        metadata={\"hnsw:space\": \"cosine\"},\n    )\n\n    # insert documents in batches\n    total_inserts = len(gen_indices)\n    print(f\"Inserting {total_inserts} batches into the collection...\")\n    for start_idx, end_idx in tqdm(gen_indices, total=total_inserts):\n        embeddings = np.load(f\"testdata/mmqa-image-title-embeddings/{start_idx}_{end_idx}.npy\")\n        collection.add(\n            documents=image_titles[start_idx:end_idx],\n            embeddings=embeddings.tolist(),\n            ids=image_ids[start_idx:end_idx],\n        )\n"
  },
  {
    "path": "abacus-research/helper-scripts/mmqa-gen-table-index.py",
    "content": "import json\nimport os\nimport time\n\nimport chromadb\nimport chromadb.utils.embedding_functions as embedding_functions\nimport numpy as np\nfrom openai import OpenAI\nfrom tqdm import tqdm\n\n# NOTE: this script is meant to be run from the root of the repository\nif __name__ == \"__main__\":\n    # initialize openai client\n    openai_client = OpenAI()\n\n    # load table texts\n    table_texts, table_ids = [], []\n    with open(\"data/MMQA_tables.jsonl\") as f:\n        for line in f:\n            dict_line = json.loads(line)\n            \n            # get page title and table name\n            page_title = dict_line[\"title\"]\n            table_name = dict_line[\"table\"][\"table_name\"]\n\n            # get table column names and empty column indices\n            table_header = dict_line[\"table\"][\"header\"]\n            column_names = [col[\"column_name\"] for col in table_header if col[\"column_name\"] != \"\"]\n            empty_col_indices = set([idx for idx, col in enumerate(table_header) if col[\"column_name\"] == \"\"])\n\n            # create string for table data\n            text = f\"{page_title}: {table_name}\\n\\n{','.join(column_names)}\\n\"\n\n            # parse table row data\n            table_rows = dict_line[\"table\"][\"table_rows\"]\n            for row in table_rows:\n                row_data = []\n                for idx, cell in enumerate(row):\n                    if idx in empty_col_indices:\n                        continue\n                    row_data.append(cell[\"text\"])\n\n                text += \",\".join(row_data) + \"\\n\"\n\n            table_texts.append(text)\n            table_ids.append(dict_line[\"id\"])\n\n    # create directory for embeddings\n    os.makedirs(\"testdata/mmqa-table-embeddings/\", exist_ok=True)\n\n    # generate embeddings in batches of 1000 at a time\n    indices = np.linspace(0, len(table_texts), len(table_texts)//1000, dtype=int)\n    total_embeds = len(indices)\n    print(f\"Generating {total_embeds} batches of embeddings...\")\n    gen_indices = []\n    for iter_idx, start_idx in tqdm(enumerate(indices), total=total_embeds):\n        # check if embedding needs to be computed\n        end_idx = indices[iter_idx + 1] if iter_idx + 1 < len(indices) else None\n        filename = f\"testdata/mmqa-table-embeddings/{start_idx}_{end_idx}.npy\"\n        if end_idx is not None and not os.path.exists(filename):\n            # generate embeddings\n            batch = table_texts[start_idx:end_idx]\n            resp = openai_client.embeddings.create(input=batch, model=\"text-embedding-3-small\")\n            embeddings = [item.embedding for item in resp.data]\n\n            # save embeddings to disk\n            with open(filename, \"wb\") as f:\n                np.save(f, np.array(embeddings))\n\n            gen_indices.append((start_idx, end_idx))\n            time.sleep(1)\n    print(\"Done generating embeddings.\")\n\n    # initialize chroma client\n    chroma_client = chromadb.PersistentClient(\".chroma-mmqa\")\n\n    # initialize embedding function\n    openai_ef = embedding_functions.OpenAIEmbeddingFunction(\n        api_key=os.environ[\"OPENAI_API_KEY\"],\n        model_name=\"text-embedding-3-small\"\n    )\n\n    # create a collection\n    collection = chroma_client.get_or_create_collection(\n        name=\"mmqa-tables\",\n        embedding_function=openai_ef,\n        metadata={\"hnsw:space\": \"cosine\"},\n    )\n\n    # insert documents in batches\n    total_inserts = len(gen_indices)\n    print(f\"Inserting {total_inserts} batches into the collection...\")\n    for start_idx, end_idx in tqdm(gen_indices, total=total_inserts):\n        embeddings = np.load(f\"testdata/mmqa-table-embeddings/{start_idx}_{end_idx}.npy\")\n        collection.add(\n            documents=table_texts[start_idx:end_idx],\n            embeddings=embeddings.tolist(),\n            ids=table_ids[start_idx:end_idx],\n        )\n"
  },
  {
    "path": "abacus-research/helper-scripts/mmqa-gen-text-index.py",
    "content": "import json\nimport os\nimport time\n\nimport chromadb\nimport chromadb.utils.embedding_functions as embedding_functions\nimport numpy as np\nfrom openai import OpenAI\nfrom tqdm import tqdm\n\n# NOTE: this script is meant to be run from the root of the repository\nif __name__ == \"__main__\":\n    # initialize openai client\n    openai_client = OpenAI()\n\n    # load texts\n    texts, text_ids = [], []\n    with open(\"data/MMQA_texts.jsonl\") as f:\n        for line in f:\n            dict_line = json.loads(line)\n            title = dict_line[\"title\"]\n            text = dict_line[\"text\"]\n            texts.append(f\"{title}: {text}\")\n            text_ids.append(dict_line[\"id\"])\n\n    # create directory for embeddings\n    os.makedirs(\"testdata/mmqa-text-embeddings/\", exist_ok=True)\n\n    # generate embeddings in batches of 1000 at a time\n    indices = np.linspace(0, len(texts), len(texts)//1000, dtype=int)\n    total_embeds = len(indices)\n    print(f\"Generating {total_embeds} batches of embeddings...\")\n    gen_indices = []\n    for iter_idx, start_idx in tqdm(enumerate(indices), total=total_embeds):\n        # check if embedding needs to be computed\n        end_idx = indices[iter_idx + 1] if iter_idx + 1 < len(indices) else None\n        filename = f\"testdata/mmqa-text-embeddings/{start_idx}_{end_idx}.npy\"\n        if end_idx is not None and not os.path.exists(filename):\n            # generate embeddings\n            batch = texts[start_idx:end_idx]\n            resp = openai_client.embeddings.create(input=batch, model=\"text-embedding-3-small\")\n            embeddings = [item.embedding for item in resp.data]\n\n            # save embeddings to disk\n            with open(filename, \"wb\") as f:\n                np.save(f, np.array(embeddings))\n\n            gen_indices.append((start_idx, end_idx))\n            time.sleep(1)\n    print(\"Done generating embeddings.\")\n\n    # initialize chroma client\n    chroma_client = chromadb.PersistentClient(\".chroma-mmqa\")\n\n    # initialize embedding function\n    openai_ef = embedding_functions.OpenAIEmbeddingFunction(\n        api_key=os.environ[\"OPENAI_API_KEY\"],\n        model_name=\"text-embedding-3-small\"\n    )\n\n    # create a collection\n    collection = chroma_client.get_or_create_collection(\n        name=\"mmqa-texts\",\n        embedding_function=openai_ef,\n        metadata={\"hnsw:space\": \"cosine\"},\n    )\n\n    # insert documents in batches\n    total_inserts = len(gen_indices)\n    print(f\"Inserting {total_inserts} batches into the collection...\")\n    for start_idx, end_idx in tqdm(gen_indices, total=total_inserts):\n        embeddings = np.load(f\"testdata/mmqa-text-embeddings/{start_idx}_{end_idx}.npy\")\n        collection.add(\n            documents=texts[start_idx:end_idx],\n            embeddings=embeddings.tolist(),\n            ids=text_ids[start_idx:end_idx],\n        )\n"
  },
  {
    "path": "abacus-research/mmqa-complex-demo.py",
    "content": "import argparse\nimport base64\nimport json\nimport os\nimport string\nimport time\n\nimport numpy as np\nimport regex as re\n\nimport palimpzest as pz\nfrom palimpzest.constants import Model\nfrom palimpzest.core.lib.schemas import ImageBase64\n\nCORRUPTED_IMAGE_IDS = [\n    \"17ae0616ac745e70781203267f3a382d\",\n    \"bf201cbbd058ef51aef89b1be4158c2a\",\n    \"ef457a7b3ab437cd78ab9f82dc083048\",\n    \"225c3db49d60b5ef30ed0bfc649ebf78\",\n    \"b413cc1dc4969dcbe4cb6a55c0f2e359\",\n    \"e81b2acfd792b171389c8f47a0e14504\",\n]\n\nmmqa_entry_cols = [\n    {\"name\": \"qid\", \"type\": str, \"desc\": \"The id of the MMQA question\"},\n    {\"name\": \"question\", \"type\": str, \"desc\": \"The question which needs to be answered\"},\n]\nmmqa_text_search_cols = [\n    {\"name\": \"text_search_string\", \"type\": str, \"desc\": \"A string used to search for relevant text snippets.\"},\n]\nmmqa_table_search_cols = [\n    {\"name\": \"table_search_string\", \"type\": str, \"desc\": \"A string used to search for relevant tables.\"},\n]\nmmqa_image_search_cols = [\n    {\"name\": \"image_search_string\", \"type\": str, \"desc\": \"A string used to search for relevant images.\"},\n]\n\nmmqa_text_cols = [\n    {\"name\": \"text_id\", \"type\": str, \"desc\": \"The id for the given text snippet.\"},\n    {\"name\": \"text\", \"type\": str, \"desc\": \"A text snippet which may or may not support the question.\"},\n]\n\nmmqa_table_cols = [\n    {\"name\": \"table_id\", \"type\": str, \"desc\": \"The id for the given table.\"},\n    {\"name\": \"table\", \"type\": str, \"desc\": \"A table which may or may not support the question.\"},\n]\n\nmmqa_image_cols = [\n    {\"name\": \"image_id\", \"type\": str, \"desc\": \"The id for the given image.\"},\n    {\"name\": \"image\", \"type\": ImageBase64, \"desc\": \"An image which may or may not support the question.\"},\n]\n\nmmqa_answer_cols = [\n    {\"name\": \"answers\", \"type\": list[str], \"desc\": \"The answer(s) to the question. Answer the question using the relevant information from gathered image(s), text(s), and table(s). Return your answer as a JSON list of strings. Do not include any additional context or an explanation in your answer, simply list the entities asked for by the question\"},\n]\n\ndef get_json_from_answer(answer: str):\n    \"\"\"\n    This function parses an LLM response which is supposed to output a JSON object\n    and optimistically searches for the substring containing the JSON object.\n    \"\"\"\n    # split off context / excess, which models sometimes output after answer\n    answer = answer.split(\"Context:\")[0]\n    answer = answer.split(\"# this is the answer\")[0]\n    # trim the answer to only include the JSON array\n    if not answer.strip().startswith(\"[\"):\n        # Find the start index of the actual JSON string assuming the prefix is followed by the JSON array\n        start_index = answer.find(\"[\")\n        if start_index != -1:\n            # Remove the prefix and any leading characters before the JSON starts\n            answer = answer[start_index:]\n    if not answer.strip().endswith(\"]\"):\n        # Find the end index of the actual JSON string\n        # assuming the suffix is preceded by the JSON object/array\n        end_index = answer.rfind(\"]\")\n        if end_index != -1:\n            # Remove the suffix and any trailing characters after the JSON ends\n            answer = answer[: end_index + 1]\n    # Handle weird escaped values. I am not sure why the model\n    # is returning these, but the JSON parser can't take them\n    answer = answer.replace(r\"\\_\", \"_\")\n    answer = answer.replace(\"\\\\n\", \"\\n\")\n    # Remove https and http prefixes to not conflict with comment detection\n    # Handle comments in the JSON response. Use regex from // until end of line\n    answer = re.sub(r\"(?<!https?:)\\/\\/.*?$\", \"\", answer, flags=re.MULTILINE)\n    answer = re.sub(r\",\\n.*\\.\\.\\.$\", \"\", answer, flags=re.MULTILINE)\n    # Sanitize newlines in the JSON response\n    answer = answer.replace(\"\\n\", \" \")\n    # finally, parse and return the JSON object; errors are handled by the caller\n    return json.loads(answer)\n\n\nclass MMQAValidator(pz.Validator):\n    def __init__(self, dataset: list[dict]):\n        super().__init__()\n        self.dataset = dataset\n\n        # compute qid to label mapping\n        self.qid_to_labels = self._compute_qid_to_labels()\n\n    def _compute_qid_to_labels(self) -> dict:\n        \"\"\"Compute the label for a MMQA question given its entry in the dataset.\"\"\"\n        qid_to_labels = {}\n        for entry in self.dataset:\n            # get the answers\n            answers = [answer[\"answer\"] for answer in entry[\"answers\"]]\n            supporting_text_ids = [context[\"doc_id\"] for context in entry[\"supporting_context\"] if context[\"doc_part\"] == \"text\"]\n            supporting_table_ids = [context[\"doc_id\"] for context in entry[\"supporting_context\"] if context[\"doc_part\"] == \"table\"]\n            supporting_image_ids = [context[\"doc_id\"] for context in entry[\"supporting_context\"] if context[\"doc_part\"] == \"image\"]\n\n            label_dict = {\n                \"answers\": answers,\n                \"supporting_text_ids\": supporting_text_ids,\n                \"supporting_table_ids\": supporting_table_ids,\n                \"supporting_image_ids\": supporting_image_ids,\n            }\n            qid_to_labels[entry[\"qid\"]] = label_dict\n\n        return qid_to_labels\n\n    def recall(self, preds: list | None, targets: list):\n        if preds is None or len(targets) == 0:\n            return 0.0\n\n        tp, fn = 0, 0\n        try:\n            # compute recall of retrieved ids and return\n            preds = [str(pred).lower() for pred in preds]\n            targets = [str(target).lower() for target in targets]\n            remove_tokens = [c for c in string.punctuation if c != \"/\"]\n            for token in remove_tokens:\n                preds = [pred.replace(token, \"\") for pred in preds]\n                targets = [target.replace(token, \"\") for target in targets]\n\n\n            for target in targets:\n                if target in preds:\n                    tp += 1\n                else:\n                    fn += 1\n\n            return tp / (tp + fn)\n\n        except Exception:\n            os.makedirs(\"mmqa-recall-eval-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"mmqa-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def f1(self, preds: list | None, targets: list):\n        if preds is None or len(targets) == 0:\n            return 0.0\n\n        tp, fp, fn = 0, 0, 0\n        try:\n            # compute recall of retrieved ids and return\n            preds = [str(pred).lower() for pred in preds]\n            targets = [str(target).lower() for target in targets]\n\n            remove_tokens = [c for c in string.punctuation if c != \"/\"]\n            for token in remove_tokens:\n                preds = [pred.replace(token, \"\") for pred in preds]\n                targets = [target.replace(token, \"\") for target in targets]\n\n            for pred in preds:\n                if pred in targets:\n                    tp += 1\n                else:\n                    fp += 1\n            for target in targets:\n                if target not in preds:\n                    fn += 1\n\n            # compute overall f1 score and return\n            recall = tp / (tp + fn) if (tp + fn) > 0 else 0\n            precision = tp / (tp + fp) if (tp + fp) > 0 else 0\n            f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0\n\n            return f1\n\n        except Exception:\n            os.makedirs(\"mmqa-recall-eval-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"mmqa-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        if \"answers\" not in fields:\n            return None\n        preds = output.get(\"answers\")\n        targets = self.qid_to_labels[str(input_record[\"qid\"])][\"answers\"]\n        return self.f1(preds, targets)\n\n    def join_score_fn(self, condition: str, left_input_record: dict, right_input_record: dict, output: bool) -> float | None:\n        if condition == \"The text snippet is relevant to the question based on the text search string.\":\n            pred = right_input_record[\"text_id\"]\n            targets = self.qid_to_labels[left_input_record[\"qid\"]][\"supporting_text_ids\"]\n            return pred in targets and output or pred not in targets and not output\n        elif condition == \"The table is relevant to the question based on the table search string.\":\n            pred = right_input_record[\"table_id\"]\n            targets = self.qid_to_labels[left_input_record[\"qid\"]][\"supporting_table_ids\"]\n            return pred in targets and output or pred not in targets and not output\n        elif condition == \"The image is relevant to the question based on the image search string.\":\n            pred = right_input_record[\"image_id\"]\n            targets = self.qid_to_labels[left_input_record[\"qid\"]][\"supporting_image_ids\"]\n            return pred in targets and output or pred not in targets and not output\n        else:\n            raise NotImplementedError(f\"Validator.join_score_fn not implemented for condition {condition}.\")\n\n\nclass MMQAQuestionDataset(pz.IterDataset):\n    def __init__(self, dataset: list[dict]):\n        super().__init__(id=\"mmqa-questions\", schema=mmqa_entry_cols)\n        self.dataset = [{\"qid\": entry[\"qid\"], \"question\": entry[\"question\"]} for entry in dataset]\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        return self.dataset[idx]\n\n\nclass MMQATextDataset(pz.IterDataset):\n    def __init__(self, dataset: list[dict]):\n        super().__init__(id=\"mmqa-texts\", schema=mmqa_text_cols)\n\n        # construct mapping from text id to text\n        text_id_to_text = {}\n        with open(\"data/MMQA_texts.jsonl\") as f:\n            for line in f:\n                dict_line = json.loads(line)\n                text_id_to_text[dict_line[\"id\"]] = f\"{dict_line['title']}: {dict_line['text']}\"\n\n        # construct dataset\n        self.dataset = []\n        for entry in dataset:\n            for context in entry[\"supporting_context\"]:\n                if context[\"doc_part\"] == \"text\":\n                    text_id = context[\"doc_id\"]\n                    text = text_id_to_text[text_id]\n                    self.dataset.append({\"text_id\": text_id, \"text\": text})\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        return self.dataset[idx]\n\n\nclass MMQATableDataset(pz.IterDataset):\n    def __init__(self, dataset: list[dict]):\n        super().__init__(id=\"mmqa-tables\", schema=mmqa_table_cols)\n\n        # construct mapping from table id to table string\n        table_id_to_table = {}\n        with open(\"data/MMQA_tables.jsonl\") as f:\n            for line in f:\n                dict_line = json.loads(line)\n\n                # get page title and table name\n                page_title = dict_line[\"title\"]\n                table_name = dict_line[\"table\"][\"table_name\"]\n\n                # get table column names and empty column indices\n                table_header = dict_line[\"table\"][\"header\"]\n                column_names = [col[\"column_name\"] for col in table_header if col[\"column_name\"] != \"\"]\n                empty_col_indices = set([idx for idx, col in enumerate(table_header) if col[\"column_name\"] == \"\"])\n\n                # create string for table data\n                text = f\"{page_title}: {table_name}\\n\\n{','.join(column_names)}\\n\"\n\n                # parse table row data\n                table_rows = dict_line[\"table\"][\"table_rows\"]\n                for row in table_rows:\n                    row_data = []\n                    for idx, cell in enumerate(row):\n                        if idx in empty_col_indices:\n                            continue\n                        row_data.append(cell[\"text\"])\n\n                    text += \",\".join(row_data) + \"\\n\"\n\n                table_id_to_table[dict_line[\"id\"]] = text\n\n        # construct dataset\n        self.dataset = []\n        for entry in dataset:\n            for context in entry[\"supporting_context\"]:\n                if context[\"doc_part\"] == \"table\":\n                    table_id = context[\"doc_id\"]\n                    table = table_id_to_table[table_id]\n                    self.dataset.append({\"table_id\": table_id, \"table\": table})\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        return self.dataset[idx]\n\n\nclass MMQAImageDataset(pz.IterDataset):\n    def __init__(self, dataset: list[dict]):\n        super().__init__(id=\"mmqa-images\", schema=mmqa_image_cols)\n\n        # construct mapping from image id to image base64 object\n        image_id_to_image = {}\n        with open(\"data/MMQA_images.jsonl\") as f:\n            possible_endings = {'.JPG', '.png', '.jpeg', '.jpg', '.tif', '.JPEG', '.tiff', '.PNG', '.Jpg', '.gif'}\n            for line in f:\n                dict_line = json.loads(line)\n                image_id = dict_line[\"id\"]\n\n                # skip corrupted images:\n                if image_id in CORRUPTED_IMAGE_IDS:\n                    continue\n\n                # find the correct image file\n                image_filepath = None\n                for ending in possible_endings:\n                    filepath = f\"data/final_dataset_images/{image_id}{ending}\"\n                    if os.path.exists(filepath):\n                        image_filepath = filepath\n                        break\n\n                # read the image file and convert to base64\n                with open(image_filepath, \"rb\") as f:\n                    contents = base64.b64encode(f.read()).decode(\"utf-8\")\n                    image_id_to_image[image_id] = contents\n\n        # construct dataset\n        self.dataset = []\n        for entry in dataset:\n            for context in entry[\"supporting_context\"]:\n                if context[\"doc_part\"] == \"image\":\n                    image_id = context[\"doc_id\"]\n                    image = image_id_to_image[image_id]\n                    self.dataset.append({\"image_id\": image_id, \"image\": image})\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        return self.dataset[idx]\n\n\ndef get_dataset(split: str, shuffle: bool, seed: int, num_samples: int | None) -> list[str]:\n    dataset = []\n    with open(f\"data/MMQA_{split}.jsonl\") as f:\n        for line in f:\n            dict_line = json.loads(line)\n            if \"image\" in dict_line[\"metadata\"][\"modalities\"] and len(dict_line[\"metadata\"][\"modalities\"]) > 1:\n                dataset.append(dict_line)\n\n    # shuffle the questions for the given seed\n    if shuffle:\n        rng = np.random.default_rng(seed=seed)\n        rng.shuffle(dataset)\n    \n    return dataset if num_samples is None else dataset[:num_samples]\n\n\ndef compute_f1(final_df, answers_df):\n    merged_df = final_df.merge(answers_df, on=\"qid\", how=\"left\")\n    tp, fp, fn = 0, 0, 0\n    for _, row in merged_df.iterrows():\n        targets = [str(target).lower() for target in row[\"gt_answers\"]]\n        preds = row[\"answers\"]\n        if isinstance(preds, str):\n            try:\n                # convert single quotes to double quotes before parsing for JSON\n                preds = preds.replace(\"'\", '\"')\n                # try parsing preds as JSON list and cast everything to str to match targets\n                preds = get_json_from_answer(preds)\n                preds = [str(pred).lower() for pred in preds]\n            except Exception:\n                # if that fails, give it a shot as a singleton answer that the LLM failed to wrap in a list\n                preds = [preds.lower()]\n            remove_tokens = [c for c in string.punctuation if c != \"/\"]\n            for token in remove_tokens:\n                preds = [pred.replace(token, \"\") for pred in preds]\n                targets = [target.replace(token, \"\") for target in targets]\n        elif isinstance(preds, list):\n            preds = [str(pred).lower() for pred in preds]\n            remove_tokens = [c for c in string.punctuation if c != \"/\"]\n            for token in remove_tokens:\n                preds = [pred.replace(token, \"\") for pred in preds]\n                targets = [target.replace(token, \"\") for target in targets]\n        else:\n            preds = []\n        for pred in preds:\n            if pred in targets:\n                tp += 1\n            else:\n                fp += 1\n        for target in targets:\n            if target not in preds:\n                fn += 1\n    # compute overall f1 score and return\n    recall = tp / (tp + fn) if (tp + fn) > 0 else 0\n    precision = tp / (tp + fp) if (tp + fp) > 0 else 0\n    f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0\n    return f1\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\"--progress\", default=False, action=\"store_true\", help=\"Print progress output\")\n    parser.add_argument(\"--gpt4-mini-only\", default=False, action=\"store_true\", help=\"Use only GPT-4o-mini\")\n    parser.add_argument(\n        \"--execution-strategy\",\n        default=\"parallel\",\n        type=str,\n        help=\"The plan executor to use. One of sequential, pipelined, parallel\",\n    )\n    parser.add_argument(\n        \"--sentinel-execution-strategy\",\n        default=\"mab\",\n        type=str,\n        help=\"The sentinel execution strategy to use. One of mab or random\",\n    )\n    parser.add_argument(\n        \"--policy\",\n        default=\"maxquality\",\n        type=str,\n        help=\"One of 'mincost', 'mintime', 'maxquality'\",\n    )\n    parser.add_argument(\n        \"--val-examples\",\n        default=20,\n        type=int,\n        help=\"Number of validation examples to sample from\",\n    )\n    parser.add_argument(\n        \"--model\",\n        default=\"gpt-4o\",\n        type=str,\n        help=\"One of 'gpt-4o', 'gpt-4o-mini', 'llama'\",\n    )\n    parser.add_argument(\n        \"--seed\",\n        default=42,\n        type=int,\n        help=\"Seed used to initialize RNG for MAB sampling algorithm\",\n    )\n    parser.add_argument(\n        \"--k\",\n        default=6,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--j\",\n        default=4,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--sample-budget\",\n        default=100,\n        type=int,\n        help=\"Total sample budget in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--quality\",\n        default=None,\n        type=float,\n        help=\"Quality threshold\",\n    )\n    parser.add_argument(\n        \"--exp-name\",\n        default=None,\n        type=str,\n        help=\"The experiment name.\",\n    )\n\n    args = parser.parse_args()\n\n    # create directory for profiling data\n    os.makedirs(\"mmqa-complex-data\", exist_ok=True)\n\n    verbose = args.verbose\n    progress = args.progress\n    seed = args.seed\n    val_examples = args.val_examples\n    k = args.k\n    j = args.j\n    sample_budget = args.sample_budget\n    execution_strategy = args.execution_strategy\n    sentinel_execution_strategy = args.sentinel_execution_strategy\n    exp_name = (\n        f\"mmqa-complex-final-{sentinel_execution_strategy}-k{k}-j{j}-budget{sample_budget}-seed{seed}\"\n        if args.exp_name is None\n        else args.exp_name\n    )\n\n    policy = pz.MaxQuality()\n    if args.policy == \"mincost\":\n        policy = pz.MinCost()\n    elif args.policy == \"minlatency\":\n        policy = pz.MinTime()\n    elif args.quality is not None and args.policy == \"mincostatfixedquality\":\n        policy = pz.MinCostAtFixedQuality(min_quality=args.quality)\n    elif args.quality is not None and args.policy == \"minlatencyatfixedquality\":\n        policy = pz.MinTimeAtFixedQuality(min_quality=args.quality)\n    print(f\"USING POLICY: {policy}\")\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # create the train and test dataset\n    train_dataset = get_dataset(split=\"train\", shuffle=True, seed=seed, num_samples=val_examples)\n    test_dataset = get_dataset(split=\"dev\", shuffle=True, seed=seed, num_samples=100)\n\n    # create validator for MMQA\n    validator = MMQAValidator(train_dataset)\n\n    # create train datasets for questions, texts, tables, and images\n    train_question_dataset = MMQAQuestionDataset(train_dataset)\n    train_text_dataset = MMQATextDataset(train_dataset)\n    train_table_dataset = MMQATableDataset(train_dataset)\n    train_image_dataset = MMQAImageDataset(train_dataset)\n    train_dataset = {\n        train_question_dataset.id: train_question_dataset,\n        train_text_dataset.id: train_text_dataset,\n        train_table_dataset.id: train_table_dataset,\n        train_image_dataset.id: train_image_dataset,\n    }\n\n    # construct plan\n    test_question_dataset = MMQAQuestionDataset(test_dataset)\n    print(f\"Test Question Dataset: {len(test_question_dataset)}\")\n    test_text_dataset = MMQATextDataset(test_dataset)\n    print(f\"Text Dataset: {len(test_text_dataset)}\")\n    test_table_dataset = MMQATableDataset(test_dataset)\n    print(f\"Table Dataset: {len(test_table_dataset)}\")\n    test_image_dataset = MMQAImageDataset(test_dataset)\n    print(f\"Image Dataset: {len(test_image_dataset)}\")\n\n    text_plan = test_question_dataset.sem_map(mmqa_text_search_cols, depends_on=[\"question\"])\n    text_plan = text_plan.sem_join(\n        test_text_dataset,\n        condition=\"The text snippet is relevant to the question based on the text search string.\",\n        depends_on=[\"text_search_string\", \"text\"],\n        how=\"left\",\n    )\n    text_plan = text_plan.groupby(pz.GroupBySig([\"qid\", \"question\", \"text_search_string\"], agg_funcs=[\"list\", \"list\"], agg_fields=[\"text_id\", \"text\"]))\n    text_plan = text_plan.map(\n        udf=lambda record: {\"text\": \"...\".join(record[\"list(text)\"]) if record[\"list(text)\"] != [None] else [None], **record},\n        cols=[{\"name\": \"text\", \"type\": str, \"desc\": \"All relevant text snippets concatenated together.\"}],\n    )\n    text_plan = text_plan.project([\"qid\", \"question\", \"text\"])\n\n    table_plan = test_question_dataset.sem_map(mmqa_table_search_cols, depends_on=[\"question\"])\n    table_plan = table_plan.sem_join(\n        test_table_dataset,\n        condition=\"The table is relevant to the question based on the table search string.\",\n        depends_on=[\"table_search_string\", \"table\"],\n        how=\"left\",\n    )\n    table_plan = table_plan.groupby(pz.GroupBySig([\"qid\", \"question\", \"table_search_string\"], agg_funcs=[\"list\", \"list\"], agg_fields=[\"table_id\", \"table\"]))\n    table_plan = table_plan.map(\n        udf=lambda record: {\"table\": \"\\n\\n\".join(record[\"list(table)\"]) if record[\"list(table)\"] != [None] else [None], **record},\n        cols=[{\"name\": \"table\", \"type\": str, \"desc\": \"All relevant tables concatenated together.\"}],\n    )\n    table_plan = table_plan.project([\"qid\", \"question\", \"table\"])\n\n    image_plan = test_question_dataset.sem_map(mmqa_image_search_cols, depends_on=[\"question\"])\n    image_plan = image_plan.sem_join(\n        test_image_dataset,\n        condition=\"The image is relevant to the question based on the image search string.\",\n        depends_on=[\"image_search_string\", \"image\"],\n        how=\"left\",\n    )\n    image_plan = image_plan.groupby(pz.GroupBySig([\"qid\", \"question\", \"image_search_string\"], agg_funcs=[\"list\", \"list\"], agg_fields=[\"image_id\", \"image\"]))\n    image_plan = image_plan.map(\n        udf=lambda record: {\"image\": record[\"list(image)\"] if record[\"list(image)\"] != [None] else [None], **record},\n        cols=[{\"name\": \"image\", \"type\": list[ImageBase64], \"desc\": \"All relevant images.\"}],\n    )\n    image_plan = image_plan.project([\"qid\", \"question\", \"image\"])\n    plan = text_plan.join(table_plan, on=[\"qid\", \"question\"]).join(image_plan, on=[\"qid\", \"question\"])\n    plan = plan.sem_map(mmqa_answer_cols, depends_on=[\"question\", \"text\", \"table\", \"image\"])\n\n    # execute pz plan\n    config = pz.QueryProcessorConfig(\n        policy=policy,\n        optimizer_strategy=\"pareto\",\n        sentinel_execution_strategy=sentinel_execution_strategy,\n        execution_strategy=execution_strategy,\n        use_final_op_quality=True,\n        max_workers=64,\n        verbose=verbose,\n        available_models=[\n            Model.GPT_4o_MINI,\n        ],\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=progress,\n        k=k,\n        j=j,\n        sample_budget=sample_budget,\n        seed=seed,\n        exp_name=exp_name,\n    )\n\n    data_record_collection = plan.optimize_and_run(config=config, train_dataset=train_dataset, validator=validator)\n\n    print(data_record_collection.to_df())\n    data_record_collection.to_df().to_csv(f\"mmqa-complex-data/{exp_name}-output.csv\", index=False)\n\n    # create filepaths for records and stats\n    records_path = f\"mmqa-complex-data/{exp_name}-records.json\"\n    stats_path = f\"mmqa-complex-data/{exp_name}-profiling.json\"\n\n    # save record outputs\n    record_jsons = []\n    for record in data_record_collection:\n        record_dict = record.to_dict()\n        record_dict = {\n            k: v\n            for k, v in record_dict.items()\n            if k in [\"qid\", \"question\", \"supporting_text_ids\", \"supporting_table_ids\", \"supporting_image_ids\", \"answers\"]\n        }\n        record_jsons.append(record_dict)\n\n    with open(records_path, \"w\") as f:\n        json.dump(record_jsons, f)\n\n    # read the appropriate dataset\n    dataset = []\n    with open(\"data/MMQA_dev.jsonl\") as f:\n        for line in f:\n            dict_line = json.loads(line)\n            if \"image\" in dict_line[\"metadata\"][\"modalities\"] and len(dict_line[\"metadata\"][\"modalities\"]) > 1:\n                dataset.append(dict_line)\n\n    # shuffle the questions for the given seed\n    rng = np.random.default_rng(seed=seed)\n    rng.shuffle(dataset)\n\n    # trim to 100 samples\n    dataset = dataset[:100]\n    answer_dataset = []\n    for item in dataset:\n        answers = list(map(lambda elt: str(elt[\"answer\"]), item[\"answers\"]))\n        answer_dataset.append({\n            \"qid\": item[\"qid\"],\n            \"gt_answers\": answers\n        })\n\n    # construction dataframe\n    import pandas as pd\n    answers_df = pd.DataFrame(answer_dataset)\n\n    # get final plan str\n    final_plan_id = list(data_record_collection.execution_stats.plan_stats.keys())[0]\n    final_plan_str = data_record_collection.execution_stats.plan_strs[final_plan_id]\n\n    # write stats to disk\n    stats_dict = {\n        \"f1\": compute_f1(data_record_collection.to_df(), answers_df),\n        \"optimization_time\": data_record_collection.execution_stats.optimization_time,\n        \"optimization_cost\": data_record_collection.execution_stats.optimization_cost,\n        \"plan_execution_time\": data_record_collection.execution_stats.plan_execution_time,\n        \"plan_execution_cost\": data_record_collection.execution_stats.plan_execution_cost,\n        \"total_execution_time\": data_record_collection.execution_stats.total_execution_time,\n        \"total_execution_cost\": data_record_collection.execution_stats.total_execution_cost,\n        \"plan_str\": final_plan_str,\n    }\n    print(f\"F1 IS: {stats_dict['f1']}\")\n\n    with open(f\"mmqa-complex-data/{exp_name}-stats.json\", \"w\") as f:\n        json.dump(stats_dict, f)\n"
  },
  {
    "path": "abacus-research/mmqa-demo.py",
    "content": "import argparse\nimport base64\nimport json\nimport os\nimport string\nimport time\n\nimport chromadb\nimport numpy as np\nimport regex as re\nfrom chromadb.utils.embedding_functions import (\n    SentenceTransformerEmbeddingFunction,\n)\nfrom chromadb.utils.embedding_functions.openai_embedding_function import (\n    OpenAIEmbeddingFunction,\n)\n\nimport palimpzest as pz\nfrom palimpzest.constants import Model\nfrom palimpzest.core.lib.schemas import ImageBase64\n\nmmqa_entry_cols = [\n    {\"name\": \"qid\", \"type\": str, \"desc\": \"The id of the MMQA question\"},\n    {\"name\": \"question\", \"type\": str, \"desc\": \"The question which needs to be answered\"},\n]\n\nmmqa_text_cols = [\n    {\"name\": \"supporting_text_ids\", \"type\": list[str], \"desc\": \"A list of text ids for text snippets which may support the question.\"},\n    {\"name\": \"supporting_texts\", \"type\": list[str], \"desc\": \"A list of text snippets which may support the question.\"},\n]\n\nmmqa_table_cols = [\n    {\"name\": \"supporting_table_ids\", \"type\": list[str], \"desc\": \"A list of table ids for tables which may support the question.\"},\n    {\"name\": \"supporting_tables\", \"type\": list[str], \"desc\": \"A list of tables which may support the question.\"},\n]\n\nmmqa_image_cols = [\n    {\"name\": \"supporting_image_ids\", \"type\": list[str], \"desc\": \"A list of image ids whose images may support the question.\"},\n    {\"name\": \"supporting_images\", \"type\": list[ImageBase64], \"desc\": \"A list of images which may support the question.\"},\n]\n\nmmqa_answer_cols = [\n    {\"name\": \"answers\", \"type\": list[str], \"desc\": \"The answer(s) to the question. Answer the question using the relevant information from gathered image(s), text(s), and table(s). Do not include any additional context or an explanation in your final list, simply list the entities asked for by the question\"},\n] # Return your answer as a JSON list of strings.\n\ndef get_json_from_answer(answer: str):\n    \"\"\"\n    This function parses an LLM response which is supposed to output a JSON object\n    and optimistically searches for the substring containing the JSON object.\n    \"\"\"\n    # split off context / excess, which models sometimes output after answer\n    answer = answer.split(\"Context:\")[0]\n    answer = answer.split(\"# this is the answer\")[0]\n    # trim the answer to only include the JSON array\n    if not answer.strip().startswith(\"[\"):\n        # Find the start index of the actual JSON string assuming the prefix is followed by the JSON array\n        start_index = answer.find(\"[\")\n        if start_index != -1:\n            # Remove the prefix and any leading characters before the JSON starts\n            answer = answer[start_index:]\n    if not answer.strip().endswith(\"]\"):\n        # Find the end index of the actual JSON string\n        # assuming the suffix is preceded by the JSON object/array\n        end_index = answer.rfind(\"]\")\n        if end_index != -1:\n            # Remove the suffix and any trailing characters after the JSON ends\n            answer = answer[: end_index + 1]\n    # Handle weird escaped values. I am not sure why the model\n    # is returning these, but the JSON parser can't take them\n    answer = answer.replace(r\"\\_\", \"_\")\n    answer = answer.replace(\"\\\\n\", \"\\n\")\n    # Remove https and http prefixes to not conflict with comment detection\n    # Handle comments in the JSON response. Use regex from // until end of line\n    answer = re.sub(r\"(?<!https?:)\\/\\/.*?$\", \"\", answer, flags=re.MULTILINE)\n    answer = re.sub(r\",\\n.*\\.\\.\\.$\", \"\", answer, flags=re.MULTILINE)\n    # Sanitize newlines in the JSON response\n    answer = answer.replace(\"\\n\", \" \")\n    # finally, parse and return the JSON object; errors are handled by the caller\n    return json.loads(answer)\n\n\nclass MMQAValidator(pz.Validator):\n    def __init__(\n        self,\n        num_samples: int = 5,\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__()\n\n        # read the appropriate dataset\n        dataset = []\n        with open(\"data/MMQA_train.jsonl\") as f:\n            for line in f:\n                dict_line = json.loads(line)\n                if \"image\" in dict_line[\"metadata\"][\"modalities\"] and len(dict_line[\"metadata\"][\"modalities\"]) > 1:\n                    dataset.append(dict_line)\n\n        # shuffle the questions for the given seed\n        if shuffle:\n            rng = np.random.default_rng(seed=seed)\n            rng.shuffle(dataset)\n\n        # trim to number of samples\n        self.dataset = dataset[:num_samples]\n        self.num_samples = num_samples\n        self.shuffle = shuffle\n        self.seed = seed\n\n        # compute qid to label mapping\n        self.qid_to_labels = self._compute_qid_to_labels()\n\n    def _compute_qid_to_labels(self) -> dict:\n        \"\"\"Compute the label for a MMQA question given its entry in the dataset.\"\"\"\n        qid_to_labels = {}\n        for entry in self.dataset:\n            # get the answers\n            answers = [answer[\"answer\"] for answer in entry[\"answers\"]]\n            supporting_text_doc_ids = [context[\"doc_id\"] for context in entry[\"supporting_context\"] if context[\"doc_part\"] == \"text\"]\n            supporting_table_doc_ids = [context[\"doc_id\"] for context in entry[\"supporting_context\"] if context[\"doc_part\"] == \"table\"]\n            supporting_image_doc_ids = [context[\"doc_id\"] for context in entry[\"supporting_context\"] if context[\"doc_part\"] == \"image\"]\n\n            # NOTE: inside the optimizer, our qualities will effectively be divided by two,\n            #       because we are not providing a label for supporting texts, tables, and images,\n            #       however this should be okay b/c it will affect all records equally\n            label_dict = {\n                \"answers\": answers,\n                \"supporting_text_ids\": supporting_text_doc_ids,\n                \"supporting_table_ids\": supporting_table_doc_ids,\n                \"supporting_image_ids\": supporting_image_doc_ids,\n                \"supporting_texts\": [],\n                \"supporting_tables\": [],\n                \"supporting_images\": [],\n            }\n            qid_to_labels[entry[\"qid\"]] = label_dict\n\n        return qid_to_labels\n\n    def recall(self, preds: list | None, targets: list):\n        if preds is None or len(targets) == 0:\n            return 0.0\n\n        tp, fn = 0, 0\n        try:\n            if isinstance(preds, list) and len(preds) > 0 and isinstance(preds[0], list):\n                preds = preds[0]\n\n            # compute recall of retrieved ids and return\n            preds = [str(pred).lower() for pred in preds]\n            targets = [str(target).lower() for target in targets]\n            remove_tokens = [c for c in string.punctuation if c != \"/\"]\n            for token in remove_tokens:\n                preds = [pred.replace(token, \"\") for pred in preds]\n                targets = [target.replace(token, \"\") for target in targets]\n\n\n            for target in targets:\n                if target in preds:\n                    tp += 1\n                else:\n                    fn += 1\n\n            return tp / (tp + fn)\n\n        except Exception:\n            os.makedirs(\"mmqa-recall-eval-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"mmqa-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def f1(self, preds: list | None, targets: list):\n        if preds is None or len(targets) == 0:\n            return 0.0\n\n        tp, fp, fn = 0, 0, 0\n        try:\n            if isinstance(preds, list) and len(preds) > 0 and isinstance(preds[0], list):\n                preds = preds[0]\n\n            # compute recall of retrieved ids and return\n            preds = [str(pred).lower() for pred in preds]\n            targets = [str(target).lower() for target in targets]\n\n            remove_tokens = [c for c in string.punctuation if c != \"/\"]\n            for token in remove_tokens:\n                preds = [pred.replace(token, \"\") for pred in preds]\n                targets = [target.replace(token, \"\") for target in targets]\n\n            for pred in preds:\n                if pred in targets:\n                    tp += 1\n                else:\n                    fp += 1\n            for target in targets:\n                if target not in preds:\n                    fn += 1\n\n            # compute overall f1 score and return\n            recall = tp / (tp + fn) if (tp + fn) > 0 else 0\n            precision = tp / (tp + fp) if (tp + fp) > 0 else 0\n            f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0\n\n            return f1\n\n        except Exception:\n            os.makedirs(\"mmqa-recall-eval-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"mmqa-recall-eval-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(preds))\n            return 0.0\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        preds = output.get(\"answers\")\n        targets = self.qid_to_labels[str(input_record[\"qid\"])][\"answers\"]\n        return self.f1(preds, targets)\n\n    def topk_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        if \"supporting_text_ids\" in fields:\n            preds = output.get(\"supporting_text_ids\")\n            targets = self.qid_to_labels[str(input_record[\"qid\"])][\"supporting_text_ids\"]\n            return self.recall(preds, targets)\n        elif \"supporting_table_ids\" in fields:\n            preds = output.get(\"supporting_table_ids\")\n            targets = self.qid_to_labels[str(input_record[\"qid\"])][\"supporting_table_ids\"]\n            return self.recall(preds, targets)\n        elif \"supporting_image_ids\" in fields:\n            preds = output.get(\"supporting_image_ids\")\n            targets = self.qid_to_labels[str(input_record[\"qid\"])][\"supporting_image_ids\"]\n            return self.recall(preds, targets)\n        else:\n            raise NotImplementedError(f\"Validator.topk_score_fn not implemented for fields {fields}.\")\n\n\nclass MMQADataset(pz.IterDataset):\n    def __init__(\n        self,\n        num_samples: int = 5,\n        split: str = \"dev\",\n        shuffle: bool = False,\n        seed: int = 42,\n    ):\n        super().__init__(id=\"mmqa\", schema=mmqa_entry_cols)\n\n        # read the appropriate dataset\n        dataset = []\n        with open(f\"data/MMQA_{split}.jsonl\") as f:\n            for line in f:\n                dict_line = json.loads(line)\n                if \"image\" in dict_line[\"metadata\"][\"modalities\"] and len(dict_line[\"metadata\"][\"modalities\"]) > 1:\n                    dataset.append(dict_line)\n\n        # shuffle the questions for the given seed\n        if shuffle:\n            rng = np.random.default_rng(seed=seed)\n            rng.shuffle(dataset)\n\n        # trim to number of samples\n        self.dataset = dataset[:num_samples]\n        self.num_samples = num_samples\n        self.shuffle = shuffle\n        self.seed = seed\n        self.split = split\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx: int):\n        # get entry\n        entry = self.dataset[idx]\n\n        # get input fields\n        qid = entry[\"qid\"]\n        question = entry[\"question\"]\n\n        # create item with fields\n        item = {\"qid\": qid, \"question\": question}\n\n        return item\n\n\ndef compute_f1(final_df, answers_df):\n    merged_df = final_df.merge(answers_df, on=\"qid\", how=\"left\")\n    tp, fp, fn = 0, 0, 0\n    for _, row in merged_df.iterrows():\n        targets = [str(target).lower() for target in row[\"gt_answers\"]]\n        preds = row[\"answers\"]\n        if isinstance(preds, str):\n            try:\n                # convert single quotes to double quotes before parsing for JSON\n                preds = preds.replace(\"'\", '\"')\n                # try parsing preds as JSON list and cast everything to str to match targets\n                preds = get_json_from_answer(preds)\n                if isinstance(preds, list) and len(preds) > 0 and isinstance(preds[0], list):\n                    preds = preds[0]\n                preds = [str(pred).lower() for pred in preds]\n            except Exception:\n                # if that fails, give it a shot as a singleton answer that the LLM failed to wrap in a list\n                preds = [str(preds).lower()]\n            remove_tokens = [c for c in string.punctuation if c != \"/\"]\n            for token in remove_tokens:\n                preds = [pred.replace(token, \"\") for pred in preds]\n                targets = [target.replace(token, \"\") for target in targets]\n        elif isinstance(preds, list):\n            if isinstance(preds, list) and len(preds) > 0 and isinstance(preds[0], list):\n                preds = preds[0]\n            preds = [str(pred).lower() for pred in preds]\n            remove_tokens = [c for c in string.punctuation if c != \"/\"]\n            for token in remove_tokens:\n                preds = [pred.replace(token, \"\") for pred in preds]\n                targets = [target.replace(token, \"\") for target in targets]\n        else:\n            preds = []\n        for pred in preds:\n            if pred in targets:\n                tp += 1\n            else:\n                fp += 1\n        for target in targets:\n            if target not in preds:\n                fn += 1\n    # compute overall f1 score and return\n    recall = tp / (tp + fn) if (tp + fn) > 0 else 0\n    precision = tp / (tp + fp) if (tp + fp) > 0 else 0\n    f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0\n    return f1\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\"--progress\", default=False, action=\"store_true\", help=\"Print progress output\")\n    parser.add_argument(\"--gpt4-mini-only\", default=False, action=\"store_true\", help=\"Use only GPT-4o-mini\")\n    parser.add_argument(\n        \"--execution-strategy\",\n        default=\"parallel\",\n        type=str,\n        help=\"The plan executor to use. One of sequential, pipelined, parallel\",\n    )\n    parser.add_argument(\n        \"--sentinel-execution-strategy\",\n        default=\"mab\",\n        type=str,\n        help=\"The sentinel execution strategy to use. One of mab or random\",\n    )\n    parser.add_argument(\n        \"--policy\",\n        default=\"maxquality\",\n        type=str,\n        help=\"One of 'mincost', 'mintime', 'maxquality'\",\n    )\n    parser.add_argument(\n        \"--val-examples\",\n        default=20,\n        type=int,\n        help=\"Number of validation examples to sample from\",\n    )\n    parser.add_argument(\n        \"--model\",\n        default=\"gpt-4o\",\n        type=str,\n        help=\"One of 'gpt-4o', 'gpt-4o-mini', 'llama'\",\n    )\n    parser.add_argument(\n        \"--seed\",\n        default=42,\n        type=int,\n        help=\"Seed used to initialize RNG for MAB sampling algorithm\",\n    )\n    parser.add_argument(\n        \"--k\",\n        default=6,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--j\",\n        default=4,\n        type=int,\n        help=\"Number of columns to sample in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--sample-budget\",\n        default=100,\n        type=int,\n        help=\"Total sample budget in Random Sampling or MAB sentinel execution\",\n    )\n    parser.add_argument(\n        \"--quality\",\n        default=None,\n        type=float,\n        help=\"Quality threshold\",\n    )\n    parser.add_argument(\n        \"--exp-name\",\n        default=None,\n        type=str,\n        help=\"The experiment name.\",\n    )\n\n    args = parser.parse_args()\n\n    # create directory for profiling data\n    os.makedirs(\"opt-profiling-data\", exist_ok=True)\n\n    verbose = args.verbose\n    progress = args.progress\n    seed = args.seed\n    val_examples = args.val_examples\n    k = args.k\n    j = args.j\n    sample_budget = args.sample_budget\n    execution_strategy = args.execution_strategy\n    sentinel_execution_strategy = args.sentinel_execution_strategy\n    exp_name = (\n        f\"mmqa-final-{sentinel_execution_strategy}-k{k}-j{j}-budget{sample_budget}-seed{seed}\"\n        if args.exp_name is None\n        else args.exp_name\n    )\n\n    policy = pz.MaxQuality()\n    if args.policy == \"mincost\":\n        policy = pz.MinCost()\n    elif args.policy == \"minlatency\":\n        policy = pz.MinTime()\n    elif args.quality is not None and args.policy == \"mincostatfixedquality\":\n        policy = pz.MinCostAtFixedQuality(min_quality=args.quality)\n    elif args.quality is not None and args.policy == \"minlatencyatfixedquality\":\n        policy = pz.MinTimeAtFixedQuality(min_quality=args.quality)\n    print(f\"USING POLICY: {policy}\")\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # create validator for MMQA\n    validator = MMQAValidator(num_samples=val_examples, shuffle=True, seed=seed)\n\n    # create datasets for MMQA\n    train_dataset = MMQADataset(split=\"train\", num_samples=val_examples, shuffle=True, seed=seed)\n    train_dataset = {train_dataset.id: train_dataset}\n\n    # load index [text-embedding-3-small]\n    chroma_client = chromadb.PersistentClient(\".chroma-mmqa\")\n    openai_ef = OpenAIEmbeddingFunction(\n        api_key=os.environ[\"OPENAI_API_KEY\"],\n        model_name=\"text-embedding-3-small\",\n    )\n    sentence_transformer_ef = SentenceTransformerEmbeddingFunction(\n        model_name=\"clip-ViT-B-32\"\n    )\n    text_index = chroma_client.get_collection(\"mmqa-texts\", embedding_function=openai_ef)\n    table_index = chroma_client.get_collection(\"mmqa-tables\", embedding_function=openai_ef)\n    image_index = chroma_client.get_collection(\"mmqa-images\", embedding_function=sentence_transformer_ef)\n\n    def get_results_and_ids(index: chromadb.Collection, query: list[list[float]], n_results: int, image=False) -> tuple[list[str]]:\n        # execute query with embeddings\n        results = index.query(query, n_results=n_results)\n\n        # get list of result terms with their cosine similarity scores\n        final_results = []\n        for query_doc_ids, query_docs, query_distances in zip(results[\"ids\"], results[\"documents\"], results[\"distances\"]):\n            for doc_id, doc, dist in zip(query_doc_ids, query_docs, query_distances):\n                cosine_similarity = 1 - dist\n                final_results.append({\"content\": doc, \"id\": doc_id, \"similarity\": cosine_similarity})\n\n        # sort the results by similarity score\n        sorted_results = sorted(final_results, key=lambda result: result[\"similarity\"], reverse=True)\n\n        # remove duplicates\n        sorted_results_set = set()\n        final_sorted_results, final_sorted_result_ids = [], []\n        for result in sorted_results:\n            if result[\"content\"] not in sorted_results_set:\n                sorted_results_set.add(result[\"content\"])\n                final_sorted_results.append(result[\"content\"])\n                final_sorted_result_ids.append(result[\"id\"])\n\n        # return the top-k similar results and generation stats\n        return final_sorted_results[:n_results], final_sorted_result_ids[:n_results]\n\n    def text_search_func(index: chromadb.Collection, query: list[list[float]], k: int) -> list[str]:\n        # execute query with embeddings\n        results, result_ids = get_results_and_ids(index, query, n_results=k)\n        return {\"supporting_texts\": results, \"supporting_text_ids\": result_ids}\n\n    def table_search_func(index: chromadb.Collection, query: list[list[float]], k: int) -> list[str]:\n        # execute query with embeddings\n        results, result_ids = get_results_and_ids(index, query, n_results=k)\n        return {\"supporting_tables\": results, \"supporting_table_ids\": result_ids}\n\n    def image_search_func(index: chromadb.Collection, query: list[list[float]], k: int) -> list[str]:\n        # limit max number of results to 5\n        # k = min(k, 5)\n\n        # execute query with embeddings\n        _, result_ids = get_results_and_ids(index, query, n_results=k, image=True)\n        possible_endings = {'.JPG', '.png', '.jpeg', '.jpg', '.tif', '.JPEG', '.tiff', '.PNG', '.Jpg', '.gif'}\n\n        results = []\n        for image_id in result_ids:\n            # find the correct image file\n            for ending in possible_endings:\n                if os.path.exists(f\"/ssd1/mdrusso/mmqa-images/{image_id}{ending}\"):\n                    image_id += ending\n                    break\n\n            # load image from disk\n            with open(f\"/ssd1/mdrusso/mmqa-images/{image_id}\", \"rb\") as f:\n                base64_image_str = base64.b64encode(f.read()).decode(\"utf-8\")\n                results.append(base64_image_str)\n\n        return {\"supporting_images\": results, \"supporting_image_ids\": result_ids}\n\n    # construct plan\n    plan = MMQADataset(split=\"dev\", num_samples=100, shuffle=True, seed=seed)\n    plan = plan.sem_topk(\n        index=text_index,\n        search_func=text_search_func,\n        search_attr=\"question\",\n        output_attrs=mmqa_text_cols,\n    )\n    plan = plan.sem_topk(\n        index=table_index,\n        search_func=table_search_func,\n        search_attr=\"question\",\n        output_attrs=mmqa_table_cols,\n    )\n    plan = plan.sem_topk(\n        index=image_index,\n        search_func=image_search_func,\n        search_attr=\"question\",\n        output_attrs=mmqa_image_cols,\n    )\n    plan = plan.sem_map(mmqa_answer_cols, depends_on=[\"question\", \"supporting_texts\", \"supporting_tables\", \"supporting_images\"])\n\n    # execute pz plan\n    config = pz.QueryProcessorConfig(\n        policy=policy,\n        optimizer_strategy=\"pareto\",\n        sentinel_execution_strategy=sentinel_execution_strategy,\n        execution_strategy=execution_strategy,\n        use_final_op_quality=True,\n        max_workers=1,\n        verbose=verbose,\n        available_models=[\n            Model.GPT_4o_MINI,\n        ],\n        allow_bonded_query=True,\n        allow_critic=True,\n        allow_mixtures=True,\n        allow_rag_reduction=True,\n        progress=progress,\n        k=k,\n        j=j,\n        sample_budget=sample_budget,\n        seed=seed,\n        exp_name=exp_name,\n    )\n\n    data_record_collection = plan.optimize_and_run(config=config, train_dataset=train_dataset, validator=validator)\n\n    print(data_record_collection.to_df())\n    data_record_collection.to_df().to_csv(f\"opt-profiling-data/{exp_name}-output.csv\", index=False)\n\n    # create filepaths for records and stats\n    records_path = f\"opt-profiling-data/{exp_name}-records.json\"\n    stats_path = f\"opt-profiling-data/{exp_name}-profiling.json\"\n\n    # save record outputs\n    record_jsons = []\n    for record in data_record_collection:\n        record_dict = record.to_dict()\n        record_dict = {\n            k: v\n            for k, v in record_dict.items()\n            if k in [\"qid\", \"question\", \"supporting_text_ids\", \"supporting_table_ids\", \"supporting_image_ids\", \"answers\"]\n        }\n        record_jsons.append(record_dict)\n\n    with open(records_path, \"w\") as f:\n        json.dump(record_jsons, f)\n\n    # read the appropriate dataset\n    dataset = []\n    with open(\"data/MMQA_dev.jsonl\") as f:\n        for line in f:\n            dict_line = json.loads(line)\n            if \"image\" in dict_line[\"metadata\"][\"modalities\"] and len(dict_line[\"metadata\"][\"modalities\"]) > 1:\n                dataset.append(dict_line)\n\n    # shuffle the questions for the given seed\n    rng = np.random.default_rng(seed=seed)\n    rng.shuffle(dataset)\n\n    # trim to 100 samples\n    dataset = dataset[:100]\n    answer_dataset = []\n    for item in dataset:\n        answers = list(map(lambda elt: str(elt[\"answer\"]), item[\"answers\"]))\n        answer_dataset.append({\n            \"qid\": item[\"qid\"],\n            \"gt_answers\": answers\n        })\n\n    # construction dataframe\n    import pandas as pd\n    answers_df = pd.DataFrame(answer_dataset)\n\n    # get final plan str\n    final_plan_id = list(data_record_collection.execution_stats.plan_stats.keys())[0]\n    final_plan_str = data_record_collection.execution_stats.plan_strs[final_plan_id]\n\n    # write stats to disk\n    stats_dict = {\n        \"f1\": compute_f1(data_record_collection.to_df(), answers_df),\n        \"optimization_time\": data_record_collection.execution_stats.optimization_time,\n        \"optimization_cost\": data_record_collection.execution_stats.optimization_cost,\n        \"plan_execution_time\": data_record_collection.execution_stats.plan_execution_time,\n        \"plan_execution_cost\": data_record_collection.execution_stats.plan_execution_cost,\n        \"total_execution_time\": data_record_collection.execution_stats.total_execution_time,\n        \"total_execution_cost\": data_record_collection.execution_stats.total_execution_cost,\n        \"plan_str\": final_plan_str,\n    }\n    print(f\"F1 IS: {stats_dict['f1']}\")\n\n    with open(f\"opt-profiling-data/{exp_name}-stats.json\", \"w\") as f:\n        json.dump(stats_dict, f)\n"
  },
  {
    "path": "abacus-research/run_ablation_study.sh",
    "content": "#!/bin/bash\n\nfor seed in {0..9}\ndo\n    for priors in none naive sample\n    do\n        for sentinel in mab random\n        do\n            for opt in pareto greedy\n            do\n                k=6\n                j=4\n                if [[ $sentinel -eq \"random\" ]]; then\n                    j=8\n                fi\n                priors_file=\"none\"\n                if [[ $priors -eq \"naive\" ]]; then\n                    priors_file=\"cheap-priors.json\"\n                elif [[ $priors -eq \"sample\" ]]; then\n                    priors_file=\"biodex-priors.json\"\n                fi\n\n                exp_name=\"ablation-${priors}-${sentinel}-${opt}-seed${seed}\"\n                FILE=\"ablation-data/${exp_name}-metrics.json\"\n                if [ -f $FILE ]; then\n                    echo \"Skipping because $FILE exists.\"\n                else\n                    echo \"Running Seed: ${seed} -- priors: ${priors} (${priors_file}) -- sentinel: ${sentinel} -- k: ${k} -- j: ${j} -- opt: ${opt}\"\n                    python biodex-ablation.py --priors-file $priors_file --k $k --j $j --sample-budget 150 --optimizer-strategy $opt --sentinel-execution-strategy $sentinel --seed $seed --exp-name $exp_name\n                fi\n            done\n        done\n    done\ndone\n"
  },
  {
    "path": "abacus-research/run_biodex.sh",
    "content": "#!/bin/bash\n\nfor seed in {0..9}\ndo\n    echo \"Running Seed: ${seed}\"\n    exp_name=\"biodex-final-mab-k6-j4-budget150-seed${seed}\"\n    python biodex-demo.py --progress --policy maxquality --val-examples 20 --k 6 --j 4 --sample-budget 150 --seed $seed --exp-name $exp_name --gpt4-mini-only\ndone\n"
  },
  {
    "path": "abacus-research/run_biodex_cascades.sh",
    "content": "#!/bin/bash\n\n\nfor seed in {0..9}\ndo\n  for budget in 150 300 450\n  do\n    for strategy in \"greedy\" \"pareto\"\n    do\n      cost=0.5\n      k=0\n      j=0\n      if [[ $budget -eq 150 ]]; then\n        k=6\n        j=4\n      elif [[ $budget -eq 300 ]]; then\n        k=24\n        j=5\n      elif [[ $budget -eq 450 ]]; then\n        k=48\n        j=6\n      fi\n      # no priors\n      exp_name=\"biodex-${strategy}-cost${cost}-budget${budget}-k${k}-j${j}-seed${seed}\"\n      FILE=\"pareto-cascades-data/${exp_name}-metrics.json\"\n      if [ -f $FILE ]; then\n        echo \"Skipping because $FILE exists.\"\n      else\n        echo \"Running Seed: ${seed} -- cost: ${cost} -- budget: ${budget} -- k: ${k} -- j: ${j} -- strategy: ${strategy}\"\n        python biodex-pareto-cascades.py --progress --k $k --j $j --sample-budget $budget --optimizer-strategy $strategy --cost $cost --seed $seed --exp-name $exp_name\n      fi\n\n      # sample priors\n      exp_name=\"biodex-${strategy}-cost${cost}-with-priors-budget${budget}-k${k}-j${j}-seed${seed}\"\n      FILE=\"pareto-cascades-data/${exp_name}-metrics.json\"\n      if [ -f $FILE ]; then\n        echo \"Skipping because $FILE exists.\"\n      else\n        echo \"Running Seed: ${seed} -- cost: ${cost} -- SAMPLE PRIORS -- budget: ${budget} -- k: ${k} -- j: ${j} -- strategy: ${strategy}\"\n        python biodex-pareto-cascades.py --progress --priors-file biodex-priors-cascades.json --k $k --j $j --sample-budget $budget --optimizer-strategy $strategy --cost $cost --seed $seed --exp-name $exp_name\n      fi\n\n      # naive priors\n      exp_name=\"biodex-${strategy}-cost${cost}-cheap-priors-budget${budget}-k${k}-j${j}-seed${seed}\"\n      FILE=\"pareto-cascades-data/${exp_name}-metrics.json\"\n      if [ -f $FILE ]; then\n        echo \"Skipping because $FILE exists.\"\n      else\n        echo \"Running Seed: ${seed} -- cost: ${cost} -- CHEAP PRIORS -- budget: ${budget} -- k: ${k} -- j: ${j} -- strategy: ${strategy}\"\n        python biodex-pareto-cascades.py --progress --priors-file cheap-priors-cascades.json --k $k --j $j --sample-budget $budget --optimizer-strategy $strategy --cost $cost --seed $seed --exp-name $exp_name\n      fi\n    done\n  done\ndone\n"
  },
  {
    "path": "abacus-research/run_biodex_cost_threshold.sh",
    "content": "#!/bin/bash\n\n\nfor cost in 1.0 2.0 4.0 8.0 999.99\ndo\n  for seed in {0..9}\n  do\n    # set variables\n    budget=450\n    k=48\n    j=3\n\n    # no priors\n    exp_name=\"biodex-pareto-cost${cost}-budget${budget}-k${k}-j${j}-seed${seed}\"\n    FILE=\"max-quality-at-cost-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- cost: ${cost} -- budget: ${budget} -- k: ${k} -- j: ${j} -- strategy: ${strategy}\"\n      python biodex-max-quality-at-cost.py --progress --k $k --j $j --sample-budget $budget --cost $cost --seed $seed --exp-name $exp_name\n    fi\n\n    # sample priors\n    exp_name=\"biodex-pareto-cost${cost}-with-priors-budget${budget}-k${k}-j${j}-seed${seed}\"\n    FILE=\"max-quality-at-cost-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- cost: ${cost} -- SAMPLE PRIORS -- budget: ${budget} -- k: ${k} -- j: ${j} -- strategy: ${strategy}\"\n      python biodex-max-quality-at-cost.py --progress --priors-file biodex-priors.json --k $k --j $j --sample-budget $budget --cost $cost --seed $seed --exp-name $exp_name\n    fi\n  done\ndone\n"
  },
  {
    "path": "abacus-research/run_biodex_min_cost_latency.sh",
    "content": "#!/bin/bash\n\n\nfor policy in \"mincost\" \"minlatency\"\ndo\n  for seed in {0..9}\n  do\n    # set variables\n    budget=150\n    k=6\n    j=4\n\n    echo \"Running Seed: ${seed}\"\n    exp_name=\"biodex-final-${policy}-k6-j4-budget150-seed${seed}\"\n    python biodex-demo.py --progress --policy $policy --val-examples 20 --k 6 --j 4 --sample-budget 150 --seed $seed --exp-name $exp_name --gpt4-mini-only\n\n  done\ndone\n"
  },
  {
    "path": "abacus-research/run_biodex_priors.sh",
    "content": "#!/bin/bash\n\nfor sample_budget in 10 20 50 100\ndo\n  for seed in {0..9}\n  do\n    k=0\n    j=0\n    if [[ $sample_budget -eq 10 ]]; then\n      k=2\n      j=2\n    elif [[ $sample_budget -eq 20 ]]; then\n      k=2\n      j=2\n    elif [[ $sample_budget -eq 50 ]]; then\n      k=3\n      j=3\n    elif [[ $sample_budget -eq 100 ]]; then\n      k=4\n      j=4\n    fi\n\n    # run without priors\n    exp_name=\"biodex-no-priors-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: NO PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python biodex-demo.py --progress --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n\n    # run with sample-based priors\n    exp_name=\"biodex-with-priors-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: WITH PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python biodex-demo.py --progress --priors-file biodex-priors.json --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n\n    # run with cheap priors \n    exp_name=\"biodex-cheap-priors-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: CHEAP PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python biodex-demo.py --progress --priors-file cheap-priors.json --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n  done\ndone\n"
  },
  {
    "path": "abacus-research/run_biodex_priors_constrained.sh",
    "content": "#!/bin/bash\n\nfor sample_budget in 10 20 50 100\ndo\n  for seed in {0..9}\n  do\n    k=0\n    j=0\n    if [[ $sample_budget -eq 10 ]]; then\n      k=4\n      j=1\n    elif [[ $sample_budget -eq 20 ]]; then\n      k=4\n      j=1\n    elif [[ $sample_budget -eq 50 ]]; then\n      k=4\n      j=2\n    elif [[ $sample_budget -eq 100 ]]; then\n      k=5\n      j=3\n    fi\n\n    # run without priors\n    exp_name=\"biodex-no-priors-constrained-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: NO PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python biodex-demo.py --constrained --progress --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n\n    # run with sample-based priors\n    exp_name=\"biodex-with-priors-constrained-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: WITH PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python biodex-demo.py --constrained --progress --priors-file biodex-priors.json --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n\n    # run with cheap priors \n    exp_name=\"biodex-cheap-priors-constrained-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: CHEAP PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python biodex-demo.py --constrained --progress --priors-file cheap-priors.json --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n  done\ndone\n"
  },
  {
    "path": "abacus-research/run_cuad.sh",
    "content": "#!/bin/bash\n\nfor seed in {0..9}\ndo\n  policy=\"maxquality\"\n  echo \"Running Seed: ${seed} -- policy: ${policy}\"\n  exp_name=\"cuad-${policy}-k6-j4-budget50-seed${seed}\"\n  python cuad-demo.py --k 6 --j 4 --sample-budget 50 --seed $seed --exp-name $exp_name --gpt4-mini-only\ndone\n"
  },
  {
    "path": "abacus-research/run_cuad_cost_threshold.sh",
    "content": "#!/bin/bash\n\n\nfor cost in 1.0 2.0 4.0 8.0 999.99\ndo\n  for seed in {0..9}\n  do\n    # set variables\n    budget=300\n    k=60\n    j=5\n\n    # no priors\n    exp_name=\"cuad-pareto-cost${cost}-budget${budget}-k${k}-j${j}-seed${seed}\"\n    FILE=\"max-quality-at-cost-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- cost: ${cost} -- budget: ${budget} -- k: ${k} -- j: ${j} -- strategy: ${strategy}\"\n      python cuad-max-quality-at-cost.py --k $k --j $j --sample-budget $budget --cost $cost --seed $seed --exp-name $exp_name\n    fi\n\n    # sample priors\n    exp_name=\"cuad-pareto-cost${cost}-with-priors-budget${budget}-k${k}-j${j}-seed${seed}\"\n    FILE=\"max-quality-at-cost-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- cost: ${cost} -- SAMPLE PRIORS -- budget: ${budget} -- k: ${k} -- j: ${j} -- strategy: ${strategy}\"\n      python cuad-max-quality-at-cost.py --priors-file cuad-priors.json --k $k --j $j --sample-budget $budget --cost $cost --seed $seed --exp-name $exp_name\n    fi\n\n  done\ndone\n"
  },
  {
    "path": "abacus-research/run_cuad_min_cost_latency.sh",
    "content": "#!/bin/bash\n\nfor policy in \"mincost\" \"minlatency\"\ndo\n  for seed in {0..9}\n  do\n    echo \"Running Seed: ${seed}\"\n    exp_name=\"cuad-final-${policy}-k6-j4-budget50-seed${seed}\"\n    python cuad-demo.py --policy $policy --k 6 --j 4 --sample-budget 50 --seed $seed --exp-name $exp_name --gpt4-mini-only\n  done\ndone\n"
  },
  {
    "path": "abacus-research/run_cuad_priors.sh",
    "content": "#!/bin/bash\n\nfor sample_budget in 5 10 20 50\ndo\n  for seed in {0..9}\n  do\n    k=0\n    j=0\n    if [[ $sample_budget -eq 5 ]]; then\n      k=2\n      j=3\n    elif [[ $sample_budget -eq 10 ]]; then\n      k=3\n      j=2\n    elif [[ $sample_budget -eq 20 ]]; then\n      k=3\n      j=3\n    elif [[ $sample_budget -eq 50 ]]; then\n      k=6\n      j=4\n    fi\n\n    # run without priors\n    exp_name=\"cuad-no-priors-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: NO PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python cuad-demo.py --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n\n    # run with sample based priors\n    exp_name=\"cuad-with-priors-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: WITH PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python cuad-demo.py --priors-file cuad-priors.json --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n\n    # run with cheap priors \n    exp_name=\"cuad-cheap-priors-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: CHEAP PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python cuad-demo.py --priors-file cheap-priors.json --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n  done\ndone\n"
  },
  {
    "path": "abacus-research/run_cuad_priors_constrained.sh",
    "content": "#!/bin/bash\n\nfor sample_budget in 5 10 20 50\ndo\n  for seed in {0..9}\n  do\n    k=0\n    j=0\n    if [[ $sample_budget -eq 5 ]]; then\n      k=2\n      j=3\n    elif [[ $sample_budget -eq 10 ]]; then\n      k=3\n      j=2\n    elif [[ $sample_budget -eq 20 ]]; then\n      k=3\n      j=3\n    elif [[ $sample_budget -eq 50 ]]; then\n      k=6\n      j=4\n    fi\n\n    # run without priors\n    exp_name=\"cuad-no-priors-constrained-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: NO PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python cuad-demo.py --constrained --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n\n    # run with sample based priors\n    exp_name=\"cuad-with-priors-constrained-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: WITH PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python cuad-demo.py --constrained --priors-file cuad-priors.json --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n\n    # run with cheap priors \n    exp_name=\"cuad-cheap-priors-constrained-k${k}-j${j}-budget${sample_budget}-seed${seed}\"\n    FILE=\"opt-profiling-data/${exp_name}-metrics.json\"\n    if [ -f $FILE ]; then\n      echo \"Skipping because $FILE exists.\"\n    else\n      echo \"Running Seed: ${seed} -- priors: CHEAP PRIORS -- k: ${k} -- j: ${j} -- budget: ${sample_budget}\"\n      python cuad-demo.py --constrained --priors-file cheap-priors.json --k $k --j $j --sample-budget $sample_budget --seed $seed --exp-name $exp_name\n    fi\n  done\ndone\n"
  },
  {
    "path": "abacus-research/run_mmqa.sh",
    "content": "#!/bin/bash\n\nfor seed in {0..9}\ndo\n    echo \"Running Seed: ${seed}\"\n    exp_name=\"mmqa-final-mab-k6-j4-budget150-seed${seed}\"\n    python mmqa-demo.py --progress --k 6 --j 4 --sample-budget 150 --seed $seed --exp-name $exp_name --gpt4-mini-only\ndone\n"
  },
  {
    "path": "abacus-research/run_mmqa_complex.sh",
    "content": "#!/bin/bash\n\n# for seed in {0..9}\n# Lotus error'ed on seed 5 and 7, so we limit to these seeds only for a consistent comparison\nfor seed in 0 1 2 3 4 6 8 9\ndo\n    policy=\"maxquality\"\n    exp_name=\"mmqa-complex-${policy}-k6-j4-budget350-seed${seed}\"\n    FILE=\"mmqa-complex-data/${exp_name}-stats.json\"\n    if [ -f $FILE ]; then\n        echo \"Skipping because $FILE exists.\"\n    else\n        echo \"Running Seed: ${seed} -- ${policy}\"\n        python mmqa-complex-demo.py --progress --k 6 --j 4 --sample-budget 350 --seed $seed --exp-name $exp_name --gpt4-mini-only\n    fi\ndone\n"
  },
  {
    "path": "abacus-research/run_mmqa_complex_min_cost_latency.sh",
    "content": "#!/bin/bash\n\n# for seed in {0..9}\n# Lotus error'ed on seed 5 and 7, so we limit to these seeds only for a consistent comparison\nfor seed in 0 1 2 3 4 6 8 9\ndo\n    policy=\"mincost\"\n    exp_name=\"mmqa-complex-${policy}-k6-j4-budget350-seed${seed}\"\n    FILE=\"mmqa-complex-data/${exp_name}-stats.json\"\n    if [ -f $FILE ]; then\n        echo \"Skipping because $FILE exists.\"\n    else\n        echo \"Running Seed: ${seed} -- ${policy}\"\n        python mmqa-complex-demo.py --progress --k 6 --j 4 --sample-budget 350 --policy $policy --seed $seed --exp-name $exp_name --gpt4-mini-only\n    fi\n\n    policy=\"minlatency\"\n    exp_name=\"mmqa-complex-${policy}-k6-j4-budget350-seed${seed}\"\n    FILE=\"mmqa-complex-data/${exp_name}-stats.json\"\n    if [ -f $FILE ]; then\n        echo \"Skipping because $FILE exists.\"\n    else\n        echo \"Running Seed: ${seed} -- ${policy}\"\n        python mmqa-complex-demo.py --progress --k 6 --j 4 --sample-budget 350 --policy $policy --seed $seed --exp-name $exp_name --gpt4-mini-only\n    fi\ndone\n"
  },
  {
    "path": "abacus-research/run_mmqa_min_cost_latency.sh",
    "content": "#!/bin/bash\n\nfor policy in \"mincost\" \"minlatency\"\ndo\n    for seed in {0..9}\n    do\n        echo \"Running Seed: ${seed}\"\n        exp_name=\"mmqa-final-${policy}-k6-j4-budget150-seed${seed}\"\n        python mmqa-demo.py --progress --policy $policy --k 6 --j 4 --sample-budget 150 --seed $seed --exp-name $exp_name --gpt4-mini-only\n    done\ndone\n"
  },
  {
    "path": "abacus-research/score_biodex.py",
    "content": "import json\n\nimport numpy as np\n\n\ndef compute_final_metrics(metric: str, dir: str, exp_base_name: str):\n    qualities = []\n    opt_costs, run_costs = [], []\n    opt_times, run_times = [], []\n    total_costs, total_times = [], []\n    print(f\"--- {metric} ---\")\n    for seed in range(10):\n        exp_name = f\"{exp_base_name}-seed{seed}\"\n        with open(f\"{dir}/{exp_name}-metrics.json\") as f:\n            metrics = json.load(f)\n        qualities.append(metrics[\"rp@5\"])\n        opt_costs.append(metrics[\"optimization_cost\"])\n        opt_times.append(metrics[\"optimization_time\"])\n        run_costs.append(metrics[\"plan_execution_cost\"])\n        run_times.append(metrics[\"plan_execution_time\"])\n        total_costs.append(metrics[\"total_execution_cost\"])\n        total_times.append(metrics[\"total_execution_time\"])\n    \n    print(f\"Opt. Cost: {np.mean(opt_costs):.3f} +/- {np.std(opt_costs):.3f}\")\n    print(f\"Opt. Time: {np.mean(opt_times):.3f} +/- {np.std(opt_times):.3f}\")\n    print(f\"Run Cost: {np.mean(run_costs):.3f} +/- {np.std(run_costs):.3f}\")\n    print(f\"Run Time: {np.mean(run_times):.3f} +/- {np.std(run_times):.3f}\")\n    print(f\"Total Cost: {np.mean(total_costs):.3f} +/- {np.std(total_costs):.3f}\")\n    print(f\"Total Time: {np.mean(total_times):.3f} +/- {np.std(total_times):.3f}\")\n    print(f\"Quality: {np.mean(qualities):.3f} +/- {np.std(qualities):.3f}\")\n    print(\"-------\")\n\nif __name__ == \"__main__\":\n    compute_final_metrics(\"quality\", \"opt-profiling-data\", \"biodex-final-mab-k6-j4-budget150\")\n    compute_final_metrics(\"cost\", \"min-cost-at-quality-data\", \"biodex-pareto-min-cost-budget150-k6-j4\")\n    compute_final_metrics(\"latency\", \"min-latency-at-quality-data\", \"biodex-pareto-min-latency-budget150-k6-j4\")\n"
  },
  {
    "path": "abacus-research/score_cuad.py",
    "content": "import json\nimport os\n\nimport numpy as np\n\n\ndef compute_final_metrics(metric: str, dir: str, exp_base_name: str):\n    qualities = []\n    opt_costs, run_costs = [], []\n    opt_times, run_times = [], []\n    total_costs, total_times = [], []\n    print(f\"--- {metric} ---\")\n    for seed in range(10):\n        exp_name = f\"{exp_base_name}-seed{seed}\"\n        if os.path.exists(f\"{dir}/{exp_name}-metrics.json\") is False:\n            print(f\"Missing {dir}/{exp_name}-metrics.json\")\n            continue\n        with open(f\"{dir}/{exp_name}-metrics.json\") as f:\n            metrics = json.load(f)\n        qualities.append(metrics[\"f1\"])\n        opt_costs.append(metrics[\"optimization_cost\"])\n        opt_times.append(metrics[\"optimization_time\"])\n        run_costs.append(metrics[\"plan_execution_cost\"])\n        run_times.append(metrics[\"plan_execution_time\"])\n        total_costs.append(metrics[\"total_execution_cost\"])\n        total_times.append(metrics[\"total_execution_time\"])\n    \n    print(f\"Opt. Cost: {np.mean(opt_costs):.3f} +/- {np.std(opt_costs):.3f}\")\n    print(f\"Opt. Time: {np.mean(opt_times):.3f} +/- {np.std(opt_times):.3f}\")\n    print(f\"Run Cost: {np.mean(run_costs):.3f} +/- {np.std(run_costs):.3f}\")\n    print(f\"Run Time: {np.mean(run_times):.3f} +/- {np.std(run_times):.3f}\")\n    print(f\"Total Cost: {np.mean(total_costs):.3f} +/- {np.std(total_costs):.3f}\")\n    print(f\"Total Time: {np.mean(total_times):.3f} +/- {np.std(total_times):.3f}\")\n    print(f\"Quality: {np.mean(qualities):.3f} +/- {np.std(qualities):.3f}\")\n    print(\"-------\")\n\nif __name__ == \"__main__\":\n    compute_final_metrics(\"quality\", \"opt-profiling-data\", \"cuad-final-mab-k6-j4-budget50\")\n    compute_final_metrics(\"cost\", \"opt-profiling-data\", \"cuad-final-mincost-k6-j4-budget50\")\n    compute_final_metrics(\"latency\", \"opt-profiling-data\", \"cuad-final-minlatency-k6-j4-budget50\")"
  },
  {
    "path": "abacus-research/score_mmqa.py",
    "content": "import json\n\nimport numpy as np\n\n\ndef compute_final_metrics(metric: str, dir: str, exp_base_name: str):\n    qualities = []\n    opt_costs, run_costs = [], []\n    opt_times, run_times = [], []\n    total_costs, total_times = [], []\n    print(f\"--- {metric} ---\")\n    for seed in range(10):\n        exp_name = f\"{exp_base_name}-seed{seed}\"\n        with open(f\"{dir}/{exp_name}-stats.json\") as f:\n            metrics = json.load(f)\n        qualities.append(metrics[\"f1\"])\n        opt_costs.append(metrics[\"optimization_cost\"])\n        opt_times.append(metrics[\"optimization_time\"])\n        run_costs.append(metrics[\"plan_execution_cost\"])\n        run_times.append(metrics[\"plan_execution_time\"])\n        total_costs.append(metrics[\"total_execution_cost\"])\n        total_times.append(metrics[\"total_execution_time\"])\n    \n    print(f\"Opt. Cost: {np.mean(opt_costs):.3f} +/- {np.std(opt_costs):.3f}\")\n    print(f\"Opt. Time: {np.mean(opt_times):.3f} +/- {np.std(opt_times):.3f}\")\n    print(f\"Run Cost: {np.mean(run_costs):.3f} +/- {np.std(run_costs):.3f}\")\n    print(f\"Run Time: {np.mean(run_times):.3f} +/- {np.std(run_times):.3f}\")\n    print(f\"Total Cost: {np.mean(total_costs):.3f} +/- {np.std(total_costs):.3f}\")\n    print(f\"Total Time: {np.mean(total_times):.3f} +/- {np.std(total_times):.3f}\")\n    print(f\"Quality: {np.mean(qualities):.3f} +/- {np.std(qualities):.3f}\")\n    print(\"-------\")\n\nif __name__ == \"__main__\":\n    compute_final_metrics(\"quality\", \"opt-profiling-data\", \"mmqa-final-mab-k6-j4-budget150\")\n    compute_final_metrics(\"cost\", \"opt-profiling-data\", \"mmqa-final-mincost-k6-j4-budget150\")\n    compute_final_metrics(\"latency\", \"opt-profiling-data\", \"mmqa-final-minlatency-k6-j4-budget150\")\n"
  },
  {
    "path": "abacus-research/score_mmqa_complex.py",
    "content": "import json\nimport os\n\nimport numpy as np\n\n\ndef compute_final_metrics(metric: str, dir: str, exp_base_name: str):\n    qualities = []\n    opt_costs, run_costs = [], []\n    opt_times, run_times = [], []\n    total_costs, total_times = [], []\n    print(f\"--- {metric} ---\")\n    for seed in [0, 1, 2, 3, 4, 6, 8, 9]:\n        exp_name = f\"{exp_base_name}-seed{seed}\"\n        if os.path.exists(f\"{dir}/{exp_name}-stats.json\"):\n            with open(f\"{dir}/{exp_name}-stats.json\") as f:\n                metrics = json.load(f)\n            qualities.append(metrics[\"f1\"])\n            opt_costs.append(metrics[\"optimization_cost\"])\n            opt_times.append(metrics[\"optimization_time\"])\n            run_costs.append(metrics[\"plan_execution_cost\"])\n            run_times.append(metrics[\"plan_execution_time\"])\n            total_costs.append(metrics[\"total_execution_cost\"])\n            total_times.append(metrics[\"total_execution_time\"])\n    \n    print(f\"Opt. Cost: {np.mean(opt_costs):.3f} +/- {np.std(opt_costs):.3f}\")\n    print(f\"Opt. Time: {np.mean(opt_times):.3f} +/- {np.std(opt_times):.3f}\")\n    print(f\"Run Cost: {np.mean(run_costs):.3f} +/- {np.std(run_costs):.3f}\")\n    print(f\"Run Time: {np.mean(run_times):.3f} +/- {np.std(run_times):.3f}\")\n    print(f\"Total Cost: {np.mean(total_costs):.3f} +/- {np.std(total_costs):.3f}\")\n    print(f\"Total Time: {np.mean(total_times):.3f} +/- {np.std(total_times):.3f}\")\n    print(f\"Quality: {np.mean(qualities):.3f} +/- {np.std(qualities):.3f}\")\n    print(\"-------\")\n\nif __name__ == \"__main__\":\n    compute_final_metrics(\"quality\", \"opt-profiling-data\", \"mmqa-complex-final-mab-k6-j4-budget350\")\n    compute_final_metrics(\"cost\", \"opt-profiling-data\", \"mmqa-complex-mincost-k6-j4-budget350\")\n    compute_final_metrics(\"latency\", \"opt-profiling-data\", \"mmqa-complex-minlatency-k6-j4-budget350\")\n"
  },
  {
    "path": "abacus-research/setup_cuad_data.py",
    "content": "#!/usr/bin/env python\n\"\"\"\nScript to download CUAD dataset and set up local data directory.\nThis replaces the need for HuggingFace datasets library.\n\"\"\"\n\nimport os\nimport urllib.request\nimport zipfile\n\n\ndef setup_cuad_data():\n    # Create cuad-data directory\n    data_dir = \"cuad-data\"\n    if not os.path.exists(data_dir):\n        os.makedirs(data_dir)\n        print(f\"Created directory: {data_dir}\")\n    \n    # Download CUAD data zip file\n    data_url = \"https://github.com/TheAtticusProject/cuad/raw/main/data.zip\"\n    zip_path = os.path.join(data_dir, \"data.zip\")\n    \n    if not os.path.exists(zip_path):\n        print(f\"Downloading CUAD data from {data_url}...\")\n        urllib.request.urlretrieve(data_url, zip_path)\n        print(f\"Downloaded to {zip_path}\")\n    else:\n        print(f\"Data already downloaded at {zip_path}\")\n    \n    # Extract the zip file\n    print(\"Extracting data...\")\n    with zipfile.ZipFile(zip_path, 'r') as zip_ref:\n        zip_ref.extractall(data_dir)\n    print(f\"Extracted data to {data_dir}\")\n    \n    # Download the dataset loading script (for reference, not actually used)\n    script_url = \"https://huggingface.co/datasets/theatticusproject/cuad-qa/resolve/main/cuad-qa.py\"\n    script_path = os.path.join(data_dir, \"cuad-qa.py\")\n    \n    if not os.path.exists(script_path):\n        print(f\"Downloading CUAD dataset script from {script_url}...\")\n        urllib.request.urlretrieve(script_url, script_path)\n        print(f\"Downloaded to {script_path}\")\n    else:\n        print(f\"Script already exists at {script_path}\")\n    \n    # List extracted files\n    print(\"\\nExtracted files:\")\n    for file in os.listdir(data_dir):\n        if file.endswith('.json'):\n            file_path = os.path.join(data_dir, file)\n            size = os.path.getsize(file_path) / (1024 * 1024)  # Size in MB\n            print(f\"  - {file} ({size:.2f} MB)\")\n    \n    print(\"\\nSetup complete! CUAD data is ready in the 'cuad-data' directory.\")\n    print(\"\\nTo use this data in your scripts, update the data loading to:\")\n    print(\"  - train data: cuad-data/train_separate_questions.json\")\n    print(\"  - test data: cuad-data/test.json\")\n    \n    return data_dir\n\nif __name__ == \"__main__\":\n    setup_cuad_data()"
  },
  {
    "path": "demos/audio-demo.py",
    "content": "import os\n\nimport kagglehub\n\nimport palimpzest as pz\n\n\nclass SmallAudioDataset(pz.AudioFileDataset):\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n\n        # Limit to first 10 audio files for demo purposes\n        self.filepaths = self.filepaths[:10]\n\n\nif __name__ == \"__main__\":\n    # Download latest version\n    path = kagglehub.dataset_download(\"rushibalajiputthewad/sound-classification-of-animal-voice\")\n    print(f\"Dataset downloaded to: {path}\")\n\n    # create simple plan to classify animal sounds\n    plan = SmallAudioDataset(id=\"animal-sounds\", path=os.path.join(path, \"Animal-Soundprepros\"))\n    plan = plan.sem_map(cols=[{\"name\": \"animal\", \"type\": str, \"description\": \"The type of animal making the sound in the recording.\"}])\n\n    # run plan un-optimized\n    config = pz.QueryProcessorConfig(\n        policy=pz.MaxQuality(),\n        # available_models=[pz.Model.GEMINI_2_0_FLASH, pz.Model.GEMINI_2_5_FLASH, pz.Model.GEMINI_2_5_PRO],\n    )\n    output = plan.run(config)\n\n    print(output.to_df())\n"
  },
  {
    "path": "demos/caching-demo.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nRealistic Demo showcasing prompt caching capabilities in Palimpzest.\n\nThis demo processes multiple employee travel requests against a comprehensive\nCorporate Travel Policy. The policy text (~2000 tokens) is included in the\nsystem prompt, creating a realistic scenario for prompt caching where a large\nstatic context is reused across multiple dynamic inputs.\n\nWorkload:\n- Context: A lengthy 10-page Corporate Travel & Expense Policy.\n- Input: Short email requests from employees.\n- Task: Analyze each request for policy compliance, identifying violations and reimbursable amounts.\n\nSupported caching providers:\n- OpenAI (GPT-4o, GPT-4o-mini): Automatic prefix caching\n- Anthropic (Claude 3.5 Sonnet/Haiku): Explicit cache_control markers\n- Gemini: Implicit caching\n\"\"\"\n\nimport argparse\nimport os\nimport time\nfrom typing import List\n\nfrom dotenv import load_dotenv\n\nimport palimpzest as pz\nfrom palimpzest.constants import Model\nfrom palimpzest.core.lib.schemas import TextFile\n\nload_dotenv()\n\n# =============================================================================\n# MOCK DATA: CORPORATE TRAVEL POLICY (Static Context > 1024 tokens)\n# =============================================================================\nCORPORATE_TRAVEL_POLICY = \"\"\"\nGLOBAL CORP TRAVEL & EXPENSE POLICY (v2024.1)\n\nSECTION 1: OVERVIEW AND PHILOSOPHY\nGlobal Corp expects employees to act responsibly and professionally when incurring and submitting costs. \nThe company will reimburse employees for reasonable and necessary expenses incurred during approved business travel. \nThis policy applies to all employees, contractors, and consultants.\n\nSECTION 2: AIR TRAVEL\n2.1 Booking Window: All domestic flights must be booked at least 14 days in advance. International flights must be booked 21 days in advance.\n2.2 Class of Service:\n    - Economy Class: Required for all domestic flights under 6 hours.\n    - Premium Economy: Allowed for domestic flights over 6 hours or international flights under 8 hours.\n    - Business Class: Allowed for international flights exceeding 8 hours duration.\n    - First Class: Strictly prohibited unless approved by the CEO.\n2.3 Ancillary Fees:\n    - Checked Bags: Up to two bags reimbursed for trips > 3 days. One bag for trips <= 3 days.\n    - Wi-Fi: Reimbursed only if business justification is provided (e.g., \"urgent client deadline\").\n    - Seat Selection: Fees > $50 require VP approval.\n\nSECTION 3: LODGING\n3.1 Hotel Caps (Nightly Rates excluding taxes):\n    - Tier 1 Cities (NY, London, Tokyo, SF, Zurich): $350 USD\n    - Tier 2 Cities (Chicago, Paris, Berlin, Austin): $250 USD\n    - All Other Locations: $175 USD\n3.2 Room Type: Standard single rooms only. Suites are prohibited.\n3.3 Laundry: Reasonable laundry expenses reimbursed for trips exceeding 5 consecutive nights.\n\nSECTION 4: MEALS AND ENTERTAINMENT\n4.1 Daily Meal Allowance (Per Diem):\n    - Tier 1 Cities: $100/day\n    - Tier 2 Cities: $75/day\n    - Others: $60/day\n4.2 Client Entertainment:\n    - Must include at least one current or prospective client.\n    - Cap is $150 per person (including employees).\n    - Names and affiliations of all attendees must be documented.\n4.3 Alcohol:\n    - Reimbursable only with dinner.\n    - Moderate consumption allowed (max 2 drinks per person).\n    - \"Top Shelf\" liquors prohibited.\n\nSECTION 5: GROUND TRANSPORTATION\n5.1 Ride Share/Taxi: Preferred mode for travel between airport and hotel.\n5.2 Car Rentals:\n    - Class: Intermediate/Mid-size or smaller.\n    - Insurance: Decline CDW/LDW (covered by corporate policy).\n    - Fuel: Pre-paid fuel options are prohibited; cars must be returned full.\n5.3 Rail: Economy/Standard class only. Acela Business Class permitted for Northeast Corridor travel.\n\nSECTION 6: MISCELLANEOUS\n6.1 Tipping:\n    - Meals: 15-20%\n    - Taxis: 10-15%\n    - Bellhop: $1-2 per bag\n6.2 Non-Reimbursable Items:\n    - Personal grooming/toiletries.\n    - Fines (parking, speeding).\n    - Airline club memberships.\n    - In-room movies.\n    - Lost luggage/property.\n\nSECTION 7: SUBMISSION PROCESS\nExpenses must be submitted within 30 days of trip completion. Receipts required for all expenses > $25.\n\"\"\"\n\n# =============================================================================\n# MOCK DATA: EMPLOYEE REQUESTS (Dynamic Inputs)\n# =============================================================================\nEMPLOYEE_REQUESTS = [\n    # Request 1: Compliant\n    \"\"\"Subject: Trip to London\n    I booked a flight to London (8.5 hours) in Business Class for the client summit. \n    Hotel is $320/night. Meal expenses were about $90/day. \n    Receipts attached.\"\"\",\n    # Request 2: Violation (Booking window & First Class)\n    \"\"\"Subject: Urgent NY Trip\n    I need to fly to New York tomorrow. Booked First Class because it was the only seat left.\n    Hotel is the Ritz at $500/night. \n    Also expensed $40 for in-flight Wi-Fi to finish the Q3 report.\"\"\",\n    # Request 3: Violation (Car Rental & Alcohol)\n    \"\"\"Subject: Austin Conference\n    Rented a luxury SUV for the team in Austin. \n    Dinner with the team (no clients) came to $800 ($200/person) including 3 bottles of wine.\n    Hotel was $240/night.\"\"\",\n    # Request 4: Compliant (Tier 2 City)\n    \"\"\"Subject: Berlin Site Visit\n    Flew Economy to Berlin. Hotel was $220/night.\n    Took a taxi from TXL ($45 + $5 tip).\n    Daily meals averaged $70.\"\"\",\n    # Request 5: Violation (Misc items)\n    \"\"\"Subject: Tokyo Tech Symposium\n    Trip duration: 4 days. \n    Expensed:\n    - Flight (Premium Econ, 11 hours)\n    - Hotel ($340/night)\n    - Laundry service ($60)\n    - Forgotten toothbrush replacement ($15)\n    - Parking ticket ($50)\n    \"\"\",\n]\n\n# Output Schema\nOUTPUT_SCHEMA = [\n    {\"name\": \"status\", \"type\": str, \"desc\": \"One of: 'COMPLIANT', 'PARTIAL_VIOLATION', 'MAJOR_VIOLATION'\"},\n    {\n        \"name\": \"violations\",\n        \"type\": str,\n        \"desc\": \"A list of specific policy violations found, referencing the specific section numbers (e.g., 'Violation of Section 2.2'). If compliant, return 'None'.\",\n    },\n    {\n        \"name\": \"reimbursable_summary\",\n        \"type\": str,\n        \"desc\": \"A concise summary of what should be reimbursed vs rejected based on the policy text.\",\n    },\n    {\n        \"name\": \"flag_for_review\",\n        \"type\": bool,\n        \"desc\": \"True if the request requires manual review by a manager (e.g. for high amounts or ambiguous justifications).\",\n    },\n]\n\nTASK_DESC = f\"\"\"\nYou are an AI auditor for Global Corp. Your job is to review employee travel expense descriptions against the Corporate Travel Policy.\nThe full policy text is provided below. \n\n{CORPORATE_TRAVEL_POLICY}\n\nAnalyze the input email and determine if the expenses adhere to the policy.\n\"\"\"\n\n\nclass TravelRequestDataset(pz.IterDataset):\n    \"\"\"Custom dataset that provides travel requests as text records.\"\"\"\n\n    def __init__(self, requests: List[str]):\n        super().__init__(id=\"travel_requests\", schema=TextFile)\n        self.requests = requests\n\n    def __len__(self):\n        return len(self.requests)\n\n    def __getitem__(self, idx: int):\n        return {\n            \"filename\": f\"request_{idx + 1}.txt\",\n            \"contents\": self.requests[idx],\n        }\n\n\n# Model mapping (Same as original)\nMODEL_MAPPING = {\n    \"gpt-4o\": Model.GPT_4o,\n    \"gpt-4o-mini\": Model.GPT_4o_MINI,\n    \"claude-4-0-sonnet\": Model.CLAUDE_4_SONNET,\n    # \"claude-3-7-sonnet\": Model.CLAUDE_3_7_SONNET, # deprecated model testing\n    \"claude-4-5-haiku\": Model.CLAUDE_4_5_HAIKU,\n    \"gemini-2.5-flash\": Model.GOOGLE_GEMINI_2_5_FLASH,\n    # \"deepseek-v3\": Model.DEEPSEEK_V3,\n}\n\n\ndef get_model_from_string(model_str: str) -> Model:\n    if model_str.lower() in MODEL_MAPPING:\n        return MODEL_MAPPING[model_str.lower()]\n    for model in Model:\n        if model.value.lower() == model_str.lower():\n            return model\n    raise ValueError(f\"Unknown model: {model_str}\")\n\n\ndef print_cache_stats(execution_stats):\n    \"\"\"Print cache-related statistics from execution.\"\"\"\n    print(\"\\n\" + \"=\" * 60)\n    print(\" CACHE STATISTICS & COST ANALYSIS\")\n    print(\"=\" * 60)\n\n    # Token counts are now disjoint:\n    # - input_text_tokens: regular (non-cached) input tokens\n    # - cache_read_tokens: tokens read from cache (hits)\n    # - cache_creation_tokens: tokens written to cache\n    regular_input = execution_stats.input_text_tokens\n    cache_read = execution_stats.cache_read_tokens\n    cache_creation = execution_stats.cache_creation_tokens\n    total_output = execution_stats.output_text_tokens\n    total_embedding = execution_stats.embedding_input_tokens\n\n    # Logical total = regular + cache read + cache creation\n    logical_total_input = regular_input + cache_read + cache_creation\n\n    print(f\"{'Metric':<35} | {'Count':<15}\")\n    print(\"-\" * 55)\n    print(f\"{'Logical Total Input Tokens':<35} | {logical_total_input:,}\")\n    print(f\"{'  - Regular Input (full rate)':<35} | {regular_input:,}\")\n    print(f\"{'  - Cache Read (discounted)':<35} | {cache_read:,}\")\n    print(f\"{'  - Cache Creation':<35} | {cache_creation:,}\")\n    print(\"-\" * 55)\n    print(f\"{'Total Output Tokens':<35} | {total_output:,}\")\n    if total_embedding > 0:\n        print(f\"{'Total Embedding Input Tokens':<35} | {total_embedding:,}\")\n    print(\"-\" * 55)\n    print(f\"{'Total Execution Cost':<35} | ${execution_stats.total_execution_cost:.6f}\")\n\n    # Calculate and display cache hit rate\n    # Hit rate = cache_read / (regular_input + cache_read)\n    total_cacheable = regular_input + cache_read\n    if total_cacheable > 0:\n        hit_rate = (cache_read / total_cacheable) * 100\n        print(f\"\\nCache Hit Rate: {hit_rate:.1f}%\")\n\n\ndef main():\n    parser = argparse.ArgumentParser(description=\"Demo showcasing prompt caching in Palimpzest\")\n    parser.add_argument(\"--model\", type=str, default=\"gpt-4o-mini\", help=\"Model to use\")\n    parser.add_argument(\"--num-records\", type=int, default=5, help=\"Number of requests to process\")\n    parser.add_argument(\"--verbose\", action=\"store_true\", help=\"Enable verbose output\")\n    parser.add_argument(\"--profile\", action=\"store_true\", help=\"Save profiling data\")\n\n    args = parser.parse_args()\n    model = get_model_from_string(args.model)\n\n    # Validate env vars (Simplified for brevity)\n    if model.is_provider_openai() and not os.getenv(\"OPENAI_API_KEY\"):\n        print(\"ERROR: OPENAI_API_KEY not set\")\n        return\n    if model.is_provider_anthropic() and not os.getenv(\"ANTHROPIC_API_KEY\"):\n        print(\"ERROR: ANTHROPIC_API_KEY not set\")\n        return\n    if (model.is_provider_google_ai_studio() or model.is_provider_vertex_ai()) and not os.getenv(\"GOOGLE_API_KEY\"):\n        print(\"ERROR: GOOGLE_API_KEY not set\")\n        return\n\n    print(\"=\" * 60)\n    print(\" PZ CACHING DEMO: CORPORATE AUDIT\")\n    print(\"=\" * 60)\n    print(f\"Model: {model.value}\")\n    print(\n        f\"Policy Context Size: ~{len(CORPORATE_TRAVEL_POLICY.split())} words (~{int(len(CORPORATE_TRAVEL_POLICY.split()) * 1.3)} tokens)\"\n    )\n\n    # Repeat the request list if user wants more records than we have mocks\n    base_requests = EMPLOYEE_REQUESTS\n    requests = []\n    while len(requests) < args.num_records:\n        requests.extend(base_requests)\n    requests = requests[: args.num_records]\n\n    print(f\"Processing {len(requests)} travel requests...\")\n\n    # Build Plan\n    dataset = TravelRequestDataset(requests)\n\n    # The 'desc' field incorporates the huge CORPORATE_TRAVEL_POLICY string.\n    # This ensures the System Prompt is large (>1024 tokens) and identical for all records.\n    plan = dataset.sem_map(OUTPUT_SCHEMA, desc=TASK_DESC)\n\n    config = pz.QueryProcessorConfig(\n        policy=pz.MaxQuality(),\n        verbose=args.verbose,\n        execution_strategy=\"sequential\",  # Sequential often easier to debug caching behavior initially\n        available_models=[model],\n    )\n\n    start_time = time.time()\n    result = plan.run(config)\n    end_time = time.time()\n\n    # Output Results\n    print(\"\\n\" + \"=\" * 60)\n    print(\" AUDIT RESULTS\")\n    print(\"=\" * 60)\n    for i, record in enumerate(result.data_records):\n        print(f\"\\n[Request {i + 1}]\")\n        print(f\"Status: {record.status}\")\n        print(f\"Violations: {record.violations}\")\n        print(f\"Summary: {record.reimbursable_summary}\")\n\n    print_cache_stats(result.execution_stats)\n    print(f\"\\nWall Clock Time: {end_time - start_time:.2f}s\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "demos/demo_core.py",
    "content": "#!/usr/bin/env python3\nimport json\nimport os\n\nimport pandas as pd\nfrom tabulate import tabulate\n\nimport palimpzest as pz\nfrom palimpzest.core.elements.groupbysig import GroupBySig\nfrom palimpzest.core.elements.records import DataRecord\n\nsci_paper_cols = [\n    {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the paper. This is a natural language title, not a number or letter.\"},\n    {\"name\": \"publication_year\", \"type\": int, \"desc\": \"The year the paper was published. This is a number.\"},\n    {\"name\": \"author\", \"type\": str, \"desc\": \"The name of the first author of the paper\"},\n    {\"name\": \"institution\", \"type\": str, \"desc\": \"The institution of the first author of the paper\"},\n    {\"name\": \"journal\", \"type\": str, \"desc\": \"The name of the journal the paper was published in\"},\n    {\"name\": \"funding_agency\", \"type\": str, \"desc\": \"The name of the funding agency that supported the research\"},\n]\n\nemail_cols = [\n    {\"name\": \"sender\", \"type\": str, \"desc\": \"The email address of the sender\"},\n    {\"name\": \"subject\", \"type\": str, \"desc\": \"The subject of the email\"},\n]\n\ndog_image_cols = [\n    {\"name\": \"breed\", \"type\": str, \"desc\": \"The breed of the dog\"},\n]\n\ndef build_sci_paper_plan(dataset):\n    \"\"\"A dataset-independent declarative description of authors of good papers\"\"\"\n    return pz.PDFFileDataset(id=\"science-papers\", path=dataset).sem_map(sci_paper_cols)\n\ndef build_test_pdf_plan(dataset):\n    \"\"\"This tests whether we can process a PDF file\"\"\"\n    return pz.PDFFileDataset(id=\"pdf-files\", path=dataset)\n\ndef build_mit_battery_paper_plan(dataset):\n    \"\"\"A dataset-independent declarative description of authors of good papers\"\"\"\n    sci_papers = pz.PDFFileDataset(id=\"science-papers\", path=dataset).sem_map(sci_paper_cols)\n    battery_papers = sci_papers.sem_filter(\"The paper is about batteries\")\n    mit_papers = battery_papers.sem_filter(\"The paper is from MIT\")\n    return mit_papers\n\ndef build_enron_plan(dataset):\n    \"\"\"Build a plan for processing Enron email data\"\"\"\n    return pz.TextFileDataset(id=\"enron-emails\", path=dataset).sem_map(email_cols)\n\ndef compute_enron_stats(dataset):\n    \"\"\"Compute statistics on Enron email data\"\"\"\n    emails = pz.TextFileDataset(id=\"enron-emails\", path=dataset).sem_map(email_cols)\n    subject_line_lengths = emails.sem_map([{\"name\": \"words\", \"type\": int, \"desc\": \"The number of words in the subject field\"}])\n    return subject_line_lengths\n\ndef enron_gby_plan(dataset):\n    \"\"\"Group Enron emails by sender\"\"\"\n    emails = pz.TextFileDataset(id=\"enron-emails\", path=dataset).sem_map(email_cols)\n    ops = [\"count\"]\n    fields = [\"sender\"]\n    groupbyfields = [\"sender\"]\n    gby_desc = GroupBySig(groupbyfields, ops, fields)\n    grouped_emails = emails.groupby(gby_desc)\n    return grouped_emails\n\ndef enron_count_plan(dataset):\n    \"\"\"Count total Enron emails\"\"\"\n    emails = pz.TextFileDataset(id=\"enron-emails\", path=dataset).sem_map(email_cols)\n    ops = [\"count\"]\n    fields = [\"sender\"]\n    groupbyfields = []\n    gby_desc = GroupBySig(groupbyfields, ops, fields)\n    count_emails = emails.groupby(gby_desc)\n    return count_emails\n\ndef enron_average_count_plan(dataset):\n    \"\"\"Calculate average number of emails per sender\"\"\"\n    emails = pz.TextFileDataset(id=\"enron-emails\", path=dataset).sem_map(email_cols)\n    ops = [\"count\"]\n    fields = [\"sender\"]\n    groupbyfields = [\"sender\"]\n    gby_desc = GroupBySig(groupbyfields, ops, fields)\n    grouped_emails = emails.groupby(gby_desc)\n    ops = [\"average\"]\n    fields = [\"count(sender)\"]\n    groupbyfields = []\n    gby_desc = GroupBySig(groupbyfields, ops, fields)\n    average_emails_per_sender = grouped_emails.groupby(gby_desc)\n    return average_emails_per_sender\n\ndef enron_limit_plan(dataset, limit=5):\n    \"\"\"Get limited number of Enron emails\"\"\"\n    emails = pz.TextFileDataset(id=\"enron-emails\", path=dataset).sem_map(email_cols)\n    limit_data = emails.limit(limit)\n    return limit_data\n\ndef build_image_plan(dataset):\n    \"\"\"Build a plan for processing dog images\"\"\"\n    images = pz.ImageFileDataset(id=\"dog-images\", path=dataset)\n    filtered_images = images.sem_filter(\"The image contains one or more dogs\")\n    dog_images = filtered_images.sem_map(dog_image_cols)\n    return dog_images\n\ndef build_image_agg_plan(dataset):\n    \"\"\"Build a plan for aggregating dog images by breed\"\"\"\n    images = pz.ImageFileDataset(id=\"dog-images\", path=dataset)\n    filtered_images = images.sem_filter(\"The image contains one or more dogs\")\n    dog_images = filtered_images.sem_map(dog_image_cols)\n    ops = [\"count\"]\n    fields = [\"breed\"]\n    groupbyfields = [\"breed\"]\n    gby_desc = GroupBySig(groupbyfields, ops, fields)\n    grouped_dog_images = dog_images.groupby(gby_desc)\n    return grouped_dog_images\n\ndef build_join_plan(dataset1, dataset2):\n    \"\"\"Build a plan that joins two datasets\"\"\"\n    ds1 = pz.TextFileDataset(id=\"enron-emails\", path=dataset1).sem_map(email_cols)\n    ds2 = pz.TextFileDataset(id=\"other-enron-emails\", path=dataset2).sem_map(email_cols)\n    joined = ds1.sem_join(ds2, condition=\"sender\")\n    return joined\n\ndef build_join_image_plan(dataset1, dataset2):\n    \"\"\"Build a plan that joins two datasets with images\"\"\"\n    ds1 = pz.ImageFileDataset(id=\"dog-images\", path=dataset1).sem_map(dog_image_cols)\n    ds2 = pz.ImageFileDataset(id=\"other-dog-images\", path=dataset2).sem_map(dog_image_cols)\n    joined = ds1.sem_join(ds2, condition=\"breed\")\n    return joined\n\ndef get_task_config(task, dataset, join_dataset=None):\n    \"\"\"Get configuration for a specific task\"\"\"\n    if task == \"paper\":\n        root_set = build_mit_battery_paper_plan(dataset)\n        cols = [\"title\", \"publication_year\", \"author\", \"institution\", \"journal\", \"funding_agency\"]\n        stat_path = \"profiling-data/paper-profiling.json\"\n    elif task == \"enron\":\n        root_set = build_enron_plan(dataset)\n        cols = [\"sender\", \"subject\"]\n        stat_path = \"profiling-data/enron-profiling.json\"\n    elif task == \"enronGby\":\n        root_set = enron_gby_plan(dataset)\n        cols = [\"sender\", \"count(sender)\"]\n        stat_path = \"profiling-data/egby-profiling.json\"\n    elif task in (\"enronCount\", \"count\"):\n        root_set = enron_count_plan(dataset)\n        cols = [\"count(sender)\"]\n        stat_path = \"profiling-data/ecount-profiling.json\"\n    elif task in (\"enronAvgCount\", \"average\"):\n        root_set = enron_average_count_plan(dataset)\n        cols = [\"average(count(sender))\"]\n        stat_path = \"profiling-data/e-profiling.json\"\n    elif task == \"enronmap\":\n        root_set = compute_enron_stats(dataset)\n        cols = [\"sender\", \"subject\", \"value\"]\n        stat_path = \"profiling-data/emap-profiling.json\"\n    elif task == \"pdftest\":\n        root_set = build_test_pdf_plan(dataset)\n        cols = [\"filename\"]\n        stat_path = \"profiling-data/pdftest-profiling.json\"\n    elif task == \"scitest\":\n        root_set = build_sci_paper_plan(dataset)\n        cols = [\"title\", \"author\", \"institution\", \"journal\", \"funding_agency\"]\n        stat_path = \"profiling-data/scitest-profiling.json\"\n    elif task == \"image\":\n        root_set = build_image_plan(dataset)\n        cols = None\n        stat_path = \"profiling-data/image-profiling.json\"\n    elif task == \"gbyImage\":\n        root_set = build_image_agg_plan(dataset)\n        cols = [\"breed\", \"count(breed)\"]\n        stat_path = \"profiling-data/gbyImage-profiling.json\"\n    elif task == \"limit\":\n        root_set = enron_limit_plan(dataset, 5)\n        cols = [\"sender\", \"subject\"]\n        stat_path = \"profiling-data/limit-profiling.json\"\n    elif task == \"join\":\n        root_set = build_join_plan(dataset, join_dataset)\n        cols = [\"filename\", \"sender\", \"subject\"]\n        stat_path = \"profiling-data/join-profiling.json\"\n    elif task == \"joinImage\":\n        root_set = build_join_image_plan(dataset, join_dataset)\n        cols = None\n        stat_path = \"profiling-data/joinImage-profiling.json\"\n    else:\n        raise ValueError(f\"Unknown task: {task}\")\n    \n    return root_set, cols, stat_path\n\ndef execute_task(task, dataset, policy, join_dataset=None, verbose=False, profile=False, execution_strategy=\"sequential\", optimizer_strategy=\"pareto\"):\n    \"\"\"Execute a task and return results\"\"\"\n    root_set, cols, stat_path = get_task_config(task, dataset, join_dataset)\n    config = pz.QueryProcessorConfig(\n        policy=policy,\n        verbose=verbose,\n        execution_strategy=execution_strategy,\n        optimizer_strategy=optimizer_strategy,\n    )\n    data_record_collection = root_set.run(config)\n\n    if profile:\n        os.makedirs(\"profiling-data\", exist_ok=True)\n        with open(stat_path, \"w\") as f:\n            json.dump(data_record_collection.execution_stats.to_json(), f)\n\n    return data_record_collection.data_records, data_record_collection.execution_stats, cols\n\ndef format_results_table(records: list[DataRecord], cols=None):\n    \"\"\"Format records as a table\"\"\"\n    records = [record.to_dict(include_bytes=False) for record in records]\n    records_df = pd.DataFrame(records)\n    print_cols = records_df.columns if cols is None else cols\n    final_df = records_df[print_cols] if not records_df.empty else pd.DataFrame(columns=print_cols)\n    return tabulate(final_df, headers=\"keys\", tablefmt=\"psql\")\n"
  },
  {
    "path": "demos/enron-demo.py",
    "content": "import json\nimport os\n\nimport palimpzest as pz\nfrom palimpzest.core.lib.schemas import TextFile\n\n\nclass EnronValidator(pz.Validator):\n    def __init__(self, labels_file: str):\n        super().__init__()\n\n        self.filename_to_labels = {}\n        if labels_file:\n            with open(labels_file) as f:\n                self.filename_to_labels = json.load(f)\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        filename = input_record[\"filename\"]\n        labels = self.filename_to_labels[filename]\n        if len(labels) == 0:\n            return None\n\n        labels = labels[0]\n        return (float(labels[\"sender\"] == output[\"sender\"]) + float(labels[\"subject\"] == output[\"subject\"])) / 2.0\n\n\nclass EnronDataset(pz.IterDataset):\n    def __init__(self, dir: str, labels_file: str | None = None, split: str = \"test\"):\n        super().__init__(id=\"enron\", schema=TextFile)\n        self.filepaths = [os.path.join(dir, filename) for filename in os.listdir(dir)]\n        self.filepaths = self.filepaths[:50] if split == \"train\" else self.filepaths[50:150]\n        self.filename_to_labels = {}\n        if labels_file:\n            with open(labels_file) as f:\n                self.filename_to_labels = json.load(f)\n\n    def __len__(self):\n        return len(self.filepaths)\n\n    def __getitem__(self, idx: int):\n        # get input fields\n        filepath = self.filepaths[idx]\n        filename = os.path.basename(filepath)\n        with open(filepath) as f:\n            contents = f.read()\n\n        # create item with fields\n        item = {\"filename\": filename, \"contents\": contents}\n\n        return item\n\n\nif __name__ == \"__main__\":\n    # create validator and train_dataset\n    validator = EnronValidator(labels_file=\"testdata/enron-eval-medium-labels.json\")\n    train_dataset = EnronDataset(dir=\"testdata/enron-eval-medium\", split=\"train\")\n\n    # construct plan\n    plan = EnronDataset(dir=\"testdata/enron-eval-medium\", split=\"test\")\n    plan = plan.sem_map([\n        {\"name\": \"subject\", \"type\": str, \"desc\": \"The subject of the email\"},\n        {\"name\": \"sender\", \"type\": str, \"desc\": \"The email address of the email's sender\"},\n    ])\n    plan = plan.sem_filter(\n        'The email refers to a fraudulent scheme (i.e., \"Raptor\", \"Deathstar\", \"Chewco\", and/or \"Fat Boy\")',\n        depends_on=[\"contents\"],\n    )\n    plan = plan.sem_filter(\n        \"The email is not quoting from a news article or an article written by someone outside of Enron\",\n        depends_on=[\"contents\"],\n    )\n\n    # execute pz plan\n    config = pz.QueryProcessorConfig(\n        policy=pz.MaxQuality(),\n        execution_strategy=\"parallel\",\n        k=5,\n        j=6,\n        sample_budget=100,\n        max_workers=20,\n        progress=True,\n    )\n    output = plan.optimize_and_run(train_dataset=train_dataset, validator=validator, config=config)\n\n    # print output dataframe\n    print(output.to_df())\n\n    # print precision and recall\n    with open(\"testdata/enron-eval-medium-labels.json\") as f:\n        filename_to_labels = json.load(f)\n        test_filenames = os.listdir(\"testdata/enron-eval-medium\")[50:150]\n        filename_to_labels = {k: v for k, v in filename_to_labels.items() if k in test_filenames}\n\n    target_filenames = set(filename for filename, labels in filename_to_labels.items() if labels != [])\n    pred_filenames = set(output.to_df()[\"filename\"])\n    tp = sum(filename in target_filenames for filename in pred_filenames)\n    fp = len(pred_filenames) - tp\n    fn = len(target_filenames) - tp\n\n    print(f\"PRECISION: {tp/(tp + fp) if tp + fp > 0 else 0.0:.3f}\")\n    print(f\"RECALL: {tp/(tp + fn) if tp + fn > 0 else 0.0:.3f}\")\n"
  },
  {
    "path": "demos/image-demo.py",
    "content": "#!/usr/bin/env python3\n\"\"\"This scripts is a demo for image processing, it is simply an abridged version of simpleDemo.py\"\"\"\n\nimport os\nimport time\n\nimport gradio as gr\nimport numpy as np\nfrom PIL import Image\n\nimport palimpzest as pz\n\nif not os.environ.get(\"OPENAI_API_KEY\"):\n    from palimpzest.utils.env_helpers import load_env\n\n    load_env()\n\ndog_image_cols = [\n    {\"name\": \"breed\", \"type\": str, \"desc\": \"The breed of the dog\"},\n]\n\ndef build_image_plan(dataset):\n    images = pz.ImageFileDataset(id=\"images\", path=dataset)\n    filtered_images = images.sem_filter(\"The image contains one or more dogs\")\n    dog_images = filtered_images.sem_map(dog_image_cols)\n    return dog_images\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    start_time = time.time()\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    print(\"Starting image task\")\n    policy = pz.MaxQuality()\n    plan = build_image_plan(\"testdata/images-tiny\")\n    config = pz.QueryProcessorConfig(policy=policy)\n    data_record_collection = plan.run(config)\n\n    imgs, breeds = [], []\n    for record in data_record_collection:\n        print(\"Trying to open \", record.filename)\n        path = os.path.join(\"testdata/images-tiny/\", record.filename)\n        img = Image.open(path).resize((128, 128))\n        img_arr = np.asarray(img)\n        imgs.append(img_arr)\n        breeds.append(record.breed)\n\n    with gr.Blocks() as demo:\n        img_blocks, breed_blocks = [], []\n        for img, breed in zip(imgs, breeds):\n            with gr.Row():\n                with gr.Column():\n                    img_blocks.append(gr.Image(value=img))\n                with gr.Column():\n                    breed_blocks.append(gr.Textbox(value=breed))\n\n        plan_str = list(data_record_collection.execution_stats.plan_strs.values())[0]\n        gr.Textbox(value=plan_str, info=\"Query Plan\")\n\n    end_time = time.time()\n    print(\"Elapsed time:\", end_time - start_time)\n    demo.launch()\n"
  },
  {
    "path": "demos/join-data/animal-texts/animal1.txt",
    "content": "The quick red fox jumped over the fence.\n"
  },
  {
    "path": "demos/join-data/animal-texts/animal2.txt",
    "content": "The black dog sat next to the bed.\n"
  },
  {
    "path": "demos/join-data/animal-texts/animal3.txt",
    "content": "The white polar bear swam gently in the ocean.\n"
  },
  {
    "path": "demos/join-data/animal-texts/animal4.txt",
    "content": "The labrador swam in the lake, the sun glistening off its shiny black coat.\n"
  },
  {
    "path": "demos/join-data/animal-texts/animal5.txt",
    "content": "Clifford was a big red dog.\n"
  },
  {
    "path": "demos/join-data/animal-texts/animal6.txt",
    "content": "The elephant was wise and grey.\n"
  },
  {
    "path": "demos/join-demo.py",
    "content": "import argparse\n\nimport palimpzest as pz\n\n# define columns for datasets\ntext_animal_cols = [\n    {\"name\": \"animal\", \"type\": str, \"desc\": \"The type of animal mentioned in the text\"},\n    {\"name\": \"color\", \"type\": str, \"desc\": \"The color of the animal mentioned in the text\"},\n]\nimage_animal_cols = [\n    {\"name\": \"animal\", \"type\": str, \"desc\": \"The type of animal in the image\"},\n    {\"name\": \"color\", \"type\": str, \"desc\": \"The color of the animal in the image\"},\n]\n\n# query plans\ndef run_text_join():\n    \"\"\"Build a plan that joins two datasets\"\"\"\n    ds1 = pz.TextFileDataset(id=\"animals1\", path=\"join-data/animal-texts/\").sem_map(text_animal_cols)\n    ds2 = pz.TextFileDataset(id=\"animals2\", path=\"join-data/animal-texts/\").sem_map(text_animal_cols)\n    ds3 = ds1.sem_join(ds2, condition=\"both animals are canines with the same color\")\n    config = pz.QueryProcessorConfig(\n        policy=pz.MaxQuality(),\n        execution_strategy=\"parallel\",\n        join_parallelism=64,\n    )\n    data_record_collection = ds3.run(config)\n    print(data_record_collection.to_df())\n\n\ndef run_image_join():\n    \"\"\"Build a plan that joins two datasets with images\"\"\"\n    ds1 = pz.ImageFileDataset(id=\"animals1\", path=\"join-data/animal-images/\").sem_map(image_animal_cols)\n    ds2 = pz.ImageFileDataset(id=\"animals2\", path=\"join-data/animal-images/\").sem_map(image_animal_cols)\n    ds3 = ds1.sem_join(ds2, condition=\"both animals are canines with the same color\")\n    config = pz.QueryProcessorConfig(\n        policy=pz.MaxQuality(),\n        execution_strategy=\"parallel\",\n        join_parallelism=64,\n    )\n    data_record_collection = ds3.run(config)\n    print(data_record_collection.to_df())\n\n\ndef run_text_image_join():\n    \"\"\"Build a plan that joins a dataset with text to a dataset with images\"\"\"\n    ds1 = pz.TextFileDataset(id=\"animals1\", path=\"join-data/animal-texts/\").sem_map(text_animal_cols)\n    ds2 = pz.ImageFileDataset(id=\"animals2\", path=\"join-data/animal-images/\").sem_map(image_animal_cols)\n    ds3 = ds1.sem_join(ds2, condition=\"both animals are canines with the same color\")\n    config = pz.QueryProcessorConfig(\n        policy=pz.MaxQuality(),\n        execution_strategy=\"parallel\",\n        join_parallelism=64,\n    )\n    data_record_collection = ds3.run(config)\n    print(data_record_collection.to_df())\n\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser(description=\"Run the Palimpzest join demo.\")\n    parser.add_argument(\"--task\", type=str, help=\"Which join demo to run\")\n    args = parser.parse_args()\n\n    if args.task == \"text-join\":\n        run_text_join()\n    elif args.task == \"image-join\":\n        run_image_join()\n    elif args.task == \"text-image-join\":\n        run_text_image_join()\n    else:\n        print(\"Please provide a valid task: one of 'text-join', 'image-join', 'text-image-join'\")\n        exit(1)\n"
  },
  {
    "path": "demos/paper-demo.py",
    "content": "import argparse\nimport json\nimport os\n\nimport gradio as gr\nimport numpy as np\nimport pandas as pd\nfrom PIL import Image\n\nimport palimpzest as pz\nfrom palimpzest.core.lib.schemas import ImageFilepath\nfrom palimpzest.utils.udfs import xls_to_tables\n\n\ndef print_table(records, cols=None, plan_str=None):\n    \"\"\"Helper function to print execution results using Gradio\"\"\"\n    if len(records) == 0:\n        print(\"No records met search criteria\")\n        return\n\n    records = [record.to_dict() for record in records]\n    records_df = pd.DataFrame(records)\n    print_cols = records_df.columns if cols is None else cols\n\n    with gr.Blocks() as demo:\n        gr.Dataframe(records_df[print_cols])\n\n        if plan_str is not None:\n            gr.Textbox(value=plan_str, info=\"Physical Plan\")\n\n    demo.launch()\n\n\n# Addresses far from MIT; we use a simple lookup like this to make the\n# experiments re-producible w/out needed a Google API key for geocoding lookups\nFAR_AWAY_ADDRS = [\n    \"Melcher St\",\n    \"Sleeper St\",\n    \"437 D St\",\n    \"Seaport Blvd\",\n    \"50 Liberty Dr\",\n    \"Telegraph St\",\n    \"Columbia Rd\",\n    \"E 6th St\",\n    \"E 7th St\",\n    \"E 5th St\",\n]\n\n\ndef within_two_miles_of_mit(record: dict):\n    # NOTE: I'm using this hard-coded function so that folks w/out a\n    #       Geocoding API key from google can still run this example\n    try:\n        return not any([street.lower() in record[\"address\"].lower() for street in FAR_AWAY_ADDRS])\n    except Exception:\n        return False\n\n\ndef in_price_range(record: dict):\n    try:\n        price = record[\"price\"]\n        if isinstance(price, str):\n            price = price.strip()\n            price = int(price.replace(\"$\", \"\").replace(\",\", \"\"))\n        return 6e5 < price <= 2e6\n    except Exception:\n        return False\n\nemail_cols =  [\n    {\"name\": \"sender\", \"type\": str, \"desc\": \"The email address of the sender\"},\n    {\"name\": \"subject\", \"type\": str, \"desc\": \"The subject of the email\"},\n]\n\ncase_data_cols = [\n    {\"name\": \"case_submitter_id\", \"type\": str, \"desc\": \"The ID of the case\"},\n    {\"name\": \"age_at_diagnosis\", \"type\": int | float, \"desc\": \"The age of the patient at the time of diagnosis\"},\n    {\"name\": \"race\", \"type\": str, \"desc\": \"An arbitrary classification of a taxonomic group that is a division of a species.\"},\n    {\"name\": \"ethnicity\", \"type\": str, \"desc\": \"Whether an individual describes themselves as Hispanic or Latino or not.\"},\n    {\"name\": \"gender\", \"type\": str, \"desc\": \"Text designations that identify gender.\"},\n    {\"name\": \"vital_status\", \"type\": str, \"desc\": \"The vital status of the patient\"},\n    {\"name\": \"ajcc_pathologic_t\", \"type\": str, \"desc\": \"Code of pathological T (primary tumor) to define the size or contiguous extension of the primary tumor (T), using staging criteria from the American Joint Committee on Cancer (AJCC).\"},\n    {\"name\": \"ajcc_pathologic_n\", \"type\": str, \"desc\": \"The codes that represent the stage of cancer based on the nodes present (N stage) according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.\"},\n    {\"name\": \"ajcc_pathologic_stage\", \"type\": str, \"desc\": \"The extent of a cancer, especially whether the disease has spread from the original site to other parts of the body based on AJCC staging criteria.\"},\n    {\"name\": \"tumor_grade\", \"type\": int | float, \"desc\": \"Numeric value to express the degree of abnormality of cancer cells, a measure of differentiation and aggressiveness.\"},\n    {\"name\": \"tumor_focality\", \"type\": str, \"desc\": \"The text term used to describe whether the patient's disease originated in a single location or multiple locations.\"},\n    {\"name\": \"tumor_largest_dimension_diameter\", \"type\": int | float, \"desc\": \"The tumor largest dimension diameter.\"},\n    {\"name\": \"primary_diagnosis\", \"type\": str, \"desc\": \"Text term used to describe the patient's histologic diagnosis, as described by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O).\"},\n    {\"name\": \"morphology\", \"type\": str, \"desc\": \"The Morphological code of the tumor, as described by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O).\"},\n    {\"name\": \"tissue_or_organ_of_origin\", \"type\": str, \"desc\": \"The text term used to describe the anatomic site of origin, of the patient's malignant disease, as described by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O).\"},\n    {\"name\": \"study\", \"type\": str, \"desc\": \"The last name of the author of the study, from the table name\"},\n    {\"name\": \"filename\", \"type\": str, \"desc\": \"The name of the file the record was extracted from\"}\n]\n\nreal_estate_listing_cols = [\n    {\"name\": \"listing\", \"type\": str, \"desc\": \"The name of the listing\"},\n    {\"name\": \"text_content\", \"type\": str, \"desc\": \"The content of the listing's text description\"},\n    {\"name\": \"image_filepaths\", \"type\": list[ImageFilepath], \"desc\": \"A list of the filepaths for each image of the listing\"},\n]\n\nreal_estate_text_cols = [\n    {\"name\": \"address\", \"type\": str, \"desc\": \"The address of the property\"},\n    {\"name\": \"price\", \"type\": int | float, \"desc\": \"The listed price of the property\"},\n]\n\nreal_estate_image_cols = [\n    {\"name\": \"is_modern_and_attractive\", \"type\": bool, \"desc\": \"True if the home interior design is modern and attractive and False otherwise\"},\n    {\"name\": \"has_natural_sunlight\", \"type\": bool, \"desc\": \"True if the home interior has lots of natural sunlight and False otherwise\"},\n]\n\ntable_cols = [\n    {\"name\": \"rows\", \"type\": list[str], \"desc\": \"The rows of the table\"},\n    {\"name\": \"header\", \"type\": list[str], \"desc\": \"The header of the table\"},\n    {\"name\": \"name\", \"type\": str, \"desc\": \"The name of the table\"},\n    {\"name\": \"filename\", \"type\": str, \"desc\": \"The name of the file the table was extracted from\"}\n]\n\n\nclass RealEstateListingDataset(pz.IterDataset):\n    def __init__(self, listings_dir):\n        super().__init__(id=\"real-estate\", schema=real_estate_listing_cols)\n        self.listings_dir = listings_dir\n        self.listings = sorted(os.listdir(self.listings_dir))\n        self.listings = [file for file in self.listings if not file.startswith(\".\")]\n\n    def __len__(self):\n        return len(self.listings)\n\n    def __getitem__(self, idx: int):\n        # get listing\n        listing = self.listings[idx]\n\n        # get fields\n        image_filepaths, text_content = [], None\n        listing_dir = os.path.join(self.listings_dir, listing)\n        for file in os.listdir(listing_dir):\n            if file.endswith(\".txt\"):\n                with open(os.path.join(listing_dir, file), \"rb\") as f:\n                    text_content = f.read().decode(\"utf-8\")\n            elif file.endswith(\".png\"):\n                image_filepaths.append(os.path.join(listing_dir, file))\n\n        # construct and return dictionary with fields\n        return {\"listing\": listing, \"text_content\": text_content, \"image_filepaths\": image_filepaths}\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\"--viz\", default=False, action=\"store_true\", help=\"Visualize output in Gradio\")\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\"--profile\", default=False, action=\"store_true\", help=\"Profile execution\")\n    parser.add_argument(\"--dataset\", type=str, help=\"The path to the dataset\")\n    parser.add_argument(\n        \"--workload\", type=str, help=\"The workload to run. One of enron, real-estate, medical-schema-matching.\"\n    )\n    parser.add_argument(\n        \"--executor\",\n        type=str,\n        help=\"The plan executor to use. One of sequential, pipelined, parallel\",\n        default=\"parallel\",\n    )\n    parser.add_argument(\n        \"--policy\",\n        type=str,\n        help=\"One of 'mincost', 'mintime', 'maxquality'\",\n        default=\"maxquality\",\n    )\n\n    args = parser.parse_args()\n\n    # The user has to indicate the dataset id and the workload\n    if args.dataset is None:\n        print(\"Please provide a dataset id\")\n        exit(1)\n    if args.workload is None:\n        print(\"Please provide a workload\")\n        exit(1)\n\n    # create directory for profiling data\n    if args.profile:\n        os.makedirs(\"profiling-data\", exist_ok=True)\n\n    dataset = args.dataset\n    workload = args.workload\n    visualize = args.viz\n    verbose = args.verbose\n    profile = args.profile\n    policy = pz.MaxQuality()\n    if args.policy == \"mincost\":\n        policy = pz.MinCost()\n    elif args.policy == \"mintime\":\n        policy = pz.MinTime()\n    elif args.policy == \"maxquality\":\n        policy = pz.MaxQuality()\n    else:\n        print(\"Policy not supported for this demo\")\n        exit(1)\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # create pz plan\n    if workload == \"enron\":\n        plan = pz.TextFileDataset(id=\"enron\", path=dataset)\n        plan = plan.sem_map(email_cols)\n        plan = plan.sem_filter(\n            \"The email is not quoting from a news article or an article written by someone outside of Enron\",\n            depends_on=[\"contents\"],\n        )\n        plan = plan.sem_filter(\n            'The email refers to a fraudulent scheme (i.e., \"Raptor\", \"Deathstar\", \"Chewco\", and/or \"Fat Boy\")',\n            depends_on=[\"contents\"],\n        )\n\n    elif workload == \"real-estate\":\n        plan = RealEstateListingDataset(dataset)\n        plan = plan.sem_map(real_estate_text_cols, depends_on=\"text_content\")\n        plan = plan.sem_map(real_estate_image_cols, depends_on=\"image_filepaths\")\n        plan = plan.sem_filter(\n            \"The interior is modern and attractive, and has lots of natural sunlight\",\n            depends_on=[\"is_modern_and_attractive\", \"has_natural_sunlight\"],\n        )\n        plan = plan.filter(within_two_miles_of_mit, depends_on=\"address\")\n        plan = plan.filter(in_price_range, depends_on=\"price\")\n\n    elif workload == \"medical-schema-matching\":\n        plan = dataset.add_columns(xls_to_tables, cols=table_cols, cardinality=pz.Cardinality.ONE_TO_MANY)\n        plan = plan.sem_filter(\"The rows of the table contain the patient age\")\n        plan = plan.sem_map(case_data_cols, cardinality=pz.Cardinality.ONE_TO_MANY)\n\n    # construct config and run plan\n    config = pz.QueryProcessorConfig(\n        verbose=verbose,\n        policy=policy,\n        execution_strategy=args.executor,\n    )\n    data_record_collection = plan.run(config)\n    print(data_record_collection.to_df())\n\n    # save statistics\n    if profile:\n        stats_path = f\"profiling-data/{workload}-profiling.json\"\n        execution_stats_dict = data_record_collection.execution_stats.to_json()\n        with open(stats_path, \"w\") as f:\n            json.dump(execution_stats_dict, f)\n\n    # visualize output in Gradio\n    if visualize:\n        plan_str = list(data_record_collection.execution_stats.plan_strs.values())[-1]\n        if workload == \"enron\":\n            print_table(data_record_collection.data_records, cols=[\"sender\", \"subject\"], plan_str=plan_str)\n\n        elif workload == \"real-estate\":\n            fst_imgs, snd_imgs, thrd_imgs, addrs, prices = [], [], [], [], []\n            for record in data_record_collection:\n                addrs.append(record.address)\n                prices.append(record.price)\n                for idx, img_name in enumerate([\"img1.png\", \"img2.png\", \"img3.png\"]):\n                    path = os.path.join(dataset, record.listing, img_name)\n                    img = Image.open(path)\n                    img_arr = np.asarray(img)\n                    if idx == 0:\n                        fst_imgs.append(img_arr)\n                    elif idx == 1:\n                        snd_imgs.append(img_arr)\n                    elif idx == 2:\n                        thrd_imgs.append(img_arr)\n\n            with gr.Blocks() as demo:\n                fst_img_blocks, snd_img_blocks, thrd_img_blocks, addr_blocks, price_blocks = [], [], [], [], []\n                for fst_img, snd_img, thrd_img, addr, price in zip(fst_imgs, snd_imgs, thrd_imgs, addrs, prices):\n                    with gr.Row(equal_height=True):\n                        with gr.Column():\n                            fst_img_blocks.append(gr.Image(value=fst_img))\n                        with gr.Column():\n                            snd_img_blocks.append(gr.Image(value=snd_img))\n                        with gr.Column():\n                            thrd_img_blocks.append(gr.Image(value=thrd_img))\n                    with gr.Row():\n                        with gr.Column():\n                            addr_blocks.append(gr.Textbox(value=addr, info=\"Address\"))\n                        with gr.Column():\n                            price_blocks.append(gr.Textbox(value=price, info=\"Price\"))\n\n                plan_str = list(data_record_collection.execution_stats.plan_strs.values())[0]\n                gr.Textbox(value=plan_str, info=\"Query Plan\")\n\n            demo.launch()\n"
  },
  {
    "path": "demos/real-estate-demo.py",
    "content": "import argparse\nimport os\n\nimport gradio as gr\nimport numpy as np\nimport pandas as pd\nfrom PIL import Image\n\nimport palimpzest as pz\nfrom palimpzest.core.lib.schemas import ImageFilepath\n\n\ndef print_table(records, cols=None, plan_str=None):\n    \"\"\"Helper function to print execution results using Gradio\"\"\"\n    if len(records) == 0:\n        print(\"No records met search criteria\")\n        return\n\n    records = [record.to_dict() for record in records]\n    records_df = pd.DataFrame(records)\n    print_cols = records_df.columns if cols is None else cols\n\n    with gr.Blocks() as demo:\n        gr.Dataframe(records_df[print_cols])\n\n        if plan_str is not None:\n            gr.Textbox(value=plan_str, info=\"Physical Plan\")\n\n    demo.launch()\n\n\n# Addresses far from MIT; we use a simple lookup like this to make the\n# experiments re-producible w/out needed a Google API key for geocoding lookups\nFAR_AWAY_ADDRS = [\n    \"Melcher St\",\n    \"Sleeper St\",\n    \"437 D St\",\n    \"Seaport Blvd\",\n    \"50 Liberty Dr\",\n    \"Telegraph St\",\n    \"Columbia Rd\",\n    \"E 6th St\",\n    \"E 7th St\",\n    \"E 5th St\",\n]\n\n\ndef within_two_miles_of_mit(record: dict):\n    # NOTE: I'm using this hard-coded function so that folks w/out a\n    #       Geocoding API key from google can still run this example\n    try:\n        return not any([street.lower() in record[\"address\"].lower() for street in FAR_AWAY_ADDRS])\n    except Exception:\n        return False\n\n\ndef in_price_range(record: dict):\n    try:\n        price = record[\"price\"]\n        if isinstance(price, str):\n            price = price.strip()\n            price = int(price.replace(\"$\", \"\").replace(\",\", \"\"))\n        return 6e5 < price <= 2e6\n    except Exception:\n        return False\n\nreal_estate_listing_cols = [\n    {\"name\": \"listing\", \"type\": str, \"desc\": \"The name of the listing\"},\n    {\"name\": \"text_content\", \"type\": str, \"desc\": \"The content of the listing's text description\"},\n    {\"name\": \"image_filepaths\", \"type\": list[ImageFilepath], \"desc\": \"A list of the filepaths for each image of the listing\"},\n]\n\nreal_estate_text_cols = [\n    {\"name\": \"address\", \"type\": str, \"desc\": \"The address of the property\"},\n    {\"name\": \"price\", \"type\": int | float, \"desc\": \"The listed price of the property\"},\n]\n\nreal_estate_image_cols = [\n    {\"name\": \"is_modern_and_attractive\", \"type\": bool, \"desc\": \"True if the home interior design is modern and attractive and False otherwise\"},\n    {\"name\": \"has_natural_sunlight\", \"type\": bool, \"desc\": \"True if the home interior has lots of natural sunlight and False otherwise\"},\n]\n\n# class RealEstateValidator(pz.Validator):\n#     def __init__(self, labels_file: str):\n#         super().__init__()\n#         with open(labels_file) as f:\n#             self.filename_to_labels = json.load(f)\n\n#     def filter_score_fn(self, filter_str: str, input_record: dict, output: bool) -> float | None:\n#         filename = input_record[\"filename\"]\n#         labels = self.filename_to_labels[filename]\n#         if labels is None:\n#             return None\n\n#         if \"business transactions\" in filter_str:\n#             return float(labels[\"mentions_transaction\"] == output)\n#         elif \"first-hand discussion\" in filter_str:\n#             return float(labels[\"firsthand_discussion\"] == output)\n#         else:\n#             return None\n\n#     def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n#         # NOTE: we score the map based on the sender and subject fields only, as summary is too subjective;\n#         #       we could also use an LLM judge within this function to score the summary field if desired\n#         filename = input_record[\"filename\"]\n#         labels = self.filename_to_labels[filename]\n#         if labels is None:\n#             return None\n\n#         return (float(labels[\"sender\"] == output[\"sender\"]) + float(labels[\"subject\"] == output[\"subject\"])) / 2.0\n\n\nclass RealEstateListingDataset(pz.IterDataset):\n    def __init__(self, listings_dir):\n        super().__init__(id=\"real-estate\", schema=real_estate_listing_cols)\n        self.listings_dir = listings_dir\n        self.listings = sorted(os.listdir(self.listings_dir))\n        self.listings = [file for file in self.listings if not file.startswith(\".\")]\n\n    def __len__(self):\n        return len(self.listings)\n\n    def __getitem__(self, idx: int):\n        # get listing\n        listing = self.listings[idx]\n\n        # get fields\n        image_filepaths, text_content = [], None\n        listing_dir = os.path.join(self.listings_dir, listing)\n        for file in os.listdir(listing_dir):\n            if file.endswith(\".txt\"):\n                with open(os.path.join(listing_dir, file), \"rb\") as f:\n                    text_content = f.read().decode(\"utf-8\")\n            elif file.endswith(\".png\"):\n                image_filepaths.append(os.path.join(listing_dir, file))\n\n        # construct and return dictionary with fields\n        return {\"listing\": listing, \"text_content\": text_content, \"image_filepaths\": image_filepaths}\n\n\nif __name__ == \"__main__\":\n    # parse arguments\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\"--viz\", default=False, action=\"store_true\", help=\"Visualize output in Gradio\")\n    parser.add_argument(\"--dataset\", type=str, help=\"The path to the dataset\")\n    parser.add_argument(\n        \"--policy\",\n        type=str,\n        help=\"One of 'mincost', 'mintime', 'maxquality'\",\n        default=\"maxquality\",\n    )\n\n    args = parser.parse_args()\n\n    # The user has to indicate the dataset id and the workload\n    if args.dataset is None:\n        print(\"Please provide a dataset id\")\n        exit(1)\n\n    dataset = args.dataset\n    visualize = args.viz\n    policy = pz.MaxQuality()\n    if args.policy == \"mincost\":\n        policy = pz.MinCost()\n    elif args.policy == \"mintime\":\n        policy = pz.MinTime()\n    elif args.policy == \"maxquality\":\n        policy = pz.MaxQuality()\n    else:\n        print(\"Policy not supported for this demo\")\n        exit(1)\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # create pz plan\n    plan = RealEstateListingDataset(dataset)\n    plan = plan.sem_map(real_estate_text_cols, depends_on=\"text_content\")\n    plan = plan.sem_map(real_estate_image_cols, depends_on=\"image_filepaths\")\n    plan = plan.sem_filter(\n        \"The interior is modern and attractive, and has lots of natural sunlight\",\n        depends_on=[\"is_modern_and_attractive\", \"has_natural_sunlight\"],\n    )\n    plan = plan.filter(within_two_miles_of_mit, depends_on=\"address\")\n    plan = plan.filter(in_price_range, depends_on=\"price\")\n\n    # construct config and run plan\n    config = pz.QueryProcessorConfig(\n        policy=policy,\n        available_models=[pz.Model.GPT_5_MINI],\n        k=6,\n        j=6,\n        sample_budget=125,\n    )\n    data_record_collection = plan.optimize_and_run(config, validator=pz.Validator(model=pz.Model.o4_MINI))\n    print(data_record_collection.to_df())\n\n    # preds = data_record_collection.to_df()[\"listing\"].tolist()\n    # gt_df = pd.read_csv(\"testdata/groundtruth/real-estate-eval-100.csv\")\n    # labels = gt_df.listing.tolist()\n    # tp, fp, fn = 0, 0, 0\n    # for pred in preds:\n    #     if pred in labels:\n    #         tp += 1\n    #     else:\n    #         fp += 1\n    # for label in labels:\n    #     if label not in preds:\n    #         fn += 1\n    # precision = tp / (tp + fp) if (tp + fp) > 0 else 0.0\n    # recall = tp / (tp + fn) if (tp + fn) > 0 else 0.0\n    # f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0.0\n    # print(f\"Precision: {precision:.2f}\")\n    # print(f\"Recall: {recall:.4f}\")\n    # print(f\"F1: {f1:.4f}\")\n\n    # visualize output in Gradio\n    if visualize:\n        plan_str = list(data_record_collection.execution_stats.plan_strs.values())[-1]\n        fst_imgs, snd_imgs, thrd_imgs, addrs, prices = [], [], [], [], []\n        for record in data_record_collection:\n            addrs.append(record.address)\n            prices.append(record.price)\n            for idx, img_name in enumerate([\"img1.png\", \"img2.png\", \"img3.png\"]):\n                path = os.path.join(dataset, record.listing, img_name)\n                img = Image.open(path)\n                img_arr = np.asarray(img)\n                if idx == 0:\n                    fst_imgs.append(img_arr)\n                elif idx == 1:\n                    snd_imgs.append(img_arr)\n                elif idx == 2:\n                    thrd_imgs.append(img_arr)\n\n        with gr.Blocks() as demo:\n            fst_img_blocks, snd_img_blocks, thrd_img_blocks, addr_blocks, price_blocks = [], [], [], [], []\n            for fst_img, snd_img, thrd_img, addr, price in zip(fst_imgs, snd_imgs, thrd_imgs, addrs, prices):\n                with gr.Row(equal_height=True):\n                    with gr.Column():\n                        fst_img_blocks.append(gr.Image(value=fst_img))\n                    with gr.Column():\n                        snd_img_blocks.append(gr.Image(value=snd_img))\n                    with gr.Column():\n                        thrd_img_blocks.append(gr.Image(value=thrd_img))\n                with gr.Row():\n                    with gr.Column():\n                        addr_blocks.append(gr.Textbox(value=addr, info=\"Address\"))\n                    with gr.Column():\n                        price_blocks.append(gr.Textbox(value=price, info=\"Price\"))\n\n            plan_str = list(data_record_collection.execution_stats.plan_strs.values())[0]\n            gr.Textbox(value=plan_str, info=\"Query Plan\")\n\n            demo.launch()\n"
  },
  {
    "path": "demos/simple-demo.py",
    "content": "#!/usr/bin/env python3\nimport argparse\nimport os\nimport time\n\nfrom demo_core import execute_task, format_results_table\nfrom dotenv import load_dotenv\n\nimport palimpzest as pz\n\nload_dotenv()\n\ndef main():\n    # parse arguments\n    start_time = time.time()\n    parser = argparse.ArgumentParser(description=\"Run a simple demo\")\n    parser.add_argument(\"--verbose\", default=False, action=\"store_true\", help=\"Print verbose output\")\n    parser.add_argument(\"--profile\", default=False, action=\"store_true\", help=\"Profile execution\")\n    parser.add_argument(\"--dataset\", type=str, help=\"Path to the dataset\")\n    parser.add_argument(\"--join-dataset\", type=str, help=\"Path to the join dataset (if needed)\", default=None)\n    parser.add_argument(\"--task\", type=str, help=\"The task to run\")\n    parser.add_argument(\n        \"--execution-strategy\",\n        type=str,\n        help=\"The execution strategy to use. One of sequential, pipelined, parallel\",\n        default=\"sequential\",\n    )\n    parser.add_argument(\n        \"--policy\",\n        type=str,\n        help=\"One of 'mincost', 'mintime', 'maxquality'\",\n        default=\"maxquality\",\n    )\n\n    args = parser.parse_args()\n\n    # The user has to indicate the dataset and the task\n    if args.dataset is None:\n        print(\"Please provide a path to the dataset\")\n        exit(1)\n    if args.task is None:\n        print(\"Please provide a task\")\n        exit(1)\n\n    # Set up execution parameters\n    dataset = args.dataset\n    join_dataset = args.join_dataset\n    task = args.task\n    verbose = args.verbose\n    profile = args.profile\n\n    # Set policy\n    policy = pz.MaxQuality()\n    if args.policy == \"mincost\":\n        policy = pz.MinCost()\n    elif args.policy == \"mintime\":\n        policy = pz.MinTime()\n    elif args.policy == \"maxquality\":\n        policy = pz.MaxQuality()\n    else:\n        print(\"Policy not supported for this demo\")\n        exit(1)\n\n    if os.getenv(\"OPENAI_API_KEY\") is None and os.getenv(\"TOGETHER_API_KEY\") is None and os.getenv(\"ANTHROPIC_API_KEY\") is None:\n        print(\"WARNING: OPENAI_API_KEY, TOGETHER_API_KEY, and ANTHROPIC_API_KEY are unset\")\n\n    # Execute task\n    records, execution_stats, cols = execute_task(\n        task=task,\n        dataset=dataset,\n        policy=policy,\n        join_dataset=join_dataset,\n        verbose=verbose,\n        profile=profile,\n        execution_strategy=args.execution_strategy\n    )\n\n    # Print results\n    print(f\"Policy is: {str(policy)}\")\n    print(\"Executed plan:\")\n    plan_str = list(execution_stats.plan_strs.values())[0]\n    print(plan_str)\n    end_time = time.time()\n    print(\"Elapsed time:\", end_time - start_time)\n\n    print(format_results_table(records, cols=cols))\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "demos/vllm-demo.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nMinimal demo for running a vLLM model with Palimpzest.\n\nPrerequisites:\n  1. Start a vLLM server serving a small model, e.g.:\n     vllm serve Qwen/Qwen2.5-1.5B-Instruct --port 8000\n  2. Run this script:\n     python demos/vllm-demo.py \\\n       --api-base http://localhost:8000/v1 \\\n       --model-id openai/Qwen/Qwen2.5-1.5B-Instruct\n\"\"\"\nimport argparse\nimport os\n\nfrom pydantic import BaseModel, Field\n\nimport palimpzest as pz\n\n\nclass SentimentResult(BaseModel):\n    sentiment: str = Field(description=\"The sentiment of the text: positive, negative, or neutral\")\n\n\ndef main():\n    parser = argparse.ArgumentParser(description=\"Run a minimal vLLM demo\")\n    parser.add_argument(\"--api-base\", type=str, required=True, help=\"vLLM server base URL (e.g. http://localhost:8000/v1)\")\n    parser.add_argument(\"--model-id\", type=str, required=True, help=\"Model ID for litellm (e.g. openai/Qwen/Qwen2.5-1.5B-Instruct)\")\n    parser.add_argument(\"--max-tokens\", type=int, default=128, help=\"Max tokens for completion\")\n    parser.add_argument(\"--verbose\", action=\"store_true\", default=False)\n    args = parser.parse_args()\n\n    # Create the vLLM model with api_base and kwargs on the Model instance\n    vllm_model = pz.Model(args.model_id, api_base=args.api_base, max_tokens=args.max_tokens)\n\n    # Load the enron-tiny dataset\n    data_path = os.path.join(os.path.dirname(__file__), \"..\", \"testdata\", \"enron-tiny\")\n    dataset = pz.TextFileDataset(id=\"test-sentiment\", path=data_path)\n    dataset = dataset.sem_map(SentimentResult, desc=\"Classify the sentiment of the text\")\n\n    # Configure with vLLM model\n    config = pz.QueryProcessorConfig(\n        policy=pz.MaxQuality(),\n        available_models=[vllm_model],\n        execution_strategy=\"sequential\",\n        optimizer_strategy=\"pareto\",\n        verbose=args.verbose,\n    )\n\n    output = dataset.run(config)\n    for record in output:\n        print(record)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "evals/quest/eval.py",
    "content": "import argparse\nimport copy\nimport json\nimport os\nimport random\nimport time\n\nimport palimpzest as pz\n\n\ndef prepare_docs_for_query(items: list, gt_docs: list) -> list:\n    items = copy.copy(items)\n    random.shuffle(items)\n    final_items = [doc for doc in items if doc[\"title\"] in gt_docs]\n    while len(final_items) < 1000 and len(items) > 0:\n        item = items.pop(0)\n        if item not in final_items:\n            final_items.append(item)\n    return final_items\n\n\ndef palimpzest_run_query(query: dict, documents: list) -> list[str]:\n    gt_docs = query[\"docs\"]\n    items = prepare_docs_for_query(documents, gt_docs)\n\n    schema = [\n        {\"name\": \"title\", \"type\": str, \"desc\": \"Document title\"},\n        {\"name\": \"text\", \"type\": str, \"desc\": \"Document content\"},\n    ]\n\n    dataset = pz.MemoryDataset(\n        id=\"quest-docs\",\n        vals=items,\n        schema=schema,\n    )\n\n    query_text = query[\"query\"]\n    plan = dataset.sem_filter(\n        f'This document is relevant to the entity-seeking query: \"{query_text}\". '\n        \"Return True if the document helps answer the query, False otherwise.\",\n        depends_on=[\"text\"],\n    ).project([\"title\"])\n\n    config = pz.QueryProcessorConfig(\n        policy=pz.MaxQuality(),\n        execution_strategy=\"parallel\",\n        progress=True,\n    )\n    output = plan.run(config)\n    execution_stats = output.execution_stats\n    time_secs = execution_stats.total_execution_time if execution_stats else 0.0\n    cost = execution_stats.total_execution_cost if execution_stats else 0.0\n    return [record[\"title\"] for record in output], time_secs, cost\n\n\ndef main():\n    parser = argparse.ArgumentParser(description=\"Evaluate Palimpzest on QUEST\")\n    parser.add_argument(\n        \"--domain\",\n        type=str,\n        required=True,\n        choices=[\"films\", \"books\"],\n        help=\"The domain to evaluate.\",\n    )\n    parser.add_argument(\n        \"--queries\",\n        type=str,\n        required=True,\n        help=\"Path to the file containing the queries (e.g. test.jsonl).\",\n    )\n    parser.add_argument(\n        \"--documents\",\n        type=str,\n        default=\"data/documents.jsonl\",\n        help=\"Path to documents.jsonl (QUEST format: title, text per line).\",\n    )\n    parser.add_argument(\n        \"--limit\",\n        type=int,\n        default=None,\n        help=\"Limit number of queries to evaluate (for debugging).\",\n    )\n    parser.add_argument(\n        \"--seed\",\n        type=int,\n        default=42,\n        help=\"Random seed for document shuffling.\",\n    )\n    args = parser.parse_args()\n\n    random.seed(args.seed)\n\n    if not os.path.exists(args.documents):\n        raise FileNotFoundError(\n            f\"Documents file not found: {args.documents}\\n\"\n        )\n    with open(args.documents) as f:\n        documents = [json.loads(line) for line in f]\n\n    queries = []\n    with open(args.queries) as f:\n        for line in f:\n            d = json.loads(line)\n            if d[\"metadata\"][\"domain\"] == args.domain:\n                queries.append(d)\n\n    if args.limit:\n        queries = queries[: args.limit]\n\n    results = []\n    for i, query in enumerate(queries):\n        print(f\"[{i + 1}/{len(queries)}] Executing query: {query['query']}\")\n        pred_docs, cur_time, cur_cost = palimpzest_run_query(query, documents)\n\n        gt_docs = query[\"docs\"]\n        preds = set(pred_docs)\n        labels = set(gt_docs)\n\n        tp = sum(1 for pred in preds if pred in labels)\n        fp = len(preds) - tp\n        fn = sum(1 for label in labels if label not in preds)\n\n        precision = tp / (tp + fp) if (tp + fp) > 0 else 0\n        recall = tp / (tp + fn) if (tp + fn) > 0 else 0\n        f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0\n\n        result = {\n            \"query\": query[\"query\"],\n            \"predicted_docs\": pred_docs,\n            \"ground_truth_docs\": gt_docs,\n            \"precision\": precision,\n            \"recall\": recall,\n            \"f1_score\": f1,\n            \"time\": cur_time,\n            \"cost\": cur_cost\n        }\n        results.append(result)\n\n    ts = int(time.time())\n    out_path = f\"results_{args.domain}_{ts}.json\"\n    with open(out_path, \"w\") as f:\n        json.dump(results, f, indent=4)\n    print(f\"\\nResults saved to {out_path}\")\n\n    n = len(results)\n    avg_precision = sum(r[\"precision\"] for r in results) / n\n    avg_recall = sum(r[\"recall\"] for r in results) / n\n    avg_f1 = sum(r[\"f1_score\"] for r in results) / n\n    avg_time = sum(r[\"time\"] for r in results) / n\n    avg_cost = sum(r[\"cost\"] for r in results) / n\n\n    print(f\"Average Precision: {avg_precision:.4f}\")\n    print(f\"Average Recall: {avg_recall:.4f}\")\n    print(f\"Average F1 Score: {avg_f1:.4f}\")\n    print(f\"Average Time: {avg_time:.4f}s\")\n    print(f\"Average Cost: {avg_cost:.4f}$\")\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "pyproject.toml",
    "content": "[project]\nname = \"palimpzest\"\nversion = \"1.5.3\"\ndescription = \"Palimpzest is a system which enables anyone to process AI-powered analytical queries simply by defining them in a declarative language\"\nreadme = \"README.md\"\nrequires-python = \">=3.12\"\nkeywords = [\"relational\", \"optimization\", \"llm\", \"AI programming\", \"extraction\", \"tools\", \"document\", \"search\", \"integration\"]\nauthors = [\n    {name=\"MIT DSG Semantic Management Lab\", email=\"michjc@csail.mit.edu\"},\n]\ndependencies = [\n    \"anthropic>=0.79.0\",\n    \"beautifulsoup4>=4.13.4\",\n    \"chromadb>=1.0.15\",\n    \"colorama>=0.4.6\",\n    \"datasets>=4.0.0\",\n    \"fastapi>=0.115.0\",\n    \"google-genai>=1.0.0\",\n    \"gradio>=5.26.0\",\n    \"litellm>=1.81.11, <1.82.7\",\n    \"numpy==2.0.2\",\n    \"openai>=1.0\",\n    \"pandas>=2.1.1\",\n    \"pytest>=8.2.2\",\n    \"pillow>=11.3.0\",\n    \"prettytable>=3.9.0\",\n    \"psutil==5.9.5\",\n    \"PyLD>=2.0.4\",\n    \"pyarrow>=20.0.0\",\n    \"pypdf>=5.1.0\",\n    \"pytest-mock>=3.14.0\",\n    \"python-dotenv>=1.2.1\",\n    \"pyyaml>=6.0.1\",\n    \"requests>=2.25\",\n    \"ruff>=0.9.0\",\n    \"sentence-transformers==5.0.0\",\n    \"setuptools>=70.1.1\",\n    \"smolagents[toolkit]\",\n    \"tabulate>=0.9.0\",\n    \"together>=1.5.5\",\n    \"tqdm~=4.66.1\",\n    \"rich[jupyter]>=13.9.2\",\n]\nclassifiers=[\n    \"Development Status :: 4 - Beta\",  # Change as appropriate\n    \"Intended Audience :: Developers\",\n    \"License :: OSI Approved :: MIT License\",  # Change as appropriate\n    \"Programming Language :: Python :: 3\",\n    \"Programming Language :: Python :: 3.8\",  # Specify versions you support\n    # Add more classifiers as appropriate\n]\n\n[project.optional-dependencies]\nvllm = [\n    \"vllm>=0.10.1.1\",\n]\n\n[tool.setuptools]\npackage-dir = {\"\" = \"src\"}\ninclude-package-data = true\n\n[tool.setuptools.packages.find]\nwhere = [\"src\"]\nnamespaces = false\n\n[tool.setuptools.package-data]\n\"*\" = [\"*.txt\", \"*.rst\", \"*.md\", \"*.json\"]\n\n[tool.pytest.ini_options]\ntestpaths = [\"tests/pytest\"]\nfilterwarnings = [\n    \"error\",\n    \"ignore::DeprecationWarning\",\n    \"ignore::ResourceWarning\",\n    \"ignore::UserWarning\",\n]\n\n[build-system]\nrequires = [\"setuptools\"]\nbuild-backend = \"setuptools.build_meta\"\n\n[project.urls]\nhomepage = \"https://palimpzest.org\"\nrepository = \"https://github.com/mitdbg/palimpzest/\"\ndocumentation = \"https://palimpzest.org\"\n# changelog = \"https://github.com/me/spam/blob/master/CHANGELOG.md\"\n"
  },
  {
    "path": "quickstart.ipynb",
    "content": "{\n  \"cells\": [\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"dBfyB-7Hytwy\"\n      },\n      \"source\": [\n        \"![PZ-banner](https://palimpzest-workloads.s3.us-east-1.amazonaws.com/palimpzest-cropped.png)\\n\",\n        \"\\n\",\n        \"# Palimpzest Quickstart\\n\",\n        \"This notebook contains a sample program to guide you through the features of the Palimpzest (PZ) library. PZ provides a high-level, declarative interface for composing and executing pipelines of semantic operators.\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"2-TkUeCFx1et\"\n      },\n      \"source\": [\n        \"## Load Private Key(s)\\n\",\n        \"1. Click on the \\\"key\\\" icon on the left-hand-side of the Colab notebook.\\n\",\n        \"2. In the sidebar that opens, click `+ Add new secret`\\n\",\n        \"  - **Note:** your secrets are not visible to anyone other than Google and your version of the notebook.\\n\",\n        \"3. Enter one or more of the following keys as secrets:\\n\",\n        \"  - `OPENAI_API_KEY`\\n\",\n        \"  - `TOGETHER_API_KEY`\\n\",\n        \"    - You can create a `together.ai` API key [here](https://api.together.ai/) for this demo (it comes with $1 of free API requests)\\n\",\n        \"4. Make sure you have toggled `Notebook access` ON\\n\",\n        \"5. Execute the cell below to store these keys in notebook environment variables.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"zmmkh1n8efxA\"\n      },\n      \"source\": [\n        \"#### Note: for the changes to take effect, you may need to restart the session (`Runtime > Restart Session`) if you've already connected the notebook to a runtime\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"-DgUrHNtZu0z\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"from google.colab import userdata\\n\",\n        \"import os\\n\",\n        \"\\n\",\n        \"# set environment variables\\n\",\n        \"def set_api_key_from_secret(key_name):\\n\",\n        \"  try:\\n\",\n        \"    os.environ[key_name] = userdata.get(key_name)\\n\",\n        \"  except:\\n\",\n        \"    pass\\n\",\n        \"\\n\",\n        \"set_api_key_from_secret('OPENAI_API_KEY')\\n\",\n        \"set_api_key_from_secret('TOGETHER_API_KEY')\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"HNFA4gTzxvE2\"\n      },\n      \"source\": [\n        \"## Install Palimpzest\\n\",\n        \"First, let's install the Palimpzest package. This may take a few minutes. **PIP dependency error messages are expected and can be ignored.**\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"collapsed\": true,\n        \"id\": \"4AxQGqXIyXsP\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"!pip install palimpzest==0.7.6\\n\",\n        \"!pip install --upgrade pyarrow\\n\",\n        \"!pip install chromadb==0.6.3\\n\",\n        \"import palimpzest as pz\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"qSAC96Rb-Ggy\"\n      },\n      \"source\": [\n        \"## Download Test Files\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"cUwsu8XOzgJd\"\n      },\n      \"source\": [\n        \"Next, we'll download the dataset we need for this demo:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"collapsed\": true,\n        \"id\": \"IXv-pxMhx0i1\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"# download tar files with testdata\\n\",\n        \"!wget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/enron-tiny.tar.gz\\n\",\n        \"!wget -nc wget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/real-estate-eval-5.tar.gz\\n\",\n        \"!wget -nc https://palimpzest-workloads.s3.us-east-1.amazonaws.com/chroma-biodex.tar.gz\\n\",\n        \"\\n\",\n        \"# open tar files\\n\",\n        \"!tar -xzf enron-tiny.tar.gz\\n\",\n        \"!tar -xzf real-estate-eval-5.tar.gz\\n\",\n        \"!tar -xzf chroma-biodex.tar.gz\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"fw5mmyAY_EaS\"\n      },\n      \"source\": [\n        \"# First PZ Program: Filtering Enron Emails\\n\",\n        \"For this demo, we will work with a small subset of the Enron Email Dataset to identify emails matching some search criteria.\\n\",\n        \"\\n\",\n        \"We are going to use Palimpzest to perform the following tasks:\\n\",\n        \"1. Load the text files that contain the emails. (Each `.txt` file contains a single email).\\n\",\n        \"2. Compute the sender, subject, and date of each email.\\n\",\n        \"3. Filter the emails for ones that mention a vacation plan and were sent in the month of July.\\n\",\n        \"\\n\",\n        \"We can compose these tasks into a PZ program as follows:\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"collapsed\": true,\n        \"id\": \"lE8xx1s7xoQQ\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"# define the fields we wish to compute\\n\",\n        \"email_cols = [\\n\",\n        \"    {\\\"name\\\": \\\"sender\\\", \\\"type\\\": str, \\\"desc\\\": \\\"The email address of the sender\\\"},\\n\",\n        \"    {\\\"name\\\": \\\"subject\\\", \\\"type\\\": str, \\\"desc\\\": \\\"The subject of the email\\\"},\\n\",\n        \"    {\\\"name\\\": \\\"date\\\", \\\"type\\\": str, \\\"desc\\\": \\\"The date the email was sent\\\"},\\n\",\n        \"]\\n\",\n        \"\\n\",\n        \"# lazily construct the computation to get emails about holidays sent in July\\n\",\n        \"dataset = pz.Dataset(\\\"enron-tiny/\\\")\\n\",\n        \"dataset = dataset.sem_add_columns(email_cols)\\n\",\n        \"dataset = dataset.sem_filter(\\\"The email was sent in July\\\")\\n\",\n        \"dataset = dataset.sem_filter(\\\"The email is about holidays\\\")\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"ZRYDgD3RsMCo\"\n      },\n      \"source\": [\n        \"First, we define the set of columns we want to compute in `email_cols`.\\n\",\n        \"\\n\",\n        \"Next, we create a dataset by simply constructing `pz.Dataset()` with to the path to our files.\\n\",\n        \"\\n\",\n        \"We then instruct PZ to compute the email columns with a call to `sem_add_columns()`.\\n\",\n        \"\\n\",\n        \"Finally, we apply our two natural language filters with `sem_filter()`.\\n\",\n        \"\\n\",\n        \"**Note:** due to PZ's lazy execution, the code above will not execute the PZ program. It simply defines the semantic computation graph.\\n\",\n        \"\\n\",\n        \"In the next cell, we execute the PZ program with the goal of optimizing for quality:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"cSbS7uC7tUyS\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"# execute the computation w/the MaxQuality policy\\n\",\n        \"config = pz.QueryProcessorConfig(policy=pz.MaxQuality(), execution_strategy=\\\"parallel\\\", progress=True)\\n\",\n        \"output = dataset.run(config)\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"9hAjI6JrtbIU\"\n      },\n      \"source\": [\n        \"Once our pipeline completes, we can convert the output to a Pandas DataFrame:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"cyTrabGGtaZq\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"# display output (if using Jupyter, otherwise use print(output_df))\\n\",\n        \"output_df = output.to_df(cols=[\\\"date\\\", \\\"sender\\\", \\\"subject\\\"])\\n\",\n        \"display(output_df)\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"55DHU5XNAYBR\"\n      },\n      \"source\": [\n        \"Furthermore, Palimpzest provides a detailed report of the execution, with statistics about the runtime and cost of each operation, as well as the final plan that PZ executed.\\n\",\n        \"\\n\",\n        \"These statistics are stored in `output.execution_stats`:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"ottmnW4OAhXv\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"print(f\\\"Optimization Time: {output.execution_stats.optimization_time:.2f}s\\\")\\n\",\n        \"print(f\\\"Optimization Cost: ${output.execution_stats.optimization_cost:.3f}\\\")\\n\",\n        \"print(\\\"---\\\")\\n\",\n        \"print(f\\\"Plan Execution Time: {output.execution_stats.plan_execution_time:.2f}s\\\")\\n\",\n        \"print(f\\\"Plan Execution Cost: ${output.execution_stats.plan_execution_cost:.3f}\\\")\\n\",\n        \"\\n\",\n        \"print(\\\"Final plan executed:\\\")\\n\",\n        \"print(\\\"---\\\")\\n\",\n        \"final_plan_id = list(output.execution_stats.plan_strs.keys())[-1]\\n\",\n        \"print(output.execution_stats.plan_strs[final_plan_id])\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"ojm-qRxMyO0i\"\n      },\n      \"source\": [\n        \"# Second PZ Program: Multi-Modal Data Processing\\n\",\n        \"\\n\",\n        \"For our next demo, we will work with a small dataset of five real estate listings to search for properties of interest.\\n\",\n        \"\\n\",\n        \"We are going to use Palimpzest to execute the following pipeline.\\n\",\n        \"1. Load the images and text description for each listing\\n\",\n        \"2. Compute the price and address of each listing from the text description\\n\",\n        \"3. Filter for homes within our price range\\n\",\n        \"4. Filter for homes that look modern and attractive\\n\",\n        \"\\n\",\n        \"Let's take a moment to visualize the homes in our dataset:\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"zUHkZDwC6EdC\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"from PIL import Image\\n\",\n        \"import numpy as np\\n\",\n        \"import gradio as gr\\n\",\n        \"\\n\",\n        \"# Boilerplate code to build our visualization\\n\",\n        \"fst_imgs, snd_imgs, thrd_imgs, texts = [], [], [], []\\n\",\n        \"for idx in range(1, 6):\\n\",\n        \"    listing = f\\\"listing{idx}\\\"\\n\",\n        \"    with open(os.path.join(\\\"real-estate-eval-5\\\", listing, \\\"listing-text.txt\\\")) as f:\\n\",\n        \"        texts.append(f.read())\\n\",\n        \"    for idx, img_name in enumerate([\\\"img1.png\\\", \\\"img2.png\\\", \\\"img3.png\\\"]):\\n\",\n        \"        path = os.path.join(\\\"real-estate-eval-5\\\", listing, img_name)\\n\",\n        \"        img = Image.open(path)\\n\",\n        \"        img_arr = np.asarray(img)\\n\",\n        \"        if idx == 0:\\n\",\n        \"            fst_imgs.append(img_arr)\\n\",\n        \"        elif idx == 1:\\n\",\n        \"            snd_imgs.append(img_arr)\\n\",\n        \"        elif idx == 2:\\n\",\n        \"            thrd_imgs.append(img_arr)\\n\",\n        \"\\n\",\n        \"with gr.Blocks() as demo:\\n\",\n        \"    fst_img_blocks, snd_img_blocks, thrd_img_blocks, text_blocks = [], [], [], []\\n\",\n        \"    for fst_img, snd_img, thrd_img, text in zip(fst_imgs, snd_imgs, thrd_imgs, texts):\\n\",\n        \"        with gr.Row(equal_height=True):\\n\",\n        \"            with gr.Column():\\n\",\n        \"                fst_img_blocks.append(gr.Image(value=fst_img))\\n\",\n        \"            with gr.Column():\\n\",\n        \"                snd_img_blocks.append(gr.Image(value=snd_img))\\n\",\n        \"            with gr.Column():\\n\",\n        \"                thrd_img_blocks.append(gr.Image(value=thrd_img))\\n\",\n        \"        with gr.Row():\\n\",\n        \"            with gr.Column():\\n\",\n        \"                text_blocks.append(gr.Textbox(value=text, info=\\\"Text Description\\\"))\\n\",\n        \"\\n\",\n        \"demo.launch()\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"Yg4yRYx26ecr\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"demo.close()\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"GpNe3bFk6FMD\"\n      },\n      \"source\": [\n        \"As a first step, we need to write a custom `pz.DataReader` to enable PZ to load our data properly:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"2FcqpZySyWbl\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"from palimpzest.core.lib.fields import ImageFilepathField, ListField\\n\",\n        \"\\n\",\n        \"# we first define the schema for each record output by the DataReader\\n\",\n        \"real_estate_listing_cols = [\\n\",\n        \"    {\\\"name\\\": \\\"listing\\\", \\\"type\\\": str, \\\"desc\\\": \\\"The name of the listing\\\"},\\n\",\n        \"    {\\\"name\\\": \\\"text_content\\\", \\\"type\\\": str, \\\"desc\\\": \\\"The content of the listing's text description\\\"},\\n\",\n        \"    {\\\"name\\\": \\\"image_filepaths\\\", \\\"type\\\": ListField(ImageFilepathField), \\\"desc\\\": \\\"A list of the filepaths for each image of the listing\\\"},\\n\",\n        \"]\\n\",\n        \"\\n\",\n        \"# we then implement the DataReader\\n\",\n        \"class RealEstateListingReader(pz.DataReader):\\n\",\n        \"    def __init__(self, listings_dir):\\n\",\n        \"        super().__init__(schema=real_estate_listing_cols)\\n\",\n        \"        self.listings_dir = listings_dir\\n\",\n        \"        self.listings = sorted(os.listdir(self.listings_dir))\\n\",\n        \"\\n\",\n        \"    def __len__(self):\\n\",\n        \"        return len(self.listings)\\n\",\n        \"\\n\",\n        \"    def __getitem__(self, idx: int):\\n\",\n        \"        # get listing\\n\",\n        \"        listing = self.listings[idx]\\n\",\n        \"\\n\",\n        \"        # get fields\\n\",\n        \"        image_filepaths, text_content = [], None\\n\",\n        \"        listing_dir = os.path.join(self.listings_dir, listing)\\n\",\n        \"        for file in os.listdir(listing_dir):\\n\",\n        \"            if file.endswith(\\\".txt\\\"):\\n\",\n        \"                with open(os.path.join(listing_dir, file), \\\"rb\\\") as f:\\n\",\n        \"                    text_content = f.read().decode(\\\"utf-8\\\")\\n\",\n        \"            elif file.endswith(\\\".png\\\"):\\n\",\n        \"                image_filepaths.append(os.path.join(listing_dir, file))\\n\",\n        \"\\n\",\n        \"        # construct and return dictionary with fields\\n\",\n        \"        return {\\\"listing\\\": listing, \\\"text_content\\\": text_content, \\\"image_filepaths\\\": image_filepaths}\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"Q45qFYcM0N9w\"\n      },\n      \"source\": [\n        \"Every `pz.DataReader` must have the following:\\n\",\n        \"1. A `schema` defining the fields present in each output record\\n\",\n        \"2. A `__len__()` function which returns the number of items in the dataset\\n\",\n        \"3. A `__getitem__(idx)` function which returns the `idx`th item in the dataset\\n\",\n        \"\\n\",\n        \"Once we've implemented the `pz.DataReader`, we can compose our PZ program as follows:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"1qOI9WOY0X4_\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"# schema for computing the address and price of each home\\n\",\n        \"real_estate_text_cols = [\\n\",\n        \"    {\\\"name\\\": \\\"address\\\", \\\"type\\\": str, \\\"desc\\\": \\\"The address of the property\\\"},\\n\",\n        \"    {\\\"name\\\": \\\"price\\\", \\\"type\\\": int | float, \\\"desc\\\": \\\"The listed price of the property\\\"},\\n\",\n        \"]\\n\",\n        \"\\n\",\n        \"# define a UDF for filtering based on a price range\\n\",\n        \"def in_price_range(record: dict):\\n\",\n        \"    try:\\n\",\n        \"        price = record[\\\"price\\\"]\\n\",\n        \"        if isinstance(price, str):\\n\",\n        \"            price = price.strip()\\n\",\n        \"            price = int(price.replace(\\\"$\\\", \\\"\\\").replace(\\\",\\\", \\\"\\\"))\\n\",\n        \"        return 6e5 < price <= 2e6\\n\",\n        \"    except Exception:\\n\",\n        \"        return False\\n\",\n        \"\\n\",\n        \"# construct our PZ program to filter for listings matching our search criteria\\n\",\n        \"ds = pz.Dataset(RealEstateListingReader(\\\"real-estate-eval-5\\\"))\\n\",\n        \"ds = ds.sem_add_columns(real_estate_text_cols, depends_on=\\\"text_content\\\")\\n\",\n        \"ds = ds.sem_filter(\\n\",\n        \"    \\\"The interior is modern and attractive, and has lots of natural sunlight\\\",\\n\",\n        \"    depends_on=\\\"image_filepaths\\\",\\n\",\n        \")\\n\",\n        \"ds = ds.filter(in_price_range, depends_on=\\\"price\\\")\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"eitGXQCS0YRY\"\n      },\n      \"source\": [\n        \"First, we write a schema for the `address` and `price` fields we wish to compute.\\n\",\n        \"\\n\",\n        \"Next, we write a UDF to filter for homes based on our price range.\\n\",\n        \"\\n\",\n        \"Then we compose our program by:\\n\",\n        \"1. Constructing our `pz.DataReader` with the real estate data\\n\",\n        \"2. Using `sem_add_columns()` to compute the `address` and `price`\\n\",\n        \"3. Using a `sem_filter()` to filter for modern homes with lots of sunlight\\n\",\n        \"4. Using our UDF to filter for homes based on our price range\\n\",\n        \"\\n\",\n        \"We now execute the program:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"uFmakjcQ4W5o\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"# execute the computation w/the MaxQuality policy\\n\",\n        \"config = pz.QueryProcessorConfig(policy=pz.MaxQuality(), execution_strategy=\\\"parallel\\\", progress=True)\\n\",\n        \"output = ds.run(config)\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"YHesFcjc4snL\"\n      },\n      \"source\": [\n        \"Now let's take a look at our output:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"JAxeR_R54vuY\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"from PIL import Image\\n\",\n        \"import numpy as np\\n\",\n        \"import gradio as gr\\n\",\n        \"\\n\",\n        \"demo.close()\\n\",\n        \"\\n\",\n        \"# Boilerplate code to build our visualization\\n\",\n        \"fst_imgs, snd_imgs, thrd_imgs, addrs, prices = [], [], [], [], []\\n\",\n        \"for record in output:\\n\",\n        \"    addrs.append(record.address)\\n\",\n        \"    prices.append(record.price)\\n\",\n        \"    for idx, img_name in enumerate([\\\"img1.png\\\", \\\"img2.png\\\", \\\"img3.png\\\"]):\\n\",\n        \"        path = os.path.join(\\\"real-estate-eval-5\\\", record.listing, img_name)\\n\",\n        \"        img = Image.open(path)\\n\",\n        \"        img_arr = np.asarray(img)\\n\",\n        \"        if idx == 0:\\n\",\n        \"            fst_imgs.append(img_arr)\\n\",\n        \"        elif idx == 1:\\n\",\n        \"            snd_imgs.append(img_arr)\\n\",\n        \"        elif idx == 2:\\n\",\n        \"            thrd_imgs.append(img_arr)\\n\",\n        \"\\n\",\n        \"with gr.Blocks() as demo:\\n\",\n        \"    fst_img_blocks, snd_img_blocks, thrd_img_blocks, addr_blocks, price_blocks = [], [], [], [], []\\n\",\n        \"    for fst_img, snd_img, thrd_img, addr, price in zip(fst_imgs, snd_imgs, thrd_imgs, addrs, prices):\\n\",\n        \"        with gr.Row(equal_height=True):\\n\",\n        \"            with gr.Column():\\n\",\n        \"                fst_img_blocks.append(gr.Image(value=fst_img))\\n\",\n        \"            with gr.Column():\\n\",\n        \"                snd_img_blocks.append(gr.Image(value=snd_img))\\n\",\n        \"            with gr.Column():\\n\",\n        \"                thrd_img_blocks.append(gr.Image(value=thrd_img))\\n\",\n        \"        with gr.Row():\\n\",\n        \"            with gr.Column():\\n\",\n        \"                addr_blocks.append(gr.Textbox(value=addr, info=\\\"Address\\\"))\\n\",\n        \"            with gr.Column():\\n\",\n        \"                price_blocks.append(gr.Textbox(value=price, info=\\\"Price\\\"))\\n\",\n        \"\\n\",\n        \"    plan_str = list(output.execution_stats.plan_strs.values())[0]\\n\",\n        \"    gr.Textbox(value=plan_str, info=\\\"Query Plan\\\")\\n\",\n        \"\\n\",\n        \"demo.launch()\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"XpCo8uF54sPi\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"demo.close()\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"VAgG4wBKyZM0\"\n      },\n      \"source\": [\n        \"# Third PZ Program: Optimizing a Biomedical Classification Pipeline\\n\",\n        \"\\n\",\n        \"For our final demo, we will work with a subset of the BioDEX dataset.\\n\",\n        \"\\n\",\n        \"Each input in the dataset is a medical report describing an adverse reaction a patient had in response to taking one or more drugs.\\n\",\n        \"\\n\",\n        \"The goal is to correctly predict the reactions experienced by the patient by matching them to a database of ~24,300 official medical reaction terms.\\n\",\n        \"\\n\",\n        \"We are going to use Palimpzest to implement the following pipeline:\\n\",\n        \"1. Load a medical report\\n\",\n        \"2. Compute a list of reactions mentioned in the report\\n\",\n        \"3. Retrieve the most similar reaction terms from a vector database with embeddings for each of the ~24,300 official terms\\n\",\n        \"4. Re-rank the list of official terms based on their relevance\\n\",\n        \"\\n\",\n        \"First, we will once again create a `pz.DataReader` to load the medical reports:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"_CeZ9Ib1yY1n\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"import datasets\\n\",\n        \"from functools import partial\\n\",\n        \"\\n\",\n        \"# define the schema for records returned by the DataReader\\n\",\n        \"biodex_entry_cols = [\\n\",\n        \"    {\\\"name\\\": \\\"pmid\\\", \\\"type\\\": str, \\\"desc\\\": \\\"The PubMed ID of the medical paper\\\"},\\n\",\n        \"    {\\\"name\\\": \\\"title\\\", \\\"type\\\": str, \\\"desc\\\": \\\"The title of the medical paper\\\"},\\n\",\n        \"    {\\\"name\\\": \\\"abstract\\\", \\\"type\\\": str, \\\"desc\\\": \\\"The abstract of the medical paper\\\"},\\n\",\n        \"    {\\\"name\\\": \\\"fulltext\\\", \\\"type\\\": str, \\\"desc\\\": \\\"The full text of the medical paper, which contains information relevant for creating a drug safety report.\\\"},\\n\",\n        \"]\\n\",\n        \"\\n\",\n        \"# implement the DataReader\\n\",\n        \"class BiodexReader(pz.DataReader):\\n\",\n        \"    def __init__(\\n\",\n        \"        self,\\n\",\n        \"        rp_at_k: int = 5,\\n\",\n        \"        num_samples: int = 10,\\n\",\n        \"        split: str = \\\"test\\\",\\n\",\n        \"        shuffle: bool = True,\\n\",\n        \"        seed: int = 42,\\n\",\n        \"    ):\\n\",\n        \"        super().__init__(biodex_entry_cols)\\n\",\n        \"\\n\",\n        \"        self.dataset = datasets.load_dataset(\\\"BioDEX/BioDEX-Reactions\\\", split=split).to_pandas()\\n\",\n        \"        if shuffle:\\n\",\n        \"            self.dataset = self.dataset.sample(n=num_samples, random_state=seed).to_dict(orient=\\\"records\\\")\\n\",\n        \"        else:\\n\",\n        \"            self.dataset = self.dataset.to_dict(orient=\\\"records\\\")[:num_samples]\\n\",\n        \"\\n\",\n        \"        self.rp_at_k = rp_at_k\\n\",\n        \"        self.num_samples = num_samples\\n\",\n        \"        self.shuffle = shuffle\\n\",\n        \"        self.seed = seed\\n\",\n        \"        self.split = split\\n\",\n        \"\\n\",\n        \"    def compute_label(self, entry: dict) -> dict:\\n\",\n        \"        \\\"\\\"\\\"Compute the label for a BioDEX report given its entry in the dataset.\\\"\\\"\\\"\\n\",\n        \"        reactions_lst = [\\n\",\n        \"            reaction.strip().lower().replace(\\\"'\\\", \\\"\\\").replace(\\\"^\\\", \\\"\\\")\\n\",\n        \"            for reaction in entry[\\\"reactions\\\"].split(\\\",\\\")\\n\",\n        \"        ]\\n\",\n        \"        label_dict = {\\n\",\n        \"            \\\"reactions\\\": reactions_lst,\\n\",\n        \"            \\\"reaction_labels\\\": reactions_lst,\\n\",\n        \"            \\\"ranked_reaction_labels\\\": reactions_lst,\\n\",\n        \"        }\\n\",\n        \"        return label_dict\\n\",\n        \"\\n\",\n        \"    @staticmethod\\n\",\n        \"    def rank_precision_at_k(preds, targets, k: int):\\n\",\n        \"        if preds is None:\\n\",\n        \"            return 0.0\\n\",\n        \"\\n\",\n        \"        try:\\n\",\n        \"            # lower-case each list\\n\",\n        \"            preds = [pred.strip().lower().replace(\\\"'\\\", \\\"\\\").replace(\\\"^\\\", \\\"\\\") for pred in preds]\\n\",\n        \"            targets = set([target.strip().lower().replace(\\\"'\\\", \\\"\\\").replace(\\\"^\\\", \\\"\\\") for target in targets])\\n\",\n        \"\\n\",\n        \"            # compute rank-precision at k\\n\",\n        \"            rn = len(targets)\\n\",\n        \"            denom = min(k, rn)\\n\",\n        \"            total = 0.0\\n\",\n        \"            for i in range(k):\\n\",\n        \"                total += preds[i] in targets if i < len(preds) else 0.0\\n\",\n        \"\\n\",\n        \"            return total / denom\\n\",\n        \"\\n\",\n        \"        except Exception:\\n\",\n        \"            return 0.0\\n\",\n        \"\\n\",\n        \"    @staticmethod\\n\",\n        \"    def term_recall(preds, targets):\\n\",\n        \"        if preds is None:\\n\",\n        \"            return 0.0\\n\",\n        \"\\n\",\n        \"        try:\\n\",\n        \"            # normalize terms in each list\\n\",\n        \"            pred_terms = set([\\n\",\n        \"                term.strip()\\n\",\n        \"                for pred in preds\\n\",\n        \"                for term in pred.lower().replace(\\\"'\\\", \\\"\\\").replace(\\\"^\\\", \\\"\\\").split(\\\" \\\")\\n\",\n        \"            ])\\n\",\n        \"            target_terms = ([\\n\",\n        \"                term.strip()\\n\",\n        \"                for target in targets\\n\",\n        \"                for term in target.lower().replace(\\\"'\\\", \\\"\\\").replace(\\\"^\\\", \\\"\\\").split(\\\" \\\")\\n\",\n        \"            ])\\n\",\n        \"\\n\",\n        \"            # compute term recall and return\\n\",\n        \"            intersect = pred_terms.intersection(target_terms)\\n\",\n        \"            term_recall = len(intersect) / len(target_terms)\\n\",\n        \"\\n\",\n        \"            return term_recall\\n\",\n        \"\\n\",\n        \"        except Exception:\\n\",\n        \"            return 0.0\\n\",\n        \"\\n\",\n        \"    def __len__(self):\\n\",\n        \"        return len(self.dataset)\\n\",\n        \"\\n\",\n        \"    def __getitem__(self, idx: int):\\n\",\n        \"        # get entry\\n\",\n        \"        entry = self.dataset[idx]\\n\",\n        \"\\n\",\n        \"        # get input fields\\n\",\n        \"        pmid = entry[\\\"pmid\\\"]\\n\",\n        \"        title = entry[\\\"title\\\"]\\n\",\n        \"        abstract = entry[\\\"abstract\\\"]\\n\",\n        \"        fulltext = entry[\\\"fulltext\\\"]\\n\",\n        \"\\n\",\n        \"        # create item with fields\\n\",\n        \"        item = {\\\"fields\\\": {}, \\\"labels\\\": {}, \\\"score_fn\\\": {}}\\n\",\n        \"        item[\\\"fields\\\"][\\\"pmid\\\"] = pmid\\n\",\n        \"        item[\\\"fields\\\"][\\\"title\\\"] = title\\n\",\n        \"        item[\\\"fields\\\"][\\\"abstract\\\"] = abstract\\n\",\n        \"        item[\\\"fields\\\"][\\\"fulltext\\\"] = fulltext\\n\",\n        \"\\n\",\n        \"        if self.split == \\\"train\\\":\\n\",\n        \"            # add label info\\n\",\n        \"            item[\\\"labels\\\"] = self.compute_label(entry)\\n\",\n        \"\\n\",\n        \"            # add scoring functions for list fields\\n\",\n        \"            rank_precision_at_k = partial(BiodexReader.rank_precision_at_k, k=self.rp_at_k)\\n\",\n        \"            item[\\\"score_fn\\\"][\\\"reactions\\\"] = BiodexReader.term_recall\\n\",\n        \"            item[\\\"score_fn\\\"][\\\"reaction_labels\\\"] = BiodexReader.term_recall\\n\",\n        \"            item[\\\"score_fn\\\"][\\\"ranked_reaction_labels\\\"] = rank_precision_at_k\\n\",\n        \"\\n\",\n        \"        return item\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"fyr71cs3EQpl\"\n      },\n      \"source\": [\n        \"There are a few new features of this `pz.DataReader` which are needed for the optimization process:\\n\",\n        \"1. `__getitem__()` returns a dictionary with top-level keys `{\\\"fields\\\", \\\"labels\\\", \\\"score_fn\\\"}`\\n\",\n        \"2. `fields` contains the data emitted by the `pz.DataReader`\\n\",\n        \"3. (for `train` data only): `labels` contains the expected results for each output field\\n\",\n        \"4. (for `train` data only): `score_fn` contains scoring functions for each output field\\n\",\n        \"\\n\",\n        \"Once we've defined our `pz.DataReader`, we can create our training and test datasets:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"aHYqdDrlG8zP\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"SEED = 123\\n\",\n        \"\\n\",\n        \"# create train and test datasets; and validator\\n\",\n        \"train_datareader = BiodexReader(split=\\\"train\\\", seed=SEED)\\n\",\n        \"test_datareader = BiodexReader(split=\\\"test\\\", num_samples=20, seed=SEED)\\n\",\n        \"validator = pz.Validator(train_datareader, None)\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"lHaVbQHiG-Rf\"\n      },\n      \"source\": [\n        \"We now implement the logic for the `sem_topk` operator for you. It fetches the five most similar medical terms for each reaction computed by PZ, sorts them based on similarity, and then returns the final top-k most similar terms.\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"JE3scuaXI6HB\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"import chromadb\\n\",\n        \"from chromadb.utils.embedding_functions.openai_embedding_function import OpenAIEmbeddingFunction\\n\",\n        \"\\n\",\n        \"# load index [text-embedding-3-small]\\n\",\n        \"chroma_client = chromadb.PersistentClient(\\\".chroma-biodex\\\")\\n\",\n        \"openai_ef = OpenAIEmbeddingFunction(\\n\",\n        \"  api_key=os.environ[\\\"OPENAI_API_KEY\\\"],\\n\",\n        \"  model_name=\\\"text-embedding-3-small\\\",\\n\",\n        \")\\n\",\n        \"index = chroma_client.get_collection(\\\"biodex-reaction-terms\\\", embedding_function=openai_ef)\\n\",\n        \"\\n\",\n        \"def search_func(index: chromadb.Collection, query: list[list[float]], k: int) -> list[str]:\\n\",\n        \"    # execute query with embeddings\\n\",\n        \"    results = index.query(query, n_results=5)\\n\",\n        \"\\n\",\n        \"    # get list of result terms with their cosine similarity scores\\n\",\n        \"    final_results = []\\n\",\n        \"    for query_docs, query_distances in zip(results[\\\"documents\\\"], results[\\\"distances\\\"]):\\n\",\n        \"        for doc, dist in zip(query_docs, query_distances):\\n\",\n        \"            cosine_similarity = 1 - dist\\n\",\n        \"            final_results.append({\\\"content\\\": doc, \\\"similarity\\\": cosine_similarity})\\n\",\n        \"\\n\",\n        \"    # sort the results by similarity score\\n\",\n        \"    sorted_results = sorted(final_results, key=lambda result: result[\\\"similarity\\\"], reverse=True)\\n\",\n        \"\\n\",\n        \"    # remove duplicates\\n\",\n        \"    sorted_results_set = set()\\n\",\n        \"    final_sorted_results = []\\n\",\n        \"    for result in sorted_results:\\n\",\n        \"        if result[\\\"content\\\"] not in sorted_results_set:\\n\",\n        \"            sorted_results_set.add(result[\\\"content\\\"])\\n\",\n        \"            final_sorted_results.append(result[\\\"content\\\"])\\n\",\n        \"\\n\",\n        \"    # return the top-k similar results and generation stats\\n\",\n        \"    return {\\\"reaction_labels\\\": final_sorted_results[:k]}\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"dCYi92YCJSuq\"\n      },\n      \"source\": [\n        \"Finally, we can construct our PZ program:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"Z3UkmjQFBSjc\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"# define the schema for each computation in our program\\n\",\n        \"biodex_reactions_cols = [\\n\",\n        \"    {\\\"name\\\": \\\"reactions\\\", \\\"type\\\": list[str], \\\"desc\\\": \\\"The list of all medical conditions experienced by the patient as discussed in the report. Try to provide as many relevant medical conditions as possible.\\\"},\\n\",\n        \"]\\n\",\n        \"biodex_reaction_labels_cols = [\\n\",\n        \"    {\\\"name\\\": \\\"reaction_labels\\\", \\\"type\\\": list[str], \\\"desc\\\": \\\"Official terms for medical conditions listed in `reactions`\\\"},\\n\",\n        \"]\\n\",\n        \"biodex_ranked_reactions_labels_cols = [\\n\",\n        \"    {\\\"name\\\": \\\"ranked_reaction_labels\\\", \\\"type\\\": list[str], \\\"desc\\\": \\\"The ranked list of medical conditions experienced by the patient. The most relevant label occurs first in the list. Be sure to rank ALL of the inputs.\\\"},\\n\",\n        \"]\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"# construct pz plan\\n\",\n        \"plan = pz.Dataset(test_datareader)\\n\",\n        \"plan = plan.sem_add_columns(biodex_reactions_cols)\\n\",\n        \"plan = plan.sem_topk(\\n\",\n        \"    index=index,\\n\",\n        \"    search_func=search_func,\\n\",\n        \"    search_attr=\\\"reactions\\\",\\n\",\n        \"    output_attrs=biodex_reaction_labels_cols,\\n\",\n        \")\\n\",\n        \"plan = plan.sem_add_columns(biodex_ranked_reactions_labels_cols, depends_on=[\\\"title\\\", \\\"abstract\\\", \\\"fulltext\\\", \\\"reaction_labels\\\"])\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"yvFZO-LKKq3F\"\n      },\n      \"source\": [\n        \"First, let's execute our plan without training data and score our performance:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"khhJJUXTLNDz\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"def score_output(output, seed):\\n\",\n        \"    # score output\\n\",\n        \"    test_dataset = datasets.load_dataset(\\\"BioDEX/BioDEX-Reactions\\\", split=\\\"test\\\").to_pandas()\\n\",\n        \"    test_dataset = test_dataset.sample(n=20, random_state=seed).to_dict(orient=\\\"records\\\")\\n\",\n        \"\\n\",\n        \"    # construct mapping from pmid --> label (field, value) pairs\\n\",\n        \"    def compute_target_record(entry):\\n\",\n        \"        reactions_lst = [\\n\",\n        \"            reaction.strip().lower().replace(\\\"'\\\", \\\"\\\").replace(\\\"^\\\", \\\"\\\")\\n\",\n        \"            for reaction in entry[\\\"reactions\\\"].split(\\\",\\\")\\n\",\n        \"        ]\\n\",\n        \"        label_dict = {\\\"ranked_reaction_labels\\\": reactions_lst}\\n\",\n        \"        return label_dict\\n\",\n        \"\\n\",\n        \"    label_fields_to_values = {\\n\",\n        \"        entry[\\\"pmid\\\"]: compute_target_record(entry) for entry in test_dataset\\n\",\n        \"    }\\n\",\n        \"\\n\",\n        \"    def rank_precision_at_k(preds: list, targets: list, k: int):\\n\",\n        \"        if preds is None:\\n\",\n        \"            return 0.0\\n\",\n        \"\\n\",\n        \"        # lower-case each list\\n\",\n        \"        preds = [pred.lower().replace(\\\"'\\\", \\\"\\\").replace(\\\"^\\\", \\\"\\\") for pred in preds]\\n\",\n        \"        targets = set([target.lower().replace(\\\"'\\\", \\\"\\\").replace(\\\"^\\\", \\\"\\\") for target in targets])\\n\",\n        \"\\n\",\n        \"        # compute rank-precision at k\\n\",\n        \"        rn = len(targets)\\n\",\n        \"        denom = min(k, rn)\\n\",\n        \"        total = 0.0\\n\",\n        \"        for i in range(k):\\n\",\n        \"            total += preds[i] in targets if i < len(preds) else 0.0\\n\",\n        \"\\n\",\n        \"        return total / denom\\n\",\n        \"\\n\",\n        \"    def compute_avg_rp_at_k(records, k=5):\\n\",\n        \"        total_rp_at_k = 0\\n\",\n        \"        bad = 0\\n\",\n        \"        for record in records:\\n\",\n        \"            pmid = record['pmid']\\n\",\n        \"            preds = record['ranked_reaction_labels']\\n\",\n        \"            targets = label_fields_to_values[pmid]['ranked_reaction_labels']\\n\",\n        \"            try:\\n\",\n        \"                total_rp_at_k += rank_precision_at_k(preds, targets, k)\\n\",\n        \"            except Exception:\\n\",\n        \"                bad += 1\\n\",\n        \"\\n\",\n        \"        return total_rp_at_k / len(records), bad\\n\",\n        \"\\n\",\n        \"    rp_at_k, bad = compute_avg_rp_at_k([record.to_dict() for record in output], k=5)\\n\",\n        \"    final_plan_id = list(output.execution_stats.plan_stats.keys())[0]\\n\",\n        \"    final_plan_str = output.execution_stats.plan_strs[final_plan_id]\\n\",\n        \"    print(\\\"---\\\")\\n\",\n        \"    print(\\\"#########################\\\")\\n\",\n        \"    print(f\\\"##### RP@5: {rp_at_k:.5f} #####\\\")\\n\",\n        \"    print(\\\"#########################\\\")\\n\",\n        \"    print(\\\"---\\\")\\n\",\n        \"    print(f\\\"Optimization time: {output.execution_stats.optimization_time:.2f}s\\\")\\n\",\n        \"    print(f\\\"Optimization cost: ${output.execution_stats.optimization_cost:.3f}\\\")\\n\",\n        \"    print(\\\"---\\\")\\n\",\n        \"    print(f\\\"Plan exec. time: {output.execution_stats.plan_execution_time:.2f}s\\\")\\n\",\n        \"    print(f\\\"Plan exec. cost: ${output.execution_stats.plan_execution_cost:.3f}\\\")\\n\",\n        \"    print(\\\"---\\\")\\n\",\n        \"    print(f\\\"Total time: {output.execution_stats.total_execution_time:.2f}s\\\")\\n\",\n        \"    print(f\\\"Total Cost: ${output.execution_stats.total_execution_cost:.3f}\\\")\\n\",\n        \"    print(\\\"---\\\")\\n\",\n        \"    print(\\\"Final Plan:\\\")\\n\",\n        \"    print(final_plan_str)\\n\",\n        \"\\n\",\n        \"import logging\\n\",\n        \"logger = logging.getLogger()\\n\",\n        \"logger.disabled = True\\n\",\n        \"\\n\",\n        \"# execute pz plan\\n\",\n        \"config = pz.QueryProcessorConfig(\\n\",\n        \"    policy=pz.MaxQuality(),\\n\",\n        \"    execution_strategy=\\\"parallel\\\",\\n\",\n        \"    max_workers=64,\\n\",\n        \"    progress=True,\\n\",\n        \")\\n\",\n        \"\\n\",\n        \"output = plan.run(config=config, seed=SEED)\\n\",\n        \"score_output(output, seed=SEED)\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {\n        \"id\": \"ORUDS35-K1Nf\"\n      },\n      \"source\": [\n        \"Now, let's run the program again while using our `train_datareader` as a validation dataset:\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"a4htdaTDKpFc\"\n      },\n      \"outputs\": [],\n      \"source\": [\n        \"import logging\\n\",\n        \"logger = logging.getLogger()\\n\",\n        \"logger.disabled = True\\n\",\n        \"\\n\",\n        \"# execute pz plan\\n\",\n        \"config = pz.QueryProcessorConfig(\\n\",\n        \"    policy=pz.MaxQuality(),\\n\",\n        \"    validator=validator,\\n\",\n        \"    optimizer_strategy=\\\"pareto\\\",\\n\",\n        \"    sentinel_execution_strategy=\\\"mab\\\",\\n\",\n        \"    execution_strategy=\\\"parallel\\\",\\n\",\n        \"    use_final_op_quality=True,\\n\",\n        \"    max_workers=64,\\n\",\n        \"    progress=True,\\n\",\n        \")\\n\",\n        \"\\n\",\n        \"output = plan.run(config=config, k=6, j=4, sample_budget=72, seed=SEED)\\n\",\n        \"score_output(output, seed=SEED)\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {\n        \"id\": \"IT3iZr6-Kpab\"\n      },\n      \"outputs\": [],\n      \"source\": []\n    }\n  ],\n  \"metadata\": {\n    \"colab\": {\n      \"provenance\": []\n    },\n    \"kernelspec\": {\n      \"display_name\": \"Python 3\",\n      \"name\": \"python3\"\n    },\n    \"language_info\": {\n      \"name\": \"python\"\n    }\n  },\n  \"nbformat\": 4,\n  \"nbformat_minor\": 0\n}\n"
  },
  {
    "path": "ruff.toml",
    "content": "# Config https://docs.astral.sh/ruff/configuration/\nline-length = 120\nindent-width = 4\nexclude = [\"*.ipynb\"]\n\n# Assume Python 3.8\ntarget-version = \"py38\"\n\n[lint]\nignore = [\"E501\"]\nfixable = [\"ALL\"]\nunfixable = []\nselect = [\"E\", \"F\", \"UP\", \"B\", \"SIM\", \"I\", \"N\"]\n"
  },
  {
    "path": "scripts/capture_litellm_stats.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nScript to invoke LLM providers through LiteLLM and capture token/cost statistics.\n\nThis script:\n1. Loads messages from JSON files generated by generate_test_messages.py\n2. Sends requests through LiteLLM (the same path palimpzest uses)\n3. Saves all usage metadata and response stats returned by LiteLLM\n4. Waits 10 seconds\n5. Sends the request again and saves the second set of stats\n\nThis allows us to compare LiteLLM's reported statistics with:\n- Direct provider API calls (from capture_provider_stats.py)\n- Palimpzest's generator stats tracking\n\nSupported providers:\n- Anthropic: claude-sonnet-4-5-20250929 (text, image, text+image)\n- Google/Gemini: gemini-2.5-flash (all seven modality combinations)\n- OpenAI: gpt-4o-2024-08-06 (text, image, text+image)\n- OpenAI: gpt-4o-audio-preview (text+audio, audio)\n- Azure: gpt-4o via Azure OpenAI (text, image, text+image)\n\nOutput files are saved to: scripts/litellm_stats/\n\"\"\"\n\nimport argparse\nimport json\nimport os\nimport sys\nimport time\nimport uuid\nfrom typing import Any\n\nimport litellm\nfrom litellm.integrations.custom_logger import CustomLogger\n\n# Add project root to path\nsys.path.insert(0, os.path.join(os.path.dirname(__file__), \"..\"))\n\nimport contextlib\n\nfrom palimpzest.constants import Model\n\n\n# =============================================================================\n# RAW RESPONSE CAPTURE CALLBACK\n# =============================================================================\nclass RawProviderStatsCapture(CustomLogger):\n    \"\"\"\n    Custom LiteLLM callback to capture raw provider usage stats before normalization.\n\n    LiteLLM normalizes all responses to OpenAI format, which loses provider-specific\n    details like Gemini's per-modality token breakdowns. This callback captures the\n    original provider response data.\n    \"\"\"\n\n    def __init__(self):\n        self.last_raw_response = None\n        self.last_raw_usage = None\n        self.last_provider = None\n\n    def log_success_event(self, kwargs, response_obj, start_time, end_time):\n        \"\"\"Called after a successful LLM API call.\"\"\"\n        try:\n            # Store the provider info\n            self.last_provider = kwargs.get(\"custom_llm_provider\") or kwargs.get(\"model\", \"\").split(\"/\")[0]\n\n            # Try to get the original response from hidden params\n            if hasattr(response_obj, \"_hidden_params\") and response_obj._hidden_params:\n                hidden = response_obj._hidden_params\n                self.last_raw_response = hidden.get(\"original_response\")\n\n                # For some providers, the raw response might be in different locations\n                if self.last_raw_response is None:\n                    self.last_raw_response = hidden.get(\"raw_response\")\n\n            # Try to extract raw usage from the response object itself\n            # Some providers have additional attributes that aren't in model_dump()\n            if hasattr(response_obj, \"_response_ms\"):\n                if self.last_raw_response is None:\n                    self.last_raw_response = {}\n                self.last_raw_response[\"_response_ms\"] = response_obj._response_ms\n\n            # For Vertex AI / Gemini, check for provider-specific usage fields\n            if hasattr(response_obj, \"vertex_ai_usage_metadata\"):\n                self.last_raw_usage = response_obj.vertex_ai_usage_metadata\n            elif hasattr(response_obj, \"_vertex_ai_response\"):\n                self.last_raw_response = response_obj._vertex_ai_response\n\n        except Exception as e:\n            # Don't let callback errors break the main flow\n            print(f\"    [Callback] Error capturing raw response: {e}\")\n\n    def log_failure_event(self, kwargs, response_obj, start_time, end_time):\n        \"\"\"Called after a failed LLM API call.\"\"\"\n        self.last_raw_response = None\n        self.last_raw_usage = None\n        self.last_provider = None\n\n    def reset(self):\n        \"\"\"Reset captured data for next request.\"\"\"\n        self.last_raw_response = None\n        self.last_raw_usage = None\n        self.last_provider = None\n\n    def get_captured_data(self) -> dict[str, Any]:\n        \"\"\"Return captured raw data and reset for next request.\"\"\"\n        data = {\n            \"raw_provider_response\": self.last_raw_response,\n            \"raw_provider_usage\": self.last_raw_usage,\n            \"detected_provider\": self.last_provider,\n        }\n        self.reset()\n        return data\n\n\n# Global callback instance\nraw_stats_capture = RawProviderStatsCapture()\n\n# Register the callback with LiteLLM\nlitellm.callbacks = [raw_stats_capture]\n\n# Enable return of response headers (helps with some providers)\nlitellm.return_response_headers = True\n\n\n# =============================================================================\n# PROVIDER CONFIGURATIONS\n# =============================================================================\n# Maps provider name to Model enum and supported modalities\n# The Model enum is used for:\n# 1. Getting the LiteLLM model name via model.value\n# 2. Initializing PromptManager which needs a Model enum\nPROVIDER_MODALITY_SUPPORT = {\n    \"anthropic\": {\n        \"model\": Model.CLAUDE_4_5_SONNET,\n        \"supported_modalities\": [\"text-only\", \"image-only\", \"text-image\"],\n        # Note: Anthropic does not support audio\n    },\n    \"openai\": {\n        \"model\": Model.GPT_4o,\n        \"supported_modalities\": [\"text-only\", \"image-only\", \"text-image\"],\n    },\n    \"openai-audio\": {\n        \"model\": Model.GPT_4o_AUDIO_PREVIEW,\n        \"supported_modalities\": [\"audio-only\", \"text-audio\"],\n    },\n    \"gemini\": {\n        \"model\": Model.GOOGLE_GEMINI_2_5_FLASH,\n        \"supported_modalities\": [\n            \"text-only\",\n            \"image-only\",\n            \"audio-only\",\n            \"text-image\",\n            \"text-audio\",\n            \"image-audio\",\n            \"text-image-audio\",\n        ],\n    },\n    \"vertex_ai\": {\n        \"model\": Model.GEMINI_2_5_FLASH,\n        \"supported_modalities\": [\n            \"text-only\",\n            \"image-only\",\n            \"audio-only\",\n            \"text-image\",\n            \"text-audio\",\n            \"image-audio\",\n            \"text-image-audio\",\n        ],\n    },\n    \"azure\": {\n        \"model\": Model.AZURE_GPT_4o,\n        \"supported_modalities\": [\"text-only\", \"image-only\", \"text-image\"],\n    },\n}\n\n\ndef load_messages(modality: str, provider: str, messages_dir: str) -> list[dict]:\n    \"\"\"Load messages from JSON file for a given modality/provider combination.\"\"\"\n    filepath = os.path.join(messages_dir, f\"{modality}_{provider}.json\")\n    with open(filepath) as f:\n        return json.load(f)\n\n\ndef transform_messages_for_litellm(messages: list[dict]) -> list[dict]:\n    \"\"\"\n    Transform palimpzest message format to LiteLLM-compatible format.\n\n    LiteLLM expects:\n    - Messages with role and content\n    - Content can be string or list of content blocks\n    - No 'type' field at the message level (that's palimpzest-specific)\n\n    This function consolidates multiple user messages with different types\n    into single messages with combined content.\n    \"\"\"\n    litellm_messages = []\n\n    for msg in messages:\n        role = msg.get(\"role\")\n        msg_type = msg.get(\"type\")\n        content = msg.get(\"content\")\n\n        if role == \"system\":\n            # System messages pass through as-is\n            # Content may be string or list of content blocks (for caching)\n            litellm_messages.append({\"role\": \"system\", \"content\": content})\n\n        elif role == \"user\":\n            # User messages need consolidation\n            if msg_type == \"text\":\n                # Text content - string or list of content blocks\n                if litellm_messages and litellm_messages[-1][\"role\"] == \"user\":\n                    # Merge with existing user message\n                    existing = litellm_messages[-1][\"content\"]\n                    if isinstance(existing, str):\n                        if isinstance(content, str):\n                            litellm_messages[-1][\"content\"] = [\n                                {\"type\": \"text\", \"text\": existing},\n                                {\"type\": \"text\", \"text\": content},\n                            ]\n                        else:\n                            litellm_messages[-1][\"content\"] = [\n                                {\"type\": \"text\", \"text\": existing}\n                            ] + content\n                    else:\n                        if isinstance(content, str):\n                            existing.append({\"type\": \"text\", \"text\": content})\n                        else:\n                            existing.extend(content)\n                else:\n                    litellm_messages.append({\"role\": \"user\", \"content\": content})\n\n            elif msg_type == \"image\":\n                # Image content - list of image_url blocks\n                if litellm_messages and litellm_messages[-1][\"role\"] == \"user\":\n                    existing = litellm_messages[-1][\"content\"]\n                    if isinstance(existing, str):\n                        litellm_messages[-1][\"content\"] = [\n                            {\"type\": \"text\", \"text\": existing}\n                        ] + content\n                    else:\n                        existing.extend(content)\n                else:\n                    litellm_messages.append({\"role\": \"user\", \"content\": content})\n\n            elif msg_type == \"input_audio\":\n                # Audio content - list of input_audio blocks\n                if litellm_messages and litellm_messages[-1][\"role\"] == \"user\":\n                    existing = litellm_messages[-1][\"content\"]\n                    if isinstance(existing, str):\n                        litellm_messages[-1][\"content\"] = [\n                            {\"type\": \"text\", \"text\": existing}\n                        ] + content\n                    else:\n                        existing.extend(content)\n                else:\n                    litellm_messages.append({\"role\": \"user\", \"content\": content})\n\n        elif role == \"assistant\":\n            litellm_messages.append({\"role\": \"assistant\", \"content\": content})\n\n    return litellm_messages\n\n\ndef call_litellm_api(\n    messages: list[dict],\n    model: Model,\n    provider: str,\n    cache_key: str | None = None,\n) -> dict[str, Any]:\n    \"\"\"\n    Call LiteLLM completion API and return all usage statistics.\n\n    This function captures both:\n    - Option A: Raw provider usage via callback (if available)\n    - Option C: Normalized LiteLLM usage (fallback)\n\n    Args:\n        messages: List of message dicts (palimpzest format)\n        model: Model enum (used for model name and provider detection)\n        provider: Provider name for logging\n        cache_key: Optional prompt_cache_key for OpenAI sticky routing to same cache shard\n\n    Returns dict with:\n    - usage: Normalized usage dict from LiteLLM response (Option C fallback)\n    - usage_raw: Raw provider usage if captured via callback (Option A)\n    - response_content: First 200 chars of response\n    - model: Model used\n    - raw_response: Full response object serialized\n    \"\"\"\n    # Reset the callback to capture fresh data for this request\n    raw_stats_capture.reset()\n\n    # Transform messages to LiteLLM format\n    litellm_messages = transform_messages_for_litellm(messages)\n\n    # Get the LiteLLM model name from the Model enum\n    model_name = model.value\n\n    # Set up completion kwargs\n    completion_kwargs = {\n        \"temperature\": 0.0,\n    }\n\n    # Add modalities for audio models\n    if \"audio\" in model_name.lower():\n        completion_kwargs[\"modalities\"] = [\"text\"]\n\n    # Apply provider-specific caching configuration\n    # Messages from generator_messages already have cache_control markers for Anthropic\n    if (model.is_provider_openai() or model.is_provider_azure()) and cache_key:\n        # OpenAI: Use prompt_cache_key for sticky routing to the same cache shard\n        # https://platform.openai.com/docs/guides/prompt-caching\n        completion_kwargs[\"extra_body\"] = {\"prompt_cache_key\": cache_key}\n\n    # Make the LiteLLM call\n    response = litellm.completion(\n        model=model_name,\n        messages=litellm_messages,\n        **completion_kwargs,\n    )\n\n    # ==========================================================================\n    # Option C (Fallback): Extract normalized usage stats from LiteLLM response\n    # ==========================================================================\n    usage_normalized = {}\n    if response.usage:\n        usage_normalized = response.usage.model_dump()\n\n    # ==========================================================================\n    # Option A: Get raw provider data captured by callback\n    # ==========================================================================\n    callback_data = raw_stats_capture.get_captured_data()\n    usage_raw = callback_data.get(\"raw_provider_usage\")\n\n    # Also try to extract raw usage from _hidden_params\n    hidden_params = {}\n    try:\n        if hasattr(response, \"_hidden_params\") and response._hidden_params:\n            hidden_params = dict(response._hidden_params)\n            # Some providers store original response here\n            if \"original_response\" in hidden_params:\n                original = hidden_params[\"original_response\"]\n                if isinstance(original, dict) and \"usage_metadata\" in original:\n                    usage_raw = original[\"usage_metadata\"]\n                elif hasattr(original, \"usage_metadata\"):\n                    with contextlib.suppress(Exception):\n                        usage_raw = original.usage_metadata.model_dump() if hasattr(original.usage_metadata, \"model_dump\") else dict(original.usage_metadata)\n    except Exception:\n        pass\n\n    # Get response text safely\n    try:\n        response_text = (\n            response.choices[0].message.content[:200]\n            if response.choices and response.choices[0].message.content\n            else None\n        )\n    except Exception:\n        response_text = None\n\n    # Serialize the full response for debugging\n    try:\n        raw_response = response.model_dump()\n    except Exception:\n        raw_response = str(response)\n\n    return {\n        \"provider\": provider,\n        \"model\": model_name,\n        \"usage\": usage_normalized,  # Option C: Normalized LiteLLM format\n        \"usage_raw\": usage_raw,  # Option A: Raw provider format (if captured)\n        \"response_content\": response_text,\n        \"raw_response\": raw_response,\n        \"hidden_params\": hidden_params,\n        \"callback_data\": callback_data,\n    }\n\n\ndef capture_stats_for_provider(\n    provider: str,\n    modality: str,\n    messages: list[dict],\n    model: Model,\n) -> dict[str, Any]:\n    \"\"\"\n    Capture stats for a provider by making two requests with a delay.\n\n    Args:\n        provider: Provider name (for logging and file naming)\n        modality: Modality name\n        messages: List of message dicts\n        model: Model enum\n\n    Returns dict with:\n    - first_request: stats from first request\n    - second_request: stats from second request (should show cache hits)\n    \"\"\"\n    # Generate a unique cache key for OpenAI (ensures both requests hit the same cache shard)\n    # Reference: capture_provider_stats.py and PromptManager.__init__\n    openai_cache_key = f\"pz-test-{uuid.uuid4().hex[:12]}\" if provider in (\"openai\", \"openai-audio\", \"azure\") else None\n\n    print(\"    First request...\")\n    first_stats = call_litellm_api(messages, model, provider, cache_key=openai_cache_key)\n    print(f\"      Usage: {first_stats['usage']}\")\n\n    print(\"    Waiting 20 seconds for cache to be available...\")\n    time.sleep(20)\n\n    print(\"    Second request (should show cache hits)...\")\n    second_stats = call_litellm_api(messages, model, provider, cache_key=openai_cache_key)\n    print(f\"      Usage: {second_stats['usage']}\")\n\n    return {\n        \"provider\": provider,\n        \"model\": model.value,\n        \"modality\": modality,\n        \"first_request\": first_stats,\n        \"second_request\": second_stats,\n    }\n\n\ndef save_stats(stats: dict[str, Any], output_dir: str, provider: str, modality: str) -> str:\n    \"\"\"Save stats to a JSON file.\"\"\"\n    os.makedirs(output_dir, exist_ok=True)\n    output_path = os.path.join(output_dir, f\"{provider}_{modality}.json\")\n\n    with open(output_path, \"w\") as f:\n        json.dump(stats, f, indent=2, default=str)\n\n    return output_path\n\n\ndef main():\n    \"\"\"Capture LiteLLM stats for supported provider/modality combinations.\"\"\"\n    parser = argparse.ArgumentParser(\n        description=\"Capture token/cost statistics from LLM providers via LiteLLM.\",\n        formatter_class=argparse.RawDescriptionHelpFormatter,\n        epilog=f\"\"\"\nAvailable providers: {', '.join(PROVIDER_MODALITY_SUPPORT.keys())}\nAvailable modalities: text-only, image-only, audio-only, text-image, text-audio, image-audio, text-image-audio\n\nExamples:\n  python capture_litellm_stats.py                              # Run all providers/modalities\n  python capture_litellm_stats.py -p openai                    # Run all modalities for OpenAI\n  python capture_litellm_stats.py -p openai -m text-only       # Run only text-only for OpenAI\n  python capture_litellm_stats.py -p anthropic -m text-image   # Run only text-image for Anthropic\n        \"\"\"\n    )\n    parser.add_argument(\n        \"-p\", \"--provider\",\n        nargs=\"+\",\n        choices=list(PROVIDER_MODALITY_SUPPORT.keys()),\n        help=\"Provider(s) to run. If not specified, runs all providers.\",\n    )\n    parser.add_argument(\n        \"-m\", \"--modality\",\n        nargs=\"+\",\n        choices=[\"text-only\", \"image-only\", \"audio-only\", \"text-image\", \"text-audio\", \"image-audio\", \"text-image-audio\"],\n        help=\"Modality(ies) to run. If not specified, runs all supported modalities for each provider.\",\n    )\n    args = parser.parse_args()\n\n    messages_dir = os.path.join(\n        os.path.dirname(__file__),\n        \"..\",\n        \"tests\",\n        \"pytest\",\n        \"data\",\n        \"generator_messages\",\n    )\n    messages_dir = os.path.abspath(messages_dir)\n\n    output_dir = os.path.join(\n        os.path.dirname(__file__),\n        \"litellm_stats\",\n    )\n    output_dir = os.path.abspath(output_dir)\n\n    print(f\"Loading messages from: {messages_dir}\")\n    print(f\"Saving stats to: {output_dir}\\n\")\n\n    # Ensure output directory exists\n    os.makedirs(output_dir, exist_ok=True)\n\n    # Determine which providers to run\n    providers_to_run = args.provider if args.provider else list(PROVIDER_MODALITY_SUPPORT.keys())\n    print(f\"Providers to run: {providers_to_run}\\n\")\n\n    for provider in providers_to_run:\n        config = PROVIDER_MODALITY_SUPPORT[provider]\n        model = config[\"model\"]\n        supported_modalities = config[\"supported_modalities\"]\n\n        # Filter modalities if specified\n        if args.modality:\n            modalities_to_run = [m for m in args.modality if m in supported_modalities]\n            if not modalities_to_run:\n                print(f\"\\nProvider: {provider} - SKIPPED (none of {args.modality} supported)\")\n                continue\n        else:\n            modalities_to_run = supported_modalities\n\n        print(f\"\\nProvider: {provider} (model: {model.value})\")\n        print(f\"  Modalities to run: {modalities_to_run}\")\n\n        for modality in modalities_to_run:\n            print(f\"\\n  Processing modality: {modality}\")\n\n            try:\n                messages = load_messages(modality, provider, messages_dir)\n                print(f\"    Loaded {len(messages)} messages from {modality}_{provider}.json\")\n\n                stats = capture_stats_for_provider(provider, modality, messages, model)\n\n                output_path = save_stats(stats, output_dir, provider, modality)\n                print(f\"    Saved to: {output_path}\")\n\n            except FileNotFoundError as e:\n                print(f\"    SKIPPED: Message file not found - {e}\")\n            except Exception as e:\n                print(f\"    ERROR: {e}\")\n                import traceback\n                traceback.print_exc()\n\n    print(\"\\nDone!\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/capture_provider_stats.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nScript to directly invoke LLM providers and capture token/cost statistics.\n\nThis script:\n1. Loads messages from JSON files generated by generate_test_messages.py\n2. Sends requests directly to each provider's API (not through litellm)\n3. Saves the token/cost related stats returned by the provider\n4. Waits 10 seconds\n5. Sends the request again and saves the second set of stats\n\nThis allows us to establish baseline expectations for what the providers return,\nwhich can then be used to validate the palimpzest generator's stats tracking.\n\nSupported providers:\n- Anthropic: claude-sonnet-4-5-20250929 (text, image, text+image)\n- Google/Vertex AI: gemini-2.5-flash (all seven modality combinations)\n- OpenAI: gpt-4o-2024-08-06 (text, image, text+image)\n- OpenAI: gpt-4o-audio-preview (text+audio, audio)\n- Azure: gpt-4o-2024-08-06 via Azure OpenAI (text, image, text+image)\n\nOutput files are saved to: tests/pytest/scripts/provider_stats/\n\"\"\"\n\nimport argparse\nimport base64\nimport json\nimport os\nimport sys\nimport time\nimport uuid\nfrom typing import Any\n\n\ndef detect_image_media_type(base64_data: str) -> str:\n    \"\"\"\n    Detect the actual image format from base64 data by examining the magic bytes.\n\n    Args:\n        base64_data: Base64-encoded image data.\n\n    Returns:\n        The detected media type (e.g., 'image/png', 'image/jpeg').\n        Defaults to 'image/jpeg' if format cannot be determined.\n    \"\"\"\n    try:\n        # Decode first few bytes to check magic numbers\n        header = base64.b64decode(base64_data[:32])\n\n        # PNG: 89 50 4E 47 0D 0A 1A 0A\n        if header[:8] == b\"\\x89PNG\\r\\n\\x1a\\n\":\n            return \"image/png\"\n        # JPEG: FF D8 FF\n        if header[:3] == b\"\\xff\\xd8\\xff\":\n            return \"image/jpeg\"\n        # GIF: GIF87a or GIF89a\n        if header[:6] in (b\"GIF87a\", b\"GIF89a\"):\n            return \"image/gif\"\n        # WebP: RIFF....WEBP\n        if header[:4] == b\"RIFF\" and header[8:12] == b\"WEBP\":\n            return \"image/webp\"\n    except Exception:\n        pass\n\n    return \"image/jpeg\"\n\n# Add project root to path\nsys.path.insert(0, os.path.join(os.path.dirname(__file__), \"..\"))\n\n\n# =============================================================================\n# PROVIDER CONFIGURATIONS\n# =============================================================================\nPROVIDER_MODALITY_SUPPORT = {\n    \"anthropic\": {\n        \"model\": \"claude-sonnet-4-5-20250929\",\n        \"supported_modalities\": [\"text-only\", \"image-only\", \"text-image\"],\n        # Note: Anthropic does not support audio\n    },\n    \"openai\": {\n        \"model\": \"gpt-4o-2024-08-06\",\n        \"supported_modalities\": [\"text-only\", \"image-only\", \"text-image\"],\n    },\n    \"openai-audio\": {\n        \"model\": \"gpt-4o-audio-preview\",\n        \"supported_modalities\": [\"audio-only\", \"text-audio\"],\n    },\n    \"gemini\": {\n        \"model\": \"gemini-2.5-flash\",\n        \"supported_modalities\": [\n            \"text-only\",\n            \"image-only\",\n            \"audio-only\",\n            \"text-image\",\n            \"text-audio\",\n            \"image-audio\",\n            \"text-image-audio\",\n        ],\n    },\n    \"vertex_ai\": {\n        \"model\": \"gemini-2.5-flash\",\n        \"supported_modalities\": [\n            \"text-only\",\n            \"image-only\",\n            \"audio-only\",\n            \"text-image\",\n            \"text-audio\",\n            \"image-audio\",\n            \"text-image-audio\",\n        ],\n    },\n    \"azure\": {\n        \"model\": \"gpt-4o-2024-08-06\",\n        \"supported_modalities\": [\"text-only\", \"image-only\", \"text-image\"],\n    },\n}\n\n\ndef load_messages(modality: str, provider: str, messages_dir: str) -> list[dict]:\n    \"\"\"Load messages from JSON file for a given modality/provider combination.\"\"\"\n    filepath = os.path.join(messages_dir, f\"{modality}_{provider}.json\")\n    with open(filepath) as f:\n        return json.load(f)\n\n\ndef transform_messages_for_openai(messages: list[dict]) -> list[dict]:\n    \"\"\"\n    Transform palimpzest/litellm message format to OpenAI API format.\n\n    OpenAI expects:\n    - system messages with string content\n    - user messages with string content or array of content parts\n\n    Input messages may have content as string or list of content blocks.\n    \"\"\"\n    openai_messages = []\n\n    for msg in messages:\n        role = msg.get(\"role\")\n        msg_type = msg.get(\"type\")\n        content = msg.get(\"content\")\n\n        if role == \"system\":\n            # System content may be string or list of content blocks\n            if isinstance(content, list):\n                # Extract text from content blocks\n                text_parts = [block.get(\"text\", \"\") for block in content if block.get(\"type\") == \"text\"]\n                openai_messages.append({\"role\": \"system\", \"content\": \"\".join(text_parts)})\n            else:\n                openai_messages.append({\"role\": \"system\", \"content\": content})\n\n        elif role == \"user\":\n            if msg_type == \"text\":\n                # Content may be string or list of content blocks\n                if isinstance(content, list):\n                    # Already content blocks - add them directly\n                    content_parts = []\n                    for block in content:\n                        # Convert to OpenAI format (remove cache_control if present)\n                        openai_block = {\"type\": block.get(\"type\", \"text\")}\n                        if block.get(\"type\") == \"text\":\n                            openai_block[\"text\"] = block.get(\"text\", \"\")\n                        content_parts.append(openai_block)\n\n                    if openai_messages and openai_messages[-1][\"role\"] == \"user\":\n                        existing_content = openai_messages[-1][\"content\"]\n                        if isinstance(existing_content, str):\n                            openai_messages[-1][\"content\"] = [\n                                {\"type\": \"text\", \"text\": existing_content}\n                            ] + content_parts\n                        else:\n                            existing_content.extend(content_parts)\n                    else:\n                        openai_messages.append({\"role\": \"user\", \"content\": content_parts})\n                else:\n                    # String content\n                    if openai_messages and openai_messages[-1][\"role\"] == \"user\":\n                        existing_content = openai_messages[-1][\"content\"]\n                        if isinstance(existing_content, str):\n                            openai_messages[-1][\"content\"] = [\n                                {\"type\": \"text\", \"text\": existing_content},\n                                {\"type\": \"text\", \"text\": content},\n                            ]\n                        else:\n                            existing_content.append({\"type\": \"text\", \"text\": content})\n                    else:\n                        openai_messages.append({\"role\": \"user\", \"content\": content})\n\n            elif msg_type == \"image\":\n                # Image content\n                image_parts = []\n                for img in content:\n                    if img.get(\"type\") == \"image_url\":\n                        image_parts.append(img)\n\n                if openai_messages and openai_messages[-1][\"role\"] == \"user\":\n                    existing_content = openai_messages[-1][\"content\"]\n                    if isinstance(existing_content, str):\n                        openai_messages[-1][\"content\"] = [\n                            {\"type\": \"text\", \"text\": existing_content}\n                        ] + image_parts\n                    else:\n                        existing_content.extend(image_parts)\n                else:\n                    openai_messages.append({\"role\": \"user\", \"content\": image_parts})\n\n            elif msg_type == \"input_audio\":\n                # Audio content\n                audio_parts = []\n                for audio in content:\n                    if audio.get(\"type\") == \"input_audio\":\n                        audio_parts.append(audio)\n\n                if openai_messages and openai_messages[-1][\"role\"] == \"user\":\n                    existing_content = openai_messages[-1][\"content\"]\n                    if isinstance(existing_content, str):\n                        openai_messages[-1][\"content\"] = [\n                            {\"type\": \"text\", \"text\": existing_content}\n                        ] + audio_parts\n                    else:\n                        existing_content.extend(audio_parts)\n                else:\n                    openai_messages.append({\"role\": \"user\", \"content\": audio_parts})\n\n    return openai_messages\n\n\ndef transform_messages_for_anthropic(messages: list[dict]) -> tuple[str | None, list[dict]]:\n    \"\"\"\n    Transform palimpzest/litellm message format to Anthropic API format.\n\n    Input messages may already have cache_control markers from PromptManager.\n    This function preserves those markers while converting to Anthropic's native format.\n\n    Anthropic expects:\n    - system as a separate parameter (not in messages)\n    - user/assistant messages with content as array of content blocks\n    - cache_control markers for caching (preserved from input)\n    \"\"\"\n    system_prompt = None\n    anthropic_messages = []\n\n    for msg in messages:\n        role = msg.get(\"role\")\n        msg_type = msg.get(\"type\")\n        content = msg.get(\"content\")\n\n        if role == \"system\":\n            # Anthropic uses system as a separate parameter\n            # Content may already be a list of content blocks with cache_control\n            if isinstance(content, list):\n                # Already in content block format (from PromptManager)\n                system_prompt = content\n            else:\n                # String content - wrap in content block with cache_control\n                system_prompt = [\n                    {\n                        \"type\": \"text\",\n                        \"text\": content,\n                        \"cache_control\": {\"type\": \"ephemeral\"},\n                    }\n                ]\n\n        elif role == \"user\":\n            if msg_type == \"text\":\n                # Content may be string or list of content blocks\n                if isinstance(content, list):\n                    # Already content blocks (may have cache_control) - preserve them\n                    for block in content:\n                        if anthropic_messages and anthropic_messages[-1][\"role\"] == \"user\":\n                            anthropic_messages[-1][\"content\"].append(block)\n                        else:\n                            anthropic_messages.append({\"role\": \"user\", \"content\": [block]})\n                else:\n                    # String content\n                    content_block = {\"type\": \"text\", \"text\": content}\n                    if anthropic_messages and anthropic_messages[-1][\"role\"] == \"user\":\n                        anthropic_messages[-1][\"content\"].append(content_block)\n                    else:\n                        anthropic_messages.append({\"role\": \"user\", \"content\": [content_block]})\n\n            elif msg_type == \"image\":\n                # Image content - Anthropic uses base64 format\n                for img in content:\n                    if img.get(\"type\") == \"image_url\":\n                        url = img[\"image_url\"][\"url\"]\n                        if url.startswith(\"data:\"):\n                            # Extract base64 data\n                            _, data = url.split(\";base64,\")\n                            # Detect actual media type from image data (in case URL has wrong type)\n                            media_type = detect_image_media_type(data)\n                            image_block = {\n                                \"type\": \"image\",\n                                \"source\": {\n                                    \"type\": \"base64\",\n                                    \"media_type\": media_type,\n                                    \"data\": data,\n                                },\n                            }\n                            # Preserve cache_control if present on the original block\n                            if \"cache_control\" in img:\n                                image_block[\"cache_control\"] = img[\"cache_control\"]\n                            if anthropic_messages and anthropic_messages[-1][\"role\"] == \"user\":\n                                anthropic_messages[-1][\"content\"].append(image_block)\n                            else:\n                                anthropic_messages.append({\"role\": \"user\", \"content\": [image_block]})\n\n    return system_prompt, anthropic_messages\n\n\ndef transform_messages_for_gemini(messages: list[dict]) -> tuple[str | None, list[dict]]:\n    \"\"\"\n    Transform palimpzest/litellm message format to Gemini API format.\n\n    Gemini expects:\n    - role: \"user\" or \"model\"\n    - parts: list of content parts\n\n    Input messages may have content as string or list of content blocks.\n    \"\"\"\n    gemini_contents = []\n    system_instruction = None\n\n    for msg in messages:\n        role = msg.get(\"role\")\n        msg_type = msg.get(\"type\")\n        content = msg.get(\"content\")\n\n        if role == \"system\":\n            # Gemini uses system_instruction\n            # Content may be string or list of content blocks\n            if isinstance(content, list):\n                # Extract text from content blocks\n                text_parts = [block.get(\"text\", \"\") for block in content if block.get(\"type\") == \"text\"]\n                system_instruction = \"\".join(text_parts)\n            else:\n                system_instruction = content\n\n        elif role == \"user\":\n            parts = []\n\n            if msg_type == \"text\":\n                # Content may be string or list of content blocks\n                if isinstance(content, list):\n                    for block in content:\n                        if block.get(\"type\") == \"text\":\n                            parts.append({\"text\": block.get(\"text\", \"\")})\n                else:\n                    parts.append({\"text\": content})\n\n            elif msg_type == \"image\":\n                for img in content:\n                    if img.get(\"type\") == \"image_url\":\n                        url = img[\"image_url\"][\"url\"]\n                        if url.startswith(\"data:\"):\n                            _, data = url.split(\";base64,\")\n                            # Detect actual media type from image data\n                            media_type = detect_image_media_type(data)\n                            parts.append({\n                                \"inline_data\": {\n                                    \"mime_type\": media_type,\n                                    \"data\": data,\n                                }\n                            })\n\n            elif msg_type == \"input_audio\":\n                for audio in content:\n                    if audio.get(\"type\") == \"input_audio\":\n                        audio_data = audio[\"input_audio\"]\n                        parts.append({\n                            \"inline_data\": {\n                                \"mime_type\": f\"audio/{audio_data.get('format', 'wav')}\",\n                                \"data\": audio_data[\"data\"],\n                            }\n                        })\n\n            if parts:\n                if gemini_contents and gemini_contents[-1][\"role\"] == \"user\":\n                    gemini_contents[-1][\"parts\"].extend(parts)\n                else:\n                    gemini_contents.append({\"role\": \"user\", \"parts\": parts})\n\n    return system_instruction, gemini_contents\n\n\ndef call_openai_api(messages: list[dict], model: str, cache_key: str | None = None) -> dict[str, Any]:\n    \"\"\"\n    Call OpenAI API directly and return usage statistics.\n\n    Args:\n        messages: List of message dicts\n        model: Model name\n        cache_key: Optional prompt_cache_key for sticky routing to same cache shard\n\n    Returns dict with:\n    - completion_tokens\n    - prompt_tokens\n    - prompt_tokens_details (cached_tokens, text_tokens, image_tokens, audio_tokens)\n    - total_tokens\n    \"\"\"\n    import openai\n\n    client = openai.OpenAI()\n\n    openai_messages = transform_messages_for_openai(messages)\n\n    kwargs = {\"model\": model, \"messages\": openai_messages, \"temperature\": 0.0}\n\n    # Check if this is an audio model\n    if \"audio\" in model:\n        kwargs[\"modalities\"] = [\"text\"]\n\n    # Add prompt_cache_key for caching (ensures requests route to same cache shard)\n    if cache_key:\n        kwargs[\"extra_body\"] = {\"prompt_cache_key\": cache_key}\n\n    response = client.chat.completions.create(**kwargs)\n\n    # Extract complete usage stats\n    usage_dict = {}\n    if response.usage:\n        usage_dict = response.usage.model_dump()\n\n    # Get response text safely\n    try:\n        response_text = response.choices[0].message.content[:200] if response.choices and response.choices[0].message.content else None\n    except Exception:\n        response_text = None\n\n    # Serialize the full response\n    try:\n        raw_response = response.model_dump()\n    except Exception:\n        raw_response = str(response)\n\n    return {\n        \"provider\": \"openai\",\n        \"model\": model,\n        \"usage\": usage_dict,\n        \"response_content\": response_text,\n        \"raw_response\": raw_response,\n    }\n\n\n# NOTE: this function was generated speculatively and has not been tested, so it may have errors\ndef call_azure_api(messages: list[dict], model: str, cache_key: str | None = None) -> dict[str, Any]:\n    \"\"\"\n    Call Azure OpenAI API directly and return usage statistics.\n\n    Uses the same message format as OpenAI, but routes through Azure endpoints.\n\n    Args:\n        messages: List of message dicts\n        model: Model name (deployment name)\n        cache_key: Optional prompt_cache_key for sticky routing to same cache shard\n\n    Returns dict with:\n    - completion_tokens\n    - prompt_tokens\n    - prompt_tokens_details (cached_tokens, text_tokens, image_tokens, audio_tokens)\n    - total_tokens\n    \"\"\"\n    import openai\n\n    api_key = os.environ.get(\"AZURE_API_KEY\") or os.environ.get(\"AZURE_OPENAI_API_KEY\")\n    azure_endpoint = os.environ.get(\"AZURE_API_BASE\")\n    api_version = os.environ.get(\"AZURE_API_VERSION\", \"2024-12-01-preview\")\n\n    if not api_key:\n        raise ValueError(\"AZURE_API_KEY or AZURE_OPENAI_API_KEY must be set\")\n    if not azure_endpoint:\n        raise ValueError(\"AZURE_API_BASE must be set\")\n\n    client = openai.AzureOpenAI(\n        api_key=api_key,\n        azure_endpoint=azure_endpoint,\n        api_version=api_version,\n    )\n\n    openai_messages = transform_messages_for_openai(messages)\n\n    kwargs = {\"model\": model, \"messages\": openai_messages, \"temperature\": 0.0}\n\n    # Add prompt_cache_key for caching (ensures requests route to same cache shard)\n    if cache_key:\n        kwargs[\"extra_body\"] = {\"prompt_cache_key\": cache_key}\n\n    response = client.chat.completions.create(**kwargs)\n\n    # Extract complete usage stats\n    usage_dict = {}\n    if response.usage:\n        usage_dict = response.usage.model_dump()\n\n    # Get response text safely\n    try:\n        response_text = response.choices[0].message.content[:200] if response.choices and response.choices[0].message.content else None\n    except Exception:\n        response_text = None\n\n    # Serialize the full response\n    try:\n        raw_response = response.model_dump()\n    except Exception:\n        raw_response = str(response)\n\n    return {\n        \"provider\": \"azure\",\n        \"model\": model,\n        \"usage\": usage_dict,\n        \"response_content\": response_text,\n        \"raw_response\": raw_response,\n    }\n\n\ndef call_anthropic_api(messages: list[dict], model: str) -> dict[str, Any]:\n    \"\"\"\n    Call Anthropic API directly and return usage statistics.\n\n    Returns dict with:\n    - input_tokens\n    - output_tokens\n    - cache_creation_input_tokens\n    - cache_read_input_tokens\n    \"\"\"\n    import anthropic\n\n    client = anthropic.Anthropic()\n\n    system_prompt, anthropic_messages = transform_messages_for_anthropic(messages)\n\n    response = client.messages.create(\n        model=model,\n        max_tokens=1024,\n        system=system_prompt,\n        messages=anthropic_messages,\n    )\n\n    # Extract complete usage stats\n    usage_dict = {}\n    if response.usage:\n        usage_dict = response.usage.model_dump()\n\n    # Get response text safely\n    try:\n        response_text = response.content[0].text[:200] if response.content and response.content[0].text else None\n    except Exception:\n        response_text = None\n\n    # Serialize the full response\n    try:\n        raw_response = response.model_dump()\n    except Exception:\n        raw_response = str(response)\n\n    return {\n        \"provider\": \"anthropic\",\n        \"model\": model,\n        \"usage\": usage_dict,\n        \"response_content\": response_text,\n        \"raw_response\": raw_response,\n    }\n\n\ndef call_gemini_api(messages: list[dict], model: str, use_vertex: bool = False) -> dict[str, Any]:\n    \"\"\"\n    Call Gemini API directly and return usage statistics.\n\n    Args:\n        messages: List of message dicts\n        model: Model name\n        use_vertex: If True, use Vertex AI; otherwise use Google AI Studio\n\n    Returns dict with usage statistics.\n    \"\"\"\n    from google import genai\n    from google.genai import types\n\n    system_instruction, gemini_contents = transform_messages_for_gemini(messages)\n\n    # Create client for Google AI Studio or Vertex AI\n    if use_vertex:\n        # Vertex AI requires project and location\n        import os\n        client = genai.Client(\n            vertexai=True,\n            project=os.environ.get(\"GOOGLE_CLOUD_PROJECT\", os.environ.get(\"VERTEXAI_PROJECT\")),\n            location=os.environ.get(\"GOOGLE_CLOUD_LOCATION\", os.environ.get(\"VERTEXAI_LOCATION\", \"us-central1\")),\n        )\n    else:\n        # Google AI Studio uses API key from environment\n        client = genai.Client()\n\n    # Build the config\n    config = types.GenerateContentConfig(\n        temperature=0.0,\n        system_instruction=system_instruction if system_instruction else None,\n    )\n\n    response = client.models.generate_content(\n        model=model,\n        contents=gemini_contents,\n        config=config,\n    )\n\n    # Extract complete usage stats from usage_metadata\n    usage_metadata = response.usage_metadata\n    usage_dict = {}\n    if usage_metadata:\n        # Try model_dump() first (Pydantic models), then to_dict(), then manual extraction\n        try:\n            usage_dict = usage_metadata.model_dump()\n        except AttributeError:\n            try:\n                usage_dict = usage_metadata.to_dict()\n            except AttributeError:\n                # Manual extraction of known Gemini usage fields\n                usage_dict = {\n                    \"prompt_token_count\": getattr(usage_metadata, \"prompt_token_count\", None),\n                    \"candidates_token_count\": getattr(usage_metadata, \"candidates_token_count\", None),\n                    \"total_token_count\": getattr(usage_metadata, \"total_token_count\", None),\n                    \"cached_content_token_count\": getattr(usage_metadata, \"cached_content_token_count\", None),\n                }\n\n    # Get response text safely\n    try:\n        response_text = response.text[:200] if response.text else None\n    except Exception:\n        response_text = None\n\n    # Serialize the full response\n    try:\n        # Try model_dump() first (Pydantic models)\n        raw_response = response.model_dump()\n    except AttributeError:\n        try:\n            raw_response = response.to_dict()\n        except AttributeError:\n            # Manual serialization\n            try:\n                raw_response = {\n                    \"text\": response.text if hasattr(response, \"text\") else None,\n                    \"candidates\": [\n                        {\n                            \"content\": {\n                                \"parts\": [{\"text\": getattr(part, \"text\", str(part))} for part in c.content.parts] if c.content and c.content.parts else [],\n                                \"role\": c.content.role if c.content else None,\n                            },\n                            \"finish_reason\": str(c.finish_reason) if hasattr(c, \"finish_reason\") else None,\n                        }\n                        for c in (response.candidates or [])\n                    ],\n                    \"usage_metadata\": usage_dict,\n                    \"model_version\": getattr(response, \"model_version\", None),\n                }\n            except Exception as e:\n                raw_response = {\"error\": str(e), \"response_str\": str(response)}\n\n    return {\n        \"provider\": \"vertex_ai\" if use_vertex else \"gemini\",\n        \"model\": model,\n        \"usage\": usage_dict,\n        \"response_content\": response_text,\n        \"raw_response\": raw_response,\n    }\n\n\ndef capture_stats_for_provider(\n    provider: str,\n    modality: str,\n    messages: list[dict],\n    model: str,\n) -> dict[str, Any]:\n    \"\"\"\n    Capture stats for a provider by making two requests with a 10-second gap.\n\n    Returns dict with:\n    - first_request: stats from first request\n    - second_request: stats from second request (should show cache hits)\n    \"\"\"\n    # Generate a unique cache key for OpenAI/Azure (ensures both requests hit the same cache shard)\n    openai_cache_key = f\"pz-test-{uuid.uuid4().hex[:12]}\" if provider in (\"openai\", \"openai-audio\", \"azure\") else None\n\n    print(\"    First request...\")\n    if provider == \"openai\" or provider == \"openai-audio\":\n        first_stats = call_openai_api(messages, model, cache_key=openai_cache_key)\n    elif provider == \"azure\":\n        first_stats = call_azure_api(messages, model, cache_key=openai_cache_key)\n    elif provider == \"anthropic\":\n        first_stats = call_anthropic_api(messages, model)\n    elif provider == \"gemini\":\n        first_stats = call_gemini_api(messages, model, use_vertex=False)\n    elif provider == \"vertex_ai\":\n        first_stats = call_gemini_api(messages, model, use_vertex=True)\n    else:\n        raise ValueError(f\"Unknown provider: {provider}\")\n\n    print(f\"      Usage: {first_stats['usage']}\")\n\n    print(\"    Waiting 20 seconds for cache to be available...\")\n    time.sleep(20)\n\n    print(\"    Second request (should show cache hits)...\")\n    if provider == \"openai\" or provider == \"openai-audio\":\n        second_stats = call_openai_api(messages, model, cache_key=openai_cache_key)\n    elif provider == \"azure\":\n        second_stats = call_azure_api(messages, model, cache_key=openai_cache_key)\n    elif provider == \"anthropic\":\n        second_stats = call_anthropic_api(messages, model)\n    elif provider == \"gemini\":\n        second_stats = call_gemini_api(messages, model, use_vertex=False)\n    elif provider == \"vertex_ai\":\n        second_stats = call_gemini_api(messages, model, use_vertex=True)\n\n    print(f\"      Usage: {second_stats['usage']}\")\n\n    return {\n        \"provider\": provider,\n        \"model\": model,\n        \"modality\": modality,\n        \"first_request\": first_stats,\n        \"second_request\": second_stats,\n    }\n\n\ndef save_stats(stats: dict[str, Any], output_dir: str, provider: str, modality: str) -> str:\n    \"\"\"Save stats to a JSON file.\"\"\"\n    os.makedirs(output_dir, exist_ok=True)\n    output_path = os.path.join(output_dir, f\"{provider}_{modality}.json\")\n\n    with open(output_path, \"w\") as f:\n        json.dump(stats, f, indent=2)\n\n    return output_path\n\n\ndef main():\n    \"\"\"Capture provider stats for supported provider/modality combinations.\"\"\"\n    parser = argparse.ArgumentParser(\n        description=\"Capture token/cost statistics from LLM providers.\",\n        formatter_class=argparse.RawDescriptionHelpFormatter,\n        epilog=f\"\"\"\nAvailable providers: {', '.join(PROVIDER_MODALITY_SUPPORT.keys())}\nAvailable modalities: text-only, image-only, audio-only, text-image, text-audio, image-audio, text-image-audio\n\nExamples:\n  python capture_provider_stats.py                              # Run all providers/modalities\n  python capture_provider_stats.py -p openai                    # Run all modalities for OpenAI\n  python capture_provider_stats.py -p openai -m text-only       # Run only text-only for OpenAI\n  python capture_provider_stats.py -p anthropic -m text-image   # Run only text-image for Anthropic\n        \"\"\"\n    )\n    parser.add_argument(\n        \"-p\", \"--provider\",\n        nargs=\"+\",\n        choices=list(PROVIDER_MODALITY_SUPPORT.keys()),\n        help=\"Provider(s) to run. If not specified, runs all providers.\",\n    )\n    parser.add_argument(\n        \"-m\", \"--modality\",\n        nargs=\"+\",\n        choices=[\"text-only\", \"image-only\", \"audio-only\", \"text-image\", \"text-audio\", \"image-audio\", \"text-image-audio\"],\n        help=\"Modality(ies) to run. If not specified, runs all supported modalities for each provider.\",\n    )\n    args = parser.parse_args()\n\n    messages_dir = os.path.join(\n        os.path.dirname(__file__),\n        \"..\",\n        \"tests\",\n        \"pytest\",\n        \"data\",\n        \"generator_messages\",\n    )\n    messages_dir = os.path.abspath(messages_dir)\n\n    output_dir = os.path.join(\n        os.path.dirname(__file__),\n        \"provider_stats\",\n    )\n    output_dir = os.path.abspath(output_dir)\n\n    print(f\"Loading messages from: {messages_dir}\")\n    print(f\"Saving stats to: {output_dir}\\n\")\n\n    # Determine which providers to run\n    providers_to_run = args.provider if args.provider else list(PROVIDER_MODALITY_SUPPORT.keys())\n    print(f\"Providers to run: {providers_to_run}\\n\")\n\n    for provider in providers_to_run:\n        config = PROVIDER_MODALITY_SUPPORT[provider]\n        model = config[\"model\"]\n        supported_modalities = config[\"supported_modalities\"]\n\n        # Filter modalities if specified\n        if args.modality:\n            modalities_to_run = [m for m in args.modality if m in supported_modalities]\n            if not modalities_to_run:\n                print(f\"\\nProvider: {provider} - SKIPPED (none of {args.modality} supported)\")\n                continue\n        else:\n            modalities_to_run = supported_modalities\n\n        print(f\"\\nProvider: {provider} (model: {model})\")\n        print(f\"  Modalities to run: {modalities_to_run}\")\n\n        for modality in modalities_to_run:\n            print(f\"\\n  Processing modality: {modality}\")\n\n            try:\n                messages = load_messages(modality, provider, messages_dir)\n                print(f\"    Loaded {len(messages)} messages from {modality}_{provider}.json\")\n\n                stats = capture_stats_for_provider(provider, modality, messages, model)\n\n                output_path = save_stats(stats, output_dir, provider, modality)\n                print(f\"    Saved to: {output_path}\")\n\n            except FileNotFoundError as e:\n                print(f\"    SKIPPED: Message file not found - {e}\")\n            except Exception as e:\n                print(f\"    ERROR: {e}\")\n                import traceback\n                traceback.print_exc()\n\n    print(\"\\nDone!\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/generate_test_messages.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nScript to generate test messages for each provider/modality combination.\n\nThis script uses the Generator class directly to create message payloads.\nIt uses the 'generating_messages_only' flag to retrieve the exact messages\nthat would be sent to the provider without making an actual API call.\n\nSupported provider/modality combinations:\n- Anthropic: text-only, image-only, text-image (no audio support)\n- OpenAI: text-only, image-only, text-image\n- OpenAI-Audio: audio-only, text-audio\n- Gemini: all 7 modality combinations\n- Vertex AI: all 7 modality combinations\n- Azure: text-only, image-only, text-image\n\nOutput files are saved to: tests/pytest/data/generator_messages/\nFormat: {modality}_{provider}.json (e.g., text-only_anthropic.json)\n\"\"\"\n\nimport json\nimport os\nimport sys\n\nfrom pydantic import BaseModel, Field\n\n# Add project root to path\nsys.path.insert(0, os.path.join(os.path.dirname(__file__), \"..\"))\n\nfrom palimpzest.constants import Model, PromptStrategy\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.lib.schemas import AudioFilepath, ImageFilepath, union_schemas\nfrom palimpzest.query.generators.generators import Generator\n\n\ndef generate_session_id(provider: str, modality: str) -> str:\n    \"\"\"\n    Generate a unique 12-character session ID for a provider/modality combination.\n    This ensures each modality test has a unique prompt prefix, preventing cross-modality cache hits.\n    The ID is deterministic based on provider+modality so regenerating produces consistent results.\n    \"\"\"\n    import hashlib\n    hash_input = f\"{provider}_{modality}\"\n    hash_hex = hashlib.md5(hash_input.encode()).hexdigest()\n    return hash_hex[:12].upper()\n\nSTATIC_CONTEXT = \"\"\"\nWILDLIFE CONSERVATION & RESEARCH CENTER: SPECIES IDENTIFICATION MANUAL (v2025.1)\n\nSECTION 1: INTRODUCTION AND MISSION\nThe Wildlife Conservation & Research Center (WCRC) is dedicated to the preservation, study, and rehabilitation of diverse wildlife species.\nAll staff members, researchers, and volunteers must adhere to these protocols for accurate species identification and data collection.\nOur mission combines advanced biological sciences with conservation efforts to protect endangered and threatened populations worldwide.\n\nSECTION 2: MAMMAL IDENTIFICATION PROTOCOLS\n\n2.1 ELEPHANTS (Family Elephantidae):\n    - African Savanna Elephant: Larger ears (shaped like Africa), concave back, two fingers on trunk tip. Weight: 5,000-14,000 lbs.\n    - African Forest Elephant: Smaller stature, oval-shaped ears, straighter tusks pointing downward.\n    - Asian Elephant: Smaller ears, convex back, one finger on trunk tip, twin domes on head. Weight: 4,000-11,000 lbs.\n    - Vocalizations: Trumpeting (alarm/excitement), rumbling (long-distance communication), roaring (distress).\n\n2.2 BIG CATS (Family Felidae):\n    - Lion (Panthera leo): Tawny coat, males have distinctive mane. Social, live in prides. Height: 3.5-4 ft at shoulder.\n    - Tiger (Panthera tigris): Orange coat with black stripes, white underbelly. Solitary hunters. Largest cat species.\n    - Leopard (Panthera pardus): Golden-yellow coat with rosette patterns. Excellent climbers, often cache prey in trees.\n    - Cheetah (Acinonyx jubatus): Spotted coat, black \"tear marks\" from eyes to mouth. Fastest land animal (70 mph).\n    - Vocalizations: Roaring (lions, tigers, leopards), chirping/purring (cheetahs cannot roar).\n\n2.3 BEARS (Family Ursidae):\n    - Brown Bear (Ursus arctos): Large shoulder hump, dish-shaped face, long claws. Includes grizzly subspecies.\n    - Black Bear (Ursus americanus): Straight facial profile, no shoulder hump, shorter claws. Most common North American bear.\n    - Polar Bear (Ursus maritimus): White fur, longer neck, smaller ears. Marine mammal adapted to Arctic conditions.\n    - Giant Panda (Ailuropoda melanoleuca): Black and white coloring, feeds almost exclusively on bamboo.\n    - Vocalizations: Roaring, growling, huffing, jaw-popping (threat displays).\n\n2.4 PRIMATES (Order Primates):\n    - Gorilla: Largest primate, silver-back males, knuckle-walking locomotion. Vocalizations include chest-beating, hooting.\n    - Chimpanzee: Highly intelligent, uses tools, complex social structures. Vocalizations: pant-hoots, screams.\n    - Orangutan: Red-orange fur, arboreal lifestyle, solitary. Long calls can travel over 1 km.\n    - Gibbon: Smaller apes, brachiation locomotion, distinctive whooping songs for territorial marking.\n\nSECTION 3: BIRD IDENTIFICATION PROTOCOLS\n\n3.1 RAPTORS (Order Accipitriformes/Falconiformes):\n    - Bald Eagle: White head and tail, yellow beak. Wingspan: 6-7.5 ft. Call: high-pitched chattering.\n    - Golden Eagle: Dark brown plumage, golden nape. Powerful hunters of small mammals.\n    - Peregrine Falcon: Blue-gray back, barred underparts. Fastest bird in dive (240+ mph).\n    - Red-tailed Hawk: Brown back, pale underparts, distinctive red tail. Most common North American hawk.\n\n3.2 PARROTS (Order Psittaciformes):\n    - Macaw: Large, colorful, long tail feathers. Powerful curved beaks. Highly social and vocal.\n    - African Grey: Gray plumage, red tail. Exceptional mimicry and cognitive abilities.\n    - Cockatoo: White or pink plumage, distinctive crest. Loud screeching vocalizations.\n\nSECTION 4: REPTILE IDENTIFICATION PROTOCOLS\n\n4.1 CROCODILIANS (Order Crocodilia):\n    - American Alligator: Broad, U-shaped snout, dark coloration. Freshwater habitats.\n    - Nile Crocodile: V-shaped snout, aggressive. Can reach 16-18 ft in length.\n    - Gharial: Extremely narrow snout, fish-eating specialist. Critically endangered.\n\n4.2 LARGE SNAKES (Families Pythonidae/Boidae):\n    - Reticulated Python: Longest snake species (up to 23 ft), complex geometric patterns.\n    - Green Anaconda: Heaviest snake species, olive-green with black spots. Semi-aquatic.\n    - King Cobra: Longest venomous snake (up to 18 ft), distinctive hood when threatened.\n\nSECTION 5: DATA COLLECTION AND ANALYSIS\n\n5.1 Visual Identification:\n    - Document body shape, size, coloration, and distinctive markings.\n    - Note behavioral characteristics and habitat context.\n    - Use standardized photography protocols for pattern matching.\n\n5.2 Audio Identification:\n    - Record vocalizations with frequency analysis equipment.\n    - Tag recordings with behavioral context (territorial, mating, alarm, social).\n    - Cross-reference with vocalization databases for species confirmation.\n\n5.3 Biometric Data:\n    - Record body measurements according to species-specific protocols.\n    - Document age indicators (teeth wear, plumage, etc.).\n    - Collect genetic samples when possible for lineage verification.\n\nYou are an AI Research Assistant for the WCRC. Your job is to analyze data inputs (text descriptions, images, and/or audio recordings) and identify the species based on the characteristics described in this manual.\nAnalyze all provided inputs and determine the most likely species identification.\n\"\"\"\n\nclass TextInputSchema(BaseModel):\n    \"\"\"Schema for text-only input.\"\"\"\n    text: str = Field(description=\"Description of an animal\")\n    age: int = Field(description=\"The age of the animal in years\")\n\n\nclass ImageInputSchema(BaseModel):\n    \"\"\"Schema for image-only input.\"\"\"\n    image_file: ImageFilepath = Field(description=\"File path to an image of an animal\")\n    height: float = Field(description=\"The estimated height of the animal in cm\")\n\n\nclass AudioInputSchema(BaseModel):\n    \"\"\"Schema for audio-only input.\"\"\"\n    audio_file: AudioFilepath = Field(description=\"File path to an audio recording of an animal\")\n    year: float = Field(description=\"The year the recording was made\")\n\n\n# Union schemas for multi-modal inputs\nTextImageInputSchema = union_schemas([TextInputSchema, ImageInputSchema])\nTextAudioInputSchema = union_schemas([TextInputSchema, AudioInputSchema])\nImageAudioInputSchema = union_schemas([ImageInputSchema, AudioInputSchema])\nTextImageAudioInputSchema = union_schemas([TextInputSchema, ImageInputSchema, AudioInputSchema])\n\n\nclass OutputSchema(BaseModel):\n    \"\"\"Output schema for animal identification.\"\"\"\n    animal: str = Field(description=\"The animal in the input\")\n\nMODALITY_CONFIGS = {\n    \"text-only\": {\n        \"input_schema\": TextInputSchema,\n        \"data_item\": {\n            \"text\": \"An elephant is a large gray animal with a trunk and big ears. It makes a trumpeting sound.\",\n            \"age\": 15,\n        },\n    },\n    \"image-only\": {\n        \"input_schema\": ImageInputSchema,\n        \"data_item\": {\n            \"image_file\": \"tests/pytest/data/elephant.png\",\n            \"height\": 304.5,\n        },\n    },\n    \"audio-only\": {\n        \"input_schema\": AudioInputSchema,\n        \"data_item\": {\n            \"audio_file\": \"tests/pytest/data/elephant.wav\",\n            \"year\": 2020,\n        },\n    },\n    \"text-image\": {\n        \"input_schema\": TextImageInputSchema,\n        \"data_item\": {\n            \"text\": \"An elephant is a large gray animal with a trunk and big ears. It makes a trumpeting sound.\",\n            \"age\": 15,\n            \"image_file\": \"tests/pytest/data/elephant.png\",\n            \"height\": 304.5,\n        },\n    },\n    \"text-audio\": {\n        \"input_schema\": TextAudioInputSchema,\n        \"data_item\": {\n            \"text\": \"An elephant is a large gray animal with a trunk and big ears. It makes a trumpeting sound.\",\n            \"age\": 15,\n            \"audio_file\": \"tests/pytest/data/elephant.wav\",\n            \"year\": 2020,\n        },\n    },\n    \"image-audio\": {\n        \"input_schema\": ImageAudioInputSchema,\n        \"data_item\": {\n            \"image_file\": \"tests/pytest/data/elephant.png\",\n            \"height\": 304.5,\n            \"audio_file\": \"tests/pytest/data/elephant.wav\",\n            \"year\": 2020,\n        },\n    },\n    \"text-image-audio\": {\n        \"input_schema\": TextImageAudioInputSchema,\n        \"data_item\": {\n            \"text\": \"An elephant is a large gray animal with a trunk and big ears. It makes a trumpeting sound.\",\n            \"age\": 15,\n            \"image_file\": \"tests/pytest/data/elephant.png\",\n            \"height\": 304.5,\n            \"audio_file\": \"tests/pytest/data/elephant.wav\",\n            \"year\": 2020,\n        },\n    },\n}\n\n# Maps provider name to (Model enum, supported modalities)\nPROVIDER_CONFIGS = {\n    \"anthropic\": {\n        \"model\": Model.CLAUDE_4_5_SONNET,\n        \"supported_modalities\": [\"text-only\", \"image-only\", \"text-image\"],\n    },\n    \"openai\": {\n        \"model\": Model.GPT_4o,\n        \"supported_modalities\": [\"text-only\", \"image-only\", \"text-image\"],\n    },\n    \"openai-audio\": {\n        \"model\": Model.GPT_4o_AUDIO_PREVIEW,\n        \"supported_modalities\": [\"audio-only\", \"text-audio\"],\n    },\n    \"gemini\": {\n        \"model\": Model.GOOGLE_GEMINI_2_5_FLASH,\n        \"supported_modalities\": [\n            \"text-only\", \"image-only\", \"audio-only\",\n            \"text-image\", \"text-audio\", \"image-audio\", \"text-image-audio\",\n        ],\n    },\n    \"vertex_ai\": {\n        \"model\": Model.GEMINI_2_5_FLASH,\n        \"supported_modalities\": [\n            \"text-only\", \"image-only\", \"audio-only\",\n            \"text-image\", \"text-audio\", \"image-audio\", \"text-image-audio\",\n        ],\n    },\n    \"azure\": {\n        \"model\": Model.AZURE_GPT_4o,\n        \"supported_modalities\": [\"text-only\", \"image-only\", \"text-image\"],\n    },\n}\n\n\ndef save_messages(modality: str, provider: str, messages: list[dict], output_dir: str) -> str:\n    \"\"\"\n    Save messages to a JSON file.\n\n    Args:\n        modality: Modality name\n        provider: Provider name\n        messages: List of message dicts\n        output_dir: Directory to save files\n\n    Returns:\n        Path to the saved file\n    \"\"\"\n    os.makedirs(output_dir, exist_ok=True)\n    output_path = os.path.join(output_dir, f\"{modality}_{provider}.json\")\n\n    # Convert messages to JSON-serializable format\n    serializable_messages = []\n    for msg in messages:\n        serializable_msg = msg.copy()\n        serializable_messages.append(serializable_msg)\n\n    with open(output_path, \"w\") as f:\n        json.dump(serializable_messages, f, indent=2, default=str)\n\n    return output_path\n\n\ndef main():\n    \"\"\"Generate and save messages for all provider/modality combinations.\"\"\"\n    # Ensure the output directory follows the repository structure\n    output_dir = os.path.join(\n        os.path.dirname(__file__),\n        \"..\",\n        \"tests\",\n        \"pytest\",\n        \"data\",\n        \"generator_messages\",\n    )\n    output_dir = os.path.abspath(output_dir)\n\n    # Count total combinations\n    total_combinations = sum(\n        len(provider_config[\"supported_modalities\"])\n        for provider_config in PROVIDER_CONFIGS.values()\n    )\n\n    print(f\"Generating test messages for {total_combinations} provider/modality combinations...\")\n    print(f\"Output directory: {output_dir}\")\n    print(f\"Static context length: ~{len(STATIC_CONTEXT.split())} words\\n\")\n\n    generated_count = 0\n\n    for provider, provider_config in PROVIDER_CONFIGS.items():\n        model = provider_config[\"model\"]\n        supported_modalities = provider_config[\"supported_modalities\"]\n\n        print(f\"Provider: {provider} (model: {model.value})\")\n        print(f\"  Supported modalities: {supported_modalities}\")\n\n        for modality in supported_modalities:\n            config = MODALITY_CONFIGS[modality]\n            print(f\"  Generating: {modality}_{provider}\")\n\n            try:\n                # Prepare input record\n                input_schema = config[\"input_schema\"]\n                data_item = config[\"data_item\"]\n                input_record = DataRecord(input_schema(**data_item), source_indices=[0])\n\n                # Instantiate Generator\n                generator = Generator(\n                    model=model,\n                    prompt_strategy=PromptStrategy.MAP,\n                    reasoning_effort=None,\n                    desc=STATIC_CONTEXT,\n                )\n\n                # Generate unique session ID for this provider/modality to prevent cross-modality cache hits\n                session_id = generate_session_id(provider, modality)\n\n                # Call the generator with the new flag\n                # Pass cache_isolation_id to inject session ID at start of system/user prompts\n                messages = generator(\n                    candidate=input_record,\n                    fields=OutputSchema.model_fields,\n                    output_schema=OutputSchema,\n                    generating_messages_only=True,\n                    cache_isolation_id=session_id,\n                )\n\n                # Manually save the messages using local helper\n                output_path = save_messages(modality, provider, messages, output_dir)\n\n                print(f\"    Session ID: {session_id}\")\n                print(f\"    Saved to: {output_path}\")\n                print(f\"    Messages: {len(messages)}\")\n\n                # Print message summary\n                for i, msg in enumerate(messages):\n                    role = msg.get(\"role\", \"unknown\")\n                    msg_type = msg.get(\"type\", \"unknown\")\n                    content = msg.get(\"content\", \"\")\n                    content_len = len(content) if isinstance(content, str) else len(str(content))\n                    print(f\"      [{i}] role={role}, type={msg_type}, len={content_len}\")\n\n                generated_count += 1\n\n            except Exception as e:\n                print(f\"    ERROR: {e}\")\n                import traceback\n                traceback.print_exc()\n\n        print()\n\n    print(f\"Done! Generated {generated_count}/{total_combinations} message files.\")\n\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "scripts/update_model_info.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nScript to automatically update pz_models_information.json with data from external sources.\n\nData Sources:\n- LiteLLM proxy /model/info endpoint: Dynamic model info (100% accuracy, prioritized)\n- LiteLLM model_prices_and_context_window.json: Cost and capability data (fallback)\n- MMLU-Pro leaderboard: Quality scores (fuzzy matching acceptable)\n- Artificial Analysis: Latency data (fuzzy matching acceptable)\n\nUsage:\n    python scripts/update_model_info.py MODEL_ID [MODEL_ID ...] [--use-endpoint]\n\"\"\"\n\nimport argparse\nimport json\nimport os\nimport socket\nimport subprocess\nimport sys\nimport time\nfrom typing import Any\n\nimport requests\nimport yaml\n\n# Add src to path to import from palimpzest\nsys.path.insert(0, os.path.join(os.path.dirname(__file__), \"..\", \"src\"))\n\nfrom palimpzest.utils.model_info_helpers import (\n    LATENCY_TPS_DATA,\n    MMLU_PRO_SCORES,\n    derive_model_flags,\n    fuzzy_match_score,\n)\n\n# Constants\nLITELLM_URL = \"https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json\"\nPZ_MODELS_PATH = os.path.join(\n    os.path.dirname(__file__),\n    \"..\",\n    \"src\",\n    \"palimpzest\",\n    \"utils\",\n    \"pz_models_information.json\",\n)\n\n# Provider mapping from LiteLLM prefixes to our provider strings\nPROVIDER_MAPPING = {\n    \"openai\": \"openai\",\n    \"anthropic\": \"anthropic\",\n    \"claude\": \"anthropic\",\n    \"vertex_ai\": \"vertex_ai\",\n    \"gemini\": \"gemini\",\n    \"together_ai\": \"together_ai\",\n    \"together\": \"together_ai\",\n    \"hosted_vllm\": \"hosted_vllm\",\n    \"groq\": \"groq\",\n    \"mistral\": \"mistral\",\n    \"cohere\": \"cohere\",\n    \"bedrock\": \"bedrock\",\n    \"azure\": \"azure\",\n    \"deepseek\": \"deepseek\",\n    \"fireworks_ai\": \"fireworks_ai\",\n    \"xai\": \"xai\",\n}\n\n# API key environment variable mapping\nAPI_KEY_MAPPING = {\n    \"openai\": \"OPENAI_API_KEY\",\n    \"azure\": \"AZURE_API_KEY\",\n    \"anthropic\": \"ANTHROPIC_API_KEY\",\n    \"vertex_ai\": \"GOOGLE_APPLICATION_CREDENTIALS\",\n    \"gemini\": \"GEMINI_API_KEY\",\n    \"together_ai\": \"TOGETHER_API_KEY\",\n    \"hosted_vllm\": \"VLLM_API_KEY\",\n    \"groq\": \"GROQ_API_KEY\",\n    \"mistral\": \"MISTRAL_API_KEY\",\n    \"cohere\": \"COHERE_API_KEY\",\n    \"deepseek\": \"DEEPSEEK_API_KEY\",\n    \"fireworks_ai\": \"FIREWORKS_API_KEY\",\n    \"xai\": \"XAI_API_KEY\",\n}\n\n# Field mapping from LiteLLM endpoint to PZ format\nFIELD_MAPPING = [\n    (\"usd_per_input_token\", \"input_cost_per_token\", None),\n    (\"usd_per_output_token\", \"output_cost_per_token\", None),\n    (\"usd_per_audio_input_token\", \"input_cost_per_audio_token\", None),\n    (\"usd_per_audio_output_token\", \"output_cost_per_audio_token\", None),\n    (\"usd_per_image_output_token\", \"output_cost_per_image_token\", None),\n    (\"usd_per_cache_read_token\", \"cache_read_input_token_cost\", None),\n    (\"usd_per_cache_creation_token\", \"cache_creation_input_token_cost\", None),\n    (\"supports_prompt_caching\", \"supports_prompt_caching\", False),\n]\n\n# Boolean capability fields derived from endpoint\nCAPABILITY_MAPPING = [\n    (\"is_vision_model\", \"supports_vision\", False),\n    (\"is_audio_model\", \"supports_audio_input\", False),\n    (\"is_reasoning_model\", \"supports_reasoning\", False),\n]\n\n# MMLU_PRO_SCORES, LATENCY_TPS_DATA, and fuzzy_match_score are imported from\n# palimpzest.utils.model_info_helpers\n\n# Alias for backwards compatibility in this script\nLATENCY_DATA = LATENCY_TPS_DATA\n\n\n# =============================================================================\n# LiteLLM Proxy Endpoint Functions\n# =============================================================================\n\ndef get_free_port() -> int:\n    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:\n        s.bind((\"\", 0))\n        return s.getsockname()[1]\n\n\ndef extract_provider(model_id: str) -> str:\n    \"\"\"Extract provider from model ID.\"\"\"\n    if \"/\" in model_id:\n        prefix = model_id.split(\"/\")[0].lower()\n        return PROVIDER_MAPPING.get(prefix, prefix)\n\n    model_lower = model_id.lower()\n    \n    # OpenAI\n    if any(x in model_lower for x in [\"gpt\", \"o1-\", \"o3-\", \"o4-\", \"dall-e\", \"whisper\"]):\n        return \"openai\"\n    \n    # Anthropic\n    if \"claude\" in model_lower:\n        return \"anthropic\"\n    \n    # Google (Vertex AI / Gemini)\n    if \"gemini\" in model_lower or \"bison\" in model_lower:\n        return \"vertex_ai\"\n    \n    # Meta / Together / Llama\n    if \"llama\" in model_lower:\n        return \"together_ai\"\n    \n    # Mistral\n    if \"mistral\" in model_lower or \"mixtral\" in model_lower:\n        return \"mistral\"\n\n    # DeepSeek\n    if \"deepseek\" in model_lower:\n        return \"deepseek\"\n\n    return \"unknown\"\n\n\ndef get_api_key_env_var(provider: str) -> str | None:\n    return API_KEY_MAPPING.get(provider)\n\n\ndef generate_config_yaml(model_ids: list[str]) -> str:\n    config_id = 0\n    config_filename = f\"litellm_config_{config_id}.yaml\"\n    while not os.path.exists(config_filename):\n        config_id += 1\n\n    config_list = []\n    for model_id in model_ids:\n        provider = extract_provider(model_id)\n        env_var_name = get_api_key_env_var(provider)\n        api_key_val = f\"os.environ/{env_var_name}\" if env_var_name else None\n\n        entry = {\n            \"model_name\": model_id,\n            \"litellm_params\": {\n                \"model\": model_id,\n                \"api_key\": api_key_val,\n            },\n        }\n        config_list.append(entry)\n\n    yaml_structure = {\"model_list\": config_list}\n    with open(config_filename, \"w\") as f:\n        yaml.dump(yaml_structure, f, default_flow_style=False, sort_keys=False)\n\n    return config_filename\n\n\ndef fetch_dynamic_model_info(model_ids: list[str]) -> dict[str, Any]:\n    if not model_ids:\n        return {}\n\n    port = get_free_port()\n    proxy_url = f\"http://127.0.0.1:{port}\"\n    config_filename = generate_config_yaml(model_ids)\n    server_env = os.environ.copy()\n    process = None\n    dynamic_model_info = {}\n\n    print(f\"Starting LiteLLM proxy on port {port} for {len(model_ids)} models...\")\n\n    try:\n        process = subprocess.Popen(\n            [\"litellm\", \"--config\", config_filename, \"--port\", str(port)],\n            stdout=subprocess.PIPE,\n            stderr=subprocess.PIPE,\n            env=server_env,\n        )\n\n        server_ready = False\n        max_retries = 30\n        for i in range(max_retries):\n            if process.poll() is not None:\n                _, stderr = process.communicate()\n                print(f\"  LiteLLM process died unexpectedly: {stderr.decode()}\")\n                break\n            try:\n                requests.get(f\"{proxy_url}/health/readiness\", timeout=1)\n                server_ready = True\n                print(f\"  Server ready after {i + 1} attempts\")\n                break\n            except (requests.exceptions.ConnectionError, requests.exceptions.ReadTimeout):\n                time.sleep(0.5)\n\n        if not server_ready:\n            print(\"  Timeout: LiteLLM server failed to start within the limit.\")\n            return {}\n\n        try:\n            response = requests.get(f\"{proxy_url}/model/info\", timeout=10)\n            response.raise_for_status()\n            model_data = response.json()\n\n            if \"data\" in model_data and len(model_data[\"data\"]) > 0:\n                for item in model_data[\"data\"]:\n                    model_name = item.get(\"model_name\")\n                    model_info = item.get(\"model_info\", {})\n                    dynamic_model_info[model_name] = model_info\n                    print(f\"  Retrieved info for: {model_name}\")\n            else:\n                print(\"  WARNING: No model data returned from endpoint\")\n        except Exception as e:\n            print(f\"  Error fetching model info: {e}\")\n\n    finally:\n        if process:\n            process.terminate()\n            try:\n                process.wait(timeout=5)\n            except subprocess.TimeoutExpired:\n                process.kill()\n\n        if os.path.exists(config_filename):\n            os.remove(config_filename)\n\n    return dynamic_model_info\n\n\n# =============================================================================\n# Data Fetching Functions\n# =============================================================================\n\ndef fetch_litellm_data() -> dict[str, Any]:\n    print(f\"Fetching LiteLLM data from {LITELLM_URL}...\")\n    try:\n        response = requests.get(LITELLM_URL, timeout=30)\n        response.raise_for_status()\n        data = response.json()\n        print(f\"  Found {len(data)} models in LiteLLM database\")\n        return data\n    except Exception as e:\n        print(f\"  Error fetching LiteLLM data: {e}\")\n        return {}\n\n\ndef load_existing_data() -> dict[str, Any]:\n    if os.path.exists(PZ_MODELS_PATH):\n        with open(PZ_MODELS_PATH) as f:\n            return json.load(f)\n    return {}\n\n\ndef save_data(data: dict[str, Any]) -> None:\n    with open(PZ_MODELS_PATH, \"w\") as f:\n        json.dump(data, f, indent=4)\n    print(f\"  [System] Successfully saved to {PZ_MODELS_PATH}\")\n\n\n# =============================================================================\n# Matching and Conversion Functions\n# =============================================================================\n\n# fuzzy_match_score is imported from palimpzest.utils.model_info_helpers\n\n\ndef derive_model_flags_with_provider(model_id: str, provider: str) -> dict[str, bool]:\n    \"\"\"Wrapper around derive_model_flags that also adds provider-specific flags.\"\"\"\n    flags = derive_model_flags(model_id)\n    if provider == \"hosted_vllm\":\n        flags[\"is_vllm_model\"] = True\n    return flags\n\n\n# =============================================================================\n# Interactive Review Functions\n# =============================================================================\n\ndef prompt_for_value(field_name: str, current_value: Any, value_type: str = \"any\") -> Any:\n    while True:\n        user_input = input(f\"    > Enter new value for '{field_name}' (or press Enter to keep current): \").strip()\n        if user_input == \"\":\n            return current_value\n        \n        try:\n            if user_input.lower() == \"none\":\n                return None\n            if value_type == \"float\":\n                return float(user_input)\n            elif value_type == \"int\":\n                return int(user_input)\n            elif value_type == \"bool\":\n                return user_input.lower() in (\"true\", \"yes\", \"1\", \"y\")\n            else:\n                try:\n                    return json.loads(user_input)\n                except json.JSONDecodeError:\n                    return user_input\n        except ValueError as e:\n            print(f\"    Invalid input: {e}. Try again.\")\n\n\ndef review_field(\n    field_name: str,\n    value: Any,\n    from_endpoint: bool,\n    interactive: bool = True,\n    value_type: str = \"any\"\n) -> tuple[Any, bool]:\n    \"\"\"\n    Review a single field.\n    Logic:\n    1. If from_endpoint is True and value not None -> VERIFIED (return immediately)\n    2. If interactive -> Ask User (1. Correct, 2. Incorrect)\n    \"\"\"\n    if from_endpoint and value is not None:\n        # Verified automatically by endpoint\n        return value, False\n\n    if not interactive:\n        return value, False\n\n    print(f\"\\n  [Review] {field_name}: {value}\")\n    if from_endpoint and value is None:\n         print(\"    (Source: Endpoint returned Null)\")\n    else:\n         print(\"    (Source: Derived/Static/Fallback)\")\n\n    while True:\n        choice = input(\"    1. Yes, information is correct\\n    2. No, enter different value\\n    Choice [1]: \").strip()\n        if choice == \"\" or choice == \"1\":\n            return value, False\n        elif choice == \"2\":\n            new_value = prompt_for_value(field_name, value, value_type)\n            return new_value, True\n        else:\n            print(\"    Invalid choice.\")\n\n\ndef convert_and_review_model(\n    model_id: str,\n    litellm_static: dict[str, Any] | None,\n    litellm_dynamic: dict[str, Any] | None,\n    existing_entry: dict[str, Any] | None,\n    interactive: bool = True,\n) -> dict[str, Any]:\n    \"\"\"\n    1. Aggregates all data into a Draft Entry.\n    2. Displays the Draft Entry (User can see Current State).\n    3. Iterates fields to Verify (Prioritizing endpoint).\n    \"\"\"\n    print(f\"\\n{'='*60}\")\n    print(f\"PROCESSING: {model_id}\")\n    print(f\"{'='*60}\")\n\n    # --- PHASE 1: Build Draft Entry & Source Map ---\n    \n    endpoint_fields: set[str] = set()\n    raw_data: dict[str, Any] = {}\n\n    # 1. Base: Static Data\n    if litellm_static:\n        raw_data.update(litellm_static)\n    \n    # 2. Overlay: Dynamic Data (Priority)\n    if litellm_dynamic:\n        for key, val in litellm_dynamic.items():\n            if val is not None:\n                raw_data[key] = val\n                endpoint_fields.add(key)\n\n    # 3. Construct Candidate dictionary\n    candidate = {}\n    source_map = {} # Map field -> is_from_endpoint\n\n    # Provider\n    prov = raw_data.get(\"litellm_provider\") or extract_provider(model_id)\n    candidate[\"provider\"] = prov\n    source_map[\"provider\"] = \"litellm_provider\" in endpoint_fields\n\n    # Costs & Caching\n    for pz_field, litellm_field, default in FIELD_MAPPING:\n        val = raw_data.get(litellm_field, default)\n        candidate[pz_field] = val\n        source_map[pz_field] = litellm_field in endpoint_fields\n\n    # Capabilities\n    for pz_field, litellm_field, default in CAPABILITY_MAPPING:\n        val = raw_data.get(litellm_field, default)\n        # Special logic for audio\n        if pz_field == \"is_audio_model\":\n            audio_in = raw_data.get(\"supports_audio_input\", False)\n            audio_out = raw_data.get(\"supports_audio_output\", False)\n            val = audio_in or audio_out\n            source_map[pz_field] = (\"supports_audio_input\" in endpoint_fields or \n                                    \"supports_audio_output\" in endpoint_fields)\n        else:\n            source_map[pz_field] = litellm_field in endpoint_fields\n        candidate[pz_field] = val\n\n    # Modes\n    mode = raw_data.get(\"mode\", \"chat\")\n    mode_src = \"mode\" in endpoint_fields\n    candidate[\"is_text_model\"] = mode in [\"chat\", \"completion\"]\n    source_map[\"is_text_model\"] = mode_src\n    candidate[\"is_embedding_model\"] = mode == \"embedding\"\n    source_map[\"is_embedding_model\"] = mode_src\n\n    # Flags (Always derived, never endpoint)\n    flags = derive_model_flags_with_provider(model_id, candidate[\"provider\"])\n    for k, v in flags.items():\n        candidate[k] = v\n        source_map[k] = False\n\n    # Scores / Latency (Fuzzy or Existing)\n    mmlu = fuzzy_match_score(model_id, MMLU_PRO_SCORES)\n    if mmlu is None and existing_entry:\n        mmlu = existing_entry.get(\"MMLU_Pro_score\")\n    candidate[\"MMLU_Pro_score\"] = mmlu\n    source_map[\"MMLU_Pro_score\"] = False\n\n    tps = fuzzy_match_score(model_id, LATENCY_DATA)\n    sec_per_tok = round(1.0 / tps, 6) if tps else None\n    if sec_per_tok is None and existing_entry:\n        sec_per_tok = existing_entry.get(\"seconds_per_output_token\")\n    candidate[\"seconds_per_output_token\"] = sec_per_tok\n    source_map[\"seconds_per_output_token\"] = False\n    \n    # Audio Cache Read (check existing)\n    acr = existing_entry.get(\"usd_per_audio_cache_read_token\") if existing_entry else None\n    if acr is not None:\n        candidate[\"usd_per_audio_cache_read_token\"] = acr\n        source_map[\"usd_per_audio_cache_read_token\"] = False\n\n    # Note\n    if existing_entry and existing_entry.get(\"note\"):\n        candidate[\"note\"] = existing_entry[\"note\"]\n        source_map[\"note\"] = False\n\n    # Sources\n    src_list = [LITELLM_URL]\n    if existing_entry and existing_entry.get(\"sources\"):\n        existing_srcs = existing_entry[\"sources\"]\n        if isinstance(existing_srcs, list):\n            src_list = list(set(src_list + existing_srcs))\n        elif existing_srcs:\n            src_list = list(set(src_list + [existing_srcs]))\n    candidate[\"sources\"] = src_list\n\n    # --- PHASE 2: Display Current State ---\n    \n    print(\"\\n--- Current State (Draft) ---\")\n    display_dict = {}\n    for k, v in candidate.items():\n        if k == \"sources\":\n            continue\n        src_label = \"ENDPOINT\" if source_map.get(k, False) and v is not None else \"DERIVED/STATIC\"\n        display_dict[k] = f\"{v}  [{src_label}]\"\n    \n    print(json.dumps(display_dict, indent=2))\n    print(\"-\" * 30)\n\n    # --- PHASE 3: Verification Loop ---\n\n    final_entry = {}\n    final_entry[\"sources\"] = candidate[\"sources\"]\n\n    # Iterate over specific keys to ensure order and types\n    \n    # Provider\n    final_entry[\"provider\"], _ = review_field(\n        \"provider\", candidate[\"provider\"], source_map[\"provider\"], interactive, \"str\"\n    )\n\n    # All cost/cache fields\n    for k in [f[0] for f in FIELD_MAPPING] + [\"usd_per_audio_cache_read_token\"]:\n        if k in candidate:\n            vtype = \"float\" if \"usd_\" in k else \"bool\"\n            final_entry[k], _ = review_field(\n                k, candidate[k], source_map.get(k, False), interactive, vtype\n            )\n\n    # Capabilities & Modes\n    bool_keys = [f[0] for f in CAPABILITY_MAPPING] + [\"is_text_model\", \"is_embedding_model\"] + list(flags.keys())\n    for k in bool_keys:\n        if k in candidate:\n            final_entry[k], _ = review_field(\n                k, candidate[k], source_map.get(k, False), interactive, \"bool\"\n            )\n\n    # Stats\n    final_entry[\"MMLU_Pro_score\"], _ = review_field(\n        \"MMLU_Pro_score\", candidate[\"MMLU_Pro_score\"], False, interactive, \"float\"\n    )\n    final_entry[\"seconds_per_output_token\"], _ = review_field(\n        \"seconds_per_output_token\", candidate[\"seconds_per_output_token\"], False, interactive, \"float\"\n    )\n\n    # Note\n    if \"note\" in candidate:\n        final_entry[\"note\"], _ = review_field(\n            \"note\", candidate[\"note\"], False, interactive, \"str\"\n        )\n\n    # Cleanup Nulls\n    cleaned_entry = {k: v for k, v in final_entry.items() if v is not None}\n    \n    return cleaned_entry\n\n\ndef update_model(\n    model_id: str,\n    existing_data: dict[str, Any],\n    litellm_static: dict[str, Any],\n    litellm_dynamic: dict[str, Any] | None = None,\n    interactive: bool = True,\n) -> dict[str, Any] | None:\n    static_entry = None\n    if model_id in litellm_static:\n        static_entry = litellm_static[model_id]\n    else:\n        if \"/\" in model_id:\n            model_name = model_id.split(\"/\", 1)[1]\n            if model_name in litellm_static:\n                static_entry = litellm_static[model_name]\n\n    dynamic_entry = litellm_dynamic.get(model_id) if litellm_dynamic else None\n    \n    if static_entry is None and dynamic_entry is None:\n        print(f\"\\n  WARNING: No LiteLLM data found for {model_id}\")\n    \n    existing_entry = existing_data.get(model_id)\n\n    new_entry = convert_and_review_model(\n        model_id,\n        static_entry,\n        dynamic_entry,\n        existing_entry,\n        interactive=interactive,\n    )\n    return new_entry\n\n\ndef process_models(\n    model_ids: list[str],\n    existing_data: dict[str, Any],\n    litellm_static: dict[str, Any],\n    use_endpoint: bool = False,\n    interactive: bool = True,\n    skip_existing: bool = False,\n) -> None:\n    \"\"\"\n    Process models and (if interactive is True) ask user whether to write each one to file.\n    \"\"\"\n    litellm_dynamic = None\n    if use_endpoint:\n        litellm_dynamic = fetch_dynamic_model_info(model_ids)\n\n    # We work on the existing_data dictionary directly so we can save incrementally\n    current_data_state = existing_data.copy()\n\n    for model_id in model_ids:\n        # Check if model exists and if we should skip it\n        if skip_existing and model_id in current_data_state:\n            print(f\"\\n  [System] Model '{model_id}' already exists in file. Skipping.\")\n            continue\n\n        new_entry = update_model(\n            model_id, current_data_state, litellm_static, litellm_dynamic,\n            interactive=interactive\n        )\n\n        if new_entry:\n            # Display Final Result\n            print(\"\\n\" + \"-\"*30)\n            print(f\"FINAL JSON FOR: {model_id}\")\n            print(json.dumps(new_entry, indent=2))\n            print(\"-\" * 30)\n\n            # Ask user to write to file\n            should_save = True\n            if interactive:\n                confirm = input(f\"Write '{model_id}' to json file? [y/N]: \").strip().lower()\n                should_save = confirm == 'y'\n\n            if should_save:\n                current_data_state[model_id] = new_entry\n                save_data(current_data_state)\n            else:\n                print(f\"  [System] Skipped saving {model_id}.\")\n\n\n# =============================================================================\n# Main Entry Point\n# =============================================================================\n\ndef main():\n    parser = argparse.ArgumentParser(\n        description=\"Update pz_models_information.json with external data sources\",\n        formatter_class=argparse.RawDescriptionHelpFormatter\n    )\n    parser.add_argument(\"model_ids\", nargs=\"*\", help=\"Model IDs to update\")\n    parser.add_argument(\"--use-endpoint\", action=\"store_true\", help=\"Fetch dynamic info\")\n    parser.add_argument(\"--non-interactive\", action=\"store_true\", help=\"Skip review and auto-save\")\n    parser.add_argument(\"--update-all\", action=\"store_true\", help=\"Update all existing\")\n\n    args = parser.parse_args()\n\n    litellm_static = fetch_litellm_data()\n    if not litellm_static:\n        return\n\n    existing_data = load_existing_data()\n\n    skip_existing = False\n    if args.update_all:\n        model_ids = list(existing_data.keys())\n    elif args.model_ids:\n        model_ids = args.model_ids\n        skip_existing = True\n    else:\n        parser.print_help()\n        return\n\n    interactive = not args.non_interactive\n\n    # Run the main processing loop\n    process_models(\n        model_ids,\n        existing_data,\n        litellm_static,\n        use_endpoint=args.use_endpoint,\n        interactive=interactive,\n        skip_existing=skip_existing,\n    )\n\n    print(\"\\nAll operations complete.\")\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "src/palimpzest/__init__.py",
    "content": "import logging\n\nfrom palimpzest.constants import Cardinality, Model\nfrom palimpzest.core.data.context import Context, TextFileContext\nfrom palimpzest.core.data.dataset import Dataset\nfrom palimpzest.core.data.iter_dataset import (\n    AudioFileDataset,\n    HTMLFileDataset,\n    ImageFileDataset,\n    IterDataset,\n    MemoryDataset,\n    PDFFileDataset,\n    TextFileDataset,\n    XLSFileDataset,\n)\nfrom palimpzest.core.elements.groupbysig import GroupBySig\nfrom palimpzest.core.lib.schemas import AudioBase64, AudioFilepath, ImageBase64, ImageFilepath, ImageURL\nfrom palimpzest.policy import (\n    MaxQuality,\n    MaxQualityAtFixedCost,\n    MaxQualityAtFixedTime,\n    MinCost,\n    MinCostAtFixedQuality,\n    MinTime,\n    MinTimeAtFixedQuality,\n    PlanCost,\n    Policy,\n)\nfrom palimpzest.query.processor.config import QueryProcessorConfig\nfrom palimpzest.validator.validator import Validator\n\n# Initialize the root logger\nlogging.getLogger(__name__).addHandler(logging.NullHandler())\n\n__all__ = [\n    # constants\n    \"Cardinality\",\n    \"Model\",\n    # core\n    \"GroupBySig\",\n    \"Context\",\n    \"TextFileContext\",\n    \"Dataset\",\n    \"IterDataset\",\n    \"AudioFileDataset\",\n    \"MemoryDataset\",\n    \"HTMLFileDataset\",\n    \"ImageFileDataset\",\n    \"PDFFileDataset\",\n    \"TextFileDataset\",\n    \"XLSFileDataset\",\n    # schemas\n    \"AudioBase64\",\n    \"AudioFilepath\",\n    \"ImageBase64\",\n    \"ImageFilepath\",\n    \"ImageURL\",\n    # policy\n    \"MaxQuality\",\n    \"MaxQualityAtFixedCost\",\n    \"MaxQualityAtFixedTime\",\n    \"MinCost\",\n    \"MinCostAtFixedQuality\",\n    \"MinTime\",\n    \"MinTimeAtFixedQuality\",\n    \"PlanCost\",\n    \"Policy\",\n    # query\n    \"QueryProcessorConfig\",\n    # validator\n    \"Validator\",\n]\n"
  },
  {
    "path": "src/palimpzest/agents/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/agents/compute_agents.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/agents/search_agents.py",
    "content": "import json\nimport textwrap\nimport time\nfrom collections.abc import Generator\nfrom typing import TYPE_CHECKING, Any\n\nfrom rich.console import Group\nfrom rich.live import Live\nfrom rich.markdown import Markdown\nfrom rich.rule import Rule\nfrom rich.text import Text\n\nif TYPE_CHECKING:\n    import PIL.Image\n\nfrom smolagents.agent_types import handle_agent_output_types\nfrom smolagents.agents import (\n    ActionOutput,\n    CodeAgent,\n    FinalAnswerPromptTemplate,\n    ManagedAgentPromptTemplate,\n    PlanningPromptTemplate,\n    PromptTemplates,\n    RunResult,\n    ToolOutput,\n    populate_template,\n)\nfrom smolagents.local_python_executor import fix_final_answer_code\nfrom smolagents.memory import (\n    ActionStep,\n    FinalAnswerStep,\n    PlanningStep,\n    SystemPromptStep,\n    TaskStep,\n    Timing,\n    TokenUsage,\n    ToolCall,\n)\nfrom smolagents.models import (\n    CODEAGENT_RESPONSE_FORMAT,\n    ChatMessage,\n    ChatMessageStreamDelta,\n    MessageRole,\n    agglomerate_stream_deltas,\n)\nfrom smolagents.monitoring import YELLOW_HEX, LogLevel\nfrom smolagents.utils import (\n    AgentError,\n    AgentExecutionError,\n    AgentGenerationError,\n    AgentMaxStepsError,\n    AgentParsingError,\n    extract_code_from_text,\n    parse_code_blobs,\n    truncate_content,\n)\n\nfrom palimpzest.prompts import (\n    CODE_AGENT_SYSTEM_PROMPT,\n    DATA_DISCOVERY_AGENT_INITIAL_PLAN_PROMPT,\n    DATA_DISCOVERY_AGENT_REPORT_PROMPT,\n    DATA_DISCOVERY_AGENT_TASK_PROMPT,\n    DATA_DISCOVERY_AGENT_UPDATE_PLAN_POST_MESSAGES_PROMPT,\n    DATA_DISCOVERY_AGENT_UPDATE_PLAN_PRE_MESSAGES_PROMPT,\n    FINAL_ANSWER_POST_MESSAGES_PROMPT,\n    FINAL_ANSWER_PRE_MESSAGES_PROMPT,\n)\n\n\n# TODO: make this use memory the way you want\nclass PZBaseAgent(CodeAgent):\n    def __init__(self, run_id: str, context_description: str, *args, **kwargs):\n        # memory_config = {\n        #     \"vector_store\": {\n        #         \"provider\": \"chroma\",\n        #         \"config\": {\n        #             \"collection_name\": f\"palimpzest-memory-{self.__class__.__name__}\",\n        #             \"path\": \"./pz-chroma\",\n        #         }\n        #     }\n        # }\n        # self.pz_memory = Memory.from_config(memory_config)\n        self.run_id = run_id\n        self.context_description = context_description\n        super().__init__(*args, **kwargs)\n\n    def write_memory_to_messages(\n        self,\n        summary_mode: bool = False,\n    ) -> list[ChatMessage]:\n        \"\"\"\n        Reads past llm_outputs, actions, and observations or errors from the memory into a series of messages\n        that can be used as input to the LLM. Adds a number of keywords (such as PLAN, error, etc) to help\n        the LLM.\n        \"\"\"\n        messages = self.memory.system_prompt.to_messages(summary_mode=summary_mode)\n        for memory_step in self.memory.steps:\n            messages.extend(memory_step.to_messages(summary_mode=summary_mode))\n        return messages\n\n    def _generate_planning_step(\n        self, task, is_first_step: bool, step: int\n    ) -> Generator[ChatMessageStreamDelta | PlanningStep]:\n        start_time = time.time()\n        if is_first_step:\n            input_messages = [\n                ChatMessage(\n                    role=MessageRole.USER,\n                    content=[\n                        {\n                            \"type\": \"text\",\n                            \"text\": populate_template(\n                                self.prompt_templates[\"planning\"][\"initial_plan\"],\n                                variables={\"task\": task, \"tools\": self.tools, \"managed_agents\": self.managed_agents, \"context_description\": self.context_description},\n                            ),\n                        }\n                    ],\n                )\n            ]\n            if self.stream_outputs and hasattr(self.model, \"generate_stream\"):\n                plan_message_content = \"\"\n                output_stream = self.model.generate_stream(input_messages, stop_sequences=[\"<end_plan>\"])  # type: ignore\n                input_tokens, output_tokens = 0, 0\n                with Live(\"\", console=self.logger.console, vertical_overflow=\"visible\") as live:\n                    for event in output_stream:\n                        if event.content is not None:\n                            plan_message_content += event.content\n                            live.update(Markdown(plan_message_content))\n                            if event.token_usage:\n                                output_tokens += event.token_usage.output_tokens\n                                input_tokens = event.token_usage.input_tokens\n                        yield event\n            else:\n                plan_message = self.model.generate(input_messages, stop_sequences=[\"<end_plan>\"])\n                plan_message_content = plan_message.content\n                input_tokens, output_tokens = (\n                    (\n                        plan_message.token_usage.input_tokens,\n                        plan_message.token_usage.output_tokens,\n                    )\n                    if plan_message.token_usage\n                    else (None, None)\n                )\n            plan = textwrap.dedent(\n                f\"\"\"Here are the facts I know and the plan of action that I will follow to solve the task:\\n```\\n{plan_message_content}\\n```\"\"\"\n            )\n        else:\n            # Summary mode removes the system prompt and previous planning messages output by the model.\n            # Removing previous planning messages avoids influencing too much the new plan.\n            memory_messages = self.write_memory_to_messages(summary_mode=True)\n            plan_update_pre = ChatMessage(\n                role=MessageRole.SYSTEM,\n                content=[\n                    {\n                        \"type\": \"text\",\n                        \"text\": populate_template(\n                            self.prompt_templates[\"planning\"][\"update_plan_pre_messages\"], variables={\"task\": task, \"context_description\": self.context_description}\n                        ),\n                    }\n                ],\n            )\n            plan_update_post = ChatMessage(\n                role=MessageRole.USER,\n                content=[\n                    {\n                        \"type\": \"text\",\n                        \"text\": populate_template(\n                            self.prompt_templates[\"planning\"][\"update_plan_post_messages\"],\n                            variables={\n                                \"task\": task,\n                                \"tools\": self.tools,\n                                \"managed_agents\": self.managed_agents,\n                                \"remaining_steps\": (self.max_steps - step),\n                                \"context_description\": self.context_description,\n                            },\n                        ),\n                    }\n                ],\n            )\n            input_messages = [plan_update_pre] + memory_messages + [plan_update_post]\n            if self.stream_outputs and hasattr(self.model, \"generate_stream\"):\n                plan_message_content = \"\"\n                input_tokens, output_tokens = 0, 0\n                with Live(\"\", console=self.logger.console, vertical_overflow=\"visible\") as live:\n                    for event in self.model.generate_stream(\n                        input_messages,\n                        stop_sequences=[\"<end_plan>\"],\n                    ):  # type: ignore\n                        if event.content is not None:\n                            plan_message_content += event.content\n                            live.update(Markdown(plan_message_content))\n                            if event.token_usage:\n                                output_tokens += event.token_usage.output_tokens\n                                input_tokens = event.token_usage.input_tokens\n                        yield event\n            else:\n                plan_message = self.model.generate(input_messages, stop_sequences=[\"<end_plan>\"])\n                plan_message_content = plan_message.content\n                if plan_message.token_usage is not None:\n                    input_tokens, output_tokens = (\n                        plan_message.token_usage.input_tokens,\n                        plan_message.token_usage.output_tokens,\n                    )\n            plan = textwrap.dedent(\n                f\"\"\"I still need to solve the task I was given:\\n```\\n{self.task}\\n```\\n\\nHere are the facts I know and my new/updated plan of action to solve the task:\\n```\\n{plan_message_content}\\n```\"\"\"\n            )\n        log_headline = \"Initial plan\" if is_first_step else \"Updated plan\"\n        self.logger.log(Rule(f\"[bold]{log_headline}\", style=\"orange\"), Text(plan), level=LogLevel.INFO)\n        yield PlanningStep(\n            model_input_messages=input_messages,\n            plan=plan,\n            model_output_message=ChatMessage(role=MessageRole.ASSISTANT, content=plan_message_content),\n            token_usage=TokenUsage(input_tokens=input_tokens, output_tokens=output_tokens),\n            timing=Timing(start_time=start_time, end_time=time.time()),\n        )\n\n    # def _curate_messages(self, input_messages: list[ChatMessage]) -> list[ChatMessage]:\n    #     \"\"\"\n    #     Try returning:\n    #     - System Prompt + task\n    #     - Current Plan\n    #     - Summary of previous conversation\n    #     \"\"\"\n    #     # initialize with the system prompt & original task\n    #     curated_messages = input_messages[:2]\n\n    #     # find the last planning step message\n    #     idx = len(self.memory.steps) - 1\n    #     while idx > -1:\n    #         step = self.memory.steps[idx]\n    #         if isinstance(step, PlanningStep):\n    #             curated_messages.append(step.model_output_message)\n    #             break\n    #         idx -= 1\n\n    #     # add summary of chat history\n    #     history = self.pz_memory.search(\"A condensed summary of the execution trace of the agent.\", run_id=self.run_id)\n    #     for msg in history[\"results\"]:\n    #         pass\n\n    #     return curated_messages\n\n    def _step_stream(self, memory_step: ActionStep) -> Generator[ChatMessageStreamDelta | ActionOutput | ToolOutput]:\n        \"\"\"\n        Perform one step in the ReAct framework: the agent thinks, acts, and observes the result.\n        Yields ChatMessageStreamDelta during the run if streaming is enabled.\n        At the end, yields either None if the step is not final, or the final answer.\n        \"\"\"\n        memory_messages = self.write_memory_to_messages()\n\n        input_messages = memory_messages.copy()\n\n        ### Generate model output ###\n        memory_step.model_input_messages = input_messages\n        try:\n            additional_args: dict[str, Any] = {}\n            if self.grammar:\n                additional_args[\"grammar\"] = self.grammar\n            if self._use_structured_outputs_internally:\n                additional_args[\"response_format\"] = CODEAGENT_RESPONSE_FORMAT\n            if self.stream_outputs:\n                output_stream = self.model.generate_stream(\n                    input_messages,\n                    stop_sequences=[\"<end_code>\", \"Observation:\", \"Calling tools:\"],\n                    **additional_args,\n                )\n                chat_message_stream_deltas: list[ChatMessageStreamDelta] = []\n                with Live(\"\", console=self.logger.console, vertical_overflow=\"visible\") as live:\n                    for event in output_stream:\n                        chat_message_stream_deltas.append(event)\n                        live.update(\n                            Markdown(agglomerate_stream_deltas(chat_message_stream_deltas).render_as_markdown())\n                        )\n                        yield event\n                chat_message = agglomerate_stream_deltas(chat_message_stream_deltas)\n                memory_step.model_output_message = chat_message\n                output_text = chat_message.content\n            else:\n                chat_message: ChatMessage = self.model.generate(\n                    input_messages,\n                    stop_sequences=[\"<end_code>\", \"Observation:\", \"Calling tools:\"],\n                    **additional_args,\n                )\n                memory_step.model_output_message = chat_message\n                output_text = chat_message.content\n                self.logger.log_markdown(\n                    content=output_text,\n                    title=\"Output message of the LLM:\",\n                    level=LogLevel.DEBUG,\n                )\n\n            # This adds <end_code> sequence to the history.\n            # This will nudge ulterior LLM calls to finish with <end_code>, thus efficiently stopping generation.\n            if output_text and output_text.strip().endswith(\"```\"):\n                output_text += \"<end_code>\"\n                memory_step.model_output_message.content = output_text\n\n            memory_step.token_usage = chat_message.token_usage\n            memory_step.model_output = output_text\n        except Exception as e:\n            raise AgentGenerationError(f\"Error in generating model output:\\n{e}\", self.logger) from e\n\n        ### Parse output ###\n        try:\n            if self._use_structured_outputs_internally:\n                code_action = json.loads(output_text)[\"code\"]\n                code_action = extract_code_from_text(code_action) or code_action\n            else:\n                code_action = parse_code_blobs(output_text)\n            code_action = fix_final_answer_code(code_action)\n            memory_step.code_action = code_action\n        except Exception as e:\n            error_msg = f\"Error in code parsing:\\n{e}\\nMake sure to provide correct code blobs.\"\n            raise AgentParsingError(error_msg, self.logger) from e\n\n        memory_step.tool_calls = [\n            ToolCall(\n                name=\"python_interpreter\",\n                arguments=code_action,\n                id=f\"call_{len(self.memory.steps)}\",\n            )\n        ]\n\n        ### Execute action ###\n        self.logger.log_code(title=\"Executing parsed code:\", content=code_action, level=LogLevel.INFO)\n        is_final_answer = False\n        try:\n            output, execution_logs, is_final_answer = self.python_executor(code_action)\n            execution_outputs_console = []\n            if len(execution_logs) > 0:\n                execution_outputs_console += [\n                    Text(\"Execution logs:\", style=\"bold\"),\n                    Text(execution_logs),\n                ]\n            observation = \"Execution logs:\\n\" + execution_logs\n        except Exception as e:\n            if hasattr(self.python_executor, \"state\") and \"_print_outputs\" in self.python_executor.state:\n                execution_logs = str(self.python_executor.state[\"_print_outputs\"])\n                if len(execution_logs) > 0:\n                    execution_outputs_console = [\n                        Text(\"Execution logs:\", style=\"bold\"),\n                        Text(execution_logs),\n                    ]\n                    memory_step.observations = \"Execution logs:\\n\" + execution_logs\n                    self.logger.log(Group(*execution_outputs_console), level=LogLevel.INFO)\n            error_msg = str(e)\n            if \"Import of \" in error_msg and \" is not allowed\" in error_msg:\n                self.logger.log(\n                    \"[bold red]Warning to user: Code execution failed due to an unauthorized import - Consider passing said import under `additional_authorized_imports` when initializing your CodeAgent.\",\n                    level=LogLevel.INFO,\n                )\n            raise AgentExecutionError(error_msg, self.logger) from e\n\n        truncated_output = truncate_content(str(output))\n        observation += \"Last output from code snippet:\\n\" + truncated_output\n        memory_step.observations = observation\n        \n        # # TODO: add output to self.pz_memory\n        # def get_role(msg_role):\n        #     return str(msg_role).split(\".\")[-1].lower()\n\n        # messages = [\n        #     {\"role\": get_role(memory_step.model_output_message.role), \"content\": memory_step.model_output_message.content},\n        #     {\"role\": \"user\", \"content\": memory_step.observations},\n        # ]\n        # self.pz_memory.add(messages, run_id=self.run_id, agent_id=self.name)\n\n        execution_outputs_console += [\n            Text(\n                f\"{('Out - Final answer' if is_final_answer else 'Out')}: {truncated_output}\",\n                style=(f\"bold {YELLOW_HEX}\" if is_final_answer else \"\"),\n            ),\n        ]\n        self.logger.log(Group(*execution_outputs_console), level=LogLevel.INFO)\n        memory_step.action_output = output\n        yield ActionOutput(output=output, is_final_answer=is_final_answer)\n\n    def _run_stream(\n        self, task: str, max_steps: int, images: list[\"PIL.Image.Image\"] | None = None\n    ) -> Generator[ActionStep | PlanningStep | FinalAnswerStep | ChatMessageStreamDelta]:\n        \"\"\"\n        Execute the agent.\n        \"\"\"\n        self.step_number = 1\n        returned_final_answer = False\n        while not returned_final_answer and self.step_number <= self.max_steps:\n            # Run a planning step if scheduled\n            if self.planning_interval is not None and (\n                self.step_number == 1 or (self.step_number - 1) % self.planning_interval == 0\n            ):\n                planning_start_time = time.time()\n                planning_step = None\n                for element in self._generate_planning_step(\n                    self.task, is_first_step=len(self.memory.steps) == 1, step=self.step_number\n                ):  # Don't use the attribute step_number here, because there can be steps from previous runs\n                    yield element\n                    planning_step = element\n                assert isinstance(planning_step, PlanningStep)  # Last yielded element should be a PlanningStep\n                self.memory.steps.append(planning_step)\n                planning_end_time = time.time()\n                planning_step.timing = Timing(\n                    start_time=planning_start_time,\n                    end_time=planning_end_time,\n                )\n\n            # Start action step!\n            action_step_start_time = time.time()\n            action_step = ActionStep(\n                step_number=self.step_number,\n                timing=Timing(start_time=action_step_start_time),\n                observations_images=images,\n            )\n            self.logger.log_rule(f\"Step {self.step_number}\", level=LogLevel.INFO)\n            try:\n                for output in self._step_stream(action_step):\n                    # Yield streaming deltas\n                    if not isinstance(output, (ActionOutput, ToolOutput)):\n                        # non-action, non-tool output\n                        yield output\n\n                    if isinstance(output, (ActionOutput, ToolOutput)) and output.is_final_answer:\n                        if self.final_answer_checks:\n                            self._validate_final_answer(output.output)\n                        returned_final_answer = True\n                        action_step.is_final_answer = True\n                        final_answer = output.output\n                        # handle final step\n            except AgentGenerationError as e:\n                # Agent generation errors are not caused by a Model error but an implementation error: so we should raise them and exit.\n                raise e\n            except AgentError as e:\n                # Other AgentError types are caused by the Model, so we should log them and iterate.\n                action_step.error = e\n            finally:\n                self._finalize_step(action_step)\n                self.memory.steps.append(action_step)\n                yield action_step\n                self.step_number += 1\n\n        if not returned_final_answer and self.step_number == self.max_steps + 1:\n            final_answer = self._handle_max_steps_reached(self.task, images)\n            yield action_step\n        yield FinalAnswerStep(handle_agent_output_types(final_answer))\n\n    def run(\n        self,\n        task: str,\n        stream: bool = False,\n        reset: bool = True,\n        images: list[\"PIL.Image.Image\"] | None = None,\n        additional_args: dict | None = None,\n        max_steps: int | None = None,\n    ):\n        \"\"\"\n        Run the agent for the given task.\n\n        Args:\n            task (`str`): Task to perform.\n            stream (`bool`): Whether to run in streaming mode.\n                If `True`, returns a generator that yields each step as it is executed. You must iterate over this generator to process the individual steps (e.g., using a for loop or `next()`).\n                If `False`, executes all steps internally and returns only the final answer after completion.\n            reset (`bool`): Whether to reset the conversation or keep it going from previous run.\n            images (`list[PIL.Image.Image]`, *optional*): Image(s) objects.\n            additional_args (`dict`, *optional*): Any other variables that you want to pass to the agent run, for instance images or dataframes. Give them clear names!\n            max_steps (`int`, *optional*): Maximum number of steps the agent can take to solve the task. if not provided, will use the agent's default value.\n\n        Example:\n        ```py\n        from smolagents import CodeAgent\n        agent = CodeAgent(tools=[])\n        agent.run(\"What is the result of 2 power 3.7384?\")\n        ```\n        \"\"\"\n        max_steps = max_steps or self.max_steps\n        self.task = task\n        self.interrupt_switch = False\n        if additional_args is not None:\n            self.state.update(additional_args)\n            self.task += f\"\"\"\nYou have been provided with these additional arguments, that you can access using the keys as variables in your python code:\n{str(additional_args)}.\"\"\"\n\n        self.memory.system_prompt = SystemPromptStep(system_prompt=self.system_prompt)\n        if reset:\n            self.memory.reset()\n            self.monitor.reset()\n\n        self.logger.log_task(\n            content=self.task.strip(),\n            subtitle=f\"{type(self.model).__name__} - {(self.model.model_id if hasattr(self.model, 'model_id') else '')}\",\n            level=LogLevel.INFO,\n            title=self.name if hasattr(self, \"name\") else None,\n        )\n        self.memory.steps.append(TaskStep(task=self.task, task_images=images))\n\n        if getattr(self, \"python_executor\", None):\n            self.python_executor.send_variables(variables=self.state)\n            self.python_executor.send_tools({**self.tools, **self.managed_agents})\n\n        if stream:\n            # The steps are returned as they are executed through a generator to iterate on.\n            return self._run_stream(task=self.task, max_steps=max_steps, images=images)\n        run_start_time = time.time()\n        # Outputs are returned only at the end. We only look at the last step.\n\n        steps = list(self._run_stream(task=self.task, max_steps=max_steps, images=images))\n        assert isinstance(steps[-1], FinalAnswerStep)\n        output = steps[-1].output\n\n        if self.return_full_result:\n            total_input_tokens = 0\n            total_output_tokens = 0\n            correct_token_usage = True\n            for step in self.memory.steps:\n                if isinstance(step, (ActionStep, PlanningStep)):\n                    if step.token_usage is None:\n                        correct_token_usage = False\n                        break\n                    else:\n                        total_input_tokens += step.token_usage.input_tokens\n                        total_output_tokens += step.token_usage.output_tokens\n            if correct_token_usage:\n                token_usage = TokenUsage(input_tokens=total_input_tokens, output_tokens=total_output_tokens)\n            else:\n                token_usage = None\n\n            if self.memory.steps and isinstance(getattr(self.memory.steps[-1], \"error\", None), AgentMaxStepsError):\n                state = \"max_steps_error\"\n            else:\n                state = \"success\"\n\n            messages = self.memory.get_full_steps()\n\n            return RunResult(\n                output=output,\n                token_usage=token_usage,\n                messages=messages,\n                timing=Timing(start_time=run_start_time, end_time=time.time()),\n                state=state,\n            )\n\n        return output\n\n\nclass PZBaseManagedAgent(PZBaseAgent):\n\n    def __call__(self, task: str, **kwargs):\n        \"\"\"Adds additional prompting for the managed agent, runs it, and wraps the output.\n        This method is called only by a managed agent.\n        \"\"\"\n        full_task = populate_template(\n            self.prompt_templates[\"managed_agent\"][\"task\"],\n            variables=dict(name=self.name, task=task, context_description=self.context_description),\n        )\n        result = self.run(full_task, **kwargs)\n        report = result.output if isinstance(result, RunResult) else result\n        answer = populate_template(\n            self.prompt_templates[\"managed_agent\"][\"report\"], variables=dict(name=self.name, final_answer=report)\n        )\n        if self.provide_run_summary:\n            answer += \"\\n\\nFor more detail, find below a summary of this agent's work:\\n<summary_of_work>\\n\"\n            for message in self.write_memory_to_messages(summary_mode=True):\n                content = message.content\n                answer += \"\\n\" + truncate_content(str(content)) + \"\\n---\"\n            answer += \"\\n</summary_of_work>\"\n        return answer\n\n\nclass DataDiscoveryAgent(PZBaseManagedAgent):\n    def __init__(self, run_id: str, context_description: str, *args, **kwargs):\n        self.description = \"\"\"A team member that will search a data repository to find files which help to answer your question.\n    Ask him for all your questions that require searching a repository of relevant data.\n    Provide him as much context as possible, in particular if you need to search on a specific timeframe!\n    And don't hesitate to provide him with a complex search task, like finding a difference between two files.\n    Your request must be a real sentence, not a keyword search! Like \"Find me this information (...)\" rather than a few keywords.\n        \"\"\"\n        prompt_templates = PromptTemplates(\n            system_prompt=CODE_AGENT_SYSTEM_PROMPT,\n            planning=PlanningPromptTemplate(\n                initial_plan=DATA_DISCOVERY_AGENT_INITIAL_PLAN_PROMPT,\n                update_plan_pre_messages=DATA_DISCOVERY_AGENT_UPDATE_PLAN_PRE_MESSAGES_PROMPT,\n                update_plan_post_messages=DATA_DISCOVERY_AGENT_UPDATE_PLAN_POST_MESSAGES_PROMPT,\n            ),\n            managed_agent=ManagedAgentPromptTemplate(task=DATA_DISCOVERY_AGENT_TASK_PROMPT, report=DATA_DISCOVERY_AGENT_REPORT_PROMPT),\n            final_answer=FinalAnswerPromptTemplate(pre_messages=FINAL_ANSWER_PRE_MESSAGES_PROMPT, post_messages=FINAL_ANSWER_POST_MESSAGES_PROMPT),\n        )\n\n        super().__init__(\n            *args,\n            run_id=run_id,\n            context_description=context_description,\n            prompt_templates=prompt_templates,\n            max_steps=20,\n            verbosity_level=2,\n            planning_interval=4,\n            name=\"data_discovery_agent\",\n            description=self.description,\n            provide_run_summary=True,\n            **kwargs,\n        )\n        self.prompt_templates[\"managed_agent\"][\"task\"] += \"\"\"Additionally, if after some searching you find out that you need more information to answer the question, you can use `final_answer` with your request for clarification as argument to request for more information.\"\"\"\n\n\nclass SearchManagerAgent(PZBaseAgent):\n    def __init__(self, run_id: str, context_description: str, *args, **kwargs):\n        prompt_templates = PromptTemplates(\n            system_prompt=CODE_AGENT_SYSTEM_PROMPT,\n            planning=PlanningPromptTemplate(\n                initial_plan=DATA_DISCOVERY_AGENT_INITIAL_PLAN_PROMPT,\n                update_plan_pre_messages=DATA_DISCOVERY_AGENT_UPDATE_PLAN_PRE_MESSAGES_PROMPT,\n                update_plan_post_messages=DATA_DISCOVERY_AGENT_UPDATE_PLAN_POST_MESSAGES_PROMPT,\n            ),\n            managed_agent=ManagedAgentPromptTemplate(task=DATA_DISCOVERY_AGENT_TASK_PROMPT, report=DATA_DISCOVERY_AGENT_REPORT_PROMPT),\n            final_answer=FinalAnswerPromptTemplate(pre_messages=FINAL_ANSWER_PRE_MESSAGES_PROMPT, post_messages=FINAL_ANSWER_POST_MESSAGES_PROMPT),\n        )\n        super().__init__(\n            *args,\n            run_id=run_id,\n            context_description=context_description,\n            prompt_templates=prompt_templates,\n            max_steps=12,\n            verbosity_level=2,\n            additional_authorized_imports=[\"*\"],\n            planning_interval=4,\n            return_full_result=True,\n            **kwargs,\n        )\n\n# class ManagerAgent(CodeAgent):\n\n#     def _step_stream(self, memory_step: ActionStep) -> Generator[ChatMessageStreamDelta | ActionOutput | ToolOutput]:\n#         \"\"\"\n#         Perform one step in the ReAct framework: the agent thinks, acts, and observes the result.\n#         Yields ChatMessageStreamDelta during the run if streaming is enabled.\n#         At the end, yields either None if the step is not final, or the final answer.\n#         \"\"\"\n#         raise NotImplementedError(\"This method should be implemented in child classes\")"
  },
  {
    "path": "src/palimpzest/constants.py",
    "content": "### This file contains constants used by Palimpzest ###\nfrom __future__ import annotations\n\nimport os\nfrom enum import Enum\n\nimport litellm\n\nfrom palimpzest.utils.model_info_helpers import ModelMetricsManager, predict_local_model_metrics\n\n\nclass PromptStrategy(str, Enum):\n    \"\"\"\n    PromptStrategy describes the prompting technique to be used by a Generator when\n    performing some task with a specified Model.\n    \"\"\"\n\n    # aggregation prompt strategies\n    AGG = \"aggregation\"\n    AGG_NO_REASONING = \"aggregation-no-reasoning\"\n\n    # filter prompt strategies\n    FILTER = \"filter\"\n    FILTER_NO_REASONING = \"filter-no-reasoning\"\n    FILTER_CRITIC = \"filter-critic\"\n    FILTER_REFINE = \"filter-refine\"\n    FILTER_MOA_PROPOSER = \"filter-mixture-of-agents-proposer\"\n    FILTER_MOA_AGG = \"filter-mixture-of-agents-aggregator\"\n    FILTER_SPLIT_PROPOSER = \"filter-split-proposer\"\n    FILTER_SPLIT_MERGER = \"filter-split-merger\"\n\n    # join prompt strategies\n    JOIN = \"join\"\n    JOIN_NO_REASONING = \"join-no-reasoning\"\n\n    # map prompt strategies\n    MAP = \"map\"\n    MAP_NO_REASONING = \"map-no-reasoning\"\n    MAP_CRITIC = \"map-critic\"\n    MAP_REFINE = \"map-refine\"\n    MAP_MOA_PROPOSER = \"map-mixture-of-agents-proposer\"\n    MAP_MOA_AGG = \"map-mixture-of-agents-aggregator\"\n    MAP_SPLIT_PROPOSER = \"map-split-proposer\"\n    MAP_SPLIT_MERGER = \"map-split-merger\"\n\n    def is_agg_prompt(self):\n        return \"aggregation\" in self.value\n\n    def is_filter_prompt(self):\n        return \"filter\" in self.value\n\n    def is_join_prompt(self):\n        return \"join\" in self.value\n\n    def is_map_prompt(self):\n        return \"map\" in self.value\n\n    def is_critic_prompt(self):\n        return \"critic\" in self.value\n\n    def is_refine_prompt(self):\n        return \"refine\" in self.value\n\n    def is_moa_proposer_prompt(self):\n        return \"mixture-of-agents-proposer\" in self.value\n\n    def is_moa_aggregator_prompt(self):\n        return \"mixture-of-agents-aggregator\" in self.value\n\n    def is_split_proposer_prompt(self):\n        return \"split-proposer\" in self.value\n\n    def is_split_merger_prompt(self):\n        return \"split-merger\" in self.value\n\n    def is_no_reasoning_prompt(self):\n        return \"no-reasoning\" in self.value\n\n\nclass Modality(str, Enum):\n    TEXT = \"text\"\n    IMAGE = \"image\"\n    AUDIO = \"audio\"\n\n\nclass AggFunc(str, Enum):\n    COUNT = \"count\"\n    AVERAGE = \"average\"\n    SUM = \"sum\"\n    MIN = \"min\"\n    MAX = \"max\"\n\nclass Cardinality(str, Enum):\n    ONE_TO_ONE = \"one-to-one\"\n    ONE_TO_MANY = \"one-to-many\"\n\n    @classmethod\n    def _missing_(cls, value):\n        if value:\n            normalized_value = \"\".join([x for x in value if x.isalpha()]).lower()\n            for member in cls:\n                normalized_member = \"\".join([x for x in member if x.isalpha()]).lower()\n                if normalized_member == normalized_value:\n                    return member\n        return cls.ONE_TO_ONE\n\n\nclass PickOutputStrategy(str, Enum):\n    CHAMPION = \"champion\"\n    ENSEMBLE = \"ensemble\"\n\n\nAUDIO_EXTENSIONS = [\".wav\"]\nIMAGE_EXTENSIONS = [\".jpg\", \".jpeg\", \".png\", \".gif\", \".bmp\", \".tiff\"]\nPDF_EXTENSIONS = [\".pdf\"]\nXLS_EXTENSIONS = [\".xls\", \".xlsx\"]\nHTML_EXTENSIONS = [\".html\", \".htm\"]\n\n# the number of seconds the parallel execution will sleep for while waiting for futures to complete\nPARALLEL_EXECUTION_SLEEP_INTERVAL_SECS = 0.3\n\n# default PDF parser\nDEFAULT_PDF_PROCESSOR = \"pypdf\"\n\n# character limit for various IDs\nMAX_ID_CHARS = 10\n\n# maximum number of rows to display in a table\nMAX_ROWS = 5\n\n# maximum number of rows to parse from an HTML\nMAX_HTML_ROWS = 10000\n\n\ndef log_attempt_number(retry_state):\n    \"\"\"return the result of the last call attempt\"\"\"\n    print(f\"Retrying: {retry_state.attempt_number}...\")\n\n\n# Palimpzest root directory\nPZ_DIR = os.path.join(os.path.expanduser(\"~\"), \".palimpzest\")\n\n# assume 500 MB/sec for local SSD scan time\nLOCAL_SCAN_TIME_PER_KB = 1 / (float(500) * 1024)\n\n# assume 30 GB/sec for sequential access of memory\nMEMORY_SCAN_TIME_PER_KB = 1 / (float(30) * 1024 * 1024)\n\n# assume 1 KB per record\nNAIVE_BYTES_PER_RECORD = 1024\n\n# rough conversion from # of characters --> # of tokens; assumes 1 token ~= 4 chars\nTOKENS_PER_CHARACTER = 0.25\n\n# rough estimate of the number of tokens the context is allowed to take up for LLAMA3 models\nLLAMA_CONTEXT_TOKENS_LIMIT = 6000\n\n# a naive estimate for the input record size\nNAIVE_EST_SOURCE_RECORD_SIZE_IN_BYTES = 1_000_000\n\n# a naive estimate for filter selectivity\nNAIVE_EST_FILTER_SELECTIVITY = 0.5\n\n# a naive estimate for join selectivity\nNAIVE_EST_JOIN_SELECTIVITY = 0.5\n\n# a naive estimate for the number of input tokens processed per record\nNAIVE_EST_NUM_INPUT_TOKENS = 1000\n\n# a naive estimate for the number of output tokens processed per record\nNAIVE_EST_NUM_OUTPUT_TOKENS = 100\n\n# a naive estimate for the number of groups returned by a group by\nNAIVE_EST_NUM_GROUPS = 3\n\n# a naive estimate for the factor of increase (loosely termed \"selectivity\") for one-to-many cardinality operations\nNAIVE_EST_ONE_TO_MANY_SELECTIVITY = 2\n\n# a naive estimate of the time it takes to extract the latex for an equation from an image file using Skema\nNAIVE_IMAGE_TO_EQUATION_LATEX_TIME_PER_RECORD = 10.0\n\n# a naive estimate of the time it takes to extract the text from a PDF using a PDF processor\nNAIVE_PDF_PROCESSOR_TIME_PER_RECORD = 10.0\n\n# whether or not to log LLM outputs\nLOG_LLM_OUTPUT = False\n\n# maximum number of models to use when user does not narrow optimization space\nMAX_AVAILABLE_MODELS = 5\n\nclass Model:\n    \"\"\"\n    Model describes the underlying LLM which should be used to perform some operation\n    which requires invoking an LLM.\n    \"\"\"\n    # Registry of known models (maps value string to Model instance)\n    _registry: dict[str, Model] = {}\n\n    def __init__(self, model_id: str, api_base: str | None = None, **vllm_kwargs):\n        self.metrics_manager = ModelMetricsManager()\n        self.model_id = model_id\n        self.api_base = api_base\n        self.vllm_kwargs = vllm_kwargs if vllm_kwargs else {}\n\n        # For vLLM models (api_base is set), try to get model info from litellm's local data\n        if api_base is not None:\n            self.model_specs = self._get_litellm_model_specs(model_id)\n        else:\n            self.model_specs = self.metrics_manager.get_model_metrics(model_id)\n            if not self.model_specs:\n                raise ValueError(\"Palimpzest currently does not contain information about this model.\")\n\n        Model._registry[model_id] = self\n\n    def _get_litellm_model_specs(self, model_id: str) -> dict:\n        \"\"\"Get model specs from litellm's local model_cost data for vLLM models.\"\"\"\n        # Use predict function to get quality, latency metrics, and capability flags\n        predicted_metrics = predict_local_model_metrics(model_id)\n\n        # Start with defaults, then overlay predicted values\n        specs = {\n            \"is_text_model\": True,\n            \"is_vision_model\": False,\n            \"is_llama_model\": False,\n            \"is_clip_model\": False,\n            \"is_audio_model\": False,\n            \"is_reasoning_model\": False,\n            \"is_embedding_model\": False,\n            \"is_text_image_multimodal_embedding_model\": False,\n            \"is_vllm_model\": True,  # Mark as vLLM model\n            \"usd_per_input_token\": 0.0,  # Cost always 0 for local model\n            \"usd_per_output_token\": 0.0,\n            \"seconds_per_output_token\": predicted_metrics[\"seconds_per_output_token\"],\n            \"MMLU_Pro_score\": predicted_metrics[\"MMLU_Pro_score\"],\n        }\n\n        # Overlay all flags detected from model name (including False values like is_text_model for embeddings)\n        for key, value in predicted_metrics.items():\n            if key.startswith(\"is_\"):\n                specs[key] = value\n\n        # Try litellm for additional capability detection (may not work for local models)\n        try:\n            if litellm.supports_vision(model=model_id):\n                specs[\"is_vision_model\"] = True\n        except Exception:\n            pass\n\n        try:\n            if litellm.supports_audio_input(model=model_id):\n                specs[\"is_audio_model\"] = True\n        except Exception:\n            pass\n\n        return specs\n\n    def __lt__(self, other):\n        if isinstance(other, Model):\n            return self.value < other.value\n        if isinstance(other, str):\n            return self.value < other\n        return NotImplemented\n\n    @classmethod\n    def get_all_models(cls) -> list[Model]:\n        return list(cls._registry.values())\n\n    @property\n    def value(self) -> str:\n        return self.model_id\n\n    @property\n    def provider(self) -> str | None:\n        \"\"\"Returns the provider string for this model.\"\"\"\n        return self.model_specs.get(\"provider\")\n\n    @property\n    def api_key_env_var(self) -> str | None:\n        \"\"\"\n        Returns the standard environment variable name for this provider's API key.\n        \"\"\"\n        if self.provider == \"gemini\":\n            return \"GEMINI_API_KEY\" if os.getenv(\"GEMINI_API_KEY\") else \"GOOGLE_API_KEY\"\n        if self.provider == \"azure\":\n            return \"AZURE_API_KEY\" if os.getenv(\"AZURE_API_KEY\") else \"AZURE_OPENAI_API_KEY\"\n        mapping = {\n            \"openai\": \"OPENAI_API_KEY\",\n            \"vertex_ai\": \"GOOGLE_APPLICATION_CREDENTIALS\",\n            \"anthropic\": \"ANTHROPIC_API_KEY\",\n            \"together_ai\": \"TOGETHER_API_KEY\",\n            \"hosted_vllm\": \"VLLM_API_KEY\"\n        }\n        return mapping.get(self.provider)\n\n    def __repr__(self) -> str:\n        return self.value\n\n    def __str__(self) -> str:\n        return self.value\n\n    def __eq__(self, other: object) -> bool:\n        if isinstance(other, Model):\n            return self.value == other.value\n        if isinstance(other, str):\n            return self.value == other\n        return NotImplemented\n\n    def __hash__(self) -> int:\n        return hash(self.value)\n\n    def is_llama_model(self) -> bool:\n        return self.model_specs.get(\"is_llama_model\", False)\n    \n    def is_vllm_model(self) -> bool:\n        return self.model_specs.get(\"is_vllm_model\", False) and self.api_base is not None\n    \n    def is_embedding_model(self) -> bool:\n        return self.model_specs.get(\"is_embedding_model\", False)\n\n    def is_text_image_multimodal_embedding_model(self) -> bool:\n        return self.model_specs.get(\"is_text_image_multimodal_embedding_model\", False)\n\n    def is_provider_vertex_ai(self) -> bool:\n        return self.provider == \"vertex_ai\"\n    \n    def is_provider_anthropic(self) -> bool:\n        return self.provider == \"anthropic\"\n    \n    def is_provider_google_ai_studio(self) -> bool:\n        return self.provider == \"gemini\"\n\n    def is_provider_openai(self) -> bool:\n        return self.provider == \"openai\"\n    \n    def is_provider_azure(self) -> bool:\n        return self.provider == \"azure\"\n    \n    def is_provider_together_ai(self) -> bool:\n        return self.provider == \"together_ai\"\n    \n    def is_provider_deepseek(self) -> bool:\n        return self.provider == \"deepseek\"\n\n    def is_provider_ollama(self) -> bool:\n        return self.provider == \"ollama\"\n\n    def is_model_gemini(self) -> bool:\n        return \"gemini\" in self.value.lower()\n\n    def get_model_name(self) -> str:\n        return self.value.split(\"/\")[-1] if \"/\" in self.value else self.value\n\n    def is_o_model(self) -> bool:\n        return self.model_specs.get(\"is_o_model\", False)\n\n    def is_gpt_5_model(self) -> bool:\n        return self.model_specs.get(\"is_gpt_5_model\", False)\n\n    def is_reasoning_model(self) -> bool:\n        return self.model_specs.get(\"is_reasoning_model\", False)\n\n    def is_text_model(self) -> bool:\n        return self.model_specs.get(\"is_text_model\", False)\n\n    def is_vision_model(self) -> bool:\n        return self.model_specs.get(\"is_vision_model\", False)\n\n    def is_audio_model(self) -> bool:\n        return self.model_specs.get(\"is_audio_model\", False)\n\n    def is_text_image_multimodal_model(self) -> bool:\n        return self.is_text_model() and self.is_vision_model()\n\n    def is_text_audio_multimodal_model(self) -> bool:\n        return self.is_audio_model() and self.is_text_model()\n\n    def supports_prompt_caching(self) -> bool:\n        return (self.is_provider_anthropic() or self.is_provider_google_ai_studio() or self.is_provider_vertex_ai or self.is_provider_openai() or self.is_provider_azure()) \\\n            and self.model_specs.get(\"supports_prompt_caching\", False)\n\n    def get_usd_per_input_token(self) -> float:\n        return self.model_specs.get(\"usd_per_input_token\", 0.0)\n\n    def get_usd_per_audio_input_token(self) -> float:\n        return self.model_specs.get(\"usd_per_audio_input_token\", self.get_usd_per_input_token())\n    \n    # forward-looking, TODO: default value discussion\n    def get_usd_per_image_input_token(self) -> float:\n        return self.model_specs.get(\"usd_per_image_input_token\", self.get_usd_per_input_token())\n\n    def get_usd_per_cache_read_token(self) -> float:\n        return self.model_specs.get(\"usd_per_cache_read_token\", self.get_usd_per_input_token())\n    \n    def get_usd_per_audio_cache_read_token(self) -> float:\n        return self.model_specs.get(\"usd_per_audio_cache_read_token\", self.get_usd_per_cache_read_token())\n    \n    def get_usd_per_image_cache_read_token(self) -> float:\n        return self.model_specs.get(\"usd_per_image_cache_read_token\", self.get_usd_per_cache_read_token())\n    \n    # forward looking; Gemini explicit\n    def get_usd_per_cached_token_per_hour(self) -> float:\n        return self.model_specs.get(\"usd_per_cached_token_per_hour\", 0.0)\n     \n    def get_usd_per_cache_creation_token(self) -> float:\n        return self.model_specs.get(\"usd_per_cache_creation_token\", 0.0)\n    \n    def get_usd_per_output_token(self) -> float:\n        return self.model_specs.get(\"usd_per_output_token\", 0.0)\n    \n    # forward-looking\n    def get_usd_per_audio_cache_creation_token(self) -> float:\n        return self.model_specs.get(\"usd_per_audio_cache_creation_token\", 0.0)\n    \n    # forward-looking\n    def get_usd_per_image_cache_creation_token(self) -> float:\n        return self.model_specs.get(\"usd_per_image_cache_creation_token\", 0.0)\n    \n    def get_seconds_per_output_token(self) -> float:\n        return self.model_specs.get(\"seconds_per_output_token\", 0.0)\n\n    def get_overall_score(self) -> float:\n        return self.model_specs.get(\"MMLU_Pro_score\", 0.0)\n\n# TODO: investigate which (if any llama3 models are still supported by TogetherAI)\n# Model.LLAMA3_2_3B = Model(\"together_ai/meta-llama/Llama-3.2-3B-Instruct-Turbo\") - seems to be deprecated\nModel.LLAMA3_1_8B = Model(\"together_ai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo\")\nModel.LLAMA3_3_70B = Model(\"together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo\")\nModel.LLAMA3_2_90B_V = Model(\"together_ai/meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo\")\nModel.DEEPSEEK_V3 = Model(\"together_ai/deepseek-ai/DeepSeek-V3\")\nModel.DEEPSEEK_R1_DISTILL_QWEN_1_5B = Model(\"together_ai/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B\")\nModel.GPT_4o = Model(\"openai/gpt-4o-2024-08-06\")\nModel.GPT_4o_MINI = Model(\"openai/gpt-4o-mini-2024-07-18\")\nModel.GPT_4_1 = Model(\"openai/gpt-4.1-2025-04-14\")\nModel.GPT_4_1_MINI = Model(\"openai/gpt-4.1-mini-2025-04-14\")\nModel.GPT_4_1_NANO = Model(\"openai/gpt-4.1-nano-2025-04-14\")\nModel.GPT_5 = Model(\"openai/gpt-5-2025-08-07\")\nModel.GPT_5_MINI = Model(\"openai/gpt-5-mini-2025-08-07\")\nModel.GPT_5_NANO = Model(\"openai/gpt-5-nano-2025-08-07\")\nModel.GPT_5_2 = Model(\"openai/gpt-5.2-2025-12-11\")\nModel.o4_MINI = Model(\"openai/o4-mini-2025-04-16\")  # noqa: N815\n# Model.CLAUDE_3_5_SONNET = Model(\"anthropic/claude-3-5-sonnet-20241022\") - retired 10/28/2025\nModel.CLAUDE_3_7_SONNET = Model(\"anthropic/claude-3-7-sonnet-20250219\")\nModel.CLAUDE_4_SONNET = Model(\"anthropic/claude-sonnet-4-20250514\")\nModel.CLAUDE_4_5_SONNET = Model(\"anthropic/claude-sonnet-4-5-20250929\")\nModel.CLAUDE_3_5_HAIKU = Model(\"anthropic/claude-3-5-haiku-20241022\")\nModel.CLAUDE_4_5_HAIKU = Model(\"anthropic/claude-haiku-4-5-20251001\")\nModel.GEMINI_3_0_PRO = Model(\"vertex_ai/gemini-3-pro-preview\")  # image\nModel.GEMINI_3_0_FLASH = Model(\"vertex_ai/gemini-3-flash-preview\")  # Text, Image, Video, Audio, and PDF\nModel.GEMINI_2_0_FLASH = Model(\"vertex_ai/gemini-2.0-flash\")\nModel.GEMINI_2_5_FLASH = Model(\"vertex_ai/gemini-2.5-flash\")\nModel.GEMINI_2_5_PRO = Model(\"vertex_ai/gemini-2.5-pro\")\nModel.GOOGLE_GEMINI_3_0_PRO = Model(\"gemini/gemini-3-pro-preview\")\nModel.GOOGLE_GEMINI_3_0_FLASH = Model(\"gemini/gemini-3-flash-preview\")\nModel.GOOGLE_GEMINI_2_5_FLASH = Model(\"gemini/gemini-2.5-flash\")\nModel.GOOGLE_GEMINI_2_5_FLASH_LITE = Model(\"gemini/gemini-2.5-flash-lite\")\nModel.GOOGLE_GEMINI_2_5_PRO = Model(\"gemini/gemini-2.5-pro\")\nModel.LLAMA_4_MAVERICK = Model(\"vertex_ai/meta/llama-4-maverick-17b-128e-instruct-maas\")\nModel.GPT_4o_AUDIO_PREVIEW = Model(\"openai/gpt-4o-audio-preview\")\nModel.GPT_4o_MINI_AUDIO_PREVIEW = Model(\"openai/gpt-4o-mini-audio-preview\")\nModel.AZURE_GPT_4o = Model(\"azure/gpt-4o-2024-08-06\")\nModel.AZURE_GPT_4o_MINI = Model(\"azure/gpt-4o-mini-2024-07-18\")\nModel.AZURE_GPT_4_1 = Model(\"azure/gpt-4.1-2025-04-14\")\nModel.AZURE_GPT_4_1_MINI = Model(\"azure/gpt-4.1-mini-2025-04-14\")\nModel.AZURE_GPT_4_1_NANO = Model(\"azure/gpt-4.1-nano-2025-04-14\")\nModel.AZURE_o4_MINI = Model(\"azure/o4-mini-2025-04-16\")  # noqa: N815\nModel.AZURE_GPT_4o_AUDIO_PREVIEW = Model(\"azure/gpt-4o-audio-preview\")\nModel.AZURE_GPT_4o_MINI_AUDIO_PREVIEW = Model(\"azure/gpt-4o-mini-audio-preview\")\nModel.TEXT_EMBEDDING_3_SMALL = Model(\"openai/text-embedding-3-small\")\nModel.CLIP_VIT_B_32 = Model(\"clip-ViT-B-32\")\nModel.NOMIC_EMBED_TEXT = Model(\"ollama/nomic-embed-text\")\n\n#### MODEL PERFORMANCE & COST METRICS ####\n# Overall model quality is computed using MMLU-Pro; multi-modal models currently use the same score for vision\n# - in the future we should split quality for vision vs. multi-modal vs. text\n# - code quality was computed using HumanEval, but that benchmark is too easy and should be replaced.\n# - https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro\n# - https://www.vals.ai/benchmarks/mmlu_pro-08-12-2025\n#\n# Cost is presented in terms of USD / token for input tokens and USD / token for\n# generated tokens.\n#\n# Time is presented in seconds per output token. I grabbed some semi-recent estimates\n# from the internet for this quick POC, but we can and should do more to model these\n# values more precisely:\n# - https://artificialanalysis.ai/models/llama-3-1-instruct-8b\n# \n# Model metrics now fetched from singular json file curated_model_info.json\n"
  },
  {
    "path": "src/palimpzest/core/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/core/data/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/core/data/context.py",
    "content": "from __future__ import annotations\n\nimport os\nimport re\nfrom abc import ABC\nfrom typing import Callable\n\nfrom pydantic import BaseModel\nfrom smolagents import CodeAgent, LiteLLMModel\n\nfrom palimpzest.core.data import context_manager\nfrom palimpzest.core.data.dataset import Dataset\nfrom palimpzest.core.lib.schemas import create_schema_from_fields, union_schemas\nfrom palimpzest.query.operators.logical import ComputeOperator, ContextScan, LogicalOperator, SearchOperator\nfrom palimpzest.utils.hash_helpers import hash_for_id\n\nPZ_INSTRUCTION = \"\"\"\\n\\nYou are a CodeAgent who is a specialist at writing declarative AI programs with the Palimpzest (PZ) library.\n\nPalimpzest is a programming framework which provides you with **semantic operators** (e.g. semantic maps, semantic filters, etc.)\nwhich are like their traditional counterparts, except they can execute instructions provided in natural language.\n\nFor example, if you wanted to write a program to extract the title and abstract from a directory of papers,\nyou could write the following in PZ:\n```\nimport palimpzest as pz\nfrom dotenv import load_dotenv\n\n# Load environment variables from .env file\nload_dotenv()\n\n# Define columns for semantic map (sem_map) operation; each column is specified\n# with a dictionary containing the following keys:\n# - \"name\": the name of the field to compute\n# - \"type\": the type of the field to compute\n# - \"description\": the natural language description of the field\npaper_cols = [\n    {\"name\": \"title\", \"type\": str, \"description\": \"the title of the paper\"},\n    {\"name\": \"abstract\", \"type\": str, \"description\": \"the paper's abstract\"},\n]\n\n# construct the data processing pipeline with PZ\nds = pz.TextFileDataset(id=\"papers\", path=\"path/to/papers\")\nds = ds.sem_map(cols)\n\n# optimize and execute the PZ program\nvalidator = pz.Validator()\nconfig = pz.QueryProcessorConfig(\n    policy=pz.MaxQuality(),\n    execution_strategy=\"parallel\",\n    max_workers=20,\n    progress=True,\n)\noutput = ds.optimize_and_run(config=config, validator=validator)\n\n# write the execution stats to json\noutput.execution_stats.to_json(\"pz_program_stats.json\")\n\n# write the output to a CSV and print the output CSV filepath so the user knows where to find it\noutput_filepath = \"pz_program_output.csv\"\noutput.to_df().to_csv(output_filepath, index=False)\nprint(f\"Results at: {output_filepath}\")\n```\n\nTo initialize a dataset in PZ, simply provide the path to a directory to `pz.TextFileDirectory()`\n(if your data contains text-based files). For example:\n```\nimport palimpzest as pz\nfrom dotenv import load_dotenv\n\n# Load environment variables from .env file\nload_dotenv()\n\nds = pz.TextFileDataset(id=\"files\", path=\"path/to/files\")\n```\n\nPalimpzest has two primary **semantic operators** which you can use to construct data processing pipelines:\n- sem_filter(predicate: str): executes a semantic filter specified by the natural language predicate on a given PZ dataset\n- sem_map(cols: list[dict]): executes a semantic map to compute the `cols` on a given PZ dataset\n\nAs a second example, consider the following PZ program which filters for papers about batteries that are from MIT\nand computes a summary for each one:\n```\nimport palimpzest as pz\nfrom dotenv import load_dotenv\n\n# Load environment variables from .env file\nload_dotenv()\n\n# construct the PZ program\nds = pz.TextFileDataset(id=\"papers\", path=\"path/to/research-papers\")\nds = ds.sem_filter(\"The paper is about batteries\")\nds = ds.sem_filter(\"The paper is from MIT\")\nds = ds.sem_map([{\"name\": \"summary\", \"type\": str, \"description\": \"A summary of the paper\"}])\n\n# optimize and execute the PZ program\nvalidator = pz.Validator()\nconfig = pz.QueryProcessorConfig(\n    policy=pz.MaxQuality(),\n    execution_strategy=\"parallel\",\n    max_workers=20,\n    progress=True,\n)\noutput = ds.optimize_and_run(config=config, validator=validator)\n\n# write the execution stats to json\noutput.execution_stats.to_json(\"pz_program_stats.json\")\n\n# write the output to a CSV and print the output CSV filepath so the user knows where to find it\noutput_filepath = \"pz_program_output.csv\"\noutput.to_df().to_csv(output_filepath, index=False)\nprint(f\"Results at: {output_filepath}\")\n```\n\nBe sure to always:\n- execute your program using the `.optimize_and_run()` format shown above\n- call `output.execution_stats.to_json(\"pz_program_stats.json\")` to write execution statistics to disk\n- write your output to CSV and print where you wrote it!\n\"\"\"\n\nclass Context(Dataset, ABC):\n    \"\"\"\n    The `Context` class is an abstract base class for root `Datasets` whose data is accessed\n    via user-defined methods. Classes which inherit from this class must implement two methods:\n\n    - `list_filepaths()`: which lists the files that the `Context` has access to.\n    - `read_filepath(path: str)`: which reads the file corresponding to the given `path`.\n\n    A `Context` is a special type of `Dataset` that represents a view over an underlying `Dataset`.\n    Each `Context` has a `name` which uniquely identifies it, as well as a natural language `description`\n    of the data / computation that the `Context` represents. Similar to `Dataset`s, `Context`s can be\n    lazily transformed using functions such as `sem_filter`, `sem_map`, `sem_join`, etc., and they may\n    be materialized or unmaterialized.\n    \"\"\"\n\n    def __init__(\n            self,\n            id: str,\n            description: str,\n            operator: LogicalOperator,\n            schema: type[BaseModel] | None = None,\n            sources: list[Context] | Context | None = None,\n            materialized: bool = False,\n        ) -> None:\n        \"\"\"\n        Constructor for the `Context` class.\n\n        Args:\n            id (`str`): a string identifier for the `Context`\n            description (`str`): the description of the data contained within the `Context`\n            operator (`LogicalOperator`): The `LogicalOperator` used to compute this `Context`.\n            schema: (`type[BaseModel] | None`): The schema of this `Context`.\n            sources (`list[Context] | Context | None`): The (list of) `Context(s)` which are input(s) to\n                the operator used to compute this `Context`.\n            materialized (`bool`): True if the `Context` has been computed, False otherwise\n        \"\"\"\n        # set the description\n        self._description = description\n\n        # set the materialization status\n        self._materialized = materialized\n\n        # compute schema and call parent constructor\n        if schema is None:\n            schema = create_schema_from_fields([{\"name\": \"context\", \"description\": \"The context\", \"type\": str}])\n        super().__init__(sources=sources, operator=operator, schema=schema, id=id)\n\n        # set the tools associated with this Context\n        self._tools = [getattr(self, attr) for attr in dir(self) if attr.startswith(\"tool_\")]\n\n        # add Context to ContextManager\n        cm = context_manager.ContextManager()\n        cm.add_context(self)\n\n    @property\n    def description(self) -> str:\n        \"\"\"The string containing all of the information computed for this `Context`\"\"\"\n        return self._description\n\n    @property\n    def materialized(self) -> bool:\n        \"\"\"The boolean which specifies whether the `Context` has been computed or not\"\"\"\n        return self._materialized\n\n    @property\n    def tools(self) -> list[Callable]:\n        \"\"\"The list of tools associated with this `Context`\"\"\"\n        return self._tools\n\n    def __str__(self) -> str:\n        return f\"Context(id={self.id}, description={self.description:20s}, materialized={self.materialized})\"\n\n    def set_description(self, description: str) -> None:\n        \"\"\"\n        Update the context's description.\n        \"\"\"\n        self._description = description\n\n    def set_materialized(self, materialized: str) -> None:\n        \"\"\"\n        Update the context's materialization status.\n        \"\"\"\n        self._materialized = materialized\n\n    def compute(self, instruction: str) -> Context:\n        # construct new description and output schema\n        new_id = hash_for_id(instruction)\n        new_description = f\"Parent Context ID: {self.id}\\n\\nThis Context is the result of computing the following instruction on the parent context.\\n\\nINSTRUCTION: {instruction}\\n\\n\"\n        inter_schema = create_schema_from_fields([{\"name\": f\"result-{new_id}\", \"desc\": \"The result from computing the instruction on the input Context\",  \"type\": str}])\n        new_output_schema = union_schemas([self.schema, inter_schema])\n\n        # construct logical operator\n        operator = ComputeOperator(\n            input_schema=self.schema,\n            output_schema=new_output_schema,\n            context_id=new_id,\n            instruction=instruction,\n        )        \n\n        return Context(id=new_id, description=new_description, operator=operator, sources=[self], materialized=False)\n\n    def search(self, search_query: str) -> Context:\n        # construct new description and output schema\n        new_id = hash_for_id(search_query)\n        new_description = f\"Parent Context ID: {self.id}\\n\\nThis Context is the result of searching the parent context for information related to the following query.\\n\\nSEARCH QUERY: {search_query}\\n\\n\"\n\n        # construct logical operator\n        operator = SearchOperator(\n            input_schema=self.schema,\n            output_schema=self.schema,\n            context_id=new_id,\n            search_query=search_query,\n        )\n\n        return Context(id=new_id, description=new_description, operator=operator, sources=[self], materialized=False)\n\nclass TextFileContext(Context):\n    def __init__(self, path: str, id: str, description: str) -> None:\n        \"\"\"\n        Constructor for the `TextFileContext` class.\n\n        Args:\n            path (str): The path to the file\n            id (str): a string identifier for the `Context`\n            description (str): The description of the data contained within the `Context`\n            kwargs (dict): Keyword arguments containing the `Context's` id and description.\n        \"\"\"\n        # check that path is a valid file or directory\n        assert os.path.isfile(path) or os.path.isdir(path), f\"Path {path} is not a file nor a directory\"\n\n        # get list of filepaths\n        self.filepaths = []\n        if os.path.isfile(path):\n            self.filepaths = [path]\n        else:\n            self.filepaths = []\n            for root, _, files in os.walk(path):\n                for file in files:\n                    fp = os.path.join(root, file)\n                    self.filepaths.append(fp)\n            self.filepaths = sorted(self.filepaths)\n\n        # call parent constructor to set id, operator, and schema\n        schema = create_schema_from_fields([{\"name\": \"context\", \"desc\": \"The context\", \"type\": str}])\n        super().__init__(\n            id=id,\n            description=description,\n            operator=ContextScan(context=self, output_schema=schema),\n            schema=schema,\n            materialized=True,\n        )\n    def _check_filter_answer_text(self, answer_text: str) -> dict | None:\n        \"\"\"\n        Return {\"passed_operator\": True} if and only if \"true\" is in the answer text.\n        Return {\"passed_operator\": False} if and only if \"false\" is in the answer text.\n        Otherwise, return None.\n        \"\"\"\n        # NOTE: we may be able to eliminate this condition by specifying this JSON output in the prompt;\n        # however, that would also need to coincide with a change to allow the parse_answer_fn to set \"passed_operator\"\n        if \"true\" in answer_text.lower():\n            return {\"passed_operator\": True}\n        elif \"false\" in answer_text.lower():\n            return {\"passed_operator\": False}\n        elif \"yes\" in answer_text.lower():\n            return {\"passed_operator\": True}\n\n        return None\n\n    def _parse_filter_answer(self, completion_text: str) -> dict[str, list]:\n        \"\"\"Extract the answer from the completion object for filter operations.\"\"\"\n        # if the model followed the default instructions, the completion text will place\n        # its answer between \"ANSWER:\" and \"---\"\n        regex = re.compile(\"answer:(.*?)---\", re.IGNORECASE | re.DOTALL)\n        matches = regex.findall(completion_text)\n        if len(matches) > 0:\n            answer_text = matches[0].strip()\n            field_answers = self._check_filter_answer_text(answer_text)\n            if field_answers is not None:\n                return field_answers\n\n        # if the first regex didn't find an answer, try taking all the text after \"ANSWER:\"\n        regex = re.compile(\"answer:(.*)\", re.IGNORECASE | re.DOTALL)\n        matches = regex.findall(completion_text)\n        if len(matches) > 0:\n            answer_text = matches[0].strip()\n            field_answers = self._check_filter_answer_text(answer_text)\n            if field_answers is not None:\n                return field_answers\n\n        # finally, try taking all of the text; throw an exception if this doesn't work\n        field_answers = self._check_filter_answer_text(completion_text)\n        if field_answers is None:\n            raise Exception(f\"Could not parse answer from completion text: {completion_text}\")\n\n        return field_answers\n\n    # def tool_list_filepaths(self) -> list[str]:\n    #     \"\"\"\n    #     This tool returns the list of all of the filepaths which the `Context` has access to.\n\n    #     Args:\n    #         None\n        \n    #     Returns:\n    #         list[str]: A list of file paths for all files in the `Context`.\n    #     \"\"\"\n    #     return self.filepaths\n\n    # def tool_read_filepath(self, path: str) -> str:\n    #     \"\"\"\n    #     This tool takes a filepath (`path`) as input and returns the content of the file as a string.\n    #     It handles both CSV files and html / regular text files. It does not handle images.\n\n    #     Args:\n    #         path (str): The path to the file to read.\n\n    #     Returns:\n    #         str: The content of the file as a string.\n    #     \"\"\"\n    #     if path.endswith(\".csv\"):\n    #         return pd.read_csv(path, encoding=\"ISO-8859-1\").to_string(index=False)\n\n    #     with open(path, encoding='utf-8') as file:\n    #         content = file.read()\n\n    #     return content\n\n    def tool_execute_semantic_operators(self, instruction: str) -> str:\n        \"\"\"\n        This tool takes an `instruction` as input and invokes an expert to write a semantic data processing pipeline\n        to execute the instruction. The tool returns the path to a CSV file which contains the output of the pipeline.\n\n        For example, the tool could be invoked as follows to extract the title and abstract from a dataset of research papers:\n        ```\n        instruction = \"Write a program to extract the title and abstract from each research paper\"\n        result_csv_filepath = tool_execute_semantic_operators(instruction)\n        ```\n\n        Args:\n            instruction: The instruction specifying the semantic data processing pipeline that you need to execute.\n\n        Returns:\n            str: the filepath to the CSV containing the output from running the data processing pipeline.\n        \"\"\"\n        from smolagents import tool\n        @tool\n        def tool_list_filepaths() -> list[str]:\n            \"\"\"\n            This tool returns the list of all of the filepaths which the `Context` has access to.\n\n            NOTE: You may want to execute this before writing your PZ program to determine where the data lives.\n\n            Args:\n                None\n            \n            Returns:\n                list[str]: A list of file paths for all files in the `Context`.\n            \"\"\"\n            return self.filepaths\n\n        agent = CodeAgent(\n            model=LiteLLMModel(model_id=\"openai/o1\", api_key=os.getenv(\"ANTHROPIC_API_KEY\")),\n            tools=[tool_list_filepaths],\n            max_steps=20,\n            planning_interval=4,\n            add_base_tools=False,\n            return_full_result=True,\n            additional_authorized_imports=[\"dotenv\", \"json\", \"palimpzest\", \"pandas\"],\n            instructions=PZ_INSTRUCTION,\n        )\n        result = agent.run(instruction)\n        response = result.output\n\n        return response\n"
  },
  {
    "path": "src/palimpzest/core/data/context_manager.py",
    "content": "from __future__ import annotations\n\nimport os\nimport pickle\n\nimport chromadb\nimport chromadb.utils.embedding_functions as embedding_functions\nimport tiktoken\n\nfrom palimpzest.constants import PZ_DIR\nfrom palimpzest.core.data import context\n\n\nclass ContextNotFoundError(Exception):\n    pass\n\n\nclass ContextManager:\n    \"\"\"\n    This class manages the long-term storage of `Contexts`. Each new `Context` is added to\n    the `ContextManager` and serialized to disk. `Contexts` are also indexed, which enables\n    PZ to search for `Context(s)` which may support `search()` and `compute()` operations.\n    \"\"\"\n    def __init__(self):\n        # create directory with serialized contexts (if it doesn't already exist)\n        self.context_dir = os.path.join(PZ_DIR, \"contexts\")\n        os.makedirs(self.context_dir, exist_ok=True)\n\n        # create vector store (if it doesn't already exist)\n        self.chroma_dir = os.path.join(PZ_DIR, \"chroma\")\n        os.makedirs(self.chroma_dir, exist_ok=True)\n        self.chroma_client = chromadb.PersistentClient(self.chroma_dir)\n\n        # pick embedding function based on presence of API key(s)\n        self.emb_fn = None\n        if os.getenv(\"OPENAI_API_KEY\"):\n            self.emb_fn = embedding_functions.OpenAIEmbeddingFunction(\n                api_key=os.getenv(\"OPENAI_API_KEY\"),\n                model_name=\"text-embedding-3-small\"\n            )\n\n        self.index = self.chroma_client.get_or_create_collection(\"contexts\", embedding_function=self.emb_fn)\n\n    @staticmethod\n    def from_pkl(path: str) -> context.Context:\n        \"\"\"Load a `Context` from its serialized pickle file.\"\"\"\n        with open(path, \"rb\") as f:\n            context = pickle.load(f)\n\n        return context\n\n    @staticmethod\n    def to_pkl(context: context.Context, path: str) -> None:\n        \"\"\"Write the given `Context` to a pickle file at the provided `path`.\"\"\"\n        with open(path, \"wb\") as f:\n            pickle.dump(context, f)\n\n    def num_tokens_from_string(self, string: str, encoding_name: str) -> int:\n        \"\"\"Returns the number of tokens in a text string.\"\"\"\n        encoding = tiktoken.get_encoding(encoding_name)\n        num_tokens = len(encoding.encode(string))\n        return num_tokens\n\n    def add_context(self, context: context.Context, update: bool = False) -> None:\n        \"\"\"\n        Add the new `Context` to the `ContextManager` by serializing and writing it to disk.\n\n        Args:\n            context (`Context`): the context to add to the `ContextManager`\n            update (`bool`): whether or not to update an existing context\n\n        TODO: track cost\n        \"\"\"\n        # return early if the context already exists and we're not performing an update\n        id = context.id\n        context_path = os.path.join(self.context_dir, f\"{id}.pkl\")\n        if os.path.exists(context_path) and update is False:\n            return\n\n        # write the context to disk\n        ContextManager.to_pkl(context, context_path)\n\n        # compute number of tokens in context.description\n        description = context.description\n        while self.num_tokens_from_string(description, \"cl100k_base\") > 8192:\n            description = description[:int(0.9*len(description))]\n \n        # add context to vector store\n        context_embeddings = self.emb_fn([description])\n        context_payload = {\n            \"ids\": [context.id],\n            \"embeddings\": context_embeddings,\n            \"metadatas\": [{\"id\": context.id, \"materialized\": context.materialized}],\n            \"documents\": [context.description],\n        }\n        if update:\n            self.index.update(**context_payload)\n        else:\n            self.index.add(**context_payload)\n\n    def update_context(self, id: str, description: str, materialized: bool = True) -> None:\n        \"\"\"\n        Update an existing `Context` with the given `id` to have the given `description`.\n        \n        Args:\n            id (str): the id of the updated `Context`\n            description (str): the update to the description for the specified `Context`\n            materialized (bool): boolean to set the materialization status of the `Context`\n\n        Raises:\n            ContextNotFoundError: if the given `id` doesn't point to a `Context` in the `ContextManger`.\n        \"\"\"\n        context = self.get_context(id)\n        new_description = context.description + description  # TODO: should description have RESULT replaced on update? as opposed to appending? should description be some pydantic BaseModel?\n        context.set_description(new_description)\n        context.set_materialized(materialized)\n        self.add_context(context, update=True)\n\n    def get_context(self, id: str) -> context.Context:\n        \"\"\"\n        Returns the `Context` specified by the given `id`.\n\n        Args:\n            id (str): the id of the retrieved `Context`\n\n        Returns:\n            `Context`: the specified `Context`.\n        \"\"\"\n        context_path = os.path.join(self.context_dir, f\"{id}.pkl\")\n        try:\n            return ContextManager.from_pkl(context_path)\n        except FileNotFoundError as err:\n            raise ContextNotFoundError from err\n\n    def search_context(self, query: str, k: int = 1, where: dict | None = None) -> list[context.Context]:\n        \"\"\"\n        Returns the top-k most relevant `Context(s)` for the given query. If provided,\n        the where dictionary will be used to filter the search results.\n\n        TODO:\n        3) update CostModel to account for benefit of using existing Context(s)\n        ---\n        4) unit test\n        5) track cost\n        \"\"\"\n        # embed the search query\n        query_embeddings = self.emb_fn([query])\n\n        # look up ids of most similar contexts\n        results = self.index.query(\n            query_embeddings=query_embeddings,\n            n_results=k,\n            where=where,\n        )\n        ids = results[\"ids\"][0]\n\n        # load and return Context objects\n        contexts = []\n        for id in ids:\n            context_path = os.path.join(self.context_dir, f\"{id}.pkl\")\n            contexts.append(ContextManager.from_pkl(context_path))\n\n        return contexts\n"
  },
  {
    "path": "src/palimpzest/core/data/dataset.py",
    "content": "from __future__ import annotations\n\nimport warnings\nfrom collections.abc import Iterator\nfrom typing import Callable\n\nfrom chromadb.api.models.Collection import Collection\nfrom pydantic import BaseModel\n\nfrom palimpzest.constants import AggFunc, Cardinality\nfrom palimpzest.core.elements.filters import Filter\nfrom palimpzest.core.elements.groupbysig import GroupBySig\nfrom palimpzest.core.lib.schemas import create_schema_from_fields, project, relax_schema, union_schemas\nfrom palimpzest.policy import construct_policy_from_kwargs\nfrom palimpzest.query.operators.logical import (\n    Aggregate,\n    ConvertScan,\n    Distinct,\n    FilteredScan,\n    GroupByAggregate,\n    JoinOp,\n    LimitScan,\n    LogicalOperator,\n    Project,\n    TopKScan,\n)\nfrom palimpzest.query.processor.config import QueryProcessorConfig\nfrom palimpzest.utils.hash_helpers import hash_for_serialized_dict\nfrom palimpzest.validator.validator import Validator\n\n\n# TODO?: remove `schema` from `Dataset` and access it from `operator`?\n# - Q: how do you handle datasets with multiple sources?\n#    - for joins the operator should have the union'ed schema\n#    - but for Contexts it may be trickier\nclass Dataset:\n    \"\"\"\n    A `Dataset` represents a collection of structured or unstructured data that can be processed and\n    transformed. Each `Dataset` is either a \"root\" `Dataset` (which yields data items) or it is the\n    result of performing data processing operation(s) on root `Dataset(s)`.\n\n    Users can perform computations on a `Dataset` in a lazy or eager fashion. Applying functions\n    such as `sem_filter`, `sem_map`, `sem_join`, `sem_agg`, etc. will lazily create a new `Dataset`.\n    Users can invoke the `run()` method to execute the computation and retrieve a materialized `Dataset`.\n    Materialized `Dataset`s can be processed further, or their results can be retrieved using `.get()`.\n    \n    A root `Dataset` must subclass at least one of `pz.IterDataset`, `pz.IndexDataset`, or `pz.Context`.\n    Each of these classes supports a different access pattern:\n\n        - `pz.IterDataset`: supports accessing data via iteration\n            - Ex: iterating over a list of PDFs\n            - Ex: iterating over rows in a DataFrame\n        - `pz.IndexDataset`: supports accessing data via point lookups / queries\n            - Ex: querying a vector database\n            - Ex: querying a SQL database\n        - `pz.Context`: supports accessing data with an agent\n            - Ex: processing a set of CSV files with a data science agent\n            - Ex: processing time series data with a data cleaning agent\n\n    A root `Dataset` may subclass more than one of the aforementioned classes. For example, the root\n    `Dataset` for a list of files may inherit from `pz.IterDataset` and `pz.IndexDataset` to support\n    iterating over the files and performing point lookups for individual files.\n\n    For details on how to create your own root `Dataset`, please see: TODO\n    \"\"\"\n    def __init__(\n            self,\n            sources: list[Dataset] | Dataset | None,\n            operator: LogicalOperator,\n            schema: type[BaseModel] | None = None,\n            id: str | None = None,\n        ) -> None:\n        \"\"\"\n        Initialize a `Dataset` with one or more `sources` and the operator that is being applied.\n        Root `Datasets` subclass `pz.IterDataset`, `pz.IndexDataset`, and/or `pz.Context` and use\n        their own constructors.\n\n        Args:\n            sources (`list[Dataset] | Dataset`): The (list of) `Dataset(s)` which are input(s) to\n                the operator used to compute this `Dataset`.\n            operator (`LogicalOperator`): The `LogicalOperator` used to compute this `Dataset`.\n            schema (type[`BaseModel`] | None): The schema of this `Dataset`.\n            id (str | None): an identifier for this `Dataset` provided by the user\n\n        Raises:\n            ValueError: if `sources` is not a `Dataset` or list of `Datasets`\n        \"\"\"\n        # set sources\n        self._sources = []\n        if isinstance(sources, list):\n            self._sources = sources\n        elif isinstance(sources, Dataset):\n            self._sources = [sources]\n        elif sources is not None:\n            raise ValueError(\"Dataset sources must be another Dataset or a list of Datasets. For root Datasets, you must subclass pz.IterDataset, pz.IndexDataset, or pz.Context.\")\n\n        # set the logical operator and schema\n        self._operator: LogicalOperator = operator\n        self._schema = schema\n\n        # compute the dataset id\n        self._id = self._compute_dataset_id() if id is None else id\n\n    @property\n    def id(self) -> str:\n        \"\"\"The string identifier for this `Dataset`\"\"\"\n        return self._id\n\n    @property\n    def schema(self) -> type[BaseModel]:\n        \"\"\"The Pydantic model defining the schema of this `Dataset`\"\"\"\n        return self._schema\n\n    @property\n    def is_root(self) -> bool:\n        return len(self._sources) == 0\n\n    def __str__(self) -> str:\n        return f\"Dataset(schema={self._schema}, id={self._id}, op_id={self._operator.get_logical_op_id()})\"\n\n    def __iter__(self) -> Iterator[Dataset]:\n        for source in self._sources:\n            yield from source\n        yield self\n\n    def _compute_dataset_id(self) -> str:\n        \"\"\"\n        Compute the identifier for this `Dataset`. The ID is uniquely defined by the operation(s)\n        applied to the `Dataset's` sources.\n        \"\"\"\n        return hash_for_serialized_dict({\n            \"source_ids\": [source.id for source in self._sources],\n            \"logical_op_id\": self._operator.get_logical_op_id(),\n        })\n\n    def _set_root_datasets(self, new_root_datasets: dict[str, Dataset]) -> None:\n        \"\"\"\n        Update the root dataset(s) for this dataset with the `new_root_datasets`. This is used during\n        optimization to reuse the same physical plan while running it on a train dataset.\n\n        Args:\n            new_root_datasets (dict[str, Dataset]): the new root datasets for this dataset.\n        \"\"\"\n        new_sources = []\n        for old_source in self._sources:\n            if old_source.id in new_root_datasets:\n                new_sources.append(new_root_datasets[old_source.id])\n            else:\n                old_source._set_root_datasets(new_root_datasets)\n                new_sources.append(old_source)\n        self._sources = new_sources\n\n    # TODO: the entire way (unique) logical op ids are computed and stored needs to be rethought\n    def _generate_unique_logical_op_ids(self, topo_idx: int | None = None) -> None:\n        \"\"\"\n        Generate unique operation IDs for all operators in this dataset and its sources.\n        This is used to ensure that each operator can be uniquely identified during execution.\n        \"\"\"\n        # generate the unique op ids for all sources' operators\n        for source in self._sources:\n            topo_idx = source._generate_unique_logical_op_ids(topo_idx)\n            topo_idx += 1\n\n        # if topo_idx is None, this is the first call, so we initialize it to 0\n        if topo_idx is None:\n            topo_idx = 0\n\n        # compute this operator's unique operator id\n        this_unique_logical_op_id = f\"{topo_idx}-{self._operator.get_logical_op_id()}\"\n\n        # update the unique logical op id for this operator\n        self._operator.set_unique_logical_op_id(this_unique_logical_op_id)\n\n        # return the current unique full_op_id for this operator\n        return topo_idx\n\n    # TODO\n    def _resolve_depends_on(self, depends_on: list[str]) -> list[str]:\n        \"\"\"\n        TODO: resolve the `depends_on` strings to their full field names ({Dataset.id}.{field_name}).\n        \"\"\"\n        return []\n\n    def _get_root_datasets(self) -> dict[str, Dataset]:\n        \"\"\"Return a mapping from the id --> Dataset for all root datasets.\"\"\"\n        if self.is_root:\n            return {self.id: self}\n\n        root_datasets = {}\n        for source in self._sources:\n            child_root_datasets = source._get_root_datasets()\n            root_datasets = {**root_datasets, **child_root_datasets}\n\n        return root_datasets\n\n    def relax_types(self) -> None:\n        \"\"\"\n        Relax the types in this Dataset's schema and all upstream Datasets' schemas to be more permissive.\n        \"\"\"\n        # relax the types in this dataset's schema\n        self._schema = relax_schema(self._schema)\n\n        # relax the types in dataset's operator's input and output schemas\n        self._operator.input_schema = None if self._operator.input_schema is None else relax_schema(self._operator.input_schema)\n        self._operator.output_schema = relax_schema(self._operator.output_schema)\n\n        # recursively relax the types in all upstream datasets\n        for source in self._sources:\n            source.relax_types()\n\n    def get_upstream_datasets(self) -> list[Dataset]:\n        \"\"\"\n        Get the list of all upstream datasets that are sources to this dataset.\n        \"\"\"\n        # recursively get the upstream datasets\n        upstream = []\n        for source in self._sources:\n            upstream.extend(source.get_upstream_datasets())\n            upstream.append(source)\n        return upstream\n\n    def get_limit(self) -> int | None:\n        \"\"\"Get the limit applied to this Dataset, if any.\"\"\"\n        if isinstance(self._operator, LimitScan):\n            return self._operator.limit\n\n        source_limits = []\n        for source in self._sources:\n            source_limit = source.get_limit()\n            if source_limit is not None:\n                source_limits.append(source_limit)\n\n        if len(source_limits) == 0:\n            return None\n\n        return min([limit for limit in source_limits if limit is not None])\n\n    def copy(self):\n        return Dataset(\n            sources=[source.copy() for source in self._sources],\n            operator=self._operator.copy(),\n            schema=self._schema,\n            id=self.id,\n        )\n\n    def join(self, other: Dataset, on: str | list[str], how: str = \"inner\") -> Dataset:\n        \"\"\"\n        Perform the specified join on the specified (list of) column(s)\n        \"\"\"\n        # enforce type for on\n        if isinstance(on, str):\n            on = [on]\n\n        # construct new output schema\n        combined_schema = union_schemas([self.schema, other.schema], join=True, on=on)\n\n        # construct logical operator\n        operator = JoinOp(\n            input_schema=combined_schema,\n            output_schema=combined_schema,\n            condition=\"\",\n            on=on,\n            how=how,\n            depends_on=on,\n        )\n\n        return Dataset(sources=[self, other], operator=operator, schema=combined_schema)\n\n    def sem_join(self, other: Dataset, condition: str, desc: str | None = None, depends_on: str | list[str] | None = None, how: str = \"inner\") -> Dataset:\n        \"\"\"\n        Perform a semantic (inner) join on the specified join predicate\n        \"\"\"\n        # enforce type for depends_on\n        if isinstance(depends_on, str):\n            depends_on = [depends_on]\n\n        # construct new output schema\n        combined_schema = union_schemas([self.schema, other.schema], join=True)\n\n        # construct logical operator\n        operator = JoinOp(\n            input_schema=combined_schema,\n            output_schema=combined_schema,\n            condition=condition,\n            how=how,\n            desc=desc,\n            depends_on=depends_on,\n        )\n\n        return Dataset(sources=[self, other], operator=operator, schema=combined_schema)\n\n    def filter(\n        self,\n        filter: Callable,\n        depends_on: str | list[str] | None = None,\n    ) -> Dataset:\n        \"\"\"Add a user defined function as a filter to the Set. This filter will possibly restrict the items that are returned later.\"\"\"\n        # construct Filter object\n        f = None\n        if callable(filter):\n            f = Filter(filter_fn=filter)\n        else:\n            error_str = f\"Only support callable for filter, currently got {type(filter)}\"\n            if isinstance(filter, str):\n                error_str += \". Consider using sem_filter() for semantic filters.\"\n            raise Exception(error_str)\n\n        # enforce type for depends_on\n        if isinstance(depends_on, str):\n            depends_on = [depends_on]\n\n        # construct logical operator\n        operator = FilteredScan(input_schema=self.schema, output_schema=self.schema, filter=f, depends_on=depends_on)\n\n        return Dataset(sources=[self], operator=operator, schema=self.schema)\n\n    def sem_filter(\n        self,\n        filter: str,\n        desc: str | None = None,\n        depends_on: str | list[str] | None = None,\n    ) -> Dataset:\n        \"\"\"Add a natural language description of a filter to the Set. This filter will possibly restrict the items that are returned later.\"\"\"\n        # construct Filter object\n        f = None\n        if isinstance(filter, str):\n            f = Filter(filter)\n        else:\n            raise Exception(\"sem_filter() only supports `str` input for _filter.\", type(filter))\n\n        # enforce type for depends_on\n        if isinstance(depends_on, str):\n            depends_on = [depends_on]\n\n        # construct logical operator\n        operator = FilteredScan(input_schema=self.schema, output_schema=self.schema, filter=f, desc=desc, depends_on=depends_on)\n\n        return Dataset(sources=[self], operator=operator, schema=self.schema)\n\n    def _sem_map(self, cols: list[dict] | type[BaseModel] | None,\n                 cardinality: Cardinality,\n                 desc: str | None = None,\n                 depends_on: str | list[str] | None = None) -> Dataset:\n        \"\"\"Execute the semantic map operation with the appropriate cardinality.\"\"\"\n        # construct new output schema\n        new_output_schema = None\n        if cols is None:\n            new_output_schema = self.schema\n        elif isinstance(cols, list):\n            cols = create_schema_from_fields(cols)\n            new_output_schema = union_schemas([self.schema, cols])\n        elif issubclass(cols, BaseModel):\n            new_output_schema = union_schemas([self.schema, cols])\n        else:\n            raise ValueError(\"`cols` must be a list of dictionaries or a BaseModel.\")\n\n        # enforce type for depends_on\n        if isinstance(depends_on, str):\n            depends_on = [depends_on]\n\n        # construct logical operator\n        operator = ConvertScan(\n            input_schema=self.schema,\n            output_schema=new_output_schema,\n            cardinality=cardinality,\n            udf=None,\n            desc=desc,\n            depends_on=depends_on,\n        )\n\n        return Dataset(sources=[self], operator=operator, schema=new_output_schema)\n\n    def sem_add_columns(self, cols: list[dict] | type[BaseModel],\n                        cardinality: Cardinality = Cardinality.ONE_TO_ONE,\n                        desc: str | None = None,\n                        depends_on: str | list[str] | None = None) -> Dataset:\n        \"\"\"\n        NOTE: we are renaming this function to `sem_map` and deprecating `sem_add_columns` in the next\n        release of PZ. To update your code, simply change your calls from `.sem_add_columns(...)` to `.sem_map(...)`.\n        The function arguments will stay the same.\n\n        Add new columns by specifying the column names, descriptions, and types.\n        The column will be computed during the execution of the Dataset.\n        Example:\n            sem_add_columns(\n                [{'name': 'greeting', 'desc': 'The greeting message', 'type': str},\n                 {'name': 'age', 'desc': 'The age of the person', 'type': int},\n                 {'name': 'full_name', 'desc': 'The name of the person', 'type': str}]\n            )\n        \"\"\"\n        # issue deprecation warning\n        warnings.warn(\n            \"we are renaming this function to `sem_map` and deprecating `sem_add_columns` in the next\"\n            \" release of PZ. To update your code, simply change your calls from `.sem_add_columns(...)`\"\n            \" to `.sem_map(...)`. The function arguments will stay the same.\",\n            DeprecationWarning,\n            stacklevel=2\n        )\n\n        return self._sem_map(cols, cardinality, desc, depends_on)\n\n    def sem_map(self, cols: list[dict] | type[BaseModel], desc: str | None = None, depends_on: str | list[str] | None = None) -> Dataset:\n        \"\"\"\n        Compute new field(s) by specifying their names, descriptions, and types. For each input there will\n        be one output. The field(s) will be computed during the execution of the Dataset.\n\n        Example:\n            sem_map(\n                [{'name': 'greeting', 'desc': 'The greeting message', 'type': str},\n                 {'name': 'age', 'desc': 'The age of the person', 'type': int},\n                 {'name': 'full_name', 'desc': 'The name of the person', 'type': str}]\n            )\n        \"\"\"\n        return self._sem_map(cols, Cardinality.ONE_TO_ONE, desc, depends_on)\n\n    def sem_flat_map(self, cols: list[dict] | type[BaseModel], desc: str | None = None, depends_on: str | list[str] | None = None) -> Dataset:\n        \"\"\"\n        Compute new field(s) by specifying their names, descriptions, and types. For each input there will\n        be one or more output(s). The field(s) will be computed during the execution of the Dataset.\n\n        Example:\n            sem_flat_map(\n                cols=[\n                    {'name': 'author_name', 'description': 'The name of the author', 'type': str},\n                    {'name': 'institution', 'description': 'The institution of the author', 'type': str},\n                    {'name': 'email', 'description': 'The author's email', 'type': str},\n                ]\n            )\n        \"\"\"\n        return self._sem_map(cols, Cardinality.ONE_TO_MANY, desc, depends_on)\n\n    def _map(self, udf: Callable,\n            cols: list[dict] | type[BaseModel] | None,\n            cardinality: Cardinality,\n            depends_on: str | list[str] | None = None) -> Dataset:\n        \"\"\"Execute the map operation with the appropriate cardinality.\"\"\"\n        # construct new output schema\n        new_output_schema = None\n        if cols is None:\n            new_output_schema = self.schema\n        elif isinstance(cols, list):\n            cols = create_schema_from_fields(cols)\n            new_output_schema = union_schemas([self.schema, cols])\n        elif issubclass(cols, BaseModel):\n            new_output_schema = union_schemas([self.schema, cols])\n        else:\n            raise ValueError(\"`cols` must be a list of dictionaries, a BaseModel, or None.\")\n\n        # enforce type for depends_on\n        if isinstance(depends_on, str):\n            depends_on = [depends_on]\n\n        # construct logical operator\n        operator = ConvertScan(\n            input_schema=self.schema,\n            output_schema=new_output_schema,\n            cardinality=cardinality,\n            udf=udf,\n            depends_on=depends_on,\n        )\n\n        return Dataset(sources=[self], operator=operator, schema=new_output_schema)\n\n    def add_columns(self, udf: Callable,\n                    cols: list[dict] | type[BaseModel] | None,\n                    cardinality: Cardinality = Cardinality.ONE_TO_ONE,\n                    depends_on: str | list[str] | None = None) -> Dataset:\n        \"\"\"\n        NOTE: we are renaming this function to `map` and deprecating `add_columns` in the next\n        release of PZ. To update your code, simply change your calls from `.add_columns(...)` to `.map(...)`.\n        The function arguments will stay the same.\n\n        Compute new fields (or update existing ones) with a UDF. For each input, this function will compute one output.\n\n        Set `cols=None` if your add_columns operation is not computing any new fields.\n\n        Examples:\n            add_columns(\n                udf=compute_personal_greeting,\n                cols=[\n                    {'name': 'greeting', 'description': 'The greeting message', 'type': str},\n                    {'name': 'age', 'description': 'The age of the person', 'type': int},\n                    {'name': 'full_name', 'description': 'The name of the person', 'type': str},\n                ]\n            )\n        \"\"\"\n        # issue deprecation warning\n        warnings.warn(\n            \"we are renaming this function to `map` and deprecating `add_columns` in the next\"\n            \" release of PZ. To update your code, simply change your calls from `.add_columns(...)`\"\n            \" to `.map(...)`. The function arguments will stay the same.\",\n            DeprecationWarning,\n            stacklevel=2\n        )\n\n        # sanity check inputs\n        if udf is None:\n            raise ValueError(\"`udf` must be provided for add_columns.\")\n\n        return self._map(udf, cols, cardinality, depends_on)\n\n    def map(self, udf: Callable,\n            cols: list[dict] | type[BaseModel] | None,\n            depends_on: str | list[str] | None = None) -> Dataset:\n        \"\"\"\n        Compute new fields (or update existing ones) with a UDF. For each input, this function will compute one output.\n\n        Set `cols=None` if your map is not computing any new fields.\n\n        Examples:\n            map(\n                udf=compute_personal_greeting,\n                cols=[\n                    {'name': 'greeting', 'description': 'The greeting message', 'type': str},\n                    {'name': 'age', 'description': 'The age of the person', 'type': int},\n                    {'name': 'full_name', 'description': 'The name of the person', 'type': str},\n                ]\n            )\n        \"\"\"\n        # sanity check inputs\n        if udf is None:\n            raise ValueError(\"`udf` must be provided for map.\")\n\n        return self._map(udf, cols, Cardinality.ONE_TO_ONE, depends_on)\n\n    def flat_map(self, udf: Callable,\n            cols: list[dict] | type[BaseModel] | None,\n            depends_on: str | list[str] | None = None) -> Dataset:\n        \"\"\"\n        Compute new fields (or update existing ones) with a UDF. For each input, this function will compute one or more outputs.\n\n        Set `cols=None` if your flat_map is not computing any new fields.\n\n        Examples:\n            flat_map(\n                udf=extract_paper_authors,\n                cols=[\n                    {'name': 'author_name', 'description': 'The name of the author', 'type': str},\n                    {'name': 'institution', 'description': 'The institution of the author', 'type': str},\n                    {'name': 'email', 'description': 'The author's email', 'type': str},\n                ]\n            )\n        \"\"\"\n        # sanity check inputs\n        if udf is None:\n            raise ValueError(\"`udf` must be provided for map.\")\n\n        return self._map(udf, cols, Cardinality.ONE_TO_MANY, depends_on)\n\n    def count(self) -> Dataset:\n        \"\"\"Apply a count aggregation to this set\"\"\"\n        operator = Aggregate(input_schema=self.schema, agg_func=AggFunc.COUNT)\n        return Dataset(sources=[self], operator=operator, schema=operator.output_schema)\n\n    def average(self) -> Dataset:\n        \"\"\"Apply an average aggregation to this set\"\"\"\n        operator = Aggregate(input_schema=self.schema, agg_func=AggFunc.AVERAGE)\n        return Dataset(sources=[self], operator=operator, schema=operator.output_schema)\n\n    def sum(self) -> Dataset:\n        \"\"\"Apply a summation to this set\"\"\"\n        operator = Aggregate(input_schema=self.schema, agg_func=AggFunc.SUM)\n        return Dataset(sources=[self], operator=operator, schema=operator.output_schema)\n\n    def min(self) -> Dataset:\n        \"\"\"Apply an min operator to this set\"\"\"\n        operator = Aggregate(input_schema=self.schema, agg_func=AggFunc.MIN)\n        return Dataset(sources=[self], operator=operator, schema=operator.output_schema)\n\n    def max(self) -> Dataset:\n        \"\"\"Apply an max operator to this set\"\"\"\n        operator = Aggregate(input_schema=self.schema, agg_func=AggFunc.MAX)\n        return Dataset(sources=[self], operator=operator, schema=operator.output_schema)\n\n    def groupby(self, groupby: GroupBySig) -> Dataset:\n        output_schema = groupby.output_schema()\n        operator = GroupByAggregate(input_schema=self.schema, output_schema=output_schema, group_by_sig=groupby)\n        return Dataset(sources=[self], operator=operator, schema=output_schema)\n\n    def sem_agg(self, col: dict | type[BaseModel], agg: str, depends_on: str | list[str] | None = None) -> Dataset:\n        \"\"\"\n        Apply a semantic aggregation to this set. The `agg` string will be applied using an LLM\n        over the entire set of inputs' fields specified in `depends_on` to generate the output `col`.\n\n        Example:\n            sem_agg(\n                col={'name': 'overall_sentiment', 'desc': 'The overall sentiment of the reviews', 'type': str},\n                agg=\"Compute the overall sentiment of the reviews as POSITIVE or NEGATIVE.\",\n                depends_on=\"review_text\",\n            )\n        \"\"\"\n        # construct new output schema\n        new_output_schema = None\n        if isinstance(col, dict):\n            new_output_schema = create_schema_from_fields([col])\n        elif issubclass(col, BaseModel):\n            assert len(col.model_fields) == 1, \"For semantic aggregation, when passing a BaseModel to `col` it must have exactly one field.\"\n            new_output_schema = col\n        else:\n            raise ValueError(\"`col` must be a dictionary or a single-field BaseModel.\")\n\n        # enforce type for depends_on\n        if isinstance(depends_on, str):\n            depends_on = [depends_on]\n\n        # construct logical operator\n        operator = Aggregate(input_schema=self.schema, output_schema=new_output_schema, agg_str=agg, depends_on=depends_on)\n\n        return Dataset(sources=[self], operator=operator, schema=operator.output_schema)\n\n    def sem_topk(\n        self,\n        index: Collection,\n        search_attr: str,\n        output_attrs: list[dict] | type[BaseModel],\n        search_func: Callable | None = None,\n        k: int = -1,\n    ) -> Dataset:\n        \"\"\"\n        Retrieve the top-k nearest neighbors of the value of the `search_attr` from the `index` and\n        use these results to construct the `output_attrs` field(s).\n        \"\"\"\n        # construct new output schema\n        new_output_schema = None\n        if isinstance(output_attrs, list):\n            output_attrs = create_schema_from_fields(output_attrs)\n            new_output_schema = union_schemas([self.schema, output_attrs])\n        elif issubclass(output_attrs, BaseModel):\n            new_output_schema = union_schemas([self.schema, output_attrs])\n        else:\n            raise ValueError(\"`output_attrs` must be a list of dictionaries or a BaseModel.\")\n\n        # TODO: revisit once we can think through abstraction(s)\n        # # construct the PZIndex from the user-provided index\n        # index = index_factory(index)\n\n        # construct logical operator\n        operator = TopKScan(\n            input_schema=self.schema,\n            output_schema=new_output_schema,\n            index=index,\n            search_func=search_func,\n            search_attr=search_attr,\n            output_attrs=output_attrs,\n            k=k,\n        )\n\n        return Dataset(sources=[self], operator=operator, schema=new_output_schema)\n\n    def limit(self, n: int) -> Dataset:\n        \"\"\"Limit the set size to no more than n rows\"\"\"\n        operator = LimitScan(input_schema=self.schema, output_schema=self.schema, limit=n)\n        return Dataset(sources=[self], operator=operator, schema=self.schema)\n\n    def distinct(self, distinct_cols: list[str] | None = None) -> Dataset:\n        \"\"\"Return a new Dataset with distinct rows based on the current schema.\"\"\"\n        operator = Distinct(input_schema=self.schema, output_schema=self.schema, distinct_cols=distinct_cols)\n        return Dataset(sources=[self], operator=operator, schema=self.schema)\n\n    def project(self, project_cols: list[str] | str) -> Dataset:\n        \"\"\"Project the Set to only include the specified columns.\"\"\"\n        project_cols = project_cols if isinstance(project_cols, list) else [project_cols]\n        new_output_schema = project(self.schema, project_cols)\n        operator = Project(input_schema=self.schema, output_schema=new_output_schema, project_cols=project_cols)\n        return Dataset(sources=[self], operator=operator, schema=new_output_schema)\n\n    def run(self, config: QueryProcessorConfig | None = None, **kwargs):\n        \"\"\"Invoke the QueryProcessor to execute the query. `kwargs` will be applied to the QueryProcessorConfig.\"\"\"\n        # TODO: this import currently needs to be here to avoid a circular import; we should fix this in a subsequent PR\n        from palimpzest.query.processor.query_processor_factory import QueryProcessorFactory\n\n        # as syntactic sugar, we will allow some keyword arguments to parameterize our policies\n        policy = construct_policy_from_kwargs(**kwargs)\n        if policy is not None:\n            kwargs[\"policy\"] = policy\n\n        # construct unique logical op ids for all operators in this dataset\n        self._generate_unique_logical_op_ids()\n\n        return QueryProcessorFactory.create_and_run_processor(self, config)\n\n    def optimize_and_run(self, config: QueryProcessorConfig | None = None, train_dataset: dict[str, Dataset] | Dataset | None = None, validator: Validator | None = None, **kwargs):\n        \"\"\"Optimize the PZ program using the train_dataset and validator before running the optimized plan.\"\"\"\n        # TODO: this import currently needs to be here to avoid a circular import; we should fix this in a subsequent PR\n        from palimpzest.query.processor.query_processor_factory import QueryProcessorFactory\n\n        # confirm that either train_dataset or validator is provided\n        assert train_dataset is not None or validator is not None, \"Must provide at least one of train_dataset or validator to use optimize_and_run()\"\n\n        # validate the train_dataset has one input for each source dataset and normalize its type to be a dict\n        if train_dataset is not None:\n            root_datasets = self._get_root_datasets()\n            if isinstance(train_dataset, Dataset) and len(root_datasets) > 1:\n                raise ValueError(\n                    \"For plans with more than one root dataset, `train_dataset` must be a dictionary mapping\"\n                    \" {'dataset_id' --> Dataset} for all root Datasets\"\n                )\n\n            elif isinstance(train_dataset, Dataset):\n                root_dataset_id = list(root_datasets.values())[0].id\n                if train_dataset.id != root_dataset_id:\n                    warnings.warn(\n                        f\"train_dataset.id={train_dataset.id} does not match root dataset id={root_dataset_id}\\n\"\n                        f\"Setting train_dataset to be the training data for root dataset with id={root_dataset_id} anyways.\",\n                        stacklevel=2,\n                    )\n                train_dataset = {root_dataset_id: train_dataset}\n\n            elif not all(dataset_id in train_dataset for dataset_id in root_datasets):\n                missing_ids = [dataset_id for dataset_id in root_datasets if dataset_id not in train_dataset]\n                raise ValueError(\n                    f\"`train_dataset` is missing the following root dataset id(s): {missing_ids}\"\n                )\n\n        # as syntactic sugar, we will allow some keyword arguments to parameterize our policies\n        policy = construct_policy_from_kwargs(**kwargs)\n        if policy is not None:\n            kwargs[\"policy\"] = policy\n            config.policy = policy\n\n        # construct unique logical op ids for all operators in this dataset\n        self._generate_unique_logical_op_ids()\n\n        return QueryProcessorFactory.create_and_run_processor(self, config, train_dataset, validator)\n"
  },
  {
    "path": "src/palimpzest/core/data/index_dataset.py",
    "content": "from __future__ import annotations\n\nfrom abc import ABC, abstractmethod\n\nfrom chromadb.api.models.Collection import Collection\n\n\ndef index_factory(index: Collection) -> PZIndex:\n    \"\"\"\n    Factory function to create a PZ index based on the type of the provided index.\n\n    Args:\n        index (Collection): The index provided by the user.\n\n    Returns:\n        PZIndex: The PZ wrapped Index.\n    \"\"\"\n    if isinstance(index, Collection):\n        return ChromaIndex(index)\n    else:\n        raise TypeError(f\"Unsupported index type: {type(index)}\\nindex must be a `chromadb.api.models.Collection.Collection`\")\n\n\nclass BaseIndex(ABC):\n\n    def __init__(self, index: Collection):\n        self.index = index\n\n    def __str__(self):\n        \"\"\"\n        Return a string representation of the index.\n        \"\"\"\n        return f\"{self.__class__.__name__}\"\n\n    @abstractmethod\n    def search(self, query_embedding: list[float] | list[list[float]], results_per_query: int = 1) -> list | list[list]:\n        \"\"\"\n        Query the index with a string or a list of strings.\n\n        Args:\n            query (str | list[str]): The query string or list of strings to search for.\n            results_per_query (int): The number of top results to retrieve for each query.\n\n        Returns:\n            list | list[list]: The top results for the query. If query is a list, the top\n                results for each query in the list are returned. Each list will contain the\n                raw elements yielded by the index. This way, users can program against the\n                results they expect to get from e.g. chromadb or ragatouille.\n        \"\"\"\n        pass\n\n\nclass ChromaIndex(BaseIndex):\n    def __init__(self, index: Collection):\n        assert isinstance(index, Collection), \"ChromaIndex input must be a `chromadb.api.models.Collection.Collection`\"\n        super().__init__(index)\n\n\n# define type for PZIndex\nPZIndex = ChromaIndex\n"
  },
  {
    "path": "src/palimpzest/core/data/iter_dataset.py",
    "content": "from __future__ import annotations\n\nimport base64\nimport os\nfrom abc import ABC, abstractmethod\nfrom io import BytesIO\nfrom pathlib import Path\n\nimport pandas as pd\nfrom bs4 import BeautifulSoup\nfrom pydantic import BaseModel\n\nfrom palimpzest import constants\nfrom palimpzest.core.data import dataset\nfrom palimpzest.core.lib.schemas import (\n    AudioFile,\n    DefaultSchema,\n    ImageFile,\n    PDFFile,\n    TextFile,\n    WebPage,\n    XLSFile,\n    create_schema_from_df,\n    create_schema_from_fields,\n)\nfrom palimpzest.query.operators.logical import BaseScan\nfrom palimpzest.tools.pdfparser import get_text_from_pdf\n\n\n####################\n### BASE CLASSES ###\n####################\nclass IterDataset(dataset.Dataset, ABC):\n    \"\"\"\n    The `IterDataset` is an abstract base class for root `Datasets` whose data is accessed\n    via iteration. Classes which inherit from this class must implement two methods:\n\n    - `__len__()`: which returns the number of elements in the dataset\n    - `__getitem__(idx: int)`: which takes in an `idx` and returns the element at that index\n    \"\"\"\n\n    def __init__(self, id: str, schema: type[BaseModel] | list[dict]) -> None:\n        \"\"\"\n            Constructor for the `IterDataset` class.\n\n            Args:\n                id (str): a string identifier for the `Dataset`\n                schema (BaseModel | list[dict]): The output schema of the records returned by the `Dataset`\n        \"\"\"\n        # compute Schema and call parent constructor\n        schema = create_schema_from_fields(schema) if isinstance(schema, list) else schema\n        super().__init__(sources=None, operator=BaseScan(datasource=self, output_schema=schema), schema=schema, id=id)\n\n    @abstractmethod\n    def __len__(self) -> int:\n        \"\"\"Returns the number of items in the `Dataset`.\"\"\"\n        pass\n\n    @abstractmethod\n    def __getitem__(self, idx: int) -> dict:\n        \"\"\"\n        Returns a single item from the `Dataset` at the given index.\n\n        Args:\n            idx (int): The index of the item to return\n\n        Returns:\n            dict: A dictionary representing the item at the given index. The dictionary\n                  keys (i.e. fields) should match the fields specified in the schema of the\n                  dataset, and the values should be the values associated with those fields.\n\n                    # Example return value\n                    {\"field1\": value1, \"field2\": value2, ...}\n\n        \"\"\"\n        pass\n\n\nclass BaseFileDataset(IterDataset):\n    \"\"\"\n    BaseFileDataset is the base class for multiple `IterDatasets` which iterate over\n    different types of files.\n    \"\"\"\n\n    def __init__(self, path: str, **kwargs) -> None:\n        \"\"\"\n        Constructor for the `BaseFileDataset` class.\n\n        Args:\n            path (str): The path to the file\n            kwargs (dict): Keyword arguments containing the `Dataset's` id and file-specific `Schema`\n        \"\"\"\n        # check that path is a valid file or directory\n        assert os.path.isfile(path) or os.path.isdir(path), f\"Path {path} is not a file nor a directory\"\n\n        # get list of filepaths\n        self.filepaths = []\n        if os.path.isfile(path):\n            self.filepaths = [path]\n        else:\n            self.filepaths = [\n                os.path.join(path, filename)\n                for filename in sorted(os.listdir(path))\n                if os.path.isfile(os.path.join(path, filename))\n            ]\n\n        # call parent constructor to set id, operator, and schema\n        super().__init__(**kwargs)\n\n    def __len__(self) -> int:\n        return len(self.filepaths)\n\n\nclass BaseFileDirectoryDataset(IterDataset):\n    \"\"\"\n    BaseFileDirectoryDataset is the base class for multiple `IterDatasets` which iterate over\n    different types of files. This class walks the entire directory tree rooted at `path`.\n    \"\"\"\n\n    def __init__(self, path: str, **kwargs) -> None:\n        \"\"\"\n        Constructor for the `BaseFileDataset` class.\n\n        Args:\n            path (str): The path to the file\n            kwargs (dict): Keyword arguments containing the `Dataset's` id and file-specific `Schema`\n        \"\"\"\n        # check that path is a valid file or directory\n        assert os.path.isfile(path) or os.path.isdir(path), f\"Path {path} is not a file nor a directory\"\n\n        # get list of filepaths\n        self.filepaths = []\n        if os.path.isfile(path):\n            self.filepaths = [path]\n        else:\n            self.filepaths = []\n            for root, _, files in os.walk(path):\n                for file in files:\n                    fp = os.path.join(root, file)\n                    self.filepaths.append(fp)\n            self.filepaths = sorted(self.filepaths)\n\n        # call parent constructor to set id, operator, and schema\n        super().__init__(**kwargs)\n\n    def __len__(self) -> int:\n        return len(self.filepaths)\n\n########################\n### CONCRETE CLASSES ###\n########################\nclass MemoryDataset(IterDataset):\n    \"\"\"\n    MemoryDataset returns one or more dictionaries that reflect the contents of an in-memory Python object `vals`.\n    If `vals` is not a pd.DataFrame, then the dictionary returned by `__getitem__()` has a single field called \"value\".\n    Otherwise, the dictionary contains the key-value mapping from columns to values for the `idx` row in the dataframe.\n\n    TODO(gerardo): Add support for other types of in-memory data structures (he has some code for subclassing\n        MemoryDataset on his branch)\n    \"\"\"\n\n    def __init__(self, id: str, vals: list | pd.DataFrame, schema: type[BaseModel] | list[dict] | None = None) -> None:\n        \"\"\"\n        Constructor for the `MemoryDataset` class. The `schema` is set to the default `DefaultSchema` schema.\n        If `vals` is a pd.DataFrame, then the schema is set to the schema inferred from the DataFrame.\n\n        Args:\n            id (str): a string identifier for the `Dataset`\n            vals (Any): The in-memory data to iterate over\n        \"\"\"\n        # if list[dict] --> convert to pd.DataFrame first\n        self.vals = pd.DataFrame(vals) if isinstance(vals, list) and all([isinstance(item, dict) for item in vals]) else vals\n        if schema is None:\n            schema = create_schema_from_df(self.vals) if isinstance(self.vals, pd.DataFrame) else DefaultSchema\n        super().__init__(id=id, schema=schema)\n\n    def __len__(self) -> int:\n        return len(self.vals)\n\n    def __getitem__(self, idx: int) -> dict:\n        \"\"\"\n        Returns a dictionary with the value(s) for the element at the specified `idx` in `vals`.\n\n        Args:\n            idx (int): The index of the item to return\n\n        Returns:\n            dict: If `vals` is not a pd.DataFrame, then the dictionary has a single field called \"value\".\n                Otherwise, the dictionary contains the key-value mapping from columns to values for the\n                `idx` row in the dataframe.\n            \n            .. code-block:: python\n            \n                # Example return value at idx = 0, for the following list of values\n                # [42, 43, 44, ...]\n                {\"value\": 42}\n\n                # Example return value at idx = 0, for the following DataFrame:\n                # +---------+---------+---------+\n                # |  name   |   job   |  hobby  |\n                # +---------+---------+---------+\n                # |  Alice  |  doctor |  tennis |\n                # |  Bob    |  lawyer |  chess  |\n                # +---------+---------+---------+\n                {\"name\": \"Alice\", \"job\": \"doctor\", \"hobby\": \"tennis\"}\n        \"\"\"\n        item = (\n            self.vals.iloc[idx].to_dict()\n            if isinstance(self.vals, pd.DataFrame)\n            else {\"value\": self.vals[idx]}\n        )\n\n        return item\n\n\nclass HTMLFileDataset(BaseFileDataset):\n    \"\"\"\n    HTMLFileDataset returns a dictionary for each HTML file in a directory. Each dictionary contains the\n    filename, raw HTML content, and parsed content of a single HTML file in the directory.\n    \"\"\"\n    def __init__(self, id: str, path: str) -> None:\n        \"\"\"\n        Constructor for the `HTMLFileDataset` class. The `schema` is set to the `WebPage` schema.\n\n        Args:\n            id (str): a string identifier for the `Dataset`\n            path (str): The path to the directory\n        \"\"\"\n        super().__init__(path=path, id=id, schema=WebPage)\n        self.filepaths = [fp for fp in self.filepaths if fp.endswith(tuple(constants.HTML_EXTENSIONS))]\n\n    def _html_to_text_with_links(self, html: str) -> str:\n        # Parse the HTML content\n        soup = BeautifulSoup(html, \"html.parser\")\n\n        # Find all hyperlink tags\n        for a in soup.find_all(\"a\"):\n            # Check if the hyperlink tag has an 'href' attribute\n            if a.has_attr(\"href\"):\n                # Replace the hyperlink with its text and URL in parentheses\n                a.replace_with(f\"{a.text} ({a['href']})\")\n\n        # Extract text from the modified HTML\n        text = soup.get_text(separator=\"\\n\", strip=True)\n\n        return text\n\n    def __getitem__(self, idx: int) -> dict:\n        \"\"\"\n        Returns a dictionary with the filename, raw HTML content, and parsed content of the HTML file at the\n        specified `idx`.\n\n        Args:\n            idx (int): The index of the item to return\n        \n        Returns:\n            dict: A dictionary with the filename, raw HTML content, and parsed content of the HTML file.\n\n            .. code-block:: python\n\n                {\n                    \"filename\": \"file.html\",\n                    \"html\": \"raw HTML content here\",\n                    \"text\": \"parsed text content here\",\n                }\n        \"\"\"\n        item = {}\n        filepath = self.filepaths[idx]\n        item[\"filename\"] = os.path.basename(filepath)\n        with open(filepath) as f:\n            text_content = f.read()\n\n        html = text_content\n        tokens = html.split()[: constants.MAX_HTML_ROWS]\n        item[\"html\"] = \" \".join(tokens)\n\n        stripped_html = self._html_to_text_with_links(text_content)\n        tokens = stripped_html.split()[: constants.MAX_HTML_ROWS]\n        item[\"text\"] = \" \".join(tokens)\n\n        return item\n\n\nclass ImageFileDataset(BaseFileDataset):\n    \"\"\"\n    ImageFileDataset returns a dictionary for each image file in a directory. Each dictionary contains the\n    filename and the base64 encoded bytes content of a single image file in the directory.\n    \"\"\"\n    def __init__(self, id: str, path: str) -> None:\n        \"\"\"\n        Constructor for the `ImageFileDataset` class. The `schema` is set to the `ImageFile` schema.\n\n        Args:\n            id (str): a string identifier for the `Dataset`\n            path (str): The path to the directory\n        \"\"\"\n        super().__init__(path=path, id=id, schema=ImageFile)\n        self.filepaths = [fp for fp in self.filepaths if fp.endswith(tuple(constants.IMAGE_EXTENSIONS))]\n\n    def __getitem__(self, idx: int) -> dict:\n        \"\"\"\n        Returns a dictionary with the filename and base64 encoded bytes content of the image file at the\n        specified `idx`.\n\n        Args:\n            idx (int): The index of the item to return\n\n        Returns:\n            dict: A dictionary with the filename and base64 encoded bytes content of the image file.\n\n            .. code-block:: python\n\n                {\n                    \"filename\": \"image.jpg\",\n                    \"contents\": b\"base64 encoded image content here\",\n                }\n        \"\"\"\n        filepath = self.filepaths[idx]\n        filename = os.path.basename(filepath)\n        with open(filepath, \"rb\") as f:\n            contents = base64.b64encode(f.read()).decode(\"utf-8\")\n\n        return {\"filename\": filename, \"contents\": contents}\n\n\nclass PDFFileDataset(BaseFileDataset):\n    \"\"\"\n    PDFFileDataset returns a dictionary for each PDF file in a directory. Each dictionary contains the\n    filename, raw PDF content, and parsed text content of a single PDF file in the directory.\n\n    This class also uses one of a predefined set of PDF processors to extract text content from the PDF files.\n    \"\"\"\n    def __init__(\n        self,\n        id: str,\n        path: str,\n        pdfprocessor: str = \"pypdf\",\n        file_cache_dir: str = \"/tmp\",\n    ) -> None:\n        \"\"\"\n        Constructor for the `PDFFileDataset` class. The `schema` is set to the `PDFFile` schema.\n\n        Args:\n            id (str): a string identifier for the `Dataset`\n            path (str): The path to the directory\n            pdfprocessor (str): The PDF processor to use for extracting text content from the PDF files\n            file_cache_dir (str): The directory to store the temporary files generated during PDF processing\n        \"\"\"\n        super().__init__(path=path, id=id, schema=PDFFile)\n        self.filepaths = [fp for fp in self.filepaths if fp.endswith(tuple(constants.PDF_EXTENSIONS))]\n        self.pdfprocessor = pdfprocessor\n        self.file_cache_dir = file_cache_dir\n\n    def __getitem__(self, idx: int) -> dict:\n        \"\"\"\n        Returns a dictionary with the filename, raw PDF content, and parsed text content of the PDF file at the\n        specified `idx`.\n\n        Args:\n            idx (int): The index of the item to return\n\n        Returns:\n            dict: A dictionary with the filename, raw PDF content, and parsed text content of the PDF file.\n\n            .. code-block:: python\n\n                {\n                    \"filename\": \"file.pdf\",\n                    \"contents\": b\"raw PDF content here\",\n                    \"text_contents\": \"parsed text content here\",\n                }\n        \"\"\"\n        filepath = self.filepaths[idx]\n        pdf_filename = os.path.basename(filepath)\n        with open(filepath, \"rb\") as f:\n            pdf_bytes = f.read()\n\n        # generate text_content from PDF\n        text_content = get_text_from_pdf(pdf_filename, pdf_bytes, pdfprocessor=self.pdfprocessor, file_cache_dir=self.file_cache_dir)\n\n        # construct and return item\n        return {\"filename\": pdf_filename, \"contents\": pdf_bytes, \"text_contents\": text_content}\n\n\nclass TextFileDataset(BaseFileDataset):\n    \"\"\"\n    TextFileDataset returns a dictionary for each text file in a directory. Each dictionary contains the\n    filename and contents of a single text file in the directory.\n    \"\"\"\n    def __init__(self, id: str, path: str) -> None:\n        \"\"\"\n        Constructor for the `TextFileDataset` class. The `schema` is set to the `TextFile` schema.\n\n        Args:\n            id (str): a string identifier for the `Dataset`\n            path (str): The path to the directory\n        \"\"\"\n        super().__init__(path=path, id=id, schema=TextFile)\n\n    def __getitem__(self, idx: int) -> dict:\n        \"\"\"\n        Returns a dictionary with the filename and contents of the text file at the specified `idx`.\n\n        Args:\n            idx (int): The index of the item to return\n\n        Returns:\n            dict: A dictionary with the filename and contents of the text file.\n\n            .. code-block:: python\n\n                {\n                    \"filename\": \"file.txt\",\n                    \"contents\": \"text content here\",\n                }\n        \"\"\"\n        filepath = self.filepaths[idx]\n        filename = os.path.basename(filepath)\n        with open(filepath) as f:\n            contents = f.read()\n\n        return {\"filename\": filename, \"contents\": contents}\n\n\nclass XLSFileDataset(BaseFileDataset):\n    \"\"\"\n    XLSFileDataset returns a dictionary for each XLS file in a directory. Each dictionary contains the\n    filename, contents, sheet names, and the number of sheets for a single XLS file in the directory.\n    \"\"\"\n    def __init__(self, id: str, path: str) -> None:\n        \"\"\"\n        Constructor for the `XLSFileDataset` class. The `schema` is set to the `XLSFile` schema.\n        \"\"\"\n        super().__init__(path=path, id=id, schema=XLSFile)\n        self.filepaths = [fp for fp in self.filepaths if fp.endswith(tuple(constants.XLS_EXTENSIONS))]\n\n    def __getitem__(self, idx: int) -> dict:\n        \"\"\"\n        Returns a dictionary with the filename, contents, sheet names, and the number of sheets of the XLS file at the\n        specified `idx`.\n\n        Args:\n            idx (int): The index of the item to return\n\n        Returns:\n            dict: A dictionary with the filename, contents, sheet names, and the number of sheets of the XLS file.\n\n            .. code-block:: python\n\n                {\n                    \"filename\": \"file.xls\",\n                    \"contents\": b\"raw XLS content here\",\n                    \"sheet_names\": [\"Sheet1\", \"Sheet2\", \"Sheet3],\n                    \"number_sheets\": 3,\n                }\n        \"\"\"\n        filepath = self.filepaths[idx]\n        filename = os.path.basename(filepath)\n        with open(filepath, \"rb\") as f:\n            contents = f.read()\n\n        xls = pd.ExcelFile(BytesIO(contents), engine=\"openpyxl\")\n\n        return {\n            \"filename\": filename,\n            \"contents\": contents,\n            \"sheet_names\": xls.sheet_names,\n            \"number_sheets\": len(xls.sheet_names),\n        }\n\n\nclass AudioFileDataset(BaseFileDirectoryDataset):\n    \"\"\"\n    AudioFileDataset returns a dictionary for each audio file in a directory. Each dictionary contains the\n    filename and the base64 encoded bytes content of a single audio file in the directory.\n    \"\"\"\n    def __init__(self, id: str, path: str) -> None:\n        \"\"\"\n        Constructor for the `AudioFileDataset` class. The `schema` is set to the `AudioFile` schema.\n\n        Args:\n            id (str): a string identifier for the `Dataset`\n            path (str): The path to the directory\n        \"\"\"\n        super().__init__(path=path, id=id, schema=AudioFile)\n        self.filepaths = [fp for fp in self.filepaths if fp.endswith(tuple(constants.AUDIO_EXTENSIONS))]\n\n    def __getitem__(self, idx: int) -> dict:\n        \"\"\"\n        Returns a dictionary with the filename and base64 encoded bytes content of the audio file at the\n        specified `idx`.\n\n        Args:\n            idx (int): The index of the item to return\n\n        Returns:\n            dict: A dictionary with the filename and base64 encoded bytes content of the audio file.\n\n            .. code-block:: python\n\n                {\n                    \"filename\": \"audio.wav\",\n                    \"contents\": b\"base64 encoded audio content here\",\n                }\n        \"\"\"\n        filepath = self.filepaths[idx]\n        filename = os.path.basename(filepath)\n        with open(filepath, \"rb\") as f:\n            contents = base64.b64encode(f.read()).decode(\"utf-8\")\n\n        return {\"filename\": filename, \"contents\": contents}\n\n\ndef get_local_source(id: str, path: str | Path, **kwargs) -> dataset.Dataset:\n    \"\"\"Return a `Dataset` for a local file or directory.\"\"\"\n    if os.path.isfile(path):\n        return TextFileDataset(id, path)\n\n    elif os.path.isdir(path):\n        if all([f.endswith(tuple(constants.IMAGE_EXTENSIONS)) for f in os.listdir(path)]):\n            return ImageFileDataset(id, path)\n\n        elif all([f.endswith(tuple(constants.PDF_EXTENSIONS)) for f in os.listdir(path)]):\n            pdfprocessor = kwargs.get(\"pdfprocessor\", constants.DEFAULT_PDF_PROCESSOR)\n            file_cache_dir = kwargs.get(\"file_cache_dir\", \"/tmp\")\n            return PDFFileDataset(\n                id=id, path=path, pdfprocessor=pdfprocessor, file_cache_dir=file_cache_dir\n            )\n\n        elif all([f.endswith(tuple(constants.XLS_EXTENSIONS)) for f in os.listdir(path)]):\n            return XLSFileDataset(id, path)\n\n        elif all([f.endswith(tuple(constants.HTML_EXTENSIONS)) for f in os.listdir(path)]):\n            return HTMLFileDataset(id, path)\n\n        else:\n            return TextFileDataset(id, path)\n    else:\n        raise ValueError(f\"Path {path} is invalid. Does not point to a file or directory.\")\n\n\ndef resolve_datasource(id: str, source: str | Path | list | pd.DataFrame, **kwargs) -> dataset.Dataset:\n    \"\"\"\n    This helper function returns a `Dataset` object based on the `source` type.\n    The returned `Dataset` object is guaranteed to have a schema.\n    \"\"\"\n    if isinstance(source, (str, Path)):\n        source = get_local_source(id, source, **kwargs)\n\n    elif isinstance(source, (list, pd.DataFrame)):\n        source = MemoryDataset(id=id, vals=source)\n\n    else:\n        raise ValueError(f\"Invalid source type: {type(source)}, We only support str, Path, list[dict], and pd.DataFrame\")\n\n    return source\n"
  },
  {
    "path": "src/palimpzest/core/elements/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/core/elements/filters.py",
    "content": "from __future__ import annotations\n\nfrom typing import Any, Callable\n\n\n#############################\n# Filters that can be applied against a particular Schema\n#############################\n# TODO: think through a way to give filter functions fixed strings that could not be affected by a copy\n#       potentially changing the address of a function; I don't think this happens today, but it's worth safeguarding against\nclass Filter:\n    \"\"\"A filter that can be applied to a Set\"\"\"\n\n    def __init__(self, filter_condition: str | None = None, filter_fn: Callable | None = None) -> None:\n        self.filter_condition = filter_condition\n        self.filter_fn = filter_fn\n\n    def serialize(self) -> dict[str, Any]:\n        return {\n            \"filter_condition\": self.filter_condition,\n            \"filter_fn\": self.filter_fn.__name__ if self.filter_fn is not None else None,\n        }\n\n    def get_filter_str(self) -> str:\n        return self.filter_condition if self.filter_condition is not None else self.filter_fn.__name__\n\n    def __repr__(self) -> str:\n        return \"Filter(\" + self.get_filter_str() + \")\"\n\n    def __hash__(self) -> int:\n        # custom hash function\n        return hash(self.filter_condition) if self.filter_condition is not None else hash(self.filter_fn.__name__)\n\n    def __eq__(self, other) -> bool:\n        # __eq__ should be defined for consistency with __hash__\n        return (\n            isinstance(other, Filter)\n            and self.filter_condition == other.filter_condition\n            and self.filter_fn == other.filter_fn\n        )\n\n    def __str__(self) -> str:\n        return self.get_filter_str()\n"
  },
  {
    "path": "src/palimpzest/core/elements/groupbysig.py",
    "content": "from __future__ import annotations\n\nfrom typing import Any\n\nfrom pydantic import BaseModel\n\nfrom palimpzest.core.lib.schemas import create_schema_from_fields\n\n# TODO:\n# - move the arguments for group_by_fields, agg_funcs, and agg_fields into the Dataset.groupby() operator\n# - construct the correct output schema using the input schema and the group by and aggregation fields\n# - remove/update all other references to GroupBySig in the codebase\n\n# TODO:\n# - move the arguments for group_by_fields, agg_funcs, and agg_fields into the Dataset.groupby() operator\n# - construct the correct output schema using the input schema and the group by and aggregation fields\n# - remove/update all other references to GroupBySig in the codebase\n\n# signature for a group by aggregate that applies\n# group and aggregation to an input tuple\nclass GroupBySig:\n    def __init__(self, group_by_fields: list[str], agg_funcs: list[str], agg_fields: list[str]):\n        self.group_by_fields = group_by_fields\n        self.agg_funcs = agg_funcs\n        self.agg_fields = agg_fields\n\n    def validate_schema(self, input_schema: type[BaseModel]) -> tuple[bool, str | None]:\n        for f in self.group_by_fields:\n            if f not in input_schema.model_fields:\n                return (False, \"Supplied schema has no field \" + f)\n        for f in self.agg_fields:\n            if f not in input_schema.model_fields:\n                return (False, \"Supplied schema has no field \" + f)\n        return (True, None)\n\n    def serialize(self) -> dict[str, Any]:\n        out = {\n            \"group_by_fields\": self.group_by_fields,\n            \"agg_funcs\": self.agg_funcs,\n            \"agg_fields\": self.agg_fields,\n        }\n        return out\n\n    def __str__(self) -> str:\n        return \"GroupBy(\" + repr(self.serialize()) + \")\"\n\n    def __hash__(self) -> int:\n        # custom hash function\n        return hash(repr(self.serialize()))\n\n    def __eq__(self, other) -> bool:\n        # __eq__ should be defined for consistency with __hash__\n        return isinstance(other, GroupBySig) and self.serialize() == other.serialize()\n\n    def get_agg_field_names(self) -> list[str]:\n        ops = []\n        for i in range(0, len(self.agg_fields)):\n            ops.append(self.agg_funcs[i] + \"(\" + self.agg_fields[i] + \")\")\n        return ops\n\n    # TODO: output schema needs to account for input schema types and create new output schema types\n    def output_schema(self) -> type[BaseModel]:\n        # the output class varies depending on the group by, so here\n        # we dynamically construct this output\n        fields = []\n        for g in self.group_by_fields:\n            f = {\"name\": g, \"type\": Any, \"desc\": f\"Group by field: {g}\"}\n            fields.append(f)\n\n        ops = self.get_agg_field_names()\n        for op in ops:\n            f = {\"name\": op, \"type\": Any, \"desc\": f\"Aggregate field: {op}\"}\n            fields.append(f)\n\n        return create_schema_from_fields(fields)\n"
  },
  {
    "path": "src/palimpzest/core/elements/records.py",
    "content": "from __future__ import annotations\n\nimport json\nfrom collections.abc import Generator\nfrom copy import deepcopy\nfrom typing import Any\n\nimport pandas as pd\nfrom pydantic import BaseModel\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.core.data import context\nfrom palimpzest.core.lib.schemas import (\n    AUDIO_FIELD_TYPES,\n    IMAGE_FIELD_TYPES,\n    AudioBase64,\n    AudioFilepath,\n    ImageBase64,\n    ImageFilepath,\n    ImageURL,\n    project,\n    union_schemas,\n)\nfrom palimpzest.core.models import ExecutionStats, PlanStats, RecordOpStats\nfrom palimpzest.utils.hash_helpers import hash_for_id\n\n\nclass DataRecord:\n    \"\"\"A DataRecord is a single record of data matching some schema defined by a BaseModel.\"\"\"\n\n    def __init__(\n        self,\n        data_item: BaseModel,\n        source_indices: str | int | list[str | int],\n        parent_ids: str | list[str] | None = None,\n        cardinality_idx: int | None = None,\n    ):\n        # check that source_indices are provided\n        assert source_indices is not None, \"Every DataRecord must be constructed with source index (or indices)\"\n\n        # normalize to list[str]\n        if not isinstance(source_indices, list):\n            source_indices = [source_indices]\n\n        # normalize to list[str]\n        if isinstance(parent_ids, str):\n            parent_ids = [parent_ids]\n\n        # data for the data record\n        self._data_item = data_item\n\n        # the index in the root Dataset from which this DataRecord is derived;\n        # each source index takes the form: f\"{root_dataset.id}-{idx}\"\n        self._source_indices = sorted(source_indices)\n\n        # the id(s) of the parent record(s) from which this DataRecord is derived\n        self._parent_ids = parent_ids\n\n        # store the cardinality index\n        self._cardinality_idx = cardinality_idx\n\n        # indicator variable which may be flipped by filter operations to signal when a record has been filtered out\n        self._passed_operator = True\n\n        # NOTE: Record ids are hashed based on:\n        # 0. their schema (keys)\n        # 1. their parent record id(s) (or source_indices if there is no parent record)\n        # 2. their index in the fan out (if this is in a one-to-many operation)\n        #\n        # We currently do NOT hash just based on record content (i.e. schema (key, value) pairs)\n        # because multiple outputs for a given operation may have the exact same\n        # schema (key, value) pairs.\n        #\n        # We may revisit this hashing scheme in the future.\n\n        # unique identifier for the record\n        schema_fields = sorted(list(type(data_item).model_fields))\n        id_str = (\n            str(schema_fields) + str(parent_ids) if parent_ids is not None else str(self._source_indices)\n            if cardinality_idx is None\n            else str(schema_fields) + str(cardinality_idx) + str(parent_ids) if parent_ids is not None else str(self._source_indices)\n        )\n        self._id = hash_for_id(id_str)\n\n\n    # TODO: raise an exception if one of these fields is present in the schema\n    # - put these in a constant list up top\n    # - import the constant list in Dataset (if possible) and check at plan creation time\n    def __setattr__(self, name: str, value: Any, /) -> None:\n        if name in [\"_data_item\", \"_source_indices\", \"_parent_ids\", \"_cardinality_idx\", \"_passed_operator\", \"_id\"]:\n            super().__setattr__(name, value)\n        else:\n            setattr(self._data_item, name, value)\n\n\n    def __getattr__(self, name: str) -> Any:\n        return getattr(self._data_item, name)\n\n\n    def __getitem__(self, field: str) -> Any:\n        return getattr(self._data_item, field)\n\n\n    def __setitem__(self, field: str, value: Any) -> None:\n        setattr(self._data_item, field, value)\n\n\n    def __str__(self, truncate: int | None = 15) -> str:\n        if truncate is not None:\n            items = (f\"{k}={str(v)[:truncate]!r}{'...' if len(str(v)) > truncate else ''}\" for k, v in sorted(self._data_item.model_dump().items()))\n        else:\n            items = (f\"{k}={v!r}\" for k, v in sorted(self._data_item.model_dump().items()))\n        return \"{}({})\".format(type(self).__name__, \", \".join(items))\n\n\n    def __repr__(self) -> str:\n        return self.__str__(truncate=None)\n\n\n    def __eq__(self, other):\n        return isinstance(other, DataRecord) and self._data_item == other._data_item\n\n\n    def __hash__(self):\n        return hash(self.to_json_str(bytes_to_str=True, sorted=True))\n\n\n    def __iter__(self):\n        yield from self._data_item.__iter__()\n\n\n    def get_field_names(self):\n        return list(type(self._data_item).model_fields.keys())\n\n\n    def get_field_type(self, field_name: str) -> FieldInfo:\n        return type(self._data_item).model_fields[field_name]\n\n    @property\n    def schema(self) -> type[BaseModel]:\n        return type(self._data_item)\n\n    def copy(self) -> DataRecord:\n        # get the set of fields to copy from the parent record\n        copy_field_names = [field.split(\".\")[-1] for field in self.get_field_names()]\n\n        # copy field types and values from the parent\n        data_item = {field_name: self[field_name] for field_name in copy_field_names}\n\n        # make copy of the current record\n        new_dr = DataRecord(\n            self.schema(**data_item),\n            source_indices=self._source_indices,\n            parent_ids=self._parent_ids,\n            cardinality_idx=self._cardinality_idx,\n        )\n\n        # copy the passed_operator attribute\n        new_dr._passed_operator = self._passed_operator\n\n        return new_dr\n\n    @staticmethod\n    def from_parent(\n        schema: type[BaseModel],\n        data_item: dict,\n        parent_record: DataRecord,\n        project_cols: list[str] | None = None,\n        cardinality_idx: int | None = None,\n    ) -> DataRecord:\n        # if project_cols is None, then the new schema is a union of the provided schema and parent_record.schema;\n        # if project_cols is an empty list, then the new schema is simply the provided schema\n        # otherwise, it's a ProjectSchema\n        new_schema = None\n        if project_cols is None:\n            new_schema = union_schemas([schema, parent_record.schema])\n        elif project_cols == []:\n            new_schema = schema\n        else:\n            new_schema = union_schemas([schema, parent_record.schema])\n            new_schema = project(new_schema, project_cols)\n\n        # get the set of fields and field descriptions to copy from the parent record\n        copy_field_names = parent_record.get_field_names() if project_cols is None else project_cols\n        copy_field_names = [field.split(\".\")[-1] for field in copy_field_names]\n\n        # copy fields from the parent\n        data_item.update({field_name: parent_record[field_name] for field_name in copy_field_names})\n\n        # corner-case: wrap values in lists if the new schema expects a list but the data item has a single value\n        for field_name, field_info in new_schema.model_fields.items():\n            field_should_be_list = hasattr(field_info.annotation, '__origin__') and field_info.annotation.__origin__ is list\n            field_is_not_list = field_name in data_item and not isinstance(data_item[field_name], list)\n            if field_should_be_list and field_is_not_list:\n                data_item[field_name] = [data_item[field_name]]\n\n        # make new record which has parent_record as its parent (and the same source_indices)\n        new_dr = DataRecord(\n            new_schema(**data_item),\n            source_indices=parent_record._source_indices,\n            parent_ids=[parent_record._id],\n            cardinality_idx=cardinality_idx,\n        )\n\n        return new_dr\n\n    @staticmethod\n    def from_agg_parents(\n        data_item: BaseModel,\n        parent_records: DataRecordSet,\n        cardinality_idx: int | None = None,\n    ) -> DataRecord:\n        # flatten source indices from all parents\n        source_indices = [\n            source_idx\n            for parent_record in parent_records\n            for source_idx in parent_record._source_indices\n        ]\n\n        # make new record which has all parent records as its parents\n        return DataRecord(\n            data_item,\n            source_indices=source_indices,\n            parent_ids=[parent_record._id for parent_record in parent_records],\n            cardinality_idx=cardinality_idx,\n        )\n\n    @staticmethod\n    def from_join_parents(\n        schema: type[BaseModel],\n        left_parent_record: DataRecord | None,\n        right_parent_record: DataRecord | None,\n        project_cols: list[str] | None = None,\n        cardinality_idx: int = None,\n    ) -> DataRecord:\n        # get the set of fields and field descriptions to copy from the parent record(s)\n        left_copy_field_names = [] if left_parent_record is None else (\n            left_parent_record.get_field_names()\n            if project_cols is None\n            else [col for col in project_cols if col in left_parent_record.get_field_names()]\n        )\n        right_copy_field_names = [] if right_parent_record is None else (\n            right_parent_record.get_field_names()\n            if project_cols is None\n            else [col for col in project_cols if col in right_parent_record.get_field_names()]\n        )\n        left_copy_field_names = [field.split(\".\")[-1] for field in left_copy_field_names]\n        right_copy_field_names = [field.split(\".\")[-1] for field in right_copy_field_names]\n\n        # copy fields from the parents\n        data_item = {field_name: left_parent_record[field_name] for field_name in left_copy_field_names}\n        for field_name in right_copy_field_names:\n            new_field_name = field_name\n            if field_name in left_copy_field_names:\n                new_field_name = f\"{field_name}_right\"\n            data_item[new_field_name] = right_parent_record[field_name]\n\n        # for any missing fields in the schema, set them to None\n        for field_name in schema.model_fields:\n            if field_name not in data_item:\n                data_item[field_name] = None\n\n        # make new record which has left and right parent record as its parents\n        left_parent_source_indices = [] if left_parent_record is None else list(left_parent_record._source_indices)\n        right_parent_source_indices = [] if right_parent_record is None else list(right_parent_record._source_indices)\n        left_parent_record_id = [] if left_parent_record is None else [left_parent_record._id]\n        right_parent_record_id = [] if right_parent_record is None else [right_parent_record._id]\n        new_dr = DataRecord(\n            schema(**data_item),\n            source_indices=left_parent_source_indices + right_parent_source_indices,\n            parent_ids=left_parent_record_id + right_parent_record_id,\n            cardinality_idx=cardinality_idx,\n        )\n\n        return new_dr\n\n    @staticmethod\n    def to_df(records: list[DataRecord], project_cols: list[str] | None = None) -> pd.DataFrame:\n        if len(records) == 0:\n            return pd.DataFrame()\n\n        fields = records[0].get_field_names()\n        if project_cols is not None and len(project_cols) > 0:\n            fields = [field for field in fields if field in project_cols]\n\n        # convert Context --> str\n        for record in records:\n            for k in fields:\n                if isinstance(record[k], context.Context):\n                    record[k] = record[k].description\n\n        return pd.DataFrame([\n            {k: record[k] for k in fields}\n            for record in records\n        ])\n\n    def to_json_str(self, include_bytes: bool = True, bytes_to_str: bool = False, project_cols: list[str] | None = None, sorted: bool = False):\n        \"\"\"Return a JSON representation of this DataRecord\"\"\"\n        record_dict = self.to_dict(include_bytes, bytes_to_str, project_cols, sorted)\n        return json.dumps(record_dict, indent=2)\n\n    def to_dict(self, include_bytes: bool = True, bytes_to_str: bool = False, project_cols: list[str] | None = None, _sorted: bool = False, mask_filepaths: bool = False):\n        \"\"\"Return a dictionary representation of this DataRecord\"\"\"\n        # TODO(chjun): In case of numpy types, the json.dumps will fail. Convert to native types.\n        # Better ways to handle this.\n        field_values = {\n            k: v.description if isinstance(v, context.Context) else v\n            for k, v in self._data_item.model_dump().items()\n        }\n        dct = pd.Series(field_values).to_dict()\n\n        if project_cols is not None and len(project_cols) > 0:\n            project_field_names = set(field.split(\".\")[-1] for field in project_cols)\n            dct = {k: v for k, v in dct.items() if k in project_field_names}\n\n        if not include_bytes:\n            bytes_field_types = [bytes, list[bytes], bytes | None, list[bytes] | None, bytes | Any, list[bytes] | Any]\n            bytes_field_types += AUDIO_FIELD_TYPES + IMAGE_FIELD_TYPES\n            for k in dct:\n                field_type = self.get_field_type(k)\n                if field_type.annotation in bytes_field_types:\n                    dct[k] = \"<bytes>\"\n\n        if bytes_to_str:\n            for k, v in dct.items():\n                if isinstance(v, bytes):\n                    dct[k] = v.decode(\"utf-8\")\n                elif isinstance(v, list) and len(v) > 0 and any([isinstance(elt, bytes) for elt in v]):\n                    dct[k] = [elt.decode(\"utf-8\") if isinstance(elt, bytes) else elt for elt in v]\n\n        if _sorted:\n            dct = dict(sorted(dct.items()))\n\n        if mask_filepaths:\n            for k in dct:\n                field_type = self.get_field_type(k)\n                if field_type.annotation in [AudioBase64, AudioFilepath, ImageBase64, ImageFilepath, ImageURL]:\n                    dct[k] = \"<bytes>\"\n\n        return deepcopy(dct)\n\n\nclass DataRecordSet:\n    \"\"\"\n    A DataRecordSet contains a list of DataRecords that share the same schema, same parent(s), and same source(s).\n\n    We explicitly check that this is True.\n\n    The record_op_stats could be empty if the DataRecordSet is not from executing an operator.\n    \"\"\"\n    def __init__(\n            self,\n            data_records: list[DataRecord],\n            record_op_stats: list[RecordOpStats],\n            field_to_score_fn: dict[str, str | callable] | None = None,\n            input: int | DataRecord | list[DataRecord] | tuple[list[DataRecord]] | None = None,\n        ):\n        # set data_records, parent_ids, and source_indices; note that it is possible for\n        # data_records to be an empty list in the event of a failed convert\n        self.data_records = data_records\n        self.parent_ids = data_records[0]._parent_ids if len(data_records) > 0 else None\n        self.source_indices = data_records[0]._source_indices if len(data_records) > 0 else None\n        self.schema = data_records[0].schema if len(data_records) > 0 else None\n\n        # the input to the operator which produced the data_records; type is tuple[DataRecord] | tuple[int]\n        # - for scan operators, input is a singleton tuple[int] which wraps the source_idx, e.g.: (source_idx,)\n        # - for join operators, input is a tuple with one entry for the left input DataRecord and one entry for the right input DataRecord\n        # - for aggregate operators, input is a tuple with all the input DataRecords to the aggregation\n        # - for all other operaotrs, input is a singleton tuple[DataRecord] which wraps the single input\n        self.input = input\n\n        # set statistics for generating these records\n        self.record_op_stats = record_op_stats\n\n        # assign field_to_score_fn if provided\n        self.field_to_score_fn = {} if field_to_score_fn is None else field_to_score_fn\n\n    def get_total_cost(self) -> float:\n        return sum([record_op_stats.cost_per_record for record_op_stats in self.record_op_stats])\n\n    def get_field_to_score_fn(self) -> dict[str, str | callable]:\n        return self.field_to_score_fn\n\n    def __getitem__(self, slice) -> DataRecord | list[DataRecord]:\n        return self.data_records[slice]\n\n    def __len__(self) -> int:\n        return len(self.data_records)\n\n    def __iter__(self) -> Generator[DataRecord]:\n        yield from self.data_records\n\n\nclass DataRecordCollection:\n    \"\"\"\n    A DataRecordCollection contains a list of DataRecords.\n\n    This is a wrapper class for list[DataRecord] to support more advanced features for output of execute().\n\n    The difference between DataRecordSet and DataRecordCollection \n\n    Goal: \n        DataRecordSet is a set of DataRecords that share the same schema, same parents, and same sources.\n        DataRecordCollection is a general wrapper for list[DataRecord].\n    \n    Usage:\n        DataRecordSet is used for the output of executing an operator.\n        DataRecordCollection is used for the output of executing a query, we definitely could extend it to support more advanced features for output of execute().\n    \"\"\"\n    def __init__(self, data_records: list[DataRecord], execution_stats: ExecutionStats | None = None, plan_stats: PlanStats | None = None):\n        self.data_records = data_records\n        self.execution_stats = execution_stats\n        self.plan_stats = plan_stats\n        self.executed_plans = self._get_executed_plans()\n\n    def __iter__(self) -> Generator[DataRecord]:\n        \"\"\"Allow iterating directly over the data records\"\"\"\n        yield from self.data_records\n\n    def __len__(self):\n        \"\"\"Return the number of records in the collection\"\"\"\n        return len(self.data_records)\n\n    def to_df(self, cols: list[str] | None = None):\n        return DataRecord.to_df(self.data_records, cols)\n\n    def _get_executed_plans(self):\n        if self.plan_stats is not None:\n            return [self.plan_stats.plan_str]\n        elif self.execution_stats is not None:\n            return list(self.execution_stats.plan_strs.values())\n        else:\n            return None\n"
  },
  {
    "path": "src/palimpzest/core/lib/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/core/lib/schemas.py",
    "content": "from __future__ import annotations\n\nimport sys\nfrom typing import Any, TypeAliasType\n\nimport pandas as pd\nfrom pydantic import BaseModel, Field, create_model\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.utils.hash_helpers import hash_for_serialized_dict\n\n# DEFINITIONS\nPANDAS_DTYPE_TO_PYDANTIC = {\n    \"object\": str,\n    \"bool\": bool,\n    \"int64\": int,\n    \"float64\": float,\n}\n\n# IMAGE TYPES\nImageFilepath = TypeAliasType('ImageFilepath', str)\nImageBase64 = TypeAliasType('ImageBase64', str)\nImageURL = TypeAliasType('ImageURL', str)\n\n# AUDIO TYPES\nAudioFilepath = TypeAliasType('AudioFilepath', str)\nAudioBase64 = TypeAliasType('AudioBase64', str)\n\nIMAGE_LIST_FIELD_TYPES = [\n    list[ImageBase64],\n    list[ImageFilepath],\n    list[ImageURL],\n    list[ImageBase64] | None,\n    list[ImageFilepath] | None,\n    list[ImageURL] | None,\n    list[ImageBase64] | Any,\n    list[ImageFilepath] | Any,\n    list[ImageURL] | Any,\n]\nIMAGE_FIELD_TYPES = IMAGE_LIST_FIELD_TYPES + [\n    ImageBase64, ImageFilepath, ImageURL,\n    ImageBase64 | None, ImageFilepath | None, ImageURL | None,\n    ImageBase64 | Any, ImageFilepath | Any, ImageURL | Any,\n]\nAUDIO_LIST_FIELD_TYPES = [\n    list[AudioBase64],\n    list[AudioFilepath],\n    list[AudioBase64] | None,\n    list[AudioFilepath] | None,\n    list[AudioBase64] | Any,\n    list[AudioFilepath] | Any,\n]\nAUDIO_FIELD_TYPES = AUDIO_LIST_FIELD_TYPES + [\n    AudioBase64, AudioFilepath,\n    AudioBase64 | None, AudioFilepath | None,\n    AudioBase64 | Any, AudioFilepath | Any,\n]\n\n\ndef get_schema_field_names(schema: type[BaseModel], id: str | None = None) -> list[str]:\n    \"\"\"Return the field names of a Pydantic model.\"\"\"\n    return list(schema.model_fields) if id is None else [f\"{schema.__name__}.{id}.{field_name}\" for field_name in schema.model_fields]\n\n\ndef _create_pickleable_model(fields: dict[str, tuple[type, FieldInfo]]) -> type[BaseModel]:\n    \"\"\"Create a Pydantic model that can be pickled.\"\"\"\n    # create unique name for the unioned model\n    new_schema_name = f\"Schema{sorted(fields.keys())}\"\n    new_schema_id = hash_for_serialized_dict({\n        field_name: {\"annotation\": str(annotation), \"default\": str(field.default), \"description\": field.description}\n        for field_name, (annotation, field) in fields.items()\n    })\n\n    # if this class already exists, get it from the module and return\n    module = sys.modules[__name__]\n    if hasattr(module, new_schema_id):\n        return getattr(module, new_schema_id)\n\n    # create the class dynamically\n    new_model = create_model(new_schema_name, **fields)\n\n    # register it in the module's namespace so pickle can find it\n    module = sys.modules[__name__]\n    setattr(module, new_schema_id, new_model)\n    new_model.__module__ = module.__name__\n\n    return new_model\n\n\ndef relax_schema(model: type[BaseModel]) -> type[BaseModel]:\n    \"\"\"Updates the type annotation for every field in the BaseModel to include typing.Any\"\"\"\n    fields = {}\n    for field_name, field in model.model_fields.items():\n        fields[field_name] = (field.annotation | Any, field)\n\n    return _create_pickleable_model(fields)\n\n\ndef project(model: type[BaseModel], project_fields: list[str]) -> type[BaseModel]:\n    \"\"\"Project a Pydantic model to only the specified columns.\"\"\"\n    # make sure projection column names are shortened\n    project_fields = [field_name.split(\".\")[-1] for field_name in project_fields]\n\n    # build up the fields for the new schema\n    fields = {}\n    for field_name, field in model.model_fields.items():\n        if field_name in project_fields:\n            fields[field_name] = (field.annotation, field)\n\n    # create and return the new schema\n    return _create_pickleable_model(fields)\n\n\ndef create_schema_from_fields(fields: list[dict]) -> type[BaseModel]:\n    \"\"\"Create a Pydantic model from a list of fields.\"\"\"\n    fields_ = {}\n    for field in fields:\n        assert \"name\" in field, \"fields must contain a 'name' key\"\n        assert \"type\" in field, \"fields must contain a 'type' key\"\n        assert \"desc\" in field or \"description\" in field, \"fields must contain a 'description' key\"\n\n        # for backwards compatability, rename \"desc\" to \"description\"\n        if \"desc\" in field:\n            field[\"description\"] = field.pop(\"desc\")\n        field_name = field[\"name\"]\n        field_type = field[\"type\"]\n        fields_[field_name] = (field_type, Field(**{k: v for k, v in field.items() if k not in [\"name\", \"type\"]}))\n\n    return _create_pickleable_model(fields_)\n\n\ndef create_schema_from_df(df: pd.DataFrame) -> type[BaseModel]:\n    \"\"\"Create a Pydantic model from a Pandas DataFrame.\"\"\"\n    fields = {}\n    for column, dtype in zip(df.columns, df.dtypes):\n        column = f\"column_{column}\" if isinstance(column, int) else column\n        field_desc = f\"The {column} column from an input DataFrame\"\n        annotation = PANDAS_DTYPE_TO_PYDANTIC.get(str(dtype), Any)\n        fields[column] = (annotation, Field(description=field_desc))\n\n    # create and return the new schema\n    return _create_pickleable_model(fields)\n\n\ndef union_schemas(models: list[type[BaseModel]], join: bool = False, on: list[str] | None = None) -> type[BaseModel]:\n    \"\"\"Union multiple Pydantic models into a single model.\"\"\"\n    # convert on to empty list if None\n    if on is None:\n        on = []\n\n    # build up the fields for the new schema\n    fields = {}\n    for model in models:\n        for field_name, field in model.model_fields.items():\n            # for non-join unions, make sure duplicate fields have the same type\n            if not join and field_name in fields:\n                assert fields[field_name][0] == field.annotation, f\"Field {field_name} has different types in different models\"\n\n            # for joins with \"on\" specified, no need to rename fields in \"on\"\n            elif join and field_name in on and field_name in fields:\n                continue\n\n            # otherwise, rename duplicate fields by appending _right\n            elif join and field_name in fields:\n                while field_name in fields:\n                    field_name = f\"{field_name}_right\"\n\n            # add the field to the new schema\n            fields[field_name] = (field.annotation, field)\n\n    # create and return the new schema\n    return _create_pickleable_model(fields)\n\n###################################################################################\n# \"Core\" useful Schemas. These are Schemas that almost everyone will need.\n# File, TextFile, Image, PDF, etc.\n###################################################################################\n\n\n# First-level Schema's\nclass DefaultSchema(BaseModel):\n    \"\"\"Store context data.\"\"\"\n    value: Any = Field(description=\"The value of the input data\")\n\nclass Download(BaseModel):\n    \"\"\"A download is a URL and the contents of the download.\"\"\"\n    url: str = Field(description=\"The URL of the download\")\n    content: bytes = Field(description=\"The contents of the download\")\n    timestamp: str = Field(description=\"The timestamp of the download\")\n\nclass File(BaseModel):\n    \"\"\"\n    A File is defined by two Fields:\n    - the filename (string)\n    - the contents of the file (bytes)\n    \"\"\"\n    filename: str = Field(description=\"The UNIX-style name of the file\")\n    contents: bytes = Field(description=\"The contents of the file\")\n\nclass TextFile(BaseModel):\n    \"\"\"A text file is a File that contains only text. No binary data.\"\"\"\n    filename: str = Field(description=\"The UNIX-style name of the file\")\n    contents: str = Field(description=\"The contents of the file\")\n\nclass Average(BaseModel):\n    average: float = Field(description=\"The average value of items in the dataset\")\n\nclass Count(BaseModel):\n    count: int = Field(description=\"The count of items in the dataset\")\n\nclass Sum(BaseModel):\n    sum: int = Field(description=\"The summation of items in the dataset\")\n\nclass Min(BaseModel):\n    min: int | float = Field(description=\"The minimum value of some items in the dataset\")\n\nclass Max(BaseModel):\n    max: int | float = Field(description=\"The maximum value of some items in the dataset\")\n\nclass OperatorDerivedSchema(BaseModel):\n    \"\"\"Schema defined by an operator, e.g., a join or a group by\"\"\"\n\nclass Table(BaseModel):\n    \"\"\"A Table is an object composed of a header and rows.\"\"\"\n    filename: str = Field(description=\"The name of the file the table was extracted from\")\n    name: str = Field(description=\"The name of the table\")\n    header: list[str] = Field(description=\"The header of the table\")\n    rows: list[list] = Field(description=\"The rows of the table\")\n\nclass URL(BaseModel):\n    \"\"\"A URL is a string that represents a web address.\"\"\"\n    url: str = Field(description=\"A URL\")\n\nclass WebPage(BaseModel):\n    \"\"\"A web page is a URL and the contents of the page.\"\"\"\n    text: str = Field(description=\"The text contents of the web page\")\n    html: str = Field(description=\"The html contents of the web page\")\n    timestamp: str = Field(description=\"The timestamp of the download\")\n    filename: str = Field(description=\"The name of the file the web page was downloaded from\")\n\n# Second-level Schemas\nclass ImageFile(File):\n    \"\"\"A file that contains an image.\"\"\"\n    contents: ImageBase64 = Field(description=\"The contents of the image encoded as a base64 string\")\n\nclass AudioFile(File):\n    \"\"\"A file that contains audio.\"\"\"\n    contents: AudioBase64 = Field(description=\"The contents of an audio recording encoded as a base64 string\")\n\nclass PDFFile(File):\n    \"\"\"A PDF file is a File that is a PDF. It has specialized fields, font information, etc.\"\"\"\n    # This class is currently very impoverished. It needs a lot more fields before it can correctly represent a PDF.\n    text_contents: str = Field(description=\"The text-only contents of the PDF\")\n\nclass XLSFile(File):\n    \"\"\"An XLS file is a File that contains one or more Excel spreadsheets.\"\"\"\n    number_sheets: int = Field(description=\"The number of sheets in the Excel file\")\n    sheet_names: list[str] = Field(description=\"The names of the sheets in the Excel file\")\n\n# Third-level Schemas\nclass EquationImage(ImageFile):\n    \"\"\"An image that contains a mathematical equation.\"\"\"\n    equation_text: str = Field(description=\"The text representation of the equation in the image\")\n\nclass PlotImage(ImageFile):\n    \"\"\"An image that contains a plot, such as a graph or chart.\"\"\"\n    plot_description: str = Field(description=\"A description of the plot\")\n"
  },
  {
    "path": "src/palimpzest/core/models.py",
    "content": "from __future__ import annotations\n\nimport json\nimport time\nfrom abc import abstractmethod\nfrom typing import Any\n\nfrom pydantic import BaseModel, Field\n\n\nclass GenerationStats(BaseModel):\n    \"\"\"\n    Model for storing statistics about the execution of an operator on a single record.\n    \"\"\"\n\n    model_name: str | None = None\n\n    # The raw answer as output from the generator (a list of strings, possibly of len 1)\n    # raw_answers: Optional[List[str]] = field(default_factory=list)\n\n    # the total number of input text tokens processed by this operator; None if this operation did not use any LLM\n    # typed as a float because GenerationStats may be amortized (i.e. divided) acorss a number of output records\n    input_text_tokens: float = 0.0\n\n    # the total number of input audio tokens processed by this operation.\n    input_audio_tokens: float = 0.0\n\n    # the total number of input image tokens processed by this operation.\n    input_image_tokens: float = 0.0\n\n    # the total number of cache read tokens processed by this operation (charged at a discount, typically 0.1x input rate)\n    cache_read_tokens: float = 0.0\n\n    # the total number of tokens written to the cache in this operation (Anthropic only) (charged at creation rate, typically 1.25x input rate)\n    cache_creation_tokens: float = 0.0\n\n    # the number of output text tokens generated by the model\n    output_text_tokens: float = 0.0\n\n    # the total number of input tokens processed by embedding models\n    embedding_input_tokens: float = 0.0\n\n    # the total cost of processing the input and output tokens; None if this operation did not use an LLM\n    # TODO: future PR: cost_per_record --> total_cost\n    cost_per_record: float = 0.0\n\n    # (if applicable) the time (in seconds) spent executing a call to an LLM\n    llm_call_duration_secs: float = 0.0\n\n    # (if applicable) the time (in seconds) spent executing a call to a function\n    fn_call_duration_secs: float = 0.0\n\n    # (if applicable) the total number of LLM calls made by this operator\n    total_llm_calls: float = 0.0\n\n    # (if applicable) the total number of embedding LLM calls made by this operator\n    total_embedding_llm_calls: float = 0.0\n\n    def __iadd__(self, other: GenerationStats) -> GenerationStats:\n        for field in type(self).model_fields:\n            if field == \"model_name\":\n                continue\n            setattr(self, field, getattr(self, field) + getattr(other, field))\n        return self\n\n    def __add__(self, other: GenerationStats) -> GenerationStats:\n        dct = {\n            field: getattr(self, field) + getattr(other, field)\n            for field in type(self).model_fields\n            if field != \"model_name\"\n        }\n        dct[\"model_name\"] = self.model_name\n        return GenerationStats(**dct)\n\n    # Do the same as iadd and add but with division operator\n    def __itruediv__(self, quotient: float) -> GenerationStats:\n        if quotient == 0:\n            raise ZeroDivisionError(\"Cannot divide by zero\")\n        if isinstance(quotient, int):\n            quotient = float(quotient)\n        for field in type(self).model_fields:\n            if field == \"model_name\":\n                continue\n            setattr(self, field, getattr(self, field) / quotient)\n        return self\n\n    def __truediv__(self, quotient: float) -> GenerationStats:\n        if quotient == 0:\n            raise ZeroDivisionError(\"Cannot divide by zero\")\n        if isinstance(quotient, int):\n            quotient = float(quotient)\n        dct = {\n            field: getattr(self, field) / quotient\n            for field in type(self).model_fields\n            if field != \"model_name\"\n        }\n        dct[\"model_name\"] = self.model_name\n        return GenerationStats(**dct)\n\n    def __radd__(self, other: int) -> GenerationStats:\n        assert not isinstance(other, GenerationStats), \"This should not be called with a GenerationStats object\"\n        return self\n\n    # NOTE: this is added temporarily to help track cost of compute agent writing PZ code;\n    #       once we find a long-term solution for tracking that cost, we can remove this\n    def to_json(self, filepath: str | None = None) -> dict | None:\n        if filepath is None:\n            return self.model_dump(mode=\"json\")\n\n        with open(filepath, \"w\") as f:\n            json.dump(self.model_dump(mode=\"json\"), f)\n\n\nclass RecordOpStats(BaseModel):\n    \"\"\"\n    Model for storing statistics about the execution of an operator on a single record.\n    \"\"\"\n\n    ##### REQUIRED FIELDS #####\n    # record id; an identifier for this record\n    record_id: str | int\n\n    # identifier for the parent(s) of this record\n    record_parent_ids: list[str | int] | None\n\n    # idenifier for the source indices of this record\n    record_source_indices: list[str | int]\n\n    # a dictionary with the record state after being processed by the operator\n    record_state: dict[str, Any]\n\n    # operation id; an identifier for this operation's physical op id\n    full_op_id: str\n\n    # logical operation id; the logical op id for this physical op\n    logical_op_id: str\n\n    # operation name\n    op_name: str\n\n    # the time spent by the data record just in this operation\n    time_per_record: float\n\n    # the cost (in dollars) to generate this record at this operation\n    cost_per_record: float\n\n    ##### NOT-OPTIONAL, BUT FILLED BY EXECUTION CLASS AFTER CONSTRUCTOR CALL #####\n    # the ID(s) of the physical operation(s) which produced the input record(s) for this record at this operation\n    source_unique_full_op_ids: list[str] | None = None\n\n    # the ID(s) of the logical operation(s) which produced the input record(s) for this record at this operation\n    source_unique_logical_op_ids: list[str] | None = None\n\n    # the ID of the physical plan which produced this record at this operation\n    plan_id: str = \"\"\n\n    ##### OPTIONAL, BUT FILLED BY COST MODEL AFTER SAMPLE DATA EXECUTION #####\n    quality: float | None = None\n\n    ##### OPTIONAL FIELDS (I.E. ONLY MANDATORY FOR CERTAIN OPERATORS) #####\n    # (if applicable) the name of the model used to generate the output for this record\n    model_name: str | None = None\n\n    # (if applicable) the mapping from field-name to generated output for this record\n    answer: dict[str, Any] | None = None\n\n    # (if applicable) the mapping from field-name to generated output for this record\n    # raw_answers: Optional[List[str, Any]] = field(default_factory=list)\n\n    # (if applicable) the list of input fields for the generation for this record\n    input_fields: list[str] | None = None\n\n    # (if applicable) the list of generated fields for this record\n    generated_fields: list[str] | None = None\n\n    # the number of input text tokens processed by this operation\n    # typed as a float because GenerationStats may be amortized (i.e. divided) across a number of output records\n    input_text_tokens: float = 0.0\n\n    # the number of input audio tokens processed by this operation\n    input_audio_tokens: float = 0.0\n\n    # the number of input image tokens processed by this operation\n    input_image_tokens: float = 0.0\n\n    # the number of cache read tokens processed by this operation\n    cache_read_tokens: float = 0.0\n\n    # the number of tokens written to cache by this operation\n    cache_creation_tokens: float = 0.0\n\n    # the number of output text tokens generated by this operation\n    output_text_tokens: float = 0.0\n\n    # the number of input tokens processed by embedding models\n    embedding_input_tokens: float = 0.0\n\n    # (if applicable) the filter text (or a string representation of the filter function) applied to this record\n    filter_str: str | None = None\n\n    # (if applicable) the join condition applied to this record\n    join_condition: str | None = None\n\n    # the True/False result of whether this record was output by the operator or not\n    # (can only be False if the operator is a Filter or Join)\n    passed_operator: bool = True\n\n    # (if applicable) the time (in seconds) spent executing a call to an LLM\n    llm_call_duration_secs: float = 0.0\n\n    # (if applicable) the time (in seconds) spent executing a UDF or calling an external api\n    fn_call_duration_secs: float = 0.0\n\n    # (if applicable) the total number of LLM calls made by this operator\n    total_llm_calls: float = 0.0\n\n    # (if applicable) the total number of embedding LLM calls made by this operator\n    total_embedding_llm_calls: float = 0.0\n\n    # (if applicable) a boolean indicating whether this is the statistics captured from a failed convert operation\n    failed_convert: bool | None = None\n\n    # an OPTIONAL dictionary with more detailed information about this operation;\n    op_details: dict[str, Any] = Field(default_factory=dict)\n\n\nclass OperatorStats(BaseModel):\n    \"\"\"\n    Model for storing statistics captured within a given operator.\n    \"\"\"\n\n    # the full ID of the physical operation in which these stats were collected\n    full_op_id: str\n\n    # the name of the physical operation in which these stats were collected\n    op_name: str\n\n    # the total time spent in this operation\n    total_op_time: float = 0.0\n\n    # the total cost of this operation\n    total_op_cost: float = 0.0\n\n    # the number of input text tokens processed by this operation\n    input_text_tokens: float = 0.0\n\n    # the number of input audio tokens processed by this operation\n    input_audio_tokens: float = 0.0\n\n    # the number of input image tokens processed by this operation\n    input_image_tokens: float = 0.0\n\n    # the number of cache read tokens processed by this operation\n    cache_read_tokens: float = 0.0\n\n    # the number of tokens written to cache by this operation\n    cache_creation_tokens: float = 0.0\n\n    # the number of output text tokens generated by this operation\n    output_text_tokens: float = 0.0\n\n    # the number of input tokens processed by embedding models\n    embedding_input_tokens: float = 0.0\n\n    # a list of RecordOpStats processed by the operation\n    record_op_stats_lst: list[RecordOpStats] = Field(default_factory=list)\n\n    # the unique full ID(s) of the physical operator(s) which precede this one (used by PlanStats)\n    source_unique_full_op_ids: list[str] | None = None\n\n    # the unique full ID(s) of the logical operator(s) which precede this one (used by SentinelPlanStats)\n    source_unique_logical_op_ids: list[str] | None = None\n\n    # the ID of the physical plan which this operator is part of\n    plan_id: str = \"\"\n\n    # an OPTIONAL dictionary with more detailed information about this operation;\n    op_details: dict[str, Any] = Field(default_factory=dict)\n\n    def __iadd__(self, stats: OperatorStats | RecordOpStats) -> OperatorStats:\n        \"\"\"\n        Sum the given stats to this operator's stats. The given stats can be either:\n\n        1. an OperatorStats object\n        2. a RecordOpStats object\n\n        NOTE: in case (1.) we assume the execution layer guarantees that `stats` is\n              generated by the same operator in the same plan. Thus, we assume the\n              full_op_ids, op_name, source_op_id, etc. do not need to be updated.\n        \"\"\"\n        if isinstance(stats, OperatorStats):\n            self.total_op_time += stats.total_op_time\n            self.total_op_cost += stats.total_op_cost\n            self.input_text_tokens += stats.input_text_tokens\n            self.input_audio_tokens += stats.input_audio_tokens\n            self.input_image_tokens += stats.input_image_tokens\n            self.cache_read_tokens += stats.cache_read_tokens\n            self.cache_creation_tokens += stats.cache_creation_tokens\n            self.output_text_tokens += stats.output_text_tokens\n            self.embedding_input_tokens += stats.embedding_input_tokens\n            self.record_op_stats_lst.extend(stats.record_op_stats_lst)\n\n        elif isinstance(stats, RecordOpStats):\n            stats.source_unique_full_op_ids = self.source_unique_full_op_ids\n            stats.plan_id = self.plan_id\n            self.record_op_stats_lst.append(stats)\n            self.total_op_time += stats.time_per_record\n            self.total_op_cost += stats.cost_per_record\n            self.input_text_tokens += stats.input_text_tokens\n            self.input_audio_tokens += stats.input_audio_tokens\n            self.input_image_tokens += stats.input_image_tokens\n            self.cache_read_tokens += stats.cache_read_tokens\n            self.cache_creation_tokens += stats.cache_creation_tokens\n            self.output_text_tokens += stats.output_text_tokens\n            self.embedding_input_tokens += stats.embedding_input_tokens\n\n        else:\n            raise TypeError(f\"Cannot add {type(stats)} to OperatorStats\")\n\n        return self\n\n\nclass BasePlanStats(BaseModel):\n    \"\"\"\n    Model for storing statistics captured for an entire plan.\n\n    This class is subclassed for tracking:\n    - PlanStats: the statistics for execution of a PhysicalPlan\n    - SentinelPlanStats: the statistics for execution of a SentinelPlan\n\n    The key difference between the two subclasses is that the `operator_stats`\n    field in the PlanStats maps from the physical operator ids to their corresponding\n    OperatorStats objects.\n\n    The `operator_stats` field in the SentinelPlanStats maps from a logical operator id\n    to another dictionary which maps from the physical operator ids to their corresponding\n    OperatorStats objects.\n    \"\"\"\n\n    # id for identifying the physical plan\n    plan_id: str\n\n    # string representation of the physical plan\n    plan_str: str | None = None\n\n    # dictionary whose values are OperatorStats objects;\n    # PlanStats maps {full_op_id -> OperatorStats}\n    # SentinelPlanStats maps {logical_op_id -> {full_op_id -> OperatorStats}}\n    operator_stats: dict[str, OperatorStats | dict[str, OperatorStats]] = Field(default_factory=dict)\n\n    # dictionary whose values are GenerationStats objects for validation;\n    # only used by SentinelPlanStats\n    validation_gen_stats: dict[str, GenerationStats] = Field(default_factory=dict)\n\n    # total runtime for the plan measured from the start to the end of PhysicalPlan.execute()\n    total_plan_time: float = 0.0\n\n    # total cost for plan\n    total_plan_cost: float = 0.0\n\n    # input text tokens processed by this plan\n    input_text_tokens: float = 0.0\n\n    # input audio tokens processed by this plan\n    input_audio_tokens: float = 0.0\n\n    # input image tokens processed by this plan\n    input_image_tokens: float = 0.0\n\n    # cache read tokens processed by this plan\n    cache_read_tokens: float = 0.0\n\n    # tokens written to cache by this plan\n    cache_creation_tokens: float = 0.0\n\n    # output text tokens generated by this plan\n    output_text_tokens: float = 0.0\n\n    # embedding input tokens processed by this plan\n    embedding_input_tokens: float = 0.0\n\n    # start time for the plan execution; should be set by calling PlanStats.start()\n    start_time: float | None = None\n\n    def start(self) -> None:\n        \"\"\"Start the timer for this plan execution.\"\"\"\n        self.start_time = time.time()\n\n    def finish(self) -> None:\n        \"\"\"Finish the timer for this plan execution.\"\"\"\n        if self.start_time is None:\n            raise RuntimeError(\"PlanStats.start() must be called before PlanStats.finish()\")\n        self.total_plan_time = time.time() - self.start_time\n        self.total_plan_cost = self.sum_op_stats_field(\"total_op_cost\") + self.sum_validation_stats_field(\"cost_per_record\")\n        self.input_text_tokens = self.sum_op_stats_field(\"input_text_tokens\") + self.sum_validation_stats_field(\"input_text_tokens\")\n        self.input_audio_tokens = self.sum_op_stats_field(\"input_audio_tokens\") + self.sum_validation_stats_field(\"input_audio_tokens\")\n        self.input_image_tokens = self.sum_op_stats_field(\"input_image_tokens\") + self.sum_validation_stats_field(\"input_image_tokens\")\n        self.cache_read_tokens = self.sum_op_stats_field(\"cache_read_tokens\") + self.sum_validation_stats_field(\"cache_read_tokens\")\n        self.cache_creation_tokens = self.sum_op_stats_field(\"cache_creation_tokens\") + self.sum_validation_stats_field(\"cache_creation_tokens\")\n        self.output_text_tokens = self.sum_op_stats_field(\"output_text_tokens\") + self.sum_validation_stats_field(\"output_text_tokens\")\n        self.embedding_input_tokens = self.sum_op_stats_field(\"embedding_input_tokens\") + self.sum_validation_stats_field(\"embedding_input_tokens\")\n\n    @staticmethod\n    @abstractmethod\n    def from_plan(plan) -> BasePlanStats:\n        \"\"\"\n        Initialize this PlanStats object from a PhysicalPlan or SentinelPlan object.\n        \"\"\"\n        pass\n\n    @abstractmethod\n    def sum_op_stats_field(self, field_name: str) -> float | int:\n        \"\"\"Sum a given field across all operator stats in this plan.\"\"\"\n        pass\n\n    def sum_validation_stats_field(self, field_name: str) -> float | int:\n        \"\"\"Sum a given field across all validation generation stats in this plan.\"\"\"\n        return sum([getattr(gen_stats, field_name) for _, gen_stats in self.validation_gen_stats.items()])\n\n    @abstractmethod\n    def add_record_op_stats(self, unique_full_op_id: str, record_op_stats: RecordOpStats | list[RecordOpStats]) -> None:\n        \"\"\"\n        Add the given RecordOpStats to this plan's operator stats for the given operator id.\n        \"\"\"\n        pass\n\n    @abstractmethod\n    def __iadd__(self, plan_stats: BasePlanStats) -> None:\n        \"\"\"\n        Add the given PlanStats to this plan's operator stats.\n        \"\"\"\n        pass\n\n    @abstractmethod\n    def __str__(self) -> str:\n        \"\"\"\n        Return a string representation of this plan's statistics.\n        \"\"\"\n        pass\n\n    def get_total_cost_so_far(self) -> float:\n        \"\"\"\n        Get the total cost incurred so far in this plan execution.\n        \"\"\"\n        return self.sum_op_stats_field(\"total_op_cost\") + self.sum_validation_stats_field(\"cost_per_record\")\n\n\nclass PlanStats(BasePlanStats):\n    \"\"\"\n    Subclass of BasePlanStats which captures statistics from the execution of a single PhysicalPlan.\n    \"\"\"\n    @staticmethod\n    def from_plan(plan) -> PlanStats:\n        \"\"\"\n        Initialize this PlanStats object from a PhysicalPlan object.\n        \"\"\"\n        # TODO?: have PhysicalPlan return PlanStats object\n        operator_stats = {}\n        for topo_idx, op in enumerate(plan):\n            unique_full_op_id = f\"{topo_idx}-{op.get_full_op_id()}\"\n            operator_stats[unique_full_op_id] = OperatorStats(\n                full_op_id=op.get_full_op_id(),\n                op_name=op.op_name(),\n                source_unique_full_op_ids=plan.get_source_unique_full_op_ids(topo_idx, op),\n                plan_id=plan.plan_id,\n                op_details={k: str(v) for k, v in op.get_id_params().items()},\n            )\n\n        return PlanStats(plan_id=plan.plan_id, plan_str=str(plan), operator_stats=operator_stats)\n \n    def sum_op_stats_field(self, field_name: str) -> float | int:\n        \"\"\"Sum a given field across all operator stats in this plan.\"\"\"\n        return sum([getattr(op_stats, field_name) for _, op_stats in self.operator_stats.items()])\n\n    def add_record_op_stats(self, unique_full_op_id: str, record_op_stats: RecordOpStats | list[RecordOpStats]) -> None:\n        \"\"\"\n        Add the given RecordOpStats to this plan's operator stats for the given operator id.\n        \"\"\"\n        # normalize input type to be list[RecordOpStats]\n        record_op_stats_lst = record_op_stats if isinstance(record_op_stats, list) else [record_op_stats]\n\n        # update operator stats\n        for record_op_stats in record_op_stats_lst:\n            if unique_full_op_id in self.operator_stats:\n                self.operator_stats[unique_full_op_id] += record_op_stats\n            else:\n                raise ValueError(f\"RecordOpStats with unique_full_op_id {unique_full_op_id} not found in PlanStats\")\n\n    def __iadd__(self, plan_stats: PlanStats) -> None:\n        \"\"\"\n        NOTE: we assume the execution layer guarantees:\n        1. these plan_stats belong to the same plan\n        2. these plan_stats come from sequential (non-overlapping) executions of the same plan\n\n        The latter criteria implies it is okay for this method to sum the plan (and operator) runtimes.\n        \"\"\"\n        self.total_plan_time += plan_stats.total_plan_time\n        self.total_plan_cost += plan_stats.total_plan_cost\n        self.input_text_tokens += plan_stats.input_text_tokens\n        self.input_audio_tokens += plan_stats.input_audio_tokens\n        self.input_image_tokens += plan_stats.input_image_tokens\n        self.cache_read_tokens += plan_stats.cache_read_tokens\n        self.cache_creation_tokens += plan_stats.cache_creation_tokens\n        self.output_text_tokens += plan_stats.output_text_tokens\n        self.embedding_input_tokens += plan_stats.embedding_input_tokens\n        for unique_full_op_id, op_stats in plan_stats.operator_stats.items():\n            if unique_full_op_id in self.operator_stats:\n                self.operator_stats[unique_full_op_id] += op_stats\n            else:\n                self.operator_stats[unique_full_op_id] = op_stats\n\n    def __str__(self) -> str:\n        stats = f\"total_plan_time={self.total_plan_time} \\n\"\n        stats += f\"total_plan_cost={self.total_plan_cost} \\n\"\n        stats += f\"input_text_tokens={self.input_text_tokens} \\n\"\n        stats += f\"input_audio_tokens={self.input_audio_tokens} \\n\"\n        stats += f\"input_image_tokens={self.input_image_tokens} \\n\"\n        stats += f\"cache_read_tokens={self.cache_read_tokens} \\n\"\n        stats += f\"cache_creation_tokens={self.cache_creation_tokens} \\n\"\n        stats += f\"output_text_tokens={self.output_text_tokens} \\n\"\n        stats += f\"embedding_input_tokens={self.embedding_input_tokens} \\n\"\n        for idx, op_stats in enumerate(self.operator_stats.values()):\n            stats += f\"{idx}. {op_stats.op_name} time={op_stats.total_op_time} cost={op_stats.total_op_cost} \\n\"\n        return stats\n\n\nclass SentinelPlanStats(BasePlanStats):\n    \"\"\"\n    Subclass of BasePlanStats which captures statistics from the execution of a single SentinelPlan.\n    \"\"\"\n    @staticmethod\n    def from_plan(plan) -> SentinelPlanStats:\n        \"\"\"\n        Initialize this PlanStats object from a Sentinel object.\n        \"\"\"\n        operator_stats = {}\n        for topo_idx, (logical_op_id, op_set) in enumerate(plan):\n            unique_logical_op_id = f\"{topo_idx}-{logical_op_id}\"\n            operator_stats[unique_logical_op_id] = {}\n            for physical_op in op_set:\n                full_op_id = physical_op.get_full_op_id()\n                operator_stats[unique_logical_op_id][full_op_id] = OperatorStats(\n                    full_op_id=full_op_id,\n                    op_name=physical_op.op_name(),\n                    source_unique_logical_op_ids=plan.get_source_unique_logical_op_ids(unique_logical_op_id),\n                    plan_id=plan.plan_id,\n                    op_details={k: str(v) for k, v in physical_op.get_id_params().items()},\n                )\n\n        return SentinelPlanStats(plan_id=plan.plan_id, plan_str=str(plan), operator_stats=operator_stats)\n\n    def sum_op_stats_field(self, field_name: str) -> float | int:\n        \"\"\"Sum a given field across all operator stats in this plan.\"\"\"\n        return sum(sum([getattr(op_stats, field_name) for _, op_stats in phys_op_stats.items()]) for _, phys_op_stats in self.operator_stats.items())\n\n    def add_record_op_stats(self, unique_logical_op_id: str, record_op_stats: RecordOpStats | list[RecordOpStats]) -> None:\n        \"\"\"\n        Add the given RecordOpStats to this plan's operator stats for the given operator set id.\n        \"\"\"\n        # normalize input type to be list[RecordOpStats]\n        record_op_stats_lst = record_op_stats if isinstance(record_op_stats, list) else [record_op_stats]\n\n        # update operator stats\n        for record_op_stats in record_op_stats_lst:\n            full_op_id = record_op_stats.full_op_id\n            if unique_logical_op_id in self.operator_stats:\n                if full_op_id in self.operator_stats[unique_logical_op_id]:\n                    self.operator_stats[unique_logical_op_id][full_op_id] += record_op_stats\n                else:\n                    raise ValueError(f\"RecordOpStats with full_op_id {full_op_id} not found in SentinelPlanStats\")\n            else:\n                raise ValueError(f\"RecordOpStats with unique_logical_op_id {unique_logical_op_id} not found in SentinelPlanStats\")\n\n    def add_validation_gen_stats(self, unique_logical_op_id: str, gen_stats: GenerationStats) -> None:\n        \"\"\"\n        Add the given GenerationStats to this plan's validation generation stats for the given logical operator id.\n        \"\"\"\n        if unique_logical_op_id in self.validation_gen_stats:\n            self.validation_gen_stats[unique_logical_op_id] += gen_stats\n        else:\n            self.validation_gen_stats[unique_logical_op_id] = gen_stats\n\n    def __iadd__(self, plan_stats: SentinelPlanStats) -> None:\n        \"\"\"\n        NOTE: we assume the execution layer guarantees:\n        1. these plan_stats belong to the same plan\n        2. these plan_stats come from sequential (non-overlapping) executions of the same plan\n\n        The latter criteria implies it is okay for this method to sum the plan (and operator) runtimes.\n        \"\"\"\n        self.total_plan_time += plan_stats.total_plan_time\n        self.total_plan_cost += plan_stats.total_plan_cost\n        self.input_text_tokens += plan_stats.input_text_tokens\n        self.input_audio_tokens += plan_stats.input_audio_tokens\n        self.input_image_tokens += plan_stats.input_image_tokens\n        self.cache_read_tokens += plan_stats.cache_read_tokens\n        self.cache_creation_tokens += plan_stats.cache_creation_tokens\n        self.output_text_tokens += plan_stats.output_text_tokens\n        self.embedding_input_tokens += plan_stats.embedding_input_tokens\n        for unique_logical_op_id, physical_op_stats in plan_stats.operator_stats.items():\n            for full_op_id, op_stats in physical_op_stats.items():\n                if unique_logical_op_id in self.operator_stats:\n                    if full_op_id in self.operator_stats[unique_logical_op_id]:\n                        self.operator_stats[unique_logical_op_id][full_op_id] += op_stats\n                    else:\n                        self.operator_stats[unique_logical_op_id][full_op_id] = op_stats\n                else:\n                    self.operator_stats[unique_logical_op_id] = physical_op_stats\n\n        for unique_logical_op_id, gen_stats in plan_stats.validation_gen_stats.items():\n            if unique_logical_op_id in self.validation_gen_stats:\n                self.validation_gen_stats[unique_logical_op_id] += gen_stats\n            else:\n                self.validation_gen_stats[unique_logical_op_id] = gen_stats\n\n    def __str__(self) -> str:\n        stats = f\"total_plan_time={self.total_plan_time} \\n\"\n        stats += f\"total_plan_cost={self.total_plan_cost} \\n\"\n        stats += f\"input_text_tokens={self.input_text_tokens} \\n\"\n        stats += f\"input_audio_tokens={self.input_audio_tokens} \\n\"\n        stats += f\"input_image_tokens={self.input_image_tokens} \\n\"\n        stats += f\"cache_read_tokens={self.cache_read_tokens} \\n\"\n        stats += f\"cache_creation_tokens={self.cache_creation_tokens} \\n\"\n        stats += f\"output_text_tokens={self.output_text_tokens} \\n\"\n        stats += f\"embedding_input_tokens={self.embedding_input_tokens} \\n\"\n        for outer_idx, physical_op_stats in enumerate(self.operator_stats.values()):\n            total_time = sum([op_stats.total_op_time for op_stats in physical_op_stats.values()])\n            total_cost = sum([op_stats.total_op_cost for op_stats in physical_op_stats.values()])\n            stats += f\"{outer_idx}. total_time={total_time} total_cost={total_cost} \\n\"\n            for inner_idx, op_stats in enumerate(physical_op_stats.values()):\n                stats += f\"    {outer_idx}.{inner_idx}. {op_stats.op_name} time={op_stats.total_op_time} cost={op_stats.total_op_cost} \\n\"\n        return stats\n\n\nclass ExecutionStats(BaseModel):\n    \"\"\"\n    Model for storing statistics captured for the entire execution of a workload.\n    \"\"\"\n\n    # string for identifying this workload execution\n    execution_id: str | None = None\n\n    # dictionary of SentinelPlanStats objects (one for each sentinel plan run during execution)\n    sentinel_plan_stats: dict[str, SentinelPlanStats] = Field(default_factory=dict)\n\n    # dictionary of PlanStats objects (one for each plan run during execution)\n    plan_stats: dict[str, PlanStats] = Field(default_factory=dict)\n\n    # total time spent optimizing\n    optimization_time: float = 0.0\n\n    # total cost of optimizing\n    optimization_cost: float = 0.0\n\n    # total time spent executing the optimized plan\n    plan_execution_time: float = 0.0\n\n    # total cost of executing the optimized plan\n    plan_execution_cost: float = 0.0\n\n    # total runtime for the entire execution\n    total_execution_time: float = 0.0\n\n    # total cost for the entire execution\n    total_execution_cost: float = 0.0\n\n    # input text tokens processed\n    input_text_tokens: float = 0.0\n\n    # input audio tokens processed\n    input_audio_tokens: float = 0.0\n\n    # input image tokens processed\n    input_image_tokens: float = 0.0\n\n    # cache read tokens processed\n    cache_read_tokens: float = 0.0\n\n    # tokens written to cache\n    cache_creation_tokens: float = 0.0\n\n    # output text tokens generated\n    output_text_tokens: float = 0.0\n\n    # embedding input tokens processed\n    embedding_input_tokens: float = 0.0\n\n    # dictionary of sentinel plan strings; useful for printing executed sentinel plans in demos\n    sentinel_plan_strs: dict[str, str] = Field(default_factory=dict)\n\n    # dictionary of plan strings; useful for printing executed plans in demos\n    plan_strs: dict[str, str] = Field(default_factory=dict)\n\n    # start time for the execution; should be set by calling ExecutionStats.start()\n    start_time: float | None = None\n\n    # end time for the optimization;\n    optimization_end_time: float | None = None\n\n    def start(self) -> None:\n        \"\"\"Start the timer for this execution.\"\"\"\n        self.start_time = time.time()\n\n    def finish_optimization(self) -> None:\n        \"\"\"Finish the timer for the optimization phase of this execution.\"\"\"\n        if self.start_time is None:\n            raise RuntimeError(\"ExecutionStats.start() must be called before ExecutionStats.finish_optimization()\")\n\n        # compute optimization time and cost\n        self.optimization_end_time = time.time()\n        self.optimization_time = self.optimization_end_time - self.start_time\n        self.optimization_cost = self.sum_sentinel_plan_costs()\n\n        # compute sentinel_plan_strs\n        self.sentinel_plan_strs = {plan_id: plan_stats.plan_str for plan_id, plan_stats in self.sentinel_plan_stats.items()}\n\n    def finish(self) -> None:\n        \"\"\"Finish the timer for this execution.\"\"\"\n        if self.start_time is None:\n            raise RuntimeError(\"ExecutionStats.start() must be called before ExecutionStats.finish()\")\n\n        # compute time for plan and total execution\n        end_time = time.time()\n        self.plan_execution_time = (\n            end_time - self.optimization_end_time\n            if self.optimization_end_time is not None\n            else end_time - self.start_time\n        )\n        self.total_execution_time = end_time - self.start_time\n\n        # compute the cost for plan and total execution\n        self.plan_execution_cost = self.sum_plan_costs()\n        self.total_execution_cost = self.optimization_cost + self.plan_execution_cost\n\n        # compute the tokens for total execution\n        self.input_text_tokens = self.sum_plan_stats_field(\"input_text_tokens\")\n        self.input_audio_tokens = self.sum_plan_stats_field(\"input_audio_tokens\")\n        self.input_image_tokens = self.sum_plan_stats_field(\"input_image_tokens\")\n        self.cache_read_tokens = self.sum_plan_stats_field(\"cache_read_tokens\")\n        self.cache_creation_tokens = self.sum_plan_stats_field(\"cache_creation_tokens\")\n        self.output_text_tokens = self.sum_plan_stats_field(\"output_text_tokens\")\n        self.embedding_input_tokens = self.sum_plan_stats_field(\"embedding_input_tokens\")\n\n        # compute plan_strs\n        self.plan_strs = {plan_id: plan_stats.plan_str for plan_id, plan_stats in self.plan_stats.items()}\n\n    def sum_plan_stats_field(self, field_name: str) -> float | int:\n        \"\"\"\n        Sum a given field across all PlanStats in this execution.\n        \"\"\"\n        sentinel_plan_field_sum = sum([plan_stats.sum_op_stats_field(field_name) + plan_stats.sum_validation_stats_field(field_name) for _, plan_stats in self.sentinel_plan_stats.items()])\n        plan_field_sum = sum([plan_stats.sum_op_stats_field(field_name) for _, plan_stats in self.plan_stats.items()])\n        return plan_field_sum + sentinel_plan_field_sum\n\n    def sum_sentinel_plan_costs(self) -> float:\n        \"\"\"\n        Sum the costs of all SentinelPlans in this execution.\n        \"\"\"\n        return sum([\n            plan_stats.sum_op_stats_field(\"total_op_cost\") + plan_stats.sum_validation_stats_field(\"cost_per_record\")\n            for _, plan_stats in self.sentinel_plan_stats.items()\n        ])\n\n    def sum_plan_costs(self) -> float:\n        \"\"\"\n        Sum the costs of all PhysicalPlans in this execution.\n        \"\"\"\n        return sum([plan_stats.sum_op_stats_field(\"total_op_cost\") for _, plan_stats in self.plan_stats.items()])\n\n    def add_plan_stats(self, plan_stats: PlanStats | SentinelPlanStats | list[PlanStats] | list[SentinelPlanStats]) -> None:\n        \"\"\"\n        Add the given PlanStats (or SentinelPlanStats) to this execution's plan stats.\n\n        NOTE: we make the assumption that the same plan cannot be run more than once in parallel,\n        i.e. each plan stats object for an individual plan comes from two different (sequential)\n        periods in time. Thus, PlanStats objects can be summed.\n        \"\"\"\n        # normalize input type to be list[PlanStats] or list[SentinelPlanStats]\n        if isinstance(plan_stats, (PlanStats, SentinelPlanStats)):\n            plan_stats = [plan_stats]\n\n        for plan_stats_obj in plan_stats:\n            if isinstance(plan_stats_obj, PlanStats) and plan_stats_obj.plan_id not in self.plan_stats:\n                self.plan_stats[plan_stats_obj.plan_id] = plan_stats_obj\n            elif isinstance(plan_stats_obj, PlanStats):\n                self.plan_stats[plan_stats_obj.plan_id] += plan_stats_obj\n            elif isinstance(plan_stats_obj, SentinelPlanStats) and plan_stats_obj.plan_id not in self.sentinel_plan_stats:\n                self.sentinel_plan_stats[plan_stats_obj.plan_id] = plan_stats_obj\n            elif isinstance(plan_stats_obj, SentinelPlanStats):\n                self.sentinel_plan_stats[plan_stats_obj.plan_id] += plan_stats_obj\n            else:\n                raise TypeError(f\"Cannot add {type(plan_stats)} to ExecutionStats\")\n\n    def to_json(self, filepath: str | None = None) -> dict | None:\n        if filepath is None:\n            return self.model_dump(mode=\"json\")\n\n        with open(filepath, \"w\") as f:\n            json.dump(self.model_dump(mode=\"json\"), f)\n\n\nclass OperatorCostEstimates(BaseModel):\n    \"\"\"\n    Model for storing estimates of key metrics of interest for each operator.\n    \"\"\"\n\n    # (estimated) number of records output by this operator\n    cardinality: float\n\n    # (estimated) avg. time spent in this operator per-record\n    time_per_record: float\n\n    # (estimated) dollars spent per-record by this operator\n    cost_per_record: float\n\n    # (estimated) quality of the output from this operator\n    quality: float\n\n    # lower bound on cardinality\n    cardinality_lower_bound: float | None = None\n\n    # upper bound on cardinality\n    cardinality_upper_bound: float | None = None\n\n    # lower bound on time_per_record\n    time_per_record_lower_bound: float | None = None\n\n    # upper bound on time_per_record\n    time_per_record_upper_bound: float | None = None\n\n    # lower bound on cost_per_record\n    cost_per_record_lower_bound: float | None = None\n\n    # upper bound on cost_per_record\n    cost_per_record_upper_bound: float | None = None\n\n    # lower bound on quality\n    quality_lower_bound: float | None = None\n\n    # upper bound on quality\n    quality_upper_bound: float | None = None\n\n    def __rmul__(self, multiplier: float) -> OperatorCostEstimates:\n        \"\"\"\n        Multiply all fields by a scalar.\n        \"\"\"\n        dct = {field_name: getattr(self, field_name) * multiplier for field_name in self.model_fields}\n        return OperatorCostEstimates(**dct)\n\n    def model_post_init(self, __context: Any) -> None:\n        if self.cardinality_lower_bound is None and self.cardinality_upper_bound is None:\n            self.cardinality_lower_bound = self.cardinality\n            self.cardinality_upper_bound = self.cardinality\n\n        if self.time_per_record_lower_bound is None and self.time_per_record_upper_bound is None:\n            self.time_per_record_lower_bound = self.time_per_record\n            self.time_per_record_upper_bound = self.time_per_record\n\n        if self.cost_per_record_lower_bound is None and self.cost_per_record_upper_bound is None:\n            self.cost_per_record_lower_bound = self.cost_per_record\n            self.cost_per_record_upper_bound = self.cost_per_record\n\n        if self.quality_lower_bound is None and self.quality_upper_bound is None:\n            self.quality_lower_bound = self.quality\n            self.quality_upper_bound = self.quality\n\n\nclass PlanCost(BaseModel):\n    \"\"\"\n    Model for storing the (cost, time, quality) estimates of (sub)-plans and their upper and lower bounds.\n    \"\"\"\n\n    # the expression cost\n    cost: float\n\n    # the expression runtime\n    time: float\n\n    # the expression quality\n    quality: float\n\n    # operator-specific cost estimates\n    op_estimates: OperatorCostEstimates | None = None\n\n    # lower bound on the expression cost\n    cost_lower_bound: float | None = None\n\n    # upper bound on the expression cost\n    cost_upper_bound: float | None = None\n\n    # lower bound on the expression time\n    time_lower_bound: float | None = None\n\n    # upper bound on the expression time\n    time_upper_bound: float | None = None\n\n    # lower bound on the expression quality\n    quality_lower_bound: float | None = None\n\n    # upper bound on the expression quality\n    quality_upper_bound: float | None = None\n\n    def __hash__(self):\n        return hash(f\"{self.cost}-{self.time}-{self.quality}\")\n\n    def __eq__(self, other: Any) -> bool:\n        if not isinstance(other, PlanCost):\n            return False\n        return (\n            self.cost == other.cost\n            and self.time == other.time\n            and self.quality == other.quality\n        )\n\n    def model_post_init(self, __context: Any) -> None:\n        if self.time_lower_bound is None and self.time_upper_bound is None:\n            self.time_lower_bound = self.time\n            self.time_upper_bound = self.time\n\n        if self.cost_lower_bound is None and self.cost_upper_bound is None:\n            self.cost_lower_bound = self.cost\n            self.cost_upper_bound = self.cost\n\n        if self.quality_lower_bound is None and self.quality_upper_bound is None:\n            self.quality_lower_bound = self.quality\n            self.quality_upper_bound = self.quality\n\n    def join_add(self, left_plan_cost: PlanCost, right_plan_cost: PlanCost, execution_strategy: str = \"parallel\") -> PlanCost:\n        \"\"\"\n        Add the PlanCost objects for two joined plans (left_plan_cost and right_plan_cost)\n        to the PlanCost object for the join operator. The execution strategy determines how\n        the input times are combined. If the execution strategy is \"parallel\", the input time\n        is the maximum of the two times. If the execution strategy is \"sequential\" (which is\n        currently anything else), the input time is the sum of the two times.\n\n        For quality, we compute the produce of the operator quality with the average of the\n        two input qualities.\n\n        NOTE: we currently assume the updating of the op_estimates are handled by the caller\n        as there is not a universally correct meaning of addition of op_estimates.\n        \"\"\"\n        dct = {}\n        for model_field in [\"cost\", \"cost_lower_bound\", \"cost_upper_bound\"]:\n            op_field_value = getattr(self, model_field)\n            left_plan_field_value = getattr(left_plan_cost, model_field)\n            right_plan_field_value = getattr(right_plan_cost, model_field)\n            if op_field_value is not None and left_plan_field_value is not None and right_plan_field_value is not None:\n                dct[model_field] = op_field_value + left_plan_field_value + right_plan_field_value\n\n        for model_field in [\"time\", \"time_lower_bound\", \"time_upper_bound\"]:\n            op_field_value = getattr(self, model_field)\n            left_plan_field_value = getattr(left_plan_cost, model_field)\n            right_plan_field_value = getattr(right_plan_cost, model_field)\n            if op_field_value is not None and left_plan_field_value is not None and right_plan_field_value is not None:\n                if execution_strategy == \"parallel\":\n                    dct[model_field] = op_field_value + max(left_plan_field_value, right_plan_field_value)\n                else:\n                    dct[model_field] = op_field_value + left_plan_field_value + right_plan_field_value\n\n        for model_field in [\"quality\", \"quality_lower_bound\", \"quality_upper_bound\"]:\n            op_field_value = getattr(self, model_field)\n            left_plan_field_value = getattr(left_plan_cost, model_field)\n            right_plan_field_value = getattr(right_plan_cost, model_field)\n            if op_field_value is not None and left_plan_field_value is not None and right_plan_field_value is not None:\n                dct[model_field] = op_field_value * ((left_plan_field_value + right_plan_field_value) / 2.0)\n\n        return PlanCost(**dct)\n\n    def __iadd__(self, other: PlanCost) -> PlanCost:\n        \"\"\"\n        NOTE: we currently assume the updating of the op_estimates are handled by the caller\n        as there is not a universally correct meaning of addition of op_estimates.\n        \"\"\"\n        self.cost += other.cost\n        self.time += other.time\n        self.quality *= other.quality\n        for model_field in [\"cost_lower_bound\", \"cost_upper_bound\", \"time_lower_bound\", \"time_upper_bound\"]:\n            if getattr(self, model_field) is not None and getattr(other, model_field) is not None:\n                summation = getattr(self, model_field) + getattr(other, model_field)\n                setattr(self, model_field, summation)\n\n        for model_field in [\"quality_lower_bound\", \"quality_upper_bound\"]:\n            if getattr(self, model_field) is not None and getattr(other, model_field) is not None:\n                product = getattr(self, model_field) * getattr(other, model_field)\n                setattr(self, model_field, product)\n\n        return self\n\n    def __add__(self, other: PlanCost) -> PlanCost:\n        \"\"\"\n        NOTE: we currently assume the updating of the op_estimates are handled by the caller\n        as there is not a universally correct meaning of addition of op_estimates.\n        \"\"\"\n        dct = {\n            field: getattr(self, field) + getattr(other, field)\n            for field in [\n                \"cost\",\n                \"cost_lower_bound\",\n                \"cost_upper_bound\",\n                \"time\",\n                \"time_lower_bound\",\n                \"time_upper_bound\",\n            ]\n        }\n        for model_field in [\"quality\", \"quality_lower_bound\", \"quality_upper_bound\"]:\n            dct[model_field] = getattr(self, model_field) * getattr(other, model_field)\n\n        return PlanCost(**dct)\n"
  },
  {
    "path": "src/palimpzest/policy.py",
    "content": "from __future__ import annotations\n\nimport json\n\nfrom palimpzest.core.models import PlanCost\n\n\ndef construct_policy_from_kwargs(**kwargs) -> Policy | None:\n    \"\"\"\n    Construct and return a policy object which is defined by the keyword arguments.\n\n    This function accepts the following keyword arguments:\n    - max_quality\n    - min_cost\n    - min_time\n    - cost_budget\n    - time_budget\n    - quality_threshold\n\n    If none of these keyword arguments are provided, the function will return None.\n    \"\"\"\n    # compute the number of objectives and constraints in the kwargs\n    num_objectives = sum([bool(kwargs.get(key, False)) for key in [\"max_quality\", \"min_cost\", \"min_time\"]])\n    num_constraints = sum([bool(kwargs.get(key, False)) for key in [\"cost_budget\", \"time_budget\", \"quality_threshold\"]])\n\n    # if there are no policy kwargs provided, return None\n    if num_objectives == 0 and num_constraints == 0:\n        return None\n\n    # Otherwise, assert that kwargs are valid\n    assert num_objectives == 1, \"Must optimize for one of max_quality, min_cost, or min_time.\"\n    assert num_constraints <= 1, \"Currently, PZ supports at most one constraint.\"\n\n    # print warning if user tries to set a constraint and optimization goal on the same metric\n    if \"max_quality\" in kwargs and \"quality_threshold\" in kwargs:\n        print(\"Warning: Setting a constraint on quality and optimizing for quality is redundant.\")\n\n    if \"min_cost\" in kwargs and \"cost_budget\" in kwargs:\n        print(\"Warning: Setting a constraint on cost and optimizing for cost is redundant.\")\n\n    if \"min_time\" in kwargs and \"time_budget\" in kwargs:\n        print(\"Warning: Setting a constraint on time and optimizing for time is redundant.\")\n\n    # construct the policy object\n    policy = None\n    if \"max_quality\" in kwargs and \"cost_budget\" in kwargs:\n        policy = MaxQualityAtFixedCost(kwargs[\"cost_budget\"])\n    elif \"max_quality\" in kwargs and \"time_budget\" in kwargs:\n        policy = MaxQualityAtFixedTime(kwargs[\"time_budget\"])\n    elif \"max_quality\" in kwargs:\n        policy = MaxQuality()\n    elif \"min_cost\" in kwargs and \"quality_threshold\" in kwargs:\n        policy = MinCostAtFixedQuality(kwargs[\"quality_threshold\"])\n    elif \"min_cost\" in kwargs:\n        policy = MinCost()\n    elif \"min_time\" in kwargs and \"quality_threshold\" in kwargs:\n        policy = MinTimeAtFixedQuality(kwargs[\"quality_threshold\"])\n    elif \"min_time\" in kwargs:\n        policy = MinTime()\n\n    return policy\n\n\nclass Policy:\n    \"\"\"\n    Base class for a policy. Each policy has two methods: constraint() and chooose().\n    The first method determines whether the given cost, runtime, and quality for a plan\n    (or sub-plan) satisfy the policy's constraint(s). The second method takes in the \n    (cost, runtime, quality) tuples for two plans (or subplans) and returns True if the\n    first plan is better than the second one and False otherwise.\n    \"\"\"\n\n    def __init__(self):\n        pass\n\n    def get_primary_metric(self) -> str:\n        \"\"\"\n        Returns one of [\"cost\", \"time\", \"quality\"]; whichever corresponds to the\n        maximization / minimization goal of the policy.\n\n        Eventually we may make policies more general by allowing users to optimize\n        some function: f(cost, time, quality). In that case, we may deprecate this\n        method and update its callers.\n        \"\"\"\n        raise NotImplementedError(\"Calling this method from an abstract base class.\")\n\n    def get_dict(self) -> dict:\n        \"\"\"\n        Returns a dict representation of the policy which specifies how much weight\n        (in [0,1]) should be given to each metric.\n        \"\"\"\n        raise NotImplementedError(\"Calling this method from an abstract base class.\")\n\n    def constraint(self, plan: PlanCost) -> bool:\n        \"\"\"\n        Return True if the given (cost, runtime, quality) for a plan (or subplan)\n        satisfy the policy's constraint(s). Otherwise, return False.\n        \"\"\"\n        raise NotImplementedError(\"Calling this method from an abstract base class.\")\n\n    def choose(self, plan: PlanCost, other_plan: PlanCost) -> float:\n        \"\"\"\n        Return True if plan is better than other_plan and return False otherwise.\n        \"\"\"\n        raise NotImplementedError(\"Calling this method from an abstract base class.\")\n\n    def to_json_str(self) -> str:\n        \"\"\"Convert policy configuration to a JSON-serializable dictionary.\"\"\"\n        return json.dumps({\n            \"type\": self.__class__.__name__,\n            \"config\": self.get_dict()\n        }, indent=2)\n\n\nclass MaxQuality(Policy):\n    \"\"\"\n    This policy has no constraints and computes the best plan as the one with\n    the higher quality.\n    \"\"\"\n\n    def __str__(self):\n        return \"Maximum Quality\"\n\n    def get_primary_metric(self) -> str:\n        return \"quality\"\n\n    def get_dict(self) -> dict:\n        return {\"cost\": 0.0, \"time\": 0.0, \"quality\": 1.0}\n\n    def constraint(self, plan: PlanCost) -> bool:\n        \"\"\"There is no constraint.\"\"\"\n        return True\n\n    def choose(self, plan: PlanCost, other_plan: PlanCost) -> float:\n        \"\"\"\n        Return True if plan has higher quality than other_plan and return False otherwise.\n        Use cost and then runtime as tiebreakers.\n        \"\"\"\n        if plan.quality == other_plan.quality:\n            if plan.cost == other_plan.cost:\n                return plan.time < other_plan.time\n            return plan.cost < other_plan.cost\n\n        return plan.quality > other_plan.quality\n\n\nclass MinCost(Policy):\n    \"\"\"\n    This policy has no constraints and computes the best plan as the one with\n    the lower cost.\n    \"\"\"\n\n    def __str__(self):\n        return \"Minimum Cost\"\n\n    def get_primary_metric(self) -> str:\n        return \"cost\"\n\n    def get_dict(self) -> dict:\n        return {\"cost\": 1.0, \"time\": 0.0, \"quality\": 0.0}\n\n    def constraint(self, plan: PlanCost) -> bool:\n        \"\"\"There is no constraint.\"\"\"\n        return True\n\n    def choose(self, plan: PlanCost, other_plan: PlanCost) -> float:\n        \"\"\"\n        Return True if plan has lower cost than other_plan and return False otherwise.\n        Use quality and then runtime as tiebreakers.\n        \"\"\"\n        if plan.cost == other_plan.cost:\n            if plan.quality == other_plan.quality:\n                return plan.time < other_plan.time\n            return plan.quality > other_plan.quality\n\n        return plan.cost < other_plan.cost\n\n\nclass MinTime(Policy):\n    \"\"\"\n    This policy has no constraints and computes the best plan as the one with\n    the lower runtime.\n    \"\"\"\n\n    def __str__(self):\n        return \"Minimum Time\"\n\n    def get_primary_metric(self) -> str:\n        return \"time\"\n\n    def get_dict(self) -> dict:\n        return {\"cost\": 0.0, \"time\": 1.0, \"quality\": 0.0}\n\n    def constraint(self, plan: PlanCost) -> bool:\n        \"\"\"There is no constraint.\"\"\"\n        return True\n\n    def choose(self, plan: PlanCost, other_plan: PlanCost) -> float:\n        \"\"\"\n        Return True if plan has lower runtime than other_plan and return False otherwise.\n        Use quality and then cost as tiebreakers.\n        \"\"\"\n        if plan.time == other_plan.time:\n            if plan.quality == other_plan.quality:\n                return plan.cost < other_plan.cost\n            return plan.quality > other_plan.quality\n\n        return plan.time < other_plan.time\n\n\nclass MaxQualityAtFixedCost(Policy):\n    \"\"\"\n    This policy applies a constraint (upper bound) on the cost of the plan\n    and tries to maximize quality subject to that constraint.\n    \"\"\"\n\n    def __init__(self, max_cost: float):\n        self.max_cost = max_cost\n\n    def __str__(self):\n        return \"MaxQuality@FixedCost\"\n\n    def get_primary_metric(self) -> str:\n        return \"quality\"\n\n    def get_dict(self) -> dict:\n        return {\"cost\": 0.5, \"time\": 0.0, \"quality\": 0.5}\n\n    def constraint(self, plan: PlanCost) -> bool:\n        return plan.cost < self.max_cost\n\n    def choose(self, plan: PlanCost, other_plan: PlanCost) -> float:\n        \"\"\"\n        Return True if plan has higher quality than other_plan and return False otherwise.\n        Use cost and then runtime as a tie-breaker.\n        \"\"\"\n        if plan.quality == other_plan.quality:\n            if plan.cost == other_plan.cost:\n                return plan.time < other_plan.time\n            return plan.cost < other_plan.cost\n\n        return plan.quality > other_plan.quality\n\n\nclass MaxQualityAtFixedTime(Policy):\n    \"\"\"\n    This policy applies a constraint (upper bound) on the runtime of the plan\n    and tries to maximize quality subject to that constraint.\n    \"\"\"\n\n    def __init__(self, max_time: float):\n        self.max_time = max_time\n\n    def __str__(self):\n        return \"MaxQuality@FixedTime\"\n\n    def get_primary_metric(self) -> str:\n        return \"quality\"\n\n    def get_dict(self) -> dict:\n        return {\"cost\": 0.0, \"time\": 0.5, \"quality\": 0.5}\n\n    def constraint(self, plan: PlanCost) -> bool:\n        return plan.time < self.max_time\n\n    def choose(self, plan: PlanCost, other_plan: PlanCost) -> float:\n        \"\"\"\n        Return True if plan has higher quality than other_plan and return False otherwise.\n        Use runtime and then cost as a tie-breaker.\n        \"\"\"\n        if plan.quality == other_plan.quality:\n            if plan.time == other_plan.time:\n                return plan.cost < other_plan.cost\n            return plan.time < other_plan.time\n\n        return plan.quality > other_plan.quality\n\n\nclass MinCostAtFixedQuality(Policy):\n    \"\"\"\n    This policy applies a constraint (lower bound) on the quality of the plan\n    and tries to minimize cost subject to that constraint.\n    \"\"\"\n\n    def __init__(self, min_quality: float):\n        self.min_quality = min_quality\n\n    def __str__(self):\n        return \"MinCost@FixedQuality\"\n\n    def get_primary_metric(self) -> str:\n        return \"cost\"\n\n    def get_dict(self) -> dict:\n        return {\"cost\": 0.5, \"time\": 0.0, \"quality\": 0.5}\n\n    def constraint(self, plan: PlanCost) -> bool:\n        return plan.quality > self.min_quality\n\n    def choose(self, plan: PlanCost, other_plan: PlanCost) -> float:\n        \"\"\"\n        Return True if plan has lower cost than other_plan and return False otherwise.\n        Use quality and then runtime as a tie-breaker.\n        \"\"\"\n        if plan.cost == other_plan.cost:\n            if plan.quality == other_plan.quality:\n                return plan.time < other_plan.time\n            return plan.quality > other_plan.quality\n\n        return plan.cost < other_plan.cost\n\n\nclass MinTimeAtFixedQuality(Policy):\n    \"\"\"\n    This policy applies a constraint (lower bound) on the quality of the plan\n    and tries to minimize runtime subject to that constraint.\n    \"\"\"\n\n    def __init__(self, min_quality: float):\n        self.min_quality = min_quality\n\n    def __str__(self):\n        return \"MinTime@FixedQuality\"\n\n    def get_primary_metric(self) -> str:\n        return \"time\"\n\n    def get_dict(self) -> dict:\n        return {\"cost\": 0.0, \"time\": 0.5, \"quality\": 0.5}\n\n    def constraint(self, plan: PlanCost) -> bool:\n        return plan.quality > self.min_quality\n\n    def choose(self, plan: PlanCost, other_plan: PlanCost) -> float:\n        \"\"\"\n        Return True if plan has lower runtime than other_plan and return False otherwise.\n        Use quality and then cost as a tie-breaker.\n        \"\"\"\n        if plan.time == other_plan.time:\n            if plan.quality == other_plan.quality:\n                return plan.cost < other_plan.cost\n            return plan.quality > other_plan.quality\n\n        return plan.time < other_plan.time\n"
  },
  {
    "path": "src/palimpzest/prompts/__init__.py",
    "content": "from palimpzest.prompts.agent_prompts import (\n    CODE_AGENT_SYSTEM_PROMPT,\n    DATA_DISCOVERY_AGENT_INITIAL_PLAN_PROMPT,\n    DATA_DISCOVERY_AGENT_REPORT_PROMPT,\n    DATA_DISCOVERY_AGENT_TASK_PROMPT,\n    DATA_DISCOVERY_AGENT_UPDATE_PLAN_POST_MESSAGES_PROMPT,\n    DATA_DISCOVERY_AGENT_UPDATE_PLAN_PRE_MESSAGES_PROMPT,\n    FINAL_ANSWER_POST_MESSAGES_PROMPT,\n    FINAL_ANSWER_PRE_MESSAGES_PROMPT,\n)\nfrom palimpzest.prompts.context_search import CONTEXT_SEARCH_PROMPT\nfrom palimpzest.prompts.prompt_factory import PromptFactory\nfrom palimpzest.prompts.prompt_manager import PromptManager\nfrom palimpzest.prompts.utils import (\n    ONE_TO_MANY_OUTPUT_FORMAT_INSTRUCTION,\n    ONE_TO_ONE_OUTPUT_FORMAT_INSTRUCTION,\n)\nfrom palimpzest.prompts.validator import (\n    FLAT_MAP_IMAGE_VALIDATOR_PROMPT,\n    FLAT_MAP_VALIDATOR_PROMPT,\n    MAP_IMAGE_VALIDATOR_PROMPT,\n    MAP_VALIDATOR_PROMPT,\n    RETRIEVE_VALIDATOR_PROMPT,\n)\n\n__all__ = [\n    # agent prompts\n    \"CODE_AGENT_SYSTEM_PROMPT\",\n    \"DATA_DISCOVERY_AGENT_INITIAL_PLAN_PROMPT\",\n    \"DATA_DISCOVERY_AGENT_REPORT_PROMPT\",\n    \"DATA_DISCOVERY_AGENT_TASK_PROMPT\",\n    \"DATA_DISCOVERY_AGENT_UPDATE_PLAN_POST_MESSAGES_PROMPT\",\n    \"DATA_DISCOVERY_AGENT_UPDATE_PLAN_PRE_MESSAGES_PROMPT\",\n    \"FINAL_ANSWER_POST_MESSAGES_PROMPT\",\n    \"FINAL_ANSWER_PRE_MESSAGES_PROMPT\",\n    # context search\n    \"CONTEXT_SEARCH_PROMPT\",\n    # prompt cache\n    \"PromptManager\",\n    # prompt factory\n    \"PromptFactory\",\n    # utils\n    \"ONE_TO_MANY_OUTPUT_FORMAT_INSTRUCTION\",\n    \"ONE_TO_ONE_OUTPUT_FORMAT_INSTRUCTION\",\n    # validator\n    \"FLAT_MAP_IMAGE_VALIDATOR_PROMPT\",\n    \"FLAT_MAP_VALIDATOR_PROMPT\",\n    \"MAP_IMAGE_VALIDATOR_PROMPT\",\n    \"MAP_VALIDATOR_PROMPT\",\n    \"RETRIEVE_VALIDATOR_PROMPT\",\n]\n"
  },
  {
    "path": "src/palimpzest/prompts/agent_prompts.py",
    "content": "CODE_AGENT_SYSTEM_PROMPT = \"\"\"You are an expert assistant who can solve any task using code blobs. You will be given a task to solve as best you can.\nTo do so, you have been given access to a list of tools: these tools are basically Python functions which you can call with code.\nTo solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', '<code>', and 'Observation:' sequences.\n\nAt each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task and the tools that you want to use.\nThen in the '<code>' sequence, you should write the code in simple Python. The code sequence must end with '</code>' sequence.\nDuring each intermediate step, you can use 'print()' to save whatever important information you will then need.\nThese print outputs will then appear in the 'Observation:' field, which will be available as input for the next step.\nIn the end you have to return a final answer using the `final_answer` tool.\n\nHere are a few examples using notional tools:\n---\nTask: \"Generate an image of the oldest person in this document.\"\n\nThought: I will proceed step by step and use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.\n<code>\nanswer = document_qa(document=document, question=\"Who is the oldest person mentioned?\")\nprint(answer)\n</code>\nObservation: \"The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland.\"\n\nThought: I will now generate an image showcasing the oldest person.\n<code>\nimage = image_generator(\"A portrait of John Doe, a 55-year-old man living in Canada.\")\nfinal_answer(image)\n</code>\n\n---\nTask: \"What is the result of the following operation: 5 + 3 + 1294.678?\"\n\nThought: I will use python code to compute the result of the operation and then return the final answer using the `final_answer` tool\n<code>\nresult = 5 + 3 + 1294.678\nfinal_answer(result)\n</code>\n\n---\nTask:\n\"Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French.\nYou have been provided with these additional arguments, that you can access using the keys as variables in your python code:\n{'question': 'Quel est l'animal sur l'image?', 'image': 'path/to/image.jpg'}\"\n\nThought: I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.\n<code>\ntranslated_question = translator(question=question, src_lang=\"French\", tgt_lang=\"English\")\nprint(f\"The translated question is {translated_question}.\")\nanswer = image_qa(image=image, question=translated_question)\nfinal_answer(f\"The answer is {answer}\")\n</code>\n\n---\nTask:\nIn a 1979 interview, Stanislaus Ulam discusses with Martin Sherwin about other great physicists of his time, including Oppenheimer.\nWhat does he say was the consequence of Einstein learning too much math on his creativity, in one word?\n\nThought: I need to find and read the 1979 interview of Stanislaus Ulam with Martin Sherwin.\n<code>\npages = web_search(query=\"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\")\nprint(pages)\n</code>\nObservation:\nNo result found for query \"1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein\".\n\nThought: The query was maybe too restrictive and did not find any results. Let's try again with a broader query.\n<code>\npages = web_search(query=\"1979 interview Stanislaus Ulam\")\nprint(pages)\n</code>\nObservation:\nFound 6 pages:\n[Stanislaus Ulam 1979 interview](https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/)\n\n[Ulam discusses Manhattan Project](https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/)\n\n(truncated)\n\nThought: I will read the first 2 pages to know more.\n<code>\nfor url in [\"https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/\", \"https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/\"]:\n    whole_page = visit_webpage(url)\n    print(whole_page)\n    print(\"\\n\" + \"=\"*80 + \"\\n\")  # Print separator between pages\n</code>\nObservation:\nManhattan Project Locations:\nLos Alamos, NM\nStanislaus Ulam was a Polish-American mathematician. He worked on the Manhattan Project at Los Alamos and later helped design the hydrogen bomb. In this interview, he discusses his work at\n(truncated)\n\nThought: I now have the final answer: from the webpages visited, Stanislaus Ulam says of Einstein: \"He learned too much mathematics and sort of diminished, it seems to me personally, it seems to me his purely physics creativity.\" Let's answer in one word.\n<code>\nfinal_answer(\"diminished\")\n</code>\n\n---\nTask: \"Which city has the highest population: Guangzhou or Shanghai?\"\n\nThought: I need to get the populations for both cities and compare them: I will use the tool `web_search` to get the population of both cities.\n<code>\nfor city in [\"Guangzhou\", \"Shanghai\"]:\n    print(f\"Population {city}:\", web_search(f\"{city} population\")\n</code>\nObservation:\nPopulation Guangzhou: ['Guangzhou has a population of 15 million inhabitants as of 2021.']\nPopulation Shanghai: '26 million (2019)'\n\nThought: Now I know that Shanghai has the highest population.\n<code>\nfinal_answer(\"Shanghai\")\n</code>\n\n---\nTask: \"What is the current age of the pope, raised to the power 0.36?\"\n\nThought: I will use the tool `wikipedia_search` to get the age of the pope, and confirm that with a web search.\n<code>\npope_age_wiki = wikipedia_search(query=\"current pope age\")\nprint(\"Pope age as per wikipedia:\", pope_age_wiki)\npope_age_search = web_search(query=\"current pope age\")\nprint(\"Pope age as per google search:\", pope_age_search)\n</code>\nObservation:\nPope age: \"The pope Francis is currently 88 years old.\"\n\nThought: I know that the pope is 88 years old. Let's compute the result using python code.\n<code>\npope_current_age = 88 ** 0.36\nfinal_answer(pope_current_age)\n</code>\n\nAbove example were using notional tools that might not exist for you. On top of performing computations in the Python code snippets that you create, you only have access to these tools, behaving like regular python functions:\n```python\n{%- for tool in tools.values() %}\ndef {{ tool.name }}({% for arg_name, arg_info in tool.inputs.items() %}{{ arg_name }}: {{ arg_info.type }}{% if not loop.last %}, {% endif %}{% endfor %}) -> {{tool.output_type}}:\n    \\\"\\\"\\\"{{ tool.description }}\n\n    Args:\n    {%- for arg_name, arg_info in tool.inputs.items() %}\n        {{ arg_name }}: {{ arg_info.description }}\n    {%- endfor %}\n    \\\"\\\"\\\"\n{% endfor %}\n```\n\n{%- if managed_agents and managed_agents.values() | list %}\nYou can also give tasks to team members.\nCalling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\nYou can also include any relevant variables or context using the 'additional_args' argument.\nHere is a list of the team members that you can call:\n```python\n{%- for agent in managed_agents.values() %}\ndef {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:\n    \\\"\\\"\\\"{{ agent.description }}\n\n    Args:\n        task: Long detailed description of the task.\n        additional_args: Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\n    \\\"\\\"\\\"\n{% endfor %}\n```\n{%- endif %}\n\nHere are the rules you should always follow to solve your task:\n1. Always provide a 'Thought:' sequence, and a '<code>' sequence ending with '</code>', else you will fail.\n2. Use only variables that you have defined!\n3. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = wikipedia_search({'query': \"What is the place where James Bond lives?\"})', but use the arguments directly as in 'answer = wikipedia_search(query=\"What is the place where James Bond lives?\")'.\n4. Take care to not chain too many sequential tool calls in the same code block, especially when the output format is unpredictable. For instance, a call to wikipedia_search has an unpredictable return format, so do not have another tool call that depends on its output in the same block: rather output results with print() to use them in the next block.\n5. Call a tool only when needed, and never re-do a tool call that you previously did with the exact same parameters.\n6. Don't name any new variable with the same name as a tool: for instance don't name a variable 'final_answer'.\n7. Never create any notional variables in our code, as having these in your logs will derail you from the true variables.\n8. You can use imports in your code, but only from the following list of modules: {{authorized_imports}}\n9. The state persists between code executions: so if in one step you've created variables or imported modules, these will all persist.\n10. Don't give up! You're in charge of solving the task, not providing directions to solve it.\n\n{%- if custom_instructions %}\n{{custom_instructions}}\n{%- endif %}\n\nNow Begin!\"\"\"\n\n\nDATA_DISCOVERY_AGENT_INITIAL_PLAN_PROMPT = \"\"\"You are a world expert at analyzing a situation to derive facts, and plan accordingly towards solving a task.\nBelow I will present you a task. You will need to 1. build a survey of facts known or needed to solve the task, then 2. make a plan of action to solve the task.\n\n## 1. Facts survey\nYou will build a comprehensive preparatory survey of which facts we have at our disposal and which ones we still need.\nThese \"facts\" will typically be specific names, dates, values, etc. Be sure to report the facts you look up and derive, as well as the sources where you find these facts.\nYour answer should use the below headings:\n### 1.1. Facts given in the task\nList here the specific facts given in the task that could help you (there might be nothing here).\n\n### 1.2. Facts to look up\nList here any facts that we may need to look up.\nAlso list where to find each of these, for instance a website, a file... - maybe the task contains some sources that you should re-use here.\n\n### 1.3. Facts to derive\nList here anything that we want to derive from the above by logical reasoning, for instance computation or simulation.\n\n### 1.4. Fact Sources\nList the source of each fact that you have looked up or derived. The source may be a file, a database table, a web page, etc.\nBe sure that someone can use these sources to reproduce your work. \n\nDon't make any assumptions. For each item, provide a thorough reasoning. Do not add anything else on top of the four headings above.\n\n## 2. Plan\nThen for the given task, develop a step-by-step high-level plan taking into account the above inputs and list of facts.\nThis plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.\nDo not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.\nAfter writing the final step of the plan, write the '<end_plan>' tag and stop there.\n\nYou can leverage these tools, behaving like regular python functions:\n```python\n{%- for tool in tools.values() %}\ndef {{ tool.name }}({% for arg_name, arg_info in tool.inputs.items() %}{{ arg_name }}: {{ arg_info.type }}{% if not loop.last %}, {% endif %}{% endfor %}) -> {{tool.output_type}}:\n    \\\"\\\"\\\"{{ tool.description }}\n\n    Args:\n    {%- for arg_name, arg_info in tool.inputs.items() %}\n        {{ arg_name }}: {{ arg_info.description }}\n    {%- endfor %}\n    \\\"\\\"\\\"\n{% endfor %}\n```\n\nThe tools you have been given will provide you with access to a dataset with the following description:\n\nContext: {{context_description}}\\n\n\n{%- if managed_agents and managed_agents.values() | list %}\nYou can also give tasks to team members.\nCalling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\nYou can also include any relevant variables or context using the 'additional_args' argument.\nHere is a list of the team members that you can call:\n```python\n{%- for agent in managed_agents.values() %}\ndef {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:\n    \\\"\\\"\\\"{{ agent.description }}\n\n    Args:\n        task: Long detailed description of the task.\n        additional_args: Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\n    \\\"\\\"\\\"\n{% endfor %}\n```\n{%- endif %}\n\n---\nNow begin! Here is your task:\n```\nsearch: {{task}}\n```\nFirst in part 1, write the facts survey, then in part 2, write your plan.\n\"\"\"\n\nDATA_DISCOVERY_AGENT_UPDATE_PLAN_PRE_MESSAGES_PROMPT = \"\"\"You are a world expert at analyzing a situation, and plan accordingly towards solving a task.\nYou have been given the following task:\n```\nsearch: {{task}}\n```\nYou are working with the following dataset:\n```\ncontext: {{context_description}}\n```\n\nBelow you will find a history of attempts made to solve this task.\nYou will first have to produce a survey of known and unknown facts, then propose a step-by-step high-level plan to solve the task.\nIf the previous tries so far have met some success, your updated plan can build on these results.\nIf you are stalled, you can make a completely new plan starting from scratch.\n\nFind the task and history below:\n\"\"\"\n\nDATA_DISCOVERY_AGENT_UPDATE_PLAN_POST_MESSAGES_PROMPT = \"\"\"Now write your updated facts below, taking into account the above history:\n## 1. Updated facts survey\n### 1.1. Facts given in the task\n### 1.2. Facts that we have learned\n### 1.3. Facts still to look up\n### 1.4. Facts still to derive\n### 1.5. Fact Sources\n\nThen write a step-by-step high-level plan to solve the task above.\n## 2. Plan\n### 2. 1. ...\nEtc.\nThis plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.\nBeware that you have {remaining_steps} steps remaining.\nDo not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.\nAfter writing the final step of the plan, write the '<end_plan>' tag and stop there.\n\nYou can leverage these tools, behaving like regular python functions:\n```python\n{%- for tool in tools.values() %}\ndef {{ tool.name }}({% for arg_name, arg_info in tool.inputs.items() %}{{ arg_name }}: {{ arg_info.type }}{% if not loop.last %}, {% endif %}{% endfor %}) -> {{tool.output_type}}:\n    \\\"\\\"\\\"{{ tool.description }}\n\n    Args:\n    {%- for arg_name, arg_info in tool.inputs.items() %}\n        {{ arg_name }}: {{ arg_info.description }}\n    {%- endfor %}\\\"\\\"\\\"\n{% endfor %}\n```\n\nThe tools you have been given will provide you with access to a dataset with the following description:\n\nContext: {{context_description}}\n\n{%- if managed_agents and managed_agents.values() | list %}\nYou can also give tasks to team members.\nCalling a team member works similarly to calling a tool: provide the task description as the 'task' argument. Since this team member is a real human, be as detailed and verbose as necessary in your task description.\nYou can also include any relevant variables or context using the 'additional_args' argument.\nHere is a list of the team members that you can call:\n```python\n{%- for agent in managed_agents.values() %}\ndef {{ agent.name }}(task: str, additional_args: dict[str, Any]) -> str:\n    \\\"\\\"\\\"{{ agent.description }}\n\n    Args:\n        task: Long detailed description of the task.\n        additional_args: Dictionary of extra inputs to pass to the managed agent, e.g. images, dataframes, or any other contextual data it may need.\n    \\\"\\\"\\\"\n{% endfor %}\n```\n{%- endif %}\n\nNow write your updated facts survey below, then your new plan.\n\"\"\"\n\nDATA_DISCOVERY_AGENT_TASK_PROMPT = \"\"\"You're a helpful agent named '{{name}}'.\nYou have been submitted this task by your manager.\n---\nTask:\n{{task}}\n---\nYou have also been given access to tools which will help you navigate a dataset with the following description:\n---\nContext:\n{{context_description}}\n---\nYou're helping your manager solve a search task: so make sure to not provide a one-line answer, but give as much information as possible to give them a clear understanding of the answer and the context in which you produced it. In particular, be sure to report what source(s) you use and what the contents of those source(s) contain, even if they are irrelevant for solving this task.\n\nYour final_answer WILL HAVE to contain these parts:\n### 1. Task outcome (short version):\n### 2. Task outcome (extremely detailed version):\n### 3. Additional context (if relevant):\n\nPut all these in your final_answer tool, everything that you do not pass as an argument to final_answer will be lost.\nAnd even if your task resolution is not successful, please return as much context as possible, so that your manager can act upon this feedback.\n\"\"\"\n\nDATA_DISCOVERY_AGENT_REPORT_PROMPT = \"\"\"Here is the final answer from your managed agent '{{name}}':\n{{final_answer}}\"\"\"\n\nFINAL_ANSWER_PRE_MESSAGES_PROMPT = \"\"\"An agent tried to answer a user query but it got stuck and failed to do so. You are tasked with providing an answer instead. Here is the agent's memory:\"\"\"\n\nFINAL_ANSWER_POST_MESSAGES_PROMPT = \"\"\"Based on the above, please provide an answer to the following user task:\n{{task}}\n\"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/aggregate_prompts.py",
    "content": "\"\"\"This file contains prompts for aggregation operations.\"\"\"\n\n### BASE PROMPTS ###\nAGG_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and an output field to generate. Your task is to generate a JSON object which aggregates the input and fills in the output field with the correct value.\nYou will be provided with a description of each input field and each output field. The field in the output JSON object can be derived using information from the context.\n\n{output_format_instruction} Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{example_input_fields}\n\nOUTPUT FIELDS:\n{example_output_fields}\n\nCONTEXT:\n{{{example_context}}}\n{{{second_example_context}}}\n{{{third_example_context}}}{image_disclaimer}{audio_disclaimer}\n\nAGGREGATION INSTRUCTION: {example_agg_instruction}\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: {example_reasoning}\n\nANSWER:\n{{{example_answer}}}\n---\n\"\"\"\n\nAGG_NO_REASONING_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and an output field to generate. Your task is to generate a JSON object which aggregates the input and fills in the output field with the correct value.\nYou will be provided with a description of each input field and each output field. The field in the output JSON object can be derived using information from the context.\n\n{output_format_instruction} Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{example_input_fields}\n\nOUTPUT FIELDS:\n{example_output_fields}\n\nCONTEXT:\n{{{example_context}}}\n{{{second_example_context}}}\n{{{third_example_context}}}{image_disclaimer}{audio_disclaimer}\n\nAGGREGATION INSTRUCTION: {example_agg_instruction}\n\nANSWER:\n{{{example_answer}}}\n---\n\"\"\"\n\n\nAGG_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and an output field to generate. Your task is to generate a JSON object which aggregates the input and fills in the output field with the correct value.\nYou will be provided with a description of each input field and each output field. The field in the output JSON object can be derived using information from the context.\n{desc_section}\n{output_format_instruction} Finish your response with a newline character followed by ---\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nOUTPUT FIELDS:\n{output_fields_desc}\n\nCONTEXT:\n{context}<<image-audio-placeholder>>\n\nAGGREGATION INSTRUCTION: {agg_instruction}\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: \"\"\"\n\nAGG_NO_REASONING_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and an output field to generate. Your task is to generate a JSON object which aggregates the input and fills in the output field with the correct value.\nYou will be provided with a description of each input field and each output field. The field in the output JSON object can be derived using information from the context.\n{desc_section}\n{output_format_instruction} Finish your response with a newline character followed by ---\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nOUTPUT FIELDS:\n{output_fields_desc}\n\nCONTEXT:\n{context}<<image-audio-placeholder>>\n\nAGGREGATION INSTRUCTION: {agg_instruction}\n\nANSWER: \"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/context_search.py",
    "content": "\nCONTEXT_SEARCH_PROMPT = \"\"\"You are a helpful agent whose job is to propose search queries that will assist in finding information that is relevant to performing a computation.\n\nPlease propose a concise sentence which will be used to search for relevant information for the following computation instruction.\n\nINSTRUCTION: {instruction}\n\nSEARCH QUERY: \n\"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/convert_prompts.py",
    "content": "\"\"\"This file contains prompts for convert operations.\"\"\"\n\n### BASE PROMPTS ###\nMAP_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a set of output fields to generate. Your task is to generate a JSON object which fills in the output fields with the correct values.\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the context.\n\n{output_format_instruction} Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{example_input_fields}\n\nOUTPUT FIELDS:\n{example_output_fields}\n\nCONTEXT:\n{{{example_context}}}{image_disclaimer}{audio_disclaimer}\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: {example_reasoning}\n\nANSWER:\n{{{example_answer}}}\n---\n\"\"\"\n\nMAP_NO_REASONING_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a set of output fields to generate. Your task is to generate a JSON object which fills in the output fields with the correct values.\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the context.\n\n{output_format_instruction} Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{example_input_fields}\n\nOUTPUT FIELDS:\n{example_output_fields}\n\nCONTEXT:\n{{{example_context}}}{image_disclaimer}{audio_disclaimer}\n\nANSWER:\n{{{example_answer}}}\n---\n\"\"\"\n\n\nMAP_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a set of output fields to generate. Your task is to generate a JSON object which fills in the output fields with the correct values.\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the context.\n{desc_section}\n{output_format_instruction} Finish your response with a newline character followed by ---\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nOUTPUT FIELDS:\n{output_fields_desc}\n\n<<cache-boundary>>CONTEXT:\n{context}<<image-audio-placeholder>>\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: \"\"\"\n\nMAP_NO_REASONING_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a set of output fields to generate. Your task is to generate a JSON object which fills in the output fields with the correct values.\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the context.\n{desc_section}\n{output_format_instruction} Finish your response with a newline character followed by ---\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nOUTPUT FIELDS:\n{output_fields_desc}\n\n<<cache-boundary>>CONTEXT:\n{context}<<image-audio-placeholder>>\n\nANSWER: \"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/critique_and_refine_prompts.py",
    "content": "\"\"\"This file contains prompts for CritiqueAndRefineConvert operations.\"\"\"\n\n### CRITIQUE PROMPT AND CRITERIA ###\nBASE_CRITIQUE_PROMPT = \"\"\"You are a helpful assistant tasked with critiquing the output of a model based on a given prompt.\nBelow is the original user prompt used to generate the output:\n\nORIGINAL PROMPT:\n<<original-prompt-placeholder>>\n\nHere is the output generated by the model:\n\nOUTPUT:\n{original_output}\n\nYour task is to critique the output based on the following:\n{critique_criteria}\n\n{finish_instruction}\n\"\"\"\n\nMAP_CRITIQUE_CRITERIA = \"\"\"1. Does the JSON object adhere to the required format? Highlight any structural issues.\n2. Are the values of the output fields accurate based on the provided context? If any fields are incorrect or missing, provide specific examples.\n3. Are there any logical errors in reasoning used to derive the output? Provide detailed feedback on potential mistakes.\n\"\"\"\nFILTER_CRITIQUE_CRITERIA = \"\"\"1. Does the output adhere to the required TRUE/FALSE format? Highlight any issues.\n2. Is the TRUE/FALSE determination accurate based on the provided context? If there is evidence for an incorrect determination, provide specific reasons why.\n3. Are there any logical errors in reasoning used to derive the TRUE/FALSE determination? Provide detailed feedback on potential mistakes.\n\"\"\"\n\nMAP_CRITIQUE_FINISH_INSTRUCTION = \"\"\"Finish your critique with actionable recommendations for improving the JSON object.\"\"\"\nFILTER_CRITIQUE_FINISH_INSTRUCTION = \"\"\"Finish your critique with actionable recommendations for improving the model's reasoning and answer.\"\"\"\n\n### REFINEMENT PROMPT AND CRITERIA ###\nBASE_REFINEMENT_PROMPT = \"\"\"You are a helpful assistant tasked with refining the output of a model based on a critique.\nBelow is the original user prompt used to generate the output:\n\nORIGINAL PROMPT:\n<<original-prompt-placeholder>>\n\nHere is the original output generated by the model:\n\nOUTPUT:\n{original_output}\n\nHere is the critique of the output:\n\nCRITIQUE:\n{critique_output}\n\nYour task is to refine the original output to address the critique. Ensure that:\n{refinement_criteria}\n\n{finish_instruction}\n\"\"\"\n\nMAP_REFINEMENT_CRITERIA = \"\"\"1. The answer adheres to the required JSON format specified in the original prompt.\n2. Correctly derives all values for the output fields based on the provided context.\n3. Resolves any logical errors identified in the critique.\n\"\"\"\nFILTER_REFINEMENT_CRITERIA = \"\"\"1. The answer adheres to the required TRUE/FALSE format specified in the original prompt.\n2. Correctly derives the final TRUE/FALSE determination based on the provided context.\n3. Resolves any logical errors identified in the critique.\n\"\"\"\n\nMAP_REFINEMENT_FINISH_INSTRUCTION = \"\"\"Return the refined JSON object as your final answer.\"\"\"\nFILTER_REFINEMENT_FINISH_INSTRUCTION = \"\"\"Return the final TRUE/FALSE answer.\"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/filter_prompts.py",
    "content": "\"\"\"This file contains prompts for filter operations.\"\"\"\n\n### BASE PROMPTS ###\nFILTER_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a filter condition. Output TRUE if the context satisfies the filter condition, and FALSE otherwise.\n\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{example_input_fields}\n\nFILTER CONDITION: {example_filter_condition}\n\nCONTEXT:\n{{{example_context}}}{image_disclaimer}{audio_disclaimer}\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: {example_reasoning}\n\nANSWER: TRUE\n---\n\"\"\"\n\nFILTER_NO_REASONING_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a filter condition. Output TRUE if the context satisfies the filter condition, and FALSE otherwise.\n\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{example_input_fields}\n\nFILTER CONDITION: {example_filter_condition}\n\nCONTEXT:\n{{{example_context}}}{image_disclaimer}{audio_disclaimer}\n\nANSWER: TRUE\n---\n\"\"\"\n\nFILTER_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a filter condition. Output TRUE if the context satisfies the filter condition, and FALSE otherwise.\n{desc_section}\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nFILTER CONDITION: {filter_condition}\n\n<<cache-boundary>>CONTEXT:\n{context}<<image-audio-placeholder>>\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: \"\"\"\n\nFILTER_NO_REASONING_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a filter condition. Output TRUE if the context satisfies the filter condition, and FALSE otherwise.\n{desc_section}\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nFILTER CONDITION: {filter_condition}\n\n<<cache-boundary>>CONTEXT:\n{context}<<image-audio-placeholder>>\n\nANSWER: \"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/join_prompts.py",
    "content": "\"\"\"This file contains prompts for join operations.\"\"\"\n\n### BASE PROMPTS ###\nJOIN_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with two data records and a join condition. Output TRUE if the two data records satisfy the join condition, and FALSE otherwise.\n\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nLEFT INPUT FIELDS:\n{example_input_fields}\n\nRIGHT INPUT FIELDS:\n{right_example_input_fields}\n\nJOIN CONDITION: {example_join_condition}\n\nLEFT CONTEXT:\n{{{example_context}}}{image_disclaimer}{audio_disclaimer}\n\nRIGHT CONTEXT:\n{{{right_example_context}}}{right_image_disclaimer}{right_audio_disclaimer}\n\nLet's think step-by-step in order to evaluate the join condition.\n\nREASONING: {example_reasoning}\n\nANSWER: TRUE\n---\n\"\"\"\n\nJOIN_NO_REASONING_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with two data records and a join condition. Output TRUE if the two data records satisfy the join condition, and FALSE otherwise.\n\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nLEFT INPUT FIELDS:\n{example_input_fields}\n\nRIGHT INPUT FIELDS:\n{right_example_input_fields}\n\nJOIN CONDITION: {example_join_condition}\n\nLEFT CONTEXT:\n{{{example_context}}}{image_disclaimer}{audio_disclaimer}\n\nRIGHT CONTEXT:\n{{{right_example_context}}}{right_image_disclaimer}{right_audio_disclaimer}\n\nANSWER: TRUE\n---\n\"\"\"\n\nJOIN_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with two data records and a join condition. Output TRUE if the two data records satisfy the join condition, and FALSE otherwise.\n{desc_section}\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n---\nLEFT INPUT FIELDS:\n{input_fields_desc}\n\nRIGHT INPUT FIELDS:\n{right_input_fields_desc}\n\nJOIN CONDITION: {join_condition}\n\n<<cache-boundary>>LEFT CONTEXT:\n{context}<<image-audio-placeholder>>\n\nRIGHT CONTEXT:\n{right_context}<<right-image-audio-placeholder>>\n\nLet's think step-by-step in order to evaluate the join condition.\n\nREASONING: \"\"\"\n\nJOIN_NO_REASONING_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with two data records and a join condition. Output TRUE if the two data records satisfy the join condition, and FALSE otherwise.\n{desc_section}\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n---\nLEFT INPUT FIELDS:\n{input_fields_desc}\n\nRIGHT INPUT FIELDS:\n{right_input_fields_desc}\n\nJOIN CONDITION: {join_condition}\n\n<<cache-boundary>>LEFT CONTEXT:\n{context}<<image-audio-placeholder>>\n\nRIGHT CONTEXT:\n{right_context}<<right-image-audio-placeholder>>\n\nANSWER: \"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/moa_aggregator_prompts.py",
    "content": "\"\"\"This file contains prompts for Mixture-of-Agents aggregator operations.\"\"\"\n\n### SYSTEM PROMPTS ###\nMAP_MOA_AGG_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to generate a JSON object.\nYou will be presented with one or more outputs produced by a set of models. Your task is to synthesize these responses into a single, high-quality JSON object which fills in the output fields with the correct values.\nIt is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased or incorrect.\n\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the model responses.\n\n{output_format_instruction} Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nMODEL RESPONSE 1: the text mentions the scientist's full name \"Augusta Ada King, Countess of Lovelace\" and states she was an English mathematician who worked on Babbage's Analytical Engine.\n\nMODEL RESPONSE 2: the text passage mentions the scientist's name as \"Augusta Ada King, Countess of Lovelace, also known as Ada Lovelace\" and the scientist's birthday as \"December 10, 1815\". Therefore, the name of the scientist is \"Augusta Ada King\" and the birth year is 1815.\n\nINPUT FIELDS:\n- text: a text passage describing a scientist\n- birthday: the scientist's birthday\n\nOUTPUT FIELDS:\n- name: the name of the scientist\n- birth_year: the year the scientist was born\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: Looking at both model responses, they agree that the scientist's formal name is \"Augusta Ada King\". Model Response 2 correctly extracts the birth year from the birthday field as 1815. The responses are consistent and provide sufficient evidence for these values.\n\nANSWER:\n{{\n  \"name\": \"Augusta Ada King\",\n  \"birth_year\": 1815\n}}\n---\n\"\"\"\n\nFILTER_MOA_AGG_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to answer a TRUE/FALSE question.\nYou will be presented with one or more outputs produced by a set of models. Your task is to synthesize these responses into a single TRUE/FALSE answer.\nIt is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased or incorrect.\n\nYou will also be provided with a description of each input field and the filter condition.\n\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nMODEL RESPONSE 1: The context describes Augusta Ada King, Countess of Lovelace, also known as Ada Lovelace, who is widely recognized as a foundational figure in computer science. Therefore, the answer is TRUE.\n\nMODEL RESPONSE 2: Based on the context provided, Ada Lovelace is indeed a foundational computer scientist, therefore the answer is TRUE.\n\nINPUT FIELDS:\n- text: a text passage describing a scientist\n- birthday: the scientist's birthday\n- image: an image of the scientist\n- recording: an audio recording of a newscast about the scientist's contributions to their field\n\nFILTER CONDITION: The subject of the input is a foundational computer scientist.\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: Both model responses agree that the context describes Ada Lovelace, who is widely recognized as a foundational figure in computer science. The evidence from the text passage supports this conclusion.\n\nANSWER: TRUE\n---\n\"\"\"\n\n### USER / INSTANCE-SPECIFIC PROMPTS ###\nMAP_MOA_AGG_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to generate a JSON object.\nYou will be presented with one or more outputs produced by a set of models. Your task is to synthesize these responses into a single, high-quality JSON object which fills in the output fields with the correct values.\nIt is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased or incorrect.\n\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the model responses.\n\n{output_format_instruction} Finish your response with a newline character followed by ---\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nOUTPUT FIELDS:\n{output_fields_desc}\n\n<<cache-boundary>>{model_responses}\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: \"\"\"\n\nFILTER_MOA_AGG_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to answer a TRUE/FALSE question.\nYou will be presented with one or more outputs produced by a set of models. Your task is to synthesize these responses into a single TRUE/FALSE answer.\nIt is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased or incorrect.\n\nYou will also be provided with a description of each input field and the filter condition.\n\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nFILTER CONDITION: {filter_condition}\n\n<<cache-boundary>>{model_responses}\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: \"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/moa_proposer_prompts.py",
    "content": "\"\"\"This file contains prompts for MixtureOfAgentsConvert operations.\"\"\"\n\n### SYSTEM PROMPTS ###\nMAP_MOA_PROPOSER_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a set of output fields to generate. Your task is to generate a detailed and succinct analysis describing what you believe is the correct value for each output field.\nBe sure to cite information from the context as evidence of why your answers are correct. Do not hallucinate evidence.\n\nYou will be provided with a description of each input field and each output field.\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{example_input_fields}\n\nOUTPUT FIELDS:\n{example_output_fields}\n\nCONTEXT:\n{{{example_context}}}{image_disclaimer}{audio_disclaimer}\n\nLet's think step-by-step in order to answer the question.\n\nANSWER: {example_answer}\n---\n\"\"\"\n\nFILTER_MOA_PROPOSER_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a filter condition. Your task is to generate a detailed and succinct analysis describing whether you believe the input satisfies the filter condition.\nBe sure to cite information from the context as evidence of why your determination is correct. Do not hallucinate evidence.\n\nYou will be provided with a description of each input field.\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{example_input_fields}\n\nFILTER CONDITION: {example_filter_condition}\n\nCONTEXT:\n{{{example_context}}}{image_disclaimer}{audio_disclaimer}\n\nLet's think step-by-step in order to answer the question.\n\nANSWER: {example_answer}\n---\n\"\"\"\n\n### USER / INSTANCE-SPECIFIC PROMPTS ###\nMAP_MOA_PROPOSER_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a set of output fields to generate. Your task is to generate a detailed and succinct analysis describing what you believe is the correct value for each output field.\nBe sure to cite information from the context as evidence of why your answers are correct. Do not hallucinate evidence.\n{desc_section}\nYou will be provided with a description of each input field and each output field.\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nOUTPUT FIELDS:\n{output_fields_desc}\n\n<<cache-boundary>>CONTEXT:\n{context}<<image-audio-placeholder>>\n\nLet's think step-by-step in order to answer the question.\n\nANSWER: \"\"\"\n\nFILTER_MOA_PROPOSER_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a filter condition. Your task is to generate a detailed and succinct analysis describing whether you believe the input satisfies the filter condition.\nBe sure to cite information from the context as evidence of why your determination is correct. Do not hallucinate evidence.\n{desc_section}\nYou will be provided with a description of each input field.\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nFILTER CONDITION: {filter_condition}\n\n<<cache-boundary>>CONTEXT:\n{context}<<image-audio-placeholder>>\n\nLet's think step-by-step in order to answer the question.\n\nANSWER: \"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/prompt_factory.py",
    "content": "\"\"\"This file contains factory methods which return template prompts and return messages for chat payloads.\"\"\"\n\nimport base64\nimport json\nimport os\nfrom typing import Any\n\nfrom pydantic import BaseModel\n\nfrom palimpzest.constants import (\n    LLAMA_CONTEXT_TOKENS_LIMIT,\n    TOKENS_PER_CHARACTER,\n    Cardinality,\n    Modality,\n    Model,\n    PromptStrategy,\n)\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.lib.schemas import (\n    AUDIO_FIELD_TYPES,\n    IMAGE_FIELD_TYPES,\n    AudioBase64,\n    AudioFilepath,\n    ImageBase64,\n    ImageFilepath,\n    ImageURL,\n)\nfrom palimpzest.prompts.aggregate_prompts import (\n    AGG_BASE_SYSTEM_PROMPT,\n    AGG_BASE_USER_PROMPT,\n    AGG_NO_REASONING_BASE_SYSTEM_PROMPT,\n    AGG_NO_REASONING_BASE_USER_PROMPT,\n)\nfrom palimpzest.prompts.convert_prompts import (\n    MAP_BASE_SYSTEM_PROMPT,\n    MAP_BASE_USER_PROMPT,\n    MAP_NO_REASONING_BASE_SYSTEM_PROMPT,\n    MAP_NO_REASONING_BASE_USER_PROMPT,\n)\nfrom palimpzest.prompts.critique_and_refine_prompts import (\n    BASE_CRITIQUE_PROMPT,\n    BASE_REFINEMENT_PROMPT,\n    FILTER_CRITIQUE_CRITERIA,\n    FILTER_CRITIQUE_FINISH_INSTRUCTION,\n    FILTER_REFINEMENT_CRITERIA,\n    FILTER_REFINEMENT_FINISH_INSTRUCTION,\n    MAP_CRITIQUE_CRITERIA,\n    MAP_CRITIQUE_FINISH_INSTRUCTION,\n    MAP_REFINEMENT_CRITERIA,\n    MAP_REFINEMENT_FINISH_INSTRUCTION,\n)\nfrom palimpzest.prompts.filter_prompts import (\n    FILTER_BASE_SYSTEM_PROMPT,\n    FILTER_BASE_USER_PROMPT,\n    FILTER_NO_REASONING_BASE_SYSTEM_PROMPT,\n    FILTER_NO_REASONING_BASE_USER_PROMPT,\n)\nfrom palimpzest.prompts.join_prompts import (\n    JOIN_BASE_SYSTEM_PROMPT,\n    JOIN_BASE_USER_PROMPT,\n    JOIN_NO_REASONING_BASE_SYSTEM_PROMPT,\n    JOIN_NO_REASONING_BASE_USER_PROMPT,\n)\nfrom palimpzest.prompts.moa_aggregator_prompts import (\n    FILTER_MOA_AGG_BASE_SYSTEM_PROMPT,\n    FILTER_MOA_AGG_BASE_USER_PROMPT,\n    MAP_MOA_AGG_BASE_SYSTEM_PROMPT,\n    MAP_MOA_AGG_BASE_USER_PROMPT,\n)\nfrom palimpzest.prompts.moa_proposer_prompts import (\n    FILTER_MOA_PROPOSER_BASE_SYSTEM_PROMPT,\n    FILTER_MOA_PROPOSER_BASE_USER_PROMPT,\n    MAP_MOA_PROPOSER_BASE_SYSTEM_PROMPT,\n    MAP_MOA_PROPOSER_BASE_USER_PROMPT,\n)\nfrom palimpzest.prompts.split_merge_prompts import (\n    FILTER_SPLIT_MERGER_BASE_SYSTEM_PROMPT,\n    FILTER_SPLIT_MERGER_BASE_USER_PROMPT,\n    MAP_SPLIT_MERGER_BASE_SYSTEM_PROMPT,\n    MAP_SPLIT_MERGER_BASE_USER_PROMPT,\n)\nfrom palimpzest.prompts.split_proposer_prompts import (\n    FILTER_SPLIT_PROPOSER_BASE_SYSTEM_PROMPT,\n    FILTER_SPLIT_PROPOSER_BASE_USER_PROMPT,\n    MAP_SPLIT_PROPOSER_BASE_SYSTEM_PROMPT,\n    MAP_SPLIT_PROPOSER_BASE_USER_PROMPT,\n)\nfrom palimpzest.prompts.utils import (\n    AGG_AUDIO_DISCLAIMER,\n    AGG_EXAMPLE_ANSWER,\n    AGG_EXAMPLE_OUTPUT_FIELDS,\n    AGG_EXAMPLE_REASONING,\n    AGG_IMAGE_DISCLAIMER,\n    AGG_JOB_INSTRUCTION,\n    AUDIO_DISCLAIMER,\n    AUDIO_EXAMPLE_ANSWER,\n    AUDIO_EXAMPLE_CONTEXT,\n    AUDIO_EXAMPLE_INPUT_FIELDS,\n    AUDIO_EXAMPLE_OUTPUT_FIELDS,\n    AUDIO_EXAMPLE_REASONING,\n    AUDIO_SENTENCE_EXAMPLE_ANSWER,\n    DESC_SECTION,\n    EXAMPLE_AGG_INSTRUCTION,\n    EXAMPLE_FILTER_CONDITION,\n    EXAMPLE_JOIN_CONDITION,\n    FILTER_EXAMPLE_REASONING,\n    FILTER_JOB_INSTRUCTION,\n    IMAGE_DISCLAIMER,\n    IMAGE_EXAMPLE_ANSWER,\n    IMAGE_EXAMPLE_CONTEXT,\n    IMAGE_EXAMPLE_INPUT_FIELDS,\n    IMAGE_EXAMPLE_OUTPUT_FIELDS,\n    IMAGE_EXAMPLE_REASONING,\n    IMAGE_SENTENCE_EXAMPLE_ANSWER,\n    JOIN_EXAMPLE_REASONING,\n    JOIN_JOB_INSTRUCTION,\n    MAP_JOB_INSTRUCTION,\n    ONE_TO_MANY_OUTPUT_FORMAT_INSTRUCTION,\n    ONE_TO_ONE_OUTPUT_FORMAT_INSTRUCTION,\n    PROPOSER_JOB_INSTRUCTION,\n    RIGHT_AUDIO_DISCLAIMER,\n    RIGHT_AUDIO_EXAMPLE_CONTEXT,\n    RIGHT_AUDIO_EXAMPLE_INPUT_FIELDS,\n    RIGHT_IMAGE_DISCLAIMER,\n    RIGHT_IMAGE_EXAMPLE_CONTEXT,\n    RIGHT_IMAGE_EXAMPLE_INPUT_FIELDS,\n    RIGHT_TEXT_EXAMPLE_CONTEXT,\n    RIGHT_TEXT_EXAMPLE_INPUT_FIELDS,\n    SECOND_AUDIO_EXAMPLE_CONTEXT,\n    SECOND_IMAGE_EXAMPLE_CONTEXT,\n    SECOND_TEXT_EXAMPLE_CONTEXT,\n    TEXT_EXAMPLE_ANSWER,\n    TEXT_EXAMPLE_CONTEXT,\n    TEXT_EXAMPLE_INPUT_FIELDS,\n    TEXT_EXAMPLE_OUTPUT_FIELDS,\n    TEXT_EXAMPLE_REASONING,\n    TEXT_SENTENCE_EXAMPLE_ANSWER,\n    THIRD_AUDIO_EXAMPLE_CONTEXT,\n    THIRD_IMAGE_EXAMPLE_CONTEXT,\n    THIRD_TEXT_EXAMPLE_CONTEXT,\n)\n\n\ndef _detect_image_media_type(filepath: str | None = None, base64_data: str | None = None) -> str:\n    \"\"\"Detect image media type from file extension or base64 magic bytes.\"\"\"\n    if filepath:\n        ext = os.path.splitext(filepath)[1].lower()\n        ext_map = {\".png\": \"image/png\", \".jpg\": \"image/jpeg\", \".jpeg\": \"image/jpeg\",\n                   \".gif\": \"image/gif\", \".webp\": \"image/webp\"}\n        if ext in ext_map:\n            return ext_map[ext]\n\n    if base64_data:\n        try:\n            header = base64.b64decode(base64_data[:32])\n            if header[:8] == b\"\\x89PNG\\r\\n\\x1a\\n\":\n                return \"image/png\"\n            if header[:3] == b\"\\xff\\xd8\\xff\":\n                return \"image/jpeg\"\n            if header[:6] in (b\"GIF87a\", b\"GIF89a\"):\n                return \"image/gif\"\n            if header[:4] == b\"RIFF\" and header[8:12] == b\"WEBP\":\n                return \"image/webp\"\n        except Exception:\n            pass\n\n    return \"image/jpeg\"\n\n\nclass PromptFactory:\n    \"\"\"Factory class for generating prompts for the Generator given the input(s).\"\"\"\n\n    BASE_SYSTEM_PROMPT_MAP = {\n        # agg user prompts\n        PromptStrategy.AGG: AGG_BASE_SYSTEM_PROMPT,\n        PromptStrategy.AGG_NO_REASONING: AGG_NO_REASONING_BASE_SYSTEM_PROMPT,\n\n        # filter system prompts\n        PromptStrategy.FILTER: FILTER_BASE_SYSTEM_PROMPT,\n        PromptStrategy.FILTER_NO_REASONING: FILTER_NO_REASONING_BASE_SYSTEM_PROMPT,\n        PromptStrategy.FILTER_CRITIC: None,\n        PromptStrategy.FILTER_REFINE: None,\n        PromptStrategy.FILTER_MOA_PROPOSER: FILTER_MOA_PROPOSER_BASE_SYSTEM_PROMPT,\n        PromptStrategy.FILTER_MOA_AGG: FILTER_MOA_AGG_BASE_SYSTEM_PROMPT,\n        PromptStrategy.FILTER_SPLIT_PROPOSER: FILTER_SPLIT_PROPOSER_BASE_SYSTEM_PROMPT,\n        PromptStrategy.FILTER_SPLIT_MERGER: FILTER_SPLIT_MERGER_BASE_SYSTEM_PROMPT,\n\n        # join system prompts\n        PromptStrategy.JOIN: JOIN_BASE_SYSTEM_PROMPT,\n        PromptStrategy.JOIN_NO_REASONING: JOIN_NO_REASONING_BASE_SYSTEM_PROMPT,\n\n        # map system prompts\n        PromptStrategy.MAP: MAP_BASE_SYSTEM_PROMPT,\n        PromptStrategy.MAP_NO_REASONING: MAP_NO_REASONING_BASE_SYSTEM_PROMPT,\n        PromptStrategy.MAP_CRITIC: None,\n        PromptStrategy.MAP_REFINE: None,\n        PromptStrategy.MAP_MOA_PROPOSER: MAP_MOA_PROPOSER_BASE_SYSTEM_PROMPT,\n        PromptStrategy.MAP_MOA_AGG: MAP_MOA_AGG_BASE_SYSTEM_PROMPT,\n        PromptStrategy.MAP_SPLIT_PROPOSER: MAP_SPLIT_PROPOSER_BASE_SYSTEM_PROMPT,\n        PromptStrategy.MAP_SPLIT_MERGER: MAP_SPLIT_MERGER_BASE_SYSTEM_PROMPT,\n    }\n    BASE_USER_PROMPT_MAP = {\n        # agg user prompts\n        PromptStrategy.AGG: AGG_BASE_USER_PROMPT,\n        PromptStrategy.AGG_NO_REASONING: AGG_NO_REASONING_BASE_USER_PROMPT,\n\n        # filter user prompts\n        PromptStrategy.FILTER: FILTER_BASE_USER_PROMPT,\n        PromptStrategy.FILTER_NO_REASONING: FILTER_NO_REASONING_BASE_USER_PROMPT,\n        PromptStrategy.FILTER_CRITIC: BASE_CRITIQUE_PROMPT,\n        PromptStrategy.FILTER_REFINE: BASE_REFINEMENT_PROMPT,\n        PromptStrategy.FILTER_MOA_PROPOSER: FILTER_MOA_PROPOSER_BASE_USER_PROMPT,\n        PromptStrategy.FILTER_MOA_AGG: FILTER_MOA_AGG_BASE_USER_PROMPT,\n        PromptStrategy.FILTER_SPLIT_PROPOSER: FILTER_SPLIT_PROPOSER_BASE_USER_PROMPT,\n        PromptStrategy.FILTER_SPLIT_MERGER: FILTER_SPLIT_MERGER_BASE_USER_PROMPT,\n\n        # join user prompts\n        PromptStrategy.JOIN: JOIN_BASE_USER_PROMPT,\n        PromptStrategy.JOIN_NO_REASONING: JOIN_NO_REASONING_BASE_USER_PROMPT,\n\n        # map user prompts\n        PromptStrategy.MAP: MAP_BASE_USER_PROMPT,\n        PromptStrategy.MAP_NO_REASONING: MAP_NO_REASONING_BASE_USER_PROMPT,\n        PromptStrategy.MAP_CRITIC: BASE_CRITIQUE_PROMPT,\n        PromptStrategy.MAP_REFINE: BASE_REFINEMENT_PROMPT,\n        PromptStrategy.MAP_MOA_PROPOSER: MAP_MOA_PROPOSER_BASE_USER_PROMPT,\n        PromptStrategy.MAP_MOA_AGG: MAP_MOA_AGG_BASE_USER_PROMPT,\n        PromptStrategy.MAP_SPLIT_PROPOSER: MAP_SPLIT_PROPOSER_BASE_USER_PROMPT,\n        PromptStrategy.MAP_SPLIT_MERGER: MAP_SPLIT_MERGER_BASE_USER_PROMPT,\n    }\n\n    def __init__(self, prompt_strategy: PromptStrategy, model: Model, cardinality: Cardinality, desc: str | None = None) -> None:\n        self.prompt_strategy = prompt_strategy\n        self.model = model\n        self.cardinality = cardinality\n        self.desc = desc\n\n    def _get_context(self, candidate: DataRecord | list[DataRecord], input_fields: list[str]) -> str:\n        \"\"\"\n        Returns the context for the prompt.\n\n        Args:\n            candidate (DataRecord): The input record.\n            input_fields (list[str]): The input fields.\n\n        Returns:\n            str: The context.\n        \"\"\"\n        # TODO: remove mask_filepaths=True after SemBench evaluation\n        # get context from input record (project_cols will be None if not provided in kwargs)\n        if isinstance(candidate, list):\n            context: list[dict] = [record.to_dict(include_bytes=False, project_cols=input_fields, mask_filepaths=True) for record in candidate]\n        else:\n            context: dict = candidate.to_dict(include_bytes=False, project_cols=input_fields, mask_filepaths=True)\n\n        # TODO: MOVE THIS LOGIC INTO A CHUNKING / CONTEXT MANAGEMENT CLASS\n        #   - this class should be able to:\n        #      - handle the context length of different models (i.e. self.model should be an input)\n        #      - handle images\n        #      - handle the issue with `original_messages` (ask Matt if this is not clear)\n        # TODO: this does not work for image prompts\n        # TODO: this ignores the size of the `orignal_messages` in critique and refine prompts\n        # NOTE: llama models are disallowed for aggregation so we can assume context is a dict here\n        # cut down on context based on window length\n        if self.model.is_llama_model():\n            assert isinstance(context, dict), \"Llama models are not allowed for aggregation operations.\"\n            total_context_len = len(json.dumps(context, indent=2))\n\n            # sort fields by length and progressively strip from the longest field until it is short enough;\n            # NOTE: LLAMA_CONTEXT_TOKENS_LIMIT is a rough estimate which leaves room for the rest of the prompt text\n            while total_context_len * TOKENS_PER_CHARACTER > LLAMA_CONTEXT_TOKENS_LIMIT:\n                # sort fields by length\n                field_lengths = [(field, len(value) if value is not None else 0) for field, value in context.items()]\n                sorted_fields = sorted(field_lengths, key=lambda item: item[1], reverse=True)\n\n                # get field with longest context\n                longest_field_name, longest_field_length = sorted_fields[0]\n\n                # trim the field\n                context_factor = LLAMA_CONTEXT_TOKENS_LIMIT / (total_context_len * TOKENS_PER_CHARACTER)\n                keep_frac_idx = int(longest_field_length * context_factor)\n                context[longest_field_name] = context[longest_field_name][:keep_frac_idx]\n\n                # update total context length\n                total_context_len = len(json.dumps(context, indent=2))\n\n        return json.dumps(context, indent=2)\n\n    def _get_input_fields(self, candidate: DataRecord, **kwargs) -> list[str]:\n        \"\"\"\n        The list of input fields to be templated into the prompt(s).\n        If the user provides a list of \"project_cols\" in kwargs, then this list will be returned.\n        Otherwise, this function returns the list of all field names in the candidate record.\n\n        Args:\n            candidate (DataRecord): The input record.\n            kwargs: The keyword arguments provided by the user.\n\n        Returns:\n            list[str]: The list of input field names.\n        \"\"\"\n        # NOTE: joins will include left and right input fields in project_cols, so we have to check\n        #       if the field is in the candidate record\n        input_fields = kwargs.get(\"project_cols\", candidate.get_field_names())\n        input_fields = [field for field in input_fields if field in candidate.get_field_names()]\n        return input_fields\n\n    def _get_input_modalities(self, candidate: DataRecord, input_fields: list[str]) -> set[Modality]:\n        \"\"\"\n        The list of input modalities for the given input fields.\n\n        Args:\n            candidate (DataRecord): The input record.\n            input_fields (list[str]): The input fields.\n\n        Returns:\n            set[Modality]: The list of input modalities.\n        \"\"\"\n        input_modalities = []\n        for field_name in input_fields:\n            field_type = candidate.get_field_type(field_name)\n            if field_type.annotation in IMAGE_FIELD_TYPES:\n                input_modalities.append(Modality.IMAGE)\n            elif field_type.annotation in AUDIO_FIELD_TYPES:\n                input_modalities.append(Modality.AUDIO)\n            else:\n                input_modalities.append(Modality.TEXT)\n\n        return set(input_modalities)\n\n    def _get_modalities_str(self, input_modalities: set[Modality]) -> str:\n        \"\"\"\n        Returns a format string to reflect the input modalities.\n\n        Args:\n            input_modalities (set[Modality]): The input modalities.\n\n        Returns:\n            str: The string to reflect the input modalities.\n        \"\"\"\n        if input_modalities == {Modality.TEXT}:\n            return \"text\"\n        elif input_modalities == {Modality.IMAGE}:\n            return \"image(s)\"\n        elif input_modalities == {Modality.AUDIO}:\n            return \"audio\"\n        elif input_modalities == {Modality.TEXT, Modality.IMAGE}:\n            return \"text and/or image(s)\"\n        elif input_modalities == {Modality.TEXT, Modality.AUDIO}:\n            return \"text and/or audio\"\n        elif input_modalities == {Modality.IMAGE, Modality.AUDIO}:\n            return \"image(s) and/or audio\"\n        elif input_modalities == {Modality.TEXT, Modality.IMAGE, Modality.AUDIO}:\n            return \"text, image(s), and/or audio\"\n\n    def _get_input_fields_desc(self, candidate: DataRecord, input_fields: list[str]) -> str:\n        \"\"\"\n        Returns a multi-line description of each input field for the prompt.\n\n        Args:\n            input_fields (list[str]): The input fields.\n\n        Returns:\n            str: The input fields description.\n        \"\"\"\n        input_fields_desc = \"\"\n        for field_name in input_fields:\n            input_fields_desc += f\"- {field_name}: {candidate.get_field_type(field_name).description}\\n\"\n\n        return input_fields_desc[:-1]\n\n    def _get_output_fields_desc(self, output_fields: list[str], **kwargs) -> str:\n        \"\"\"\n        Returns a multi-line description of each output field for the prompt.\n\n        Args:\n            output_fields (list[str]): The output fields.\n            kwargs: The keyword arguments provided by the user.\n\n        Returns:\n            str: The output fields description.\n        \"\"\"\n        output_fields_desc = \"\"\n        output_schema: type[BaseModel] = kwargs.get(\"output_schema\")\n        if self.prompt_strategy.is_map_prompt() or self.prompt_strategy.is_agg_prompt():\n            assert output_schema is not None, \"Output schema must be provided for convert prompts.\"\n\n            for field_name in sorted(output_fields):\n                desc = output_schema.model_fields[field_name].description\n                output_fields_desc += f\"- {field_name}: {'no description available' if desc is None else desc}\\n\"\n\n        # strip the last newline characters from the field descriptions and return\n        return output_fields_desc[:-1]\n\n    def _get_agg_instruction(self, **kwargs) -> str | None:\n        \"\"\"\n        Returns the aggregation instruction for the aggregation operation.\n\n        Returns:\n            str | None: The aggregation instruction (if applicable).\n        \"\"\"\n        agg_instruction = kwargs.get(\"agg_instruction\")\n        if self.prompt_strategy.is_agg_prompt():\n            assert agg_instruction is not None, \"Aggregation instruction must be provided for aggregation operations.\"\n\n        return agg_instruction\n\n    def _get_filter_condition(self, **kwargs) -> str | None:\n        \"\"\"\n        Returns the filter condition for the filter operation.\n\n        Returns:\n            str | None: The filter condition (if applicable).\n        \"\"\"\n        filter_condition = kwargs.get(\"filter_condition\")\n        if self.prompt_strategy.is_filter_prompt():\n            assert filter_condition is not None, \"Filter condition must be provided for filter operations.\"\n\n        return filter_condition\n\n    def _get_join_condition(self, **kwargs) -> str | None:\n        \"\"\"\n        Returns the join condition for the join operation.\n\n        Returns:\n            str | None: The join condition (if applicable).\n        \"\"\"\n        join_condition = kwargs.get(\"join_condition\")\n        if self.prompt_strategy.is_join_prompt():\n            assert join_condition is not None, \"Join condition must be provided for join operations.\"\n\n        return join_condition\n\n    def _get_original_output(self, **kwargs) -> str | None:\n        \"\"\"\n        Returns the original output from a previous model generation for the critique and refinement operations.\n\n        Args:\n            kwargs: The keyword arguments provided by the user.\n\n        Returns:\n            str | None: The original output.\n        \"\"\"\n        original_output = kwargs.get(\"original_output\")\n        if self.prompt_strategy.is_critic_prompt() or self.prompt_strategy.is_refine_prompt():\n            assert original_output is not None, (\n                \"Original output must be provided for critique and refinement operations.\"\n            )\n\n        return original_output\n\n    def _get_critique_output(self, **kwargs) -> str | None:\n        \"\"\"\n        Returns the critique output for the refinement operation.\n\n        Args:\n            kwargs: The keyword arguments provided by the user.\n\n        Returns:\n            str | None: The critique output.\n        \"\"\"\n        critique_output = kwargs.get(\"critique_output\")\n        if self.prompt_strategy.is_refine_prompt():\n            assert critique_output is not None, \"Critique output must be provided for refinement operations.\"\n\n        return critique_output\n\n    def _get_model_responses(self, **kwargs) -> str | None:\n        \"\"\"\n        Returns the model responses for the mixture-of-agents aggregation operation.\n\n        Args:\n            kwargs: The keyword arguments provided by the user.\n\n        Returns:\n            str | None: The model responses.\n        \"\"\"\n        model_responses = None\n        if self.prompt_strategy.is_moa_aggregator_prompt():\n            model_responses = \"\"\n            for idx, model_response in enumerate(kwargs.get(\"model_responses\")):\n                model_responses += f\"MODEL RESPONSE {idx + 1}: {model_response.rstrip()}\\n\\n\"\n        model_responses = model_responses.rstrip() if model_responses is not None else None\n\n        return model_responses\n\n    def _get_chunk_outputs(self, **kwargs) -> str | None:\n        \"\"\"\n        Returns the chunk outputs for the split-convert.\n\n        Args:\n            kwargs: The keyword arguments provided by the user.\n\n        Returns:\n            str | None: The chunk outputs.\n        \"\"\"\n        chunk_outputs = None\n        if self.prompt_strategy.is_split_merger_prompt():\n            chunk_outputs = \"\"\n            for idx, chunk_output in enumerate(kwargs.get(\"chunk_outputs\")):\n                chunk_outputs += f\"CHUNK OUTPUT {idx + 1}: {chunk_output.rstrip()}\\n\\n\"\n        chunk_outputs = chunk_outputs.rstrip() if chunk_outputs is not None else None\n\n        return chunk_outputs\n\n    def _get_output_format_instruction(self) -> str:\n        \"\"\"\n        Returns the output format instruction based on the cardinality.\n\n        Returns:\n            str: The output format instruction.\n        \"\"\"\n        return (\n            ONE_TO_ONE_OUTPUT_FORMAT_INSTRUCTION\n            if self.cardinality == Cardinality.ONE_TO_ONE\n            else ONE_TO_MANY_OUTPUT_FORMAT_INSTRUCTION\n        )\n\n    def _get_job_instruction(self, input_modalities: set[Modality]) -> str | None:\n        \"\"\"\n        Returns the job instruction based on the prompt strategy.\n\n        Args:\n            input_modalities (set[Modality]): The modalities of the input fields.\n\n        Returns:\n            str | None: The job instruction.\n        \"\"\"\n        # get the job instruction based on the prompt strategy\n        job_instruction = None\n        if self.prompt_strategy.is_moa_proposer_prompt() or self.prompt_strategy.is_split_proposer_prompt():\n            job_instruction = PROPOSER_JOB_INSTRUCTION\n        elif self.prompt_strategy.is_map_prompt():\n            job_instruction = MAP_JOB_INSTRUCTION\n        elif self.prompt_strategy.is_filter_prompt():\n            job_instruction = FILTER_JOB_INSTRUCTION\n        elif self.prompt_strategy.is_join_prompt():\n            job_instruction = JOIN_JOB_INSTRUCTION\n        elif self.prompt_strategy.is_agg_prompt():\n            job_instruction = AGG_JOB_INSTRUCTION\n\n        # format the job instruction based on the input modalities\n        modalities = self._get_modalities_str(input_modalities)\n        if job_instruction is not None:\n            job_instruction = job_instruction.format(modalities=modalities)\n\n        return job_instruction\n\n    def _get_desc_section(self) -> str:\n        \"\"\"\n        Returns the description section for the prompt.\n\n        Returns:\n            str: The description section (if applicable).\n        \"\"\"\n        desc_section = \"\"\n        if self.desc is not None:\n            desc_section = DESC_SECTION.format(desc=self.desc)\n\n        return desc_section\n\n    def _get_critique_criteria(self) -> str | None:\n        \"\"\"\n        Returns the critique criteria for the critique operation.\n\n        Returns:\n            str | None: The critique criteria (if applicable).\n        \"\"\"\n        critique_criteria = None\n        if self.prompt_strategy.is_critic_prompt():\n            critique_criteria = MAP_CRITIQUE_CRITERIA if self.prompt_strategy.is_map_prompt() else FILTER_CRITIQUE_CRITERIA\n\n        return critique_criteria\n\n    def _get_refinement_criteria(self) -> str | None:\n        \"\"\"\n        Returns the refinement criteria for the refinement operation.\n\n        Returns:\n            str | None: The refinement criteria (if applicable).\n        \"\"\"\n        refinement_criteria = None\n        if self.prompt_strategy.is_refine_prompt():\n            refinement_criteria = MAP_REFINEMENT_CRITERIA if self.prompt_strategy.is_map_prompt() else FILTER_REFINEMENT_CRITERIA\n\n        return refinement_criteria\n\n    def _get_finish_instruction(self) -> str | None:\n        \"\"\"\n        Returns the finish instruction for the critique and refinement operations.\n\n        Returns:\n            str | None: The finish instruction (if applicable).\n        \"\"\"\n        finish_instruction = None\n        if self.prompt_strategy.is_critic_prompt():\n            finish_instruction = MAP_CRITIQUE_FINISH_INSTRUCTION if self.prompt_strategy.is_map_prompt() else FILTER_CRITIQUE_FINISH_INSTRUCTION\n        elif self.prompt_strategy.is_refine_prompt():\n            finish_instruction = MAP_REFINEMENT_FINISH_INSTRUCTION if self.prompt_strategy.is_map_prompt() else FILTER_REFINEMENT_FINISH_INSTRUCTION\n\n        return finish_instruction\n\n    def _get_example_input_fields(self, input_modalities: set[Modality], right: bool = False) -> str:\n        \"\"\"\n        Returns the example input fields for the prompt.\n\n        Args:\n            input_modalities (set[Modality]): The modalities of the input fields.\n            right (bool): Whether to return the right input fields for the join prompt.\n\n        Returns:\n            str: The example input fields.\n        \"\"\"\n        input_modality_to_example_input_fields = {\n            Modality.TEXT: RIGHT_TEXT_EXAMPLE_INPUT_FIELDS if right else TEXT_EXAMPLE_INPUT_FIELDS,\n            Modality.IMAGE: RIGHT_IMAGE_EXAMPLE_INPUT_FIELDS if right else IMAGE_EXAMPLE_INPUT_FIELDS,\n            Modality.AUDIO: RIGHT_AUDIO_EXAMPLE_INPUT_FIELDS if right else AUDIO_EXAMPLE_INPUT_FIELDS,\n        }\n\n        example_input_fields = \"\"\n        for input_modality in input_modalities:\n            example_input_fields += input_modality_to_example_input_fields[input_modality].rstrip()\n        example_input_fields = example_input_fields.lstrip() + \"\\n\"\n\n        return example_input_fields\n\n    def _get_example_output_fields(self, input_modalities: set[Modality]) -> str:\n        \"\"\"\n        Returns the example output fields for the prompt.\n\n        Returns:\n            str: The example output fields.\n        \"\"\"\n        if self.prompt_strategy.is_agg_prompt():\n            return AGG_EXAMPLE_OUTPUT_FIELDS\n\n        input_modality_to_example_output_fields = {\n            Modality.TEXT: TEXT_EXAMPLE_OUTPUT_FIELDS,\n            Modality.IMAGE: IMAGE_EXAMPLE_OUTPUT_FIELDS,\n            Modality.AUDIO: AUDIO_EXAMPLE_OUTPUT_FIELDS,\n        }\n\n        example_output_fields = \"\"\n        for input_modality in input_modalities:\n            example_output_fields += input_modality_to_example_output_fields[input_modality].rstrip()\n        example_output_fields = example_output_fields.lstrip() + \"\\n\"\n\n        return example_output_fields\n\n    def _get_example_context(self, input_modalities: set[Modality], right: bool = False, second: bool = False, third: bool = False) -> str:\n        \"\"\"\n        Returns the example context for the prompt.\n\n        Returns:\n            str: The example context.\n        \"\"\"\n        assert not (second and third), \"Cannot have both second and third example contexts.\"\n        assert not (right and (second or third)), \"Right context is only used for joins; second and third contexts only use for aggregations.\"\n        text_example_context = TEXT_EXAMPLE_CONTEXT\n        image_example_context = IMAGE_EXAMPLE_CONTEXT\n        audio_example_context = AUDIO_EXAMPLE_CONTEXT\n        if second:\n            text_example_context = SECOND_TEXT_EXAMPLE_CONTEXT\n            image_example_context = SECOND_IMAGE_EXAMPLE_CONTEXT\n            audio_example_context = SECOND_AUDIO_EXAMPLE_CONTEXT\n        elif third:\n            text_example_context = THIRD_TEXT_EXAMPLE_CONTEXT\n            image_example_context = THIRD_IMAGE_EXAMPLE_CONTEXT\n            audio_example_context = THIRD_AUDIO_EXAMPLE_CONTEXT\n\n        input_modality_to_example_context = {\n            Modality.TEXT: RIGHT_TEXT_EXAMPLE_CONTEXT if right else text_example_context,\n            Modality.IMAGE: RIGHT_IMAGE_EXAMPLE_CONTEXT if right else image_example_context,\n            Modality.AUDIO: RIGHT_AUDIO_EXAMPLE_CONTEXT if right else audio_example_context,\n        }\n\n        example_context = \"\"\n        for input_modality in input_modalities:\n            example_context += input_modality_to_example_context[input_modality].rstrip() + \",\"\n        example_context = example_context[:-1] + \"\\n\"\n\n        return example_context\n\n    def _get_image_disclaimer(self, input_modalities: set[Modality], right: bool = False, agg: bool = False) -> str:\n        \"\"\"\n        Returns the image disclaimer for the prompt. The disclaimer must be an empty string\n        for non-image prompts.\n\n        Returns:\n            str: The image disclaimer. If this is a text prompt then it is an empty string.\n        \"\"\"\n        assert not (right and agg), \"Right image disclaimer is only used for joins; agg image disclaimer only used for aggregations.\"\n        image_disclaimer = AGG_IMAGE_DISCLAIMER if agg else IMAGE_DISCLAIMER\n        image_disclaimer = RIGHT_IMAGE_DISCLAIMER if right else image_disclaimer\n        return image_disclaimer if Modality.IMAGE in input_modalities else \"\"\n\n    def _get_audio_disclaimer(self, input_modalities: set[Modality], right: bool = False, agg: bool = False) -> str:\n        \"\"\"\n        Returns the audio disclaimer for the prompt. The disclaimer must be an empty string\n        for non-audio prompts.\n\n        Returns:\n            str: The audio disclaimer. If this is a text prompt then it is an empty string.\n        \"\"\"\n        assert not (right and agg), \"Right audio disclaimer is only used for joins; agg audio disclaimer only used for aggregations.\"\n        audio_disclaimer = AGG_AUDIO_DISCLAIMER if agg else AUDIO_DISCLAIMER\n        audio_disclaimer = RIGHT_AUDIO_DISCLAIMER if right else audio_disclaimer\n        return audio_disclaimer if Modality.AUDIO in input_modalities else \"\"\n\n    def _get_example_reasoning(self, input_modalities: set[Modality]) -> str:\n        \"\"\"\n        Returns the example reasoning for the prompt.\n\n        Returns:\n            str: The example reasoning.\n        \"\"\"\n        if self.prompt_strategy.is_filter_prompt():\n            return FILTER_EXAMPLE_REASONING\n        elif self.prompt_strategy.is_join_prompt():\n            return JOIN_EXAMPLE_REASONING\n        elif self.prompt_strategy.is_agg_prompt():\n            return AGG_EXAMPLE_REASONING\n\n        input_modality_to_example_reasoning = {\n            Modality.TEXT: TEXT_EXAMPLE_REASONING,\n            Modality.IMAGE: IMAGE_EXAMPLE_REASONING,\n            Modality.AUDIO: AUDIO_EXAMPLE_REASONING,\n        }\n\n        example_reasoning = \"\"\n        for input_modality in input_modalities:\n            example_reasoning += input_modality_to_example_reasoning[input_modality] + \" \"\n        example_reasoning = example_reasoning.rstrip()\n\n        return example_reasoning\n\n    def _get_example_answer(self, input_modalities: set[Modality]) -> str:\n        \"\"\"\n        Returns the example answer for the prompt.\n\n        Returns:\n            str: The example answer.\n        \"\"\"\n        if self.prompt_strategy.is_agg_prompt():\n            return AGG_EXAMPLE_ANSWER\n\n        use_sentence_answers = self.prompt_strategy.is_split_proposer_prompt() or self.prompt_strategy.is_moa_proposer_prompt()\n        input_modality_to_example_answer = {\n            Modality.TEXT: TEXT_SENTENCE_EXAMPLE_ANSWER if use_sentence_answers else TEXT_EXAMPLE_ANSWER,\n            Modality.IMAGE: IMAGE_SENTENCE_EXAMPLE_ANSWER if use_sentence_answers else IMAGE_EXAMPLE_ANSWER,\n            Modality.AUDIO: AUDIO_SENTENCE_EXAMPLE_ANSWER if use_sentence_answers else AUDIO_EXAMPLE_ANSWER,\n        }\n\n        example_answer = \"\"\n        for input_modality in input_modalities:\n            example_answer += input_modality_to_example_answer[input_modality].rstrip()\n            if use_sentence_answers:\n                example_answer += \" \"\n        example_answer = example_answer + \"\\n\"\n\n        return example_answer\n\n    def _get_all_format_kwargs(\n        self,\n        candidate: DataRecord | list[DataRecord],\n        input_fields: list[str],\n        input_modalities: set[Modality],\n        output_fields: list[str],\n        right_candidate: DataRecord | None,\n        right_input_fields: list[str],\n        right_input_modalities: set[Modality],\n        **kwargs,\n    ) -> dict:\n        \"\"\"\n        Returns a dictionary containing all the format kwargs for templating the prompts.\n\n        Args:\n            candidate (DataRecord): The input record.\n            input_fields (list[str]): The input fields.\n            output_fields (list[str]): The output fields.\n            kwargs: The keyword arguments provided by the user.\n\n        Returns:\n            dict: The dictionary containing all the format kwargs.\n        \"\"\"\n        # get format kwargs which depend on the input data\n        input_format_kwargs = {\n            \"context\": self._get_context(candidate, input_fields),\n            \"input_fields_desc\": self._get_input_fields_desc(candidate[0] if isinstance(candidate, list) else candidate, input_fields),\n            \"output_fields_desc\": self._get_output_fields_desc(output_fields, **kwargs),\n            \"agg_instruction\": self._get_agg_instruction(**kwargs),\n            \"filter_condition\": self._get_filter_condition(**kwargs),\n            \"join_condition\": self._get_join_condition(**kwargs),\n            \"original_output\": self._get_original_output(**kwargs),\n            \"critique_output\": self._get_critique_output(**kwargs),\n            \"model_responses\": self._get_model_responses(**kwargs),\n            \"chunk_outputs\": self._get_chunk_outputs(**kwargs),\n        }\n\n        # if a right candidate is provided, we also get the context and input field descriptions for the right candidate\n        if right_candidate is not None:\n            input_format_kwargs.update({\n                \"right_context\": self._get_context(right_candidate, right_input_fields),\n                \"right_input_fields_desc\": self._get_input_fields_desc(right_candidate, right_input_fields),\n            })\n\n        # get format kwargs which depend on the prompt strategy\n        full_input_modalities = input_modalities.union(right_input_modalities)\n        prompt_strategy_format_kwargs = {\n            \"output_format_instruction\": self._get_output_format_instruction(),\n            \"job_instruction\": self._get_job_instruction(full_input_modalities),\n            \"desc_section\": self._get_desc_section(),\n            \"critique_criteria\": self._get_critique_criteria(),\n            \"refinement_criteria\": self._get_refinement_criteria(),\n            \"finish_instruction\": self._get_finish_instruction(),\n            \"example_input_fields\": self._get_example_input_fields(input_modalities),\n            \"right_example_input_fields\": self._get_example_input_fields(right_input_modalities, right=True),\n            \"example_output_fields\": self._get_example_output_fields(input_modalities),\n            \"example_context\": self._get_example_context(input_modalities),\n            \"second_example_context\": self._get_example_context(input_modalities, second=True) if self.prompt_strategy.is_agg_prompt() else \"\",\n            \"third_example_context\": self._get_example_context(input_modalities, third=True) if self.prompt_strategy.is_agg_prompt() else \"\",\n            \"right_example_context\": self._get_example_context(right_input_modalities, right=True),\n            \"image_disclaimer\": self._get_image_disclaimer(input_modalities, agg=self.prompt_strategy.is_agg_prompt()),\n            \"audio_disclaimer\": self._get_audio_disclaimer(input_modalities, agg=self.prompt_strategy.is_agg_prompt()),\n            \"right_image_disclaimer\": self._get_image_disclaimer(right_input_modalities, right=True),\n            \"right_audio_disclaimer\": self._get_audio_disclaimer(right_input_modalities, right=True),\n            \"example_agg_instruction\": EXAMPLE_AGG_INSTRUCTION,\n            \"example_filter_condition\": EXAMPLE_FILTER_CONDITION,\n            \"example_join_condition\": EXAMPLE_JOIN_CONDITION,\n            \"example_reasoning\": self._get_example_reasoning(input_modalities),\n            \"example_answer\": self._get_example_answer(input_modalities),\n        }\n\n        # return all format kwargs\n        return {**input_format_kwargs, **prompt_strategy_format_kwargs}\n\n    def _create_audio_messages(self, candidate: DataRecord | list[DataRecord], input_fields: list[str]) -> list[dict]:\n        \"\"\"\n        Parses the candidate record(s) and returns the audio messages for the chat payload.\n\n        Args:\n            candidate (DataRecord | list[DataRecord]): The input record(s).\n            input_fields (list[str]): The list of input fields.\n\n        Returns:\n            list[dict]: The audio messages for the chat payload.\n        \"\"\"\n        # normalize type to be list[DataRecord]\n        if isinstance(candidate, DataRecord):\n            candidate = [candidate]\n\n        # create a message for each audio recording in an input field with an audio (or list of audio) type\n        audio_content = []\n        for field_name in input_fields:\n            for dr in candidate:\n                field_value = dr[field_name]\n                field_type = dr.get_field_type(field_name)\n\n                # audio filepath (or list of audio filepaths)\n                if field_type.annotation in [AudioFilepath, AudioFilepath | None, AudioFilepath | Any] and field_value is not None:\n                    with open(field_value, \"rb\") as f:\n                        base64_audio_str = base64.b64encode(f.read()).decode(\"utf-8\")\n                    audio_content.append(\n                        {\"type\": \"input_audio\", \"input_audio\": {\"data\": base64_audio_str, \"format\": \"wav\"}}\n                    )\n\n                elif field_type.annotation in [list[AudioFilepath], list[AudioFilepath] | None, list[AudioFilepath] | Any]:\n                    for audio_filepath in field_value:\n                        if audio_filepath is None:\n                            continue\n                        with open(audio_filepath, \"rb\") as f:\n                            base64_audio_str = base64.b64encode(f.read()).decode(\"utf-8\")\n                        audio_content.append(\n                            {\"type\": \"input_audio\", \"input_audio\": {\"data\": base64_audio_str, \"format\": \"wav\"}}\n                        )\n\n                # pre-encoded images (or list of pre-encoded images)\n                elif field_type.annotation in [AudioBase64, AudioBase64 | None, AudioBase64 | Any] and field_value is not None:\n                    audio_content.append(\n                        {\"type\": \"input_audio\", \"input_audio\": {\"data\": field_value, \"format\": \"wav\"}}\n                    )\n\n                elif field_type.annotation in [list[AudioBase64], list[AudioBase64] | None, list[AudioBase64] | Any]:\n                    for base64_audio in field_value:\n                        if base64_audio is None:\n                            continue\n                        audio_content.append(\n                            {\"type\": \"input_audio\", \"input_audio\": {\"data\": base64_audio, \"format\": \"wav\"}}\n                        )\n\n        return [{\"role\": \"user\", \"type\": \"input_audio\", \"content\": audio_content}] if len(audio_content) > 0 else []\n\n    def _create_image_messages(self, candidate: DataRecord | list[DataRecord], input_fields: list[str]) -> list[dict]:\n        \"\"\"\n        Parses the candidate record(s) and returns the image messages for the chat payload.\n\n        Args:\n            candidate (DataRecord | list[DataRecord]): The input record(s).\n            input_fields (list[str]): The list of input fields.\n\n        Returns:\n            list[dict]: The image messages for the chat payload.\n        \"\"\"\n        # normalize type to be list[DataRecord]\n        if isinstance(candidate, DataRecord):\n            candidate = [candidate]\n\n        # create a message for each image in an input field with an image (or list of image) type\n        image_content = []\n        for field_name in input_fields:\n            for dr in candidate:\n                field_value = dr[field_name]\n                field_type = dr.get_field_type(field_name)\n\n                # image filepath (or list of image filepaths)\n                if field_type.annotation in [ImageFilepath, ImageFilepath | None, ImageFilepath | Any] and field_value is not None:\n                    with open(field_value, \"rb\") as f:\n                        base64_image_str = base64.b64encode(f.read()).decode(\"utf-8\")\n                    media_type = _detect_image_media_type(filepath=field_value, base64_data=base64_image_str)\n                    image_content.append(\n                        {\"type\": \"image_url\", \"image_url\": {\"url\": f\"data:{media_type};base64,{base64_image_str}\"}}\n                    )\n\n                elif field_type.annotation in [list[ImageFilepath], list[ImageFilepath] | None, list[ImageFilepath] | Any]:\n                    for image_filepath in field_value:\n                        if image_filepath is None:\n                            continue\n                        with open(image_filepath, \"rb\") as f:\n                            base64_image_str = base64.b64encode(f.read()).decode(\"utf-8\")\n                        media_type = _detect_image_media_type(filepath=image_filepath, base64_data=base64_image_str)\n                        image_content.append(\n                            {\"type\": \"image_url\", \"image_url\": {\"url\": f\"data:{media_type};base64,{base64_image_str}\"}}\n                        )\n\n                # image url (or list of image urls)\n                elif field_type.annotation in [ImageURL, ImageURL | None, ImageURL | Any] and field_value is not None:\n                    image_content.append({\"type\": \"image_url\", \"image_url\": {\"url\": field_value}})\n\n                elif field_type.annotation in [list[ImageURL], list[ImageURL] | None, list[ImageURL] | Any]:\n                    for image_url in field_value:\n                        if image_url is None:\n                            continue\n                        image_content.append({\"type\": \"image_url\", \"image_url\": {\"url\": image_url}})\n\n                # pre-encoded images (or list of pre-encoded images)\n                elif field_type.annotation in [ImageBase64, ImageBase64 | None, ImageBase64 | Any] and field_value is not None:\n                    media_type = _detect_image_media_type(base64_data=field_value)\n                    image_content.append(\n                        {\"type\": \"image_url\", \"image_url\": {\"url\": f\"data:{media_type};base64,{field_value}\"}}\n                    )\n\n                elif field_type.annotation in [list[ImageBase64], list[ImageBase64] | None, list[ImageBase64] | Any]:\n                    for base64_image in field_value:\n                        if base64_image is None:\n                            continue\n                        media_type = _detect_image_media_type(base64_data=base64_image)\n                        image_content.append(\n                            {\"type\": \"image_url\", \"image_url\": {\"url\": f\"data:{media_type};base64,{base64_image}\"}}\n                        )\n\n        return [{\"role\": \"user\", \"type\": \"image\", \"content\": image_content}] if len(image_content) > 0 else []\n\n    def _get_system_prompt(self, **format_kwargs) -> str | None:\n        \"\"\"\n        Returns the fully templated system prompt for the given prompt strategy.\n        Returns None if the prompt strategy does not use a system prompt.\n\n        Returns:\n            str | None: The fully templated system prompt (or None if the prompt strategy\n                does not use a system prompt).\n        \"\"\"\n        base_prompt: str = self.BASE_SYSTEM_PROMPT_MAP.get(self.prompt_strategy)\n\n        # for critic and refine prompt strategies, we do not use a base prompt\n        if base_prompt is None:\n            return base_prompt\n\n        return base_prompt.format(**format_kwargs)\n\n    def _get_user_messages(self, candidate: DataRecord | list[DataRecord], input_fields: list[str], right_candidate: DataRecord | None, right_input_fields: list[str], **kwargs) -> str:\n        \"\"\"\n        Returns a list of messages for the chat payload based on the prompt strategy.\n\n        Args:\n            candidate (DataRecord | list[DataRecord]): The input record(s).\n            input_fields (list[str]): The input fields.\n            output_fields (list[str]): The output fields.\n            kwargs: The formatting kwargs and some keyword arguments provided by the user.\n\n        Returns:\n            Tuple[str, str | None]: The fully templated start and end of the user prompt.\n                The second element will be None for text prompts.\n        \"\"\"\n        # get the base prompt template\n        base_prompt = self.BASE_USER_PROMPT_MAP.get(self.prompt_strategy)\n\n        # get any image messages for the chat payload (will be an empty list if no image fields exist)\n        image_messages = self._create_image_messages(candidate, input_fields)\n\n        # get any audio messages for the chat payload (will be an empty list if no audio fields exist)\n        audio_messages = self._create_audio_messages(candidate, input_fields)\n\n        # get any right image / audio messages for the chat payload (will be an empty list if image / audio not present)\n        right_image_messages, right_audio_messages = [], []\n        if self.prompt_strategy.is_join_prompt():\n            assert right_candidate is not None, \"Right candidate must be provided for join prompts.\"\n            right_image_messages = self._create_image_messages(right_candidate, right_input_fields)\n            right_audio_messages = self._create_audio_messages(right_candidate, right_input_fields)\n\n        # get any original messages for critique and refinement operations\n        original_messages = kwargs.get(\"original_messages\")\n        if self.prompt_strategy.is_critic_prompt() or self.prompt_strategy.is_refine_prompt():\n            assert original_messages is not None, (\n                \"Original messages must be provided for critique and refinement operations.\"\n            )\n\n        # combine image and audio messages\n        image_audio_messages = image_messages + audio_messages\n        right_image_audio_messages = right_image_messages + right_audio_messages\n        has_image_audio = len(image_audio_messages) > 0\n        has_right_image_audio = len(right_image_audio_messages) > 0\n\n        # construct the user messages based on the prompt strategy\n        user_messages = []\n        if self.prompt_strategy.is_critic_prompt() or self.prompt_strategy.is_refine_prompt():\n            # NOTE: if this critic / refinement prompt is processing images / audio, those images / audio\n            # will be part of the `original_messages` and will show up in the final chat payload\n            base_prompt_start, base_prompt_end = base_prompt.split(\"<<original-prompt-placeholder>>\\n\")\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt_start.format(**kwargs)})\n            user_messages.extend(original_messages)\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt_end.format(**kwargs)})\n\n        # handle joins with left and right images / audio\n        elif self.prompt_strategy.is_join_prompt() and has_image_audio and has_right_image_audio:\n            base_prompt_start, base_prompt_rest = base_prompt.split(\"<<image-audio-placeholder>>\")\n            base_prompt_mid, base_prompt_end = base_prompt_rest.split(\"<<right-image-audio-placeholder>>\")\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt_start.format(**kwargs)})\n            user_messages.extend(image_audio_messages)\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt_mid.format(**kwargs)})\n            user_messages.extend(right_image_audio_messages)\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt_end.format(**kwargs)})\n\n        # handle joins with only left images / audio\n        elif self.prompt_strategy.is_join_prompt() and has_image_audio and not has_right_image_audio:\n            base_prompt = base_prompt.replace(\"<<right-image-audio-placeholder>>\", \"\")\n            base_prompt_start, base_prompt_end = base_prompt.split(\"<<image-audio-placeholder>>\")\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt_start.format(**kwargs)})\n            user_messages.extend(image_audio_messages)\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt_end.format(**kwargs)})\n\n        # handle joins with only right images / audio\n        elif self.prompt_strategy.is_join_prompt() and not has_image_audio and has_right_image_audio:\n            base_prompt = base_prompt.replace(\"<<image-audio-placeholder>>\", \"\")\n            base_prompt_start, base_prompt_end = base_prompt.split(\"<<right-image-audio-placeholder>>\")\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt_start.format(**kwargs)})\n            user_messages.extend(right_image_audio_messages)\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt_end.format(**kwargs)})\n\n        # handle non-joins with images / audio\n        elif not self.prompt_strategy.is_join_prompt() and has_image_audio and not self.prompt_strategy.is_moa_aggregator_prompt():\n            base_prompt_start, base_prompt_end = base_prompt.split(\"<<image-audio-placeholder>>\")\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt_start.format(**kwargs)})\n            user_messages.extend(image_audio_messages)\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt_end.format(**kwargs)})\n\n        # handle prompts w/no images or audio\n        else:\n            base_prompt = base_prompt.replace(\"<<image-audio-placeholder>>\", \"\")\n            base_prompt = base_prompt.replace(\"<<right-image-audio-placeholder>>\", \"\")\n            user_messages.append({\"role\": \"user\", \"type\": \"text\", \"content\": base_prompt.format(**kwargs)})\n\n        return user_messages\n\n    def create_messages(self, candidate: DataRecord | list[DataRecord], output_fields: list[str], right_candidate: DataRecord | None = None, **kwargs) -> list[dict]:\n        \"\"\"\n        Creates the messages for the chat payload based on the prompt strategy.\n\n        Each message will be a dictionary with the following format:\n        {\n            \"role\": \"user\" | \"system\",\n            \"type\": \"text\" | \"image\",\n            \"content\": str\n        }\n\n        Args:\n            candidate (DataRecord | list[DataRecord]): The input record(s).\n            output_fields (list[str]): The output fields.\n            right_candidate (DataRecord | None): The other join input record (only provided for joins).\n            kwargs: The keyword arguments provided by the user.\n\n        Returns:\n            list[dict]: The messages for the chat payload.\n        \"\"\"\n        # compute the set of input fields\n        input_fields = self._get_input_fields(candidate[0] if isinstance(candidate, list) else candidate, **kwargs)\n        right_input_fields = [] if right_candidate is None else self._get_input_fields(right_candidate, **kwargs)\n\n        # use input fields to determine the left / right input modalities\n        input_modalities = self._get_input_modalities(candidate[0] if isinstance(candidate, list) else candidate, input_fields)\n        right_input_modalities = set() if right_candidate is None else self._get_input_modalities(right_candidate, right_input_fields)\n\n        # initialize messages\n        messages = []\n\n        # compute the full dictionary of format kwargs and add to kwargs\n        format_kwargs = self._get_all_format_kwargs(candidate, input_fields, input_modalities, output_fields, right_candidate, right_input_fields, right_input_modalities, **kwargs)\n        kwargs = {**kwargs, **format_kwargs}\n\n        # generate system message (if applicable)\n        system_prompt = self._get_system_prompt(**kwargs)\n        if system_prompt is not None:\n            messages.append({\"role\": \"system\", \"type\": \"text\", \"content\": system_prompt})\n\n        # generate user messages and add to messages\n        user_messages = self._get_user_messages(candidate, input_fields, right_candidate, right_input_fields, **kwargs)\n        messages.extend(user_messages)\n\n        return messages\n"
  },
  {
    "path": "src/palimpzest/prompts/prompt_manager.py",
    "content": "\"\"\"\nPrompt caching utility for different LLM providers.\n\nThis module provides provider-specific prompt caching configurations:\n- OpenAI: Automatic prefix caching with prompt_cache_key for sticky routing\n- Gemini (Google AI Studio / Vertex AI): Implicit caching (automatic prefix matching)\n- Anthropic: Explicit cache_control with ephemeral type on system and user message content\n\"\"\"\n\nimport copy\nimport uuid\nfrom typing import Any\n\nfrom palimpzest.constants import Model\n\n\nclass PromptManager:\n    \"\"\"\n    Manages prompt caching configurations and message transformations for LLM providers.\n\n    This class handles:\n    1. Session-level state (e.g., OpenAI cache keys).\n    2. Provider-specific request arguments (headers, extra_body).\n    3. Transformation of messages for providers requiring explicit markers (Anthropic).\n    4. Normalization of usage statistics.\n    \"\"\"\n    \n    CACHE_BOUNDARY_MARKER = \"<<cache-boundary>>\"\n\n    def __init__(self, model: Model):\n        self.model = model\n        # Instance-level state ensures thread safety if we use one manager per plan/execution\n        self.openai_cache_key = f\"pz-cache-{uuid.uuid4().hex[:12]}\" if (self.model.is_provider_openai() or self.model.is_provider_azure()) else None\n\n    def get_cache_kwargs(self) -> dict[str, Any]:\n        \"\"\"\n        Get provider-specific cache configuration kwargs for litellm.completion().\n\n        Returns:\n            A dictionary of kwargs to pass to litellm.completion() for enabling caching\n        \"\"\"\n        if not self.model.supports_prompt_caching():\n            return {}\n        # OpenAI and Azure OpenAI: https://platform.openai.com/docs/guides/prompt-caching\n        # Use prompt_cache_key for sticky routing to the same cache shard\n        if self.model.is_provider_openai() or self.model.is_provider_azure():\n            return {\"extra_body\": {\"prompt_cache_key\": self.openai_cache_key}}\n        else:\n            return {}\n    \n    def inject_cache_isolation_id(self, messages: list[dict], session_id: str) -> list[dict]:\n        \"\"\"\n        Inject a cache isolation ID into messages for testing cache behavior per-modality.\n\n        This must happen BEFORE update_messages_for_caching so the ID becomes part of cached content.\n        \"\"\"\n        for msg in messages:\n            role = msg.get(\"role\")\n            content = msg.get(\"content\")\n            if role == \"system\" and isinstance(content, str) or \\\n               role == \"user\" and self.model.is_provider_anthropic() and msg.get(\"type\") == \"text\" and isinstance(content, str):\n                msg[\"content\"] = f\"[{session_id}] \" + content\n        return messages\n\n    def update_messages_for_caching(self, messages: list[dict]) -> list[dict]:\n        \"\"\"\n        Transform messages to conform to provider-specific caching requirements.\n\n        - Anthropic: Adds explicit cache_control markers.\n        - Others: Removes the generic cache boundary markers.\n\n        Returns:\n            The transformed messages list.\n        \"\"\"\n        if not self.model.supports_prompt_caching():\n            return messages\n\n        # Anthropic: Explicit cache_control with ephemeral type\n        # https://platform.claude.com/docs/en/build-with-claude/prompt-caching\n        if self.model.is_provider_anthropic():\n            return self._transform_messages_for_anthropic(messages)\n        # implicit caching for Gemini/OpenAI/Azure models that currently support caching\n        # OpenAI: https://platform.openai.com/docs/guides/prompt-caching\n        # Gemini: https://ai.google.dev/gemini-api/docs/caching\n        elif (self.model.is_provider_openai() or self.model.is_provider_azure() or\n              self.model.is_provider_google_ai_studio() or self.model.is_provider_vertex_ai()):\n            return self._remove_cache_boundary_markers(messages)\n\n        return messages\n\n\n    def extract_usage_stats(self, usage: dict, is_audio_op: bool) -> dict[str, int]:\n        \"\"\"\n        Normalize cache statistics from provider-specific response formats.\n        \"\"\"\n        stats = {\n            \"input_text_tokens\": 0,\n            \"input_image_tokens\": 0, # forward looking\n            \"input_audio_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"cache_read_tokens\": 0\n        }\n\n        details = usage.get(\"prompt_tokens_details\") or {}\n\n        if self.model.is_provider_openai() or self.model.is_provider_azure():\n            # only realtime audio models do, but they are not supported by PZ\n            if self.model.supports_prompt_caching() and not is_audio_op:\n                stats[\"cache_read_tokens\"] = details.get(\"cached_tokens\") or 0\n                stats[\"input_text_tokens\"] = (usage.get(\"prompt_tokens\") or 0) - stats[\"cache_read_tokens\"]\n            # audio models don't support caching for now\n            elif is_audio_op:\n                stats[\"input_text_tokens\"] = details.get(\"text_tokens\") or 0\n                stats[\"input_audio_tokens\"] = details.get(\"audio_tokens\") or 0\n            else:\n                stats[\"input_text_tokens\"] = usage.get(\"prompt_tokens\") or 0\n\n        # Moved to Gemini client class, now usage stats are extracted directly in GeminiClient\n        # elif self.model.is_provider_vertex_ai():\n        #     stats[\"cache_read_tokens\"] = usage.get(\"cache_read_input_tokens\") or 0\n        #     if stats[\"cache_read_tokens\"] == 0:\n        #         stats[\"cache_read_tokens\"] = details.get(\"cached_tokens\") or 0\n        #     stats[\"input_text_tokens\"] = details.get(\"text_tokens\") or 0\n        #     stats[\"input_audio_tokens\"] = details.get(\"audio_tokens\") or 0\n        #     stats[\"input_image_tokens\"] = details.get(\"image_tokens\") or 0\n\n        elif self.model.is_provider_anthropic():\n            stats[\"cache_creation_tokens\"] = usage.get(\"cache_creation_input_tokens\") or 0\n            stats[\"cache_read_tokens\"] = usage.get(\"cache_read_input_tokens\") or 0\n            stats[\"input_text_tokens\"] = max(0, (usage.get(\"prompt_tokens\") or 0) - stats[\"cache_read_tokens\"] - stats[\"cache_creation_tokens\"])\n\n        elif self.model.is_vllm_model():\n            # vLLM does not seem to provide cache statistics through litellm, so we currently have no way\n            # to extract cache read/creation tokens for vLLM models.\n            pass\n\n        # all other models (assume caching not supported)\n        else:\n            if is_audio_op:\n                stats[\"input_text_tokens\"] = details.get(\"text_tokens\") or 0\n                stats[\"input_audio_tokens\"] = details.get(\"audio_tokens\") or 0\n            else:\n                stats[\"input_text_tokens\"] = usage.get(\"prompt_tokens\") or 0\n\n\n        return stats\n\n\n    def _remove_cache_boundary_markers(self, messages: list[dict]) -> list[dict]:\n        \"\"\"\n        Remove <<cache-boundary>> markers from user messages.\n\n        For providers with automatic (implicit) caching (OpenAI, Gemini), we don't need\n        explicit cache markers. This function cleans up the markers from prompts.\n\n        Args:\n            messages: The list of messages to transform.\n\n        Returns:\n            A new list of messages with cache boundary markers removed.\n        \"\"\"\n        result = []\n        for message in messages:\n            new_message = message.copy()\n            if new_message.get(\"role\") == \"user\":\n                content = new_message.get(\"content\", \"\")\n                if isinstance(content, str) and self.CACHE_BOUNDARY_MARKER in content:\n                    new_message[\"content\"] = content.replace(self.CACHE_BOUNDARY_MARKER, \"\")\n            result.append(new_message)\n        return result\n\n\n    def _transform_messages_for_anthropic(self, messages: list[dict]) -> list[dict]:\n        \"\"\"\n        Add cache_control markers to system messages and user prompt prefixes for Anthropic models.\n\n        This transforms messages to:\n        1. Add cache_control to system message content blocks\n        2. Convert user messages with <<cache-boundary>> marker into a single message with multiple content blocks:\n            a. Static prefix block (with cache_control) - cacheable across records\n            b. Dynamic content block (without cache_control) - changes per record\n\n        Args:\n            messages: The list of messages to transform.\n\n        Returns:\n            A new list of messages with cache_control markers added.\n        \"\"\"\n        result = []\n        for message in messages:\n            new_message = copy.deepcopy(message)\n            role = new_message.get(\"role\")\n            content = new_message.get(\"content\", \"\")\n\n            # 1. Handle System Messages\n            if role == \"system\":\n                if isinstance(content, str) and content:\n                    new_message[\"content\"] = [{\n                        \"type\": \"text\",\n                        \"text\": content,\n                        \"cache_control\": {\"type\": \"ephemeral\"}\n                    }]\n                elif isinstance(content, list) and content:\n                    # Apply to last block if it's text\n                    last_block = new_message[\"content\"][-1]\n                    if isinstance(last_block, dict) and last_block.get(\"type\") == \"text\":\n                        last_block[\"cache_control\"] = {\"type\": \"ephemeral\"}\n\n            # 2. Handle User Messages (The Split Logic)\n            elif role == \"user\" and isinstance(content, str) and self.CACHE_BOUNDARY_MARKER in content:\n                static, dynamic = content.split(self.CACHE_BOUNDARY_MARKER, 1)\n\n                new_blocks = []\n                if static.strip():\n                    new_blocks.append({\n                        \"type\": \"text\",\n                        \"text\": static,\n                        \"cache_control\": {\"type\": \"ephemeral\"}\n                    })\n\n                if dynamic.strip():\n                    new_blocks.append({\"type\": \"text\", \"text\": dynamic})\n\n                if new_blocks:\n                    new_message[\"content\"] = new_blocks\n                else:\n                    new_message[\"content\"] = \"\"\n\n            result.append(new_message)\n        return result\n"
  },
  {
    "path": "src/palimpzest/prompts/split_merge_prompts.py",
    "content": "\"\"\"This file contains prompts for SplitConvert aggregator operations.\"\"\"\n\n### SYSTEM PROMPTS ###\nMAP_SPLIT_MERGER_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to generate a JSON object.\nYou will be presented with one or more outputs produced by a set of models operating on chunks of an input. Your task is to synthesize these responses into a single, high-quality JSON object which fills in the output fields with the correct values.\nIt is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased, incorrect, or contain duplicates.\n\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the model responses.\n\n{output_format_instruction} Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nCHUNK 1 OUTPUT: the text mentions the scientists \"Augusta Ada King, Countess of Lovelace\" and \"Charles Babbage\". It states that King was an English mathematician who worked on Babbage's Analytical Engine.\n\nCHUNK 2 OUTPUT: the text passage mentions the scientist \"Charles Babbage\", who was a mathematician. Therefore, the name output should be [\"Charles Babbage\"] and the field_of_study output should be [\"Mathematician\"].\n\nINPUT FIELDS:\n- text: a text passage describing scientists\n\nOUTPUT FIELDS:\n- name: the list of names for each scientist mentioned in the text\n- field_of_study: a list with the field of study for each scientist\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: Looking at both chunk outputs, they specify that the scientists' formal names are \"Augusta Ada King\" and \"Charles Babbage\". Chunk Output 2 indicates that Charles Babbage was a Mathematician and Chunk Output 1 says that Augusta Ada King was an English mathematician. Therefore, the name output should be [\"Augusta Ada King\", \"Charles Babbage\"] and the field_of_study output should be [\"Mathematician\", \"Mathematician\"].\n\nANSWER:\n{{\n  \"name\": [\"Augusta Ada King\", \"Charles Babbage\"],\n  \"field_of_study\": [\"Mathematician\", \"Mathematician\"]\n}}\n---\n\"\"\"\n\nFILTER_SPLIT_MERGER_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to answer a TRUE/FALSE question.\nYou will be presented with one or more outputs produced by a set of models operating on chunks of an input. Your task is to synthesize these responses into a single TRUE/FALSE answer.\nIt is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased, incorrect, or contain duplicates.\n\nYou will be provided with a description of each input field and the filter condition.\n\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n\nAn example is shown below:\n---\nCHUNK 1 OUTPUT: The context describes Augusta Ada King, Countess of Lovelace, also known as Ada Lovelace, who is widely recognized as a foundational figure in computer science. Therefore, the answer is TRUE.\n\nCHUNK 2 OUTPUT: Based on the context provided, Ada Lovelace is indeed a foundational computer scientist, therefore the answer is TRUE.\n\nINPUT FIELDS:\n- text: a text passage describing a scientist\n- birthday: the scientist's birthday\n- image: an image of the scientist\n- recording: an audio recording of a newscast about the scientist's contributions to their field\n\nFILTER CONDITION: The subject of the input is a foundational computer scientist.\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: Looking at both chunk outputs, they agree that the subject is a foundational computer scientist. Both outputs provide consistent evidence supporting this conclusion.\n\nANSWER: TRUE\n---\n\"\"\"\n\n### USER / INSTANCE-SPECIFIC PROMPTS ###\nMAP_SPLIT_MERGER_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to generate a JSON object.\nYou will be presented with one or more outputs produced by a set of models. Your task is to synthesize these responses into a single, high-quality JSON object which fills in the output fields with the correct values.\nIt is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased, incorrect, or contain duplicates.\n\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the model responses.\n\n{output_format_instruction} Finish your response with a newline character followed by ---\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nOUTPUT FIELDS:\n{output_fields_desc}\n\n<<cache-boundary>>{chunk_outputs}\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: \"\"\"\n\nFILTER_SPLIT_MERGER_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to answer a TRUE/FALSE question.\nYou will be presented with one or more outputs produced by a set of models operating on chunks of an input. Your task is to synthesize these responses into a single TRUE/FALSE answer.\nIt is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased, incorrect, or contain duplicates.\n\nYou will be provided with a description of each input field and the filter condition.\n\nRemember, your answer must be TRUE or FALSE. Finish your response with a newline character followed by ---\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nFILTER CONDITION: {filter_condition}\n\n<<cache-boundary>>{chunk_outputs}\n\nLet's think step-by-step in order to answer the question.\n\nREASONING: \"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/split_proposer_prompts.py",
    "content": "\"\"\"This file contains prompts for SplitAndMerge operations.\"\"\"\n\n### SYSTEM PROMPTS ###\nMAP_SPLIT_PROPOSER_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a set of output fields to generate. Your task is to generate a detailed and succinct analysis describing what you believe is the correct value for each output field.\nBe sure to cite information from the context as evidence of why your answers are correct. Do not hallucinate evidence.\n\nYou will be provided with a description of each input field and each output field.\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{example_input_fields}\n\nOUTPUT FIELDS:\n{example_output_fields}\n\nCONTEXT:\n{{{example_context}}}{image_disclaimer}{audio_disclaimer}\n\nLet's think step-by-step in order to answer the question.\n\nANSWER: {example_answer}\n---\n\"\"\"\n\nFILTER_SPLIT_PROPOSER_BASE_SYSTEM_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a filter condition. Your task is to generate a detailed and succinct analysis describing whether you believe the input satisfies the filter condition.\nBe sure to cite information from the context as evidence of why your determination is correct. Do not hallucinate evidence.\n\nYou will be provided with a description of each input field.\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{example_input_fields}\n\nFILTER CONDITION: {example_filter_condition}\n\nCONTEXT:\n{{{example_context}}}{image_disclaimer}{audio_disclaimer}\n\nLet's think step-by-step in order to answer the question.\n\nANSWER: {example_answer}\n---\n\"\"\"\n\n### USER / INSTANCE-SPECIFIC PROMPTS ###\nMAP_SPLIT_PROPOSER_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a set of output fields to generate. Your task is to generate a paragraph or two which describes what you believe is the correct value for each output field.\nBe sure to cite information from the context as evidence of why your answers are correct. Do not hallucinate evidence.\n{desc_section}\nYou will be provided with a description of each input field and each output field.\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nOUTPUT FIELDS:\n{output_fields_desc}\n\n<<cache-boundary>>CONTEXT:\n{context}<<image-audio-placeholder>>\n\nLet's think step-by-step in order to answer the question.\n\nANSWER: \"\"\"\n\nFILTER_SPLIT_PROPOSER_BASE_USER_PROMPT = \"\"\"You are a helpful assistant whose job is to {job_instruction}.\nYou will be presented with a context and a filter condition. Your task is to generate a detailed and succinct analysis describing whether you believe the input satisfies the filter condition.\nBe sure to cite information from the context as evidence of why your determination is correct. Do not hallucinate evidence.\n{desc_section}\nYou will be provided with a description of each input field.\n\nAn example is shown below:\n---\nINPUT FIELDS:\n{input_fields_desc}\n\nFILTER CONDITION: {filter_condition}\n\n<<cache-boundary>>CONTEXT:\n{context}<<image-audio-placeholder>>\n\nLet's think step-by-step in order to answer the question.\n\nANSWER: \"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/utils.py",
    "content": "\"\"\"This file contains utility format strings which are templated into many of our prompts.\"\"\"\n\n### FORMATTING INSTRUCTIONS ###\nONE_TO_ONE_OUTPUT_FORMAT_INSTRUCTION = \"Remember, your answer must be a valid JSON dictionary. The dictionary should only have the specified output fields.\"\nONE_TO_MANY_OUTPUT_FORMAT_INSTRUCTION = \"Remember, your answer must be a valid JSON list of dictionaries. The list may contain one or more dictionaries, and each dictionary should only have the specified output fields.\"\n\n### USER-PROVIDED DESCRIPTION FOR MAPS / FILTERS / JOINS ###\nDESC_SECTION = \"\"\"\nThe user has additionally provided you with this description of the task you need to perform:\n{desc}\n\"\"\"\n\n### JOB INSTRUCTIONS ###\nAGG_JOB_INSTRUCTION = \"\"\"analyze input {modalities} in order to perform an aggregation and generate a JSON object\"\"\"\nMAP_JOB_INSTRUCTION = \"\"\"analyze input {modalities} in order to produce a JSON object\"\"\"\nFILTER_JOB_INSTRUCTION = \"\"\"analyze input {modalities} in order to answer a TRUE / FALSE question\"\"\"\nJOIN_JOB_INSTRUCTION = \"\"\"analyze input {modalities} in order to determine whether two data records satisfy a join condition\"\"\"\nPROPOSER_JOB_INSTRUCTION = \"\"\"analyze input {modalities} in order to produce an answer to a question\"\"\"\n\n### AGG / FILTER / JOIN CONDITIONS ###\nEXAMPLE_AGG_INSTRUCTION = \"Count the distinct number of scientists in the input.\"\nEXAMPLE_FILTER_CONDITION = \"The subject of the input is a foundational computer scientist.\"\nEXAMPLE_JOIN_CONDITION = \"The two inputs are scientists in the same academic field.\"\n\n### EXAMPLE INPUT FIELDS ###\nTEXT_EXAMPLE_INPUT_FIELDS = \"\"\"\n- text: a text passage describing a scientist\n- birthday: the scientist's birthday\n\"\"\"\nIMAGE_EXAMPLE_INPUT_FIELDS = \"\"\"\n- image: an image of the scientist\n- photographer: the photographer of the image\n\"\"\"\nAUDIO_EXAMPLE_INPUT_FIELDS = \"\"\"\n- recording: an audio recording of a newscast about the scientist's contributions to their field\n- speaker: the speaker in the recording\n\"\"\"\nRIGHT_TEXT_EXAMPLE_INPUT_FIELDS = \"\"\"\n- contents: the contents of a text file\n\"\"\"\nRIGHT_IMAGE_EXAMPLE_INPUT_FIELDS = \"\"\"\n- headshot: a headshot of a famous scientist\n\"\"\"\nRIGHT_AUDIO_EXAMPLE_INPUT_FIELDS = \"\"\"\n- podcast: an audio recording of a podcast about historic scientists\n\"\"\"\n\n### EXAMPLE OUTPUT FIELDS ###\nTEXT_EXAMPLE_OUTPUT_FIELDS = \"\"\"- name: the name of the scientist\n- birth_year: the year the scientist was born\"\"\"\nIMAGE_EXAMPLE_OUTPUT_FIELDS = \"\"\"- is_bald: true if the scientist is bald and false otherwise\"\"\"\nAUDIO_EXAMPLE_OUTPUT_FIELDS = \"\"\"- birthplace: the city where the scientist was born\"\"\"\nAGG_EXAMPLE_OUTPUT_FIELDS = \"\"\"- num_distinct_scientists: the number of distinct scientists mentioned in the input\"\"\"\n\n### EXAMPLE CONTEXTS ###\nTEXT_EXAMPLE_CONTEXT = \"\"\"\n  \"text\": \"Augusta Ada King, Countess of Lovelace, also known as Ada Lovelace, was an English mathematician and writer chiefly known for her work on Charles Babbage's proposed mechanical general-purpose computer, the Analytical Engine. She was the first to recognise that the machine had applications beyond pure calculation.\",\n  \"birthday\": \"December 10, 1815\"\n\"\"\"\nIMAGE_EXAMPLE_CONTEXT = \"\"\"\n  \"image\": <bytes>,\n  \"photographer\": \"CameraEnthusiast1\"\n\"\"\"\nAUDIO_EXAMPLE_CONTEXT = \"\"\"\n  \"recording\": <bytes>,\n  \"speaker\": \"Walter Cronkite\"\n\"\"\"\nRIGHT_TEXT_EXAMPLE_CONTEXT = \"\"\"\n  \"content\": \"Alan Turing was a pioneering computer scientist and mathematician. He is widely considered to be the father of theoretical computer science and artificial intelligence.\"\n\"\"\"\nRIGHT_IMAGE_EXAMPLE_CONTEXT = \"\"\"\n  \"headshot\": <bytes>\n\"\"\"\nRIGHT_AUDIO_EXAMPLE_CONTEXT = \"\"\"\n  \"podcast\": <bytes>\n\"\"\"\nSECOND_TEXT_EXAMPLE_CONTEXT = \"\"\"\n  \"text\": \"Alan Turing was a pioneering computer scientist and mathematician. He is widely considered to be the father of theoretical computer science and artificial intelligence.\",\n  \"birthday\": \"June 23, 1912\"\n\"\"\"\nSECOND_IMAGE_EXAMPLE_CONTEXT = \"\"\"\n  \"image\": <bytes>,\n  \"photographer\": \"PhotoPro42\"\n\"\"\"\nSECOND_AUDIO_EXAMPLE_CONTEXT = \"\"\"\n  \"recording\": <bytes>,\n  \"speaker\": \"Barbara Walters\"\n\"\"\"\nTHIRD_TEXT_EXAMPLE_CONTEXT = \"\"\"\n  \"text\": \"Ada Lovelace is a historically significant computer scientist.\",\n  \"birthday\": \"December 10, 1815\"\n\"\"\"\nTHIRD_IMAGE_EXAMPLE_CONTEXT = \"\"\"\n  \"image\": <bytes>,\n  \"photographer\": \"PicturePerfect\"\n\"\"\"\nTHIRD_AUDIO_EXAMPLE_CONTEXT = \"\"\"\n  \"recording\": <bytes>,\n  \"speaker\": \"Anderson Cooper\"\n\"\"\"\n\n### DISCLAIMERS ###\nIMAGE_DISCLAIMER = \"\"\"\n\\n<image content provided here; assume in this example the image shows Ada Lovelace wearing a hat on top of her hair>\n\"\"\"\nAUDIO_DISCLAIMER = \"\"\"\n\\n<audio content provided here; assume in this example the recording is about Ada Lovelace's upbringing in London>\n\"\"\"\nRIGHT_IMAGE_DISCLAIMER = \"\"\"\n\\n<image content provided here; assume in this example the image shows Alan Turing working at his desk>\n\"\"\"\nRIGHT_AUDIO_DISCLAIMER = \"\"\"\n\\n<audio content provided here; assume in this example the podcast is discussing Alan Turing's work on the Enigma code>\n\"\"\"\nAGG_IMAGE_DISCLAIMER = \"\"\"\n\\n<image content provided here; assume in this example the first image shows Ada Lovelace, the second image shows Alan Turing, and the third image shows Ada Lovelace again>\n\"\"\"\nAGG_AUDIO_DISCLAIMER = \"\"\"\n\\n<audio content provided here; assume in this example the first recording is about Ada Lovelace, the second recording is about Alan Turing, and the third recording is about Ada Lovelace again>\n\"\"\"\n\n### EXAMPLE REASONINGS ###\nTEXT_EXAMPLE_REASONING = \"\"\"The text passage mentions the scientist's name as \"Augusta Ada King, Countess of Lovelace, also known as Ada Lovelace\" and the scientist's birthday as \"December 10, 1815\". Therefore, the name of the scientist is \"Augusta Ada King\" and the birth year is 1815.\"\"\"\nIMAGE_EXAMPLE_REASONING = \"\"\"The image shows hair on top of the scientist's head, so the is_bald field should be false.\"\"\"\nAUDIO_EXAMPLE_REASONING = \"\"\"The newscast recording discusses Ada Lovelace's upbringing in London, so the birthplace field should be \"London\".\"\"\"\nAGG_EXAMPLE_REASONING = \"\"\"The input contains two distinct scientists: \"Augusta Ada King\" and \"Alan Turing\". Although \"Ada Lovelace\" is mentioned twice, she should only be counted once. Therefore, the number of distinct scientists mentioned in the input is 2.\"\"\"\nFILTER_EXAMPLE_REASONING = \"\"\"Ada Lovelace is a foundational computer scientist, therefore the answer is TRUE.\"\"\"\nJOIN_EXAMPLE_REASONING = \"\"\"The subject of the left record is Ada Lovelace and the subject of the right record is Alan Turing. Since both inputs are about computer scientists, they satisfy the join condition. Therefore, the answer is TRUE.\"\"\"\n\n### EXAMPLE ANSWERS ###\nAGG_EXAMPLE_ANSWER = \"\"\"\n  \"num_distinct_scientists\": 2\n\"\"\"\nTEXT_EXAMPLE_ANSWER = \"\"\"\n  \"name\": \"Augusta Ada King\",\n  \"birth_year\": 1815\n\"\"\"\nIMAGE_EXAMPLE_ANSWER = \"\"\"\n  \"is_bald\": false,\n\"\"\"\nAUDIO_EXAMPLE_ANSWER = \"\"\"\n  \"birthplace\": \"London\",\n\"\"\"\nTEXT_SENTENCE_EXAMPLE_ANSWER = \"\"\"the text passage mentions the scientist's name as \"Augusta Ada King, Countess of Lovelace, also known as Ada Lovelace\" and the scientist's birthday as \"December 10, 1815\". Therefore, the name of the scientist is \"Augusta Ada King\" and the birth year is 1815.\"\"\"\nIMAGE_SENTENCE_EXAMPLE_ANSWER = \"\"\"The image shows hair on top of the woman's head, so the is_bald field should be false.\"\"\"\nAUDIO_SENTENCE_EXAMPLE_ANSWER = \"\"\"The newscast recording discusses Ada Lovelace's upbringing in London, so her birthplace is \"London\".\"\"\"\n"
  },
  {
    "path": "src/palimpzest/prompts/validator.py",
    "content": "### MAP ###\nMAP_VALIDATOR_PROMPT = \"\"\"You are an intelligent judge whose job is to evaluate how successfully an agent executed a given instruction.\nYou will be presented with the input(s) provided to the agent followed by the output produced by the agent.\n\nEach output will be a dictionary. The keys will be **output fields** which were computed by the agent.\n\nYour job will be to assign a score of 1.0 to every output field which was computed correctly, and a score of 0.0 to every output field which was computed incorrectly. If the output for a field is a list, you may give a score in between 0.0 and 1.0 representing the fraction of correct items in the list.\n\nHere is an example evaluation:\n\nINPUT MESSAGES:\n---------------\nYou are a helpful assistant whose job is to generate a JSON object. You will be presented with a context and a set of output fields to generate. Your task is to generate a JSON object which fills in the output fields with the correct values.\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the context.\n\nINPUT FIELDS:\n- text: a text passage describing a scientist\n- birthday: the scientist's birthday\n\nOUTPUT FIELDS:\n- name: the name of the scientist\n- birth_year: the year the scientist was born\n\nCONTEXT:\n{\n  \"text\": \"Augusta Ada King, Countess of Lovelace, also known as Ada Lovelace, was an English mathematician and writer chiefly known for her work on Charles Babbage's proposed mechanical general-purpose computer, the Analytical Engine. She was the first to recognise that the machine had applications beyond pure calculation.\",\n  \"birthday\": \"December 10, 1815\"\n}\n\nOUTPUT:\n--------\n{\n  \"name\": \"Charles Babbage\",\n  \"birth_year\": 1815\n}\n\nEVALUATION: {\"name\": 0.0, \"birth_year\": 1.0}\n\nRemember, be sure to output your evaluation as a dictionary where each value contains a 0.0 or 1.0 score for each output field (or a score within [0.0, 1.0] for list output fields).\n\nINPUT MESSAGES:\n---------------\n\n\"\"\"\n\nMAP_IMAGE_VALIDATOR_PROMPT = \"\"\"You are an intelligent judge whose job is to evaluate how successfully an agent executed a given instruction.\nYou will be presented with the input(s) provided to the agent followed by the output produced by the agent.\n\nEach output will be a dictionary. The keys will be **output fields** which were computed by the agent.\n\nYour job will be to assign a score of 1.0 to every output field which was computed correctly, and a score of 0.0 to every output field which was computed incorrectly. If the output for a field is a list, you may give a score in between 0.0 and 1.0 representing the fraction of correct items in the list.\n\nHere is an example evaluation:\n\nINPUT MESSAGES:\n---------------\nYou are a helpful assistant whose job is to analyze input image(s) and/or text in order to produce a JSON object. You will be presented with a context and a set of output fields to generate. Your task is to generate a JSON object which fills in the output fields with the correct values.\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the context.\n\nINPUT FIELDS:\n- image: an image of a scene\n- photographer: the photographer of the image\n\nOUTPUT FIELDS:\n- dog_in_image: true if a dog is in the image and false otherwise\n- person_in_image: true if a person is in the image and false otherwise\n\nCONTEXT:\n{\n  \"image\": <bytes>,\n  \"photographer\": \"CameraEnthusiast1\"\n}\n<image content provided here; assume in this example the image shows a dog and a cat playing>\n\nOUTPUT:\n--------\n{\n  \"dog_in_image\": true,\n  \"person_in_image\": true\n}\n\nEVALUATION: {\"dog_in_image\": 1.0, \"person_in_image\": 0.0}\n\nRemember, be sure to output your evaluation as a dictionary where each value contains a 0.0 or 1.0 score for each output field (or a score within [0.0, 1.0] for list output fields).\n\nINPUT MESSAGES:\n---------------\n\n\"\"\"\n\n\n### FLAT MAP ###\nFLAT_MAP_VALIDATOR_PROMPT = \"\"\"You are an intelligent judge whose job is to evaluate how successfully an agent executed a given instruction.\nYou will be presented with the input(s) provided to the agent followed by the output(s) produced by the agent.\n\nEach output will be a list of dictionaries. The keys of each dictionary will be **output fields** which were computed by the agent.\n\nYour job will be to assign a score of 1.0 to every output field which was computed correctly, and a score of 0.0 to every output field which was computed incorrectly. If the output for a field is a list, you may give a score in between 0.0 and 1.0 representing the fraction of correct items in the list.\n\nHere is an example evaluation:\n\nINPUT MESSAGES:\n---------------\nYou are a helpful assistant whose job is to generate a JSON object. You will be presented with a context and a set of output fields to generate. Your task is to generate a JSON object which fills in the output fields with the correct values.\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the context.\n\nINPUT FIELDS:\n- text: a text passage describing scientists\n- birthdays: text containing birth dates\n\nOUTPUT FIELDS:\n- name: the name of the scientist\n- birth_year: the year the scientist was born\n\nCONTEXT:\n{\n  \"text\": \"Augusta Ada King, Countess of Lovelace, also known as Ada Lovelace, was an English mathematician and writer chiefly known for her work on Charles Babbage's proposed mechanical general-purpose computer, the Analytical Engine. She was the first to recognise that the machine had applications beyond pure calculation.\",\n  \"birthdays\": \"...Lovelace was born on December 10, 1815, almost exactly 24 years after Babbage's birth on 26 December 1791...\"\n}\n\nOUTPUTS:\n--------\n[\n  {\n    \"name\": \"Ada Lovelace\",\n    \"birth_year\": 1815\n  },\n  {\n    \"name\": \"Charles Babbage\",\n    \"birth_year\": 1790\n  }\n]\n\nEVALUATION: [{\"name\": 1.0, \"birth_year\": 1.0}, {\"name\": 1.0, \"birth_year\": 0.0}]\n\nRemember, be sure to output your evaluation as a list of dictionaries where each dictionary contains a 0.0 or 1.0 score for each output field (or a score within [0.0, 1.0] for list output fields).\n\nINPUT MESSAGES:\n---------------\n\n\"\"\"\n\nFLAT_MAP_IMAGE_VALIDATOR_PROMPT = \"\"\"You are an intelligent judge whose job is to evaluate how successfully an agent executed a given instruction.\nYou will be presented with the input(s) provided to the agent followed by the output(s) produced by the agent.\n\nEach output will be a list of dictionaries. The keys of each dictionary will be **output fields** which were computed by the agent.\n\nYour job will be to assign a score of 1.0 to every output field which was computed correctly, and a score of 0.0 to every output field which was computed incorrectly. If the output for a field is a list, you may give a score in between 0.0 and 1.0 representing the fraction of correct items in the list.\n\nHere is an example evaluation:\n\nINPUT MESSAGES:\n---------------\nYou are a helpful assistant whose job is to analyze input image(s) and/or text in order to produce a JSON object. You will be presented with a context and a set of output fields to generate. Your task is to generate a JSON object which fills in the output fields with the correct values.\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the context.\n\nINPUT FIELDS:\n- image: an image of a scene\n- photographer: the photographer of the image\n\nOUTPUT FIELDS:\n- animal: the type of animal in the image\n- animal_is_canine: true if the animal is a canine and false otherwise\n\nCONTEXT:\n{\n  \"image\": <bytes>,\n  \"photographer\": \"CameraEnthusiast1\"\n}\n<image content provided here; assume in this example the image shows a dog and a cat playing>\n\nOUTPUT:\n--------\n[\n  {\n    \"animal\": \"dog\",\n    \"animal_is_canine\": true\n  },\n  {\n    \"animal\": \"cat\",\n    \"animal_is_canine\": true\n  }\n]\n\nEVALUATION: [{\"animal\": 1.0, \"animal_is_canine\": 1.0}, {\"animal\": 1.0, \"animal_is_canine\": 0.0}]\n\nRemember, be sure to output your evaluation as a list of dictionaries where each dictionary contains a 0.0 or 1.0 score for each output field (or a score within [0.0, 1.0] for list output fields).\n\nINPUT MESSAGES:\n---------------\n\n\"\"\"\n\n\n### RETRIEVE\nRETRIEVE_VALIDATOR_PROMPT = \"\"\"You are an intelligent judge whose job is to evaluate how successfully an agent executed a given instruction.\nYou will be presented with the input(s) provided to the agent followed by the output produced by the agent.\n\nEach output will be a dictionary. The keys will be **output fields** which were computed by the agent.\n\nYour job will be to assign a score of 1.0 to every output field which was computed correctly, and a score of 0.0 to every output field which was computed incorrectly. If the output for a field is a list, you may give a score in between 0.0 and 1.0 representing the fraction of correct items in the list.\n\nHere is an example evaluation:\n\nINPUT MESSAGES:\n---------------\nYou are a helpful assistant whose job is to generate a JSON object. You will be presented with a context and a set of output fields to generate. Your task is to generate a JSON object which fills in the output fields with the correct values.\nYou will be provided with a description of each input field and each output field. All of the fields in the output JSON object can be derived using information from the context.\n\nINPUT FIELDS:\n- text: a text passage describing a scientist\n\nOUTPUT FIELDS:\n- related_scientists: list of scientists who perform similar work as the scientist described in the text\n\nCONTEXT:\n{\n  \"text\": \"Augusta Ada King, Countess of Lovelace, also known as Ada Lovelace, was an English mathematician and writer chiefly known for her work on Charles Babbage's proposed mechanical general-purpose computer, the Analytical Engine. She was the first to recognise that the machine had applications beyond pure calculation.\",\n}\n\nOUTPUT:\n--------\n{\n  \"related_scientists\": [\n    \"Charles Babbage\",\n    \"Alan Turing\",\n    \"Charles Darwin\",\n    \"John von Neumann\",\n  ]\n}\n\nEVALUATION: {\"related_scientists\": 0.75}\n\nRemember, be sure to output your evaluation as a dictionary where each value contains a 0.0 or 1.0 score for each output field (or a score within [0.0, 1.0] for list output fields).\n\nINPUT MESSAGES:\n---------------\n\n\"\"\"\n"
  },
  {
    "path": "src/palimpzest/query/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/query/execution/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/query/execution/all_sample_execution_strategy.py",
    "content": "import logging\n\nimport numpy as np\n\nfrom palimpzest.core.data.dataset import Dataset\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import SentinelPlanStats\nfrom palimpzest.query.execution.execution_strategy import SentinelExecutionStrategy\nfrom palimpzest.query.operators.aggregate import AggregateOp\nfrom palimpzest.query.operators.filter import FilterOp\nfrom palimpzest.query.operators.join import JoinOp\nfrom palimpzest.query.operators.physical import PhysicalOperator\nfrom palimpzest.query.operators.scan import ContextScanOp, ScanPhysicalOp\nfrom palimpzest.query.optimizer.plan import SentinelPlan\nfrom palimpzest.utils.progress import create_progress_manager\nfrom palimpzest.validator.validator import Validator\n\nlogger = logging.getLogger(__name__)\n\nclass OpSet:\n    \"\"\"\n    This class represents the set of operators which are currently in the frontier for a given logical operator.\n    Each operator in the frontier is an instance of a PhysicalOperator which either:\n\n    1. lies on the Pareto frontier of the set of sampled operators, or\n    2. has been sampled fewer than j times\n    \"\"\"\n\n    def __init__(self, op_set: list[PhysicalOperator], source_unique_logical_op_ids: list[str], source_indices: list[int]):\n        # construct the set of operators\n        self.ops = op_set\n\n        # store the order in which we will sample the source records\n        self.source_indices = source_indices\n\n        # boolean indication of the type of operator in this OpSet\n        sample_op = op_set[0]\n        self.is_scan_op = isinstance(sample_op, (ScanPhysicalOp, ContextScanOp))\n        self.is_filter_op = isinstance(sample_op, FilterOp)\n        self.is_aggregate_op = isinstance(sample_op, AggregateOp)\n        self.is_llm_join = isinstance(sample_op, JoinOp)\n\n        # set the initial inputs for this logical operator\n        self.source_indices_to_inputs = {source_unique_logical_op_id: {} for source_unique_logical_op_id in source_unique_logical_op_ids}\n        if self.is_scan_op:\n            self.source_indices_to_inputs[\"source\"] = {source_idx: [int(source_idx.split(\"-\")[-1])] for source_idx in self.source_indices}\n\n    def get_op_inputs(self) -> list[PhysicalOperator, DataRecord | int | None]:\n        \"\"\"\n        Returns the list of frontier operators and their next input to process.\n        \"\"\"\n        # if this is an aggregate, run on every input\n        if self.is_aggregate_op:\n            op = self.ops[0]\n            all_inputs = []\n            for _, source_indices_to_inputs in self.source_indices_to_inputs.items():\n                for _, inputs in source_indices_to_inputs.items():\n                    all_inputs.extend(inputs)\n            return [(op, tuple(), all_inputs)]\n\n        # if this is an un-optimized (non-scan, non-join) operator, flatten inputs and run on each one\n        elif not self.is_scan_op and not self.is_llm_join and len(self.ops) == 1:\n            op_inputs = []\n            op = self.ops[0]\n            for _, source_indices_to_inputs in self.source_indices_to_inputs.items():\n                for source_indices, inputs in source_indices_to_inputs.items():\n                    for input in inputs:\n                        op_inputs.append((op, source_indices, input))\n            return op_inputs\n\n        # get the list of (op, source_indices) pairs which this operator needs to execute\n        op_source_indices_pairs = []\n        for op in self.ops:\n            # construct list of inputs by looking up the input for the given source_indices\n            for source_indices in self.source_indices:\n                op_source_indices_pairs.append((op, source_indices))\n\n        # construct the op inputs\n        op_inputs = []\n        if self.is_llm_join:\n            left_source_unique_logical_op_id, right_source_unique_logical_op_id = list(self.source_indices_to_inputs)\n            left_source_indices_to_inputs = self.source_indices_to_inputs[left_source_unique_logical_op_id]\n            right_source_indices_to_inputs = self.source_indices_to_inputs[right_source_unique_logical_op_id]\n            for op, source_indices in op_source_indices_pairs:\n                left_source_indices = source_indices[0]\n                right_source_indices = source_indices[1]\n                left_inputs = left_source_indices_to_inputs.get(left_source_indices, [])\n                right_inputs = right_source_indices_to_inputs.get(right_source_indices, [])\n                if len(left_inputs) > 0 and len(right_inputs) > 0:\n                    op_inputs.append((op, (left_source_indices, right_source_indices), (left_inputs, right_inputs)))\n            return op_inputs\n\n        # if operator is not a join\n        source_unique_logical_op_id = list(self.source_indices_to_inputs)[0]\n        op_inputs = [\n            (op, source_indices, input)\n            for op, source_indices in op_source_indices_pairs\n            for input in self.source_indices_to_inputs[source_unique_logical_op_id].get(source_indices, [])\n        ]\n\n        return op_inputs\n\n    def pick_highest_quality_output(self, record_sets: list[DataRecordSet]) -> DataRecordSet:\n        # if there's only one operator in the set, we return its record_set\n        if len(record_sets) == 1:\n            return record_sets[0]\n\n        # NOTE: I don't like that this assumes the models are consistent in\n        #       how they order their record outputs for one-to-many converts;\n        #       eventually we can try out more robust schemes to account for\n        #       differences in ordering\n        # aggregate records at each index in the response\n        idx_to_records = {}\n        for record_set in record_sets:\n            for idx in range(len(record_set)):\n                record, record_op_stats = record_set[idx], record_set.record_op_stats[idx]\n                if idx not in idx_to_records:\n                    idx_to_records[idx] = [(record, record_op_stats)]\n                else:\n                    idx_to_records[idx].append((record, record_op_stats))\n\n        # compute highest quality answer at each index\n        out_records = []\n        out_record_op_stats = []\n        for idx in range(len(idx_to_records)):\n            records_lst, record_op_stats_lst = zip(*idx_to_records[idx])\n            max_quality_record, max_quality = records_lst[0], record_op_stats_lst[0].quality\n            max_quality_stats = record_op_stats_lst[0]\n            for record, record_op_stats in zip(records_lst[1:], record_op_stats_lst[1:]):\n                record_quality = record_op_stats.quality\n                if record_quality > max_quality:\n                    max_quality_record = record\n                    max_quality = record_quality\n                    max_quality_stats = record_op_stats\n            out_records.append(max_quality_record)\n            out_record_op_stats.append(max_quality_stats)\n\n        # create and return final DataRecordSet\n        return DataRecordSet(out_records, out_record_op_stats)\n\n    def update_inputs(self, source_idx_to_record_sets: dict[int, DataRecordSet]):\n        \"\"\"\n        Update the inputs for this logical operator based on the outputs of the previous logical operator.\n        \"\"\"\n        for source_idx, record_sets in source_idx_to_record_sets.items():\n            input = []\n            max_quality_record_set = self.pick_highest_quality_output(record_sets)\n            for record in max_quality_record_set:\n                input.append(record if record._passed_operator else None)\n\n            self.source_indices_to_inputs[source_idx] = input\n\nclass AllSamplingExecutionStrategy(SentinelExecutionStrategy):\n\n    def _execute_sentinel_plan(self,\n            plan: SentinelPlan,\n            op_sets: dict[str, OpSet],\n            validator: Validator,\n            plan_stats: SentinelPlanStats,\n        ) -> SentinelPlanStats:\n        # execute operator sets in sequence\n        for topo_idx, (logical_op_id, _) in enumerate(plan):\n            # compute unique logical op id within plan\n            unique_logical_op_id = f\"{topo_idx}-{logical_op_id}\"\n\n            # get frontier ops and their next input\n            op_inputs = op_sets[logical_op_id].get_op_inputs()\n\n            # break out of the loop if op_inputs is empty, as this means all records have been filtered out\n            if len(op_inputs) == 0:\n                break\n\n            # run sampled operators on sampled inputs\n            source_indices_to_record_set_tuples, _ = self._execute_op_set(unique_logical_op_id, op_inputs)\n\n            # score the quality of each generated output\n            source_indices_to_all_record_sets = {\n                    source_indices: [(record_set, op) for record_set, op, _ in record_set_tuples]\n                    for source_indices, record_set_tuples in source_indices_to_record_set_tuples.items()\n                }\n            source_indices_to_all_record_sets, val_gen_stats = self._score_quality(validator, source_indices_to_all_record_sets)\n\n            # remove records that were read from the execution cache before adding to record op stats\n            new_record_op_stats = []\n            for _, record_set_tuples in source_indices_to_record_set_tuples.items():\n                for record_set, _, is_new in record_set_tuples:\n                    if is_new:\n                        new_record_op_stats.extend(record_set.record_op_stats)\n\n            # update plan stats\n            plan_stats.add_record_op_stats(unique_logical_op_id, new_record_op_stats)\n            plan_stats.add_validation_gen_stats(unique_logical_op_id, val_gen_stats)\n\n            # provide the best record sets as inputs to the next logical operator\n            next_unique_logical_op_id = plan.get_next_unique_logical_op_id(unique_logical_op_id)\n            if next_unique_logical_op_id is not None:\n                source_indices_to_all_record_sets = {\n                    source_indices: [record_set for record_set, _ in record_set_tuples]\n                    for source_indices, record_set_tuples in source_indices_to_all_record_sets.items()\n                }\n                op_sets[next_unique_logical_op_id].update_inputs(unique_logical_op_id, source_indices_to_all_record_sets)\n\n        # finalize plan stats\n        plan_stats.finish()\n\n        return plan_stats\n\n    def execute_sentinel_plan(self, plan: SentinelPlan, train_dataset: dict[str, Dataset], validator: Validator): # expected_outputs: dict[int, dict] | None):\n        \"\"\"\n        NOTE: this function currently requires us to set k and j properly in order to make\n              comparison in our research against the corresponding sample budget in MAB.\n\n        NOTE: the number of samples will slightly exceed the sample_budget if the number of operator\n        calls does not perfectly match the sample_budget. This may cause some minor discrepancies with\n        the progress manager as a result.\n        \"\"\"\n        logger.info(f\"Executing plan {plan.plan_id} with {self.max_workers} workers\")\n        logger.info(f\"Plan Details: {plan}\")\n\n        # initialize plan stats\n        plan_stats = SentinelPlanStats.from_plan(plan)\n        plan_stats.start()\n\n        # get lists of source indices\n        dataset_id_to_source_indices = {}\n        for dataset_id, dataset in train_dataset.items():\n            total_num_samples = len(dataset)\n            source_indices = [f\"{dataset_id}---{int(idx)}\" for idx in np.arange(total_num_samples)]\n            dataset_id_to_source_indices[dataset_id] = source_indices\n\n        # initialize set of physical operators for each logical operator\n        op_sets = {}\n        for topo_idx, (logical_op_id, op_set) in enumerate(plan):\n            unique_logical_op_id = f\"{topo_idx}-{logical_op_id}\"\n            source_unique_logical_op_ids = plan.get_source_unique_logical_op_ids(unique_logical_op_id)\n            sample_op = op_set[0]\n            if isinstance(sample_op, (ScanPhysicalOp, ContextScanOp)):\n                root_dataset_ids = plan.get_root_dataset_ids(unique_logical_op_id)\n                assert len(root_dataset_ids) == 1, f\"Scan for {sample_op} has {len(root_dataset_ids)} > 1 root dataset ids\"\n                root_dataset_id = root_dataset_ids[0]\n                source_indices = dataset_id_to_source_indices[root_dataset_id]\n                op_sets[unique_logical_op_id] = OpSet(op_set, source_unique_logical_op_ids, source_indices)\n            elif isinstance(sample_op, JoinOp):\n                assert len(source_unique_logical_op_ids) == 2, f\"Join for {sample_op} has {len(source_unique_logical_op_ids)} != 2 source logical operators\"\n                left_source_indices = op_sets[source_unique_logical_op_ids[0]].source_indices\n                right_source_indices = op_sets[source_unique_logical_op_ids[1]].source_indices\n                source_indices = []\n                for left_source_idx in left_source_indices:\n                    for right_source_idx in right_source_indices:\n                        source_indices.append((left_source_idx, right_source_idx))\n                op_sets[unique_logical_op_id] = OpSet(op_set, source_unique_logical_op_ids, source_indices)\n            else:\n                source_indices = op_sets[source_unique_logical_op_ids[0]].source_indices\n                op_sets[unique_logical_op_id] = OpSet(op_set, source_unique_logical_op_ids, source_indices)\n\n        # initialize and start the progress manager\n        self.progress_manager = create_progress_manager(plan, sample_budget=self.sample_budget, progress=self.progress)\n        self.progress_manager.start()\n\n        # NOTE: we must handle progress manager outside of _execute_sentinel_plan to ensure that it is shut down correctly;\n        #       if we don't have the `finally:` branch, then program crashes can cause future program runs to fail because\n        #       the progress manager cannot get a handle to the console \n        try:\n            # execute sentinel plan by sampling records and operators\n            plan_stats = self._execute_sentinel_plan(plan, op_sets, validator, plan_stats)\n\n        finally:\n            # finish progress tracking\n            self.progress_manager.finish()\n\n        logger.info(f\"Done executing sentinel plan: {plan.plan_id}\")\n        logger.debug(f\"Plan stats: (plan_cost={plan_stats.total_plan_cost}, plan_time={plan_stats.total_plan_time})\")\n\n        return plan_stats\n"
  },
  {
    "path": "src/palimpzest/query/execution/execution_strategy.py",
    "content": "import logging\nfrom abc import ABC, abstractmethod\nfrom concurrent.futures import ThreadPoolExecutor, as_completed\n\nimport numpy as np\nfrom chromadb.api.models.Collection import Collection\n\nfrom palimpzest.constants import Cardinality\nfrom palimpzest.core.data.dataset import Dataset\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import GenerationStats, PlanStats, SentinelPlanStats\nfrom palimpzest.policy import Policy\nfrom palimpzest.query.operators.convert import LLMConvert\nfrom palimpzest.query.operators.filter import LLMFilter\nfrom palimpzest.query.operators.join import JoinOp\nfrom palimpzest.query.operators.physical import PhysicalOperator\nfrom palimpzest.query.operators.scan import ContextScanOp, ScanPhysicalOp\nfrom palimpzest.query.operators.topk import TopKOp\nfrom palimpzest.query.optimizer.plan import PhysicalPlan, SentinelPlan\nfrom palimpzest.utils.progress import PZSentinelProgressManager\nfrom palimpzest.validator.validator import Validator\n\nlogger = logging.getLogger(__name__)\n\nclass BaseExecutionStrategy:\n    def __init__(self,\n                 scan_start_idx: int = 0, \n                 max_workers: int | None = None,\n                 batch_size: int | None = None,\n                 num_samples: int | None = None,\n                 verbose: bool = False,\n                 progress: bool = True,\n                 *args,\n                 **kwargs):\n        self.scan_start_idx = scan_start_idx\n        self.max_workers = max_workers\n        self.batch_size = batch_size\n        self.num_samples = num_samples\n        self.verbose = verbose\n        self.progress = progress\n\n\nclass ExecutionStrategy(BaseExecutionStrategy, ABC):\n    \"\"\"Base strategy for executing query plans. Defines how to execute a PhysicalPlan.\n    \"\"\"\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        logger.info(f\"Initialized ExecutionStrategy {self.__class__.__name__}\")\n        logger.debug(f\"ExecutionStrategy initialized with config: {self.__dict__}\")\n\n    @abstractmethod\n    def execute_plan(self, plan: PhysicalPlan) -> tuple[list[DataRecord], PlanStats]:\n        \"\"\"Execute a single plan according to strategy\"\"\"\n        pass\n\n    def _create_input_queues(self, plan: PhysicalPlan) -> dict[str, dict[str, list]]:\n        \"\"\"Initialize input queues for each operator in the plan.\"\"\"\n        input_queues = {f\"{topo_idx}-{op.get_full_op_id()}\": {} for topo_idx, op in enumerate(plan)}\n        for topo_idx, op in enumerate(plan):\n            full_op_id = op.get_full_op_id()\n            unique_op_id = f\"{topo_idx}-{full_op_id}\"\n            if isinstance(op, ScanPhysicalOp):\n                scan_end_idx = (\n                    len(op.datasource)\n                    if self.num_samples is None\n                    else min(self.scan_start_idx + self.num_samples, len(op.datasource))\n                )\n                input_queues[unique_op_id][f\"source_{full_op_id}\"] = [idx for idx in range(self.scan_start_idx, scan_end_idx)]\n            elif isinstance(op, ContextScanOp):\n                input_queues[unique_op_id][f\"source_{full_op_id}\"] = [None]\n            else:\n                for source_unique_full_op_id in plan.get_source_unique_full_op_ids(topo_idx, op):\n                    input_queues[unique_op_id][source_unique_full_op_id] = []\n\n        return input_queues\n\nclass SentinelExecutionStrategy(BaseExecutionStrategy, ABC):\n    \"\"\"Base strategy for executing sentinel query plans. Defines how to execute a SentinelPlan.\"\"\"\n    \"\"\"\n    Specialized query processor that implements MAB sentinel strategy\n    for coordinating optimization and execution.\n    \"\"\"\n    def __init__(\n        self,\n        policy: Policy,\n        k: int = 6,\n        j: int = 4,\n        sample_budget: int = 100,\n        sample_cost_budget: float | None = None,\n        priors: dict | None = None,\n        use_final_op_quality: bool = False,\n        seed: int = 42,\n        exp_name: str | None = None,\n        dont_use_priors: bool = False,\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        self.k = k\n        self.j = j\n        self.sample_budget = sample_budget\n        self.sample_cost_budget = sample_cost_budget\n        self.policy = policy\n        self.priors = priors\n        self.use_final_op_quality = use_final_op_quality\n        self.seed = seed\n        self.rng = np.random.default_rng(seed=seed)\n        self.exp_name = exp_name\n        self.dont_use_priors = dont_use_priors\n\n        # general cache which maps hash(logical_op_id, phys_op_id, hash(input)) --> record_set\n        self.cache: dict[int, DataRecordSet] = {}\n\n        # progress manager used to track progress of the execution\n        self.progress_manager: PZSentinelProgressManager | None = None\n\n    def _score_quality(\n        self,\n        validator: Validator,\n        source_indices_to_record_sets: dict[tuple[str], list[tuple[DataRecordSet, PhysicalOperator]]],\n    ) -> tuple[dict[int, list[DataRecordSet]], GenerationStats]:\n        # extract information about the logical operation performed at this stage of the sentinel plan;\n        # NOTE: we can infer these fields from context clues, but in the long-term we should have a more\n        #       principled way of getting these directly from attributes either stored in the sentinel_plan\n        #       or in the PhysicalOperator\n        def is_perfect_quality_op(op: PhysicalOperator):\n            return (\n                not isinstance(op, LLMConvert)\n                and not isinstance(op, LLMFilter)\n                and not isinstance(op, TopKOp)\n                and not isinstance(op, JoinOp)\n            )\n\n        # create minimal set of futures necessary to compute quality of each output record\n        futures, full_hashes, full_hash_to_bool_output = [], set(), {}\n        with ThreadPoolExecutor(max_workers=self.max_workers) as executor:\n            for _, record_set_tuples in source_indices_to_record_sets.items():\n                for record_set, op in record_set_tuples:\n                    # if this operation does not involve an LLM, every record_op_stats object gets perfect quality\n                    if is_perfect_quality_op(op):\n                        for record_op_stats in record_set.record_op_stats:\n                            record_op_stats.quality = 1.0\n                        continue\n\n                    # if the operation failed, assign 0.0 quality\n                    if len(record_set) == 0:\n                        record_set.record_op_stats[0].quality = 0.0\n                        continue\n\n                    # create future for map\n                    if isinstance(op, LLMConvert) and op.cardinality is Cardinality.ONE_TO_ONE:\n                        fields = op.generated_fields\n                        input_record: DataRecord = record_set.input\n                        output = record_set.data_records[0].to_dict(project_cols=fields)\n                        output_str = record_set.data_records[0].to_json_str(project_cols=fields, bytes_to_str=True, sorted=True)\n                        full_hash = f\"{hash(input_record)}{hash(output_str)}\"\n                        if full_hash not in full_hashes:\n                            full_hashes.add(full_hash)\n                            futures.append(executor.submit(validator._score_map, op, fields, input_record, output, full_hash))\n\n                    # create future for flat map\n                    elif isinstance(op, LLMConvert) and op.cardinality is Cardinality.ONE_TO_MANY:\n                        fields = op.generated_fields\n                        input_record: DataRecord = record_set.input\n                        output, output_strs = [], []\n                        for data_record in record_set.data_records:\n                            output.append(data_record.to_dict(project_cols=fields))\n                            output_strs.append(data_record.to_json_str(project_cols=fields, bytes_to_str=True, sorted=True))\n                        full_hash = f\"{hash(input_record)}{hash(tuple(sorted(output_strs)))}\"\n                        if full_hash not in full_hashes:\n                            full_hashes.add(full_hash)\n                            futures.append(executor.submit(validator._score_flat_map, op, fields, input_record, output, full_hash))\n\n                    # create future for top-k\n                    elif isinstance(op, TopKOp):\n                        fields = op.generated_fields\n                        input_record: DataRecord = record_set.input\n                        output = record_set.data_records[0].to_dict(project_cols=fields)\n                        output_str = record_set.data_records[0].to_json_str(project_cols=fields, bytes_to_str=True, sorted=True)\n                        full_hash = f\"{hash(input_record)}{hash(output_str)}\"\n                        if full_hash not in full_hashes:\n                            full_hashes.add(full_hash)\n                            futures.append(executor.submit(validator._score_topk, op, fields, input_record, output, full_hash))\n\n                    # create future for filter\n                    elif isinstance(op, LLMFilter):\n                        filter_str = op.filter_obj.filter_condition\n                        input_record: DataRecord = record_set.input\n                        output = record_set.data_records[0]._passed_operator\n                        full_hash = f\"{filter_str}{hash(input_record)}\"\n                        if full_hash not in full_hashes:\n                            full_hash_to_bool_output[full_hash] = output\n                            full_hashes.add(full_hash)\n                            futures.append(executor.submit(validator._score_filter, op, filter_str, input_record, output, full_hash))\n\n                    # create future for join\n                    elif isinstance(op, JoinOp):\n                        condition = op.condition\n                        for left_idx, left_input_record in enumerate(record_set.input[0]):\n                            for right_idx, right_input_record in enumerate(record_set.input[1]):\n                                record_idx = left_idx * len(record_set.input[1]) + right_idx\n                                output = record_set.data_records[record_idx]._passed_operator\n                                full_hash = f\"{condition}{hash(left_input_record)}{hash(right_input_record)}\"\n                                if full_hash not in full_hashes:\n                                    full_hash_to_bool_output[full_hash] = output\n                                    full_hashes.add(full_hash)\n                                    futures.append(executor.submit(validator._score_join, op, condition, left_input_record, right_input_record, output, full_hash))\n\n        # collect results from futures\n        full_hash_to_score, validation_gen_stats = {}, GenerationStats()\n        for future in as_completed(futures):\n            score, gen_stats, full_hash = future.result()\n            full_hash_to_score[full_hash] = score\n            validation_gen_stats += gen_stats\n\n        # compute quality of each output computed by this operator\n        for _, record_set_tuples in source_indices_to_record_sets.items():\n            for record_set, op in record_set_tuples:\n                if is_perfect_quality_op(op) or len(record_set) == 0:\n                    continue\n\n                if isinstance(op, LLMConvert) and op.cardinality is Cardinality.ONE_TO_ONE:\n                    fields = op.generated_fields\n                    input_record: DataRecord = record_set.input\n                    output_str = record_set.data_records[0].to_json_str(project_cols=fields, bytes_to_str=True, sorted=True)\n                    full_hash = f\"{hash(input_record)}{hash(output_str)}\"\n                    record_set.record_op_stats[0].quality = full_hash_to_score[full_hash]\n\n                elif isinstance(op, LLMConvert) and op.cardinality is Cardinality.ONE_TO_MANY:\n                    fields = op.generated_fields\n                    input_record: DataRecord = record_set.input\n                    output_strs = []\n                    for data_record in record_set.data_records:\n                        output_strs.append(data_record.to_json_str(project_cols=fields, bytes_to_str=True, sorted=True))\n                    full_hash = f\"{hash(input_record)}{hash(tuple(sorted(output_strs)))}\"\n                    score = full_hash_to_score[full_hash]\n                    for record_op_stats in record_set.record_op_stats:\n                        record_op_stats.quality = score\n\n                # TODO: this scoring function will (likely) bias towards small values of k since it\n                # measures precision and not recall / F1; will need to revisit this in the future\n                elif isinstance(op, TopKOp):\n                    fields = op.generated_fields\n                    input_record: DataRecord = record_set.input\n                    output_str = record_set.data_records[0].to_json_str(project_cols=fields, bytes_to_str=True, sorted=True)\n                    full_hash = f\"{hash(input_record)}{hash(output_str)}\"\n                    score = full_hash_to_score[full_hash]\n                    record_set.record_op_stats[0].quality = score\n\n                elif isinstance(op, LLMFilter):\n                    filter_str = op.filter_obj.filter_condition\n                    input_record: DataRecord = record_set.input\n                    output = record_set.data_records[0]._passed_operator\n                    full_hash = f\"{filter_str}{hash(input_record)}\"\n                    if output == full_hash_to_bool_output[full_hash]:\n                        record_set.record_op_stats[0].quality = full_hash_to_score[full_hash]\n                    else:\n                        record_set.record_op_stats[0].quality = 1.0 - full_hash_to_score[full_hash]\n\n                elif isinstance(op, JoinOp):\n                    condition = op.condition\n                    for left_idx, left_input_record in enumerate(record_set.input[0]):\n                        for right_idx, right_input_record in enumerate(record_set.input[1]):\n                            record_idx = left_idx * len(record_set.input[1]) + right_idx\n                            output = record_set.data_records[record_idx]._passed_operator\n                            full_hash = f\"{condition}{hash(left_input_record)}{hash(right_input_record)}\"\n                            if output == full_hash_to_bool_output[full_hash]:\n                                record_set.record_op_stats[record_idx].quality = full_hash_to_score[full_hash]\n                            else:\n                                record_set.record_op_stats[record_idx].quality = 1.0 - full_hash_to_score[full_hash]\n\n        # return the quality annotated record sets\n        return source_indices_to_record_sets, validation_gen_stats\n\n    def _execute_op_set(self, unique_logical_op_id: str, op_inputs: list[tuple[PhysicalOperator, str | tuple, int | DataRecord | list[DataRecord] | tuple[list[DataRecord]]]]) -> tuple[dict[int, list[tuple[DataRecordSet, PhysicalOperator, bool]]], dict[str, int]]:\n        def execute_op_wrapper(operator: PhysicalOperator, source_indices: str | tuple, input: int | DataRecord | list[DataRecord] | tuple[list[DataRecord]]) -> tuple[DataRecordSet, PhysicalOperator, list[DataRecord] | list[int]]:\n            # operator is a join\n            record_set = operator(input[0], input[1]) if isinstance(operator, JoinOp) else operator(input)\n            return record_set, operator, source_indices, input\n\n        def get_hash(operator: PhysicalOperator, input: int | DataRecord | list[DataRecord] | tuple[list[DataRecord]]):\n            if isinstance(input, list):\n                input = tuple(input)\n            elif isinstance(input, tuple):\n                input = (tuple(input[0]), tuple(input[1]))\n            return hash(f\"{operator.get_full_op_id()}{hash(input)}\")\n\n        # initialize mapping from source indices to output record sets\n        source_indices_to_record_sets_and_ops = {source_indices: [] for _, source_indices, _ in op_inputs}\n\n        # if any operations were previously executed, read the results from the cache\n        final_op_inputs = []\n        for operator, source_indices, input in op_inputs:\n            # compute hash\n            op_input_hash = get_hash(operator, input)\n\n            # get result from cache\n            if op_input_hash in self.cache:\n                record_set, operator = self.cache[op_input_hash]\n                source_indices_to_record_sets_and_ops[source_indices].append((record_set, operator, False))\n\n            # otherwise, add to final_op_inputs\n            else:\n                final_op_inputs.append((operator, source_indices, input))\n\n        # keep track of the number of llm operations\n        num_llm_ops = 0\n\n        # create thread pool w/max workers and run futures over worker pool\n        with ThreadPoolExecutor(max_workers=self.max_workers) as executor:\n            # create futures\n            futures = [\n                executor.submit(execute_op_wrapper, operator, source_indices, input)\n                for operator, source_indices, input in final_op_inputs\n            ]\n\n            output_record_sets = []\n            for future in as_completed(futures):\n                # update output record sets\n                record_set, operator, source_indices, input = future.result()\n\n                # if the operator is a join, get record_set from tuple output\n                if isinstance(operator, JoinOp):\n                    record_set = record_set[0]\n\n                output_record_sets.append((record_set, operator, source_indices, input))\n\n                # update cache\n                op_input_hash = get_hash(operator, input)\n                self.cache[op_input_hash] = (record_set, operator)\n\n                # update progress manager\n                if self._is_llm_op(operator):\n                    num_llm_ops += 1\n                    self.progress_manager.incr(unique_logical_op_id, num_samples=1, total_cost=record_set.get_total_cost())\n\n            # update mapping from source_indices to record sets and operators\n            for record_set, operator, source_indices, input in output_record_sets:\n                # add record_set to mapping from source_indices --> record_sets\n                record_set.input = input\n                source_indices_to_record_sets_and_ops[source_indices].append((record_set, operator, True))\n\n        return source_indices_to_record_sets_and_ops, num_llm_ops\n\n    def _is_llm_op(self, physical_op: PhysicalOperator) -> bool:\n        is_llm_convert = isinstance(physical_op, LLMConvert)\n        is_llm_filter = isinstance(physical_op, LLMFilter)\n        is_llm_topk = isinstance(physical_op, TopKOp) and isinstance(physical_op.index, Collection)\n        is_llm_join = isinstance(physical_op, JoinOp)\n        return is_llm_convert or is_llm_filter or is_llm_topk or is_llm_join\n\n    @abstractmethod\n    def execute_sentinel_plan(self, sentinel_plan: SentinelPlan, train_dataset: dict[str, Dataset], validator: Validator) -> SentinelPlanStats:\n        \"\"\"Execute a SentinelPlan according to strategy\"\"\"\n        pass\n"
  },
  {
    "path": "src/palimpzest/query/execution/execution_strategy_type.py",
    "content": "from enum import Enum\n\nfrom palimpzest.query.execution.all_sample_execution_strategy import AllSamplingExecutionStrategy\nfrom palimpzest.query.execution.mab_execution_strategy import MABExecutionStrategy\nfrom palimpzest.query.execution.parallel_execution_strategy import ParallelExecutionStrategy\nfrom palimpzest.query.execution.single_threaded_execution_strategy import (\n    PipelinedSingleThreadExecutionStrategy,\n    SequentialSingleThreadExecutionStrategy,\n)\n\n\nclass ExecutionStrategyType(Enum):\n    \"\"\"Available execution strategy types\"\"\"\n    SEQUENTIAL = SequentialSingleThreadExecutionStrategy\n    PIPELINED = PipelinedSingleThreadExecutionStrategy\n    PARALLEL = ParallelExecutionStrategy\n\n    def is_fully_parallel(self) -> bool:\n        \"\"\"Check if the execution strategy executes operators in parallel.\"\"\"\n        return self == ExecutionStrategyType.PARALLEL\n\nclass SentinelExecutionStrategyType(Enum):\n    MAB = MABExecutionStrategy\n    ALL = AllSamplingExecutionStrategy\n"
  },
  {
    "path": "src/palimpzest/query/execution/mab_execution_strategy.py",
    "content": "\nimport logging\n\nimport numpy as np\nfrom chromadb.api.models.Collection import Collection\n\nfrom palimpzest.core.data.dataset import Dataset\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import OperatorCostEstimates, OperatorStats, RecordOpStats, SentinelPlanStats\nfrom palimpzest.policy import Policy\nfrom palimpzest.query.execution.execution_strategy import SentinelExecutionStrategy\nfrom palimpzest.query.operators.aggregate import AggregateOp\nfrom palimpzest.query.operators.convert import LLMConvert\nfrom palimpzest.query.operators.filter import FilterOp, LLMFilter, NonLLMFilter\nfrom palimpzest.query.operators.join import JoinOp\nfrom palimpzest.query.operators.physical import PhysicalOperator\nfrom palimpzest.query.operators.scan import ContextScanOp, ScanPhysicalOp\nfrom palimpzest.query.operators.topk import TopKOp\nfrom palimpzest.query.optimizer.plan import SentinelPlan\nfrom palimpzest.utils.progress import create_progress_manager\nfrom palimpzest.validator.validator import Validator\n\nlogger = logging.getLogger(__name__)\n\n# NOTE: we currently do not support Sentinel Plans with aggregates or limits which are not the final plan operator\n\nclass OpFrontier:\n    \"\"\"\n    This class represents the set of operators which are currently in the frontier for a given logical operator.\n    Each operator in the frontier is an instance of a PhysicalOperator which either:\n\n    1. lies on the Pareto frontier of the set of sampled operators, or\n    2. has been sampled fewer than j times\n    \"\"\"\n\n    def __init__(\n            self,\n            op_set: list[PhysicalOperator],\n            source_unique_logical_op_ids: list[str],\n            root_dataset_ids: list[str],\n            source_indices: list[tuple],\n            k: int,\n            j: int,\n            seed: int,\n            policy: Policy,\n            priors: dict | None = None,\n            dont_use_priors: bool = False,\n        ):\n        # set k and j, which are the initial number of operators in the frontier and the\n        # initial number of records to sample for each frontier operator\n        self.k = min(k, len(op_set))\n        self.j = j\n        self.source_indices = source_indices\n        self.root_dataset_ids = root_dataset_ids\n        self.dont_use_priors = dont_use_priors\n\n        # store the policy that we are optimizing under\n        self.policy = policy\n\n        # store the prior beliefs on operator performance (if provided)\n        self.priors = priors\n\n        # boolean indication of the type of operator in this OpFrontier\n        sample_op = op_set[0]\n        self.is_scan_op = isinstance(sample_op, (ScanPhysicalOp, ContextScanOp))\n        self.is_filter_op = isinstance(sample_op, FilterOp)\n        self.is_aggregate_op = isinstance(sample_op, AggregateOp)\n        self.is_llm_join = isinstance(sample_op, JoinOp)\n        is_llm_convert = isinstance(sample_op, LLMConvert)\n        is_llm_filter = isinstance(sample_op, LLMFilter)\n        is_llm_topk = isinstance(sample_op, TopKOp) and isinstance(sample_op.index, Collection)\n        self.is_llm_op = is_llm_convert or is_llm_filter or is_llm_topk or self.is_llm_join\n        self.is_llm_convert = is_llm_convert\n\n        # get order in which we will sample physical operators for this logical operator\n        sample_op_indices = self._get_op_index_order(op_set, seed)\n\n        # construct the initial set of frontier and reservoir operators\n        self.frontier_ops = [op_set[sample_idx] for sample_idx in sample_op_indices[:self.k]]\n        self.reservoir_ops = [op_set[sample_idx] for sample_idx in sample_op_indices[self.k:]]\n        self.off_frontier_ops: list[PhysicalOperator] = []\n\n        # keep track of the source indices processed by each physical operator\n        self.full_op_id_to_sources_processed = {op.get_full_op_id(): set() for op in op_set}\n        self.full_op_id_to_sources_not_processed = {op.get_full_op_id(): source_indices for op in op_set}\n        self.max_inputs = len(source_indices)\n\n        # set the initial inputs for this logical operator; we maintain a mapping from source_unique_logical_op_id --> source_indices --> input;\n        # for each unique source and (tuple of) source indices, we store its output, which is an input to this operator\n        # for scan operators, we use the default name \"source\" since these operators have no source\n        self.source_indices_to_inputs = {source_unique_logical_op_id: {} for source_unique_logical_op_id in source_unique_logical_op_ids}\n        if self.is_scan_op:\n            self.source_indices_to_inputs[\"source\"] = {source_idx: [int(source_idx.split(\"-\")[-1])] for source_idx in source_indices}\n        \n\n    def get_frontier_ops(self) -> list[PhysicalOperator]:\n        \"\"\"\n        Returns the set of frontier operators for this OpFrontier.\n        \"\"\"\n        return self.frontier_ops\n\n    def get_off_frontier_ops(self) -> list[PhysicalOperator]:\n        \"\"\"\n        Returns the set of off-frontier operators for this OpFrontier.\n        \"\"\"\n        return self.off_frontier_ops\n\n    def _compute_op_id_to_pareto_distance(self, priors: dict[str, dict[str, float]]) -> dict[str, float]:\n        \"\"\"\n        Return l2-distance for each operator from the pareto frontier.\n        \"\"\"\n        # get the dictionary representation of this poicy\n        policy_dict = self.policy.get_dict()\n\n        # compute the pareto optimal set of operators\n        pareto_op_set = set()\n        for op_id, metrics in priors.items():\n            cost, time, quality = metrics[\"cost\"], metrics[\"time\"], metrics[\"quality\"]\n            pareto_frontier = True\n\n            # check if any other operator dominates op_id\n            for other_op_id, other_metrics in priors.items():\n                other_cost, other_time, other_quality = other_metrics[\"cost\"], other_metrics[\"time\"], other_metrics[\"quality\"]\n                if op_id == other_op_id:\n                    continue\n\n                # if op_id is dominated by other_op_id, set pareto_frontier = False and break\n                # NOTE: here we use a strict inequality (instead of the usual <= or >=) because\n                #       all ops which have equal cost / time / quality / sel. should not be\n                #       filtered out from sampling by our logic in this function\n                cost_dominated = True if policy_dict[\"cost\"] == 0.0 else other_cost < cost\n                time_dominated = True if policy_dict[\"time\"] == 0.0 else other_time < time\n                quality_dominated = True if policy_dict[\"quality\"] == 0.0 else other_quality > quality\n                if cost_dominated and time_dominated and quality_dominated:\n                    pareto_frontier = False\n                    break\n\n            # add op_id to pareto frontier if it's not dominated\n            if pareto_frontier:\n                pareto_op_set.add(op_id)\n\n        # compute the shortest distance from each operator to the pareto frontier\n        op_id_to_pareto_distance = {}\n        for op_id, metrics in priors.items():\n            # set distance to 0.0 if this operator is on the pareto frontier\n            if op_id in pareto_op_set:\n                op_id_to_pareto_distance[op_id] = 0.0\n                continue\n\n            # otherwise, compute min_dist to pareto operators\n            min_dist = None\n            cost, time, quality = metrics[\"cost\"], metrics[\"time\"], metrics[\"quality\"]\n            for pareto_op_id in pareto_op_set:\n                pareto_cost, pareto_time, pareto_quality = priors[pareto_op_id][\"cost\"], priors[pareto_op_id][\"time\"], priors[pareto_op_id][\"quality\"]\n\n                cost_dist_squared = 0.0 if policy_dict[\"cost\"] == 0.0 else (cost - pareto_cost) ** 2\n                time_dist_squared = 0.0 if policy_dict[\"time\"] == 0.0 else (time - pareto_time) ** 2\n                quality_dist_squared = 0.0 if policy_dict[\"quality\"] == 0.0 else (quality - pareto_quality) ** 2\n                dist = np.sqrt(cost_dist_squared + time_dist_squared + quality_dist_squared)\n                if min_dist is None or dist < min_dist:\n                    min_dist = dist\n\n            # set minimum distance for this operator\n            op_id_to_pareto_distance[op_id] = min_dist\n        \n        return op_id_to_pareto_distance\n\n    def _compute_naive_priors(self, op_set: list[PhysicalOperator]) -> dict[str, dict[str, float]]:\n        naive_priors = {}\n        for op in op_set:\n            # use naive cost estimates with dummy source estimates to compute priors\n            source_op_estimates = OperatorCostEstimates(quality=1.0, cost_per_record=0.0, time_per_record=0.0, cardinality=100)\n            op_estimates = (\n                op.naive_cost_estimates(source_op_estimates, source_op_estimates)\n                if self.is_llm_join\n                else op.naive_cost_estimates(source_op_estimates)\n            )\n\n            # get op_id for this operator\n            op_id = op.get_op_id()\n\n            # set the naive quality, cost, and time priors for this operator\n            naive_priors[op_id] = {\n                \"quality\": op_estimates.quality,\n                \"cost\": op_estimates.cost_per_record,\n                \"time\": op_estimates.time_per_record,\n            }\n\n        return naive_priors\n\n    def _get_op_index_order(self, op_set: list[PhysicalOperator], seed: int) -> list[int]:\n        \"\"\"\n        Returns a list of indices for the operators in the op_set.\n        \"\"\"\n        # if this is not an llm-operator, we simply return the indices in random order\n        if not self.is_llm_op or self.dont_use_priors:\n            if self.is_llm_convert:\n                print(\"Using NO PRIORS for operator sampling order\")\n            rng = np.random.default_rng(seed=seed)\n            op_indices = np.arange(len(op_set))\n            rng.shuffle(op_indices)\n            return op_indices\n\n        # if this is an llm-operator, but we do not have priors, we first compute naive priors\n        if self.priors is None or any([op_id not in self.priors for op_id in map(lambda op: op.get_op_id(), op_set)]):\n            if self.is_llm_convert:\n                print(\"Using NAIVE PRIORS for operator sampling order\")\n            self.priors = self._compute_naive_priors(op_set)\n\n        # NOTE: self.priors is a dictionary with format:\n        # {op_id: {\"quality\": quality, \"cost\": cost, \"time\": time}}\n\n        # compute mean and std. dev. for each field\n        qualities = [op_priors[\"quality\"] for op_priors in self.priors.values()]\n        costs = [op_priors[\"cost\"] for op_priors in self.priors.values()]\n        times = [op_priors[\"time\"] for op_priors in self.priors.values()]\n        metric_to_mean = {\"quality\": np.mean(qualities), \"cost\": np.mean(costs), \"time\": np.mean(times)}\n        metric_to_std = {\"quality\": np.std(qualities), \"cost\": np.std(costs), \"time\": np.std(times)}\n\n        # normalize the scale of each field to be the same\n        for _, op_priors in self.priors.items():\n            for metric, value in op_priors.items():\n                if metric_to_std[metric] == 0.0:\n                    op_priors[metric] = metric_to_mean[metric]\n                else:\n                    op_priors[metric] = (value - metric_to_mean[metric]) / metric_to_std[metric]\n\n        # then, we compute the l2-distance from the pareto frontier for each operator\n        op_id_to_distance = self._compute_op_id_to_pareto_distance(self.priors)\n\n        # compute tuple for every operator, invert quality so ascending sort puts\n        # best operator first: (op_id, dist, -1 * quality, cost, time);\n        op_tuples = []\n        for op in op_set:\n            op_id = op.get_op_id()\n            op_priors = self.priors[op_id]\n            op_tuple = (op_id, op_id_to_distance[op_id], -1 * op_priors[\"quality\"], op_priors[\"cost\"], op_priors[\"time\"])\n            op_tuples.append(op_tuple)\n\n        # sort tuples on distance, then second dim\n        second_dim_idx = None\n        if self.policy.get_primary_metric() == \"quality\":\n            second_dim_idx = 2\n        elif self.policy.get_primary_metric() == \"cost\":\n            second_dim_idx = 3\n        elif self.policy.get_primary_metric() == \"time\":\n            second_dim_idx = 4\n\n        # sort based on distance from pareto frontier; break ties with performance on max / min metric\n        op_tuples = sorted(op_tuples, key=lambda x: (x[1], x[second_dim_idx]))\n\n        # return final list of op indices in sample order\n        op_id_to_idx = {op.get_op_id(): idx for idx, op in enumerate(op_set)}\n        op_indices = [op_id_to_idx[op_tuple[0]] for op_tuple in op_tuples]\n\n        return op_indices\n\n    def _get_op_source_indices_pairs(self) -> list[tuple[PhysicalOperator, tuple[str] | None]]:\n        \"\"\"\n        Returns a list of tuples for (op, source_indices) which this operator needs to execute\n        in the next iteration.\n        \"\"\"\n        op_source_indices_pairs = []\n\n        # if this operator is not being optimized: we don't request inputs, but simply process what we are given / told to (in the case of scans)\n        if not self.is_llm_op and len(self.frontier_ops) == 1:\n            return [(self.frontier_ops[0], None)]\n\n        # otherwise, sample (operator, source_indices) pairs\n        for op in self.frontier_ops:\n            # execute new operators on first j indices per root dataset, and previously sampled operators on one per root dataset\n            new_operator = self.full_op_id_to_sources_processed[op.get_full_op_id()] == set()\n            samples_per_root_dataset = self.j if new_operator else 1\n            num_root_datasets = len(self.root_dataset_ids)\n            num_samples = samples_per_root_dataset**num_root_datasets\n            samples = self.full_op_id_to_sources_not_processed[op.get_full_op_id()][:num_samples]\n            for source_indices in samples:\n                op_source_indices_pairs.append((op, source_indices))\n\n        return op_source_indices_pairs\n\n    def get_source_indices_for_next_iteration(self) -> set[tuple[str]]:\n        \"\"\"\n        Returns the set of source indices which need to be sampled for the next iteration.\n        \"\"\"\n        op_source_indices_pairs = self._get_op_source_indices_pairs()\n        return set([source_indices for _, source_indices in op_source_indices_pairs if source_indices is not None])\n\n    def get_frontier_op_inputs(self, source_indices_to_sample: set[tuple[str]], max_quality_op: PhysicalOperator) -> list[tuple[PhysicalOperator, tuple[str], list[DataRecord] | list[int] | None]]:\n        \"\"\"\n        Returns the list of frontier operators and their next input to process. If there are\n        any indices in `source_indices_to_sample` which this operator does not sample on its own, then\n        we also have this frontier process those source indices' input with its max quality operator.\n        \"\"\"\n        # if this is an aggregate, run on every input\n        if self.is_aggregate_op:\n            # NOTE: we don't keep track of source indices for aggregate (would require computing powerset of all source records);\n            #       thus, we cannot currently support optimizing plans w/LLM operators after aggregations\n            op = self.frontier_ops[0]\n            all_inputs = []\n            for _, source_indices_to_inputs in self.source_indices_to_inputs.items():\n                for _, inputs in source_indices_to_inputs.items():\n                    all_inputs.extend(inputs)\n            return [(op, tuple(), all_inputs)]\n\n        ### for optimized operators\n        # get the list of (op, source_indices) pairs which this operator needs to execute\n        op_source_indices_pairs = self._get_op_source_indices_pairs()\n\n        # remove any root datasets which this op frontier does not have access to from the source_indices_to_sample\n        def remove_unavailable_root_datasets(source_indices: str | tuple) -> str | tuple | None:\n            # base case: source_indices is a string\n            if isinstance(source_indices, str):\n                return source_indices if source_indices.split(\"---\")[0] in self.root_dataset_ids else None\n\n            # recursive case: source_indices is a tuple\n            left_indices = source_indices[0]\n            right_indices = source_indices[1]\n            left_filtered = remove_unavailable_root_datasets(left_indices)\n            right_filtered = remove_unavailable_root_datasets(right_indices)\n            if left_filtered is None and right_filtered is None:\n                return None\n\n            if left_filtered is None:\n                return right_filtered\n            if right_filtered is None:\n                return left_filtered\n            return (left_filtered, right_filtered)\n\n        source_indices_to_sample = {remove_unavailable_root_datasets(source_indices) for source_indices in source_indices_to_sample}\n\n        # if there are any source_indices in source_indices_to_sample which are not sampled by this operator,\n        # apply the max quality operator (and any other frontier operators with no samples)\n        sampled_source_indices = set(map(lambda tup: tup[1], op_source_indices_pairs))\n        unsampled_source_indices = source_indices_to_sample - sampled_source_indices\n        for source_indices in unsampled_source_indices:\n            op_source_indices_pairs.append((max_quality_op, source_indices))\n            for op in self.frontier_ops:\n                if self.full_op_id_to_sources_processed[op.get_full_op_id()] == set() and op.get_full_op_id() != max_quality_op.get_full_op_id():\n                    op_source_indices_pairs.append((op, source_indices))\n\n        # construct the op inputs\n        op_inputs = []\n        if self.is_llm_join:\n            left_source_unique_logical_op_id, right_source_unique_logical_op_id = list(self.source_indices_to_inputs)\n            left_source_indices_to_inputs = self.source_indices_to_inputs[left_source_unique_logical_op_id]\n            right_source_indices_to_inputs = self.source_indices_to_inputs[right_source_unique_logical_op_id]\n            for op, source_indices in op_source_indices_pairs:\n                left_source_indices = source_indices[0]\n                right_source_indices = source_indices[1]\n                left_inputs = left_source_indices_to_inputs.get(left_source_indices, [])\n                right_inputs = right_source_indices_to_inputs.get(right_source_indices, [])\n                left_inputs = [input for input in left_inputs if input is not None]\n                right_inputs = [input for input in right_inputs if input is not None]\n                if len(left_inputs) > 0 and len(right_inputs) > 0:\n                    op_inputs.append((op, (left_source_indices, right_source_indices), (left_inputs, right_inputs)))\n            return op_inputs\n\n        # if operator is not a join\n        source_unique_logical_op_id = list(self.source_indices_to_inputs)[0]\n        op_inputs = [\n            (op, source_indices, input)\n            for op, source_indices in op_source_indices_pairs\n            for input in self.source_indices_to_inputs[source_unique_logical_op_id].get(source_indices, [])\n        ]\n\n        return op_inputs\n\n    def update_frontier(self, unique_logical_op_id: str, plan_stats: SentinelPlanStats, full_op_id_to_source_indices_processed: dict[str, set[list]]) -> None:\n        \"\"\"\n        Update the set of frontier operators, pulling in new ones from the reservoir as needed.\n        This function will:\n        1. Compute the mean, LCB, and UCB for the cost, time, quality, and selectivity of each frontier operator\n        2. Compute the pareto optimal set of frontier operators (using the mean values)\n        3. Update the frontier and reservoir sets of operators based on their LCB/UCB overlap with the pareto frontier\n        \"\"\"\n        # NOTE: downstream operators may end up re-computing the same record_id with a diff. input as upstream\n        #       upstream operators change; in this case, we de-duplicate record_op_stats with identical record_ids\n        #       and keep the one with the maximum quality\n        # get a mapping from full_op_id --> list[RecordOpStats]\n        full_op_id_to_op_stats: dict[str, OperatorStats] = plan_stats.operator_stats.get(unique_logical_op_id, {})\n        full_op_id_to_record_op_stats: dict[str, list[RecordOpStats]] = {}\n        for full_op_id, op_stats in full_op_id_to_op_stats.items():\n            # skip over operators which have not been sampled\n            if len(op_stats.record_op_stats_lst) == 0:\n                continue\n\n            # compute mapping from record_id to highest quality record op stats\n            record_id_to_max_quality_record_op_stats = {}\n            for record_op_stats in op_stats.record_op_stats_lst:\n                record_id = record_op_stats.record_id\n                if record_id not in record_id_to_max_quality_record_op_stats:  # noqa: SIM114\n                    record_id_to_max_quality_record_op_stats[record_id] = record_op_stats\n\n                elif record_op_stats.quality > record_id_to_max_quality_record_op_stats[record_id].quality:\n                    record_id_to_max_quality_record_op_stats[record_id] = record_op_stats\n\n            # compute final list of record op stats\n            full_op_id_to_record_op_stats[full_op_id] = list(record_id_to_max_quality_record_op_stats.values())\n\n        # NOTE: it is possible for the full_op_id_to_record_op_stats to be empty if there is a duplicate operator\n        # (e.g. a scan of the same dataset) which has all of its results cached and no new_record_op_stats;\n        # in this case, we do not update the frontier\n        if full_op_id_to_record_op_stats == {}:\n            return\n\n        # update the set of source indices processed by each physical operator\n        for full_op_id, source_indices_processed in full_op_id_to_source_indices_processed.items():\n            # update the set of source indices processed\n            for source_indices in source_indices_processed:\n                source_indices = source_indices[0] if len(source_indices) == 1 else tuple(source_indices)\n                self.full_op_id_to_sources_processed[full_op_id].add(source_indices)\n                if source_indices in self.full_op_id_to_sources_not_processed[full_op_id]:\n                    self.full_op_id_to_sources_not_processed[full_op_id].remove(source_indices)\n\n            # update the set of source indices not processed\n            self.full_op_id_to_sources_not_processed[full_op_id] = [\n                indices for indices in self.full_op_id_to_sources_not_processed[full_op_id]\n                if indices not in source_indices_processed\n            ]\n\n        # compute mapping of physical op to num samples and total samples drawn\n        full_op_id_to_num_samples, total_num_samples = {}, 0\n        for full_op_id, record_op_stats_lst in full_op_id_to_record_op_stats.items():\n            # compute the number of samples as the length of the record_op_stats_lst\n            num_samples = len(record_op_stats_lst)\n            full_op_id_to_num_samples[full_op_id] = num_samples\n            total_num_samples += num_samples\n\n        # compute avg. selectivity, cost, time, and quality for each physical operator\n        def total_output(record_op_stats_lst: list[RecordOpStats]):\n            return sum([record_op_stats.passed_operator for record_op_stats in record_op_stats_lst])\n\n        def total_input(record_op_stats_lst: list[RecordOpStats]):\n            # TODO: this is okay for now because we only really need these calculations for Converts and Filters,\n            #       but this will need more thought if/when we optimize joins\n            all_parent_ids = []\n            for record_op_stats in record_op_stats_lst:\n                all_parent_ids.extend(\n                    [None]\n                    if record_op_stats.record_parent_ids is None\n                    else record_op_stats.record_parent_ids\n                )\n            return len(set(all_parent_ids))\n\n        full_op_id_to_mean_selectivity = {\n            full_op_id: total_output(record_op_stats_lst) / total_input(record_op_stats_lst)\n            for full_op_id, record_op_stats_lst in full_op_id_to_record_op_stats.items()\n        }\n        full_op_id_to_mean_cost = {\n            full_op_id: np.mean([record_op_stats.cost_per_record for record_op_stats in record_op_stats_lst])\n            for full_op_id, record_op_stats_lst in full_op_id_to_record_op_stats.items()\n        }\n        full_op_id_to_mean_time = {\n            full_op_id: np.mean([record_op_stats.time_per_record for record_op_stats in record_op_stats_lst])\n            for full_op_id, record_op_stats_lst in full_op_id_to_record_op_stats.items()\n        }\n        full_op_id_to_mean_quality = {\n            full_op_id: np.mean([record_op_stats.quality for record_op_stats in record_op_stats_lst if record_op_stats.quality is not None])\n            for full_op_id, record_op_stats_lst in full_op_id_to_record_op_stats.items()\n        }\n\n        # # compute average, LCB, and UCB of each operator; the confidence bounds depend upon\n        # # the computation of the alpha parameter, which we scale to be 0.5 * the mean (of means)\n        # # of the metric across all operators in this operator set\n        # cost_alpha = 0.5 * np.mean([mean_cost for mean_cost in full_op_id_to_mean_cost.values()])\n        # time_alpha = 0.5 * np.mean([mean_time for mean_time in full_op_id_to_mean_time.values()])\n        # quality_alpha = 0.5 * np.mean([mean_quality for mean_quality in full_op_id_to_mean_quality.values()])\n        # selectivity_alpha = 0.5 * np.mean([mean_selectivity for mean_selectivity in full_op_id_to_mean_selectivity.values()])\n        cost_alpha = 0.5 * (np.max(list(full_op_id_to_mean_cost.values())) - np.min(list(full_op_id_to_mean_cost.values())))\n        time_alpha = 0.5 * (np.max(list(full_op_id_to_mean_time.values())) - np.min(list(full_op_id_to_mean_time.values())))\n        quality_alpha = 0.5 * (np.max(list(full_op_id_to_mean_quality.values())) - np.min(list(full_op_id_to_mean_quality.values())))\n        selectivity_alpha = 0.5 * (np.max(list(full_op_id_to_mean_selectivity.values())) - np.min(list(full_op_id_to_mean_selectivity.values())))\n\n        # compute metrics for each physical operator\n        op_metrics = {}\n        for full_op_id in full_op_id_to_record_op_stats:\n            sample_ratio = np.sqrt(np.log(total_num_samples) / full_op_id_to_num_samples[full_op_id])\n            exploration_terms = np.array([cost_alpha * sample_ratio, time_alpha * sample_ratio, quality_alpha * sample_ratio, selectivity_alpha * sample_ratio])\n            mean_terms = (full_op_id_to_mean_cost[full_op_id], full_op_id_to_mean_time[full_op_id], full_op_id_to_mean_quality[full_op_id], full_op_id_to_mean_selectivity[full_op_id])\n\n            # NOTE: we could clip these; however I will not do so for now to allow for arbitrary quality metric(s)\n            lcb_terms = mean_terms - exploration_terms\n            ucb_terms = mean_terms + exploration_terms\n            op_metrics[full_op_id] = {\"mean\": mean_terms, \"lcb\": lcb_terms, \"ucb\": ucb_terms}\n\n        # get the tuple representation of this policy\n        policy_dict = self.policy.get_dict()\n\n        # compute the pareto optimal set of operators\n        pareto_op_set = set()\n        for full_op_id, metrics in op_metrics.items():\n            cost, time, quality, selectivity = metrics[\"mean\"]\n            pareto_frontier = True\n\n            # check if any other operator dominates full_op_id\n            for other_full_op_id, other_metrics in op_metrics.items():\n                other_cost, other_time, other_quality, other_selectivity = other_metrics[\"mean\"]\n                if full_op_id == other_full_op_id:\n                    continue\n\n                # if full_op_id is dominated by other_full_op_id, set pareto_frontier = False and break\n                # NOTE: here we use a strict inequality (instead of the usual <= or >=) because\n                #       all ops which have equal cost / time / quality / sel. should not be\n                #       filtered out from sampling by our logic in this function\n                cost_dominated = True if policy_dict[\"cost\"] == 0.0 else other_cost < cost\n                time_dominated = True if policy_dict[\"time\"] == 0.0 else other_time < time\n                quality_dominated = True if policy_dict[\"quality\"] == 0.0 else other_quality > quality\n                selectivity_dominated = True if not self.is_filter_op else other_selectivity < selectivity\n                if cost_dominated and time_dominated and quality_dominated and selectivity_dominated:\n                    pareto_frontier = False\n                    break\n\n            # add full_op_id to pareto frontier if it's not dominated\n            if pareto_frontier:\n                pareto_op_set.add(full_op_id)\n\n        # iterate over op metrics and compute the new frontier set of operators\n        new_frontier_full_op_ids = set()\n        for full_op_id, metrics in op_metrics.items():\n\n            # if this op is fully sampled, do not keep it on the frontier\n            if len(self.full_op_id_to_sources_processed[full_op_id]) == self.max_inputs:\n                continue\n\n            # if this op is pareto optimal keep it in our frontier ops\n            if full_op_id in pareto_op_set:\n                new_frontier_full_op_ids.add(full_op_id)\n                continue\n\n            # otherwise, if this op overlaps with an op on the pareto frontier, keep it in our frontier ops\n            # NOTE: for now, we perform an optimistic comparison with the ucb/lcb\n            pareto_frontier = True\n            op_cost, op_time, _, op_selectivity = metrics[\"lcb\"]\n            op_quality = metrics[\"ucb\"][2]\n            for pareto_full_op_id in pareto_op_set:\n                pareto_cost, pareto_time, _, pareto_selectivity = op_metrics[pareto_full_op_id][\"ucb\"]\n                pareto_quality = op_metrics[pareto_full_op_id][\"lcb\"][2]\n\n                # if full_op_id is dominated by pareto_full_op_id, set pareto_frontier = False and break\n                cost_dominated = True if policy_dict[\"cost\"] == 0.0 else pareto_cost <= op_cost\n                time_dominated = True if policy_dict[\"time\"] == 0.0 else pareto_time <= op_time\n                quality_dominated = True if policy_dict[\"quality\"] == 0.0 else pareto_quality >= op_quality\n                selectivity_dominated = True if not self.is_filter_op else pareto_selectivity <= op_selectivity\n                if cost_dominated and time_dominated and quality_dominated and selectivity_dominated:\n                    pareto_frontier = False\n                    break\n\n            # add full_op_id to pareto frontier if it's not dominated\n            if pareto_frontier:\n                new_frontier_full_op_ids.add(full_op_id)\n\n        # for operators that were in the frontier, keep them in the frontier if they\n        # are still pareto optimal, otherwise, move them to the end of the reservoir\n        new_frontier_ops = []\n        for op in self.frontier_ops:\n            if op.get_full_op_id() in new_frontier_full_op_ids:\n                new_frontier_ops.append(op)\n            else:\n                self.off_frontier_ops.append(op)\n\n        # if there are operators we previously sampled which are now back on the frontier\n        # add them to the frontier, otherwise, put them back in the off_frontier_ops\n        new_off_frontier_ops = []\n        for op in self.off_frontier_ops:\n            if op.get_full_op_id() in new_frontier_full_op_ids:\n                new_frontier_ops.append(op)\n            else:\n                new_off_frontier_ops.append(op)\n\n        # finally, if we have fewer than k operators in the frontier, sample new operators\n        # from the reservoir and put them in the frontier\n        while len(new_frontier_ops) < self.k and len(self.reservoir_ops) > 0:\n            new_op = self.reservoir_ops.pop(0)\n            new_frontier_ops.append(new_op)\n\n        # update the frontier and off frontier ops\n        self.frontier_ops = new_frontier_ops\n        self.off_frontier_ops = new_off_frontier_ops\n\n    def pick_highest_quality_output(self, record_sets: list[DataRecordSet]) -> DataRecordSet:\n        # if there's only one operator in the set, we return its record_set\n        if len(record_sets) == 1:\n            return record_sets[0]\n\n        # NOTE: I don't like that this assumes the models are consistent in\n        #       how they order their record outputs for one-to-many converts;\n        #       eventually we can try out more robust schemes to account for\n        #       differences in ordering\n        # aggregate records at each index in the response\n        idx_to_records = {}\n        for record_set in record_sets:\n            for idx in range(len(record_set)):\n                record, record_op_stats = record_set[idx], record_set.record_op_stats[idx]\n                if idx not in idx_to_records:\n                    idx_to_records[idx] = [(record, record_op_stats)]\n                else:\n                    idx_to_records[idx].append((record, record_op_stats))\n\n        # compute highest quality answer at each index\n        out_records = []\n        out_record_op_stats = []\n        for idx in range(len(idx_to_records)):\n            records_lst, record_op_stats_lst = zip(*idx_to_records[idx])\n            max_quality_record, max_quality = records_lst[0], record_op_stats_lst[0].quality if record_op_stats_lst[0].quality is not None else 0.0\n            max_quality_stats = record_op_stats_lst[0]\n            for record, record_op_stats in zip(records_lst[1:], record_op_stats_lst[1:]):\n                record_quality = record_op_stats.quality if record_op_stats.quality is not None else 0.0\n                if record_quality > max_quality:\n                    max_quality_record = record\n                    max_quality = record_quality\n                    max_quality_stats = record_op_stats\n            out_records.append(max_quality_record)\n            out_record_op_stats.append(max_quality_stats)\n\n        # create and return final DataRecordSet\n        return DataRecordSet(out_records, out_record_op_stats)\n\n    def update_inputs(self, source_unique_logical_op_id: str, source_indices_to_record_sets: dict[tuple[int], list[DataRecordSet]]):\n        \"\"\"\n        Update the inputs for this logical operator based on the outputs of the previous logical operator.\n        \"\"\"\n        for source_indices, record_sets in source_indices_to_record_sets.items():\n            input = []\n            max_quality_record_set = self.pick_highest_quality_output(record_sets)\n            for record in max_quality_record_set:\n                input.append(record if record._passed_operator else None)\n\n            self.source_indices_to_inputs[source_unique_logical_op_id][source_indices] = input\n\nclass MABExecutionStrategy(SentinelExecutionStrategy):\n    \"\"\"\n    This class implements the Multi-Armed Bandit (MAB) execution strategy for SentinelQueryProcessors.\n\n    NOTE: the number of samples will slightly exceed the sample_budget if the number of operator\n    calls does not perfectly match the sample_budget. This may cause some minor discrepancies with\n    the progress manager as a result.\n    \"\"\"\n    def _remove_filtered_records_from_downstream_ops(self, topo_idx: int, plan: SentinelPlan, op_frontiers: dict[str, OpFrontier], source_indices_to_all_record_sets: dict[int, list[DataRecordSet]]) -> None:\n        \"\"\"Remove records which were filtered out by a NonLLMFilter from all downstream operators.\"\"\"\n        filtered_source_indices = set()\n\n        # NonLLMFilter will have one record_set per source_indices with a single record\n        for source_indices, record_sets in source_indices_to_all_record_sets.items():\n            record: DataRecord = record_sets[0][0]\n            if not record._passed_operator:\n                filtered_source_indices.add(source_indices)\n\n        # remove filtered source indices from all downstream operators\n        if len(filtered_source_indices) > 0:\n            for downstream_topo_idx in range(topo_idx + 1, len(plan)):\n                downstream_logical_op_id = plan[downstream_topo_idx][0]\n                downstream_unique_logical_op_id = f\"{downstream_topo_idx}-{downstream_logical_op_id}\"\n                downstream_op_frontier = op_frontiers[downstream_unique_logical_op_id]\n                for full_op_id in downstream_op_frontier.full_op_id_to_sources_not_processed:\n                    downstream_op_frontier.full_op_id_to_sources_not_processed[full_op_id] = [\n                        indices for indices in downstream_op_frontier.full_op_id_to_sources_not_processed[full_op_id]\n                        if indices not in filtered_source_indices\n                    ]\n\n    def _get_max_quality_op(self, unique_logical_op_id: str, op_frontiers: dict[str, OpFrontier], plan_stats: SentinelPlanStats) -> PhysicalOperator:\n        \"\"\"\n        Returns the operator in the frontier with the highest (estimated) quality.\n        \"\"\"\n        # get the (off) frontier operators for this logical_op_id\n        frontier_ops = op_frontiers[unique_logical_op_id].get_frontier_ops() + op_frontiers[unique_logical_op_id].get_off_frontier_ops()\n\n        # get a mapping from full_op_id --> list[RecordOpStats]\n        full_op_id_to_op_stats: dict[str, OperatorStats] = plan_stats.operator_stats.get(unique_logical_op_id, {})\n        full_op_id_to_record_op_stats = {\n            full_op_id: op_stats.record_op_stats_lst\n            for full_op_id, op_stats in full_op_id_to_op_stats.items()\n        }\n\n        # iterate over the frontier ops and return the one with the highest quality\n        max_quality_op, max_avg_quality = None, None\n        for op in frontier_ops:\n            op_quality_stats = []\n            if op.get_full_op_id() in full_op_id_to_record_op_stats:\n                op_quality_stats = [\n                    record_op_stats.quality\n                    for record_op_stats in full_op_id_to_record_op_stats[op.get_full_op_id()]\n                    if record_op_stats.quality is not None\n                ]\n            avg_op_quality = sum(op_quality_stats) / len(op_quality_stats) if len(op_quality_stats) > 0 else 0.0\n            if max_avg_quality is None or avg_op_quality > max_avg_quality:\n                max_quality_op = op\n                max_avg_quality = avg_op_quality\n\n        return max_quality_op\n\n    def _compute_termination_condition(self, samples_drawn: int, sampling_cost: float) -> bool:\n        return (samples_drawn >= self.sample_budget) if self.sample_cost_budget is None else (sampling_cost >= self.sample_cost_budget)\n\n    def _execute_sentinel_plan(\n            self,\n            plan: SentinelPlan,\n            op_frontiers: dict[str, OpFrontier],\n            validator: Validator,\n            plan_stats: SentinelPlanStats,\n        ) -> SentinelPlanStats:\n        # sample records and operators and update the frontiers\n        samples_drawn, sampling_cost = 0, 0.0\n        while not self._compute_termination_condition(samples_drawn, sampling_cost):\n            # pre-compute the set of source indices which will need to be sampled\n            source_indices_to_sample = set()\n            for op_frontier in op_frontiers.values():\n                source_indices = op_frontier.get_source_indices_for_next_iteration()\n                source_indices_to_sample.update(source_indices)\n\n            # execute operator sets in sequence\n            for topo_idx, (logical_op_id, op_set) in enumerate(plan):\n                # compute unique logical op id within plan\n                unique_logical_op_id = f\"{topo_idx}-{logical_op_id}\"\n\n                # use the execution cache to determine the maximum quality operator for this logical_op_id\n                max_quality_op = self._get_max_quality_op(unique_logical_op_id, op_frontiers, plan_stats)\n\n                # get frontier ops and their next input\n                def filter_and_clean_inputs(frontier_op_inputs: list[tuple]) -> bool:\n                    cleaned_inputs = []\n                    for tup in frontier_op_inputs:\n                        input = tup[-1]\n                        if isinstance(input, list):\n                            input = [record for record in input if record is not None]\n                        if input is not None and input != []:\n                            cleaned_inputs.append((tup[0], tup[1], input))\n                    return cleaned_inputs\n                frontier_op_inputs = op_frontiers[unique_logical_op_id].get_frontier_op_inputs(source_indices_to_sample, max_quality_op)\n                frontier_op_inputs = filter_and_clean_inputs(frontier_op_inputs)\n\n                # break out of the loop if frontier_op_inputs is empty, as this means all records have been filtered out\n                if len(frontier_op_inputs) == 0:\n                    continue\n\n                # run sampled operators on sampled inputs and update the number of samples drawn\n                source_indices_to_record_set_tuples, num_llm_ops = self._execute_op_set(unique_logical_op_id, frontier_op_inputs)\n                samples_drawn += num_llm_ops\n\n                # score the quality of each generated output\n                source_indices_to_all_record_sets = {\n                    source_indices: [(record_set, op) for record_set, op, _ in record_set_tuples]\n                    for source_indices, record_set_tuples in source_indices_to_record_set_tuples.items()\n                }\n                source_indices_to_all_record_sets, val_gen_stats = self._score_quality(validator, source_indices_to_all_record_sets)\n\n                # update the progress manager with validation cost\n                self.progress_manager.incr_overall_progress_cost(val_gen_stats.cost_per_record)\n\n                # remove records that were read from the execution cache before adding to record op stats\n                new_record_op_stats = []\n                for _, record_set_tuples in source_indices_to_record_set_tuples.items():\n                    for record_set, _, is_new in record_set_tuples:\n                        if is_new:\n                            new_record_op_stats.extend(record_set.record_op_stats)\n\n                # update plan stats\n                plan_stats.add_record_op_stats(unique_logical_op_id, new_record_op_stats)\n                plan_stats.add_validation_gen_stats(unique_logical_op_id, val_gen_stats)\n                sampling_cost = plan_stats.get_total_cost_so_far()\n\n                # provide the best record sets as inputs to the next logical operator\n                next_unique_logical_op_id = plan.get_next_unique_logical_op_id(unique_logical_op_id)\n                if next_unique_logical_op_id is not None:\n                    source_indices_to_all_record_sets = {\n                        source_indices: [record_set for record_set, _ in record_set_tuples]\n                        for source_indices, record_set_tuples in source_indices_to_all_record_sets.items()\n                    }\n                    op_frontiers[next_unique_logical_op_id].update_inputs(unique_logical_op_id, source_indices_to_all_record_sets)\n\n                # update the (pareto) frontier for each set of operators\n                full_op_id_to_source_indices_processed = {}\n                for source_indices, record_set_tuples in source_indices_to_record_set_tuples.items():\n                    for _, op, _ in record_set_tuples:\n                        if op.get_full_op_id() not in full_op_id_to_source_indices_processed:\n                            full_op_id_to_source_indices_processed[op.get_full_op_id()] = set()\n                        full_op_id_to_source_indices_processed[op.get_full_op_id()].add(source_indices)\n                op_frontiers[unique_logical_op_id].update_frontier(unique_logical_op_id, plan_stats, full_op_id_to_source_indices_processed)\n\n                # if the operator is a non-llm filter which has filtered out records, remove those records from\n                # all downstream operators' full_op_id_to_sources_not_processed\n                if isinstance(op_set[0], NonLLMFilter) and next_unique_logical_op_id is not None:\n                    self._remove_filtered_records_from_downstream_ops(topo_idx, plan, op_frontiers, source_indices_to_all_record_sets)\n\n        # finalize plan stats\n        plan_stats.finish()\n\n        return plan_stats\n\n\n    def execute_sentinel_plan(self, plan: SentinelPlan, train_dataset: dict[str, Dataset], validator: Validator) -> SentinelPlanStats:\n        logger.info(f\"Executing plan {plan.plan_id} with {self.max_workers} workers\")\n\n        # initialize plan stats\n        plan_stats = SentinelPlanStats.from_plan(plan)\n        plan_stats.start()\n\n        # shuffle the indices of records to sample\n        dataset_id_to_shuffled_source_indices = {}\n        for dataset_id, dataset in train_dataset.items():\n            total_num_samples = len(dataset)\n            shuffled_source_indices = [f\"{dataset_id}---{int(idx)}\" for idx in np.arange(total_num_samples)]\n            self.rng.shuffle(shuffled_source_indices)\n            dataset_id_to_shuffled_source_indices[dataset_id] = shuffled_source_indices\n\n        # initialize frontier for each logical operator\n        op_frontiers = {}\n        for topo_idx, (logical_op_id, op_set) in enumerate(plan):\n            unique_logical_op_id = f\"{topo_idx}-{logical_op_id}\"\n            source_unique_logical_op_ids = plan.get_source_unique_logical_op_ids(unique_logical_op_id)\n            root_dataset_ids = plan.get_root_dataset_ids(unique_logical_op_id)\n            sample_op = op_set[0]\n            if isinstance(sample_op, (ScanPhysicalOp, ContextScanOp)):\n                assert len(root_dataset_ids) == 1, f\"Scan for {sample_op} has {len(root_dataset_ids)} > 1 root dataset ids\"\n                root_dataset_id = root_dataset_ids[0]\n                source_indices = dataset_id_to_shuffled_source_indices[root_dataset_id]\n                op_frontiers[unique_logical_op_id] = OpFrontier(op_set, source_unique_logical_op_ids, root_dataset_ids, source_indices, self.k, self.j, self.seed, self.policy, self.priors, self.dont_use_priors)\n            elif isinstance(sample_op, JoinOp):\n                assert len(source_unique_logical_op_ids) == 2, f\"Join for {sample_op} has {len(source_unique_logical_op_ids)} != 2 source logical operators\"\n                left_source_indices = op_frontiers[source_unique_logical_op_ids[0]].source_indices\n                right_source_indices = op_frontiers[source_unique_logical_op_ids[1]].source_indices\n                source_indices = []\n                for left_source_idx in left_source_indices:\n                    for right_source_idx in right_source_indices:\n                        source_indices.append((left_source_idx, right_source_idx))\n                op_frontiers[unique_logical_op_id] = OpFrontier(op_set, source_unique_logical_op_ids, root_dataset_ids, source_indices, self.k, self.j, self.seed, self.policy, self.priors, self.dont_use_priors)\n            else:\n                source_indices = op_frontiers[source_unique_logical_op_ids[0]].source_indices\n                op_frontiers[unique_logical_op_id] = OpFrontier(op_set, source_unique_logical_op_ids, root_dataset_ids, source_indices, self.k, self.j, self.seed, self.policy, self.priors, self.dont_use_priors)\n\n        # initialize and start the progress manager\n        self.progress_manager = create_progress_manager(plan, sample_budget=self.sample_budget, sample_cost_budget=self.sample_cost_budget, progress=self.progress)\n        self.progress_manager.start()\n\n        # NOTE: we must handle progress manager outside of _execute_sentinel_plan to ensure that it is shut down correctly;\n        #       if we don't have the `finally:` branch, then program crashes can cause future program runs to fail because\n        #       the progress manager cannot get a handle to the console \n        try:\n            # execute sentinel plan by sampling records and operators\n            plan_stats = self._execute_sentinel_plan(plan, op_frontiers, validator, plan_stats)\n\n        finally:\n            # finish progress tracking\n            self.progress_manager.finish()\n\n        logger.info(f\"Done executing sentinel plan: {plan.plan_id}\")\n        logger.debug(f\"Plan stats: (plan_cost={plan_stats.total_plan_cost}, plan_time={plan_stats.total_plan_time})\")\n\n        return plan_stats\n"
  },
  {
    "path": "src/palimpzest/query/execution/parallel_execution_strategy.py",
    "content": "import logging\nfrom concurrent.futures import ThreadPoolExecutor, wait\n\nfrom palimpzest.constants import PARALLEL_EXECUTION_SLEEP_INTERVAL_SECS\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.models import PlanStats\nfrom palimpzest.query.execution.execution_strategy import ExecutionStrategy\nfrom palimpzest.query.operators.aggregate import AggregateOp\nfrom palimpzest.query.operators.distinct import DistinctOp\nfrom palimpzest.query.operators.join import JoinOp\nfrom palimpzest.query.operators.limit import LimitScanOp\nfrom palimpzest.query.operators.scan import ContextScanOp, ScanPhysicalOp\nfrom palimpzest.query.optimizer.plan import PhysicalPlan\nfrom palimpzest.utils.progress import create_progress_manager\n\nlogger = logging.getLogger(__name__)\n\n\nclass ParallelExecutionStrategy(ExecutionStrategy):\n    \"\"\"\n    A parallel execution strategy that processes data through a pipeline of operators using thread-based parallelism.\n    \"\"\"\n\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n\n    def _any_queue_not_empty(self, queues: dict[str, list] | dict[str, dict[str, list]]) -> bool:\n        \"\"\"Helper function to check if any queue is not empty.\"\"\"\n        for _, value in queues.items():\n            if isinstance(value, dict):\n                if any(len(subqueue) > 0 for subqueue in value.values()):\n                    return True\n            elif len(value) > 0:\n                return True\n        return False\n\n    def _upstream_ops_finished(self, plan: PhysicalPlan, unique_full_op_id: str, input_queues: dict[str, dict[str, list]], future_queues: dict[str, list]) -> bool:\n        \"\"\"Helper function to check if agg / join operator is ready to process its inputs.\"\"\"\n        upstream_unique_full_op_ids = plan.get_upstream_unique_full_op_ids(unique_full_op_id)\n        upstream_input_queues = {upstream_unique_full_op_id: input_queues[upstream_unique_full_op_id] for upstream_unique_full_op_id in upstream_unique_full_op_ids}\n        upstream_future_queues = {upstream_unique_full_op_id: future_queues[upstream_unique_full_op_id] for upstream_unique_full_op_id in upstream_unique_full_op_ids}\n        return not (self._any_queue_not_empty(upstream_input_queues) or self._any_queue_not_empty(upstream_future_queues))\n\n    def _finish_outer_join(self, executor: ThreadPoolExecutor, plan: PhysicalPlan, unique_full_op_id: str, input_queues: dict[str, dict[str, list]], future_queues: dict[str, list]) -> None:\n        join_op_upstream_finished = self._upstream_ops_finished(plan, unique_full_op_id, input_queues, future_queues)\n        join_input_queues_empty = all(len(inputs) == 0 for inputs in input_queues[unique_full_op_id].values())\n        join_future_queue_empty = len(future_queues[unique_full_op_id]) == 0\n        if join_op_upstream_finished and join_input_queues_empty and join_future_queue_empty:\n            # process the join one last time with final=True to handle any left/right/outer join logic\n            operator = self.unique_full_op_id_to_operator[unique_full_op_id]\n            if not operator.finished:\n                def finalize_op(operator):\n                    return operator([], [], final=True)\n                future = executor.submit(finalize_op, operator)\n                future_queues[unique_full_op_id].append(future)\n                operator.set_finished()\n\n    def _process_future_results(self, unique_full_op_id: str, future_queues: dict[str, list], plan_stats: PlanStats) -> list[DataRecord]:\n        \"\"\"\n        Helper function which takes a full operator id, the future queues, and plan stats, and performs\n        the updates to plan stats and progress manager before returning the results from the finished futures.\n        \"\"\"\n        # this function is called when the future queue is not empty\n        # and the executor is not busy processing other futures\n        done_futures, not_done_futures = wait(future_queues[unique_full_op_id], timeout=PARALLEL_EXECUTION_SLEEP_INTERVAL_SECS)\n\n        # add the unfinished futures back to the previous op's future queue\n        future_queues[unique_full_op_id] = list(not_done_futures)\n\n        # add the finished futures to the input queue for this operator\n        output_records, total_inputs_processed, total_cost = [], 0, 0.0\n        for future in done_futures:\n            output = future.result()\n            record_set, num_inputs_processed = output if self.is_join_op[unique_full_op_id] else (output, 1)\n\n            # record set can be empty if one side of join has no input records yet\n            if len(record_set) == 0:\n                continue\n\n            # otherwise, process records and their stats\n            records = record_set.data_records\n            record_op_stats = record_set.record_op_stats\n\n            # update the inputs processed and total cost\n            total_inputs_processed += num_inputs_processed\n            total_cost += record_set.get_total_cost()\n\n            # update plan stats\n            plan_stats.add_record_op_stats(unique_full_op_id, record_op_stats)\n\n            # add records which aren't filtered to the output records\n            output_records.extend([record for record in records if record._passed_operator])\n\n        # update the progress manager\n        if total_inputs_processed > 0:\n            num_outputs = len(output_records)\n            self.progress_manager.incr(unique_full_op_id, num_inputs=total_inputs_processed, num_outputs=num_outputs, total_cost=total_cost)\n\n        return output_records\n\n    def _execute_plan(\n            self,\n            plan: PhysicalPlan,\n            input_queues: dict[str, dict[str, list]],\n            future_queues: dict[str, list],\n            plan_stats: PlanStats,\n        ) -> tuple[list[DataRecord], PlanStats]:\n        # process all of the input records using a thread pool\n        output_records = []\n        with ThreadPoolExecutor(max_workers=self.max_workers) as executor:\n            logger.debug(f\"Created thread pool with {self.max_workers} workers\")\n\n            # execute the plan until either:\n            # 1. all records have been processed, or\n            # 2. the final limit operation has completed (we break out of the loop if this happens)\n            final_op = plan.operator\n            while self._any_queue_not_empty(input_queues) or self._any_queue_not_empty(future_queues):\n                for topo_idx, operator in enumerate(plan):\n                    source_unique_full_op_ids = (\n                        [f\"source_{operator.get_full_op_id()}\"]\n                        if isinstance(operator, (ContextScanOp, ScanPhysicalOp))\n                        else plan.get_source_unique_full_op_ids(topo_idx, operator)\n                    )\n                    unique_full_op_id = f\"{topo_idx}-{operator.get_full_op_id()}\"\n\n                    # get any finished futures from the previous operator and add them to the input queue for this operator\n                    if not isinstance(operator, (ContextScanOp, ScanPhysicalOp)):\n                        for source_unique_full_op_id in source_unique_full_op_ids:\n                            records = self._process_future_results(source_unique_full_op_id, future_queues, plan_stats)\n                            input_queues[unique_full_op_id][source_unique_full_op_id].extend(records)\n\n                            # if the source is a left/right/outer join operator with no more inputs to process, then finish it\n                            if self.is_outer_join_op[source_unique_full_op_id]:\n                                self._finish_outer_join(executor, plan, source_unique_full_op_id, input_queues, future_queues)\n\n                    # for the final operator, add any finished futures to the output_records\n                    if unique_full_op_id == f\"{topo_idx}-{final_op.get_full_op_id()}\":\n                        records = self._process_future_results(unique_full_op_id, future_queues, plan_stats)\n                        output_records.extend(records)\n\n                        # if this is a left/right/outer join operator with no more inputs to process, then finish it\n                        if self.is_outer_join_op[unique_full_op_id]:\n                            self._finish_outer_join(executor, plan, unique_full_op_id, input_queues, future_queues)\n\n                    # if this operator does not have enough inputs to execute, then skip it\n                    num_inputs = sum(len(inputs) for inputs in input_queues[unique_full_op_id].values())\n                    agg_op_not_ready = isinstance(operator, AggregateOp) and not self._upstream_ops_finished(plan, unique_full_op_id, input_queues, future_queues)\n                    join_op_not_ready = isinstance(operator, JoinOp) and not self.join_has_downstream_limit_op[unique_full_op_id] and not self._upstream_ops_finished(plan, unique_full_op_id, input_queues, future_queues)\n                    if num_inputs == 0 or agg_op_not_ready or join_op_not_ready:\n                        continue\n\n                    # if this operator is an aggregate, process all the records in the input queue\n                    if isinstance(operator, AggregateOp):\n                        source_unique_full_op_id = source_unique_full_op_ids[0]\n                        input_records = [input_queues[unique_full_op_id][source_unique_full_op_id].pop(0) for _ in range(num_inputs)]\n                        future = executor.submit(operator, input_records)\n                        future_queues[unique_full_op_id].append(future)\n\n                    # if this operator is a join, process all pairs of records from the two input queues\n                    elif isinstance(operator, JoinOp):\n                        left_unique_full_source_op_id = source_unique_full_op_ids[0]\n                        left_num_inputs = len(input_queues[unique_full_op_id][left_unique_full_source_op_id])\n                        left_input_records = [input_queues[unique_full_op_id][left_unique_full_source_op_id].pop(0) for _ in range(left_num_inputs)]\n\n                        right_unique_full_source_op_id = source_unique_full_op_ids[1]\n                        right_num_inputs = len(input_queues[unique_full_op_id][right_unique_full_source_op_id])\n                        right_input_records = [input_queues[unique_full_op_id][right_unique_full_source_op_id].pop(0) for _ in range(right_num_inputs)]\n\n                        # NOTE: it would be nice to use executor for join inputs here; but for now synchronizing may be necessary\n                        # future = executor.submit(operator, left_input_records, right_input_records)\n                        # future_queues[unique_full_op_id].append(future)\n                        record_set, num_inputs_processed = operator(left_input_records, right_input_records)\n                        def no_op(rset, num_inputs_processed):\n                            return rset, num_inputs_processed\n                        future = executor.submit(no_op, record_set, num_inputs_processed)\n                        future_queues[unique_full_op_id].append(future)\n\n                    # if this operator is a limit, process one record at a time\n                    elif isinstance(operator, LimitScanOp):\n                        source_unique_full_op_id = source_unique_full_op_ids[0]\n                        num_records_to_process = min(len(input_queues[unique_full_op_id][source_unique_full_op_id]), operator.limit - len(output_records))\n                        for _ in range(num_records_to_process):\n                            input_record = input_queues[unique_full_op_id][source_unique_full_op_id].pop(0)\n                            future = executor.submit(operator, input_record)\n                            future_queues[unique_full_op_id].append(future)\n\n                        # if this is the final operator, add any finished futures to the output_records\n                        # immediately so that we can break out of the loop if we've reached the limit\n                        if unique_full_op_id == f\"{topo_idx}-{final_op.get_full_op_id()}\":\n                            records = self._process_future_results(unique_full_op_id, future_queues, plan_stats)\n                            output_records.extend(records)\n\n                    # if this operator is a distinct, process records sequentially\n                    # (distinct is not parallelized because it requires maintaining a set of seen records)\n                    elif isinstance(operator, DistinctOp):\n                        source_unique_full_op_id = source_unique_full_op_ids[0]\n                        input_records = input_queues[unique_full_op_id][source_unique_full_op_id]\n                        for record in input_records:\n                            record_set = operator(record)\n                            def no_op(rset):\n                                return rset\n                            future = executor.submit(no_op, record_set)\n                            future_queues[unique_full_op_id].append(future)\n\n                        # clear the input queue for this operator since we processed all records\n                        input_queues[unique_full_op_id][source_unique_full_op_id].clear()\n\n                    # otherwise, process records according to batch size\n                    else:\n                        source_unique_full_op_id = source_unique_full_op_ids[0]\n                        input_records = input_queues[unique_full_op_id][source_unique_full_op_id]\n                        if self.batch_size is None:\n                            for input_record in input_records:\n                                future = executor.submit(operator, input_record)\n                                future_queues[unique_full_op_id].append(future)\n                            input_queues[unique_full_op_id][source_unique_full_op_id].clear()\n                        else:\n                            batch_size = min(self.batch_size, len(input_records))\n                            batch_input_records = input_records[:batch_size]\n                            for input_record in batch_input_records:\n                                future = executor.submit(operator, input_record)\n                                future_queues[unique_full_op_id].append(future)\n                            input_queues[unique_full_op_id][source_unique_full_op_id] = input_records[batch_size:]\n\n                # TODO: change logic to stop upstream operators once a limit is reached\n                # break out of loop if the final operator is a LimitScanOp and we've reached its limit\n                if isinstance(final_op, LimitScanOp) and len(output_records) == final_op.limit:\n                    break\n\n        # finalize plan stats\n        plan_stats.finish()\n\n        return output_records, plan_stats\n\n    def execute_plan(self, plan: PhysicalPlan):\n        \"\"\"Initialize the stats and execute the plan.\"\"\"\n        logger.info(f\"Executing plan {plan.plan_id} with {self.max_workers} workers\")\n        logger.info(f\"Plan Details: {plan}\")\n\n        # initialize plan stats\n        plan_stats = PlanStats.from_plan(plan)\n        plan_stats.start()\n\n        # initialize input queues and future queues for each operation\n        input_queues = self._create_input_queues(plan)\n        future_queues = {f\"{topo_idx}-{op.get_full_op_id()}\": [] for topo_idx, op in enumerate(plan)}\n\n        # precompute which operators are (outer) joins and which joins have downstream limit ops\n        self.is_join_op = {f\"{topo_idx}-{op.get_full_op_id()}\": isinstance(op, JoinOp) for topo_idx, op in enumerate(plan)}\n        self.is_outer_join_op = {f\"{topo_idx}-{op.get_full_op_id()}\": isinstance(op, JoinOp) and op.how in (\"left\", \"right\", \"outer\") for topo_idx, op in enumerate(plan)}\n        self.join_has_downstream_limit_op = {}\n        for topo_idx, op in enumerate(plan):\n            if isinstance(op, JoinOp):\n                unique_full_op_id = f\"{topo_idx}-{op.get_full_op_id()}\"\n                has_downstream_limit_op = False\n                for inner_topo_idx, op in enumerate(plan):\n                    if inner_topo_idx <= topo_idx:\n                        continue\n                    if isinstance(op, LimitScanOp):\n                        has_downstream_limit_op = True\n                        break\n                self.join_has_downstream_limit_op[unique_full_op_id] = has_downstream_limit_op\n\n        # precompute mapping from unique_full_op_id to operator instance\n        self.unique_full_op_id_to_operator = {f\"{topo_idx}-{op.get_full_op_id()}\": op for topo_idx, op in enumerate(plan)}\n\n        # initialize and start the progress manager\n        self.progress_manager = create_progress_manager(plan, num_samples=self.num_samples, progress=self.progress)\n        self.progress_manager.start()\n\n        # NOTE: we must handle progress manager outside of _execute_plan to ensure that it is shut down correctly;\n        #       if we don't have the `finally:` branch, then program crashes can cause future program runs to fail\n        #       because the progress manager cannot get a handle to the console \n        try:\n            # execute plan\n            output_records, plan_stats = self._execute_plan(plan, input_queues, future_queues, plan_stats)\n\n        finally:\n            # finish progress tracking\n            self.progress_manager.finish()\n\n        logger.info(f\"Done executing plan: {plan.plan_id}\")\n        logger.debug(f\"Plan stats: (plan_cost={plan_stats.total_plan_cost}, plan_time={plan_stats.total_plan_time})\")\n\n        return output_records, plan_stats\n"
  },
  {
    "path": "src/palimpzest/query/execution/single_threaded_execution_strategy.py",
    "content": "import logging\n\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.models import PlanStats\nfrom palimpzest.query.execution.execution_strategy import ExecutionStrategy\nfrom palimpzest.query.operators.aggregate import AggregateOp\nfrom palimpzest.query.operators.join import JoinOp\nfrom palimpzest.query.operators.limit import LimitScanOp\nfrom palimpzest.query.operators.scan import ContextScanOp, ScanPhysicalOp\nfrom palimpzest.query.optimizer.plan import PhysicalPlan\nfrom palimpzest.utils.progress import create_progress_manager\n\nlogger = logging.getLogger(__name__)\n\nclass SequentialSingleThreadExecutionStrategy(ExecutionStrategy):\n    \"\"\"\n    A single-threaded execution strategy that processes operators sequentially.\n    \n    This strategy processes all records through one operator completely before moving to the next operator\n    in the execution plan. For example, if we have operators A -> B -> C and records [1,2,3]:\n    1. First processes records [1,2,3] through operator A\n    2. Then takes A's output and processes all of it through operator B\n    3. Finally processes all of B's output through operator C\n    \"\"\"\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.max_workers = 1\n\n    def _execute_plan(self, plan: PhysicalPlan, input_queues: dict[str, dict[str, list]], plan_stats: PlanStats) -> tuple[list[DataRecord], PlanStats]:\n        # execute the plan one operator at a time\n        output_records = []\n        for topo_idx, operator in enumerate(plan):\n            # if we've filtered out all records, terminate early\n            source_unique_full_op_ids = (\n                [f\"source_{operator.get_full_op_id()}\"]\n                if isinstance(operator, (ContextScanOp, ScanPhysicalOp))\n                else plan.get_source_unique_full_op_ids(topo_idx, operator)\n            )\n            unique_full_op_id = f\"{topo_idx}-{operator.get_full_op_id()}\"\n            num_inputs = sum(len(input_queues[unique_full_op_id][source_unique_full_op_id]) for source_unique_full_op_id in source_unique_full_op_ids)\n            if num_inputs == 0:\n                break\n\n            # begin to process this operator\n            records, record_op_stats = [], []\n            logger.info(f\"Processing operator {operator.op_name()} ({unique_full_op_id})\")\n\n            # if this operator is an aggregate, process all the records in the input_queue\n            if isinstance(operator, AggregateOp):\n                source_unique_full_op_id = source_unique_full_op_ids[0]\n                record_set = operator(candidates=input_queues[unique_full_op_id][source_unique_full_op_id])\n                records = record_set.data_records\n                record_op_stats = record_set.record_op_stats\n                num_outputs = sum(record._passed_operator for record in records)\n\n                # update the progress manager\n                self.progress_manager.incr(unique_full_op_id, num_inputs=1, num_outputs=num_outputs, total_cost=record_set.get_total_cost())\n\n            # if this operator is a join, process all pairs of records from the two input queues\n            elif isinstance(operator, JoinOp):\n                left_full_source_op_id = source_unique_full_op_ids[0]\n                left_num_inputs = len(input_queues[unique_full_op_id][left_full_source_op_id])\n                left_input_records = [input_queues[unique_full_op_id][left_full_source_op_id].pop(0) for _ in range(left_num_inputs)]\n\n                right_full_source_op_id = source_unique_full_op_ids[1]\n                right_num_inputs = len(input_queues[unique_full_op_id][right_full_source_op_id])\n                right_input_records = [input_queues[unique_full_op_id][right_full_source_op_id].pop(0) for _ in range(right_num_inputs)]\n\n                record_set, num_inputs_processed = operator(left_input_records, right_input_records)\n                records = record_set.data_records\n                record_op_stats = record_set.record_op_stats\n\n                # process the join one last time with final=True to handle any left/right/outer join logic\n                if operator.how in (\"left\", \"right\", \"outer\"):\n                    record_set, num_inputs_processed = operator([], [], final=True)\n                    records.extend(record_set.data_records)\n                    record_op_stats.extend(record_set.record_op_stats)\n      \n                num_outputs = sum(record._passed_operator for record in records)\n\n                # update the progress manager\n                self.progress_manager.incr(unique_full_op_id, num_inputs=num_inputs_processed, num_outputs=num_outputs, total_cost=record_set.get_total_cost())\n\n            # otherwise, process the records in the input queue for this operator one at a time\n            else:\n                source_unique_full_op_id = source_unique_full_op_ids[0]\n                for input_record in input_queues[unique_full_op_id][source_unique_full_op_id]:\n                    record_set = operator(input_record)\n                    records.extend(record_set.data_records)\n                    record_op_stats.extend(record_set.record_op_stats)\n                    num_outputs = sum(record._passed_operator for record in record_set.data_records)\n\n                    # update the progress manager\n                    self.progress_manager.incr(unique_full_op_id, num_inputs=1, num_outputs=num_outputs, total_cost=record_set.get_total_cost())\n\n                    # finish early if this is a limit\n                    if isinstance(operator, LimitScanOp) and len(records) == operator.limit:\n                        break\n\n            # update plan stats\n            plan_stats.add_record_op_stats(unique_full_op_id, record_op_stats)\n\n            # update next input_queue (if it exists)\n            output_records = [record for record in records if record._passed_operator]\n            next_unique_full_op_id = plan.get_next_unique_full_op_id(topo_idx, operator)\n            if next_unique_full_op_id is not None:\n                input_queues[next_unique_full_op_id][unique_full_op_id] = output_records\n\n            logger.info(f\"Finished processing operator {operator.op_name()} ({unique_full_op_id}), and generated {len(records)} records\")\n\n        # finalize plan stats\n        plan_stats.finish()\n\n        return output_records, plan_stats\n\n    def execute_plan(self, plan: PhysicalPlan) -> tuple[list[DataRecord], PlanStats]:\n        \"\"\"Initialize the stats and execute the plan.\"\"\"\n        logger.info(f\"Executing plan {plan.plan_id} with {self.max_workers} workers\")\n        logger.info(f\"Plan Details: {plan}\")\n\n        # initialize plan stats\n        plan_stats = PlanStats.from_plan(plan)\n        plan_stats.start()\n\n        # initialize input queues for each operation\n        input_queues = self._create_input_queues(plan)\n\n        # initialize and start the progress manager\n        self.progress_manager = create_progress_manager(plan, num_samples=self.num_samples, progress=self.progress)\n        self.progress_manager.start()\n\n        # NOTE: we must handle progress manager outside of _execute_plan to ensure that it is shut down correctly;\n        #       if we don't have the `finally:` branch, then program crashes can cause future program runs to fail\n        #       because the progress manager cannot get a handle to the console \n        try:\n            # execute plan\n            output_records, plan_stats = self._execute_plan(plan, input_queues, plan_stats)\n\n        finally:\n            # finish progress tracking\n            self.progress_manager.finish()\n\n        logger.info(f\"Done executing plan: {plan.plan_id}\")\n        logger.debug(f\"Plan stats: (plan_cost={plan_stats.total_plan_cost}, plan_time={plan_stats.total_plan_time})\")\n\n        return output_records, plan_stats\n\n\nclass PipelinedSingleThreadExecutionStrategy(ExecutionStrategy):\n    \"\"\"\n    A single-threaded execution strategy that processes records through a pipeline of operators.\n    \n    This strategy implements a pipelined execution model where each record flows through\n    the entire operator chain before the next record is processed.\n\n    Example Flow:\n    For operators A -> B -> C and records [1,2,3]:\n    1. Record 1: A -> B -> C\n    2. Record 2: A -> B -> C\n    3. Record 3: A -> B -> C\n    \"\"\"\n\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.max_workers = 1\n\n    def _any_queue_not_empty(self, queues: dict[str, list] | dict[str, dict[str, list]]) -> bool:\n        \"\"\"Helper function to check if any queue is not empty.\"\"\"\n        for _, value in queues.items():\n            if isinstance(value, dict):\n                if any(len(subqueue) > 0 for subqueue in value.values()):\n                    return True\n            elif len(value) > 0:\n                return True\n        return False\n\n    def _upstream_ops_finished(self, plan: PhysicalPlan, unique_full_op_id: str, input_queues: dict[str, dict[str, list]]) -> bool:\n        \"\"\"Helper function to check if agg / join operator is ready to process its inputs.\"\"\"\n        upstream_unique_full_op_ids = plan.get_upstream_unique_full_op_ids(unique_full_op_id)\n        upstream_input_queues = {upstream_unique_full_op_id: input_queues[upstream_unique_full_op_id] for upstream_unique_full_op_id in upstream_unique_full_op_ids}\n        return not self._any_queue_not_empty(upstream_input_queues)\n\n\n    def _execute_plan(self, plan: PhysicalPlan, input_queues: dict[str, dict[str, list]], plan_stats: PlanStats) -> tuple[list[DataRecord], PlanStats]:\n        # execute the plan until either:\n        # 1. all records have been processed, or\n        # 2. the final limit operation has completed (we break out of the loop if this happens)\n        final_output_records = []\n        while self._any_queue_not_empty(input_queues):\n            for topo_idx, operator in enumerate(plan):\n                # if this operator does not have enough inputs to execute, then skip it\n                source_unique_full_op_ids = (\n                    [f\"source_{operator.get_full_op_id()}\"]\n                    if isinstance(operator, (ContextScanOp, ScanPhysicalOp))\n                    else plan.get_source_unique_full_op_ids(topo_idx, operator)\n                )\n                unique_full_op_id = f\"{topo_idx}-{operator.get_full_op_id()}\"\n\n                num_inputs = sum(len(input_queues[unique_full_op_id][source_unique_full_op_id]) for source_unique_full_op_id in source_unique_full_op_ids)\n                agg_op_not_ready = isinstance(operator, AggregateOp) and not self._upstream_ops_finished(plan, unique_full_op_id, input_queues)\n                join_op_not_ready = isinstance(operator, JoinOp) and not self._upstream_ops_finished(plan, unique_full_op_id, input_queues)\n                if num_inputs == 0 or agg_op_not_ready or join_op_not_ready:\n                    continue\n\n                # create empty lists for records and execution stats generated by executing this operator on its next input(s)\n                records, record_op_stats = [], []\n\n                # if the next operator is an aggregate, process all the records in the input_queue\n                if isinstance(operator, AggregateOp):\n                    source_unique_full_op_id = source_unique_full_op_ids[0]\n                    input_records = [input_queues[unique_full_op_id][source_unique_full_op_id].pop(0) for _ in range(num_inputs)]\n                    record_set = operator(candidates=input_records)\n                    records = record_set.data_records\n                    record_op_stats = record_set.record_op_stats\n                    num_outputs = sum(record._passed_operator for record in records)\n\n                    # update the progress manager\n                    self.progress_manager.incr(unique_full_op_id, num_inputs=1, num_outputs=num_outputs, total_cost=record_set.get_total_cost())\n\n                # if this operator is a join, process all pairs of records from the two input queues\n                elif isinstance(operator, JoinOp):\n                    left_full_source_op_id = source_unique_full_op_ids[0]\n                    left_num_inputs = len(input_queues[unique_full_op_id][left_full_source_op_id])\n                    left_input_records = [input_queues[unique_full_op_id][left_full_source_op_id].pop(0) for _ in range(left_num_inputs)]\n\n                    right_full_source_op_id = source_unique_full_op_ids[1]\n                    right_num_inputs = len(input_queues[unique_full_op_id][right_full_source_op_id])\n                    right_input_records = [input_queues[unique_full_op_id][right_full_source_op_id].pop(0) for _ in range(right_num_inputs)]\n\n                    record_set, num_inputs_processed = operator(left_input_records, right_input_records)\n                    records = record_set.data_records\n                    record_op_stats = record_set.record_op_stats\n                    num_outputs = sum(record._passed_operator for record in records)\n\n                    # update the progress manager\n                    self.progress_manager.incr(unique_full_op_id, num_inputs=num_inputs_processed, num_outputs=num_outputs, total_cost=record_set.get_total_cost())\n\n                # otherwise, process the next record in the input queue for this operator\n                else:\n                    source_unique_full_op_id = source_unique_full_op_ids[0]\n                    input_record = input_queues[unique_full_op_id][source_unique_full_op_id].pop(0)\n                    record_set = operator(input_record)\n                    records = record_set.data_records\n                    record_op_stats = record_set.record_op_stats\n                    num_outputs = sum(record._passed_operator for record in records)\n\n                    # update the progress manager\n                    self.progress_manager.incr(unique_full_op_id, num_inputs=1, num_outputs=num_outputs, total_cost=record_set.get_total_cost())\n\n                # if this is a join operator with no more inputs to process, then finish it\n                if isinstance(operator, JoinOp) and operator.how in (\"left\", \"right\", \"outer\"):\n                    join_op_upstream_finished = self._upstream_ops_finished(plan, unique_full_op_id, input_queues)\n                    join_input_queues_empty = all(len(inputs) == 0 for inputs in input_queues[unique_full_op_id].values())\n                    if join_op_upstream_finished and join_input_queues_empty and not operator.finished:\n                        # process the join one last time with final=True to handle any left/right/outer join logic\n                        record_set, num_inputs_processed = operator([], [], final=True)\n                        records.extend(record_set.data_records)\n                        record_op_stats.extend(record_set.record_op_stats)\n                        num_outputs += sum(record._passed_operator for record in record_set.data_records)\n                        operator.set_finished()\n\n                # update plan stats\n                plan_stats.add_record_op_stats(unique_full_op_id, record_op_stats)\n\n                # update next input_queue or final_output_records\n                output_records = [record for record in records if record._passed_operator]\n                next_unique_full_op_id = plan.get_next_unique_full_op_id(topo_idx, operator)\n                if next_unique_full_op_id is not None:\n                    input_queues[next_unique_full_op_id][unique_full_op_id].extend(output_records)\n                else:\n                    final_output_records.extend(output_records)\n\n                logger.info(f\"Finished processing operator {operator.op_name()} ({unique_full_op_id}) on {num_inputs} records\")\n\n            # break out of loop if the final operator is a LimitScanOp and we've reached its limit\n            if isinstance(plan.operator, LimitScanOp) and len(final_output_records) == plan.operator.limit:\n                break\n\n        # finalize plan stats\n        plan_stats.finish()\n\n        return final_output_records, plan_stats\n\n    def execute_plan(self, plan: PhysicalPlan):\n        \"\"\"Initialize the stats and execute the plan.\"\"\"\n        logger.info(f\"Executing plan {plan.plan_id} with {self.max_workers} workers\")\n        logger.info(f\"Plan Details: {plan}\")\n\n        # initialize plan stats\n        plan_stats = PlanStats.from_plan(plan)\n        plan_stats.start()\n\n        # initialize input queues for each operation\n        input_queues = self._create_input_queues(plan)        \n\n        # initialize and start the progress manager\n        self.progress_manager = create_progress_manager(plan, self.num_samples, self.progress)\n        self.progress_manager.start()\n\n        # NOTE: we must handle progress manager outside of _execute_plan to ensure that it is shut down correctly;\n        #       if we don't have the `finally:` branch, then program crashes can cause future program runs to fail\n        #       because the progress manager cannot get a handle to the console \n        try:\n            # execute plan\n            output_records, plan_stats = self._execute_plan(plan, input_queues, plan_stats)\n\n        finally:\n            # finish progress tracking\n            self.progress_manager.finish()\n\n        logger.info(f\"Done executing plan: {plan.plan_id}\")\n        logger.debug(f\"Plan stats: (plan_cost={plan_stats.total_plan_cost}, plan_time={plan_stats.total_plan_time})\")\n\n        return output_records, plan_stats\n"
  },
  {
    "path": "src/palimpzest/query/generators/__init__.py",
    "content": "from palimpzest.query.generators.gemini_client import GeminiClient, GeminiResponse\nfrom palimpzest.query.generators.generators import Generator\n\n__all__ = [\"Generator\", \"GeminiClient\", \"GeminiResponse\"]\n"
  },
  {
    "path": "src/palimpzest/query/generators/gemini_client.py",
    "content": "\"\"\"\nDirect client for Gemini (Google AI Studio and Vertex AI) that bypasses litellm.\n\nThis module provides a GeminiClient class that:\n1. Calls Gemini API directly via google-genai SDK\n2. Converts litellm/palimpzest message format to Gemini format\n3. Relies on implicit context caching (automatic prefix matching)\n\"\"\"\n\nfrom __future__ import annotations\n\nimport base64\nimport logging\nfrom dataclasses import dataclass\nfrom typing import Any\n\nfrom google import genai\nfrom google.genai import types\n\nlogger = logging.getLogger(__name__)\n\n\n@dataclass\nclass GeminiResponse:\n    \"\"\"Response object that mimics litellm completion response structure.\"\"\"\n    content: str\n    usage: dict\n    raw_response: Any = None\n\n\nclass GeminiClient:\n    \"\"\"\n    Direct client for Gemini (Google AI Studio and Vertex AI) that bypasses litellm.\n    Uses implicit caching (automatic prefix matching) for prompt caching.\n\n    Uses a singleton pattern per (model, use_vertex) so that client state is shared\n    across all Generator instances using the same model and provider.\n\n    Args:\n        model: Model name (e.g., \"gemini-2.5-flash\")\n        use_vertex: If True, use Vertex AI; otherwise use Google AI Studio\n    \"\"\"\n\n    _instances: dict[tuple[str, bool], GeminiClient] = {}\n    \n    # Maps reasoning_effort to Gemini thinking_budget token counts\n    # Reference: https://github.com/BerriAI/litellm/blob/620664921902d7a9bfb29897a7b27c1a7ef4ddfb/litellm/constants.py#L88\n    REASONING_EFFORT_TO_THINKING_BUDGET = {\n        \"disable\": 0,\n        \"minimal\": 128,\n        \"low\": 1024,\n        \"medium\": 2048,\n        \"high\": 4096,\n    }\n\n    @classmethod\n    def get_instance(cls, model: str, use_vertex: bool = False) -> GeminiClient:\n        \"\"\"Get or create a singleton GeminiClient for the given model and provider.\"\"\"\n        key = (model, use_vertex)\n        if key not in cls._instances:\n            cls._instances[key] = cls(model, use_vertex)\n        return cls._instances[key]\n\n    def __init__(self, model: str, use_vertex: bool = False):\n        self.model = model\n        self.use_vertex = use_vertex\n        # Vertex AI: uses GOOGLE_APPLICATION_CREDENTIALS for auth\n        self.client = genai.Client(vertexai=True) if use_vertex else genai.Client()\n\n    def _detect_image_media_type(self, base64_data: str) -> str:\n        \"\"\"Detect image format from base64 data by examining magic bytes.\"\"\"\n        try:\n            header = base64.b64decode(base64_data[:32])\n            if header[:8] == b\"\\x89PNG\\r\\n\\x1a\\n\":\n                return \"image/png\"\n            if header[:3] == b\"\\xff\\xd8\\xff\":\n                return \"image/jpeg\"\n            if header[:6] in (b\"GIF87a\", b\"GIF89a\"):\n                return \"image/gif\"\n            if header[:4] == b\"RIFF\" and header[8:12] == b\"WEBP\":\n                return \"image/webp\"\n        except Exception:\n            pass\n        return \"image/jpeg\"\n\n    def _transform_messages(self, messages: list[dict]) -> tuple[str | None, list[dict]]:\n        \"\"\"\n        Transform litellm/palimpzest message format to Gemini API format.\n\n        Args:\n            messages: List of messages in litellm/palimpzest format\n\n        Returns:\n            Tuple of (system_instruction, gemini_contents)\n        \"\"\"\n        gemini_contents = []\n        system_instruction = None\n\n        for msg in messages:\n            role = msg.get(\"role\")\n            msg_type = msg.get(\"type\")\n            content = msg.get(\"content\")\n\n            if role == \"system\":\n                if isinstance(content, list):\n                    text_parts = [\n                        block.get(\"text\", \"\")\n                        for block in content\n                        if block.get(\"type\") == \"text\"\n                    ]\n                    system_instruction = \"\".join(text_parts)\n                else:\n                    system_instruction = content\n\n            elif role == \"user\":\n                parts = []\n\n                if msg_type == \"text\" or msg_type is None:\n                    if isinstance(content, list):\n                        for block in content:\n                            if block.get(\"type\") == \"text\":\n                                parts.append({\"text\": block.get(\"text\", \"\")})\n                    elif isinstance(content, str):\n                        parts.append({\"text\": content})\n\n                elif msg_type == \"image\":\n                    for img in content:\n                        if img.get(\"type\") == \"image_url\":\n                            url = img[\"image_url\"][\"url\"]\n                            if url.startswith(\"data:\"):\n                                # Robust parsing: handle \"data:[<mediatype>];base64,<data>\"\n                                base64_marker = \";base64,\"\n                                marker_idx = url.find(base64_marker)\n                                if marker_idx == -1:\n                                    continue\n                                data = url[marker_idx + len(base64_marker):]\n                                media_type = self._detect_image_media_type(data)\n                                parts.append({\n                                    \"inline_data\": {\n                                        \"mime_type\": media_type,\n                                        \"data\": data,\n                                    }\n                                })\n\n                elif msg_type == \"input_audio\":\n                    for audio in content:\n                        if audio.get(\"type\") == \"input_audio\":\n                            audio_data = audio[\"input_audio\"]\n                            parts.append({\n                                \"inline_data\": {\n                                    \"mime_type\": f\"audio/{audio_data.get('format', 'wav')}\",\n                                    \"data\": audio_data[\"data\"],\n                                }\n                            })\n\n                if parts:\n                    # Merge consecutive user messages\n                    if gemini_contents and gemini_contents[-1][\"role\"] == \"user\":\n                        gemini_contents[-1][\"parts\"].extend(parts)\n                    else:\n                        gemini_contents.append({\"role\": \"user\", \"parts\": parts})\n\n            elif role == \"assistant\":\n                # Convert assistant to model role\n                parts = []\n                if isinstance(content, str):\n                    parts.append({\"text\": content})\n                elif isinstance(content, list):\n                    for block in content:\n                        if block.get(\"type\") == \"text\":\n                            parts.append({\"text\": block.get(\"text\", \"\")})\n\n                if parts:\n                    # Merge consecutive model messages (Gemini requires strict role alternation)\n                    if gemini_contents and gemini_contents[-1][\"role\"] == \"model\":\n                        gemini_contents[-1][\"parts\"].extend(parts)\n                    else:\n                        gemini_contents.append({\"role\": \"model\", \"parts\": parts})\n\n        return system_instruction, gemini_contents\n\n    def _extract_usage_stats(self, usage_metadata: Any) -> dict:\n        \"\"\"\n        Extract and process usage statistics from Gemini response into the\n        standard format expected by Generator.\n\n        Args:\n            usage_metadata: The usage_metadata from Gemini response\n\n        Returns:\n            Dictionary with information needed by GenerationStats.\n        \"\"\"\n        generation_stats = {\n            \"input_text_tokens\": 0,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 0,\n            \"text_cache_read_tokens\": 0,\n            \"image_cache_read_tokens\": 0,\n            \"audio_cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"output_text_tokens\": 0\n        }\n\n        if usage_metadata is None:\n            return generation_stats\n\n        try:\n            raw = usage_metadata.model_dump()\n        except (AttributeError, Exception):\n            # Fallback for SDK versions without model_dump()\n            raw = vars(usage_metadata) if hasattr(usage_metadata, \"__dict__\") else {}\n            logger.warning(\"Could not call model_dump() on usage_metadata, using fallback\")\n\n        # Parse cache read tokens by modality\n        for detail in (raw.get(\"cache_tokens_details\") or []):\n            modality = (detail.get(\"modality\") or \"\").upper()\n            token_count = detail.get(\"token_count\") or 0\n            if modality == \"TEXT\":\n                generation_stats[\"text_cache_read_tokens\"] = token_count\n            elif modality == \"IMAGE\":\n                generation_stats[\"image_cache_read_tokens\"] = token_count\n            elif modality == \"AUDIO\":\n                generation_stats[\"audio_cache_read_tokens\"] = token_count\n\n        generation_stats[\"cache_read_tokens\"] = raw.get(\"cached_content_token_count\") or 0\n\n        # Parse input tokens by modality (excludes cached tokens)\n        for detail in (raw.get(\"prompt_tokens_details\") or []):\n            modality = (detail.get(\"modality\") or \"\").upper()\n            token_count = detail.get(\"token_count\") or 0\n            if modality == \"TEXT\":\n                generation_stats[\"input_text_tokens\"] = max(0, token_count - generation_stats[\"text_cache_read_tokens\"])\n            elif modality == \"IMAGE\":\n                generation_stats[\"input_image_tokens\"] = max(0, token_count - generation_stats[\"image_cache_read_tokens\"])\n            elif modality == \"AUDIO\":\n                generation_stats[\"input_audio_tokens\"] = max(0, token_count - generation_stats[\"audio_cache_read_tokens\"])\n\n        generation_stats[\"output_text_tokens\"] = (raw.get(\"candidates_token_count\") or 0) + (raw.get(\"thoughts_token_count\") or 0)\n\n        return generation_stats\n\n    def generate(\n        self,\n        messages: list[dict],\n        temperature: float = 0.0,\n        reasoning_effort: str | None = None,\n    ) -> GeminiResponse:\n        \"\"\"\n        Generate content using Gemini API directly.\n\n        Args:\n            messages: List of messages in litellm/palimpzest format\n            temperature: Sampling temperature (default: 0.0)\n            reasoning_effort: Optional thinking budget level (\"low\", \"medium\", \"high\")\n\n        Returns:\n            GeminiResponse with content, usage stats, and raw response\n        \"\"\"\n        system_instruction, gemini_contents = self._transform_messages(messages)\n\n        # Build config\n        config_kwargs = {\"temperature\": temperature}\n        if system_instruction:\n            config_kwargs[\"system_instruction\"] = system_instruction\n\n        # Map reasoning_effort to thinking_config\n        if reasoning_effort is not None:\n            budget = self.REASONING_EFFORT_TO_THINKING_BUDGET.get(reasoning_effort)\n            if budget is None:\n                raise ValueError(f\"Invalid reasoning effort: {reasoning_effort}\")\n            config_kwargs[\"thinking_config\"] = types.ThinkingConfig(thinking_budget=budget)\n\n        response = self.client.models.generate_content(\n            model=self.model,\n            contents=gemini_contents,\n            config=types.GenerateContentConfig(**config_kwargs),\n        )\n\n        # Extract response content\n        content = \"\"\n        if response.candidates and response.candidates[0].content:\n            parts = response.candidates[0].content.parts\n            if parts:\n                content = \"\".join(\n                    part.text for part in parts\n                    if hasattr(part, \"text\") and part.text\n                )\n\n        # Extract and process usage stats\n        usage = self._extract_usage_stats(response.usage_metadata)\n\n        return GeminiResponse(\n            content=content,\n            usage=usage,\n            raw_response=response,\n        )\n"
  },
  {
    "path": "src/palimpzest/query/generators/generators.py",
    "content": "\"\"\"\nThis file contains the Generator classes and generator factory.\n\"\"\"\n\nfrom __future__ import annotations\n\nimport json\nimport logging\nimport os\nimport time\nimport warnings\nfrom copy import deepcopy\nfrom typing import Any, Generic, TypeVar\n\nimport litellm\nimport regex as re  # Use regex instead of re to used variable length lookbehind\nfrom colorama import Fore, Style\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.constants import Cardinality, Model, PromptStrategy\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.models import GenerationStats\nfrom palimpzest.prompts import PromptFactory, PromptManager\nfrom palimpzest.utils.model_helpers import resolve_reasoning_effort\n\n# DEFINITIONS\nGenerationOutput = tuple[dict, str | None, GenerationStats, list[dict]]\nContextType = TypeVar(\"ContextType\")\nInputType = TypeVar(\"InputType\")\n\n\nlogger = logging.getLogger(__name__)\n\ndef get_json_from_answer(answer: str, model: Model, cardinality: Cardinality) -> dict[str, Any]:\n    \"\"\"\n    This function parses an LLM response which is supposed to output a JSON object\n    and optimistically searches for the substring containing the JSON object.\n    \"\"\"\n    # model-specific trimming for LLAMA3 responses\n    if model.is_llama_model():\n        answer = answer.split(\"---\")[0]\n        answer = answer.replace(\"True\", \"true\")\n        answer = answer.replace(\"False\", \"false\")\n\n    # split off context / excess, which models sometimes output after answer\n    answer = answer.split(\"Context:\")[0]\n    answer = answer.split(\"# this is the answer\")[0]\n\n    # trim the answer to only include the JSON dictionary\n    if cardinality == Cardinality.ONE_TO_ONE:\n        if not answer.strip().startswith(\"{\"):\n            # Find the start index of the actual JSON string assuming the prefix is followed by the JSON dictionary\n            start_index = answer.find(\"{\")\n            if start_index != -1:\n                # Remove the prefix and any leading characters before the JSON starts\n                answer = answer[start_index:]\n\n        if not answer.strip().endswith(\"}\"):\n            # Find the end index of the actual JSON string assuming the suffix is preceded by the JSON dictionary\n            end_index = answer.rfind(\"}\")\n            if end_index != -1:\n                # Remove the suffix and any trailing characters after the JSON ends\n                answer = answer[: end_index + 1]\n\n    # otherwise, trim the answer to only include the JSON array\n    else:\n        if not answer.strip().startswith(\"[\"):\n            # Find the start index of the actual JSON string assuming the prefix is followed by the JSON array\n            start_index = answer.find(\"[\")\n            if start_index != -1:\n                # Remove the prefix and any leading characters before the JSON starts\n                answer = answer[start_index:]\n\n        if not answer.strip().endswith(\"]\"):\n            # Find the end index of the actual JSON string\n            # assuming the suffix is preceded by the JSON object/array\n            end_index = answer.rfind(\"]\")\n            if end_index != -1:\n                # Remove the suffix and any trailing characters after the JSON ends\n                answer = answer[: end_index + 1]\n\n    # Handle weird escaped values. I am not sure why the model\n    # is returning these, but the JSON parser can't take them\n    answer = answer.replace(r\"\\_\", \"_\")\n    answer = answer.replace(\"\\\\n\", \"\\n\")\n    # Remove https and http prefixes to not conflict with comment detection\n    # Handle comments in the JSON response. Use regex from // until end of line\n    answer = re.sub(r\"(?<!https?:)\\/\\/.*?$\", \"\", answer, flags=re.MULTILINE)\n    answer = re.sub(r\",\\n.*\\.\\.\\.$\", \"\", answer, flags=re.MULTILINE)\n    # Sanitize newlines in the JSON response\n    answer = answer.replace(\"\\n\", \" \")\n\n    # finally, parse and return the JSON object; errors are handled by the caller\n    return json.loads(answer)\n\n# TODO: push parallelism of generations into LiteLLM rather than threadpool in executor\n# TODO: make sure answer parsing works with custom prompts / parsers (can defer this)\nclass Generator(Generic[ContextType, InputType]):\n    \"\"\"\n    Class for generating new fields for a record using an LLM.\n    \"\"\"\n\n    def __init__(\n        self,\n        model: Model,\n        prompt_strategy: PromptStrategy,\n        reasoning_effort: str,\n        cardinality: Cardinality = Cardinality.ONE_TO_ONE,\n        desc: str | None = None,\n        verbose: bool = False,\n    ):\n        self.model = model\n        self.model_name = model.value\n        self.cardinality = cardinality\n        self.prompt_strategy = prompt_strategy\n        self.reasoning_effort = reasoning_effort\n        self.desc = desc\n        self.verbose = verbose\n        self.prompt_factory = PromptFactory(prompt_strategy, model, cardinality, desc)\n        self.prompt_manager = PromptManager(model)\n\n        # Initialize GeminiClient for direct Gemini API calls (Google AI Studio and Vertex AI)\n        self.gemini_client = None\n        if model.is_model_gemini():\n            from palimpzest.query.generators.gemini_client import GeminiClient\n            self.gemini_client = GeminiClient.get_instance(\n                model=model.get_model_name(),\n                use_vertex=model.is_provider_vertex_ai(),\n            )\n\n    def _parse_reasoning(self, completion_text: str, **kwargs) -> str:\n        \"\"\"Extract the reasoning for the generated output from the completion object.\"\"\"\n        # use a custom reasoning parser if provided\n        if kwargs.get(\"parse_reasoning\"):\n            parse_reasoning_fn = kwargs.get(\"parse_reasoning\")\n            return parse_reasoning_fn(completion_text)\n\n        # if the model followed the default instructions, the completion text will have reasoning\n        # before the \"ANSWER:\"; if this is the case, we simply extract and return that full section\n        if \"answer\" in completion_text.lower():\n            regex = re.compile(\"(.*?)answer:.*\", re.IGNORECASE | re.DOTALL)\n            matches = regex.findall(completion_text)\n            if len(matches) > 0:\n                return matches[0].strip()\n\n        # otherwise, return the full completion text\n        return completion_text\n\n    def _prepare_field_answers(self, field_answers: dict | list[dict], fields: dict[str, FieldInfo]) -> dict[str, list]:\n        \"\"\"\n        field_answers is a dictionary mapping fields to their values. For one-to-one converts, wrap each\n        answer in a list. For one-to-many converts, invert the list of dictionaries into a dictionary with\n        list values.\n        \"\"\"\n        # if this is a one-to-one convert, we need to wrap each answer in a list\n        if self.cardinality == Cardinality.ONE_TO_ONE:\n            field_answers = {field_name: [field_answers[field_name]] for field_name in fields}\n\n        # otherwise, we need to invert the list of dictionaries into a dictionary with list values\n        else:\n            field_answers_lst: list[dict] = deepcopy(field_answers)\n\n            field_answers = {field_name: [] for field_name in fields}\n            for answer_dict in field_answers_lst:\n                for field_name in fields:\n                    answer = answer_dict.get(field_name, None)\n                    field_answers[field_name].append(answer)\n\n        return field_answers\n\n    def _check_convert_answer_text(self, answer_text: str, fields: dict[str, FieldInfo], throw_exception: bool=False) -> dict | list[dict] | None:\n        \"\"\"\n        Try parsing the answer text into a JSON object. If the parsing fails, return None.\n        \"\"\"\n        try:\n            # extract json from the answer text\n            field_answers = get_json_from_answer(answer_text, self.model, self.cardinality)\n\n            # prepare the field answers to match the expected output and return\n            return self._prepare_field_answers(field_answers, fields)\n\n        except Exception as e:\n            if throw_exception:\n                raise e\n\n        return None\n\n    def _check_bool_answer_text(self, answer_text: str, throw_exception: bool=False) -> dict | None:\n        \"\"\"\n        Return {\"passed_operator\": True} if and only if \"true\" is in the answer text.\n        Return {\"passed_operator\": False} if and only if \"false\" is in the answer text.\n        Otherwise, raise an exception.\n        \"\"\"\n        # NOTE: we may be able to eliminate this condition by specifying this JSON output in the prompt;\n        # however, that would also need to coincide with a change to allow the parse_answer_fn to set \"passed_operator\"\n        if \"true\" in answer_text.lower():\n            return {\"passed_operator\": True}\n        elif \"false\" in answer_text.lower():\n            return {\"passed_operator\": False}\n\n        if throw_exception:\n            raise Exception(f\"Could not parse answer from completion text: {answer_text}\")\n\n        return None\n\n    def _parse_convert_answer(self, completion_text: str, fields: dict[str, FieldInfo], json_output: bool) -> dict[str, list]:\n        \"\"\"Extract the answer from the completion object for convert operations.\"\"\"\n        # if the model followed the default instructions, the completion text will place\n        # its answer between \"ANSWER:\" and \"---\"\n        regex = re.compile(\"answer:(.*?)---\", re.IGNORECASE | re.DOTALL)\n        matches = regex.findall(completion_text)\n        if len(matches) > 0:\n            answer_text = matches[0].strip()\n\n            # if we don't expect a JSON output, return the answer text as is\n            if not json_output:\n                return answer_text\n\n            # otherwise, try to parse the answer text into a JSON object\n            field_answers = self._check_convert_answer_text(answer_text, fields)\n            if field_answers is not None:\n                return field_answers\n\n        # if the first regex didn't find an answer, try taking all the text after \"ANSWER:\"\n        regex = re.compile(\"answer:(.*)\", re.IGNORECASE | re.DOTALL)\n        matches = regex.findall(completion_text)\n        if len(matches) > 0:\n            answer_text = matches[0].strip()\n\n            # if we don't expect a JSON output, return the answer text as is\n            if not json_output:\n                return answer_text\n            \n            # otherwise, try to parse the answer text into a JSON object\n            field_answers = self._check_convert_answer_text(answer_text, fields)\n            if field_answers is not None:\n                return field_answers\n\n        # finally, try taking all of the text; for JSON output, throw an exception if parsing fails\n        if not json_output:\n            return completion_text\n\n        return self._check_convert_answer_text(completion_text, fields, throw_exception=True)\n\n    def _parse_bool_answer(self, completion_text: str, json_output: bool) -> dict[str, list]:\n        \"\"\"Extract the answer from the completion object for filter and join operations.\"\"\"\n        # if the model followed the default instructions, the completion text will place\n        # its answer between \"ANSWER:\" and \"---\"\n        regex = re.compile(\"answer:(.*?)---\", re.IGNORECASE | re.DOTALL)\n        matches = regex.findall(completion_text)\n        if len(matches) > 0:\n            answer_text = matches[0].strip()\n\n            # if we don't expect a JSON output, return the answer text as is\n            if not json_output:\n                return answer_text\n\n            # otherwise, try to parse the answer text into a JSON object\n            field_answers = self._check_bool_answer_text(answer_text)\n            if field_answers is not None:\n                return field_answers\n\n        # if the first regex didn't find an answer, try taking all the text after \"ANSWER:\"\n        regex = re.compile(\"answer:(.*)\", re.IGNORECASE | re.DOTALL)\n        matches = regex.findall(completion_text)\n        if len(matches) > 0:\n            answer_text = matches[0].strip()\n\n            # if we don't expect a JSON output, return the answer text as is\n            if not json_output:\n                return answer_text\n\n            # otherwise, try to parse the answer text into a JSON object\n            field_answers = self._check_bool_answer_text(answer_text)\n            if field_answers is not None:\n                return field_answers\n\n        # finally, try taking all of the text; for JSON output, throw an exception if parsing fails\n        if not json_output:\n            return completion_text\n\n        return self._check_bool_answer_text(completion_text, throw_exception=True)\n\n    def _parse_answer(self, completion_text: str, fields: dict[str, FieldInfo] | None, json_output: bool, **kwargs) -> dict[str, list]:\n        \"\"\"Extract the answer from the completion object.\"\"\"\n        # use a custom answer parser if provided\n        if kwargs.get(\"parse_answer\"):\n            parse_answer_fn = kwargs.get(\"parse_answer\")\n            return parse_answer_fn(completion_text)\n\n        # fields should be a dict if a custom answer parser is not provided\n        assert isinstance(fields, dict), \"Fields must be provided if a custom answer parser is not provided.\"\n\n        # extract the per-field answers from the completion text\n        field_answers = (\n            self._parse_bool_answer(completion_text, json_output)\n            if self.prompt_strategy.is_filter_prompt() or self.prompt_strategy.is_join_prompt()\n            else self._parse_convert_answer(completion_text, fields, json_output)\n        )\n\n        return field_answers\n\n    def __call__(self, candidate: DataRecord | list[DataRecord], fields: dict[str, FieldInfo] | None, right_candidate: DataRecord | None = None, json_output: bool=True, **kwargs) -> GenerationOutput:\n        \"\"\"Take the input record(s) (`candidate`), generate the output `fields`, and return the generated output.\"\"\"\n        logger.debug(f\"Generating for candidate(s) {candidate} with fields {fields}\")\n\n        # fields can only be None if the user provides an answer parser\n        fields_check = fields is not None or \"parse_answer\" in kwargs\n        assert fields_check, \"`fields` must be provided if `parse_answer` function is not provided in kwargs.\"\n\n        # if the user (or operator) provides a system prompt instead of a prompt, treat this as\n        # the prompt and print a warning\n        if \"system_prompt\" in kwargs and \"prompt\" not in kwargs:\n            kwargs[\"prompt\"] = kwargs[\"system_prompt\"]\n            kwargs.pop(\"system_prompt\")\n            warnings.warn(\"Provided `system_prompt` without providing `prompt`; setting `prompt` = `system_prompt`.\")  # noqa: B028\n\n        # generate a list of messages which can be used to construct a payload\n        messages = self.prompt_factory.create_messages(candidate, fields, right_candidate, **kwargs)\n        is_audio_op = any(msg.get(\"type\") == \"input_audio\" for msg in messages)\n\n        if \"cache_isolation_id\" in kwargs:\n            messages = self.prompt_manager.inject_cache_isolation_id(messages, kwargs[\"cache_isolation_id\"])\n\n        # generate the text completion\n        start_time = time.time()\n        completion = None\n        completion_text = None\n        try:\n            # added for testing purpose, may be removed if needed\n            if \"generating_messages_only\" in kwargs and kwargs[\"generating_messages_only\"]:\n                return messages\n\n            messages = self.prompt_manager.update_messages_for_caching(messages)\n\n            # Use GeminiClient directly for Google AI Studio models\n            if self.gemini_client is not None:\n                gemini_response = self.gemini_client.generate(\n                    messages=messages,\n                    temperature=kwargs.get(\"temperature\", 0.0),\n                    reasoning_effort=resolve_reasoning_effort(self.model, self.reasoning_effort) if self.model.is_reasoning_model() else None,\n                )\n                end_time = time.time()\n                completion_text = gemini_response.content\n                usage_stats = gemini_response.usage\n                logger.debug(f\"Generated completion via GeminiClient in {end_time - start_time:.2f} seconds\")\n            else:\n                # Use litellm for all other providers\n                completion_kwargs = {}\n                if not self.model.is_o_model() and not self.model.is_gpt_5_model():\n                    completion_kwargs = {\"temperature\": kwargs.get(\"temperature\", 0.0), **completion_kwargs}\n                if is_audio_op:\n                    completion_kwargs = {\"modalities\": [\"text\"], **completion_kwargs}\n                if self.model.is_reasoning_model():\n                    reasoning_effort = resolve_reasoning_effort(self.model, self.reasoning_effort)\n                    completion_kwargs = {\"reasoning_effort\": reasoning_effort, **completion_kwargs}\n                if self.model.is_vllm_model():\n                    completion_kwargs = {\"api_base\": self.model.api_base, \"api_key\": os.environ.get(\"VLLM_API_KEY\", \"fake-api-key\"), **self.model.vllm_kwargs, **completion_kwargs}\n\n                cache_kwargs = self.prompt_manager.get_cache_kwargs()\n                completion_kwargs = {**completion_kwargs, **cache_kwargs}\n                completion = litellm.completion(model=self.model_name, messages=messages, **completion_kwargs)\n                end_time = time.time()\n                completion_text = completion.choices[0].message.content\n                usage = completion.usage.model_dump()\n                logger.debug(f\"Generated completion via litellm in {end_time - start_time:.2f} seconds\")\n\n        # if there's an error generating the completion, we have to return an empty answer\n        # and can only account for the time spent performing the failed generation\n        except Exception as e:\n            logger.error(f\"Error generating completion: {e}\")\n            field_answers = (\n                {\"passed_operator\": False}\n                if self.prompt_strategy.is_filter_prompt() or self.prompt_strategy.is_join_prompt()\n                else {field_name: None for field_name in fields}\n            )\n            reasoning = None\n            generation_stats = GenerationStats(\n                model_name=self.model_name,\n                llm_call_duration_secs=time.time() - start_time,\n                total_llm_calls=1,\n            )\n\n            return field_answers, reasoning, generation_stats, messages\n\n        # parse usage statistics and create the GenerationStats\n        generation_stats = None\n        if completion_text is not None:\n            # get cost per input/output token for the model\n            usd_per_input_token = self.model.get_usd_per_input_token() or 0.0\n            usd_per_audio_input_token = self.model.get_usd_per_audio_input_token() or 0.0\n            usd_per_image_input_token = self.model.get_usd_per_image_input_token() or 0.0\n            usd_per_output_token = self.model.get_usd_per_output_token() or 0.0\n            usd_per_cache_read_token = self.model.get_usd_per_cache_read_token() or 0.0\n            usd_per_audio_cache_read_token = self.model.get_usd_per_audio_cache_read_token() or 0.0\n            usd_per_image_cache_read_token = self.model.get_usd_per_image_cache_read_token() or 0.0\n            usd_per_cache_creation_token = self.model.get_usd_per_cache_creation_token() or 0.0\n\n            # Extract usage stats based on provider\n            if self.gemini_client is not None:\n                # Usage already processed by GeminiClient\n                output_text_tokens = usage_stats.get(\"output_text_tokens\", 0)\n            else:\n                # litellm response format\n                output_text_tokens = usage.get(\"completion_tokens\") or 0\n                usage_stats = self.prompt_manager.extract_usage_stats(usage, is_audio_op)\n\n            input_text_tokens = usage_stats[\"input_text_tokens\"]\n            input_audio_tokens = usage_stats[\"input_audio_tokens\"]\n            input_image_tokens = usage_stats[\"input_image_tokens\"]\n            cache_read_tokens = usage_stats[\"cache_read_tokens\"]\n            cache_creation_tokens = usage_stats[\"cache_creation_tokens\"]\n\n            # Compute cache cost: use per-modality breakdown if available (Gemini), otherwise aggregate\n            if self.gemini_client is not None:\n                cache_cost = (\n                    usage_stats[\"text_cache_read_tokens\"] * usd_per_cache_read_token\n                    + usage_stats[\"audio_cache_read_tokens\"] * usd_per_audio_cache_read_token\n                    + usage_stats[\"image_cache_read_tokens\"] * usd_per_image_cache_read_token\n                )\n            else:\n                cache_cost = (\n                    cache_read_tokens * usd_per_cache_read_token\n                    + cache_creation_tokens * usd_per_cache_creation_token\n                )\n\n            total_cost = (\n                input_text_tokens * usd_per_input_token\n                + input_audio_tokens * usd_per_audio_input_token\n                + input_image_tokens * usd_per_image_input_token\n                + cache_cost\n                + output_text_tokens * usd_per_output_token\n            )\n\n            generation_stats = GenerationStats(\n                model_name=self.model_name,\n                llm_call_duration_secs=end_time - start_time,\n                fn_call_duration_secs=0.0,\n                # Raw token counts by modality\n                input_text_tokens=input_text_tokens,\n                input_audio_tokens=input_audio_tokens,\n                input_image_tokens=input_image_tokens,\n                output_text_tokens=output_text_tokens,\n                # Cache token counts\n                cache_read_tokens=cache_read_tokens,\n                cache_creation_tokens=cache_creation_tokens,\n                # Cost\n                cost_per_record=total_cost,\n                total_llm_calls=1,\n            )\n\n        # pretty print prompt + full completion output for debugging\n        prompt, system_prompt = \"\", \"\"\n        for message in messages:\n            if message[\"role\"] == \"system\":\n                content = message[\"content\"]\n                if isinstance(content, str):\n                    system_prompt += content + \"\\n\"\n                elif isinstance(content, list):\n                    # Handle Anthropic-style content blocks\n                    for block in content:\n                        if isinstance(block, dict) and block.get(\"type\") == \"text\":\n                            system_prompt += block.get(\"text\", \"\") + \"\\n\"\n            if message[\"role\"] == \"user\":\n                content = message.get(\"content\", \"\")\n                msg_type = message.get(\"type\", \"text\")\n                if msg_type == \"text\":\n                    if isinstance(content, str):\n                        prompt += content + \"\\n\"\n                    elif isinstance(content, list):\n                        # Handle Anthropic-style content blocks\n                        for block in content:\n                            if isinstance(block, dict) and block.get(\"type\") == \"text\":\n                                prompt += block.get(\"text\", \"\") + \"\\n\"\n                elif msg_type == \"image\":\n                    prompt += \"<image>\\n\" * len(content)\n                elif msg_type == \"input_audio\":\n                    prompt += \"<audio>\\n\" * len(content)\n        logger.debug(f\"PROMPT:\\n{prompt}\")\n        logger.debug(Fore.GREEN + f\"{completion_text}\\n\" + Style.RESET_ALL)\n\n        # parse reasoning\n        reasoning = None\n        try:\n            reasoning = self._parse_reasoning(completion_text, **kwargs)\n        except Exception as e:\n            logger.error(f\"Error parsing reasoning and answers: {e}\")\n            pass\n\n        # parse field answers\n        field_answers = None \n        if fields is not None and (self.prompt_strategy.is_filter_prompt() or self.prompt_strategy.is_join_prompt()):\n            field_answers = {\"passed_operator\": False}\n        elif fields is not None and not (self.prompt_strategy.is_filter_prompt() or self.prompt_strategy.is_join_prompt()):\n            field_answers = {field_name: None for field_name in fields}\n        try:\n            field_answers = self._parse_answer(completion_text, fields, json_output, **kwargs)\n        except Exception as e:\n            logger.error(f\"Error parsing answers: {e}\")\n            os.makedirs(\"parse-answer-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"parse-answer-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(f\"{str(self.model_name)}\\n\")\n                f.write(\"#####\\n\")\n                f.write(f\"{str(self.prompt_strategy)}\\n\")\n                f.write(\"#####\\n\")\n                f.write(f\"{str(completion_text)}\\n\")\n                f.write(\"#####\\n\")\n                f.write(f\"{str(fields)}\\n\")\n                f.write(\"#####\\n\")\n                f.write(f\"{str(e)}\\n\")\n\n        logger.debug(f\"Generated field answers: {field_answers}\")\n        return field_answers, reasoning, generation_stats, messages\n"
  },
  {
    "path": "src/palimpzest/query/operators/__init__.py",
    "content": "from palimpzest.query.operators.aggregate import AggregateOp as _AggregateOp\nfrom palimpzest.query.operators.aggregate import ApplyGroupByOp as _ApplyGroupByOp\nfrom palimpzest.query.operators.aggregate import AverageAggregateOp as _AverageAggregateOp\nfrom palimpzest.query.operators.aggregate import CountAggregateOp as _CountAggregateOp\nfrom palimpzest.query.operators.aggregate import MaxAggregateOp as _MaxAggregateOp\nfrom palimpzest.query.operators.aggregate import MinAggregateOp as _MinAggregateOp\nfrom palimpzest.query.operators.aggregate import SemanticAggregate as _SemanticAggregate\nfrom palimpzest.query.operators.aggregate import SumAggregateOp as _SumAggregateOp\nfrom palimpzest.query.operators.convert import ConvertOp as _ConvertOp\nfrom palimpzest.query.operators.convert import LLMConvert as _LLMConvert\nfrom palimpzest.query.operators.convert import LLMConvertBonded as _LLMConvertBonded\nfrom palimpzest.query.operators.convert import NonLLMConvert as _NonLLMConvert\nfrom palimpzest.query.operators.critique_and_refine import CritiqueAndRefineConvert as _CritiqueAndRefineConvert\nfrom palimpzest.query.operators.critique_and_refine import CritiqueAndRefineFilter as _CritiqueAndRefineFilter\nfrom palimpzest.query.operators.distinct import DistinctOp as _DistinctOp\nfrom palimpzest.query.operators.filter import FilterOp as _FilterOp\nfrom palimpzest.query.operators.filter import LLMFilter as _LLMFilter\nfrom palimpzest.query.operators.filter import NonLLMFilter as _NonLLMFilter\nfrom palimpzest.query.operators.join import EmbeddingJoin as _EmbeddingJoin\nfrom palimpzest.query.operators.join import JoinOp as _JoinOp\nfrom palimpzest.query.operators.join import NestedLoopsJoin as _NestedLoopsJoin\nfrom palimpzest.query.operators.limit import LimitScanOp as _LimitScanOp\nfrom palimpzest.query.operators.logical import (\n    Aggregate as _Aggregate,\n)\nfrom palimpzest.query.operators.logical import (\n    BaseScan as _BaseScan,\n)\nfrom palimpzest.query.operators.logical import (\n    ConvertScan as _ConvertScan,\n)\nfrom palimpzest.query.operators.logical import (\n    Distinct as _Distinct,\n)\nfrom palimpzest.query.operators.logical import (\n    FilteredScan as _FilteredScan,\n)\nfrom palimpzest.query.operators.logical import (\n    GroupByAggregate as _GroupByAggregate,\n)\nfrom palimpzest.query.operators.logical import (\n    JoinOp as _LogicalJoinOp,\n)\nfrom palimpzest.query.operators.logical import (\n    LimitScan as _LimitScan,\n)\nfrom palimpzest.query.operators.logical import (\n    LogicalOperator as _LogicalOperator,\n)\nfrom palimpzest.query.operators.logical import (\n    Project as _Project,\n)\nfrom palimpzest.query.operators.logical import (\n    TopKScan as _TopKScan,\n)\nfrom palimpzest.query.operators.mixture_of_agents import MixtureOfAgentsConvert as _MixtureOfAgentsConvert\nfrom palimpzest.query.operators.mixture_of_agents import MixtureOfAgentsFilter as _MixtureOfAgentsFilter\nfrom palimpzest.query.operators.physical import PhysicalOperator as _PhysicalOperator\nfrom palimpzest.query.operators.project import ProjectOp as _ProjectOp\nfrom palimpzest.query.operators.rag import RAGConvert as _RAGConvert\nfrom palimpzest.query.operators.rag import RAGFilter as _RAGFilter\nfrom palimpzest.query.operators.scan import MarshalAndScanDataOp as _MarshalAndScanDataOp\nfrom palimpzest.query.operators.scan import ScanPhysicalOp as _ScanPhysicalOp\nfrom palimpzest.query.operators.split import SplitConvert as _SplitConvert\nfrom palimpzest.query.operators.split import SplitFilter as _SplitFilter\nfrom palimpzest.query.operators.topk import TopKOp as _TopKOp\n\nLOGICAL_OPERATORS = [\n    _LogicalOperator,\n    _Aggregate,\n    _BaseScan,\n    _ConvertScan,\n    _Distinct,\n    _FilteredScan,\n    _GroupByAggregate,\n    _LogicalJoinOp,\n    _LimitScan,\n    _Project,\n    _TopKScan,\n]\n\nPHYSICAL_OPERATORS = (\n    # aggregate\n    [_AggregateOp, _ApplyGroupByOp, _AverageAggregateOp, _CountAggregateOp, _MaxAggregateOp, _MinAggregateOp, _SemanticAggregate, _SumAggregateOp]\n    # convert\n    + [_ConvertOp, _NonLLMConvert, _LLMConvert, _LLMConvertBonded]\n    # critique and refine\n    + [_CritiqueAndRefineConvert, _CritiqueAndRefineFilter]\n    # distinct\n    + [_DistinctOp]\n    # scan\n    + [_ScanPhysicalOp, _MarshalAndScanDataOp]\n    # filter\n    + [_FilterOp, _NonLLMFilter, _LLMFilter]\n    # join\n    + [_EmbeddingJoin, _JoinOp, _NestedLoopsJoin]\n    # limit\n    + [_LimitScanOp]\n    # mixture-of-agents\n    + [_MixtureOfAgentsConvert, _MixtureOfAgentsFilter]\n    # physical\n    + [_PhysicalOperator]\n    # project\n    + [_ProjectOp]\n    # rag\n    + [_RAGConvert, _RAGFilter]\n    # top-k\n    + [_TopKOp]\n    # split\n    + [_SplitConvert, _SplitFilter]\n)\n\n__all__ = [\n    \"LOGICAL_OPERATORS\",\n    \"PHYSICAL_OPERATORS\",\n]\n"
  },
  {
    "path": "src/palimpzest/query/operators/aggregate.py",
    "content": "from __future__ import annotations\n\nimport contextlib\nimport time\nfrom typing import Any\n\nfrom palimpzest.constants import (\n    NAIVE_EST_NUM_GROUPS,\n    NAIVE_EST_NUM_INPUT_TOKENS,\n    NAIVE_EST_NUM_OUTPUT_TOKENS,\n    AggFunc,\n    Model,\n    PromptStrategy,\n)\nfrom palimpzest.core.elements.groupbysig import GroupBySig\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.lib.schemas import Average, Count, Max, Min, Sum\nfrom palimpzest.core.models import OperatorCostEstimates, RecordOpStats\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\n\nclass AggregateOp(PhysicalOperator):\n    \"\"\"\n    Aggregate operators accept a list of candidate DataRecords as input to their\n    __call__ methods. Thus, we use a slightly modified abstract base class for\n    these operators.\n    \"\"\"\n    def __call__(self, candidates: list[DataRecord]) -> DataRecordSet:\n        raise NotImplementedError(\"Using __call__ from abstract method\")\n\n\nclass ApplyGroupByOp(AggregateOp):\n    \"\"\"\n    Implementation of a GroupBy operator. This operator groups records by a set of fields\n    and applies a function to each group. The group_by_sig object contains the fields to\n    group by and the aggregation functions to apply to each group.\n    \"\"\"\n    def __init__(self, group_by_sig: GroupBySig, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.group_by_sig = group_by_sig\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Group-by Signature: {str(self.group_by_sig)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\"group_by_sig\": str(self.group_by_sig.serialize()), **id_params}\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"group_by_sig\": self.group_by_sig, **op_params}\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        # for now, assume applying the groupby takes negligible additional time (and no cost in USD)\n        return OperatorCostEstimates(\n            cardinality=NAIVE_EST_NUM_GROUPS,\n            time_per_record=0,\n            cost_per_record=0,\n            quality=1.0,\n        )\n\n    @staticmethod\n    def agg_init(func):\n        if func.lower() == \"count\":\n            return 0\n        elif func.lower() == \"average\":\n            return (0, 0)\n        elif func.lower() == \"sum\":\n            return 0\n        elif func.lower() == \"min\":\n            return float(\"inf\")\n        elif func.lower() == \"max\":\n            return float(\"-inf\")\n        elif func.lower() == \"list\":\n            return []\n        elif func.lower() == \"set\":\n            return set()\n        else:\n            raise Exception(\"Unknown agg function \" + func)\n\n    @staticmethod\n    def agg_merge(func, state, val):\n        if func.lower() == \"count\":\n            return state + 1\n        elif func.lower() == \"average\":\n            sum_, cnt = state\n            if val is None:\n                return (sum_, cnt)\n            return (sum_ + val, cnt + 1)\n        elif func.lower() == \"sum\":\n            if val is None:\n                return state\n            return state + sum(val) if isinstance(val, list) else state + val\n        elif func.lower() == \"min\":\n            if val is None:\n                return state\n            return min(state, min(val) if isinstance(val, list) else val)\n        elif func.lower() == \"max\":\n            if val is None:\n                return state\n            return max(state, max(val) if isinstance(val, list) else val)\n        elif func.lower() == \"list\":\n            state.append(val)\n            return state\n        elif func.lower() == \"set\":\n            state.add(val)\n            return state\n        else:\n            raise Exception(\"Unknown agg function \" + func)\n\n    @staticmethod\n    def agg_final(func, state):\n        if func.lower() in [\"count\", \"sum\", \"min\", \"max\", \"list\", \"set\"]:\n            return state\n        elif func.lower() == \"average\":\n            sum, cnt = state\n            return float(sum) / cnt if cnt > 0 else None\n        else:\n            raise Exception(\"Unknown agg function \" + func)\n\n    def __call__(self, candidates: list[DataRecord]) -> DataRecordSet:\n        start_time = time.time()\n\n        # build group array\n        agg_state = {}\n        for candidate in candidates:\n            group = ()\n            for f in self.group_by_sig.group_by_fields:\n                if not hasattr(candidate, f):\n                    raise TypeError(f\"ApplyGroupByOp record missing expected field {f}\")\n                group = group + (getattr(candidate, f),)\n            if group in agg_state:\n                state = agg_state[group]\n            else:\n                state = []\n                for fun in self.group_by_sig.agg_funcs:\n                    state.append(ApplyGroupByOp.agg_init(fun))\n            for i in range(0, len(self.group_by_sig.agg_funcs)):\n                fun = self.group_by_sig.agg_funcs[i]\n                if not hasattr(candidate, self.group_by_sig.agg_fields[i]):\n                    raise TypeError(f\"ApplyGroupByOp record missing expected field {self.group_by_sig.agg_fields[i]}\")\n                field = getattr(candidate, self.group_by_sig.agg_fields[i])\n                state[i] = ApplyGroupByOp.agg_merge(fun, state[i], field)\n            agg_state[group] = state\n\n        # return list of data records (one per group)\n        drs: list[DataRecord] = []\n        group_by_fields = self.group_by_sig.group_by_fields\n        agg_fields = self.group_by_sig.get_agg_field_names()\n        for g in agg_state:\n            # build up data item\n            data_item = {}\n            for i in range(0, len(g)):\n                k = g[i]\n                data_item[group_by_fields[i]] = k\n            vals = agg_state[g]\n            for i in range(0, len(vals)):\n                v = ApplyGroupByOp.agg_final(self.group_by_sig.agg_funcs[i], vals[i])\n                data_item[agg_fields[i]] = v\n\n            # create new DataRecord\n            schema = self.group_by_sig.output_schema()\n            data_item = schema(**data_item)\n            dr = DataRecord.from_agg_parents(data_item, parent_records=candidates)\n            drs.append(dr)\n\n        # create RecordOpStats objects\n        total_time = time.time() - start_time\n        record_op_stats_lst = []\n        for dr in drs:\n            record_op_stats = RecordOpStats(\n                record_id=dr._id,\n                record_parent_ids=dr._parent_ids,\n                record_source_indices=dr._source_indices,\n                record_state=dr.to_dict(include_bytes=False),\n                full_op_id=self.get_full_op_id(),\n                logical_op_id=self.logical_op_id,\n                op_name=self.op_name(),\n                time_per_record=total_time / len(drs),\n                cost_per_record=0.0,\n                op_details={k: str(v) for k, v in self.get_id_params().items()},\n            )\n            record_op_stats_lst.append(record_op_stats)\n\n        # construct and return DataRecordSet\n        return DataRecordSet(drs, record_op_stats_lst)\n\n\nclass AverageAggregateOp(AggregateOp):\n    # NOTE: we don't actually need / use agg_func here (yet)\n\n    def __init__(self, agg_func: AggFunc, *args, **kwargs):\n        # enforce that output schema is correct\n        assert kwargs[\"output_schema\"].model_fields.keys() == Average.model_fields.keys(), \"AverageAggregateOp requires output_schema to be Average\"\n\n        # enforce that input schema is a single numeric field\n        input_field_types = list(kwargs[\"input_schema\"].model_fields.values())\n        assert len(input_field_types) == 1, \"AverageAggregateOp requires input_schema to have exactly one field\"\n        numeric_field_types = [\n            bool, int, float, int | float,\n            bool | None, int | None, float | None, int | float | None,\n            bool | Any, int | Any, float | Any, int | float | Any,\n            bool | None | Any, int | None | Any, float | None | Any, int | float | None | Any,\n        ]\n        is_numeric = input_field_types[0].annotation in numeric_field_types\n        assert is_numeric, f\"AverageAggregateOp requires input_schema to have a numeric field type, i.e. one of: {numeric_field_types}\\nGot: {input_field_types[0]}\"\n\n        # call parent constructor\n        super().__init__(*args, **kwargs)\n        self.agg_func = agg_func\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Function: {str(self.agg_func)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\"agg_func\": str(self.agg_func), **id_params}\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"agg_func\": self.agg_func, **op_params}\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        # for now, assume applying the aggregation takes negligible additional time (and no cost in USD)\n        return OperatorCostEstimates(\n            cardinality=1,\n            time_per_record=0,\n            cost_per_record=0,\n            quality=1.0,\n        )\n\n    def __call__(self, candidates: list[DataRecord]) -> DataRecordSet:\n        start_time = time.time()\n\n        # NOTE: we currently do not guarantee that input values conform to their specified type;\n        #       as a result, we simply omit any values which do not parse to a float from the average\n        # NOTE: right now we perform a check in the constructor which enforces that the input_schema\n        #       has a single field which is numeric in nature; in the future we may want to have a\n        #       cleaner way of computing the value (rather than `float(list(candidate...))` below)\n        summation, total = 0, 0\n        for candidate in candidates:\n            try:\n                summation += float(list(candidate.to_dict().values())[0])\n                total += 1\n            except Exception:\n                pass\n        data_item = Average(average=summation / total)\n        dr = DataRecord.from_agg_parents(data_item, parent_records=candidates)\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=time.time() - start_time,\n            cost_per_record=0.0,\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n\n\nclass SumAggregateOp(AggregateOp):\n    # NOTE: we don't actually need / use agg_func here (yet)\n\n    def __init__(self, agg_func: AggFunc, *args, **kwargs):\n        # enforce that output schema is correct\n        assert kwargs[\"output_schema\"].model_fields.keys() == Sum.model_fields.keys(), \"SumAggregateOp requires output_schema to be Sum\"\n\n        # enforce that input schema is a single numeric field\n        input_field_types = list(kwargs[\"input_schema\"].model_fields.values())\n        assert len(input_field_types) == 1, \"SumAggregateOp requires input_schema to have exactly one field\"\n        numeric_field_types = [\n            bool, int, float, int | float,\n            bool | None, int | None, float | None, int | float | None,\n            bool | Any, int | Any, float | Any, int | float | Any,\n            bool | None | Any, int | None | Any, float | None | Any, int | float | None | Any,\n        ]\n        is_numeric = input_field_types[0].annotation in numeric_field_types\n        assert is_numeric, f\"SumAggregateOp requires input_schema to have a numeric field type, i.e. one of: {numeric_field_types}\\nGot: {input_field_types[0]}\"\n\n        # call parent constructor\n        super().__init__(*args, **kwargs)\n        self.agg_func = agg_func\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Function: {str(self.agg_func)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\"agg_func\": str(self.agg_func), **id_params}\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"agg_func\": self.agg_func, **op_params}\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        # for now, assume applying the aggregation takes negligible additional time (and no cost in USD)\n        return OperatorCostEstimates(\n            cardinality=1,\n            time_per_record=0,\n            cost_per_record=0,\n            quality=1.0,\n        )\n\n    def __call__(self, candidates: list[DataRecord]) -> DataRecordSet:\n        start_time = time.time()\n\n        # NOTE: we currently do not guarantee that input values conform to their specified type;\n        #       as a result, we simply omit any values which do not parse to a float from the average\n        # NOTE: right now we perform a check in the constructor which enforces that the input_schema\n        #       has a single field which is numeric in nature; in the future we may want to have a\n        #       cleaner way of computing the value (rather than `float(list(candidate...))` below)\n        summation = 0\n        for candidate in candidates:\n            with contextlib.suppress(Exception):\n                summation += float(list(candidate.to_dict().values())[0])\n        data_item = Sum(sum=summation)\n        dr = DataRecord.from_agg_parents(data_item, parent_records=candidates)\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=time.time() - start_time,\n            cost_per_record=0.0,\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n\n\nclass CountAggregateOp(AggregateOp):\n    # NOTE: we don't actually need / use agg_func here (yet)\n\n    def __init__(self, agg_func: AggFunc, *args, **kwargs):\n        # enforce that output schema is correct\n        assert kwargs[\"output_schema\"].model_fields.keys() == Count.model_fields.keys(), \"CountAggregateOp requires output_schema to be Count\"\n\n        # call parent constructor\n        super().__init__(*args, **kwargs)\n        self.agg_func = agg_func\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Function: {str(self.agg_func)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\"agg_func\": str(self.agg_func), **id_params}\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"agg_func\": self.agg_func, **op_params}\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        # for now, assume applying the aggregation takes negligible additional time (and no cost in USD)\n        return OperatorCostEstimates(\n            cardinality=1,\n            time_per_record=0,\n            cost_per_record=0,\n            quality=1.0,\n        )\n\n    def __call__(self, candidates: list[DataRecord]) -> DataRecordSet:\n        start_time = time.time()\n\n        # create new DataRecord\n        data_item = Count(count=len(candidates))\n        dr = DataRecord.from_agg_parents(data_item, parent_records=candidates)\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=time.time() - start_time,\n            cost_per_record=0.0,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n\n\nclass MinAggregateOp(AggregateOp):\n    # NOTE: we don't actually need / use agg_func here (yet)\n\n    def __init__(self, agg_func: AggFunc, *args, **kwargs):\n        # enforce that output schema is correct\n        assert kwargs[\"output_schema\"].model_fields.keys() == Min.model_fields.keys(), \"MinAggregateOp requires output_schema to be Min\"\n\n        # call parent constructor\n        super().__init__(*args, **kwargs)\n        self.agg_func = agg_func\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Function: {str(self.agg_func)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\"agg_func\": str(self.agg_func), **id_params}\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"agg_func\": self.agg_func, **op_params}\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        # for now, assume applying the aggregation takes negligible additional time (and no cost in USD)\n        return OperatorCostEstimates(\n            cardinality=1,\n            time_per_record=0,\n            cost_per_record=0,\n            quality=1.0,\n        )\n\n    def __call__(self, candidates: list[DataRecord]) -> DataRecordSet:\n        start_time = time.time()\n\n        # create new DataRecord\n        min = float(\"inf\")\n        for candidate in candidates:\n            try:  # noqa: SIM105\n                min = min(float(list(candidate.to_dict().values())[0]), min)\n            except Exception:\n                pass\n        data_item = Min(min=min if min != float(\"inf\") else None)\n        dr = DataRecord.from_agg_parents(data_item, parent_records=candidates)\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr.id,\n            record_parent_ids=dr.parent_ids,\n            record_source_indices=dr.source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=time.time() - start_time,\n            cost_per_record=0.0,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n\n\nclass MaxAggregateOp(AggregateOp):\n    # NOTE: we don't actually need / use agg_func here (yet)\n\n    def __init__(self, agg_func: AggFunc, *args, **kwargs):\n        # enforce that output schema is correct\n        assert kwargs[\"output_schema\"].model_fields.keys() == Max.model_fields.keys(), \"MaxAggregateOp requires output_schema to be Max\"\n\n        # call parent constructor\n        super().__init__(*args, **kwargs)\n        self.agg_func = agg_func\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Function: {str(self.agg_func)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\"agg_func\": str(self.agg_func), **id_params}\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"agg_func\": self.agg_func, **op_params}\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        # for now, assume applying the aggregation takes negligible additional time (and no cost in USD)\n        return OperatorCostEstimates(\n            cardinality=1,\n            time_per_record=0,\n            cost_per_record=0,\n            quality=1.0,\n        )\n\n    def __call__(self, candidates: list[DataRecord]) -> DataRecordSet:\n        start_time = time.time()\n\n        # create new DataRecord\n        \n        max = float(\"-inf\")\n        for candidate in candidates:\n            try:  # noqa: SIM105\n                max = max(float(list(candidate.to_dict().values())[0]), max)\n            except Exception:\n                pass\n        data_item = Max(max=max if max != float(\"-inf\") else None)\n        dr = DataRecord.from_agg_parents(data_item, parent_records=candidates)\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr.id,\n            record_parent_ids=dr.parent_ids,\n            record_source_indices=dr.source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=time.time() - start_time,\n            cost_per_record=0.0,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n\n\nclass SemanticAggregate(AggregateOp):\n\n    def __init__(self, agg_str: str, model: Model, prompt_strategy: PromptStrategy = PromptStrategy.AGG, reasoning_effort: str = \"default\", *args, **kwargs):\n        # call parent constructor\n        super().__init__(*args, **kwargs)\n        self.agg_str = agg_str\n        self.model = model\n        self.prompt_strategy = prompt_strategy\n        self.reasoning_effort = reasoning_effort\n        if model is not None:\n            self.generator = Generator(model, prompt_strategy, reasoning_effort)\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Prompt Strategy: {self.prompt_strategy}\\n\"\n        op += f\"    Reasoning Effort: {self.reasoning_effort}\\n\"\n        op += f\"    Agg: {str(self.agg_str)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"agg_str\": self.agg_str,\n            \"model\": None if self.model is None else self.model.value,\n            \"prompt_strategy\": None if self.prompt_strategy is None else self.prompt_strategy.value,\n            \"reasoning_effort\": self.reasoning_effort,\n            **id_params,\n        }\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"agg_str\": self.agg_str,\n            \"model\": self.model,\n            \"prompt_strategy\": self.prompt_strategy,\n            \"reasoning_effort\": self.reasoning_effort,\n            **op_params,\n        }\n\n        return op_params\n\n    def get_model_name(self) -> str:\n        return self.model.value\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Compute naive cost estimates for the LLMConvert operation. Implicitly, these estimates\n        assume the use of a single LLM call for each input record. Child classes of LLMConvert\n        may call this function through super() and adjust these estimates as needed (or they can\n        completely override this function).\n        \"\"\"\n        # estimate number of input and output tokens from source\n        est_num_input_tokens = NAIVE_EST_NUM_INPUT_TOKENS * source_op_cost_estimates.cardinality\n        est_num_output_tokens = NAIVE_EST_NUM_OUTPUT_TOKENS\n\n        # get est. of conversion time per record from model card;\n        model_conversion_time_per_record = self.model.get_seconds_per_output_token() * est_num_output_tokens\n\n        # get est. of conversion cost (in USD) per record from model card\n        usd_per_input_token = self.model.get_usd_per_input_token()\n        if getattr(self, \"prompt_strategy\", None) is not None and self.is_audio_op():\n            usd_per_input_token = self.model.get_usd_per_audio_input_token()\n\n        model_conversion_usd_per_record = (\n            usd_per_input_token * est_num_input_tokens\n            + self.model.get_usd_per_output_token() * est_num_output_tokens\n        )\n\n        # estimate quality of output based on the strength of the model being used\n        quality = self.model.get_overall_score() / 100.0\n\n        return OperatorCostEstimates(\n            cardinality=1.0,\n            time_per_record=model_conversion_time_per_record,\n            cost_per_record=model_conversion_usd_per_record,\n            quality=quality,\n        )\n\n    def __call__(self, candidates: list[DataRecord]) -> DataRecordSet:\n        start_time = time.time()\n\n        # TODO: if candidates is an empty list, return an empty DataRecordSet\n        if len(candidates) == 0:\n            return DataRecordSet([], [])\n\n        # get the set of input fields to use for the operation\n        input_fields = self.get_input_fields()\n\n        # get the set of output fields to use for the operation\n        fields_to_generate = self.get_fields_to_generate(candidates[0])\n        fields = {field: field_type for field, field_type in self.output_schema.model_fields.items() if field in fields_to_generate}\n\n        # construct kwargs for generation\n        gen_kwargs = {\"project_cols\": input_fields, \"output_schema\": self.output_schema, \"agg_instruction\": self.agg_str}\n\n        # generate outputs for all fields in a single query\n        field_answers, _, generation_stats, _ = self.generator(candidates, fields, **gen_kwargs)\n        assert all([field in field_answers for field in fields]), \"Not all fields were generated!\"\n\n        # construct data record for the output\n        field, value = fields_to_generate[0], field_answers[fields_to_generate[0]][0]\n        data_item = self.output_schema(**{field: value})\n        dr = DataRecord.from_agg_parents(data_item, parent_records=candidates)\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=time.time() - start_time,\n            cost_per_record=generation_stats.cost_per_record,\n            model_name=self.get_model_name(),\n            answer={field: value},\n            input_fields=input_fields,\n            generated_fields=fields_to_generate,\n            input_text_tokens=generation_stats.input_text_tokens,\n            input_audio_tokens=generation_stats.input_audio_tokens,\n            input_image_tokens=generation_stats.input_image_tokens,\n            cache_read_tokens=generation_stats.cache_read_tokens,\n            cache_creation_tokens=generation_stats.cache_creation_tokens,\n            output_text_tokens=generation_stats.output_text_tokens,\n            embedding_input_tokens=generation_stats.embedding_input_tokens,\n            llm_call_duration_secs=generation_stats.llm_call_duration_secs,\n            fn_call_duration_secs=generation_stats.fn_call_duration_secs,\n            total_llm_calls=generation_stats.total_llm_calls,\n            total_embedding_llm_calls=generation_stats.total_embedding_llm_calls,\n            image_operation=self.is_image_op(),\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n"
  },
  {
    "path": "src/palimpzest/query/operators/compute.py",
    "content": "import functools\nimport inspect\nimport os\nimport time\nfrom typing import Any\n\nfrom smolagents import CodeAgent, LiteLLMModel, tool\n\nfrom palimpzest.core.data.context import Context\nfrom palimpzest.core.data.context_manager import ContextManager\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import GenerationStats, OperatorCostEstimates, RecordOpStats\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\n# TODO: need to store final executed code in compute() operator so that humans can debug when human-in-the-loop\n\ndef make_tool(bound_method):\n    # Get the original function and bound instance\n    func = bound_method.__func__\n    instance = bound_method.__self__\n    \n    # Get the signature and remove 'self'\n    sig = inspect.signature(func)\n    params = list(sig.parameters.values())[1:]  # skip 'self'\n    new_sig = inspect.Signature(parameters=params, return_annotation=sig.return_annotation)\n\n    # Create a wrapper function dynamically\n    @functools.wraps(func)\n    def wrapper(*args, **kwargs):\n        return func(instance, *args, **kwargs)\n\n    # Update the __signature__ to reflect the new one without 'self'\n    wrapper.__signature__ = new_sig\n\n    return wrapper\n\n\nclass SmolAgentsCompute(PhysicalOperator):\n    \"\"\"\n    \"\"\"\n    def __init__(self, context_id: str, instruction: str, additional_contexts: list[Context] | None = None, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.context_id = context_id\n        self.instruction = instruction\n        self.additional_contexts = [] if additional_contexts is None else additional_contexts\n        # self.model_id = \"anthropic/claude-3-7-sonnet-latest\"\n        self.model_id = \"openai/gpt-4o-mini-2024-07-18\"\n        # self.model_id = \"openai/gpt-4o-2024-08-06\"\n        api_key = os.getenv(\"ANTHROPIC_API_KEY\") if \"anthropic\" in self.model_id else os.getenv(\"OPENAI_API_KEY\")\n        self.model = LiteLLMModel(model_id=self.model_id, api_key=api_key)\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Context ID: {self.context_id:20s}\\n\"\n        op += f\"    Instruction: {self.instruction:20s}\\n\"\n        op += f\"    Add. Ctxs: {self.additional_contexts}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\n            \"context_id\": self.context_id,\n            \"instruction\": self.instruction,\n            \"additional_contexts\": self.additional_contexts,\n            **id_params,\n        }\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\n            \"context_id\": self.context_id,\n            \"instruction\": self.instruction,\n            \"additional_contexts\": self.additional_contexts,\n            **op_params,\n        }\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        return OperatorCostEstimates(\n            cardinality=source_op_cost_estimates.cardinality,\n            time_per_record=100,\n            cost_per_record=1,\n            quality=1.0,\n        )\n\n    def _create_record_set(\n        self,\n        candidate: DataRecord,\n        generation_stats: GenerationStats,\n        total_time: float,\n        answer: dict[str, Any],\n    ) -> DataRecordSet:\n        \"\"\"\n        Given an input DataRecord and a determination of whether it passed the filter or not,\n        construct the resulting RecordSet.\n        \"\"\"\n        # create new DataRecord\n        data_item = {field: answer[field] for field in self.output_schema.model_fields if field in answer}\n        dr = DataRecord.from_parent(self.output_schema, data_item, parent_record=candidate)\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=total_time,\n            cost_per_record=generation_stats.cost_per_record,\n            model_name=self.get_model_name(),\n            input_text_tokens=generation_stats.input_text_tokens,\n            input_audio_tokens=generation_stats.input_audio_tokens,\n            input_image_tokens=generation_stats.input_image_tokens,\n            cache_read_tokens=generation_stats.cache_read_tokens,\n            cache_creation_tokens=generation_stats.cache_creation_tokens,\n            output_text_tokens=generation_stats.output_text_tokens,\n            embedding_input_tokens=generation_stats.embedding_input_tokens,\n            llm_call_duration_secs=generation_stats.llm_call_duration_secs,\n            fn_call_duration_secs=generation_stats.fn_call_duration_secs,\n            total_llm_calls=generation_stats.total_llm_calls,\n            total_embedding_llm_calls=generation_stats.total_embedding_llm_calls,\n            answer={k: v.description if isinstance(v, Context) else v for k, v in answer.items()},\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n\n    def __call__(self, candidate: DataRecord) -> Any:\n        start_time = time.time()\n\n        # get the input context object and its tools\n        input_context: Context = candidate.context\n        description = input_context.description\n        tools = [tool(make_tool(f)) for f in input_context.tools]\n\n        # update the description to include any additional contexts\n        for ctx in self.additional_contexts:\n            # TODO: remove additional context if it is an ancestor of the input context\n            # (not just if it is equal to the input context)\n            if ctx.id == input_context.id:\n                continue\n            description += f\"\\n\\nHere is some additional Context which may be useful:\\n\\n{ctx.description}\"\n\n        # perform the computation\n        instructions = f\"\\n\\nHere is a description of the Context whose data you will be working with, as well as any previously computed results:\\n\\n{description}\"\n        agent = CodeAgent(\n            tools=tools,\n            model=self.model,\n            add_base_tools=False,\n            instructions=instructions,\n            return_full_result=True,\n            additional_authorized_imports=[\"pandas\", \"io\", \"os\"],\n            planning_interval=4,\n            max_steps=30,\n        )\n        result = agent.run(self.instruction)\n        # NOTE: you can see the system prompt with `agent.memory.system_prompt.system_prompt`\n        # full_steps = agent.memory.get_full_steps()\n\n        # compute generation stats\n        response = result.output\n        input_tokens = result.token_usage.input_tokens\n        output_tokens = result.token_usage.output_tokens\n        cost_per_input_token = (3.0 / 1e6) if \"anthropic\" in self.model_id else (0.15 / 1e6) # (2.5 / 1e6) #\n        cost_per_output_token = (15.0 / 1e6) if \"anthropic\" in self.model_id else (0.6 / 1e6) # (10.0 / 1e6) #\n        input_cost = input_tokens * cost_per_input_token\n        output_cost = output_tokens * cost_per_output_token\n        generation_stats = GenerationStats(\n            model_name=self.model_id,\n            input_text_tokens=input_tokens,\n            output_text_tokens=output_tokens,\n            cost_per_record=input_cost + output_cost,\n            llm_call_duration_secs=time.time() - start_time,\n        )\n\n        # update the description of the computed Context to include the result\n        new_description = f\"RESULT: {response}\\n\\n\"\n        cm = ContextManager()\n        cm.update_context(id=self.context_id, description=new_description)\n\n        # create and return record set\n        field_answers = {\n            \"context\": cm.get_context(id=self.context_id),\n            f\"result-{self.context_id}\": response,\n        }\n        record_set = self._create_record_set(\n            candidate,\n            generation_stats,\n            time.time() - start_time,\n            field_answers,\n        )\n\n        return record_set\n\n# import json; json.dumps(agent.memory.get_full_steps())\n# agent.memory.get_full_steps()[1].keys()\n# dict_keys(['step_number', 'timing', 'model_input_messages', 'tool_calls', 'error', 'model_output_message', 'model_output', 'code_action', 'observations', 'observations_images', \n# 'action_output', 'token_usage', 'is_final_answer'])\n# agent.memory.get_full_steps()[1]['action_output']"
  },
  {
    "path": "src/palimpzest/query/operators/convert.py",
    "content": "from __future__ import annotations\n\nimport time\nfrom abc import ABC, abstractmethod\nfrom typing import Callable\n\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.constants import (\n    NAIVE_EST_NUM_INPUT_TOKENS,\n    NAIVE_EST_NUM_OUTPUT_TOKENS,\n    NAIVE_EST_ONE_TO_MANY_SELECTIVITY,\n    Cardinality,\n    Model,\n    PromptStrategy,\n)\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import GenerationStats, OperatorCostEstimates, RecordOpStats\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\n\nclass ConvertOp(PhysicalOperator, ABC):\n    def __init__(\n        self,\n        cardinality: Cardinality = Cardinality.ONE_TO_ONE,\n        udf: Callable | None = None,\n        desc: str | None = None,\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        self.cardinality = cardinality\n        self.udf = udf\n        self.desc = desc\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"cardinality\": self.cardinality.value,\n            \"udf\": self.udf,\n            \"desc\": self.desc,\n            **id_params,\n        }\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"cardinality\": self.cardinality,\n            \"udf\": self.udf,\n            \"desc\": self.desc,\n            **op_params,\n        }\n\n        return op_params\n\n    def _create_data_records_from_field_answers(\n        self,\n        field_answers: dict[str, list],\n        candidate: DataRecord,\n    ) -> list[DataRecord]:\n        \"\"\"\n        Given a mapping from each field to its (list of) generated value(s), we construct the corresponding\n        list of output DataRecords.\n        \"\"\"\n        # get the number of records generated; for some convert operations it is possible for fields to\n        # have different lengths of generated values, so we take the maximum length of any field's values\n        # to be the number of records generated\n        n_records = max([len(lst) for lst in field_answers.values()])\n        successful_convert = n_records > 0\n\n        drs = []\n        for idx in range(max(n_records, 1)):\n            # parse newly generated fields from the field_answers dictionary for this field; if the list\n            # of generated values is shorter than the number of records, we fill in with None\n            data_item = {}\n            for field in self.generated_fields:\n                data_item[field] = field_answers[field][idx] if idx < len(field_answers[field]) else None\n\n            # initialize record with the correct output schema, data_item, parent record, and cardinality idx\n            dr = DataRecord.from_parent(self.output_schema, data_item, parent_record=candidate, cardinality_idx=idx)\n\n            # append data record to list of output data records\n            drs.append(dr)\n\n        return drs, successful_convert\n\n    def _create_record_set(\n        self,\n        records: list[DataRecord],\n        field_names: list[str],\n        generation_stats: GenerationStats,\n        total_time: float,\n        successful_convert: bool,\n    ) -> DataRecordSet:\n        \"\"\"\n        Construct list of RecordOpStats objects (one for each DataRecord).\n        \"\"\"\n        # amortize the generation stats across all generated records\n        per_record_stats = generation_stats / len(records)\n        time_per_record = total_time / len(records)\n\n        # create the RecordOpStats objects for each output record\n        record_op_stats_lst = [\n            RecordOpStats(\n                record_id=dr._id,\n                record_parent_ids=dr._parent_ids,\n                record_source_indices=dr._source_indices,\n                record_state=dr.to_dict(include_bytes=False),\n                full_op_id=self.get_full_op_id(),\n                logical_op_id=self.logical_op_id,\n                op_name=self.op_name(),\n                time_per_record=time_per_record,\n                cost_per_record=per_record_stats.cost_per_record,\n                model_name=self.get_model_name(),\n                answer={field_name: getattr(dr, field_name, None) for field_name in field_names},\n                input_fields=list(self.input_schema.model_fields),\n                generated_fields=field_names,\n                input_text_tokens=per_record_stats.input_text_tokens,\n                input_audio_tokens=per_record_stats.input_audio_tokens,\n                input_image_tokens=per_record_stats.input_image_tokens,\n                cache_read_tokens=per_record_stats.cache_read_tokens,\n                cache_creation_tokens=per_record_stats.cache_creation_tokens,\n                output_text_tokens=per_record_stats.output_text_tokens,\n                embedding_input_tokens=per_record_stats.embedding_input_tokens,\n                llm_call_duration_secs=per_record_stats.llm_call_duration_secs,\n                fn_call_duration_secs=per_record_stats.fn_call_duration_secs,\n                total_llm_calls=per_record_stats.total_llm_calls,\n                total_embedding_llm_calls=per_record_stats.total_embedding_llm_calls,\n                failed_convert=(not successful_convert),\n                op_details={k: str(v) for k, v in self.get_id_params().items()},\n            )\n            for dr in records\n        ]\n\n        # create and return the DataRecordSet\n        return DataRecordSet(records, record_op_stats_lst)\n\n    @abstractmethod\n    def convert(self, candidate: DataRecord, fields: dict[str, FieldInfo]) -> tuple[dict[str, list], GenerationStats]:\n        \"\"\"\n        This abstract method will be implemented by subclasses of ConvertOp to process the input DataRecord\n        and generate the value(s) for each of the specified fields. If the convert operator is a one-to-many\n        convert, then each field will have a corresponding list of output values. The dictionary mapping each\n        generated field to its (list of) value(s) is returned along with the GenerationStats object.\n\n        For example, if the input DataRecord (i.e. `candidate`) contains the contents of a scientific paper,\n        and the convert operation is supposed to extract the name and affiliation of each author into its own\n        DataRecord, then the output could be:\n\n        ({\"author\": [\"Jane Smith\", \"John Doe\"], \"affiliation\": [\"MIT\", \"Stanford University\"]}, GenerationStats(...))\n\n        Even if the convert operation is a one-to-one convert (i.e. it always generates one output DataRecord\n        for each input DataRecord), the output should still map each field to a singleton list containing its value.\n\n        A post-condition of this method is that every field in `fields` must be present in the output dictionary.\n        If there is an error in generating a field, then the value for that field must be None.\n        \"\"\"\n        pass\n\n    def __call__(self, candidate: DataRecord) -> DataRecordSet:\n        \"\"\"\n        This method converts an input DataRecord into an output DataRecordSet. The output DataRecordSet contains the\n        DataRecord(s) output by the operator's convert() method and their corresponding RecordOpStats objects.\n        Some subclasses may override this __call__method to implement their own custom logic.\n        \"\"\"\n        start_time = time.time()\n\n        # get fields to generate with this convert\n        fields_to_generate = self.get_fields_to_generate(candidate)\n\n        # execute the convert\n        field_answers: dict[str, list]\n        fields = {field: field_type for field, field_type in self.output_schema.model_fields.items() if field in fields_to_generate}\n        field_answers, generation_stats = self.convert(candidate=candidate, fields=fields)\n        assert all([field in field_answers for field in fields_to_generate]), \"Not all fields were generated!\"\n\n        # replace any None values with an empty list; subclasses may override __call__ to change this behavior\n        field_answers = {field: [] if answers is None else answers for field, answers in field_answers.items()}\n\n        # transform the mapping from fields to answers into a (list of) DataRecord(s)\n        drs, successful_convert = self._create_data_records_from_field_answers(field_answers, candidate)\n\n        # construct and return DataRecordSet\n        record_set = self._create_record_set(\n            records=drs,\n            field_names=fields_to_generate,\n            generation_stats=generation_stats,\n            total_time=time.time() - start_time,\n            successful_convert=successful_convert,\n        )\n\n        return record_set\n\n\nclass NonLLMConvert(ConvertOp):\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    UDF: {self.udf.__name__}\\n\"\n        return op\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Compute naive cost estimates for the NonLLMConvert operation. These estimates assume\n        that the UDF convert (1) has no cost and (2) has perfect quality.\n        \"\"\"\n        # estimate cardinality and selectivity given the \"cardinality\" set by the user\n        selectivity = 1.0 if self.cardinality == Cardinality.ONE_TO_ONE else NAIVE_EST_ONE_TO_MANY_SELECTIVITY\n        cardinality = selectivity * source_op_cost_estimates.cardinality\n\n        # estimate 1 ms single-threaded execution for udf function\n        time_per_record = 0.001\n\n        # assume filter fn has perfect quality\n        return OperatorCostEstimates(\n            cardinality=cardinality,\n            time_per_record=time_per_record,\n            cost_per_record=0.0,\n            quality=1.0,\n        )\n\n    def convert(self, candidate: DataRecord, fields: dict[str, FieldInfo]) -> tuple[dict[str, list], GenerationStats]:\n        # apply UDF to input record\n        start_time = time.time()\n        field_answers = {}\n        try:\n            # execute the UDF function\n            answer = self.udf(candidate.to_dict())\n\n            if self.cardinality == Cardinality.ONE_TO_ONE:\n                # answer should be a dictionary\n                assert isinstance(answer, dict), (\n                    \"UDF must return a dictionary mapping each generated field to its value for one-to-one converts\"\n                )\n\n                # wrap each answer in a list\n                field_answers = {field_name: [answer[field_name]] for field_name in fields}\n\n            else:\n                assert isinstance(answer, list), \"UDF must return a list of dictionaries for one-to-many converts\"\n                field_answers = {field_name: [] for field_name in fields}\n                for answer_dict in answer:\n                    assert isinstance(answer_dict, dict), \"Each element of list returned by UDF must be a dictionary\"\n                    for field_name in fields:\n                        field_answers[field_name].append(answer_dict.get(field_name, None))\n\n            if self.verbose:\n                print(f\"{self.udf.__name__}:\\n{answer}\")\n\n        except Exception as e:\n            print(f\"Error invoking user-defined function for convert: {e}\")\n            raise e\n\n        # create generation stats object containing the time spent executing the UDF function\n        generation_stats = GenerationStats(fn_call_duration_secs=time.time() - start_time)\n\n        return field_answers, generation_stats\n\n\nclass LLMConvert(ConvertOp):\n    \"\"\"\n    This is the base class for convert operations which use an LLM to generate the output fields.\n    \"\"\"\n\n    def __init__(\n        self,\n        model: Model,\n        prompt_strategy: PromptStrategy = PromptStrategy.MAP,\n        reasoning_effort: str = \"default\",\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        self.model = model\n        self.prompt_strategy = prompt_strategy\n        self.reasoning_effort = reasoning_effort\n        if model is not None:\n            self.generator = Generator(model, prompt_strategy, reasoning_effort, self.cardinality, self.desc, self.verbose)\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Prompt Strategy: {self.prompt_strategy}\\n\"\n        op += f\"    Reasoning Effort: {self.reasoning_effort}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"model\": None if self.model is None else self.model.value,\n            \"prompt_strategy\": None if self.prompt_strategy is None else self.prompt_strategy.value,\n            \"reasoning_effort\": self.reasoning_effort,\n            **id_params,\n        }\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"model\": self.model,\n            \"prompt_strategy\": self.prompt_strategy,\n            \"reasoning_effort\": self.reasoning_effort,\n            **op_params,\n        }\n\n        return op_params\n\n    def get_model_name(self):\n        return None if self.model is None else self.model.value\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Compute naive cost estimates for the LLMConvert operation. Implicitly, these estimates\n        assume the use of a single LLM call for each input record. Child classes of LLMConvert\n        may call this function through super() and adjust these estimates as needed (or they can\n        completely override this function).\n        \"\"\"\n        # estimate number of input and output tokens from source\n        est_num_input_tokens = NAIVE_EST_NUM_INPUT_TOKENS\n        est_num_output_tokens = NAIVE_EST_NUM_OUTPUT_TOKENS\n\n        # get est. of conversion time per record from model card;\n        model_conversion_time_per_record = self.model.get_seconds_per_output_token() * est_num_output_tokens\n\n        # get est. of conversion cost (in USD) per record from model card\n        usd_per_input_token = self.model.get_usd_per_input_token()\n        if getattr(self, \"prompt_strategy\", None) is not None and self.is_audio_op():\n            usd_per_input_token = self.model.get_usd_per_audio_input_token()\n\n        model_conversion_usd_per_record = (\n            usd_per_input_token * est_num_input_tokens\n            + self.model.get_usd_per_output_token() * est_num_output_tokens\n        )\n\n        # estimate cardinality and selectivity given the \"cardinality\" set by the user\n        selectivity = 1.0 if self.cardinality == Cardinality.ONE_TO_ONE else NAIVE_EST_ONE_TO_MANY_SELECTIVITY\n        cardinality = selectivity * source_op_cost_estimates.cardinality\n\n        # estimate quality of output based on the strength of the model being used\n        quality = (self.model.get_overall_score() / 100.0)\n\n        return OperatorCostEstimates(\n            cardinality=cardinality,\n            time_per_record=model_conversion_time_per_record,\n            cost_per_record=model_conversion_usd_per_record,\n            quality=quality,\n        )\n\n\nclass LLMConvertBonded(LLMConvert):\n\n    def convert(self, candidate: DataRecord, fields: dict[str, FieldInfo]) -> tuple[dict[str, list], GenerationStats]:\n        # get the set of input fields to use for the convert operation\n        input_fields = self.get_input_fields()\n\n        # construct kwargs for generation\n        gen_kwargs = {\"project_cols\": input_fields, \"output_schema\": self.output_schema}\n\n        # generate outputs for all fields in a single query\n        field_answers, _, generation_stats, _ = self.generator(candidate, fields, **gen_kwargs)\n\n        # if there was an error for any field, execute a conventional query on that field\n        if len(field_answers) > 1:\n            for field_name, answers in field_answers.items():\n                if answers is None:\n                    single_field_answers, _, single_field_stats, _ = self.generator(candidate, {field_name: fields[field_name]}, **gen_kwargs)\n                    field_answers.update(single_field_answers)\n                    generation_stats += single_field_stats\n\n        return field_answers, generation_stats\n"
  },
  {
    "path": "src/palimpzest/query/operators/critique_and_refine.py",
    "content": "from __future__ import annotations\n\nfrom typing import Any\n\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.constants import Cardinality, Model, PromptStrategy\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.models import GenerationStats, OperatorCostEstimates\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.operators.convert import LLMConvert\nfrom palimpzest.query.operators.filter import LLMFilter\n\n# TYPE DEFINITIONS\nFieldName = str\n\n\nclass CritiqueAndRefineConvert(LLMConvert):\n\n    def __init__(\n        self,\n        critic_model: Model,\n        refine_model: Model,\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        self.critic_model = critic_model\n        self.refine_model = refine_model\n\n        # create generators\n        self.critic_generator = Generator(self.critic_model, PromptStrategy.MAP_CRITIC, self.reasoning_effort, self.cardinality, self.desc, self.verbose)\n        self.refine_generator = Generator(self.refine_model, PromptStrategy.MAP_REFINE, self.reasoning_effort, self.cardinality, self.desc, self.verbose)\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Critic Model: {self.critic_model}\\n\"\n        op += f\"    Refine Model: {self.refine_model}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"critic_model\": self.critic_model.value,\n            \"refine_model\": self.refine_model.value,\n            **id_params,\n        }\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"critic_model\": self.critic_model,\n            \"refine_model\": self.refine_model,\n            **op_params,\n        }\n\n        return op_params\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Currently, we are invoking `self.model`, then critiquing its output with `self.critic_model`, and\n        finally refining the output with `self.refine_model`. Thus, we roughly expect to incur the cost\n        and time of three LLMConverts. In practice, this naive quality estimate will be overwritten by the\n        CostModel's estimate once it executes a few instances of the operator.\n        \"\"\"\n        # get naive cost estimates for first LLM call and multiply by 3 for now;\n        # of course we should sum individual estimates for each model, but this is a rough estimate\n        # and in practice we will need to revamp our naive cost estimates in the near future\n        naive_op_cost_estimates = 3 * super().naive_cost_estimates(source_op_cost_estimates)\n\n        # for naive setting, estimate quality as quality of refine model\n        model_quality = self.refine_model.get_overall_score() / 100.0\n        naive_op_cost_estimates.quality = model_quality\n        naive_op_cost_estimates.quality_lower_bound = naive_op_cost_estimates.quality\n        naive_op_cost_estimates.quality_upper_bound = naive_op_cost_estimates.quality\n\n        return naive_op_cost_estimates\n\n    def convert(self, candidate: DataRecord, fields: dict[str, FieldInfo]) -> tuple[dict[FieldName, list[Any]], GenerationStats]:\n        # get input fields\n        input_fields = self.get_input_fields()\n\n        # NOTE: when I merge in the `abacus` branch, I will want to update this to reflect the changes I made to reasoning extraction\n        # execute the initial model\n        original_gen_kwargs = {\"project_cols\": input_fields, \"output_schema\": self.output_schema}\n        field_answers, reasoning, original_gen_stats, original_messages = self.generator(candidate, fields, **original_gen_kwargs)\n        original_output = f\"REASONING: {reasoning}\\nANSWER: {field_answers}\\n\"\n\n        # execute the critic model\n        critic_gen_kwargs = {\"original_output\": original_output, \"original_messages\": original_messages, **original_gen_kwargs}\n        _, reasoning, critic_gen_stats, _ = self.critic_generator(candidate, fields, json_output=False, **critic_gen_kwargs)\n        critique_output = f\"CRITIQUE: {reasoning}\\n\"\n\n        # execute the refinement model\n        refine_gen_kwargs = {\"critique_output\": critique_output, **critic_gen_kwargs}\n        field_answers, reasoning, refine_gen_stats, _ = self.refine_generator(candidate, fields, **refine_gen_kwargs)\n\n        # compute the total generation stats\n        generation_stats = original_gen_stats + critic_gen_stats + refine_gen_stats\n\n        return field_answers, generation_stats\n\n\nclass CritiqueAndRefineFilter(LLMFilter):\n\n    def __init__(\n        self,\n        critic_model: Model,\n        refine_model: Model,\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        self.critic_model = critic_model\n        self.refine_model = refine_model\n\n        # create generators\n        self.critic_generator = Generator(self.critic_model, PromptStrategy.FILTER_CRITIC, self.reasoning_effort, Cardinality.ONE_TO_ONE, self.desc, self.verbose)\n        self.refine_generator = Generator(self.refine_model, PromptStrategy.FILTER_REFINE, self.reasoning_effort, Cardinality.ONE_TO_ONE, self.desc, self.verbose)\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Critic Model: {self.critic_model}\\n\"\n        op += f\"    Refine Model: {self.refine_model}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"critic_model\": self.critic_model.value,\n            \"refine_model\": self.refine_model.value,\n            **id_params,\n        }\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"critic_model\": self.critic_model,\n            \"refine_model\": self.refine_model,\n            **op_params,\n        }\n\n        return op_params\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Currently, we are invoking `self.model`, then critiquing its output with `self.critic_model`, and\n        finally refining the output with `self.refine_model`. Thus, we roughly expect to incur the cost\n        and time of three LLMFilters. In practice, this naive quality estimate will be overwritten by the\n        CostModel's estimate once it executes a few instances of the operator.\n        \"\"\"\n        # get naive cost estimates for first LLM call and multiply by 3 for now;\n        # of course we should sum individual estimates for each model, but this is a rough estimate\n        # and in practice we will need to revamp our naive cost estimates in the near future\n        naive_op_cost_estimates = 3 * super().naive_cost_estimates(source_op_cost_estimates)\n\n        # for naive setting, estimate quality as quality of refine model\n        model_quality = self.refine_model.get_overall_score() / 100.0\n        naive_op_cost_estimates.quality = model_quality\n        naive_op_cost_estimates.quality_lower_bound = naive_op_cost_estimates.quality\n        naive_op_cost_estimates.quality_upper_bound = naive_op_cost_estimates.quality\n\n        return naive_op_cost_estimates\n\n    def filter(self, candidate: DataRecord) -> tuple[dict[str, bool], GenerationStats]:\n        # get input fields\n        input_fields = self.get_input_fields()\n\n        # construct output fields\n        fields = {\"passed_operator\": FieldInfo(annotation=bool, description=\"Whether the record passed the filter operation\")}\n\n        # NOTE: when I merge in the `abacus` branch, I will want to update this to reflect the changes I made to reasoning extraction\n        # execute the initial model\n        original_gen_kwargs = {\"project_cols\": input_fields, \"filter_condition\": self.filter_obj.filter_condition}\n        field_answers, reasoning, original_gen_stats, original_messages = self.generator(candidate, fields, **original_gen_kwargs)\n        original_output = f\"REASONING: {reasoning}\\nANSWER: {str(field_answers['passed_operator']).upper()}\\n\"\n\n        # execute the critic model\n        critic_gen_kwargs = {\"original_output\": original_output, \"original_messages\": original_messages, **original_gen_kwargs}\n        _, reasoning, critic_gen_stats, _ = self.critic_generator(candidate, fields, json_output=False, **critic_gen_kwargs)\n        critique_output = f\"CRITIQUE: {reasoning}\\n\"\n\n        # execute the refinement model\n        refine_gen_kwargs = {\"critique_output\": critique_output, **critic_gen_kwargs}\n        field_answers, reasoning, refine_gen_stats, _ = self.refine_generator(candidate, fields, **refine_gen_kwargs)\n\n        # compute the total generation stats\n        generation_stats = original_gen_stats + critic_gen_stats + refine_gen_stats\n\n        return field_answers, generation_stats\n"
  },
  {
    "path": "src/palimpzest/query/operators/distinct.py",
    "content": "from __future__ import annotations\n\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import OperatorCostEstimates, RecordOpStats\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\n\nclass DistinctOp(PhysicalOperator):\n    def __init__(self, distinct_cols: list[str], distinct_seen: set | None = None, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.distinct_cols = distinct_cols\n        self._distinct_seen = set() if distinct_seen is None else distinct_seen\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Distinct Cols: {self.distinct_cols}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\"distinct_cols\": self.distinct_cols, **id_params}\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"distinct_cols\": self.distinct_cols, \"distinct_seen\": self._distinct_seen, **op_params}\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        # assume applying the distinct operator takes negligible additional time (and no cost in USD)\n        return OperatorCostEstimates(\n            cardinality=source_op_cost_estimates.cardinality,\n            time_per_record=0,\n            cost_per_record=0,\n            quality=1.0,\n        )\n\n    def __call__(self, candidate: DataRecord) -> DataRecordSet:\n        # create new DataRecord\n        dr = DataRecord.from_parent(schema=candidate.schema, data_item={}, parent_record=candidate)\n\n        # output record only if it has not been seen before\n        record_str = dr.to_json_str(project_cols=self.distinct_cols, bytes_to_str=True, sorted=True)\n        record_hash = f\"{hash(record_str)}\"\n        dr._passed_operator = record_hash not in self._distinct_seen\n        if dr._passed_operator:\n            self._distinct_seen.add(record_hash)\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=0.0,\n            cost_per_record=0.0,\n            passed_operator=dr._passed_operator,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n"
  },
  {
    "path": "src/palimpzest/query/operators/filter.py",
    "content": "from __future__ import annotations\n\nimport time\nfrom abc import ABC, abstractmethod\nfrom typing import Any\n\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.constants import (\n    NAIVE_EST_FILTER_SELECTIVITY,\n    NAIVE_EST_NUM_INPUT_TOKENS,\n    Cardinality,\n    Model,\n    PromptStrategy,\n)\nfrom palimpzest.core.elements.filters import Filter\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import GenerationStats, OperatorCostEstimates, RecordOpStats\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\n\nclass FilterOp(PhysicalOperator, ABC):\n    def __init__(self, filter: Filter, desc: str | None = None, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        assert self.input_schema == self.output_schema, \"Input and output schemas must match for FilterOp\"\n        self.filter_obj = filter\n        self.desc = desc\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Filter: {str(self.filter_obj)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\"filter\": str(self.filter_obj), \"desc\": self.desc, **id_params}\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"filter\": self.filter_obj, \"desc\": self.desc, **op_params}\n\n    @abstractmethod\n    def filter(self, candidate: DataRecord) -> tuple[dict[str, bool], GenerationStats]:\n        \"\"\"\n        This abstract method will be implemented by subclasses of FilterOp to process the input DataRecord\n        and generate the True / False determination of whether the input record passes the filter. A dictionary\n        mapping a \"passed_operator\" key to the T/F boolean is returned along with the GenerationStats object.\n\n        For example, if the input DataRecord (i.e. `candidate`) contains an image of a dog, and the filter\n        operation is supposed to filter for images with dogs, then the output would be:\n\n        ({\"passed_operator\": True}, GenerationStats(...))\n\n        A post-condition of this method is that the \"passed_operator\" key must be present in the output dictionary,\n        and it's value must be a boolean. If there is an error, then the value for \"passed_operator\" must be False.\n        \"\"\"\n        pass\n\n    def _create_record_set(\n        self,\n        candidate: DataRecord,\n        passed_operator: bool,\n        generation_stats: GenerationStats,\n        total_time: float,\n        answer: dict[str, Any],\n    ) -> DataRecordSet:\n        \"\"\"\n        Given an input DataRecord and a determination of whether it passed the filter or not,\n        construct the resulting RecordSet.\n        \"\"\"\n        # create new DataRecord and set passed_operator attribute\n        dr = DataRecord.from_parent(schema=candidate.schema, data_item={}, parent_record=candidate)\n        dr._passed_operator = passed_operator\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=total_time,\n            cost_per_record=generation_stats.cost_per_record,\n            model_name=self.get_model_name(),\n            filter_str=self.filter_obj.get_filter_str(),\n            input_text_tokens=generation_stats.input_text_tokens,\n            input_audio_tokens=generation_stats.input_audio_tokens,\n            input_image_tokens=generation_stats.input_image_tokens,\n            cache_read_tokens=generation_stats.cache_read_tokens,\n            cache_creation_tokens=generation_stats.cache_creation_tokens,\n            output_text_tokens=generation_stats.output_text_tokens,\n            embedding_input_tokens=generation_stats.embedding_input_tokens,\n            llm_call_duration_secs=generation_stats.llm_call_duration_secs,\n            fn_call_duration_secs=generation_stats.fn_call_duration_secs,\n            total_llm_calls=generation_stats.total_llm_calls,\n            total_embedding_llm_calls=generation_stats.total_embedding_llm_calls,\n            answer=answer,\n            passed_operator=passed_operator,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n\n    def __call__(self, candidate: DataRecord) -> DataRecordSet:\n        start_time = time.time()\n\n        # apply the filter operation\n        field_answers, generation_stats = self.filter(candidate)\n\n        # create and return record set\n        record_set = self._create_record_set(\n            candidate,\n            field_answers[\"passed_operator\"],\n            generation_stats,\n            time.time() - start_time,\n            field_answers\n        )\n\n        return record_set\n\n\nclass NonLLMFilter(FilterOp):\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates):\n        # estimate output cardinality using a constant assumption of the filter selectivity\n        selectivity = NAIVE_EST_FILTER_SELECTIVITY\n        cardinality = selectivity * source_op_cost_estimates.cardinality\n\n        # estimate 1 ms single-threaded execution for filter function\n        time_per_record = 0.001\n\n        # assume filter fn has perfect quality\n        return OperatorCostEstimates(\n            cardinality=cardinality,\n            time_per_record=time_per_record,\n            cost_per_record=0.0,\n            quality=1.0,\n        )\n    \n    def filter(self, candidate: DataRecord) -> tuple[dict[str, bool], GenerationStats]:\n        # apply filter function to input record\n        start_time = time.time()\n        answer = {}\n        try:\n            # execute the UDF filter\n            passed_operator = self.filter_obj.filter_fn(candidate.to_dict())\n            answer = {\"passed_operator\": passed_operator}\n\n            if self.verbose:\n                print(f\"{self.filter_obj.get_filter_str()}:\\n{passed_operator}\")\n\n        except Exception as e:\n            print(f\"Error invoking user-defined function for filter: {e}\")\n            raise e\n\n        # create generation stats object containing the time spent executing the UDF function\n        generation_stats = GenerationStats(fn_call_duration_secs=time.time() - start_time)\n\n        return answer, generation_stats\n\n\nclass LLMFilter(FilterOp):\n    def __init__(\n        self,\n        model: Model,\n        prompt_strategy: PromptStrategy = PromptStrategy.FILTER,\n        reasoning_effort: str = \"default\",\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        self.model = model\n        self.prompt_strategy = prompt_strategy\n        self.reasoning_effort = reasoning_effort\n        if model is not None:\n            self.generator = Generator(model, prompt_strategy, reasoning_effort, Cardinality.ONE_TO_ONE, self.desc, self.verbose)\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"model\": None if self.model is None else self.model.value,\n            \"prompt_strategy\": None if self.prompt_strategy is None else self.prompt_strategy.value,\n            \"reasoning_effort\": self.reasoning_effort,\n            **id_params,\n        }\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"model\": self.model,\n            \"prompt_strategy\": self.prompt_strategy,\n            \"reasoning_effort\": self.reasoning_effort,\n            **op_params,\n        }\n\n        return op_params\n\n    def get_model_name(self):\n        return None if self.model is None else self.model.value\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates):\n        # estimate number of input tokens from source\n        est_num_input_tokens = NAIVE_EST_NUM_INPUT_TOKENS\n        if self.is_image_op():\n            est_num_input_tokens = 765 / 10  # 1024x1024 image is 765 tokens\n\n        # NOTE: the output often generates an entire reasoning sentence, thus the true value may be higher\n        # the filter operation's LLM call should only output TRUE or FALSE, thus we expect its\n        # number of output tokens to be ~1.25\n        est_num_output_tokens = 1.25\n\n        # get est. of conversion time per record from model card;\n        model_conversion_time_per_record = (\n            self.model.get_seconds_per_output_token() * est_num_output_tokens\n        )\n\n        # get est. of conversion cost (in USD) per record from model card\n        usd_per_input_token = (\n            self.model.get_usd_per_audio_input_token()\n            if self.is_audio_op()\n            else self.model.get_usd_per_input_token()\n        )\n        model_conversion_usd_per_record = (\n            usd_per_input_token * est_num_input_tokens\n            + self.model.get_usd_per_output_token() * est_num_output_tokens\n        )\n\n        # estimate output cardinality using a constant assumption of the filter selectivity\n        selectivity = NAIVE_EST_FILTER_SELECTIVITY\n        cardinality = selectivity * source_op_cost_estimates.cardinality\n\n        # estimate quality of output based on the strength of the model being used\n        quality = (self.model.get_overall_score() / 100.0)\n\n        return OperatorCostEstimates(\n            cardinality=cardinality,\n            time_per_record=model_conversion_time_per_record,\n            cost_per_record=model_conversion_usd_per_record,\n            quality=quality,\n        )\n\n    def filter(self, candidate: DataRecord) -> tuple[dict[str, bool], GenerationStats]:\n        # get the set of input fields to use for the filter operation\n        input_fields = self.get_input_fields()\n\n        # construct kwargs for generation\n        gen_kwargs = {\"project_cols\": input_fields, \"filter_condition\": self.filter_obj.filter_condition}\n\n        # generate output; NOTE: FieldInfo is used to indicate the output type; thus, the desc is not needed\n        fields = {\"passed_operator\": FieldInfo(annotation=bool, description=\"Whether the record passed the filter operation\")}\n        field_answers, _, generation_stats, _ = self.generator(candidate, fields, **gen_kwargs)\n\n        return field_answers, generation_stats\n"
  },
  {
    "path": "src/palimpzest/query/operators/join.py",
    "content": "from __future__ import annotations\n\nimport threading\nimport time\nfrom abc import ABC, abstractmethod\nfrom concurrent.futures import ThreadPoolExecutor, as_completed\n\nimport numpy as np\nfrom litellm import embedding as litellm_embedding\nfrom numpy.linalg import norm\nfrom PIL import Image\nfrom pydantic.fields import FieldInfo\nfrom sentence_transformers import SentenceTransformer\n\nfrom palimpzest.constants import (\n    NAIVE_EST_JOIN_SELECTIVITY,\n    NAIVE_EST_NUM_INPUT_TOKENS,\n    Cardinality,\n    Model,\n    PromptStrategy,\n)\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.lib.schemas import AUDIO_FIELD_TYPES, IMAGE_FIELD_TYPES, ImageFilepath\nfrom palimpzest.core.models import GenerationStats, OperatorCostEstimates, RecordOpStats\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\n\nclass Singleton:\n     def __new__(cls, *args, **kw):\n         if not hasattr(cls, '_instance'):\n             orig = super(Singleton, cls)  # noqa: UP008\n             cls._instance = orig.__new__(cls, *args, **kw)\n         return cls._instance\n\nclass Locks(Singleton):\n    model = None\n    clip_lock = threading.Lock()\n    exec_lock = threading.Lock()\n\n    @classmethod\n    def get_model(cls, model_name: str):\n        with cls.clip_lock:\n            if cls.model is None:\n                cls.model = SentenceTransformer(model_name)\n            return cls.model\n\ndef compute_similarity(left_embedding: list[float], right_embedding: list[float]) -> float:\n    \"\"\"\n    Compute the similarity between two embeddings using cosine similarity.\n    \"\"\"\n    return np.dot(left_embedding, right_embedding) / (norm(left_embedding) * norm(right_embedding))\n\n\nclass JoinOp(PhysicalOperator, ABC):\n    def __init__(\n        self,\n        condition: str,\n        how: str = \"inner\",\n        on: list[str] | None = None,\n        join_parallelism: int = 64,\n        retain_inputs: bool = True,\n        desc: str | None = None,\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        assert self.input_schema == self.output_schema, \"Input and output schemas must match for JoinOp\"\n        self.condition = condition\n        self.how = how\n        self.on = on\n        self.join_parallelism = join_parallelism\n        self.retain_inputs = retain_inputs\n        self.desc = desc\n        self.join_idx = 0\n        self.finished = False\n\n        # maintain list(s) of input records for the join\n        self._left_input_records: list[DataRecord] = []\n        self._right_input_records: list[DataRecord] = []\n\n        # maintain set of left/right record ids that have been joined (for left/right/outer joins)\n        self._left_joined_record_ids: set[str] = set()\n        self._right_joined_record_ids: set[str] = set()\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Condition: {self.condition}\\n\"\n        op += f\"    How: {self.how}\\n\"\n        op += f\"    On: {self.on}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"condition\": self.condition,\n            \"join_parallelism\": self.join_parallelism,\n            \"desc\": self.desc,\n            \"how\": self.how,\n            \"on\": self.on,\n            **id_params,\n        }\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"condition\": self.condition,\n            \"join_parallelism\": self.join_parallelism,\n            \"retain_inputs\": self.retain_inputs,\n            \"desc\": self.desc,\n            \"how\": self.how,\n            \"on\": self.on,\n            **op_params,\n        }\n        return op_params\n\n    def _compute_unmatched_records(self) -> DataRecordSet:\n        \"\"\"Helper function to compute unmatched records for left/right/outer joins.\"\"\"\n        def join_unmatched_records(input_records: list[DataRecord] | list[tuple[DataRecord, list[float]]], joined_record_ids: set[str], left: bool = True):\n            records, record_op_stats_lst = [], []\n            for record in input_records:\n                start_time = time.time()\n                record = record[0] if isinstance(record, tuple) else record\n                if record._id not in joined_record_ids:\n                    unmatched_dr = (\n                        DataRecord.from_join_parents(self.output_schema, record, None)\n                        if left\n                        else DataRecord.from_join_parents(self.output_schema, None, record)\n                    )\n                    unmatched_dr._passed_operator = True\n\n                    # compute record stats and add to output_record_op_stats\n                    time_per_record = time.time() - start_time\n                    record_op_stats = RecordOpStats(\n                        record_id=unmatched_dr._id,\n                        record_parent_ids=unmatched_dr._parent_ids,\n                        record_source_indices=unmatched_dr._source_indices,\n                        record_state=unmatched_dr.to_dict(include_bytes=False),\n                        full_op_id=self.get_full_op_id(),\n                        logical_op_id=self.logical_op_id,\n                        op_name=self.op_name(),\n                        time_per_record=time_per_record,\n                        cost_per_record=0.0,\n                        model_name=self.get_model_name(),\n                        join_condition=str(self.on),\n                        fn_call_duration_secs=time_per_record,\n                        answer={\"passed_operator\": True},\n                        passed_operator=True,\n                        op_details={k: str(v) for k, v in self.get_id_params().items()},\n                    )\n                    records.append(unmatched_dr)\n                    record_op_stats_lst.append(record_op_stats)\n            return records, record_op_stats_lst\n\n        records, record_op_stats = [], []\n        if self.how == \"left\":\n            records, record_op_stats = join_unmatched_records(self._left_input_records, self._left_joined_record_ids, left=True)\n\n        elif self.how == \"right\":\n            records, record_op_stats = join_unmatched_records(self._right_input_records, self._right_joined_record_ids, left=False)\n\n        elif self.how == \"outer\":\n            records, record_op_stats = join_unmatched_records(self._left_input_records, self._left_joined_record_ids, left=True)\n            right_records, right_record_op_stats = join_unmatched_records(self._right_input_records, self._right_joined_record_ids, left=False)\n            records.extend(right_records)\n            record_op_stats.extend(right_record_op_stats)\n\n        return DataRecordSet(records, record_op_stats)\n\n    @abstractmethod\n    def naive_cost_estimates(self, left_source_op_cost_estimates: OperatorCostEstimates, right_source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        pass\n\n    def set_finished(self):\n        \"\"\"Mark the operator as finished after computing left/right/outer join logic.\"\"\"\n        self.finished = True\n\nclass RelationalJoin(JoinOp):\n\n    def get_model_name(self):\n        return None\n    \n    def _process_join_candidate_pair(self, left_candidate, right_candidate) -> tuple[DataRecord, RecordOpStats]:\n        start_time = time.time()\n\n        # determine whether or not the join was satisfied\n        passed_operator = all(\n            left_candidate[field] == right_candidate[field]\n            for field in self.on\n        )\n\n        # handle different join types\n        if self.how == \"left\" and passed_operator:\n            self._left_joined_record_ids.add(left_candidate._id)\n        elif self.how == \"right\" and passed_operator:\n            self._right_joined_record_ids.add(right_candidate._id)\n        elif self.how == \"outer\" and passed_operator:\n            self._left_joined_record_ids.add(left_candidate._id)\n            self._right_joined_record_ids.add(right_candidate._id)\n\n        # compute output record and add to output_records\n        join_dr = DataRecord.from_join_parents(self.output_schema, left_candidate, right_candidate)\n        join_dr._passed_operator = passed_operator\n\n        # compute record stats and add to output_record_op_stats\n        time_per_record = time.time() - start_time\n        record_op_stats = RecordOpStats(\n            record_id=join_dr._id,\n            record_parent_ids=join_dr._parent_ids,\n            record_source_indices=join_dr._source_indices,\n            record_state=join_dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=time_per_record,\n            cost_per_record=0.0,\n            model_name=self.get_model_name(),\n            join_condition=str(self.on),\n            fn_call_duration_secs=time_per_record,\n            answer={\"passed_operator\": passed_operator},\n            passed_operator=passed_operator,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return join_dr, record_op_stats\n\n    def naive_cost_estimates(self, left_source_op_cost_estimates: OperatorCostEstimates, right_source_op_cost_estimates: OperatorCostEstimates):\n        # estimate output cardinality using a constant assumption of the filter selectivity\n        selectivity = NAIVE_EST_JOIN_SELECTIVITY\n        cardinality = selectivity * (left_source_op_cost_estimates.cardinality * right_source_op_cost_estimates.cardinality)\n\n        # estimate 1 ms execution time per input record pair\n        time_per_record = 0.001 * (left_source_op_cost_estimates.cardinality + right_source_op_cost_estimates.cardinality)\n\n        return OperatorCostEstimates(\n            cardinality=cardinality,\n            time_per_record=time_per_record,\n            cost_per_record=0.0,\n            quality=1.0,\n        )\n\n    def __call__(self, left_candidates: list[DataRecord], right_candidates: list[DataRecord], final: bool = False) -> tuple[DataRecordSet, int]:\n        # create the set of candidates to join\n        join_candidates = []\n        for candidate in left_candidates:\n            for right_candidate in right_candidates:\n                join_candidates.append((candidate, right_candidate))\n            for right_candidate in self._right_input_records:\n                join_candidates.append((candidate, right_candidate))\n        for candidate in self._left_input_records:\n            for right_candidate in right_candidates:\n                join_candidates.append((candidate, right_candidate))\n\n        # apply the join logic to each pair of candidates\n        output_records, output_record_op_stats = [], []\n        with ThreadPoolExecutor(max_workers=self.join_parallelism) as executor:\n            futures = [\n                executor.submit(self._process_join_candidate_pair, candidate, right_candidate)\n                for candidate, right_candidate in join_candidates\n            ]\n  \n            # collect results as they complete\n            for future in as_completed(futures):\n                self.join_idx += 1\n                join_output_record, join_output_record_op_stats = future.result()\n                output_records.append(join_output_record)\n                output_record_op_stats.append(join_output_record_op_stats)\n\n        # compute the number of inputs processed\n        num_inputs_processed = len(join_candidates)\n\n        # store input records to join with new records added later\n        if self.retain_inputs:\n            self._left_input_records.extend(left_candidates)\n            self._right_input_records.extend(right_candidates)\n        \n        # if this is the final call, then add in any left/right/outer join records that did not match\n        if final:\n            return self._compute_unmatched_records(), 0\n\n        # return empty DataRecordSet if no output records were produced\n        if len(output_records) == 0:\n            return DataRecordSet([], []), num_inputs_processed\n\n        return DataRecordSet(output_records, output_record_op_stats), num_inputs_processed\n\n\n\nclass LLMJoin(JoinOp):\n    def __init__(\n        self,\n        model: Model,\n        prompt_strategy: PromptStrategy = PromptStrategy.JOIN,\n        reasoning_effort: str = \"default\",\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        self.model = model\n        self.prompt_strategy = prompt_strategy\n        self.reasoning_effort = reasoning_effort\n        self.generator = Generator(model, prompt_strategy, reasoning_effort, Cardinality.ONE_TO_ONE, self.desc, self.verbose)\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Model: {self.model.value}\\n\"\n        op += f\"    Reasoning Effort: {self.reasoning_effort}\\n\"\n        op += f\"    Prompt Strategy: {self.prompt_strategy.value}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"model\": self.model.value,\n            \"prompt_strategy\": self.prompt_strategy.value,\n            \"reasoning_effort\": self.reasoning_effort,\n            **id_params,\n        }\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"model\": self.model,\n            \"prompt_strategy\": self.prompt_strategy,\n            \"reasoning_effort\": self.reasoning_effort,\n            **op_params,\n        }\n        return op_params\n\n    def get_model_name(self):\n        return self.model.value\n\n    def _process_join_candidate_pair(\n        self,\n        left_candidate: DataRecord,\n        right_candidate: DataRecord,\n        gen_kwargs: dict,\n    ) -> tuple[DataRecord, RecordOpStats]:\n        start_time = time.time()\n\n        # generate output; NOTE: FieldInfo is used to indicate the output type; thus, the desc is not needed\n        fields = {\"passed_operator\": FieldInfo(annotation=bool, description=\"Whether the records satisfy the join condition\")}\n        field_answers, _, generation_stats, _ = self.generator(left_candidate, fields, right_candidate=right_candidate, **gen_kwargs)\n\n        # determine whether or not the join was satisfied\n        passed_operator = field_answers[\"passed_operator\"]\n\n        # handle different join types\n        if self.how == \"left\" and passed_operator:\n            self._left_joined_record_ids.add(left_candidate._id)\n        elif self.how == \"right\" and passed_operator:\n            self._right_joined_record_ids.add(right_candidate._id)\n        elif self.how == \"outer\" and passed_operator:\n            self._left_joined_record_ids.add(left_candidate._id)\n            self._right_joined_record_ids.add(right_candidate._id)\n\n        # compute output record and add to output_records\n        join_dr = DataRecord.from_join_parents(self.output_schema, left_candidate, right_candidate)\n        join_dr._passed_operator = passed_operator\n\n        # compute record stats and add to output_record_op_stats\n        record_op_stats = RecordOpStats(\n            record_id=join_dr._id,\n            record_parent_ids=join_dr._parent_ids,\n            record_source_indices=join_dr._source_indices,\n            record_state=join_dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=time.time() - start_time,\n            cost_per_record=generation_stats.cost_per_record,\n            model_name=self.get_model_name(),\n            join_condition=self.condition,\n            input_text_tokens=generation_stats.input_text_tokens,\n            input_audio_tokens=generation_stats.input_audio_tokens,\n            input_image_tokens=generation_stats.input_image_tokens,\n            cache_read_tokens=generation_stats.cache_read_tokens,\n            cache_creation_tokens=generation_stats.cache_creation_tokens,\n            output_text_tokens=generation_stats.output_text_tokens,\n            embedding_input_tokens=generation_stats.embedding_input_tokens,\n            llm_call_duration_secs=generation_stats.llm_call_duration_secs,\n            fn_call_duration_secs=generation_stats.fn_call_duration_secs,\n            total_llm_calls=generation_stats.total_llm_calls,\n            total_embedding_llm_calls=generation_stats.total_embedding_llm_calls,\n            answer=field_answers,\n            passed_operator=passed_operator,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return join_dr, record_op_stats\n\n\nclass NestedLoopsJoin(LLMJoin):\n\n    def naive_cost_estimates(self, left_source_op_cost_estimates: OperatorCostEstimates, right_source_op_cost_estimates: OperatorCostEstimates):\n        # estimate number of input tokens from source\n        est_num_input_tokens = 2 * NAIVE_EST_NUM_INPUT_TOKENS\n        if self.is_image_op():\n            est_num_input_tokens = 2 * 765 / 10  # 1024x1024 image is 765 tokens\n\n        # NOTE: the output often generates an entire reasoning sentence, thus the true value may be higher\n        # the filter operation's LLM call should only output TRUE or FALSE, thus we expect its\n        # number of output tokens to be ~1.25\n        est_num_output_tokens = 1.25\n\n        # get est. of conversion time per record from model card;\n        model_conversion_time_per_record = (\n            self.model.get_seconds_per_output_token() * est_num_output_tokens\n        )\n\n        # get est. of conversion cost (in USD) per record from model card\n        usd_per_input_token = (\n            self.model.get_usd_per_audio_input_token()\n            if self.is_audio_op()\n            else self.model.get_usd_per_input_token()\n        )\n\n        model_conversion_usd_per_record = (\n            usd_per_input_token * est_num_input_tokens\n            + self.model.get_usd_per_output_token() * est_num_output_tokens\n        )\n\n        # estimate output cardinality using a constant assumption of the filter selectivity\n        selectivity = NAIVE_EST_JOIN_SELECTIVITY\n        cardinality = selectivity * (left_source_op_cost_estimates.cardinality * right_source_op_cost_estimates.cardinality)\n\n        # estimate quality of output based on the strength of the model being used\n        quality = (self.model.get_overall_score() / 100.0)\n\n        return OperatorCostEstimates(\n            cardinality=cardinality,\n            time_per_record=model_conversion_time_per_record,\n            cost_per_record=model_conversion_usd_per_record,\n            quality=quality,\n        )\n\n    def __call__(self, left_candidates: list[DataRecord], right_candidates: list[DataRecord], final: bool = False) -> tuple[DataRecordSet, int]:\n        # get the set of input fields from both records in the join\n        input_fields = self.get_input_fields()\n\n        # construct kwargs for generation\n        gen_kwargs = {\"project_cols\": input_fields, \"join_condition\": self.condition}\n\n        # create the set of candidates to join\n        join_candidates = []\n        for candidate in left_candidates:\n            for right_candidate in right_candidates:\n                join_candidates.append((candidate, right_candidate))\n            for right_candidate in self._right_input_records:\n                join_candidates.append((candidate, right_candidate))\n        for candidate in self._left_input_records:\n            for right_candidate in right_candidates:\n                join_candidates.append((candidate, right_candidate))\n\n        # apply the generator to each pair of candidates\n        output_records, output_record_op_stats = [], []\n        with ThreadPoolExecutor(max_workers=self.join_parallelism) as executor:\n            futures = [\n                executor.submit(self._process_join_candidate_pair, candidate, right_candidate, gen_kwargs)\n                for candidate, right_candidate in join_candidates\n            ]\n  \n            # collect results as they complete\n            for future in as_completed(futures):\n                self.join_idx += 1\n                join_output_record, join_output_record_op_stats = future.result()\n                output_records.append(join_output_record)\n                output_record_op_stats.append(join_output_record_op_stats)\n                print(f\"{self.join_idx} JOINED\")\n\n        # compute the number of inputs processed\n        num_inputs_processed = len(join_candidates)\n\n        # store input records to join with new records added later\n        if self.retain_inputs:\n            self._left_input_records.extend(left_candidates)\n            self._right_input_records.extend(right_candidates)\n\n        # if this is the final call, then add in any left/right/outer join records that did not match\n        if final:\n            return self._compute_unmatched_records(), 0\n\n        # return empty DataRecordSet if no output records were produced\n        if len(output_records) == 0:\n            return DataRecordSet([], []), num_inputs_processed\n\n        return DataRecordSet(output_records, output_record_op_stats), num_inputs_processed\n\n\nclass EmbeddingJoin(LLMJoin):\n    # NOTE: we currently do not support audio joins as embedding models for audio seem to have\n    # specialized use cases (e.g., speech-to-text) with strict requirements on things like e.g. sample rate\n    def __init__(\n        self,\n        embedding_model: Model,\n        num_samples: int = 10,\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        self.num_samples = num_samples\n        self.samples_drawn = 0\n        self.embedding_model = embedding_model\n\n        # compute whether all fields are text fields\n        self.text_only = all([\n            field.annotation not in IMAGE_FIELD_TYPES + AUDIO_FIELD_TYPES\n            for field_name, field in self.input_schema.model_fields.items()\n            if field_name.split(\".\")[-1] in self.get_input_fields()\n        ])\n        self.locks = Locks()\n\n        # keep track of embedding costs that could not be amortized if no output records were produced\n        self.residual_embedding_cost = 0.0\n\n        # crude adjustment factor for naive estimation in unoptimized setting\n        self.naive_quality_adjustment = 0.6\n\n        # maintain list(s) of input records and their embeddings for the join\n        self._left_input_records: list[tuple[DataRecord, list[float]]] = []\n        self._right_input_records: list[tuple[DataRecord, list[float]]] = []\n\n        # maintain lowest and highest embedding similarities for matching and non-matching pairs\n        self.min_matching_sim = float(\"inf\")\n        self.max_non_matching_sim = float(\"-inf\")\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Embedding Model: {self.embedding_model.value}\\n\"\n        op += f\"    Num Samples: {self.num_samples}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"embedding_model\": self.embedding_model.value,\n            \"num_samples\": self.num_samples,\n            **id_params,\n        }\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"embedding_model\": self.embedding_model,\n            \"num_samples\": self.num_samples,\n            **op_params,\n        }\n\n        return op_params\n\n    def naive_cost_estimates(self, left_source_op_cost_estimates: OperatorCostEstimates, right_source_op_cost_estimates: OperatorCostEstimates):\n        # estimate number of input tokens from source\n        est_num_input_tokens = 2 * NAIVE_EST_NUM_INPUT_TOKENS\n        if self.is_image_op():\n            est_num_input_tokens = 2 * 765 / 10  # 1024x1024 image is 765 tokens\n\n        # NOTE: the output often generates an entire reasoning sentence, thus the true value may be higher\n        # the filter operation's LLM call should only output TRUE or FALSE, thus we expect its\n        # number of output tokens to be ~1.25\n        est_num_output_tokens = 1.25\n\n        # get est. of conversion time per record from model card;\n        model_conversion_time_per_record = (\n            self.embedding_model.get_seconds_per_output_token() * est_num_output_tokens\n        )\n\n        # get est. of conversion cost (in USD) per record from model card\n        model_conversion_usd_per_record = self.embedding_model.get_usd_per_input_token() * est_num_input_tokens\n\n        # estimate output cardinality using a constant assumption of the filter selectivity\n        selectivity = NAIVE_EST_JOIN_SELECTIVITY\n        cardinality = selectivity * (left_source_op_cost_estimates.cardinality * right_source_op_cost_estimates.cardinality)\n\n        # estimate quality of output based on the strength of the model being used\n        quality = (self.model.get_overall_score() / 100.0) * self.naive_quality_adjustment\n\n        return OperatorCostEstimates(\n            cardinality=cardinality,\n            time_per_record=model_conversion_time_per_record,\n            cost_per_record=model_conversion_usd_per_record,\n            quality=quality,\n        )\n\n    def _compute_embeddings(self, candidates: list[DataRecord], input_fields: list[str]) -> tuple[np.ndarray, GenerationStats]:\n        # return empty array and empty stats if no candidates  \n        if len(candidates) == 0:\n            return np.zeros((0, 512)), GenerationStats()\n\n        start_time = time.time()\n        total_embedding_input_tokens = 0\n        embeddings = None\n        if self.text_only:\n            inputs = [dr.to_json_str(bytes_to_str=True, project_cols=input_fields, sorted=True) for dr in candidates]\n            response = litellm_embedding(input=inputs, model=self.embedding_model.value)\n            total_embedding_input_tokens = response.usage.total_tokens if response.usage is not None else 0\n            embeddings = np.array([item['embedding'] for item in response.data])\n        else:\n            model = self.locks.get_model(self.embedding_model.value)\n            embeddings = np.zeros((len(candidates), 512))  # CLIP embeddings are 512-dimensional\n            num_input_fields_present = 0\n            for field in input_fields:\n                field_inputs = []\n                for candidate in candidates:\n                    if field not in candidate.get_field_names():\n                        continue\n                    num_input_fields_present += 1\n                    field_type = candidate.get_field_type(field)\n                    if field_type in [ImageFilepath]:\n                        field_inputs.append(Image.open(candidate[field]))\n                    else:\n                        field_inputs.append(str(candidate[field]))\n                \n                if len(field_inputs) > 0:\n                    embeddings += model.encode(field_inputs, convert_to_numpy=True)\n\n            # average embeddings over input fields present in candidates\n            embeddings /= num_input_fields_present\n\n        # compute cost of embedding(s)\n        total_embedding_cost = self.embedding_model.get_usd_per_input_token() * total_embedding_input_tokens\n        embedding_gen_stats = GenerationStats(\n            model_name=self.embedding_model.value,\n            embedding_input_tokens=total_embedding_input_tokens,\n            cost_per_record=total_embedding_cost,\n            llm_call_duration_secs=time.time() - start_time,\n            total_llm_calls=1,\n            total_embedding_llm_calls=len(candidates),\n        )\n\n        return embeddings, embedding_gen_stats\n\n    def _process_join_candidate_pair(self, left_candidate, right_candidate, gen_kwargs, embedding_sim):\n        output_record, output_record_op_stats = super()._process_join_candidate_pair(left_candidate, right_candidate, gen_kwargs)\n        return output_record, output_record_op_stats, embedding_sim\n\n    def _process_join_candidate_with_sim(self, left_candidate: DataRecord, right_candidate: DataRecord, embedding_sim: float, passed_operator: bool) -> tuple[DataRecord, RecordOpStats]:\n        # compute output record and add to output_records\n        join_dr = DataRecord.from_join_parents(self.output_schema, left_candidate, right_candidate)\n        join_dr._passed_operator = passed_operator\n\n        # handle different join types\n        if self.how == \"left\" and passed_operator:\n            self._left_joined_record_ids.add(left_candidate._id)\n        elif self.how == \"right\" and passed_operator:\n            self._right_joined_record_ids.add(right_candidate._id)\n        elif self.how == \"outer\" and passed_operator:\n            self._left_joined_record_ids.add(left_candidate._id)\n            self._right_joined_record_ids.add(right_candidate._id)\n\n        # NOTE: embedding costs are amortized over all records and added at the end of __call__\n        # compute record stats and add to output_record_op_stats\n        record_op_stats = RecordOpStats(\n            record_id=join_dr._id,\n            record_parent_ids=join_dr._parent_ids,\n            record_source_indices=join_dr._source_indices,\n            record_state=join_dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=0.0,\n            cost_per_record=0.0,\n            model_name=self.get_model_name(),\n            join_condition=self.condition,\n            answer={\"passed_operator\": passed_operator},\n            passed_operator=passed_operator,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return join_dr, record_op_stats, embedding_sim\n\n    def __call__(self, left_candidates: list[DataRecord], right_candidates: list[DataRecord], final: bool = False) -> tuple[DataRecordSet, int]:\n        # get the set of input fields from both records in the join\n        input_fields = self.get_input_fields()\n\n        # compute the embeding for each candidate\n        left_embeddings, left_embedding_gen_stats = self._compute_embeddings(left_candidates, input_fields)\n        right_embeddings, right_embedding_gen_stats = self._compute_embeddings(right_candidates, input_fields)\n        total_embedding_cost = left_embedding_gen_stats.cost_per_record + right_embedding_gen_stats.cost_per_record + self.residual_embedding_cost\n        self.residual_embedding_cost = 0.0\n\n        # construct kwargs for generation\n        gen_kwargs = {\"project_cols\": input_fields, \"join_condition\": self.condition}\n\n        # TODO: add embeddings to join candidates\n        # create the set of candidates to join\n        join_candidates = []\n        for candidate, embedding in zip(left_candidates, left_embeddings):\n            for right_candidate, right_embedding in zip(right_candidates, right_embeddings):\n                embedding_sim = compute_similarity(embedding, right_embedding)\n                join_candidates.append((candidate, right_candidate, embedding_sim))\n            for right_candidate, right_embedding in self._right_input_records:\n                embedding_sim = compute_similarity(embedding, right_embedding)\n                join_candidates.append((candidate, right_candidate, embedding_sim))\n        for candidate, embedding in self._left_input_records:\n            for right_candidate, right_embedding in zip(right_candidates, right_embeddings):\n                embedding_sim = compute_similarity(embedding, right_embedding)\n                join_candidates.append((candidate, right_candidate, embedding_sim))\n\n        # prepare list of output records and their stats\n        output_records, output_record_op_stats, num_inputs_processed = [], [], 0\n\n        # draw samples until num_samples is reached\n        with self.locks.exec_lock:\n            if self.samples_drawn < self.num_samples:\n                samples_to_draw = min(self.num_samples - self.samples_drawn, len(join_candidates))\n                join_candidate_samples = join_candidates[:samples_to_draw]\n                join_candidates = join_candidates[samples_to_draw:]\n\n                # apply the generator to each pair of candidates\n                with ThreadPoolExecutor(max_workers=self.join_parallelism) as executor:\n                    futures = [\n                        executor.submit(self._process_join_candidate_pair, left_candidate, right_candidate, gen_kwargs, embedding_sim)\n                        for left_candidate, right_candidate, embedding_sim in join_candidate_samples\n                    ]\n\n                    # collect results as they complete\n                    similarities, joined = [], []\n                    for future in as_completed(futures):\n                        self.join_idx += 1\n                        join_output_record, join_output_record_op_stats, embedding_sim = future.result()\n                        output_records.append(join_output_record)\n                        output_record_op_stats.append(join_output_record_op_stats)\n                        similarities.append(embedding_sim)\n                        joined.append(join_output_record._passed_operator)\n                        print(f\"{self.join_idx} JOINED\")\n\n                    # sort join results by embedding similarity\n                    sorted_sim_join_tuples = sorted(zip(similarities, joined), key=lambda x: x[0])\n\n                    # compute threshold below which no records joined\n                    for embedding_sim, records_joined in sorted_sim_join_tuples:\n                        if records_joined:\n                            break\n                        if not records_joined and embedding_sim > self.max_non_matching_sim:\n                            self.max_non_matching_sim = embedding_sim\n\n                    # compute threshold above which all records joined\n                    for embedding_sim, records_joined in reversed(sorted_sim_join_tuples):\n                        if not records_joined:\n                            break\n                        if records_joined and embedding_sim < self.min_matching_sim:\n                            self.min_matching_sim = embedding_sim\n\n                # update samples drawn and num_inputs_processed\n                self.samples_drawn += samples_to_draw\n                num_inputs_processed += samples_to_draw\n\n        # process remaining candidates based on embedding similarity\n        if len(join_candidates) > 0:\n             assert self.samples_drawn >= self.num_samples, \"All samples should have been drawn before processing remaining candidates\"\n             with ThreadPoolExecutor(max_workers=self.join_parallelism) as executor:\n                futures = []\n                for left_candidate, right_candidate, embedding_sim in join_candidates:\n                    # if the embedding similarity is lower than the threshold below which no records joined,\n                    # then we can skip the LLM call and mark the records as not joined\n                    if embedding_sim < self.max_non_matching_sim:\n                        futures.append(executor.submit(self._process_join_candidate_with_sim, left_candidate, right_candidate, embedding_sim, passed_operator=False))\n\n                    # if the embedding similarity is higher than the threshold above which all records joined,\n                    # then we can skip the LLM call and mark the records as joined\n                    elif embedding_sim > self.min_matching_sim:\n                        futures.append(executor.submit(self._process_join_candidate_with_sim, left_candidate, right_candidate, embedding_sim, passed_operator=True))\n\n                    # otherwise, we will process the LLM call\n                    else:\n                        futures.append(executor.submit(self._process_join_candidate_pair, left_candidate, right_candidate, gen_kwargs, embedding_sim))\n\n                    num_inputs_processed += 1\n\n                # collect results as they complete\n                similarities, joined = [], []\n                for future in as_completed(futures):\n                    self.join_idx += 1\n                    join_output_record, join_output_record_op_stats, embedding_sim = future.result()\n                    output_records.append(join_output_record)\n                    output_record_op_stats.append(join_output_record_op_stats)\n                    similarities.append(embedding_sim)\n                    joined.append(join_output_record._passed_operator)\n                    print(f\"{self.join_idx} JOINED\")\n\n                ### update thresholds if there are llm calls which incrementally squeeze the boundaries ###\n                # sort join results by embedding similarity\n                sorted_sim_join_tuples = sorted(zip(similarities, joined), key=lambda x: x[0])\n\n                # potentially update threshold below which no records joined\n                for embedding_sim, records_joined in sorted_sim_join_tuples:\n                    if records_joined:\n                        break\n                    if not records_joined and embedding_sim > self.max_non_matching_sim:\n                        self.max_non_matching_sim = embedding_sim\n\n                # potentially update threshold above which all records joined\n                for embedding_sim, records_joined in reversed(sorted_sim_join_tuples):\n                    if not records_joined:\n                        break\n                    if records_joined and embedding_sim < self.min_matching_sim:\n                        self.min_matching_sim = embedding_sim\n\n        # amortize embedding costs over all output records and add to each record's op stats\n        amortized_embedding_cost = total_embedding_cost / len(output_record_op_stats) if len(output_record_op_stats) > 0 else 0.0\n        for record_op_stats in output_record_op_stats:\n            record_op_stats.cost_per_record += amortized_embedding_cost\n\n        # store input records to join with new records added later\n        if self.retain_inputs:\n            self._left_input_records.extend(zip(left_candidates, left_embeddings))\n            self._right_input_records.extend(zip(right_candidates, right_embeddings))\n\n        # if this is the final call, then add in any left/right/outer join records that did not match\n        if final:\n            return self._compute_unmatched_records(), 0\n\n        # return empty DataRecordSet if no output records were produced\n        if len(output_records) == 0:\n            self.residual_embedding_cost = total_embedding_cost\n            return DataRecordSet([], []), num_inputs_processed\n\n        return DataRecordSet(output_records, output_record_op_stats), num_inputs_processed\n"
  },
  {
    "path": "src/palimpzest/query/operators/limit.py",
    "content": "from __future__ import annotations\n\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import OperatorCostEstimates, RecordOpStats\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\n\nclass LimitScanOp(PhysicalOperator):\n    def __init__(self, limit: int, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.limit = limit\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Limit: {self.limit}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\"limit\": self.limit, **id_params}\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"limit\": self.limit, **op_params}\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        # for now, assume applying the limit takes negligible additional time (and no cost in USD)\n        return OperatorCostEstimates(\n            cardinality=min(self.limit, source_op_cost_estimates.cardinality),\n            time_per_record=0,\n            cost_per_record=0,\n            quality=1.0,\n        )\n\n    def __call__(self, candidate: DataRecord) -> DataRecordSet:\n        # NOTE: execution layer ensures that no more than self.limit\n        #       records are returned to the user by this operator.\n        # create new DataRecord\n        dr = DataRecord.from_parent(schema=candidate.schema, data_item={}, parent_record=candidate)\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=0.0,\n            cost_per_record=0.0,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n"
  },
  {
    "path": "src/palimpzest/query/operators/logical.py",
    "content": "from __future__ import annotations\n\nimport json\nfrom typing import Callable\n\nfrom pydantic import BaseModel\n\nfrom palimpzest.constants import AggFunc, Cardinality\nfrom palimpzest.core.data import context, dataset\nfrom palimpzest.core.elements.filters import Filter\nfrom palimpzest.core.elements.groupbysig import GroupBySig\nfrom palimpzest.core.lib.schemas import Average, Count, Max, Min, Sum\nfrom palimpzest.utils.hash_helpers import hash_for_id\n\n\nclass LogicalOperator:\n    \"\"\"\n    A logical operator is an operator that operates on Sets.\n\n    Right now it can be one of:\n    - BaseScan (scans data from a root Dataset)\n    - ContextScan (loads the context for a root Dataset)\n    - FilteredScan (scans input Set and applies filter)\n    - ConvertScan (scans input Set and converts it to new Schema)\n    - LimitScan (scans up to N records from a Set)\n    - GroupByAggregate (applies a group by on the Set)\n    - Aggregate (applies an aggregation on the Set)\n    - TopKScan (fetches documents from a provided input for a given query)\n    - Map (applies a function to each record in the Set without adding any new columns)\n    - ComputeOperator (executes a computation described in natural language)\n    - SearchOperator (executes a search query on the input Context)\n\n    Every logical operator must declare the get_logical_id_params() and get_logical_op_params() methods,\n    which return dictionaries of parameters that are used to compute the logical op id and to implement\n    the logical operator (respectively).\n    \"\"\"\n\n    def __init__(\n        self,\n        output_schema: type[BaseModel],\n        input_schema: type[BaseModel] | None = None,\n        depends_on: list[str] | None = None,\n    ):\n        # TODO: can we eliminate input_schema?\n        self.output_schema = output_schema\n        self.input_schema = input_schema\n        self.depends_on = [] if depends_on is None else sorted(depends_on)\n        self.logical_op_id: str | None = None\n        self.unique_logical_op_id: str | None = None\n\n        # compute the fields generated by this logical operator\n        input_field_names = list(self.input_schema.model_fields) if self.input_schema is not None else []\n        self.generated_fields = sorted(\n            [field_name for field_name in self.output_schema.model_fields if field_name not in input_field_names]\n        )\n\n    def __str__(self) -> str:\n        raise NotImplementedError(\"Abstract method\")\n\n    def __eq__(self, other) -> bool:\n        all_id_params_match = all(value == getattr(other, key) for key, value in self.get_logical_id_params().items())\n        return isinstance(other, self.__class__) and all_id_params_match\n\n    def copy(self) -> LogicalOperator:\n        logical_op_copy = self.__class__(**self.get_logical_op_params())\n        logical_op_copy.logical_op_id = self.logical_op_id\n        logical_op_copy.unique_logical_op_id = self.unique_logical_op_id\n        return logical_op_copy\n\n    def logical_op_name(self) -> str:\n        \"\"\"Name of the logical operator.\"\"\"\n        return str(self.__class__.__name__)\n\n    def get_unique_logical_op_id(self) -> str:\n        \"\"\"\n        Get the unique logical operator id for this logical operator.\n        \"\"\"\n        return self.unique_logical_op_id\n\n    def set_unique_logical_op_id(self, unique_logical_op_id: str) -> None:\n        \"\"\"\n        Set the unique logical operator id for this logical operator.\n        This is used to uniquely identify the logical operator in the query plan.\n        \"\"\"\n        self.unique_logical_op_id = unique_logical_op_id\n\n    def get_logical_id_params(self) -> dict:\n        \"\"\"\n        Returns a dictionary mapping of logical operator parameters which are relevant\n        for computing the logical operator id.\n\n        NOTE: Should be overriden by subclasses to include class-specific parameters.\n        NOTE: input_schema and output_schema are not included in the id params because\n              they depend on how the Optimizer orders operations.\n        \"\"\"\n        # TODO: should we use `generated_fields` after getting rid of them in PhysicalOperator?\n        return {\"generated_fields\": self.generated_fields}\n\n    def get_logical_op_params(self) -> dict:\n        \"\"\"\n        Returns a dictionary mapping of logical operator parameters which may be used to\n        implement a physical operator associated with this logical operation.\n\n        NOTE: Should be overriden by subclasses to include class-specific parameters.\n        \"\"\"\n        return {\n            \"input_schema\": self.input_schema,\n            \"output_schema\": self.output_schema,\n            \"depends_on\": self.depends_on,\n        }\n\n    def get_logical_op_id(self):\n        \"\"\"\n        TODO: turn this into a property?\n\n        NOTE: We do not call this in the __init__() method as subclasses may set parameters\n              returned by self.get_logical_op_params() after they call to super().__init__().\n        \"\"\"\n        # return self.logical_op_id if we've computed it before\n        if self.logical_op_id is not None:\n            return self.logical_op_id\n\n        # get op name and op parameters which are relevant for computing the id\n        logical_op_name = self.logical_op_name()\n        logical_id_params = self.get_logical_id_params()\n        logical_id_params = {k: str(v) for k, v in logical_id_params.items()}\n\n        # compute, set, and return the op_id\n        hash_str = json.dumps({\"logical_op_name\": logical_op_name, **logical_id_params}, sort_keys=True)\n        self.logical_op_id = hash_for_id(hash_str)\n\n        return self.logical_op_id\n\n    def get_generated_fields(self) -> list[str]:\n        \"\"\"Returns the names of the fields generated by this logical operator.\"\"\"\n        return self.generated_fields\n\n    def __hash__(self):\n        if not self.logical_op_id:\n            raise ValueError(\"logical_op_id not set, unable to hash\")\n        return int(self.logical_op_id, 16)\n\n\nclass Aggregate(LogicalOperator):\n    \"\"\"\n    Aggregate is a logical operator that applies an aggregation to the input set and yields a single result.\n    This is a base class that has to be further specialized to implement specific aggregation functions.\n    \"\"\"\n\n    def __init__(\n        self,\n        agg_func: AggFunc | None = None,\n        agg_str: str | None = None,\n        *args,\n        **kwargs,\n    ):\n        assert agg_func is not None or agg_str is not None, \"Either agg_func or agg_str must be provided\"\n        if kwargs.get(\"output_schema\") is None:\n            if agg_func == AggFunc.COUNT:\n                kwargs[\"output_schema\"] = Count\n            elif agg_func == AggFunc.AVERAGE:\n                kwargs[\"output_schema\"] = Average\n            elif agg_func == AggFunc.SUM:\n                kwargs[\"output_schema\"] = Sum\n            elif agg_func == AggFunc.MIN:\n                kwargs[\"output_schema\"] = Min\n            elif agg_func == AggFunc.MAX:\n                kwargs[\"output_schema\"] = Max\n            else:\n                raise ValueError(f\"Unsupported aggregation function: {agg_func}\")\n\n        super().__init__(*args, **kwargs)\n        self.agg_func = agg_func\n        self.agg_str = agg_str\n\n    def __str__(self):\n        desc = f\"function: {str(self.agg_func.value)}\" if self.agg_func else f\"agg: {self.agg_str}\"\n        return f\"{self.__class__.__name__}({desc})\"\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\n            \"agg_func\": self.agg_func,\n            \"agg_str\": self.agg_str,\n            **logical_id_params,\n        }\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\n            \"agg_func\": self.agg_func,\n            \"agg_str\": self.agg_str,\n            **logical_op_params,\n        }\n\n        return logical_op_params\n\n\nclass BaseScan(LogicalOperator):\n    \"\"\"A BaseScan is a logical operator that represents a scan of a particular root Dataset.\"\"\"\n\n    def __init__(self, datasource: dataset.Dataset, output_schema: type[BaseModel], *args, **kwargs):\n        super().__init__(*args, output_schema=output_schema, **kwargs)\n        self.datasource = datasource\n\n    def __str__(self):\n        return f\"BaseScan({self.datasource},{self.output_schema})\"\n\n    def __eq__(self, other) -> bool:\n        return (\n            isinstance(other, BaseScan)\n            and self.input_schema == other.input_schema\n            and self.output_schema == other.output_schema\n            and self.datasource == other.datasource\n        )\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\n            \"id\": self.datasource.id,\n            **logical_id_params,\n        }\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\"datasource\": self.datasource, **logical_op_params}\n\n        return logical_op_params\n\n\nclass ContextScan(LogicalOperator):\n    \"\"\"A ContextScan is a logical operator that loads the context for a particular root Dataset.\"\"\"\n\n    def __init__(self, context: context.Context, output_schema: type[BaseModel], *args, **kwargs):\n        super().__init__(*args, output_schema=output_schema, **kwargs)\n        self.context = context\n\n    def __str__(self):\n        return f\"ContextScan({self.context},{self.output_schema})\"\n\n    def __eq__(self, other) -> bool:\n        return (\n            isinstance(other, ContextScan)\n            and self.context.id == other.context.id\n        )\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\n            \"id\": self.context.id,\n            **logical_id_params,\n        }\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\"context\": self.context, **logical_op_params}\n\n        return logical_op_params\n\n\nclass ConvertScan(LogicalOperator):\n    \"\"\"A ConvertScan is a logical operator that represents a scan of a particular input Dataset, with conversion applied.\"\"\"\n\n    def __init__(\n        self,\n        cardinality: Cardinality = Cardinality.ONE_TO_ONE,\n        udf: Callable | None = None,\n        desc: str | None = None,\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        self.cardinality = cardinality\n        self.udf = udf\n        self.desc = desc\n\n    def __str__(self):\n        return f\"ConvertScan({self.input_schema} -> {str(self.output_schema)})\"\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\n            \"cardinality\": self.cardinality,\n            \"udf\": self.udf,\n            \"desc\": self.desc,\n            **logical_id_params,\n        }\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\n            \"cardinality\": self.cardinality,\n            \"udf\": self.udf,\n            \"desc\": self.desc,\n            **logical_op_params,\n        }\n\n        return logical_op_params\n\n\nclass Distinct(LogicalOperator):\n    def __init__(self, distinct_cols: list[str] | None, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        # if distinct_cols is not None, check that all columns are in the input schema\n        if distinct_cols is not None:\n            for col in distinct_cols:\n                assert col in self.input_schema.model_fields, f\"Column {col} not found in input schema {self.input_schema} for Distinct operator\"\n\n        # store the list of distinct columns, sorted\n        self.distinct_cols = (\n            sorted([field_name for field_name in self.input_schema.model_fields])\n            if distinct_cols is None\n            else sorted(distinct_cols)\n        )\n\n    def __str__(self):\n        return f\"Distinct({self.distinct_cols})\"\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\"distinct_cols\": self.distinct_cols, **logical_id_params}\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\n            \"distinct_cols\": self.distinct_cols,\n            **logical_op_params,\n        }\n\n        return logical_op_params\n\n\nclass FilteredScan(LogicalOperator):\n    \"\"\"A FilteredScan is a logical operator that represents a scan of a particular input Dataset, with filters applied.\"\"\"\n\n    def __init__(\n        self,\n        filter: Filter,\n        desc: str | None = None,\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        self.filter = filter\n        self.desc = desc\n\n    def __str__(self):\n        return f\"FilteredScan({str(self.output_schema)}, {str(self.filter)})\"\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\n            \"filter\": self.filter,\n            \"desc\": self.desc,\n            **logical_id_params,\n        }\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\n            \"filter\": self.filter,\n            \"desc\": self.desc,\n            **logical_op_params,\n        }\n\n        return logical_op_params\n\n\nclass GroupByAggregate(LogicalOperator):\n    def __init__(\n        self,\n        group_by_sig: GroupBySig,\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        if not self.input_schema:\n            raise ValueError(\"GroupByAggregate requires an input schema\")\n        (valid, error) = group_by_sig.validate_schema(self.input_schema)\n        if not valid:\n            raise TypeError(error)\n        self.group_by_sig = group_by_sig\n\n    def __str__(self):\n        return f\"GroupBy({self.group_by_sig.serialize()})\"\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\"group_by_sig\": self.group_by_sig, **logical_id_params}\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\n            \"group_by_sig\": self.group_by_sig,\n            **logical_op_params,\n        }\n\n        return logical_op_params\n\n\nclass JoinOp(LogicalOperator):\n    def __init__(self, condition: str, on: list[str] | None = None, how: str = \"inner\", desc: str | None = None, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.condition = condition\n        self.on = on\n        self.how = how\n        self.desc = desc\n\n    def __str__(self):\n        return f\"Join(condition={self.condition})\" if self.on is None else f\"Join(on={self.on}, how={self.how})\"\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\n            \"condition\": self.condition,\n            \"on\": self.on,\n            \"how\": self.how,\n            \"desc\": self.desc,\n            **logical_id_params,\n        }\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\n            \"condition\": self.condition,\n            \"on\": self.on,\n            \"how\": self.how,\n            \"desc\": self.desc,\n            **logical_op_params,\n        }\n\n        return logical_op_params\n\n\nclass LimitScan(LogicalOperator):\n    def __init__(self, limit: int, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.limit = limit\n\n    def __str__(self):\n        return f\"LimitScan({str(self.input_schema)}, {str(self.output_schema)})\"\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\"limit\": self.limit, **logical_id_params}\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\n            \"limit\": self.limit,\n            **logical_op_params,\n        }\n\n        return logical_op_params\n\n\nclass Project(LogicalOperator):\n    def __init__(self, project_cols: list[str], *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.project_cols = project_cols\n\n    def __str__(self):\n        return f\"Project({self.input_schema}, {self.project_cols})\"\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\"project_cols\": self.project_cols, **logical_id_params}\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\n            \"project_cols\": self.project_cols,\n            **logical_op_params,\n        }\n\n        return logical_op_params\n\n\nclass TopKScan(LogicalOperator):\n    \"\"\"A TopKScan is a logical operator that represents a scan of a particular input Dataset, with a top-k operation applied.\"\"\"\n\n    def __init__(\n        self,\n        index,\n        search_func,\n        search_attr,\n        output_attrs,\n        k,\n        *args,\n        **kwargs,\n    ):\n        super().__init__(*args, **kwargs)\n        self.index = index\n        self.search_func = search_func\n        self.search_attr = search_attr\n        self.output_attrs = output_attrs\n        self.k = k\n\n    def __str__(self):\n        return f\"TopKScan({self.input_schema} -> {str(self.output_schema)})\"\n\n    def get_logical_id_params(self) -> dict:\n        # NOTE: if we allow optimization over index, then we will need to include it in the id params\n        # NOTE: ^if we do this, we should probably make a wrapper around the index object to ensure that\n        #       it can be serialized as a string properly\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\n            \"search_attr\": self.search_attr,\n            \"output_attrs\": self.output_attrs,\n            \"k\": self.k,\n            **logical_id_params,\n        }\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\n            \"index\": self.index,\n            \"search_func\": self.search_func,\n            \"search_attr\": self.search_attr,\n            \"output_attrs\": self.output_attrs,\n            \"k\": self.k,\n            **logical_op_params,\n        }\n\n        return logical_op_params\n\n\nclass ComputeOperator(LogicalOperator):\n    \"\"\"\n    A ComputeOperator is a logical operator that performs a computation described in natural language\n    on a given Context.\n    \"\"\"\n\n    def __init__(self, context_id: str, instruction: str, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.context_id = context_id\n        self.instruction = instruction\n\n    def __str__(self):\n        return f\"ComputeOperator(id={self.context_id}, instr={self.instruction:20s})\"\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\n            \"context_id\": self.context_id,\n            \"instruction\": self.instruction,\n            **logical_id_params,\n        }\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\n            \"context_id\": self.context_id,\n            \"instruction\": self.instruction,\n            **logical_op_params,\n        }\n\n        return logical_op_params\n\n\nclass SearchOperator(LogicalOperator):\n    \"\"\"\n    A SearchOperator is a logical operator that executes a search described in natural language\n    on a given Context.\n    \"\"\"\n\n    def __init__(self, context_id: str, search_query: str, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.context_id = context_id\n        self.search_query = search_query\n\n    def __str__(self):\n        return f\"SearchOperator(id={self.context_id}, search_query={self.search_query:20s})\"\n\n    def get_logical_id_params(self) -> dict:\n        logical_id_params = super().get_logical_id_params()\n        logical_id_params = {\n            \"context_id\": self.context_id,\n            \"search_query\": self.search_query,\n            **logical_id_params,\n        }\n\n        return logical_id_params\n\n    def get_logical_op_params(self) -> dict:\n        logical_op_params = super().get_logical_op_params()\n        logical_op_params = {\n            \"context_id\": self.context_id,\n            \"search_query\": self.search_query,\n            **logical_op_params,\n        }\n\n        return logical_op_params\n"
  },
  {
    "path": "src/palimpzest/query/operators/mixture_of_agents.py",
    "content": "from __future__ import annotations\n\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.constants import Cardinality, Model, PromptStrategy\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.models import GenerationStats, OperatorCostEstimates\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.operators.convert import LLMConvert\nfrom palimpzest.query.operators.filter import LLMFilter\n\n# TYPE DEFINITIONS\nFieldName = str\n\n\nclass MixtureOfAgentsConvert(LLMConvert):\n\n    def __init__(\n        self,\n        proposer_models: list[Model],\n        temperatures: list[float],\n        aggregator_model: Model,\n        *args,\n        **kwargs,\n    ):\n        kwargs[\"model\"] = None\n        kwargs[\"prompt_strategy\"] = None\n        super().__init__(*args, **kwargs)\n        sorted_proposers, sorted_temps = zip(*[(m, t) for m, t in sorted(zip(proposer_models, temperatures), key=lambda pair: pair[0])])\n        self.proposer_models = list(sorted_proposers)\n        self.temperatures = list(sorted_temps)\n        self.aggregator_model = aggregator_model\n\n        # create generators\n        self.proposer_generators = [\n            Generator(model, PromptStrategy.MAP_MOA_PROPOSER, self.reasoning_effort, self.cardinality, self.desc, self.verbose)\n            for model in proposer_models\n        ]\n        self.aggregator_generator = Generator(aggregator_model, PromptStrategy.MAP_MOA_AGG, self.reasoning_effort, self.cardinality, self.desc, self.verbose)\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Proposer Models: {self.proposer_models}\\n\"\n        op += f\"    Temperatures: {self.temperatures}\\n\"\n        op += f\"    Aggregator Model: {self.aggregator_model}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"proposer_models\": [model.value for model in self.proposer_models],\n            \"temperatures\": self.temperatures,\n            \"aggregator_model\": self.aggregator_model.value,\n            **id_params,\n        }\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"proposer_models\": self.proposer_models,\n            \"temperatures\": self.temperatures,\n            \"aggregator_model\": self.aggregator_model,\n            **op_params,\n        }\n\n        return op_params\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Currently, we are using multiple proposer models with different temperatures to synthesize\n        answers, which are then aggregated and summarized by a single aggregator model. Thus, we\n        roughly expect to incur the cost and time of an LLMConvert * (len(proposer_models) + 1).\n        In practice, this naive quality estimate will be overwritten by the CostModel's estimate\n        once it executes a few instances of the operator.\n        \"\"\"\n        # temporarily set self.model and self.prompt_strategy so that super().naive_cost_estimates(...) can compute an estimate\n        self.model = self.proposer_models[0]\n        self.prompt_strategy = PromptStrategy.MAP_MOA_PROPOSER\n\n        # get naive cost estimates for single LLM call and scale it by number of LLMs used in MoA\n        naive_op_cost_estimates = super().naive_cost_estimates(source_op_cost_estimates)\n        naive_op_cost_estimates.time_per_record *= (len(self.proposer_models) + 1)\n        naive_op_cost_estimates.time_per_record_lower_bound = naive_op_cost_estimates.time_per_record\n        naive_op_cost_estimates.time_per_record_upper_bound = naive_op_cost_estimates.time_per_record\n        naive_op_cost_estimates.cost_per_record *= (len(self.proposer_models) + 1)\n        naive_op_cost_estimates.cost_per_record_lower_bound = naive_op_cost_estimates.cost_per_record\n        naive_op_cost_estimates.cost_per_record_upper_bound = naive_op_cost_estimates.cost_per_record\n\n        # for naive setting, estimate quality as mean of all model qualities\n        model_qualities = [\n            model.get_overall_score() / 100.0\n            for model in self.proposer_models + [self.aggregator_model]\n        ]\n        naive_op_cost_estimates.quality = sum(model_qualities)/(len(self.proposer_models) + 1)\n        naive_op_cost_estimates.quality_lower_bound = naive_op_cost_estimates.quality\n        naive_op_cost_estimates.quality_upper_bound = naive_op_cost_estimates.quality\n\n        # reset self.model to be None\n        self.model = None\n        self.prompt_strategy = None\n\n        return naive_op_cost_estimates\n\n    def convert(self, candidate: DataRecord, fields: dict[str, FieldInfo]) -> tuple[dict[str, list], GenerationStats]:\n        # get input fields\n        input_fields = self.get_input_fields()\n\n        # execute generator models in sequence\n        proposer_model_final_answers, proposer_model_generation_stats = [], []\n        for proposer_generator, temperature in zip(self.proposer_generators, self.temperatures):\n            gen_kwargs = {\"project_cols\": input_fields, \"output_schema\": self.output_schema, \"temperature\": temperature}\n            _, reasoning, generation_stats, _ = proposer_generator(candidate, fields, json_output=False, **gen_kwargs)\n            proposer_text = f\"REASONING: {reasoning}\\n\"\n            proposer_model_final_answers.append(proposer_text)\n            proposer_model_generation_stats.append(generation_stats)\n\n        # call the aggregator\n        gen_kwargs = {\n            \"project_cols\": input_fields,\n            \"output_schema\": self.output_schema,\n            \"model_responses\": proposer_model_final_answers,\n        }\n        field_answers, _, aggregator_gen_stats, _ = self.aggregator_generator(candidate, fields, **gen_kwargs)\n\n        # compute the total generation stats\n        generation_stats = sum(proposer_model_generation_stats) + aggregator_gen_stats\n\n        return field_answers, generation_stats\n\n\nclass MixtureOfAgentsFilter(LLMFilter):\n\n    def __init__(\n        self,\n        proposer_models: list[Model],\n        temperatures: list[float],\n        aggregator_model: Model,\n        *args,\n        **kwargs,\n    ):\n        kwargs[\"model\"] = None\n        kwargs[\"prompt_strategy\"] = None\n        super().__init__(*args, **kwargs)\n        sorted_proposers, sorted_temps = zip(*[(m, t) for m, t in sorted(zip(proposer_models, temperatures), key=lambda pair: pair[0])])\n        self.proposer_models = list(sorted_proposers)\n        self.temperatures = list(sorted_temps)\n        self.aggregator_model = aggregator_model\n\n        # create generators\n        self.proposer_generators = [\n            Generator(model, PromptStrategy.FILTER_MOA_PROPOSER, self.reasoning_effort, Cardinality.ONE_TO_ONE, self.desc, self.verbose)\n            for model in proposer_models\n        ]\n        self.aggregator_generator = Generator(aggregator_model, PromptStrategy.FILTER_MOA_AGG, self.reasoning_effort, Cardinality.ONE_TO_ONE, self.desc, self.verbose)\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Proposer Models: {self.proposer_models}\\n\"\n        op += f\"    Temperatures: {self.temperatures}\\n\"\n        op += f\"    Aggregator Model: {self.aggregator_model}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"proposer_models\": [model.value for model in self.proposer_models],\n            \"temperatures\": self.temperatures,\n            \"aggregator_model\": self.aggregator_model.value,\n            **id_params,\n        }\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"proposer_models\": self.proposer_models,\n            \"temperatures\": self.temperatures,\n            \"aggregator_model\": self.aggregator_model,\n            **op_params,\n        }\n\n        return op_params\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Currently, we are using multiple proposer models with different temperatures to synthesize\n        answers, which are then aggregated and summarized by a single aggregator model. Thus, we\n        roughly expect to incur the cost and time of an LLMFilter * (len(proposer_models) + 1).\n        In practice, this naive quality estimate will be overwritten by the CostModel's estimate\n        once it executes a few instances of the operator.\n        \"\"\"\n        # temporarily set self.model so that super().naive_cost_estimates(...) can compute an estimate\n        self.model = self.proposer_models[0]\n\n        # get naive cost estimates for single LLM call and scale it by number of LLMs used in MoA\n        naive_op_cost_estimates = super().naive_cost_estimates(source_op_cost_estimates)\n        naive_op_cost_estimates.time_per_record *= (len(self.proposer_models) + 1)\n        naive_op_cost_estimates.time_per_record_lower_bound = naive_op_cost_estimates.time_per_record\n        naive_op_cost_estimates.time_per_record_upper_bound = naive_op_cost_estimates.time_per_record\n        naive_op_cost_estimates.cost_per_record *= (len(self.proposer_models) + 1)\n        naive_op_cost_estimates.cost_per_record_lower_bound = naive_op_cost_estimates.cost_per_record\n        naive_op_cost_estimates.cost_per_record_upper_bound = naive_op_cost_estimates.cost_per_record\n\n        # for naive setting, estimate quality as mean of all model qualities\n        model_qualities = [\n            model.get_overall_score() / 100.0\n            for model in self.proposer_models + [self.aggregator_model]\n        ]\n        naive_op_cost_estimates.quality = sum(model_qualities)/(len(self.proposer_models) + 1)\n        naive_op_cost_estimates.quality_lower_bound = naive_op_cost_estimates.quality\n        naive_op_cost_estimates.quality_upper_bound = naive_op_cost_estimates.quality\n\n        # reset self.model to be None\n        self.model = None\n\n        return naive_op_cost_estimates\n\n    def filter(self, candidate: DataRecord) -> tuple[dict[str, bool], GenerationStats]:\n        # get input fields\n        input_fields = self.get_input_fields()\n\n        # construct output fields\n        fields = {\"passed_operator\": FieldInfo(annotation=bool, description=\"Whether the record passed the filter operation\")}\n\n        # execute generator models in sequence\n        proposer_model_final_answers, proposer_model_generation_stats = [], []\n        for proposer_generator, temperature in zip(self.proposer_generators, self.temperatures):\n            gen_kwargs = {\"project_cols\": input_fields, \"filter_condition\": self.filter_obj.filter_condition, \"temperature\": temperature}\n            _, reasoning, generation_stats, _ = proposer_generator(candidate, fields, json_output=False, **gen_kwargs)\n            proposer_text = f\"REASONING: {reasoning}\\n\"\n            proposer_model_final_answers.append(proposer_text)\n            proposer_model_generation_stats.append(generation_stats)\n\n        # call the aggregator\n        gen_kwargs = {\n            \"project_cols\": input_fields,\n            \"filter_condition\": self.filter_obj.filter_condition,\n            \"model_responses\": proposer_model_final_answers,\n        }\n        field_answers, _, aggregator_gen_stats, _ = self.aggregator_generator(candidate, fields, **gen_kwargs)\n\n        # compute the total generation stats\n        generation_stats = sum(proposer_model_generation_stats) + aggregator_gen_stats\n\n        return field_answers, generation_stats\n"
  },
  {
    "path": "src/palimpzest/query/operators/physical.py",
    "content": "from __future__ import annotations\n\nimport json\n\nfrom pydantic import BaseModel\n\nfrom palimpzest.constants import Modality\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.lib.schemas import AUDIO_FIELD_TYPES, IMAGE_FIELD_TYPES\nfrom palimpzest.core.models import OperatorCostEstimates\nfrom palimpzest.utils.hash_helpers import hash_for_id\n\n\nclass PhysicalOperator:\n    \"\"\"\n    All implemented physical operators should inherit from this class.\n    In order for the Optimizer to consider using a physical operator for a\n    given logical operation, the user must also write an ImplementationRule.\n    \"\"\"\n\n    def __init__(\n        self,\n        output_schema: type[BaseModel],\n        input_schema: type[BaseModel] | None = None,\n        depends_on: list[str] | None = None,\n        logical_op_id: str | None = None,\n        unique_logical_op_id: str | None = None,\n        logical_op_name: str | None = None,\n        verbose: bool = False,\n        *args,\n        **kwargs,\n    ) -> None:\n        self.output_schema = output_schema\n        self.input_schema = input_schema\n        self.depends_on = depends_on if depends_on is None else sorted(depends_on)\n        self.logical_op_id = logical_op_id\n        self.unique_logical_op_id = unique_logical_op_id\n        self.logical_op_name = logical_op_name\n        self.verbose = verbose\n        self.op_id = None\n\n        # compute the input modalities (if any) for this physical operator\n        depends_on_short_field_names = [field.split(\".\")[-1] for field in self.depends_on] if self.depends_on is not None else None\n        self.input_modalities = None\n        if self.input_schema is not None:\n            self.input_modalities = set()\n            for field_name, field in self.input_schema.model_fields.items():\n                if self.depends_on is not None and field_name not in depends_on_short_field_names:\n                    continue\n                field_type = field.annotation\n                if field_type in IMAGE_FIELD_TYPES:\n                    self.input_modalities.add(Modality.IMAGE)\n                elif field_type in AUDIO_FIELD_TYPES:\n                    self.input_modalities.add(Modality.AUDIO)\n                else:\n                    self.input_modalities.add(Modality.TEXT)\n\n        # compute the fields generated by this physical operator\n        input_field_names = list(self.input_schema.model_fields) if self.input_schema is not None else []\n        self.generated_fields = sorted([\n            field_name\n            for field_name in self.output_schema.model_fields\n            if field_name not in input_field_names\n        ])\n\n        # sets __hash__() for each child Operator to be the base class' __hash__() method;\n        # by default, if a subclass defines __eq__() but not __hash__() Python will set that\n        # class' __hash__ to None\n        self.__class__.__hash__ = PhysicalOperator.__hash__\n\n    def __str__(self):\n        op = f\"{self.input_schema.__name__} -> {self.op_name()} -> {self.output_schema.__name__}\\n\"\n        op += f\"    ({', '.join(sorted(self.input_schema.model_fields))[:30]}) \"\n        op += f\"-> ({', '.join(sorted(self.output_schema.model_fields))[:30]})\\n\"\n        if getattr(self, \"model\", None):\n            op += f\"    Model: {self.model}\\n\"\n        return op\n\n    # def __eq__(self, other) -> bool:\n    #     all_op_params_match = all(value == getattr(other, key) for key, value in self.get_op_params().items())\n    #     return isinstance(other, self.__class__) and all_op_params_match\n    def __eq__(self, other) -> bool:\n        return isinstance(other, self.__class__) and self.get_full_op_id() == other.get_full_op_id()\n\n    def copy(self) -> PhysicalOperator:\n        return self.__class__(**self.get_op_params())\n\n    def op_name(self) -> str:\n        \"\"\"Name of the physical operator.\"\"\"\n        return str(self.__class__.__name__)\n\n    def get_id_params(self) -> dict:\n        \"\"\"\n        Returns a dictionary mapping of physical operator parameters which are relevant\n        for computing the physical operator id.\n\n        NOTE: Should be overriden by subclasses to include class-specific parameters.\n        NOTE: input_schema and output_schema are not included in the id params by default,\n              because they may depend on the order of operations chosen by the Optimizer.\n              This is particularly true for convert operations, where the output schema\n              is now the union of the input and output schemas of the logical operator.\n        \"\"\"\n        # return {\"generated_fields\": self.generated_fields}\n        return {}\n\n    def get_op_params(self) -> dict:\n        \"\"\"\n        Returns a dictionary mapping of physical operator parameters which may be used to\n        create a copy of this physical operation.\n\n        NOTE: Should be overriden by subclasses to include class-specific parameters.\n        \"\"\"\n        return {\n            \"output_schema\": self.output_schema,\n            \"input_schema\": self.input_schema,\n            \"depends_on\": self.depends_on,\n            \"logical_op_id\": self.logical_op_id,\n            \"unique_logical_op_id\": self.unique_logical_op_id,\n            \"logical_op_name\": self.logical_op_name,\n            \"verbose\": self.verbose,\n        }\n\n    def get_op_id(self):\n        \"\"\"\n        NOTE: We do not call this in the __init__() method as subclasses may set parameters\n              returned by self.get_id_params() after they call to super().__init__().\n\n        NOTE: This is NOT a universal ID.\n\n        Two different PhysicalOperator instances with the identical returned values\n        from the call to self.get_id_params() will have equivalent op_ids.\n        \"\"\"\n        # return self.op_id if we've computed it before\n        if self.op_id is not None:\n            return self.op_id\n\n        # get op name and op parameters which are relevant for computing the id\n        op_name = self.op_name()\n        id_params = self.get_id_params()\n        id_params = {k: str(v) for k, v in id_params.items()}\n\n        # compute, set, and return the op_id\n        hash_str = json.dumps({\"op_name\": op_name, **id_params}, sort_keys=True)\n        self.op_id = hash_for_id(hash_str)\n\n        return self.op_id\n    \n    def get_logical_op_id(self) -> str:\n        return self.logical_op_id\n\n    def get_unique_logical_op_id(self) -> str:\n        return self.unique_logical_op_id\n\n    def get_full_op_id(self):\n        return f\"{self.get_logical_op_id()}-{self.get_op_id()}\"\n\n    def is_image_op(self) -> bool:\n        \"\"\"Returns True if this physical operator is designed to handle image data.\"\"\"\n        return self.input_modalities is not None and Modality.IMAGE in self.input_modalities\n\n    def is_audio_op(self) -> bool:\n        \"\"\"Returns True if this physical operator is designed to handle audio data.\"\"\"\n        return self.input_modalities is not None and Modality.AUDIO in self.input_modalities\n\n    def __hash__(self):\n        return int(self.op_id, 16) # NOTE: should we use self.get_full_op_id() instead?\n\n    def get_model_name(self) -> str | None:\n        \"\"\"Returns the name of the model used by the physical operator (if it sets self.model). Otherwise, it returns None.\"\"\"\n        return None\n\n    def get_input_fields(self):\n        \"\"\"Returns the set of input fields needed to execute a physical operator.\"\"\"\n        depends_on_fields = (\n            [field.split(\".\")[-1] for field in self.depends_on]\n            if self.depends_on is not None and len(self.depends_on) > 0\n            else None\n        )\n        input_fields = (\n            list(self.input_schema.model_fields)\n            if depends_on_fields is None\n            else [field for field in self.input_schema.model_fields if field in depends_on_fields]\n        )\n\n        return input_fields\n\n    def get_fields_to_generate(self, candidate: DataRecord) -> list[str]:\n        \"\"\"\n        Returns the list of field names that this operator needs to generate for the given candidate.\n        This function returns only the fields in self.generated_fields which are not already present\n        in the candidate. This is important for operators with retry logic, where we may only need to\n        recompute a subset of self.generated_fields.\n\n        Right now this is only used by convert and top-k operators.\n        \"\"\"\n        fields_to_generate = [\n            field_name\n            for field_name in self.generated_fields\n            if getattr(candidate, field_name, None) is None\n        ]\n\n        return fields_to_generate\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        This function returns a naive estimate of this operator's:\n        - cardinality\n        - time_per_record\n        - cost_per_record\n        - quality\n\n        The function takes an argument which contains the OperatorCostEstimates\n        of the physical operator whose output is the input to this operator.\n\n        For the implemented operator. These will be used by the CostModel\n        when PZ does not have sample execution data -- and it will be necessary\n        in some cases even when sample execution data is present. (For example,\n        the cardinality of each operator cannot be estimated based on sample\n        execution data alone -- thus ScanPhysicalOps need to give at least ballpark\n        correct estimates of this quantity).\n        \"\"\"\n        raise NotImplementedError(\"CostEstimates from abstract method\")\n\n    def __call__(self, candidate: DataRecord) -> DataRecordSet:\n        raise NotImplementedError(\"Calling __call__ from abstract method\")\n"
  },
  {
    "path": "src/palimpzest/query/operators/project.py",
    "content": "from __future__ import annotations\n\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import OperatorCostEstimates, RecordOpStats\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\n\nclass ProjectOp(PhysicalOperator):\n    def __init__(self, project_cols: list[str], *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.project_cols = sorted(project_cols)\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Project Columns: {self.project_cols}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\"project_cols\": self.project_cols, **id_params}\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"project_cols\": self.project_cols, **op_params}\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        # for now, assume applying the limit takes negligible additional time (and no cost in USD)\n        return OperatorCostEstimates(\n            cardinality=source_op_cost_estimates.cardinality,\n            time_per_record=0,\n            cost_per_record=0,\n            quality=1.0,\n        )\n\n    def __call__(self, candidate: DataRecord) -> DataRecordSet:\n        # create new DataRecord with projection applied\n        dr = DataRecord.from_parent(schema=candidate.schema, data_item={}, parent_record=candidate, project_cols=self.project_cols)\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=0.0,\n            cost_per_record=0.0,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n"
  },
  {
    "path": "src/palimpzest/query/operators/rag.py",
    "content": "from __future__ import annotations\n\nimport time\nfrom typing import Any\n\nfrom litellm import embedding as litellm_embedding\nfrom numpy import dot\nfrom numpy.linalg import norm\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.constants import NAIVE_EST_NUM_OUTPUT_TOKENS, Model\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.models import GenerationStats, OperatorCostEstimates\nfrom palimpzest.query.operators.convert import LLMConvert\nfrom palimpzest.query.operators.filter import LLMFilter\n\n\nclass RAGConvert(LLMConvert):\n    def __init__(self, embedding_model: Model, num_chunks_per_field: int, chunk_size: int = 1000, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.embedding_model = embedding_model\n        self.num_chunks_per_field = num_chunks_per_field\n        self.chunk_size = chunk_size\n\n        # crude adjustment factor for naive estimation in unoptimized setting\n        self.naive_quality_adjustment = 0.6\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Embedding Model: {self.embedding_model.value}\\n\"\n        op += f\"    Number of Chunks: {str(self.num_chunks_per_field)}\\n\"\n        op += f\"    Chunk Size: {str(self.chunk_size)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"embedding_model\": self.embedding_model.value,\n            \"num_chunks_per_field\": self.num_chunks_per_field,\n            \"chunk_size\": self.chunk_size,\n            **id_params,\n        }\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"embedding_model\": self.embedding_model,\n            \"num_chunks_per_field\": self.num_chunks_per_field,\n            \"chunk_size\": self.chunk_size,\n            **op_params,\n        }\n        return op_params\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Update the cost per record and quality estimates produced by LLMConvert's naive estimates.\n        We adjust the cost per record to account for the reduced number of input tokens following\n        the retrieval of relevant chunks, and we make a crude estimate of the quality degradation\n        that results from using a downsized input (although this may in fact improve quality in\n        some cases).\n        \"\"\"\n        # get naive cost estimates from LLMConvert\n        naive_op_cost_estimates = super().naive_cost_estimates(source_op_cost_estimates)\n\n        # re-compute cost per record assuming we use fewer input tokens; naively assume a single input field\n        est_num_input_tokens = self.num_chunks_per_field * self.chunk_size\n        est_num_output_tokens = NAIVE_EST_NUM_OUTPUT_TOKENS\n        model_conversion_usd_per_record = (\n            self.model.get_usd_per_input_token() * est_num_input_tokens\n            + self.model.get_usd_per_output_token() * est_num_output_tokens\n        )\n\n        # set refined estimate of cost per record\n        naive_op_cost_estimates.cost_per_record = model_conversion_usd_per_record\n        naive_op_cost_estimates.cost_per_record_lower_bound = naive_op_cost_estimates.cost_per_record\n        naive_op_cost_estimates.cost_per_record_upper_bound = naive_op_cost_estimates.cost_per_record\n        naive_op_cost_estimates.quality = (naive_op_cost_estimates.quality) * self.naive_quality_adjustment\n        naive_op_cost_estimates.quality_lower_bound = naive_op_cost_estimates.quality\n        naive_op_cost_estimates.quality_upper_bound = naive_op_cost_estimates.quality\n\n        return naive_op_cost_estimates\n\n    def chunk_text(self, text: str, chunk_size: int) -> list[str]:\n        \"\"\"\n        Given a text string, chunk it into substrings of length chunk_size.\n        \"\"\"\n        chunks = []\n        idx = 0\n        while idx + chunk_size < len(text):\n            chunks.append(text[idx : idx + chunk_size])\n            idx += chunk_size\n        \n        if idx < len(text):\n            chunks.append(text[idx:])\n\n        return chunks\n\n    def compute_embedding(self, text: str) -> tuple[list[float], GenerationStats]:\n        \"\"\"\n        Compute the embedding for a text string. Return the embedding and the GenerationStats object\n        that captures the cost of the operation.\n        \"\"\"\n        # get the embedding model name\n        model_name = self.embedding_model.value\n\n        # compute the embedding\n        start_time = time.time()\n        response = litellm_embedding(model=model_name, input=text)\n        total_time = time.time() - start_time\n\n        # extract the embedding\n        embedding = response.data[0]['embedding']\n\n        # compute the generation stats object\n        total_embedding_input_tokens = response.usage.total_tokens if response.usage is not None else 0\n        total_embedding_cost = self.embedding_model.get_usd_per_input_token() * total_embedding_input_tokens\n        embed_stats = GenerationStats(\n            model_name=model_name,  # NOTE: this should be overwritten by generation model in convert()\n            embedding_input_tokens=total_embedding_input_tokens,\n            cost_per_record=total_embedding_cost,\n            llm_call_duration_secs=total_time,\n            total_llm_calls=1,\n            total_embedding_llm_calls=1,\n        )\n\n        return embedding, embed_stats\n\n    def compute_similarity(self, query_embedding: list[float], chunk_embedding: list[float]) -> float:\n        \"\"\"\n        Compute the similarity between the query and chunk embeddings.\n        \"\"\"\n        return dot(query_embedding, chunk_embedding) / (norm(query_embedding) * norm(chunk_embedding))\n\n    def get_chunked_candidate(self, candidate: DataRecord, input_fields: list[str], output_fields: list[str]) -> tuple[DataRecord, GenerationStats]:\n        \"\"\"\n        For each text field, chunk the content and compute the chunk embeddings. Then select the top-k chunks\n        for each field. If a field is smaller than the chunk size, simply include the full field.\n        \"\"\"\n        # initialize stats for embedding costs\n        embed_stats = GenerationStats()\n\n        # compute embedding for output fields\n        output_fields_desc = \"\"\n        for field_name in output_fields:\n            desc = self.output_schema.model_fields[field_name].description\n            output_fields_desc += f\"- {field_name}: {'no description available' if desc is None else desc}\\n\"\n        query_embedding, query_embed_stats = self.compute_embedding(output_fields_desc)\n\n        # add cost of embedding the query to embed_stats\n        embed_stats += query_embed_stats\n\n        # for each input field, chunk its content and compute the (per-chunk) embeddings\n        for field_name in input_fields:\n            field = candidate.get_field_type(field_name)\n\n            # skip this field if it is not a string or a list of strings\n            is_string_field = field.annotation in [str, str | None, str | Any]\n            is_list_string_field = field.annotation in [list[str], list[str] | None, list[str] | Any]\n            if not (is_string_field or is_list_string_field) or candidate[field_name] is None:\n                continue\n\n            # if this is a list of strings, join the strings\n            if is_list_string_field:\n                candidate[field_name] = \"[\" + \", \".join(candidate[field_name]) + \"]\"\n\n            # skip this field if it is a string field and its length is less than the chunk size\n            if len(candidate[field_name]) < self.chunk_size:\n                continue\n\n            # chunk the content\n            chunks = self.chunk_text(candidate[field_name], self.chunk_size)\n\n            # compute embeddings for each chunk\n            chunk_embeddings, chunk_embed_stats_lst = zip(*[self.compute_embedding(chunk) for chunk in chunks])\n\n            # add cost of embedding each chunk to embed_stats\n            for chunk_embed_stats in chunk_embed_stats_lst:\n                embed_stats += chunk_embed_stats\n\n            # select the top-k chunks\n            sorted_chunks = sorted(\n                zip(range(len(chunks)), chunks, chunk_embeddings),\n                key=lambda tup: self.compute_similarity(query_embedding, tup[2]),\n                reverse=True,\n            )\n            top_k_chunks = [(chunk_idx, chunk) for chunk_idx, chunk, _ in sorted_chunks[:self.num_chunks_per_field]]\n\n            # sort the top-k chunks by their original index in the content, and join them with ellipses\n            top_k_chunks = [chunk for _, chunk in sorted(top_k_chunks, key=lambda tup: tup[0])]\n            candidate[field_name] = \"...\".join(top_k_chunks)\n\n        return candidate, embed_stats\n\n    def convert(self, candidate: DataRecord, fields: dict[str, FieldInfo]) -> tuple[dict[str, list], GenerationStats]:\n        # get the set of input fields to use for the convert operation\n        input_fields = self.get_input_fields()\n        output_fields = list(fields.keys())\n\n        # lookup most relevant chunks for each field using embedding search\n        candidate_copy = candidate.copy()\n        candidate_copy, embed_stats = self.get_chunked_candidate(candidate_copy, input_fields, output_fields)\n\n        # construct kwargs for generation\n        gen_kwargs = {\"project_cols\": input_fields, \"output_schema\": self.output_schema}\n\n        # generate outputs for all fields in a single query\n        field_answers, _, generation_stats, _ = self.generator(candidate_copy, fields, **gen_kwargs)\n\n        # NOTE: summing embedding stats with generation stats is messy because it will lead to misleading\n        #       measurements of total_input_tokens and total_output_tokens. We should fix this in the future.\n        #       The good news: as long as we compute the cost_per_record of each GenerationStats object correctly,\n        #       then the total cost of the operation will be correct (which will roll-up to correctly computing\n        #       the total cost of the operator, plan, and execution).\n        #\n        # combine stats from embedding with stats for generation\n        generation_stats += embed_stats\n\n        # if there was an error for any field, execute a conventional query on that field\n        for field_name, answers in field_answers.items():\n            if answers is None:\n                single_field_answers, _, single_field_stats, _ = self.generator(candidate_copy, {field_name: fields[field_name]}, **gen_kwargs)\n                field_answers.update(single_field_answers)\n                generation_stats += single_field_stats\n\n        return field_answers, generation_stats\n\n\nclass RAGFilter(LLMFilter):\n    def __init__(self, embedding_model: Model, num_chunks_per_field: int, chunk_size: int = 1000, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.embedding_model = embedding_model\n        self.num_chunks_per_field = num_chunks_per_field\n        self.chunk_size = chunk_size\n\n        # crude adjustment factor for naive estimation in no-sentinel setting\n        self.naive_quality_adjustment = 0.6\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Embedding Model: {self.embedding_model.value}\\n\"\n        op += f\"    Number of Chunks: {str(self.num_chunks_per_field)}\\n\"\n        op += f\"    Chunk Size: {str(self.chunk_size)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"embedding_model\": self.embedding_model.value,\n            \"num_chunks_per_field\": self.num_chunks_per_field,\n            \"chunk_size\": self.chunk_size,\n            **id_params,\n        }\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"embedding_model\": self.embedding_model,\n            \"num_chunks_per_field\": self.num_chunks_per_field,\n            \"chunk_size\": self.chunk_size,\n            **op_params,\n        }\n        return op_params\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Update the cost per record and quality estimates produced by LLMFilter's naive estimates.\n        We adjust the cost per record to account for the reduced number of input tokens following\n        the retrieval of relevant chunks, and we make a crude estimate of the quality degradation\n        that results from using a downsized input (although this may in fact improve quality in\n        some cases).\n        \"\"\"\n        # get naive cost estimates from LLMFilter\n        naive_op_cost_estimates = super().naive_cost_estimates(source_op_cost_estimates)\n\n        # re-compute cost per record assuming we use fewer input tokens; naively assume a single input field\n        est_num_input_tokens = self.num_chunks_per_field * self.chunk_size\n        est_num_output_tokens = NAIVE_EST_NUM_OUTPUT_TOKENS\n        model_conversion_usd_per_record = (\n            self.model.get_usd_per_input_token() * est_num_input_tokens\n            + self.model.get_usd_per_output_token() * est_num_output_tokens\n        )\n\n        # set refined estimate of cost per record\n        naive_op_cost_estimates.cost_per_record = model_conversion_usd_per_record\n        naive_op_cost_estimates.cost_per_record_lower_bound = naive_op_cost_estimates.cost_per_record\n        naive_op_cost_estimates.cost_per_record_upper_bound = naive_op_cost_estimates.cost_per_record\n        naive_op_cost_estimates.quality = (naive_op_cost_estimates.quality) * self.naive_quality_adjustment\n        naive_op_cost_estimates.quality_lower_bound = naive_op_cost_estimates.quality\n        naive_op_cost_estimates.quality_upper_bound = naive_op_cost_estimates.quality\n\n        return naive_op_cost_estimates\n\n    def chunk_text(self, text: str, chunk_size: int) -> list[str]:\n        \"\"\"\n        Given a text string, chunk it into substrings of length chunk_size.\n        \"\"\"\n        chunks = []\n        idx = 0\n        while idx + chunk_size < len(text):\n            chunks.append(text[idx : idx + chunk_size])\n            idx += chunk_size\n        \n        if idx < len(text):\n            chunks.append(text[idx:])\n\n        return chunks\n\n    def compute_embedding(self, text: str) -> tuple[list[float], GenerationStats]:\n        \"\"\"\n        Compute the embedding for a text string. Return the embedding and the GenerationStats object\n        that captures the cost of the operation.\n        \"\"\"\n        # get the embedding model name\n        model_name = self.embedding_model.value\n\n        # compute the embedding\n        start_time = time.time()\n        response = litellm_embedding(model=model_name, input=text)\n        total_time = time.time() - start_time\n\n        # extract the embedding\n        embedding = response.data[0]['embedding']\n\n        # compute the generation stats object\n        total_embedding_input_tokens = response.usage.total_tokens if response.usage is not None else 0\n        total_embedding_cost = self.embedding_model.get_usd_per_input_token() * total_embedding_input_tokens\n        embed_stats = GenerationStats(\n            model_name=model_name,  # NOTE: this should be overwritten by generation model in filter()\n            embedding_input_tokens=total_embedding_input_tokens,\n            cost_per_record=total_embedding_cost,\n            llm_call_duration_secs=total_time,\n            total_llm_calls=1,\n            total_embedding_llm_calls=1,\n        )\n\n        return embedding, embed_stats\n\n    def compute_similarity(self, query_embedding: list[float], chunk_embedding: list[float]) -> float:\n        \"\"\"\n        Compute the similarity between the query and chunk embeddings.\n        \"\"\"\n        return dot(query_embedding, chunk_embedding) / (norm(query_embedding) * norm(chunk_embedding))\n\n    def get_chunked_candidate(self, candidate: DataRecord, input_fields: list[str]) -> tuple[DataRecord, GenerationStats]:\n        \"\"\"\n        For each text field, chunk the content and compute the chunk embeddings. Then select the top-k chunks\n        for each field. If a field is smaller than the chunk size, simply include the full field.\n        \"\"\"\n        # initialize stats for embedding costs\n        embed_stats = GenerationStats()\n\n        # compute embedding for filter condition\n        query_embedding, query_embed_stats = self.compute_embedding(self.filter_obj.filter_condition)\n\n        # add cost of embedding the query to embed_stats\n        embed_stats += query_embed_stats\n\n        # for each input field, chunk its content and compute the (per-chunk) embeddings\n        for field_name in input_fields:\n            field = candidate.get_field_type(field_name)\n\n            # skip this field if it is not a string or a list of strings\n            is_string_field = field.annotation in [str, str | None, str | Any]\n            is_list_string_field = field.annotation in [list[str], list[str] | None, list[str] | Any]\n            if not (is_string_field or is_list_string_field):\n                continue\n\n            # if this is a list of strings, join the strings\n            if is_list_string_field:\n                candidate[field_name] = \"[\" + \", \".join(candidate[field_name]) + \"]\"\n\n            # skip this field if it is a string field and its length is less than the chunk size\n            if len(candidate[field_name]) < self.chunk_size:\n                continue\n\n            # chunk the content\n            chunks = self.chunk_text(candidate[field_name], self.chunk_size)\n\n            # compute embeddings for each chunk\n            chunk_embeddings, chunk_embed_stats_lst = zip(*[self.compute_embedding(chunk) for chunk in chunks])\n\n            # add cost of embedding each chunk to embed_stats\n            for chunk_embed_stats in chunk_embed_stats_lst:\n                embed_stats += chunk_embed_stats\n\n            # select the top-k chunks\n            sorted_chunks = sorted(\n                zip(range(len(chunks)), chunks, chunk_embeddings),\n                key=lambda tup: self.compute_similarity(query_embedding, tup[2]),\n                reverse=True,\n            )\n            top_k_chunks = [(chunk_idx, chunk) for chunk_idx, chunk, _ in sorted_chunks[:self.num_chunks_per_field]]\n\n            # sort the top-k chunks by their original index in the content, and join them with ellipses\n            top_k_chunks = [chunk for _, chunk in sorted(top_k_chunks, key=lambda tup: tup[0])]\n            candidate[field_name] = \"...\".join(top_k_chunks)\n\n        return candidate, embed_stats\n\n    def filter(self, candidate: DataRecord) -> tuple[dict[str, bool], GenerationStats]:\n        # get the set of input fields to use for the filter operation\n        input_fields = self.get_input_fields()\n\n        # construct output fields\n        fields = {\"passed_operator\": FieldInfo(annotation=bool, description=\"Whether the record passed the filter operation\")}\n\n        # lookup most relevant chunks for each field using embedding search\n        candidate_copy = candidate.copy()\n        candidate_copy, embed_stats = self.get_chunked_candidate(candidate_copy, input_fields)\n\n        # construct kwargs for generation\n        gen_kwargs = {\"project_cols\": input_fields, \"filter_condition\": self.filter_obj.filter_condition}\n\n        # generate outputs for all fields in a single query\n        field_answers, _, generation_stats, _ = self.generator(candidate_copy, fields, **gen_kwargs)\n\n        # NOTE: summing embedding stats with generation stats is messy because it will lead to misleading\n        #       measurements of total_input_tokens and total_output_tokens. We should fix this in the future.\n        #       The good news: as long as we compute the cost_per_record of each GenerationStats object correctly,\n        #       then the total cost of the operation will be correct (which will roll-up to correctly computing\n        #       the total cost of the operator, plan, and execution).\n        #\n        # combine stats from embedding with stats for generation\n        generation_stats += embed_stats\n\n        return field_answers, generation_stats\n"
  },
  {
    "path": "src/palimpzest/query/operators/scan.py",
    "content": "from __future__ import annotations\n\nimport time\nfrom abc import ABC, abstractmethod\nfrom typing import Any\n\nfrom palimpzest.constants import LOCAL_SCAN_TIME_PER_KB\nfrom palimpzest.core.data import context\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import OperatorCostEstimates, RecordOpStats\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\n\nclass ScanPhysicalOp(PhysicalOperator, ABC):\n    \"\"\"\n    Physical operators which implement root Datasets require slightly more information\n    in order to accurately compute naive cost estimates. Thus, we use a slightly\n    modified abstract base class for these operators.\n    \"\"\"\n    # datasource: IterDataset\n    def __init__(self, datasource: Any, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.datasource = datasource\n\n    def __str__(self):\n        op = f\"{self.op_name()}({self.datasource}) -> {self.output_schema}\\n\"\n        op += f\"    ({', '.join(list(self.output_schema.model_fields))[:30]})\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\"datasource_id\": self.datasource.id, **id_params}\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"datasource\": self.datasource, **op_params}\n\n    @abstractmethod\n    def naive_cost_estimates(\n        self,\n        source_op_cost_estimates: OperatorCostEstimates,\n        input_record_size_in_bytes: int | float,\n    ) -> OperatorCostEstimates:\n        \"\"\"\n        This function returns a naive estimate of this operator's:\n        - cardinality\n        - time_per_record\n        - cost_per_record\n        - quality\n\n        For the implemented operator. These will be used by the CostModel\n        when PZ does not have sample execution data -- and it will be necessary\n        in some cases even when sample execution data is present. (For example,\n        the cardinality of each operator cannot be estimated based on sample\n        execution data alone -- thus ScanPhysicalOps need to give\n        at least ballpark correct estimates of this quantity).\n        \"\"\"\n        pass\n\n    def __call__(self, idx: int) -> DataRecordSet:\n        \"\"\"\n        This function invokes `self.datasource.__getitem__` on the given `idx` to retrieve the next data item.\n        It then returns this item as a DataRecord wrapped in a DataRecordSet.\n        \"\"\"\n        start_time = time.time()\n        item = self.datasource[idx]\n        end_time = time.time()\n\n        # check that item covers fields in output schema\n        output_field_names = list(self.output_schema.model_fields)\n        assert all([field in item for field in output_field_names]), f\"Some fields in Dataset schema not present in item!\\n - Dataset fields: {output_field_names}\\n - Item fields: {list(item.keys())}\"\n\n        # construct a DataRecord from the item\n        data_item = self.output_schema(**{field: item[field] for field in output_field_names})\n        dr = DataRecord(data_item, source_indices=[f\"{self.datasource.id}-{idx}\"])\n\n        # create RecordOpStats objects\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=(end_time - start_time),\n            cost_per_record=0.0,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n \n        # construct and return DataRecordSet object\n        return DataRecordSet([dr], [record_op_stats])\n\n\nclass MarshalAndScanDataOp(ScanPhysicalOp):\n    def naive_cost_estimates(\n        self,\n        source_op_cost_estimates: OperatorCostEstimates,\n        input_record_size_in_bytes: int | float,\n    ) -> OperatorCostEstimates:\n        # get inputs needed for naive cost estimation\n        # TODO: we should rename cardinality --> \"multiplier\" or \"selectivity\" one-to-one / one-to-many\n\n        # estimate time spent reading each record\n        per_record_size_kb = input_record_size_in_bytes / 1024.0\n\n        # TODO: cannot do the first computation b/c we cannot import iter_dataset; possibly revisit\n        # time_per_record = (\n        #     MEMORY_SCAN_TIME_PER_KB * per_record_size_kb\n        #     if isinstance(self.datasource, (iter_dataset.MemoryDataset))\n        #     else LOCAL_SCAN_TIME_PER_KB * per_record_size_kb\n        # )\n        time_per_record = LOCAL_SCAN_TIME_PER_KB * per_record_size_kb\n\n        # estimate output cardinality\n        cardinality = source_op_cost_estimates.cardinality\n\n        # for now, assume no cost per record for reading data\n        return OperatorCostEstimates(\n            cardinality=cardinality,\n            time_per_record=time_per_record,\n            cost_per_record=0,\n            quality=1.0,\n        )\n\n\nclass ContextScanOp(PhysicalOperator):\n    \"\"\"\n    Physical operator which facillitates the loading of a Context for processing.\n    \"\"\"\n\n    def __init__(self, context: context.Context, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.context = context\n\n    def __str__(self):\n        op = f\"{self.op_name()}({self.context}) -> {self.output_schema}\\n\"\n        op += f\"    ({', '.join(list(self.output_schema.model_fields))[:30]})\\n\"\n        return op\n\n    def get_id_params(self):\n        return super().get_id_params()\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"context\": self.context, **op_params}\n\n    def naive_cost_estimates(\n        self,\n        source_op_cost_estimates: OperatorCostEstimates,\n    ):\n        # get inputs needed for naive cost estimation\n        # TODO: we should rename cardinality --> \"multiplier\" or \"selectivity\" one-to-one / one-to-many\n\n        # estimate time spent reading each record\n        time_per_record = LOCAL_SCAN_TIME_PER_KB * 1.0\n\n        # for now, assume no cost per record for reading data\n        return OperatorCostEstimates(\n            cardinality=1.0,\n            time_per_record=time_per_record,\n            cost_per_record=0,\n            quality=1.0,\n        )\n\n    def __call__(self, *args, **kwargs) -> DataRecordSet:\n        \"\"\"\n        This function returns the context as a DataRecord wrapped in a DataRecordSet.\n        \"\"\"\n        # construct a DataRecord from the context\n        start_time = time.time()\n        dr = DataRecord(self.output_schema(), source_indices=[f\"{self.context.id}-{0}\"])\n        dr.context = self.context\n        end_time = time.time()\n\n        # create RecordOpStats objects\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=(end_time - start_time),\n            cost_per_record=0.0,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n \n        # construct and return DataRecordSet object\n        return DataRecordSet([dr], [record_op_stats])\n"
  },
  {
    "path": "src/palimpzest/query/operators/search.py",
    "content": "import functools\nimport inspect\nimport os\nimport time\nfrom typing import Any\n\n# from mem0 import Memory\nfrom smolagents import CodeAgent, LiteLLMModel, tool\n\n# from palimpzest.agents.search_agents import DataDiscoveryAgent, SearchManagerAgent\nfrom palimpzest.core.data.context import Context\nfrom palimpzest.core.data.context_manager import ContextManager\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import GenerationStats, OperatorCostEstimates, RecordOpStats\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\n\ndef make_tool(bound_method):\n    # Get the original function and bound instance\n    func = bound_method.__func__\n    instance = bound_method.__self__\n    \n    # Get the signature and remove 'self'\n    sig = inspect.signature(func)\n    params = list(sig.parameters.values())[1:]  # skip 'self'\n    new_sig = inspect.Signature(parameters=params, return_annotation=sig.return_annotation)\n\n    # Create a wrapper function dynamically\n    @functools.wraps(func)\n    def wrapper(*args, **kwargs):\n        return func(instance, *args, **kwargs)\n\n    # Update the __signature__ to reflect the new one without 'self'\n    wrapper.__signature__ = new_sig\n\n    return wrapper\n\n\nclass SmolAgentsSearch(PhysicalOperator):\n    \"\"\"\n    Physical operator for searching with Smol Agents.\n    \"\"\"\n    def __init__(self, context_id: str, search_query: str, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n        self.context_id = context_id\n        self.search_query = search_query\n        # self.model_id = \"anthropic/claude-3-7-sonnet-latest\"\n        self.model_id = \"openai/gpt-4o-mini-2024-07-18\"\n        # self.model_id = \"openai/gpt-4o-2024-08-06\"\n        api_key = os.getenv(\"ANTHROPIC_API_KEY\") if \"anthropic\" in self.model_id else os.getenv(\"OPENAI_API_KEY\")\n        self.model = LiteLLMModel(model_id=self.model_id, api_key=api_key)\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Context ID: {self.context_id:20s}\\n\"\n        op += f\"    Search Query: {self.search_query:20s}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        return {\n            \"context_id\": self.context_id,\n            \"search_query\": self.search_query,\n            **id_params,\n        }\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\n            \"context_id\": self.context_id,\n            \"search_query\": self.search_query,\n            **op_params,\n        }\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        return OperatorCostEstimates(\n            cardinality=source_op_cost_estimates.cardinality,\n            time_per_record=100,\n            cost_per_record=1,\n            quality=1.0,\n        )\n\n    def _create_record_set(\n        self,\n        candidate: DataRecord,\n        generation_stats: GenerationStats,\n        total_time: float,\n        answer: dict[str, Any],\n    ) -> DataRecordSet:\n        \"\"\"\n        Given an input DataRecord and a determination of whether it passed the filter or not,\n        construct the resulting RecordSet.\n        \"\"\"\n        # create new DataRecord\n        data_item = {field: answer[field] for field in self.output_schema.model_fields if field in answer}\n        dr = DataRecord.from_parent(self.output_schema, data_item, parent_record=candidate)\n\n        # create RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=dr._id,\n            record_parent_ids=dr._parent_ids,\n            record_source_indices=dr._source_indices,\n            record_state=dr.to_dict(include_bytes=False),\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=total_time,\n            cost_per_record=generation_stats.cost_per_record,\n            model_name=self.get_model_name(),\n            input_text_tokens=generation_stats.input_text_tokens,\n            input_audio_tokens=generation_stats.input_audio_tokens,\n            input_image_tokens=generation_stats.input_image_tokens,\n            cache_read_tokens=generation_stats.cache_read_tokens,\n            cache_creation_tokens=generation_stats.cache_creation_tokens,\n            output_text_tokens=generation_stats.output_text_tokens,\n            embedding_input_tokens=generation_stats.embedding_input_tokens,\n            llm_call_duration_secs=generation_stats.llm_call_duration_secs,\n            fn_call_duration_secs=generation_stats.fn_call_duration_secs,\n            total_llm_calls=generation_stats.total_llm_calls,\n            total_embedding_llm_calls=generation_stats.total_embedding_llm_calls,\n            answer={k: v.description if isinstance(v, Context) else v for k, v in answer.items()},\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        return DataRecordSet([dr], [record_op_stats])\n\n    def __call__(self, candidate: DataRecord) -> Any:\n        start_time = time.time()\n\n        # get the input context object and its tools\n        input_context: Context = candidate.context\n        description = input_context.description\n        tools = [tool(make_tool(f)) for f in input_context.tools]\n\n        # # construct the full search query\n        # full_query = f\"Please execute the following search query. Output a **detailed** description of (1) which data you look at, and (2) what you find in that data. Avoid making overly broad statements such as \\\"What you're searching for is not present in the dataset\\\". Instead, make more precise statments like \\\"What you're searching for is not present in files A.txt, B.txt, and C.txt, but may be present elsewhere\\\".\\n\\nQUERY: {self.search_query}\"\n\n        # perform the computation\n        instructions = f\"\\n\\nHere is a description of the Context whose data you will be working with, as well as any previously computed results:\\n\\n{description}\"\n        agent = CodeAgent(\n            tools=tools,\n            model=self.model,\n            add_base_tools=False,\n            instructions=instructions,\n            return_full_result=True,\n            additional_authorized_imports=[\"pandas\", \"io\", \"os\"],\n        )\n        result = agent.run(self.search_query)\n        # NOTE: you can see the system prompt with `agent.memory.system_prompt.system_prompt`\n        # full_steps = agent.memory.get_full_steps()\n\n        # compute generation stats\n        response = result.output\n        input_tokens = result.token_usage.input_tokens\n        output_tokens = result.token_usage.output_tokens\n        cost_per_input_token = (3.0 / 1e6) if \"anthropic\" in self.model_id else (0.15 / 1e6) # (2.5 / 1e6) #\n        cost_per_output_token = (15.0 / 1e6) if \"anthropic\" in self.model_id else (0.6 / 1e6) # (10.0 / 1e6) #\n        input_cost = input_tokens * cost_per_input_token\n        output_cost = output_tokens * cost_per_output_token\n        generation_stats = GenerationStats(\n            model_name=self.model_id,\n            input_text_tokens=input_tokens,\n            output_text_tokens=output_tokens,\n            cost_per_record=input_cost + output_cost,\n            llm_call_duration_secs=time.time() - start_time,\n        )\n\n        # update the description of the Context to include the search result\n        new_description = f\"RESULT: {response}\\n\\n\"\n        cm = ContextManager()\n        cm.update_context(id=self.context_id, description=new_description)\n\n        # create and return record set\n        field_answers = {\n            \"context\": cm.get_context(id=self.context_id),\n        }\n        record_set = self._create_record_set(\n            candidate,\n            generation_stats,\n            time.time() - start_time,\n            field_answers,\n        )\n\n        return record_set\n\n\n# class SmolAgentsManagedSearch(PhysicalOperator):\n#     \"\"\"\n#     Physical operator for searching with Smol Agents using an Orchestrator and a Data Discovery Agent.\n#     \"\"\"\n#     def __init__(self, context_id: str, search_query: str, *args, **kwargs):\n#         super().__init__(*args, **kwargs)\n#         self.context_id = context_id\n#         self.search_query = search_query\n#         # self.model_id = \"anthropic/claude-3-7-sonnet-latest\"\n#         self.model_id = \"openai/gpt-4o-mini-2024-07-18\"\n#         # self.model_id = \"o1\"\n#         model_params = {\n#             \"model_id\": self.model_id,\n#             \"custom_role_conversions\": {\"tool-call\": \"assistant\", \"tool-response\": \"user\"},\n#             \"max_completion_tokens\": 8192,\n#         }\n#         if self.model_id == \"o1\":\n#             model_params[\"reasoning_effort\"] = \"high\"\n#         self.model = LiteLLMModel(**model_params)\n#         self.text_limit = 100000\n#         self.memory = Memory()\n\n#     def __str__(self):\n#         op = super().__str__()\n#         op += f\"    Context ID: {self.context_id:20s}\\n\"\n#         op += f\"    Search Query: {self.search_query:20s}\\n\"\n#         return op\n\n#     def get_id_params(self):\n#         id_params = super().get_id_params()\n#         return {\n#             \"context_id\": self.context_id,\n#             \"search_query\": self.search_query,\n#             **id_params,\n#         }\n\n#     def get_op_params(self):\n#         op_params = super().get_op_params()\n#         return {\n#             \"context_id\": self.context_id,\n#             \"search_query\": self.search_query,\n#             **op_params,\n#         }\n\n#     def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n#         return OperatorCostEstimates(\n#             cardinality=source_op_cost_estimates.cardinality,\n#             time_per_record=100,\n#             cost_per_record=1,\n#             quality=1.0,\n#         )\n\n#     def _create_record_set(\n#         self,\n#         candidate: DataRecord,\n#         generation_stats: GenerationStats,\n#         total_time: float,\n#         answer: dict[str, Any],\n#     ) -> DataRecordSet:\n#         \"\"\"\n#         Given an input DataRecord and a determination of whether it passed the filter or not,\n#         construct the resulting RecordSet.\n#         \"\"\"\n#         # create new DataRecord\n#         data_item = {field: answer[field] for field in self.output_schema.model_fields if field in answer}\n#         dr = DataRecord.from_parent(self.output_schema, data_item, parent_record=candidate)\n\n        # # create RecordOpStats object\n        # record_op_stats = RecordOpStats(\n        #     record_id=dr._id,\n        #     record_parent_ids=dr._parent_ids,\n        #     record_source_indices=dr._source_indices,\n        #     record_state=dr.to_dict(include_bytes=False),\n        #     full_op_id=self.get_full_op_id(),\n        #     logical_op_id=self.logical_op_id,\n        #     op_name=self.op_name(),\n        #     time_per_record=total_time,\n        #     cost_per_record=generation_stats.cost_per_record,\n        #     model_name=self.get_model_name(),\n        #     total_input_tokens=generation_stats.total_input_tokens,\n        #     total_output_tokens=generation_stats.total_output_tokens,\n        #     total_input_cost=generation_stats.total_input_cost,\n        #     total_output_cost=generation_stats.total_output_cost,\n        #     llm_call_duration_secs=generation_stats.llm_call_duration_secs,\n        #     fn_call_duration_secs=generation_stats.fn_call_duration_secs,\n        #     total_llm_calls=generation_stats.total_llm_calls,\n        #     total_embedding_llm_calls=generation_stats.total_embedding_llm_calls,\n        #     answer={k: v.description if isinstance(v, Context) else v for k, v in answer.items()},\n        #     op_details={k: str(v) for k, v in self.get_id_params().items()},\n        # )\n\n#         return DataRecordSet([dr], [record_op_stats])\n\n#     def __call__(self, candidate: DataRecord) -> Any:\n#         start_time = time.time()\n\n#         # get the input context object and its tools\n#         input_context: Context = candidate.context\n#         description = input_context.description\n#         tools = [tool(make_tool(f)) for f in input_context.tools]\n\n#         # create a memory tool for accessing past searches\n#         @tool\n#         def tool_search_history(query: str) -> str:\n#             \"\"\"\n#             This tool enables the agent to search through its history of execution in previous sessions.\n#             Thus, the agent can learn more about what it has done in the past by invoking this tool with\n#             a query describing what past interactions the agent might be curious about.\n\n#             Args:\n#                 query (str): A description of what the agent wishes to search for in its execution history.\n\n#             Returns:\n#                 str: A summary of the agent execution history which is relevant to the query.\n#             \"\"\"\n#             memories = self.memory.search(query=query, user_id=\"data_discovery_agent\")\n#             memory_str = \"\"\n#             for idx, memory in enumerate(memories):\n#                 memory_str += f\"MEMORY {idx+1}: {memory['memory']}\"\n#             return memory_str\n\n#         # tools.append(tool_search_history)\n#         data_discovery_agent = CodeAgent(\n#             model=self.model,\n#             tools=tools,\n#             max_steps=20,\n#             verbosity_level=2,\n#             planning_interval=4,\n#             name=\"data_discovery_agent\",\n#             description=\"\"\"A team member that will search a data repository to find files which help to answer your question.\n#         Ask him for all your questions that require searching a repository of relevant data.\n#         Provide him as much context as possible, in particular if you need to search on a specific timeframe!\n#         And don't hesitate to provide him with a complex search task, like finding a difference between two files.\n#         Your request must be a real sentence, not a keyword search! Like \"Find me this information (...)\" rather than a few keywords.\n#         \"\"\",\n#             provide_run_summary=True,\n#         )\n#         data_discovery_agent.prompt_templates[\"managed_agent\"][\"task\"] += f\"\"\"\\n\\nHere is a description of the context you will be working with: {description}\\n\\nSearch as many files as possible before returning your final answer.\\n\\nAdditionally, if after some searching you find out that you need more information to answer the question, you can use `final_answer` with your request for clarification as argument to request for more information.\"\"\"\n\n#         manager_agent = CodeAgent(\n#             model=self.model,\n#             tools=tools,\n#             max_steps=12,\n#             verbosity_level=2,\n#             additional_authorized_imports=[\"*\"],\n#             planning_interval=4,\n#             managed_agents=[data_discovery_agent],\n#             return_full_result=True,\n#         )\n\n#         # TODO: improve context descriptions and add memory from there; expand to multi-modal benchmark(s)\n#         # perform the computation\n#         result = manager_agent.run(self.search_query)\n\n#         # compute generation stats\n#         response = result.output\n#         input_tokens = result.token_usage.input_tokens\n#         output_tokens = result.token_usage.output_tokens\n#         cost_per_input_token = (3.0 / 1e6) if \"anthropic\" in self.model_id else (0.15 / 1e6) # (15.0 / 1e6)\n#         cost_per_output_token = (15.0 / 1e6) if \"anthropic\" in self.model_id else (0.6 / 1e6) # (60.0 / 1e6)\n#         input_cost = input_tokens * cost_per_input_token\n#         output_cost = output_tokens * cost_per_output_token\n#         generation_stats = GenerationStats(\n#             model_name=self.model_id,\n#             total_input_tokens=input_tokens,\n#             total_output_tokens=output_tokens,\n#             total_input_cost=input_cost,\n#             total_output_cost=output_cost,\n#             cost_per_record=input_cost + output_cost,\n#             llm_call_duration_secs=time.time() - start_time,\n#         )\n\n#         # update the description of the Context to include the search result\n#         new_description = f\"RESULT: {response}\\n\\n\"\n#         cm = ContextManager()\n#         cm.update_context(id=self.context_id, description=new_description)\n\n#         # create and return record set\n#         field_answers = {\n#             \"context\": cm.get_context(id=self.context_id),\n#         }\n#         record_set = self._create_record_set(\n#             candidate,\n#             generation_stats,\n#             time.time() - start_time,\n#             field_answers,\n#         )\n\n#         return record_set\n\n\n# class SmolAgentsCustomManagedSearch(PhysicalOperator):\n#     \"\"\"\n#     Physical operator for searching with Smol Agents using an Orchestrator and a Data Discovery Agent.\n#     \"\"\"\n#     def __init__(self, context_id: str, search_query: str, *args, **kwargs):\n#         super().__init__(*args, **kwargs)\n#         self.context_id = context_id\n#         self.search_query = search_query\n#         # self.model_id = \"anthropic/claude-3-7-sonnet-latest\"\n#         self.model_id = \"openai/gpt-4o-mini-2024-07-18\"\n#         # self.model_id = \"o1\"\n#         model_params = {\n#             \"model_id\": self.model_id,\n#             \"custom_role_conversions\": {\"tool-call\": \"assistant\", \"tool-response\": \"user\"},\n#             \"max_completion_tokens\": 8192,\n#         }\n#         if self.model_id == \"o1\":\n#             model_params[\"reasoning_effort\"] = \"high\"\n#         self.model = LiteLLMModel(**model_params)\n#         self.text_limit = 100000\n\n#     def __str__(self):\n#         op = super().__str__()\n#         op += f\"    Context ID: {self.context_id:20s}\\n\"\n#         op += f\"    Search Query: {self.search_query:20s}\\n\"\n#         return op\n\n#     def get_id_params(self):\n#         id_params = super().get_id_params()\n#         return {\n#             \"context_id\": self.context_id,\n#             \"search_query\": self.search_query,\n#             **id_params,\n#         }\n\n#     def get_op_params(self):\n#         op_params = super().get_op_params()\n#         return {\n#             \"context_id\": self.context_id,\n#             \"search_query\": self.search_query,\n#             **op_params,\n#         }\n\n#     def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n#         return OperatorCostEstimates(\n#             cardinality=source_op_cost_estimates.cardinality,\n#             time_per_record=100,\n#             cost_per_record=1,\n#             quality=1.0,\n#         )\n\n#     def _create_record_set(\n#         self,\n#         candidate: DataRecord,\n#         generation_stats: GenerationStats,\n#         total_time: float,\n#         answer: dict[str, Any],\n#     ) -> DataRecordSet:\n#         \"\"\"\n#         Given an input DataRecord and a determination of whether it passed the filter or not,\n#         construct the resulting RecordSet.\n#         \"\"\"\n#         # create new DataRecord\n#         data_item = {field: answer[field] for field in self.output_schema.model_fields if field in answer}\n#         dr = DataRecord.from_parent(self.output_schema, data_item, parent_record=candidate)\n\n#         # create RecordOpStats object\n#         record_op_stats = RecordOpStats(\n#             record_id=dr._id,\n#             record_parent_ids=dr._parent_ids,\n#             record_source_indices=dr._source_indices,\n#             record_state=dr.to_dict(include_bytes=False),\n#             full_op_id=self.get_full_op_id(),\n#             logical_op_id=self.logical_op_id,\n#             op_name=self.op_name(),\n#             time_per_record=total_time,\n#             cost_per_record=generation_stats.cost_per_record,\n#             model_name=self.get_model_name(),\n#             total_input_tokens=generation_stats.total_input_tokens,\n#             total_output_tokens=generation_stats.total_output_tokens,\n#             total_input_cost=generation_stats.total_input_cost,\n#             total_output_cost=generation_stats.total_output_cost,\n#             llm_call_duration_secs=generation_stats.llm_call_duration_secs,\n#             fn_call_duration_secs=generation_stats.fn_call_duration_secs,\n#             total_llm_calls=generation_stats.total_llm_calls,\n#             total_embedding_llm_calls=generation_stats.total_embedding_llm_calls,\n#             answer={k: v.description if isinstance(v, Context) else v for k, v in answer.items()},\n#             op_details={k: str(v) for k, v in self.get_id_params().items()},\n#         )\n\n#         return DataRecordSet([dr], [record_op_stats])\n\n#     def __call__(self, candidate: DataRecord) -> Any:\n#         start_time = time.time()\n\n#         # get the input context object and its tools\n#         input_context: Context = candidate.context\n#         description = input_context.description\n#         tools = [tool(make_tool(f)) for f in input_context.tools]\n\n#         # TODO: add semantic operators to tools\n#         data_discovery_agent = DataDiscoveryAgent(self.context_id, description, model=self.model, tools=tools)\n#         search_manager_agent = SearchManagerAgent(self.context_id, description, model=self.model, tools=tools, managed_agents=[data_discovery_agent])\n\n#         # perform the computation\n#         result = search_manager_agent.run(self.search_query)\n\n#         # compute generation stats\n#         response = result.output\n#         input_tokens = result.token_usage.input_tokens\n#         output_tokens = result.token_usage.output_tokens\n#         cost_per_input_token = (3.0 / 1e6) if \"anthropic\" in self.model_id else (0.15 / 1e6) # (15.0 / 1e6)\n#         cost_per_output_token = (15.0 / 1e6) if \"anthropic\" in self.model_id else (0.6 / 1e6) # (60.0 / 1e6)\n#         input_cost = input_tokens * cost_per_input_token\n#         output_cost = output_tokens * cost_per_output_token\n#         generation_stats = GenerationStats(\n#             model_name=self.model_id,\n#             total_input_tokens=input_tokens,\n#             total_output_tokens=output_tokens,\n#             total_input_cost=input_cost,\n#             total_output_cost=output_cost,\n#             cost_per_record=input_cost + output_cost,\n#             llm_call_duration_secs=time.time() - start_time,\n#         )\n\n#         # update the description of the Context to include the search result\n#         new_description = f\"RESULT: {response}\\n\\n\"\n#         cm = ContextManager()\n#         cm.update_context(id=self.context_id, description=new_description)\n\n#         # create and return record set\n#         field_answers = {\n#             \"context\": cm.get_context(id=self.context_id),\n#         }\n#         record_set = self._create_record_set(\n#             candidate,\n#             generation_stats,\n#             time.time() - start_time,\n#             field_answers,\n#         )\n\n#         return record_set\n"
  },
  {
    "path": "src/palimpzest/query/operators/split.py",
    "content": "from __future__ import annotations\n\nimport math\n\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.constants import (\n    NAIVE_EST_NUM_INPUT_TOKENS,\n    NAIVE_EST_NUM_OUTPUT_TOKENS,\n    Cardinality,\n    PromptStrategy,\n)\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.models import GenerationStats, OperatorCostEstimates\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.operators.convert import LLMConvert\nfrom palimpzest.query.operators.filter import LLMFilter\n\n\nclass SplitConvert(LLMConvert):\n    def __init__(self, num_chunks: int = 2, min_size_to_chunk: int = 1000, *args, **kwargs):\n        kwargs[\"prompt_strategy\"] = None\n        super().__init__(*args, **kwargs)\n        self.num_chunks = num_chunks\n        self.min_size_to_chunk = min_size_to_chunk\n        self.split_generator = Generator(self.model, PromptStrategy.MAP_SPLIT_PROPOSER, self.reasoning_effort, self.cardinality, self.desc, self.verbose)\n        self.split_merge_generator = Generator(self.model, PromptStrategy.MAP_SPLIT_MERGER, self.reasoning_effort, self.cardinality, self.desc, self.verbose)\n\n        # crude adjustment factor for naive estimation in unoptimized setting\n        self.naive_quality_adjustment = 0.6\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Chunk Size: {str(self.num_chunks)}\\n\"\n        op += f\"    Min Size to Chunk: {str(self.min_size_to_chunk)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\"num_chunks\": self.num_chunks, \"min_size_to_chunk\": self.min_size_to_chunk, **id_params}\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"num_chunks\": self.num_chunks, \"min_size_to_chunk\": self.min_size_to_chunk, **op_params}\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Update the cost per record and quality estimates produced by LLMConvert's naive estimates.\n        We adjust the cost per record to account for the reduced number of input tokens following\n        the retrieval of relevant chunks, and we make a crude estimate of the quality degradation\n        that results from using a downsized input (although this may in fact improve quality in\n        some cases).\n        \"\"\"\n        # get naive cost estimates from LLMConvert\n        naive_op_cost_estimates = super().naive_cost_estimates(source_op_cost_estimates)\n\n        # re-compute cost per record assuming we use fewer input tokens; naively assume a single input field\n        est_num_input_tokens = NAIVE_EST_NUM_INPUT_TOKENS\n        est_num_output_tokens = NAIVE_EST_NUM_OUTPUT_TOKENS\n        model_conversion_usd_per_record = (\n            self.model.get_usd_per_input_token() * est_num_input_tokens\n            + self.model.get_usd_per_output_token() * est_num_output_tokens\n        )\n\n        # set refined estimate of cost per record\n        naive_op_cost_estimates.cost_per_record = model_conversion_usd_per_record\n        naive_op_cost_estimates.cost_per_record_lower_bound = naive_op_cost_estimates.cost_per_record\n        naive_op_cost_estimates.cost_per_record_upper_bound = naive_op_cost_estimates.cost_per_record\n        naive_op_cost_estimates.quality = (naive_op_cost_estimates.quality) * self.naive_quality_adjustment\n        naive_op_cost_estimates.quality_lower_bound = naive_op_cost_estimates.quality\n        naive_op_cost_estimates.quality_upper_bound = naive_op_cost_estimates.quality\n\n        return naive_op_cost_estimates\n\n    def get_text_chunks(self, text: str, num_chunks: int) -> list[str]:\n        \"\"\"\n        Given a text string, chunk it into num_chunks substrings of roughly equal size.\n        \"\"\"\n        chunks = []\n\n        idx, chunk_size = 0, math.ceil(len(text) / num_chunks)\n        while idx + chunk_size < len(text):\n            chunks.append(text[idx : idx + chunk_size])\n            idx += chunk_size\n\n        if idx < len(text):\n            chunks.append(text[idx:])\n\n        return chunks\n\n    def get_chunked_candidate(self, candidate: DataRecord, input_fields: list[str]) -> list[DataRecord]:\n        \"\"\"\n        For each text field, chunk the content. If a field is smaller than the chunk size,\n        simply include the full field.\n        \"\"\"\n        # compute mapping from each field to its chunked content\n        field_name_to_chunked_content = {}\n        for field_name in input_fields:\n            field = candidate.get_field_type(field_name)\n            content = candidate[field_name]\n\n            # do not chunk this field if it is not a string or a list of strings\n            is_string_field = field.annotation in [str, str | None]\n            is_list_string_field = field.annotation in [list[str], list[str] | None]\n            if not (is_string_field or is_list_string_field):\n                field_name_to_chunked_content[field_name] = [content]\n                continue\n\n            # if this is a list of strings, join the strings\n            if is_list_string_field:\n                content = \"[\" + \", \".join(content) + \"]\"\n\n            # skip this field if its length is less than the min size to chunk\n            if len(content) < self.min_size_to_chunk:\n                field_name_to_chunked_content[field_name] = [content]\n                continue\n\n            # chunk the content\n            field_name_to_chunked_content[field_name] = self.get_text_chunks(content, self.num_chunks)\n\n        # compute the true number of chunks (may be 1 if all fields are not chunked)\n        num_chunks = max(len(chunks) for chunks in field_name_to_chunked_content.values())\n\n        # create the chunked canidates\n        candidates = []\n        for chunk_idx in range(num_chunks):\n            candidate_copy = candidate.copy()\n            for field_name in input_fields:\n                field_chunks = field_name_to_chunked_content[field_name]\n                candidate_copy[field_name] = field_chunks[chunk_idx] if len(field_chunks) > 1 else field_chunks[0]\n\n            candidates.append(candidate_copy)\n\n        return candidates\n\n    def convert(self, candidate: DataRecord, fields: dict[str, FieldInfo]) -> tuple[dict[str, list], GenerationStats]:\n        # get the set of input fields to use for the convert operation\n        input_fields = self.get_input_fields()\n\n        # lookup most relevant chunks for each field using embedding search\n        candidate_copy = candidate.copy()\n        chunked_candidates = self.get_chunked_candidate(candidate_copy, input_fields)\n\n        # construct kwargs for generation\n        gen_kwargs = {\"project_cols\": input_fields, \"output_schema\": self.output_schema}\n\n        # generate outputs for each chunk separately\n        chunk_outputs, chunk_generation_stats_lst = [], []\n        for candidate in chunked_candidates:\n            _, reasoning, chunk_generation_stats, _ = self.split_generator(candidate, fields, json_output=False, **gen_kwargs)\n            chunk_outputs.append(reasoning)\n            chunk_generation_stats_lst.append(chunk_generation_stats)\n\n        # call the merger\n        gen_kwargs = {\n            \"project_cols\": input_fields,\n            \"output_schema\": self.output_schema,\n            \"chunk_outputs\": chunk_outputs,\n        }\n        field_answers, _, merger_gen_stats, _ = self.split_merge_generator(candidate, fields, **gen_kwargs)\n\n        # compute the total generation stats\n        generation_stats = sum(chunk_generation_stats_lst) + merger_gen_stats\n\n        return field_answers, generation_stats\n\n\nclass SplitFilter(LLMFilter):\n    def __init__(self, num_chunks: int = 2, min_size_to_chunk: int = 1000, *args, **kwargs):\n        kwargs[\"prompt_strategy\"] = None\n        super().__init__(*args, **kwargs)\n        self.num_chunks = num_chunks\n        self.min_size_to_chunk = min_size_to_chunk\n        self.split_generator = Generator(self.model, PromptStrategy.FILTER_SPLIT_PROPOSER, self.reasoning_effort, Cardinality.ONE_TO_ONE, self.desc, self.verbose)\n        self.split_merge_generator = Generator(self.model, PromptStrategy.FILTER_SPLIT_MERGER, self.reasoning_effort, Cardinality.ONE_TO_ONE, self.desc, self.verbose)\n\n        # crude adjustment factor for naive estimation in no-sentinel setting\n        self.naive_quality_adjustment = 0.6\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Chunk Size: {str(self.num_chunks)}\\n\"\n        op += f\"    Min Size to Chunk: {str(self.min_size_to_chunk)}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\"num_chunks\": self.num_chunks, \"min_size_to_chunk\": self.min_size_to_chunk, **id_params}\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        return {\"num_chunks\": self.num_chunks, \"min_size_to_chunk\": self.min_size_to_chunk, **op_params}\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Update the cost per record and quality estimates produced by LLMFilter's naive estimates.\n        We adjust the cost per record to account for the reduced number of input tokens following\n        the retrieval of relevant chunks, and we make a crude estimate of the quality degradation\n        that results from using a downsized input (although this may in fact improve quality in\n        some cases).\n        \"\"\"\n        # get naive cost estimates from LLMFilter\n        naive_op_cost_estimates = super().naive_cost_estimates(source_op_cost_estimates)\n\n        # re-compute cost per record assuming we use fewer input tokens; naively assume a single input field\n        est_num_input_tokens = NAIVE_EST_NUM_INPUT_TOKENS\n        est_num_output_tokens = NAIVE_EST_NUM_OUTPUT_TOKENS\n        model_conversion_usd_per_record = (\n            self.model.get_usd_per_input_token() * est_num_input_tokens\n            + self.model.get_usd_per_output_token() * est_num_output_tokens\n        )\n\n        # set refined estimate of cost per record\n        naive_op_cost_estimates.cost_per_record = model_conversion_usd_per_record\n        naive_op_cost_estimates.cost_per_record_lower_bound = naive_op_cost_estimates.cost_per_record\n        naive_op_cost_estimates.cost_per_record_upper_bound = naive_op_cost_estimates.cost_per_record\n        naive_op_cost_estimates.quality = (naive_op_cost_estimates.quality) * self.naive_quality_adjustment\n        naive_op_cost_estimates.quality_lower_bound = naive_op_cost_estimates.quality\n        naive_op_cost_estimates.quality_upper_bound = naive_op_cost_estimates.quality\n\n        return naive_op_cost_estimates\n\n    def get_text_chunks(self, text: str, num_chunks: int) -> list[str]:\n        \"\"\"\n        Given a text string, chunk it into num_chunks substrings of roughly equal size.\n        \"\"\"\n        chunks = []\n\n        idx, chunk_size = 0, math.ceil(len(text) / num_chunks)\n        while idx + chunk_size < len(text):\n            chunks.append(text[idx : idx + chunk_size])\n            idx += chunk_size\n\n        if idx < len(text):\n            chunks.append(text[idx:])\n\n        return chunks\n\n    def get_chunked_candidate(self, candidate: DataRecord, input_fields: list[str]) -> list[DataRecord]:\n        \"\"\"\n        For each text field, chunk the content. If a field is smaller than the chunk size,\n        simply include the full field.\n        \"\"\"\n        # compute mapping from each field to its chunked content\n        field_name_to_chunked_content = {}\n        for field_name in input_fields:\n            field = candidate.get_field_type(field_name)\n            content = candidate[field_name]\n\n            # do not chunk this field if it is not a string or a list of strings\n            is_string_field = field.annotation in [str, str | None]\n            is_list_string_field = field.annotation in [list[str], list[str] | None]\n            if not (is_string_field or is_list_string_field):\n                field_name_to_chunked_content[field_name] = [content]\n                continue\n\n            # if this is a list of strings, join the strings\n            if is_list_string_field:\n                content = \"[\" + \", \".join(content) + \"]\"\n\n            # skip this field if its length is less than the min size to chunk\n            if len(content) < self.min_size_to_chunk:\n                field_name_to_chunked_content[field_name] = [content]\n                continue\n\n            # chunk the content\n            field_name_to_chunked_content[field_name] = self.get_text_chunks(content, self.num_chunks)\n\n        # compute the true number of chunks (may be 1 if all fields are not chunked)\n        num_chunks = max(len(chunks) for chunks in field_name_to_chunked_content.values())\n\n        # create the chunked canidates\n        candidates = []\n        for chunk_idx in range(num_chunks):\n            candidate_copy = candidate.copy()\n            for field_name in input_fields:\n                field_chunks = field_name_to_chunked_content[field_name]\n                candidate_copy[field_name] = field_chunks[chunk_idx] if len(field_chunks) > 1 else field_chunks[0]\n\n            candidates.append(candidate_copy)\n\n        return candidates\n\n    def filter(self, candidate: DataRecord) -> tuple[dict[str, bool], GenerationStats]:\n        # get the set of input fields to use for the filter operation\n        input_fields = self.get_input_fields()\n\n        # construct output fields\n        fields = {\"passed_operator\": FieldInfo(annotation=bool, description=\"Whether the record passed the filter operation\")}\n\n        # lookup most relevant chunks for each field using embedding search\n        candidate_copy = candidate.copy()\n        chunked_candidates = self.get_chunked_candidate(candidate_copy, input_fields)\n\n        # construct kwargs for generation\n        gen_kwargs = {\"project_cols\": input_fields, \"filter_condition\": self.filter_obj.filter_condition}\n\n        # generate outputs for each chunk separately\n        chunk_outputs, chunk_generation_stats_lst = [], []\n        for candidate in chunked_candidates:\n            _, reasoning, chunk_generation_stats, _ = self.split_generator(candidate, fields, json_output=False, **gen_kwargs)\n            chunk_outputs.append(reasoning)\n            chunk_generation_stats_lst.append(chunk_generation_stats)\n\n        # call the merger\n        gen_kwargs = {\n            \"project_cols\": input_fields,\n            \"filter_condition\": self.filter_obj.filter_condition,\n            \"chunk_outputs\": chunk_outputs,\n        }\n        field_answers, _, merger_gen_stats, _ = self.split_merge_generator(candidate, fields, **gen_kwargs)\n\n        # compute the total generation stats\n        generation_stats = sum(chunk_generation_stats_lst) + merger_gen_stats\n\n        return field_answers, generation_stats\n"
  },
  {
    "path": "src/palimpzest/query/operators/topk.py",
    "content": "from __future__ import annotations\n\nimport os\nimport threading\nimport time\nfrom typing import Callable\n\nfrom chromadb.api.models.Collection import Collection\nfrom chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction\nfrom chromadb.utils.embedding_functions.openai_embedding_function import OpenAIEmbeddingFunction\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nfrom sentence_transformers import SentenceTransformer\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.models import GenerationStats, OperatorCostEstimates, RecordOpStats\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\n\nclass Singleton:\n     def __new__(cls, *args, **kw):\n         if not hasattr(cls, '_instance'):\n             orig = super(Singleton, cls)  # noqa: UP008\n             cls._instance = orig.__new__(cls, *args, **kw)\n         return cls._instance\n\nclass ClipModel(Singleton):\n    model = None\n    lock = threading.Lock()\n\n    @classmethod\n    def get_model(cls, model_name: str):\n        with cls.lock:\n            if cls.model is None:\n                cls.model = SentenceTransformer(model_name)\n            return cls.model\n\nclass TopKOp(PhysicalOperator):\n    def __init__(\n        self,\n        index: Collection,\n        search_attr: str,\n        output_attrs: list[dict] | type[BaseModel],\n        search_func: Callable | None,\n        k: int,\n        *args,\n        **kwargs,\n    ) -> None:\n        \"\"\"\n        Initialize the TopKOp object.\n        \n        Args:\n            index (Collection): The PZ index to use for retrieval.\n            search_attr (str): The attribute to search on.\n            output_attrs (list[dict]): The output fields containing the results of the search.\n            search_func (Callable | None): The function to use for searching the index. If None, the default search function will be used.\n            k (int): The number of top results to retrieve.\n        \"\"\"\n        super().__init__(*args, **kwargs)\n\n        # extract the field names from the output_attrs\n        if issubclass(output_attrs, BaseModel):\n            self.output_field_names = list(output_attrs.model_fields)\n        elif isinstance(output_attrs, list):\n            self.output_field_names = [attr[\"name\"] for attr in output_attrs]\n        else:\n            raise ValueError(\"`output_attrs` must be a list of dicts or a `pydantic.BaseModel` object.\")\n\n        if len(self.output_field_names) != 1 and search_func is None:\n            raise ValueError(\"If `search_func` is None, `output_attrs` must have a single field.\")\n\n        self.index = index\n        self.search_attr = search_attr\n        self.output_attrs = output_attrs\n        self.search_func = search_func if search_func is not None else self.default_search_func\n        self.k = k\n        self.clip_model = ClipModel()\n\n    def __str__(self):\n        op = super().__str__()\n        op += f\"    Top-K: {self.index.__class__.__name__} with k={self.k}\\n\"\n        return op\n\n    def get_id_params(self):\n        id_params = super().get_id_params()\n        id_params = {\n            \"index\": self.index.__class__.__name__,\n            \"search_attr\": self.search_attr,\n            \"output_attrs\": self.output_attrs,\n            \"k\": self.k,\n            **id_params,\n        }\n\n        return id_params\n\n    def get_op_params(self):\n        op_params = super().get_op_params()\n        op_params = {\n            \"index\": self.index,\n            \"search_func\": self.search_func,\n            \"search_attr\": self.search_attr,\n            \"output_attrs\": self.output_attrs,\n            \"k\": self.k,\n            **op_params,\n        }\n\n        return op_params\n\n    def naive_cost_estimates(self, source_op_cost_estimates: OperatorCostEstimates) -> OperatorCostEstimates:\n        \"\"\"\n        Compute naive cost estimates for the Top-K operation. These estimates assume\n        that the Top-K (1) has negligible cost and (2) has perfect quality.\n        \"\"\"\n        return OperatorCostEstimates(\n            cardinality=source_op_cost_estimates.cardinality,\n            time_per_record=0.01 * self.k,   # estimate 10 ms execution lookup per output\n            cost_per_record=0.001 * self.k,  # estimate small marginal cost of lookups \n            quality=1.0,\n        )\n\n    def default_search_func(self, index: Collection, query: list[str] | list[list[float]], k: int) -> list[str] | list[list[str]]:\n        \"\"\"\n        Default search function for the Top-K operation. This function uses the index to\n        retrieve the top-k results for the given query. The query will be a (possibly singleton)\n        list of strings or a list of lists of floats (i.e., embeddings). The function will return\n        the top-k results per-query in (descending) sorted order. If the input is a singleton list,\n        then the output will be a list of strings. If the input is a list of lists, then the output\n        will be a list of lists of strings.\n\n        Args:\n            index (PZIndex): The index to use for retrieval.\n            query (list[str] | list[list[float]]): The query (or queries) to search for.\n            k (int): The maximum number of results the top-k operator will return.\n\n        Returns:\n            list[str] | list[list[str]]: The top results in (descending) sorted order per query.\n        \"\"\"\n        # check if the input is a singleton list or a list of lists\n        is_singleton_list = len(query) == 1\n\n        if isinstance(index, Collection):\n            # if the index is a chromadb collection, use the query method\n            results = index.query(query, n_results=k)\n\n            # the results[\"documents\"] will be a list[list[str]]; if the input is a singleton list,\n            # then we output the list of strings (i.e., the first element of the list), otherwise\n            # we output the list of lists\n            final_results = results[\"documents\"][0] if is_singleton_list else results[\"documents\"]\n\n            # NOTE: self.output_field_names must be a singleton for default_search_func to be used\n            return {self.output_field_names[0]: final_results}\n\n        else:\n            raise ValueError(\"Unsupported index type. Must be either a Collection.\")\n\n    def _create_record_set(\n        self,\n        candidate: DataRecord,\n        top_k_results: dict[str, list[str] | list[list[str]]] | None,\n        generation_stats: GenerationStats,\n        total_time: float,\n    ) -> DataRecordSet:\n        \"\"\"\n        Given an input DataRecord and the top_k_results, construct the resulting RecordSet.\n        \"\"\"\n        # create output DataRecord an set the output attribute\n        data_item = {\n            output_field_name: None if top_k_results is None else top_k_results[output_field_name]\n            for output_field_name in self.output_field_names\n        }\n        output_dr = DataRecord.from_parent(self.output_schema, data_item, parent_record=candidate)\n\n        # get the record_state and generated fields\n        record_state = output_dr.to_dict(include_bytes=False)\n\n        # NOTE: this should be equivalent to self.get_fields_to_generate()\n        generated_fields = self.output_field_names\n\n        # construct the RecordOpStats object\n        record_op_stats = RecordOpStats(\n            record_id=output_dr._id,\n            record_parent_ids=output_dr._parent_ids,\n            record_source_indices=output_dr._source_indices,\n            record_state=record_state,\n            full_op_id=self.get_full_op_id(),\n            logical_op_id=self.logical_op_id,\n            op_name=self.op_name(),\n            time_per_record=total_time,\n            cost_per_record=generation_stats.cost_per_record,\n            input_text_tokens=generation_stats.input_text_tokens,\n            input_audio_tokens=generation_stats.input_audio_tokens,\n            input_image_tokens=generation_stats.input_image_tokens,\n            cache_read_tokens=generation_stats.cache_read_tokens,\n            cache_creation_tokens=generation_stats.cache_creation_tokens,\n            output_text_tokens=generation_stats.output_text_tokens,\n            embedding_input_tokens=generation_stats.embedding_input_tokens,\n            answer=data_item,\n            input_fields=list(self.input_schema.model_fields),\n            generated_fields=generated_fields,\n            fn_call_duration_secs=total_time - generation_stats.llm_call_duration_secs,\n            llm_call_duration_secs=generation_stats.llm_call_duration_secs,\n            total_llm_calls=generation_stats.total_llm_calls,\n            total_embedding_llm_calls=generation_stats.total_embedding_llm_calls,\n            op_details={k: str(v) for k, v in self.get_id_params().items()},\n        )\n\n        drs = [output_dr]\n        record_op_stats_lst = [record_op_stats]\n\n        # construct and return the record set\n        return DataRecordSet(drs, record_op_stats_lst)\n\n    def __call__(self, candidate: DataRecord) -> DataRecordSet:\n        start_time = time.time()\n\n        # check that query is a string or list of strings, otherwise return output with self.output_field_names set to None\n        query = getattr(candidate, self.search_attr)\n        query_is_str = isinstance(query, str)\n        query_is_list_of_str = isinstance(query, list) and all(isinstance(q, str) for q in query)\n        if not query_is_str and not query_is_list_of_str:\n            return self._create_record_set(\n                candidate=candidate,\n                top_k_results=None,\n                generation_stats=GenerationStats(),\n                total_time=time.time() - start_time,\n            )\n\n        # if query is a string, convert it to a list of strings\n        if query_is_str:\n            query = [query]\n\n        # compute input/query embedding(s) if the index is a chromadb collection\n        inputs, gen_stats = None, GenerationStats()\n        if isinstance(self.index, Collection):\n            uses_openai_embedding_fcn = isinstance(self.index._embedding_function, OpenAIEmbeddingFunction)\n            uses_clip_model = isinstance(self.index._embedding_function, SentenceTransformerEmbeddingFunction)\n            error_msg = \"ChromaDB index must use OpenAI or SentenceTransformer embedding function; see: https://docs.trychroma.com/integrations/embedding-models/openai\"\n            assert uses_openai_embedding_fcn or uses_clip_model, error_msg\n\n            model_name = self.index._embedding_function.model_name if uses_openai_embedding_fcn else \"clip-ViT-B-32\"\n            err_msg = f\"For Chromadb, we currently only support `text-embedding-3-small` and `clip-ViT-B-32`; your index uses: {model_name}\"\n            embedding_model_names = [model.value for model in Model.get_all_models() if model.is_embedding_model()]\n            assert model_name in embedding_model_names, err_msg\n\n            # compute embeddings\n            try:\n                embed_start_time = time.time()\n                total_input_tokens = 0.0\n                if uses_openai_embedding_fcn:\n                    client = OpenAI()\n                    response = client.embeddings.create(input=query, model=model_name)\n                    total_input_tokens = response.usage.total_tokens\n                    inputs = [item.embedding for item in response.data]\n\n                elif uses_clip_model:\n                    model = self.clip_model.get_model(model_name)\n                    inputs = model.encode(query)\n\n                embed_total_time = time.time() - embed_start_time\n\n                # compute cost of embedding(s)\n                emb_model = Model(model_name)\n                total_input_cost = emb_model.get_usd_per_input_token() * total_input_tokens\n                gen_stats = GenerationStats(\n                    model_name=model_name,\n                    embedding_input_tokens=total_input_tokens,\n                    cost_per_record=total_input_cost,\n                    llm_call_duration_secs=embed_total_time,\n                    total_llm_calls=1,\n                    total_embedding_llm_calls=len(query),\n                )\n            except Exception:\n                query = None\n\n        # in the default case, pass string inputs rather than embeddings\n        if inputs is None:\n            inputs = query\n\n        try:\n            assert inputs is not None, \"Error: inputs is None (likely because embedding generation failed)\"\n            top_results = self.search_func(self.index, inputs, self.k)\n\n        except Exception:\n            top_results = [\"error-in-topk\"]\n            os.makedirs(\"topk-errors\", exist_ok=True)\n            ts = time.time()\n            with open(f\"topk-errors/error-{ts}.txt\", \"w\") as f:\n                f.write(str(query))\n\n        # TODO: the user is always right! let's drop this post-processing in the future\n        # filter top_results for the top_k_results\n        top_k_results = {output_field_name: [] for output_field_name in self.output_field_names}\n        for output_field_name in self.output_field_names:\n            if output_field_name in top_results:\n                if all([isinstance(result, list) for result in top_results[output_field_name]]):\n                    for result in top_results[output_field_name]:\n                        top_k_results[output_field_name].append(result[:self.k])\n                else:\n                    top_k_results[output_field_name] = top_results[output_field_name][:self.k]\n            else:\n                top_k_results[output_field_name] = []\n\n        if self.verbose:\n            print(f\"Top {self.k} results: {top_k_results}\")\n\n        # construct and return the record set\n        return self._create_record_set(\n            candidate=candidate,\n            top_k_results=top_k_results,\n            generation_stats=gen_stats,\n            total_time=time.time() - start_time,\n        )\n"
  },
  {
    "path": "src/palimpzest/query/optimizer/__init__.py",
    "content": "from palimpzest.query.optimizer.rules import AddContextsBeforeComputeRule as _AddContextsBeforeComputeRule\nfrom palimpzest.query.optimizer.rules import (\n    AggregateRule as _AggregateRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    BasicSubstitutionRule as _BasicSubstitutionRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    CritiqueAndRefineRule as _CritiqueAndRefineRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    EmbeddingJoinRule as _EmbeddingJoinRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    ImplementationRule as _ImplementationRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    LLMConvertBondedRule as _LLMConvertBondedRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    LLMFilterRule as _LLMFilterRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    MixtureOfAgentsRule as _MixtureOfAgentsRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    NestedLoopsJoinRule as _NestedLoopsJoinRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    NonLLMConvertRule as _NonLLMConvertRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    NonLLMFilterRule as _NonLLMFilterRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    PushDownFilter as _PushDownFilter,\n)\nfrom palimpzest.query.optimizer.rules import (\n    RAGRule as _RAGRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    RelationalJoinRule as _RelationalJoinRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    ReorderConverts as _ReorderConverts,\n)\nfrom palimpzest.query.optimizer.rules import (\n    Rule as _Rule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    SemanticAggregateRule as _SemanticAggregateRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    SplitRule as _SplitRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    TopKRule as _TopKRule,\n)\nfrom palimpzest.query.optimizer.rules import (\n    TransformationRule as _TransformationRule,\n)\n\nALL_RULES = [\n    _AddContextsBeforeComputeRule,\n    _AggregateRule,\n    _BasicSubstitutionRule,\n    _CritiqueAndRefineRule,\n    _EmbeddingJoinRule,\n    _ImplementationRule,\n    _LLMConvertBondedRule,\n    _LLMFilterRule,\n    _NestedLoopsJoinRule,\n    _MixtureOfAgentsRule,\n    _NonLLMConvertRule,\n    _NonLLMFilterRule,\n    _PushDownFilter,\n    _RAGRule,\n    _RelationalJoinRule,\n    _ReorderConverts,\n    _TopKRule,\n    _Rule,\n    _SemanticAggregateRule,\n    _SplitRule,\n    _TransformationRule,\n]\n\nIMPLEMENTATION_RULES = [\n    rule\n    for rule in ALL_RULES\n    if issubclass(rule, _ImplementationRule)\n    and rule not in [_ImplementationRule]\n]\n\nTRANSFORMATION_RULES = [\n    rule for rule in ALL_RULES if issubclass(rule, _TransformationRule) and rule not in [_TransformationRule]\n]\n\n__all__ = [\n    \"ALL_RULES\",\n    \"IMPLEMENTATION_RULES\",\n    \"TRANSFORMATION_RULES\",\n]\n"
  },
  {
    "path": "src/palimpzest/query/optimizer/cost_model.py",
    "content": "from __future__ import annotations\n\nimport logging\nimport warnings\n\nimport pandas as pd\n\nfrom palimpzest.constants import NAIVE_BYTES_PER_RECORD\nfrom palimpzest.core.models import OperatorCostEstimates, PlanCost, SentinelPlanStats\nfrom palimpzest.query.operators.join import JoinOp\nfrom palimpzest.query.operators.physical import PhysicalOperator\nfrom palimpzest.query.operators.scan import ContextScanOp, MarshalAndScanDataOp, ScanPhysicalOp\n\nwarnings.simplefilter(action='ignore', category=UserWarning)\n\nlogger = logging.getLogger(__name__)\n\nclass BaseCostModel:\n    \"\"\"\n    This base class contains the interface/abstraction that every CostModel must implement\n    in order to work with the Optimizer. In brief, the Optimizer expects the CostModel to\n    make a prediction about the runtime, cost, and quality of a physical operator.\n    \"\"\"\n    def __init__(self):\n        \"\"\"\n        CostModel constructor; the arguments for individual CostModels may vary depending\n        on the assumptions they make about the prevalance of historical execution data\n        and online vs. batch execution settings.\n        \"\"\"\n        pass\n\n    def get_costed_full_op_ids(self) -> set[str]:\n        \"\"\"\n        Return the set of full op ids which the cost model has cost estimates for.\n        \"\"\"\n        raise NotImplementedError(\"Calling get_costed_full_op_ids from abstract method\")\n\n    def __call__(self, operator: PhysicalOperator) -> PlanCost:\n        \"\"\"\n        The interface exposed by the CostModel to the Optimizer. Subclasses may require\n        additional arguments in order to make their predictions.\n        \"\"\"\n        raise NotImplementedError(\"Calling __call__ from abstract method\")\n\n\nclass SampleBasedCostModel:\n    \"\"\"\n    \"\"\"\n    def __init__(\n        self,\n        sentinel_plan_stats: SentinelPlanStats | None = None,\n        verbose: bool = False,\n        exp_name: str | None = None,\n    ):\n        # store verbose argument\n        self.verbose = verbose\n\n        # store experiment name if one is provided\n        self.exp_name = exp_name\n\n        # construct cost, time, quality, and selectivity matrices for each operator set;\n        self.operator_to_stats = self._compute_operator_stats(sentinel_plan_stats)\n        self.costed_full_op_ids = None if self.operator_to_stats is None else set([\n            full_op_id\n            for _, full_op_id_to_stats in self.operator_to_stats.items()\n            for full_op_id in full_op_id_to_stats\n        ])\n\n        # if there is a logical operator with no samples; add all of its op ids to costed_full_op_ids;\n        # this will lead to the cost model applying the naive cost estimates for all physical op ids\n        # in this logical operator (I think?)\n        # TODO\n\n        logger.info(f\"Initialized SampleBasedCostModel with verbose={self.verbose}\")\n        logger.debug(f\"Initialized SampleBasedCostModel with params: {self.__dict__}\")\n\n    def get_costed_full_op_ids(self):\n        return self.costed_full_op_ids\n\n    def _compute_operator_stats(self, sentinel_plan_stats: SentinelPlanStats | None) -> dict:\n        logger.debug(\"Computing operator statistics\")\n        # if no stats are provided, simply return None\n        if sentinel_plan_stats is None:\n            return None\n\n        # flatten the nested dictionary of execution data and pull out fields relevant to cost estimation\n        execution_record_op_stats = []\n        for unique_logical_op_id, full_op_id_to_op_stats in sentinel_plan_stats.operator_stats.items():\n            logger.debug(f\"Computing operator statistics for logical_op_id: {unique_logical_op_id}\")\n            # flatten the execution data into a list of RecordOpStats\n            op_set_execution_data = [\n                record_op_stats\n                for _, op_stats in full_op_id_to_op_stats.items()\n                for record_op_stats in op_stats.record_op_stats_lst\n            ]\n\n            # add entries from execution data into matrices\n            for record_op_stats in op_set_execution_data:\n                record_op_stats_dict = {\n                    \"unique_logical_op_id\": unique_logical_op_id,\n                    \"full_op_id\": record_op_stats.full_op_id,\n                    \"record_id\": record_op_stats.record_id,\n                    \"record_parent_ids\": record_op_stats.record_parent_ids,\n                    \"cost_per_record\": record_op_stats.cost_per_record,\n                    \"time_per_record\": record_op_stats.time_per_record,\n                    \"quality\": record_op_stats.quality,\n                    \"passed_operator\": record_op_stats.passed_operator,\n                    \"source_indices\": record_op_stats.record_source_indices,\n                    \"op_details\": record_op_stats.op_details,\n                    \"answer\": record_op_stats.answer,\n                    \"op_name\": record_op_stats.op_name,\n                }\n                execution_record_op_stats.append(record_op_stats_dict)\n\n        # convert flattened execution data into dataframe\n        operator_stats_df = pd.DataFrame(execution_record_op_stats)\n\n        # for each full_op_id, compute its average cost_per_record, time_per_record, selectivity, and quality\n        operator_to_stats = {}\n        for unique_logical_op_id, logical_op_df in operator_stats_df.groupby(\"unique_logical_op_id\"):\n            logger.debug(f\"Computing operator statistics for unique_logical_op_id: {unique_logical_op_id}\")\n            operator_to_stats[unique_logical_op_id] = {}\n\n            for full_op_id, physical_op_df in logical_op_df.groupby(\"full_op_id\"):\n                # compute the number of input records processed by this operator; use source_indices for scan operator(s)\n                num_source_records = (\n                    physical_op_df.record_parent_ids.apply(tuple).nunique()\n                    if not physical_op_df.record_parent_ids.isna().all()\n                    else physical_op_df.source_indices.apply(tuple).nunique()\n                )\n\n                # compute selectivity; for filters this may be 1.0 on smalle samples;\n                # always put something slightly less than 1.0 to ensure that filters are pushed down when possible\n                selectivity = physical_op_df.passed_operator.sum() / num_source_records\n                op_name = physical_op_df.op_name.iloc[0].lower()\n                if selectivity == 1.0 and \"filter\" in op_name:\n                    selectivity -= 1e-3\n\n                # compute quality; if all qualities are None then this will be NaN\n                quality = physical_op_df.quality.mean()\n\n                # set operator stats for this physical operator\n                operator_to_stats[unique_logical_op_id][full_op_id] = {\n                    \"cost\": physical_op_df.cost_per_record.mean(),\n                    \"time\": physical_op_df.time_per_record.mean(),\n                    \"quality\": 1.0 if pd.isna(quality) else quality,\n                    \"selectivity\": selectivity,\n                }\n\n        logger.debug(f\"Done computing operator statistics for {len(operator_to_stats)} operators!\")\n        return operator_to_stats\n\n    def _compute_naive_plan_cost(self, operator: PhysicalOperator, source_op_estimates: OperatorCostEstimates | None = None, right_source_op_estimates: OperatorCostEstimates | None = None) -> PlanCost:\n        # get identifier for operator which is unique within sentinel plan but consistent across sentinels\n        full_op_id = operator.get_full_op_id()\n        logger.debug(f\"Calling __call__ for {str(operator)} with full_op_id: {full_op_id}\")\n\n        # initialize estimates of operator metrics based on naive (but sometimes precise) logic\n        if isinstance(operator, MarshalAndScanDataOp):\n            # get handle to scan operator and pre-compute its size (number of records)\n            datasource_len = len(operator.datasource)\n\n            source_op_estimates = OperatorCostEstimates(\n                cardinality=datasource_len,\n                time_per_record=0.0,\n                cost_per_record=0.0,\n                quality=1.0,\n            )\n\n            op_estimates = operator.naive_cost_estimates(source_op_estimates, input_record_size_in_bytes=NAIVE_BYTES_PER_RECORD)\n\n        elif isinstance(operator, ContextScanOp):\n            source_op_estimates = OperatorCostEstimates(\n                cardinality=1.0,\n                time_per_record=0.0,\n                cost_per_record=0.0,\n                quality=1.0,\n            )\n\n            op_estimates = operator.naive_cost_estimates(source_op_estimates)\n\n        elif isinstance(operator, JoinOp):\n            op_estimates = operator.naive_cost_estimates(source_op_estimates, right_source_op_estimates)\n\n        else:\n            op_estimates = operator.naive_cost_estimates(source_op_estimates)\n\n        # compute estimates for this operator\n        est_input_cardinality = (\n            source_op_estimates.cardinality * right_source_op_estimates.cardinality\n            if isinstance(operator, JoinOp)\n            else source_op_estimates.cardinality\n        )\n        op_time = op_estimates.time_per_record * est_input_cardinality\n        op_cost = op_estimates.cost_per_record * est_input_cardinality\n        op_quality = op_estimates.quality\n\n        # create and return PlanCost object for this op's statistics\n        op_plan_cost = PlanCost(\n            cost=op_cost,\n            time=op_time,\n            quality=op_quality,\n            op_estimates=op_estimates,\n        )\n        logger.debug(f\"Done calling __call__ for {str(operator)} with full_op_id: {full_op_id}\")\n        logger.debug(f\"Plan cost: {op_plan_cost}\")\n\n        return op_plan_cost\n\n    def __call__(self, operator: PhysicalOperator, source_op_estimates: OperatorCostEstimates | None = None, right_source_op_estimates: OperatorCostEstimates | None = None) -> PlanCost:\n        # for non-sentinel execution, we use naive estimates\n        full_op_id = operator.get_full_op_id()\n        unique_logical_op_id = operator.unique_logical_op_id\n        if self.operator_to_stats is None or unique_logical_op_id not in self.operator_to_stats:\n            return self._compute_naive_plan_cost(operator, source_op_estimates, right_source_op_estimates)\n\n        # NOTE: some physical operators may not have any sample execution data in this cost model;\n        #       these physical operators are filtered out of the Optimizer, thus we can assume that\n        #       we will have execution data for each operator passed into __call__; nevertheless, we\n        #       still perform a sanity check\n        # look up physical and logical op ids associated with this physical operator\n        physical_op_to_stats = self.operator_to_stats.get(unique_logical_op_id)\n        assert physical_op_to_stats is not None, f\"No execution data for logical operator: {str(operator)}\"\n        assert physical_op_to_stats.get(full_op_id) is not None, f\"No execution data for physical operator: {str(operator)}\"\n        logger.debug(f\"Calling __call__ for {str(operator)}\")\n\n        # look up stats for this operation\n        est_cost_per_record = self.operator_to_stats[unique_logical_op_id][full_op_id][\"cost\"]\n        est_time_per_record = self.operator_to_stats[unique_logical_op_id][full_op_id][\"time\"]\n        est_quality = self.operator_to_stats[unique_logical_op_id][full_op_id][\"quality\"]\n        est_selectivity = self.operator_to_stats[unique_logical_op_id][full_op_id][\"selectivity\"]\n\n        # create source_op_estimates for scan operators if they are not provided\n        if isinstance(operator, ScanPhysicalOp):\n            # get handle to scan operator and pre-compute its size (number of records)\n            datasource_len = len(operator.datasource)\n\n            source_op_estimates = OperatorCostEstimates(\n                cardinality=datasource_len,\n                time_per_record=0.0,\n                cost_per_record=0.0,\n                quality=1.0,\n            )\n\n        # generate new set of OperatorCostEstimates\n        est_input_cardinality = (\n            source_op_estimates.cardinality * right_source_op_estimates.cardinality\n            if isinstance(operator, JoinOp)\n            else source_op_estimates.cardinality\n        )\n        op_estimates = OperatorCostEstimates(\n            cardinality=est_selectivity * est_input_cardinality,\n            time_per_record=est_time_per_record,\n            cost_per_record=est_cost_per_record,\n            quality=est_quality,\n        )\n\n        # compute estimates for this operator\n        op_time = op_estimates.time_per_record * est_input_cardinality\n        op_cost = op_estimates.cost_per_record * est_input_cardinality\n        op_quality = op_estimates.quality\n\n        # construct and return op estimates\n        plan_cost = PlanCost(cost=op_cost, time=op_time, quality=op_quality, op_estimates=op_estimates)\n        logger.debug(f\"Done calling __call__ for {str(operator)}\")\n        logger.debug(f\"Plan cost: {plan_cost}\")\n        return plan_cost\n"
  },
  {
    "path": "src/palimpzest/query/optimizer/optimizer.py",
    "content": "from __future__ import annotations\n\nimport logging\nfrom copy import deepcopy\n\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.data.dataset import Dataset\nfrom palimpzest.core.lib.schemas import get_schema_field_names\nfrom palimpzest.policy import Policy\nfrom palimpzest.query.execution.execution_strategy_type import ExecutionStrategyType\nfrom palimpzest.query.operators.logical import (\n    ComputeOperator,\n    ConvertScan,\n    Distinct,\n    FilteredScan,\n    JoinOp,\n    LimitScan,\n    Project,\n    SearchOperator,\n)\nfrom palimpzest.query.optimizer import (\n    IMPLEMENTATION_RULES,\n    TRANSFORMATION_RULES,\n)\nfrom palimpzest.query.optimizer.cost_model import BaseCostModel, SampleBasedCostModel\nfrom palimpzest.query.optimizer.optimizer_strategy_type import OptimizationStrategyType\nfrom palimpzest.query.optimizer.plan import PhysicalPlan\nfrom palimpzest.query.optimizer.primitives import Group, LogicalExpression\nfrom palimpzest.query.optimizer.rules import (\n    CritiqueAndRefineRule,\n    LLMConvertBondedRule,\n    MixtureOfAgentsRule,\n    RAGRule,\n    SplitRule,\n)\nfrom palimpzest.query.optimizer.tasks import (\n    ApplyRule,\n    ExploreGroup,\n    OptimizeGroup,\n    OptimizeLogicalExpression,\n    OptimizePhysicalExpression,\n)\n\nlogger = logging.getLogger(__name__)\n\n\nclass Optimizer:\n    \"\"\"\n    The optimizer is responsible for searching the space of possible physical plans\n    for a user's initial (logical) plan and selecting the one which is closest to\n    optimizing the user's policy objective.\n\n    This optimizer is modeled after the Cascades framework for top-down query optimization:\n    - Thesis describing Cascades implementation (Chapters 1-3):\n      https://15721.courses.cs.cmu.edu/spring2023/papers/17-optimizer2/xu-columbia-thesis1998.pdf\n\n    - Andy Pavlo lecture with walkthrough example: https://www.youtube.com/watch?v=PXS49-tFLcI\n\n    - Original Paper: https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/2015/Papers/Cascades-graefe.pdf\n    \"\"\"\n\n    def __init__(\n        self,\n        policy: Policy,\n        cost_model: BaseCostModel,\n        available_models: list[Model],\n        join_parallelism: int = 64,\n        reasoning_effort: str = \"default\",\n        verbose: bool = False,\n        allow_bonded_query: bool = True,\n        allow_rag_reduction: bool = False,\n        allow_mixtures: bool = True,\n        allow_critic: bool = False,\n        allow_split_merge: bool = False,\n        optimizer_strategy: OptimizationStrategyType = OptimizationStrategyType.PARETO,\n        execution_strategy: ExecutionStrategyType = ExecutionStrategyType.PARALLEL,\n        use_final_op_quality: bool = False,\n        **kwargs,\n    ):\n        # store the policy\n        self.policy = policy\n\n        # store the cost model\n        self.cost_model = cost_model\n\n        # mapping from each group id to its Group object\n        self.groups = {}\n\n        # mapping from each expression to its Expression object\n        self.expressions = {}\n\n        # the stack of tasks to perform during optimization\n        self.tasks_stack = []\n\n        # the lists of implementation and transformation rules that the optimizer can apply\n        self.implementation_rules = IMPLEMENTATION_RULES\n        self.transformation_rules = TRANSFORMATION_RULES\n\n        # get the strategy class associated with the optimizer strategy\n        optimizer_strategy_cls = optimizer_strategy.value\n        self.strategy = optimizer_strategy_cls()\n\n        # remove transformation rules for optimization strategies which do not require them\n        if optimizer_strategy.no_transformation():\n            self.transformation_rules = []\n\n        # if we are not performing optimization, set available models to be single model\n        # and remove all optimizations (except for bonded queries)\n        if optimizer_strategy == OptimizationStrategyType.NONE:\n            self.allow_bonded_query = True\n            self.allow_rag_reduction = False\n            self.allow_mixtures = False\n            self.allow_critic = False\n            self.allow_split_merge = False\n            self.available_models = [available_models[0]]\n\n        # store optimization hyperparameters\n        self.verbose = verbose\n        self.available_models = available_models\n        self.join_parallelism = join_parallelism\n        self.reasoning_effort = reasoning_effort\n        self.allow_bonded_query = allow_bonded_query\n        self.allow_rag_reduction = allow_rag_reduction\n        self.allow_mixtures = allow_mixtures\n        self.allow_critic = allow_critic\n        self.allow_split_merge = allow_split_merge\n        self.optimizer_strategy = optimizer_strategy\n        self.execution_strategy = execution_strategy\n        self.use_final_op_quality = use_final_op_quality\n\n        # prune implementation rules based on boolean flags\n        if not self.allow_bonded_query:\n            self.implementation_rules = [\n                rule\n                for rule in self.implementation_rules\n                if rule not in [LLMConvertBondedRule]\n            ]\n\n        if not self.allow_rag_reduction:\n            self.implementation_rules = [\n                rule for rule in self.implementation_rules if not issubclass(rule, RAGRule)\n            ]\n\n        if not self.allow_mixtures:\n            self.implementation_rules = [\n                rule for rule in self.implementation_rules if not issubclass(rule, MixtureOfAgentsRule)\n            ]\n\n        if not self.allow_critic:\n            self.implementation_rules = [\n                rule for rule in self.implementation_rules if not issubclass(rule, CritiqueAndRefineRule)\n            ]\n\n        if not self.allow_split_merge:\n            self.implementation_rules = [\n                rule for rule in self.implementation_rules if not issubclass(rule, SplitRule)\n            ]\n\n        logger.info(f\"Initialized Optimizer with verbose={self.verbose}\")\n        logger.debug(f\"Initialized Optimizer with params: {self.__dict__}\")\n\n    def update_cost_model(self, cost_model: BaseCostModel):\n        self.cost_model = cost_model\n\n    def get_physical_op_params(self):\n        return {\n            \"verbose\": self.verbose,\n            \"available_models\": self.available_models,\n            \"join_parallelism\": self.join_parallelism,\n            \"reasoning_effort\": self.reasoning_effort,\n            \"is_validation\": self.optimizer_strategy == OptimizationStrategyType.SENTINEL,\n        }\n\n    def deepcopy_clean(self):\n        optimizer = Optimizer(\n            policy=self.policy,\n            cost_model=SampleBasedCostModel(),\n            verbose=self.verbose,\n            available_models=self.available_models,\n            join_parallelism=self.join_parallelism,\n            reasoning_effort=self.reasoning_effort,\n            allow_bonded_query=self.allow_bonded_query,\n            allow_rag_reduction=self.allow_rag_reduction,\n            allow_mixtures=self.allow_mixtures,\n            allow_critic=self.allow_critic,\n            allow_split_merge=self.allow_split_merge,\n            optimizer_strategy=self.optimizer_strategy,\n            execution_strategy=self.execution_strategy,\n            use_final_op_quality=self.use_final_op_quality,\n        )\n        return optimizer\n\n    def update_strategy(self, optimizer_strategy: OptimizationStrategyType):\n        # set the optimizer_strategy\n        self.optimizer_strategy = optimizer_strategy\n\n        # get the strategy class associated with the optimizer strategy\n        optimizer_strategy_cls = optimizer_strategy.value\n        self.strategy = optimizer_strategy_cls()\n\n        # remove transformation rules for optimization strategies which do not require them\n        if optimizer_strategy.no_transformation():\n            self.transformation_rules = []\n\n    def construct_group_tree(self, dataset: Dataset) -> tuple[int, dict[str, FieldInfo], dict[str, set[str]]]:\n        logger.debug(f\"Constructing group tree for dataset: {dataset}\")\n        ### convert node --> Group ###\n        # create the op for the given node\n        op = dataset._operator\n\n        # compute the input group id(s) and field(s) for this node\n        if len(dataset._sources) == 0:\n            input_group_ids, input_group_fields, input_group_properties = ([], {}, {})\n        elif len(dataset._sources) == 1:\n            input_group_id, input_group_fields, input_group_properties = self.construct_group_tree(dataset._sources[0])\n            input_group_ids = [input_group_id]\n        elif len(dataset._sources) == 2:\n            left_input_group_id, left_input_group_fields, left_input_group_properties = self.construct_group_tree(dataset._sources[0])\n            right_input_group_id, right_input_group_fields, right_input_group_properties = self.construct_group_tree(dataset._sources[1])\n            input_group_ids = [left_input_group_id, right_input_group_id]\n            input_group_fields = {**left_input_group_fields, **right_input_group_fields}\n            input_group_properties = deepcopy(left_input_group_properties)\n            for k, v in right_input_group_properties.items():\n                if k in input_group_properties:\n                    input_group_properties[k].update(v)\n                else:\n                    input_group_properties[k] = deepcopy(v)\n        else:\n            raise NotImplementedError(\"Constructing group trees for datasets with more than 2 sources is not supported.\")\n\n        # compute the fields added by this operation and all fields\n        input_group_short_field_names = list(\n            map(lambda full_field: full_field.split(\".\")[-1], input_group_fields.keys())\n        )\n        new_fields = {\n            field_name: op.output_schema.model_fields[field_name.split(\".\")[-1]]\n            for field_name in get_schema_field_names(op.output_schema, id=dataset.id)\n            if (field_name not in input_group_short_field_names) or (hasattr(op, \"udf\") and op.udf is not None)\n        }\n        all_fields = {**input_group_fields, **new_fields}\n\n        # compute the set of (short) field names this operation depends on\n        depends_on_field_names = (\n            {} if dataset.is_root else {field_name.split(\".\")[-1] for field_name in op.depends_on}\n        )\n\n        # NOTE: group_id is computed as the unique (sorted) set of fields and properties;\n        #       If an operation does not modify the fields (or modifies them in a way that\n        #       can create an idential field set to an earlier group) then we must add an\n        #       id from the operator to disambiguate the two groups.\n        # compute all properties including this operations'\n        all_properties = deepcopy(input_group_properties)\n        if isinstance(op, ConvertScan) and sorted(op.input_schema.model_fields.keys()) == sorted(op.output_schema.model_fields.keys()):\n            model_fields_dict = {\n                k: {\"annotation\": v.annotation, \"default\": v.default, \"description\": v.description}\n                for k, v in op.output_schema.model_fields.items()\n            }\n            if \"maps\" in all_properties:\n                all_properties[\"maps\"].add(model_fields_dict)\n            else:\n                all_properties[\"maps\"] = set([model_fields_dict])\n\n        elif isinstance(op, FilteredScan):\n            # NOTE: we could use op.get_full_op_id() here, but storing filter strings makes\n            #       debugging a bit easier as you can read which filters are in the Group\n            op_filter_str = op.filter.get_filter_str()\n            if \"filters\" in all_properties:\n                all_properties[\"filters\"].add(op_filter_str)\n            else:\n                all_properties[\"filters\"] = set([op_filter_str])\n\n        elif isinstance(op, JoinOp):\n            unique_join_str = str(sorted(op.on)) if op.condition is None else op.condition\n            if \"joins\" in all_properties:\n                all_properties[\"joins\"].add(unique_join_str)\n            else:\n                all_properties[\"joins\"] = set([unique_join_str])\n\n        elif isinstance(op, LimitScan):\n            op_limit_str = op.get_logical_op_id()\n            if \"limits\" in all_properties:\n                all_properties[\"limits\"].add(op_limit_str)\n            else:\n                all_properties[\"limits\"] = set([op_limit_str])\n\n        elif isinstance(op, Project):\n            op_project_str = op.get_logical_op_id()\n            if \"projects\" in all_properties:\n                all_properties[\"projects\"].add(op_project_str)\n            else:\n                all_properties[\"projects\"] = set([op_project_str])\n\n        elif isinstance(op, Distinct):\n            op_distinct_str = op.get_logical_op_id()\n            if \"distincts\" in all_properties:\n                all_properties[\"distincts\"].add(op_distinct_str)\n            else:\n                all_properties[\"distincts\"] = set([op_distinct_str])\n\n        # TODO: temporary fix; perhaps use op_ids to identify group?\n        elif isinstance(op, ComputeOperator):\n            op_instruction = op.instruction\n            if \"instructions\" in all_properties:\n                all_properties[\"instructions\"].add(op_instruction)\n            else:\n                all_properties[\"instructions\"] = set([op_instruction])\n\n        elif isinstance(op, SearchOperator):\n            op_search_query = op.search_query\n            if \"search_queries\" in all_properties:\n                all_properties[\"search_queries\"].add(op_search_query)\n            else:\n                all_properties[\"search_queries\"] = set([op_search_query])\n\n        # construct the logical expression and group\n        logical_expression = LogicalExpression(\n            operator=op,\n            input_group_ids=input_group_ids,\n            input_fields=input_group_fields,\n            depends_on_field_names=depends_on_field_names,\n            generated_fields=new_fields,\n            group_id=None,\n        )\n        group = Group(\n            logical_expressions=[logical_expression],\n            fields=all_fields,\n            properties=all_properties,\n        )\n        logical_expression.set_group_id(group.group_id)\n\n        # add the expression and group to the optimizer's expressions and groups and return\n        self.expressions[logical_expression.expr_id] = logical_expression\n        self.groups[group.group_id] = group\n        logger.debug(f\"Constructed group tree for dataset: {dataset}\")\n        logger.debug(f\"Group: {group.group_id}, {all_fields}, {all_properties}\")\n\n        return group.group_id, all_fields, all_properties\n\n    def convert_query_plan_to_group_tree(self, dataset: Dataset) -> str:\n        logger.debug(f\"Converting query plan to group tree for dataset: {dataset}\")\n\n        # compute depends_on field for every node\n        short_to_full_field_name = {}\n        for node in dataset:\n            # update mapping from short to full field names\n            short_field_names = get_schema_field_names(node.schema)\n            full_field_names = get_schema_field_names(node.schema, id=node.id)\n            for short_field_name, full_field_name in zip(short_field_names, full_field_names):\n                # set mapping automatically if this is a new field\n                if short_field_name not in short_to_full_field_name or (hasattr(node._operator, \"udf\") and node._operator.udf is not None):\n                    short_to_full_field_name[short_field_name] = full_field_name\n\n            # if the node is a root Dataset, then skip\n            if node.is_root:\n                continue\n\n            # If the node already has depends_on specified, then resolve each field name to a full (unique) field name\n            if len(node._operator.depends_on) > 0:\n                node._operator.depends_on = list(map(lambda field: short_to_full_field_name[field], node._operator.depends_on))\n                continue\n\n            # otherwise, make the node depend on all upstream nodes\n            node._operator.depends_on = set()\n            upstream_nodes = node.get_upstream_datasets()\n            for upstream_node in upstream_nodes:\n                upstream_field_names = get_schema_field_names(upstream_node.schema, id=upstream_node.id)\n                node._operator.depends_on.update(upstream_field_names)\n            node._operator.depends_on = list(node._operator.depends_on)\n\n        # construct tree of groups\n        final_group_id, _, _ = self.construct_group_tree(dataset)\n\n        logger.debug(f\"Converted query plan to group tree for dataset: {dataset}\")\n        logger.debug(f\"Final group id: {final_group_id}\")\n\n        return final_group_id\n\n    def heuristic_optimization(self, group_id: int) -> None:\n        \"\"\"\n        Apply universally desirable transformations (e.g. filter/projection push-down).\n        \"\"\"\n        pass\n\n    def search_optimization_space(self, group_id: int) -> None:\n        logger.debug(f\"Searching optimization space for group_id: {group_id}\")\n\n        # begin the search for an optimal plan with a task to optimize the final group\n        initial_task = OptimizeGroup(group_id)\n        self.tasks_stack.append(initial_task)\n\n        # TODO: conditionally stop when X number of tasks have been executed to limit exhaustive search\n        while len(self.tasks_stack) > 0:\n            task = self.tasks_stack.pop(-1)\n\n            new_tasks = []\n            if isinstance(task, (OptimizeGroup, ExploreGroup)):\n                new_tasks = task.perform(self.groups)\n            elif isinstance(task, OptimizeLogicalExpression):\n                new_tasks = task.perform(self.transformation_rules, self.implementation_rules)\n            elif isinstance(task, ApplyRule):\n                context = {\"costed_full_op_ids\": self.cost_model.get_costed_full_op_ids()}\n                new_tasks = task.perform(\n                    self.groups, self.expressions, context=context, **self.get_physical_op_params(),\n                )\n            elif isinstance(task, OptimizePhysicalExpression):\n                context = {\"optimizer_strategy\": self.optimizer_strategy, \"execution_strategy\": self.execution_strategy}\n                new_tasks = task.perform(self.cost_model, self.groups, self.policy, context=context)\n\n            self.tasks_stack.extend(new_tasks)\n\n        logger.debug(f\"Done searching optimization space for group_id: {group_id}\")\n\n    def optimize(self, dataset: Dataset) -> list[PhysicalPlan]:\n        \"\"\"\n        The optimize function takes in an initial query plan and searches the space of\n        logical and physical plans in order to cost and produce a (near) optimal physical plan.\n        \"\"\"\n        logger.info(f\"Optimizing query plan: {dataset}\")\n        # compute the initial group tree for the user plan\n        dataset_copy = dataset.copy()\n        final_group_id = self.convert_query_plan_to_group_tree(dataset_copy)\n\n        # TODO\n        # # do heuristic based pre-optimization\n        # self.heuristic_optimization(final_group_id)\n\n        # search the optimization space by applying logical and physical transformations to the initial group tree\n        self.search_optimization_space(final_group_id)\n        logger.info(f\"Getting optimal plans for final group id: {final_group_id}\")\n\n        return self.strategy.get_optimal_plans(self.groups, final_group_id, self.policy, self.use_final_op_quality)\n"
  },
  {
    "path": "src/palimpzest/query/optimizer/optimizer_strategy.py",
    "content": "from __future__ import annotations\n\nimport logging\nfrom abc import ABC, abstractmethod\n\nfrom palimpzest.policy import Policy\nfrom palimpzest.query.optimizer.plan import PhysicalPlan, SentinelPlan\nfrom palimpzest.query.optimizer.primitives import Group\n\nlogger = logging.getLogger(__name__)\n\n\nclass OptimizationStrategy(ABC):\n    @abstractmethod\n    def get_optimal_plans(self, groups: dict, final_group_id: int, policy: Policy, use_final_op_quality: bool) -> list[PhysicalPlan] | list[SentinelPlan]:\n        \"\"\"Strategy decides how to search through the groups for optimal plan(s)\"\"\"\n        pass\n\n\nclass GreedyStrategy(OptimizationStrategy):\n    def _get_greedy_physical_plan(self, groups: dict, group_id: int) -> PhysicalPlan:\n        \"\"\"\n        Return the best plan with respect to the user provided policy.\n        \"\"\"\n        # get the best physical expression for this group\n        best_phys_expr = groups[group_id].best_physical_expression\n\n        # if this expression has no inputs (i.e. it is a BaseScan), create and return the physical plan\n        best_plan = None\n        if len(best_phys_expr.input_group_ids) == 0:\n            best_plan = PhysicalPlan(best_phys_expr.operator, subplans=None, plan_cost=best_phys_expr.plan_cost)\n\n        # otherwise, if this expression is not a join (i.e. it has one input)\n        elif len(best_phys_expr.input_group_ids) == 1:\n            # get the best physical plan for this group's input\n            input_group_id = best_phys_expr.input_group_ids[0]\n            input_best_phys_plan = self._get_greedy_physical_plan(groups, input_group_id)\n\n            # add this operator to best physical plan and return\n            best_plan = PhysicalPlan(best_phys_expr.operator, subplans=[input_best_phys_plan], plan_cost=best_phys_expr.plan_cost)\n\n        # otherwise, this expression is a join (i.e. it has two inputs)\n        elif len(best_phys_expr.input_group_ids) == 2:\n            left_input_group_id, right_input_group_id = best_phys_expr.input_group_ids\n\n            # get the best physical plan for the left input\n            left_best_phys_plan = self._get_greedy_physical_plan(groups, left_input_group_id)\n\n            # get the best physical plan for the right input\n            right_best_phys_plan = self._get_greedy_physical_plan(groups, right_input_group_id)\n\n            # add this operator to best physical plan and return\n            best_plan = PhysicalPlan(best_phys_expr.operator, subplans=[left_best_phys_plan, right_best_phys_plan], plan_cost=best_phys_expr.plan_cost)\n\n        # add this operator to best physical plan and return\n        return best_plan\n\n    def get_optimal_plans(self, groups: dict, final_group_id: int, policy: Policy, use_final_op_quality: bool) -> list[PhysicalPlan]:\n        logger.info(f\"Getting greedy optimal plans for final group id: {final_group_id}\")\n        plans = [self._get_greedy_physical_plan(groups, final_group_id)]\n        logger.info(f\"Done getting greedy optimal plans for final group id: {final_group_id}\")\n\n        return plans\n\n\nclass ParetoStrategy(OptimizationStrategy):\n    def _get_candidate_pareto_physical_plans(self, groups: dict, group_id: int, policy: Policy) -> list[PhysicalPlan]:\n        \"\"\"\n        Return a list of plans which will contain all of the pareto optimal plans (and some additional\n        plans which may not be pareto optimal).\n\n        TODO: can we cache group_id --> final_pareto_optimal_plans to avoid re-computing upstream\n        groups' pareto-optimal plans for each expression?\n        \"\"\"\n        # get the pareto optimal physical expressions for this group\n        pareto_optimal_phys_exprs = groups[group_id].pareto_optimal_physical_expressions\n\n        # construct list of pareto optimal plans\n        pareto_optimal_plans = []\n        for phys_expr in pareto_optimal_phys_exprs:\n            # if this expression has no inputs (i.e. it is a BaseScan), create and return the physical plan\n            if len(phys_expr.input_group_ids) == 0:\n                for plan_cost, _ in phys_expr.pareto_optimal_plan_costs:\n                    plan = PhysicalPlan(phys_expr.operator, subplans=None, plan_cost=plan_cost)\n                    pareto_optimal_plans.append(plan)\n\n            # otherwise, if this expression is not a join (i.e. it has one input)\n            elif len(phys_expr.input_group_ids) == 1:\n                # get the pareto optimal physical plan(s) for this group's inputs\n                input_group_id = phys_expr.input_group_ids[0]\n                pareto_optimal_phys_subplans = self._get_candidate_pareto_physical_plans(groups, input_group_id, policy)\n\n                # iterate over the input subplans and find the one(s) which combine with this physical expression\n                # to make a pareto-optimal plan\n                for plan_cost, (input_plan_cost, _) in phys_expr.pareto_optimal_plan_costs:\n                    for subplan in pareto_optimal_phys_subplans:\n                        if subplan.plan_cost == input_plan_cost:\n                            plan = PhysicalPlan(phys_expr.operator, subplans=[subplan], plan_cost=plan_cost)\n                            pareto_optimal_plans.append(plan)\n\n            # otherwise, this expression is a join (i.e. it has two inputs)\n            elif len(phys_expr.input_group_ids) == 2:\n                left_input_group_id, right_input_group_id = phys_expr.input_group_ids\n                pareto_optimal_left_subplans = self._get_candidate_pareto_physical_plans(groups, left_input_group_id, policy)\n                pareto_optimal_right_subplans = self._get_candidate_pareto_physical_plans(groups, right_input_group_id, policy)\n\n                # iterate over the input subplans and find the one(s) which combine with this physical expression\n                # to make a pareto-optimal plan\n                for plan_cost, (left_input_plan_cost, right_input_plan_cost) in phys_expr.pareto_optimal_plan_costs:\n                    for left_subplan in pareto_optimal_left_subplans:\n                        if left_subplan.plan_cost == left_input_plan_cost:\n                            for right_subplan in pareto_optimal_right_subplans:\n                                if right_subplan.plan_cost == right_input_plan_cost:\n                                    plan = PhysicalPlan(phys_expr.operator, subplans=[left_subplan, right_subplan], plan_cost=plan_cost)\n                                    pareto_optimal_plans.append(plan)\n\n        return pareto_optimal_plans\n\n    def get_optimal_plans(self, groups: dict, final_group_id: int, policy: Policy, use_final_op_quality: bool) -> list[PhysicalPlan]:\n        logger.info(f\"Getting pareto optimal plans for final group id: {final_group_id}\")\n        # compute all of the pareto optimal physical plans\n        plans = self._get_candidate_pareto_physical_plans(groups, final_group_id, policy)\n\n        # adjust plans' plan_cost.quality to reflect only the quality of the final operator\n        if use_final_op_quality:\n            for plan in plans:\n                plan.plan_cost.quality = plan.plan_cost.op_estimates.quality\n\n        # filter pareto optimal plans for ones which satisfy policy constraint (if at least one of them does)\n        if any([policy.constraint(plan.plan_cost) for plan in plans]):\n            plans = [plan for plan in plans if policy.constraint(plan.plan_cost)]\n\n        # select the plan which is best for the given policy\n        optimal_plan, plans = plans[0], plans[1:]\n        for plan in plans:\n            optimal_plan = optimal_plan if policy.choose(optimal_plan.plan_cost, plan.plan_cost) else plan\n\n        plans = [optimal_plan]\n        logger.info(f\"Done getting pareto optimal plans for final group id: {final_group_id}\")\n        return plans\n    \n\nclass SentinelStrategy(OptimizationStrategy):\n    def _get_sentinel_plan(self, groups: dict[str, Group], group_id: int) -> SentinelPlan:\n        \"\"\"\n        Create and return a SentinelPlan object.\n\n        NOTE: this strategy is only used to construct a SentinelPlan before performing optimization.\n              Currently, we do not perform any transformation rules when building the groups which\n              are fed into this function. Thus, every physical expression will correspond to the same\n              logical operator and share the same logical_op_id. Eventually we will want to consider\n              multiple logical re-orderings of operators in our SentinelPlan, but for now it is static.\n        \"\"\"\n        # get all the physical expressions for this group as well as their logical_op_id\n        phys_exprs = groups[group_id].physical_expressions\n        phys_op_set = [expr.operator for expr in phys_exprs]\n\n        # if this expression has no inputs (i.e. it is a scan operator), create and return the sentinel plan\n        best_phys_expr = groups[group_id].best_physical_expression\n        if len(best_phys_expr.input_group_ids) == 0:\n            return SentinelPlan(operator_set=phys_op_set, subplans=None)\n\n        # get the subplans\n        subplans = []\n        for input_group_id in best_phys_expr.input_group_ids:\n            subplan = self._get_sentinel_plan(groups, input_group_id)\n            subplans.append(subplan)\n\n        # compose the current physical operator set with its subplans\n        return SentinelPlan(operator_set=phys_op_set, subplans=subplans)\n\n    def get_optimal_plans(self, groups: dict, final_group_id: int, policy: Policy, use_final_op_quality: bool) -> list[SentinelPlan]:\n        logger.info(f\"Getting sentinel optimal plans for final group id: {final_group_id}\")\n        plans = [self._get_sentinel_plan(groups, final_group_id)]\n        logger.info(f\"Done getting sentinel optimal plans for final group id: {final_group_id}\")\n        return plans\n\n\nclass NoOptimizationStrategy(GreedyStrategy):\n    \"\"\"\n    NoOptimizationStrategy is used to intentionally construct a PhysicalPlan without applying any\n    logical transformations or optimizations. It uses the same get_optimal_plans logic as the\n    GreedyOptimizationStrategy.\n    \"\"\"\n"
  },
  {
    "path": "src/palimpzest/query/optimizer/optimizer_strategy_type.py",
    "content": "from enum import Enum\n\nfrom palimpzest.query.optimizer.optimizer_strategy import (\n    GreedyStrategy,\n    NoOptimizationStrategy,\n    ParetoStrategy,\n    SentinelStrategy,\n)\n\n\nclass OptimizationStrategyType(Enum):\n    \"\"\"\n    OptimizationStrategyType determines which (set of) plan(s) the Optimizer\n    will return to the Execution layer.\n    \"\"\"\n    GREEDY = GreedyStrategy\n    PARETO = ParetoStrategy\n    SENTINEL = SentinelStrategy\n    NONE = NoOptimizationStrategy\n\n    def no_transformation(self) -> bool:\n        \"\"\"\n        Return True if this optimization strategy does not transform the logical plan.\n        \"\"\"\n        return self in [OptimizationStrategyType.SENTINEL, OptimizationStrategyType.NONE]\n\n    def is_pareto(self) -> bool:\n        \"\"\"\n        Return True if this optimization strategy uses Pareto optimization.\n        \"\"\"\n        return self == OptimizationStrategyType.PARETO\n\n    def is_not_pareto(self) -> bool:\n        \"\"\"\n        Return True if this optimization strategy does not use Pareto optimization.\n        \"\"\"\n        return not self.is_pareto()\n"
  },
  {
    "path": "src/palimpzest/query/optimizer/plan.py",
    "content": "from __future__ import annotations\n\nfrom abc import ABC, abstractmethod\n\nfrom palimpzest.core.models import PlanCost\nfrom palimpzest.query.operators.aggregate import AggregateOp\nfrom palimpzest.query.operators.join import JoinOp\nfrom palimpzest.query.operators.limit import LimitScanOp\nfrom palimpzest.query.operators.physical import PhysicalOperator\nfrom palimpzest.query.operators.scan import ContextScanOp, MarshalAndScanDataOp\nfrom palimpzest.utils.hash_helpers import hash_for_id\n\n\nclass Plan(ABC):\n    @abstractmethod\n    def compute_plan_id(self) -> str:\n        pass\n\n    @abstractmethod\n    def __eq__(self, other) -> bool:\n        pass\n\n    @abstractmethod\n    def __hash__(self) -> int:\n        pass\n\n    @abstractmethod\n    def __repr__(self) -> str:\n        pass\n\n    @abstractmethod\n    def __str__(self) -> str:\n        pass\n\n    @abstractmethod\n    def __getitem__(self, slice) -> tuple:\n        pass\n\n    @abstractmethod\n    def __iter__(self) -> iter:\n        pass\n\n    @abstractmethod\n    def __len__(self) -> int:\n        pass\n\nclass PhysicalPlan(Plan):\n    def __init__(self, operator: PhysicalOperator, subplans: list[PhysicalPlan] | None, plan_cost: PlanCost | None = None):\n        self.operator = operator\n        self.subplans = [] if subplans is None else subplans\n        self.plan_cost = plan_cost if plan_cost is not None else PlanCost(cost=0.0, time=0.0, quality=1.0)\n        self.plan_id = self.compute_plan_id()\n\n        # NOTE: unique full_op_id is constructed as \"{topological_index}-{full_op_id}\" to\n        # differentiate between multiple instances of the same physical operator e.g. in self-joins\n\n        # compute mapping from unique full_op_id to next unique full_op_id in the plan\n        self.unique_full_op_id_to_next_unique_full_op_and_id = {}\n        current_idx, _ = self._compute_next_unique_full_op_map(self.unique_full_op_id_to_next_unique_full_op_and_id)\n        self.unique_full_op_id_to_next_unique_full_op_and_id[f\"{current_idx}-{self.operator.get_full_op_id()}\"] = (None, None)\n\n        # compute mapping from unique full_op_id to upstream unique full_op_ids\n        self.unique_full_op_id_to_upstream_full_op_ids = {}\n        self._compute_upstream_unique_full_op_ids_map(self.unique_full_op_id_to_upstream_full_op_ids)\n\n        # compute mapping from unique full_op_id to source unique full_op_ids\n        self.unique_full_op_id_to_source_full_op_ids = {}\n        self._compute_source_unique_full_op_ids_map(self.unique_full_op_id_to_source_full_op_ids)\n\n    def compute_plan_id(self) -> str:\n        \"\"\"\n        NOTE: This is NOT a universal ID.\n\n        Two different PhysicalPlan instances with the identical lists of operators will have equivalent plan_ids.\n        \"\"\"\n        full_op_id = self.operator.get_full_op_id()\n        subplan_ids = [subplan.compute_plan_id() for subplan in self.subplans]\n        return hash_for_id(str((full_op_id,) + tuple(subplan_ids)))\n\n    def get_est_total_outputs(self, num_samples: int | None = None, current_idx: int | None = None, source_unique_full_op_ids_map: dict | None = None) -> tuple[dict[str, int], int]:\n        \"\"\"Return the estimated total number of output records to be processed by the given operator in this plan.\"\"\"\n        # get the source map from the root of the entire plan; use this map throughout all recursive calls\n        # (if you call self.get_source_unique_full_op_ids() from a subplan, it's topo indexes will be different)\n        if source_unique_full_op_ids_map is None:\n            source_unique_full_op_ids_map = self.unique_full_op_id_to_source_full_op_ids\n\n        # get the estimated total outputs from all subplans\n        # NOTE: this will be an empty dictionary for scans\n        all_subplan_total_outputs = {}\n        for subplan in self.subplans:\n            subplan_total_outputs, current_idx = subplan.get_est_total_outputs(num_samples, current_idx, source_unique_full_op_ids_map)\n            current_idx += 1\n            all_subplan_total_outputs.update(subplan_total_outputs)\n\n        # if current_idx is None, this is the first call, so we initialize it to 0\n        if current_idx is None:\n            current_idx = 0\n\n        # get total outputs for this operator\n        this_op_total_outputs = {}\n        this_unique_full_op_id = f\"{current_idx}-{self.operator.get_full_op_id()}\"\n\n        # if this operator is a scan, return the length of its datasource\n        if isinstance(self.operator, MarshalAndScanDataOp):\n            total = min(len(self.operator.datasource), num_samples) if num_samples is not None else len(self.operator.datasource)\n            this_op_total_outputs = {this_unique_full_op_id: total}\n\n        # if this operator is a context scan, return 1\n        elif isinstance(self.operator, ContextScanOp):  # noqa: SIM114\n            this_op_total_outputs = {this_unique_full_op_id: 1}\n\n        # if this operator is an aggregate, return 1\n        elif isinstance(self.operator, AggregateOp):\n            this_op_total_outputs = {this_unique_full_op_id: 1}\n\n        # if this operator is a limit scan, return its limit\n        elif isinstance(self.operator, LimitScanOp):\n            this_op_total_outputs = {this_unique_full_op_id: self.operator.limit}\n\n        # if this operator is a join, return the Cartesian product of the estimated outputs of its inputs\n        elif isinstance(self.operator, JoinOp):\n            # get estimated outputs for immediate left and right inputs\n            source_unique_full_op_ids = source_unique_full_op_ids_map[f\"{current_idx}-{self.operator.get_full_op_id()}\"]\n            left_unique_full_op_id, right_unique_full_op_id = source_unique_full_op_ids[0], source_unique_full_op_ids[1]\n            left_total_outputs = all_subplan_total_outputs[left_unique_full_op_id]\n            right_total_outputs = all_subplan_total_outputs[right_unique_full_op_id]\n            this_op_total_outputs = {this_unique_full_op_id: left_total_outputs * right_total_outputs}\n\n        # otherwise, return the number of outputs from the immediate input\n        else:\n            source_unique_full_op_ids = source_unique_full_op_ids_map[f\"{current_idx}-{self.operator.get_full_op_id()}\"]\n            source_unique_full_op_id = source_unique_full_op_ids[0]\n            this_op_total_outputs = {this_unique_full_op_id: all_subplan_total_outputs[source_unique_full_op_id]}\n\n        return {**this_op_total_outputs, **all_subplan_total_outputs}, current_idx\n\n    def _compute_next_unique_full_op_map(self, next_map: dict[str, str | None], current_idx: int | None = None) -> tuple[int, str]:\n        \"\"\"Compute a mapping from each operator's unique full_op_id to the next operator in the plan and its unique full_op_id.\n\n        The unique full_op_id is constructed as \"{topological_index}-{full_op_id}\" to differentiate between\n        multiple instances of the same physical operator in the plan (e.g., in self-joins).\n\n        Args:\n            next_map: A dictionary to populate with the mapping from unique full_op_id to next (operator, unique_full_op_id) pair.\n            current_idx: The current topological index in the plan. If None, starts at 0.\n\n        Returns:\n            A tuple containing:\n                - The current topological index after processing this plan.\n                - The unique full_op_id of this plan's root operator.\n        \"\"\"\n        # If there are subplans, compute their next maps first\n        subplan_topo_idx_op_id_pairs = []\n        for subplan in self.subplans:\n            current_idx, current_full_op_id = subplan._compute_next_unique_full_op_map(next_map, current_idx)\n            subplan_topo_idx_op_id_pairs.append((current_idx, current_full_op_id))\n            current_idx += 1  # increment after processing each subplan\n\n        # for each subplan's root operator, set its next to this plan's root operator\n        for topo_idx, full_op_id in subplan_topo_idx_op_id_pairs:\n            unique_op_id = f\"{topo_idx}-{full_op_id}\"\n            this_unique_op_id = f\"{current_idx}-{self.operator.get_full_op_id()}\"\n            next_map[unique_op_id] = (self.operator, this_unique_op_id)\n\n        # if this is the first call, initialize current_idx\n        if current_idx is None:\n            current_idx = 0\n\n        return current_idx, self.operator.get_full_op_id()\n\n    def get_next_unique_full_op_and_id(self, topo_idx: int, operator: PhysicalOperator) -> tuple[PhysicalOperator | None, str | None]:\n        \"\"\"Return the next operator in the plan after the given operator, or None if it is the last operator.\"\"\"\n        unique_full_op_id = f\"{topo_idx}-{operator.get_full_op_id()}\"\n        return self.unique_full_op_id_to_next_unique_full_op_and_id[unique_full_op_id]\n\n    def get_next_unique_full_op_id(self, topo_idx: int, operator: PhysicalOperator) -> str | None:\n        \"\"\"Return the full_op_id of the next operator in the plan after the given operator, or None if it is the last operator.\"\"\"\n        unique_full_op_id = f\"{topo_idx}-{operator.get_full_op_id()}\"\n        _, next_unique_full_op_id = self.unique_full_op_id_to_next_unique_full_op_and_id[unique_full_op_id]\n        return next_unique_full_op_id\n\n    def _compute_upstream_unique_full_op_ids_map(self, upstream_map: dict[str, list[str]], current_idx: int | None = None) -> tuple[int, str, list[str]]:\n        # set the upstream unique full_op_ids for this operator\n        subplan_topo_idx_upstream_unique_full_op_id_tuples = []\n        for subplan in self.subplans:\n            current_idx, full_op_id, subplan_upstream_unique_full_op_ids = subplan._compute_upstream_unique_full_op_ids_map(upstream_map, current_idx)\n            subplan_topo_idx_upstream_unique_full_op_id_tuples.append((current_idx, full_op_id, subplan_upstream_unique_full_op_ids))\n            current_idx += 1\n\n        # if current_idx is None, this is the first call, so we initialize it to 0\n        if current_idx is None:\n            current_idx = 0\n\n        # compute this operator's unique full_op_id\n        this_unique_full_op_id = f\"{current_idx}-{self.operator.get_full_op_id()}\"\n\n        # update the upstream_map for this operator\n        upstream_map[this_unique_full_op_id] = []\n        for topo_idx, full_op_id, upstream_unique_full_op_ids in subplan_topo_idx_upstream_unique_full_op_id_tuples:\n            subplan_upstream_unique_full_op_ids = [f\"{topo_idx}-{full_op_id}\"] + upstream_unique_full_op_ids\n            upstream_map[this_unique_full_op_id].extend(subplan_upstream_unique_full_op_ids)\n\n        # return the current index and the upstream unique full_op_ids for this operator\n        return current_idx, self.operator.get_full_op_id(), upstream_map[this_unique_full_op_id]\n\n    def get_upstream_unique_full_op_ids(self, unique_full_op_id: str) -> list[str]:\n        \"\"\"Return the list of unique full_op_ids for the upstream operators of the operator specified by `unique_full_op_id`.\"\"\"\n        return self.unique_full_op_id_to_upstream_full_op_ids[unique_full_op_id]\n\n    def _compute_source_unique_full_op_ids_map(self, source_map: dict[str, list[str]], current_idx: int | None = None) -> tuple[int, str]:\n        # get the topological index and full_op_id pairs for all subplans' root operators\n        subplan_topo_idx_op_id_pairs = []\n        for subplan in self.subplans:\n            current_idx, current_full_op_id = subplan._compute_source_unique_full_op_ids_map(source_map, current_idx)\n            subplan_topo_idx_op_id_pairs.append((current_idx, current_full_op_id))\n            current_idx += 1\n\n        # if current_idx is None, this is the first call, so we initialize it to 0\n        if current_idx is None:\n            current_idx = 0\n\n        # compute this operator's unique full_op_id\n        this_unique_full_op_id = f\"{current_idx}-{self.operator.get_full_op_id()}\"\n\n        # update the source_map for this operator\n        source_map[this_unique_full_op_id] = []\n        for topo_idx, full_op_id in subplan_topo_idx_op_id_pairs:\n            unique_full_op_id = f\"{topo_idx}-{full_op_id}\"\n            source_map[this_unique_full_op_id].append(unique_full_op_id)\n\n        # return the current unique full_op_id for this operator\n        return current_idx, self.operator.get_full_op_id()\n\n    def get_source_unique_full_op_ids(self, topo_idx: int, operator: PhysicalOperator) -> list[str]:\n        \"\"\"Return the list of unique full_op_ids for the input(s) to this operator.\"\"\"\n        unique_full_op_id = f\"{topo_idx}-{operator.get_full_op_id()}\"\n        return self.unique_full_op_id_to_source_full_op_ids[unique_full_op_id]\n\n    def __eq__(self, other):\n        return isinstance(other, PhysicalPlan) and self.plan_id == other.plan_id\n\n    def __hash__(self):\n        return int(self.plan_id, 16)\n\n    def __repr__(self) -> str:\n        return str(self)\n\n    def _get_str(self, idx: int = 0, indent: int = 0) -> str:\n        indent_str = \" \" * (indent * 2)\n        plan_str = f\"{indent_str}{idx}. {str(self.operator)}\\n\"\n        for subplan in self.subplans:\n            plan_str += subplan._get_str(idx=idx + 1, indent=indent + 1)\n\n        return plan_str\n\n    def __str__(self):\n        return self._get_str()\n\n    def __getitem__(self, slice):\n        ops = [op for op in self]\n        return ops[slice]\n\n    def __iter__(self):\n        for subplan in self.subplans:\n            yield from subplan\n        yield self.operator\n\n    def __len__(self):\n        return 1 + sum(len(subplan) for subplan in self.subplans)\n\n    @classmethod\n    def _from_ops(cls, ops: list[PhysicalOperator], plan_cost: PlanCost | None = None) -> PhysicalPlan:\n        \"\"\"\n        NOTE: Do not use this in production code. This is a convenience method for constructing PhysicalPlans in tests.\n        This method assumes a left-deep tree structure (i.e. pipeline), where each operator has at most one subplan.\n        The PlanCost is applied to all subplans, thus it is not a true representation of the cost of the plan.\n        \"\"\"\n        assert len(ops) > 0, \"ops must contain at least one PhysicalOperator\"\n\n        # build the PhysicalPlan from the list of operators\n        if len(ops) == 1:\n            return cls(operator=ops[0], subplans=None, plan_cost=plan_cost)\n\n        # recursively build subplans\n        subplan = cls._from_ops(ops[:-1], plan_cost=plan_cost)\n        return cls(operator=ops[-1], subplans=[subplan], plan_cost=plan_cost)\n\n\n# TODO(?): take list[PhysicalOperator] as input, but then store OpFrontier\nclass SentinelPlan(Plan):\n    def __init__(self, operator_set: list[PhysicalOperator], subplans: list[SentinelPlan] | None):\n        # store operator_set and logical_op_id; sort operator_set internally by full_op_id\n        self.operator_set = sorted(operator_set, key=lambda op: op.get_full_op_id())\n        self.logical_op_id = self.operator_set[0].logical_op_id\n        self.subplans = [] if subplans is None else subplans\n        self.plan_id = self.compute_plan_id()\n\n        # compute mapping from unique logical_op_id to next unique logical_op_id in the plan\n        self.unique_logical_op_id_to_next_unique_logical_op_id = {}\n        current_idx, _ = self._compute_next_unique_logical_op_id_map(self.unique_logical_op_id_to_next_unique_logical_op_id)\n        self.unique_logical_op_id_to_next_unique_logical_op_id[f\"{current_idx}-{self.logical_op_id}\"] = None\n\n        # compute mapping from unique logical_op_id to root dataset ids\n        self.unique_logical_op_id_to_root_dataset_ids = {}\n        self._compute_root_dataset_ids_map(self.unique_logical_op_id_to_root_dataset_ids)\n\n        # compute mapping from unique logical_op_id to source unique logical_op_ids\n        self.unique_logical_op_id_to_source_logical_op_ids = {}\n        self._compute_source_unique_logical_op_ids_map(self.unique_logical_op_id_to_source_logical_op_ids)\n\n    def compute_plan_id(self) -> str:\n        \"\"\"\n        NOTE: This is NOT a universal ID.\n\n        Two different SentinelPlan instances with the identical operator_sets will have equivalent plan_ids.\n        \"\"\"\n        full_id = (self.logical_op_id,) + tuple([op.get_full_op_id() for op in self.operator_set])\n        subplan_ids = [subplan.compute_plan_id() for subplan in self.subplans]\n        return hash_for_id(str((full_id,) + tuple(subplan_ids)))\n\n    def __eq__(self, other):\n        return isinstance(other, SentinelPlan) and self.plan_id == other.plan_id\n\n    def __hash__(self):\n        return int(self.plan_id, 16)\n\n    def __repr__(self) -> str:\n        return str(self)\n\n    def _get_str(self, idx: int = 0, indent: int = 0) -> str:\n        indent_str = \" \" * (indent * 2)\n        operator = self.operator_set[0]\n        inner_idx_str = \"\" if len(self.operator_set) == 1 else f\"1 - {len(self.operator_set)}.\"\n        plan_str = f\"{indent_str}{idx}.{inner_idx_str} {str(operator)}\\n\"\n        for subplan in self.subplans:\n            plan_str += subplan._get_str(idx=idx + 1, indent=indent + 1)\n\n        return plan_str\n\n    def __str__(self):\n        return self._get_str()\n\n    def __getitem__(self, slice):\n        op_set_tuples = [op_set_tuple for op_set_tuple in self]\n        return op_set_tuples[slice]\n\n    def __iter__(self):\n        for subplan in self.subplans:\n            yield from subplan\n        yield self.logical_op_id, self.operator_set\n\n    def __len__(self):\n        return 1 + sum(len(subplan) for subplan in self.subplans)\n    \n    def _compute_next_unique_logical_op_id_map(self, next_map: dict[str, str | None], current_idx: int | None = None) -> tuple[int, str]:\n        \"\"\"Compute a mapping from each operator's unique logical_op_id to the next operator's unique logical_op_id.\n\n        The unique logical_op_id is constructed as \"{topological_index}-{logical_op_id}\" to differentiate between\n        multiple instances of the same physical operator in the plan (e.g., in self-joins).\n\n        Args:\n            next_map: A dictionary to populate with the mapping from unique logical_op_id to next logical_op_id.\n            current_idx: The current topological index in the plan. If None, starts at 0.\n\n        Returns:\n            A tuple containing:\n                - The current topological index after processing this plan.\n                - The unique logical_op_id of this plan's root logical operator.\n        \"\"\"\n        # If there are subplans, compute their next maps first\n        subplan_topo_idx_op_id_pairs = []\n        for subplan in self.subplans:\n            current_idx, current_logical_op_id = subplan._compute_next_unique_logical_op_id_map(next_map, current_idx)\n            subplan_topo_idx_op_id_pairs.append((current_idx, current_logical_op_id))\n            current_idx += 1  # increment after processing each subplan\n\n        # for each subplan's root operator, set its next to this plan's root operator\n        for topo_idx, logical_op_id in subplan_topo_idx_op_id_pairs:\n            unique_logical_op_id = f\"{topo_idx}-{logical_op_id}\"\n            this_unique_logical_op_id = f\"{current_idx}-{self.logical_op_id}\"\n            next_map[unique_logical_op_id] = this_unique_logical_op_id\n\n        # if this is the first call, initialize current_idx\n        if current_idx is None:\n            current_idx = 0\n\n        return current_idx, self.logical_op_id\n\n    def get_next_unique_logical_op_id(self, unique_logical_op_id: str) -> str | None:\n        \"\"\"Return the unique logical_op_id of the next operator in the plan after the given operator, or None if it is the last operator.\"\"\"\n        return self.unique_logical_op_id_to_next_unique_logical_op_id[unique_logical_op_id]\n\n    def _compute_root_dataset_ids_map(self, root_dataset_ids_map: dict[str, list[str]], current_idx: int | None = None) -> tuple[int, list[str]]:\n        # set the root dataset ids for this operator\n        all_subplan_root_dataset_ids = []\n        for subplan in self.subplans:\n            current_idx, subplan_root_dataset_ids = subplan._compute_root_dataset_ids_map(root_dataset_ids_map, current_idx)\n            all_subplan_root_dataset_ids.extend(subplan_root_dataset_ids)\n            current_idx += 1\n\n        # if current_idx is None, this is the first call, so we initialize it to 0\n        if current_idx is None:\n            current_idx = 0\n\n        # compute this operator's unique logical_op_id\n        this_unique_logical_op_id = f\"{current_idx}-{self.logical_op_id}\"\n\n        # if this operator is a root dataset scan, update root_dataset_ids\n        root_dataset_ids = []\n        if isinstance(self.operator_set[0], MarshalAndScanDataOp):\n            root_dataset_ids.append(self.operator_set[0].datasource.id)\n        elif isinstance(self.operator_set[0], ContextScanOp):\n            root_dataset_ids.append(self.operator_set[0].context.id)\n\n        # update the root_dataset_ids_map for this operator\n        root_dataset_ids_map[this_unique_logical_op_id] = root_dataset_ids + all_subplan_root_dataset_ids\n\n        # return the current index and the upstream unique logical_op_ids for this operator\n        return current_idx, root_dataset_ids_map[this_unique_logical_op_id]\n\n    def get_root_dataset_ids(self, unique_logical_op_id: str) -> list[str]:\n        \"\"\"Return the list of root dataset ids which are upstream of this operator.\"\"\"\n        return self.unique_logical_op_id_to_root_dataset_ids[unique_logical_op_id]\n\n    def _compute_source_unique_logical_op_ids_map(self, source_map: dict[str, list[str]], current_idx: int | None = None) -> tuple[int, str]:\n        # get the topological index and logical_op_id pairs for all subplans' root operators\n        subplan_topo_idx_op_id_pairs = []\n        for subplan in self.subplans:\n            current_idx, current_logical_op_id = subplan._compute_source_unique_logical_op_ids_map(source_map, current_idx)\n            subplan_topo_idx_op_id_pairs.append((current_idx, current_logical_op_id))\n            current_idx += 1\n\n        # if current_idx is None, this is the first call, so we initialize it to 0\n        if current_idx is None:\n            current_idx = 0\n\n        # compute this operator's unique logical_op_id\n        this_unique_logical_op_id = f\"{current_idx}-{self.logical_op_id}\"\n\n        # update the source_map for this operator\n        source_map[this_unique_logical_op_id] = []\n        for topo_idx, logical_op_id in subplan_topo_idx_op_id_pairs:\n            unique_logical_op_id = f\"{topo_idx}-{logical_op_id}\"\n            source_map[this_unique_logical_op_id].append(unique_logical_op_id)\n\n        # return the current unique logical_op_id for this operator\n        return current_idx, self.logical_op_id\n\n    def get_source_unique_logical_op_ids(self, unique_logical_op_id: str) -> list[str]:\n        \"\"\"Return the list of unique logical_op_ids for the input(s) to this operator.\"\"\"\n        return self.unique_logical_op_id_to_source_logical_op_ids[unique_logical_op_id]\n"
  },
  {
    "path": "src/palimpzest/query/optimizer/primitives.py",
    "content": "from __future__ import annotations\n\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.query.operators.logical import LogicalOperator\nfrom palimpzest.query.operators.physical import PhysicalOperator\nfrom palimpzest.query.optimizer import rules\nfrom palimpzest.query.optimizer.plan import PlanCost\nfrom palimpzest.utils.hash_helpers import hash_for_id\n\n\nclass Expression:\n    \"\"\"\n    An Expression (technically a \"multi-expression\") consists of either a logical operator\n    (if it's a logical expression) or a physical operator (if it's a physical expression)\n    and the group ids which are inputs to this expression\n    \"\"\"\n\n    def __init__(\n        self,\n        operator: LogicalOperator | PhysicalOperator,\n        input_group_ids: list[int],\n        input_fields: dict[str, FieldInfo],\n        depends_on_field_names: set[str],\n        generated_fields: dict[str, FieldInfo],\n        group_id: int | None = None,\n    ):\n        self.operator = operator\n        self.input_group_ids = input_group_ids\n        self.input_fields = input_fields\n        self.depends_on_field_names = depends_on_field_names\n        self.generated_fields = generated_fields\n        self.group_id = group_id\n        self.rules_applied = set()\n\n        # NOTE: this will be the best possible plan cost achieved by this expression for some\n        # greedy definition of \"best\"\n        self.plan_cost: PlanCost | None = None\n\n        # NOTE: this will be a list of tuples where each tuple has a (pareto-optimal) plan cost\n        # and the tuple of input plan cost(s) for which that pareto-optimal plan cost is attainable;\n        # the tuple of input plan cost(s) is (input_plan_cost, None) for non-join operators and\n        # (left_input_plan_cost, right_input_plan_cost) for join operators\n        self.pareto_optimal_plan_costs: list[tuple[PlanCost, tuple[PlanCost, PlanCost]]] | None = None\n\n        # compute the expression id\n        self.expr_id = self._compute_expr_id()\n\n    def __eq__(self, other):\n        return self.expr_id == other.expr_id\n\n    def __str__(self):\n        expr_str = f\"{self.__class__.__name__}(group_id={self.group_id}, expr_id={self.expr_id})\"\n        expr_str += f\"\\n  - input_group_ids: {self.input_group_ids}\"\n        expr_str += f\"\\n  - input_fields: {self.input_fields}\"\n        expr_str += f\"\\n  - depends_on_field_names: {self.depends_on_field_names}\"\n        expr_str += f\"\\n  - generated_fields: {self.generated_fields}\"\n        expr_str += f\"\\n  - operator:\\n{str(self.operator)}\"\n        return expr_str\n\n    def __hash__(self):\n        op_id = self.operator.get_logical_op_id() if isinstance(self.operator, LogicalOperator) else self.operator.get_full_op_id()\n        hash_str = str(tuple(sorted(self.input_group_ids)) + (op_id, str(self.__class__.__name__)))\n        hash_id = int(hash_for_id(hash_str), 16)\n        return hash_id\n\n    def _compute_expr_id(self) -> int:\n        return self.__hash__()\n\n    def add_applied_rule(self, rule: type[rules.Rule]):\n        self.rules_applied.add(rule.get_rule_id())\n\n    def set_group_id(self, group_id: int) -> None:\n        self.group_id = group_id\n\n\nclass LogicalExpression(Expression):\n    pass\n\n\nclass PhysicalExpression(Expression):\n    \n    @classmethod\n    def from_op_and_logical_expr(cls, op: PhysicalOperator, logical_expression: LogicalExpression) -> PhysicalExpression:\n        \"\"\"Construct a PhysicalExpression given a physical operator and a logical expression.\"\"\"\n        return cls(\n            operator=op,\n            input_group_ids=logical_expression.input_group_ids,\n            input_fields=logical_expression.input_fields,\n            depends_on_field_names=logical_expression.depends_on_field_names,\n            generated_fields=logical_expression.generated_fields,\n            group_id=logical_expression.group_id,\n        )\n\n\nclass Group:\n    \"\"\"\n    A group is a set of logically equivalent expressions (both logical (query trees) and physical (execution plans)).\n    Represents the execution of an un-ordered set of logical operators.\n    Maintains a set of logical multi-expressions and physical multi-expressions.\n    \"\"\"\n\n    def __init__(self, logical_expressions: list[LogicalExpression], fields: dict[str, FieldInfo], properties: dict[str, set[str]]):\n        self.logical_expressions: set[LogicalExpression] = set(logical_expressions)\n        self.physical_expressions: set[PhysicalExpression] = set()\n        self.fields = fields\n        self.explored = False\n        self.best_physical_expression: PhysicalExpression | None = None\n        self.pareto_optimal_physical_expressions: list[PhysicalExpression] | None = None\n        self.optimized = False\n\n        # properties of the Group which distinguish it from groups w/identical fields,\n        # e.g. which filters, limits have been applied; is the output sorted, etc.\n        self.properties = properties\n\n        # compute the group id\n        self.group_id = self._compute_group_id()\n\n    def set_explored(self):\n        self.explored = True\n\n    def _compute_group_id(self) -> int:\n        # sort field names\n        sorted_fields = sorted(self.fields.keys())\n\n        # sort properties\n        sorted_properties = []\n        for key in sorted(self.properties.keys()):\n            sorted_properties.extend(sorted(self.properties[key]))\n\n        hash_str = str(tuple(sorted_fields + sorted_properties))\n        hash_id = int(hash_for_id(hash_str), 16)\n        return hash_id\n"
  },
  {
    "path": "src/palimpzest/query/optimizer/rules.py",
    "content": "import logging\nimport os\nfrom copy import deepcopy\nfrom itertools import combinations, product\n\nfrom palimpzest.constants import AggFunc, Model, PromptStrategy\nfrom palimpzest.core.data.context_manager import ContextManager\nfrom palimpzest.core.lib.schemas import (\n    AUDIO_FIELD_TYPES,\n    AUDIO_LIST_FIELD_TYPES,\n    IMAGE_FIELD_TYPES,\n    IMAGE_LIST_FIELD_TYPES,\n)\nfrom palimpzest.prompts import CONTEXT_SEARCH_PROMPT\nfrom palimpzest.query.operators.aggregate import (\n    ApplyGroupByOp,\n    AverageAggregateOp,\n    CountAggregateOp,\n    MaxAggregateOp,\n    MinAggregateOp,\n    SemanticAggregate,\n    SumAggregateOp,\n)\nfrom palimpzest.query.operators.compute import SmolAgentsCompute\nfrom palimpzest.query.operators.convert import LLMConvertBonded, NonLLMConvert\nfrom palimpzest.query.operators.critique_and_refine import CritiqueAndRefineConvert, CritiqueAndRefineFilter\nfrom palimpzest.query.operators.distinct import DistinctOp\nfrom palimpzest.query.operators.filter import LLMFilter, NonLLMFilter\nfrom palimpzest.query.operators.join import EmbeddingJoin, NestedLoopsJoin, RelationalJoin\nfrom palimpzest.query.operators.limit import LimitScanOp\nfrom palimpzest.query.operators.logical import (\n    Aggregate,\n    BaseScan,\n    ComputeOperator,\n    ContextScan,\n    ConvertScan,\n    Distinct,\n    FilteredScan,\n    GroupByAggregate,\n    JoinOp,\n    LimitScan,\n    Project,\n    SearchOperator,\n    TopKScan,\n)\nfrom palimpzest.query.operators.mixture_of_agents import MixtureOfAgentsConvert, MixtureOfAgentsFilter\nfrom palimpzest.query.operators.physical import PhysicalOperator\nfrom palimpzest.query.operators.project import ProjectOp\nfrom palimpzest.query.operators.rag import RAGConvert, RAGFilter\nfrom palimpzest.query.operators.scan import ContextScanOp, MarshalAndScanDataOp\nfrom palimpzest.query.operators.search import (\n    SmolAgentsSearch,  # SmolAgentsCustomManagedSearch,  # SmolAgentsManagedSearch\n)\nfrom palimpzest.query.operators.split import SplitConvert, SplitFilter\nfrom palimpzest.query.operators.topk import TopKOp\nfrom palimpzest.query.optimizer.primitives import Expression, Group, LogicalExpression, PhysicalExpression\nfrom palimpzest.utils.model_helpers import use_reasoning_prompt\n\nlogger = logging.getLogger(__name__)\n\n\nclass Rule:\n    \"\"\"\n    The abstract base class for transformation and implementation rules.\n    \"\"\"\n\n    @classmethod\n    def get_rule_id(cls):\n        return cls.__name__\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        raise NotImplementedError(\"Calling this method from an abstract base class.\")\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **kwargs: dict) -> set[Expression]:\n        raise NotImplementedError(\"Calling this method from an abstract base class.\")\n\n\nclass TransformationRule(Rule):\n    \"\"\"\n    Base class for transformation rules which convert a logical expression to another logical expression.\n    The substitute method for a TransformationRule should return all new expressions and all new groups\n    which are created during the substitution.\n    \"\"\"\n\n    @classmethod\n    def is_exploration_rule(cls) -> bool:\n        \"\"\"Returns True if this rule is an exploration rule and False otherwise. Default is False.\"\"\"\n        return False\n\n    @classmethod\n    def substitute(\n        cls, logical_expression: LogicalExpression, groups: dict[int, Group], expressions: dict[int, Expression], **kwargs\n    ) -> tuple[set[LogicalExpression], set[Group]]:\n        \"\"\"\n        This function applies the transformation rule to the logical expression, which\n        potentially creates new intermediate expressions and groups.\n\n        The function returns a tuple containing:\n        - the set of all new logical expressions created when applying the transformation\n        - the set of all new groups created when applying the transformation\n        - the next group id (after creating any new groups)\n        \"\"\"\n        raise NotImplementedError(\"Calling this method from an abstract base class.\")\n\n\nclass ReorderConverts(TransformationRule):\n    \"\"\"\n    This rule is an exploration rule that returns new logical expressions by re-ordering a sequence of ConvertScans.\n    \"\"\"\n\n    @classmethod\n    def is_exploration_rule(cls) -> bool:\n        return True\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: Expression) -> bool:\n        is_match = isinstance(logical_expression.operator, ConvertScan)\n        logger.debug(f\"ReorderConverts matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(\n        cls, logical_expression: LogicalExpression, groups: dict[int, Group], expressions: dict[int, Expression], **kwargs: dict\n    ) -> tuple[set[LogicalExpression], set[Group]]:\n        logger.debug(f\"Substituting ReorderConverts for {logical_expression}\")\n\n        # initialize the sets of new logical expressions and groups to be returned\n        new_logical_expressions, new_groups = set(), set()\n\n        # for each input group, if this convert does not depend on an operator in that group:\n        # then swap the group with this convert\n        convert_operator: ConvertScan = logical_expression.operator\n        for input_group_id in logical_expression.input_group_ids:\n            input_group = groups[input_group_id]\n\n            # if the convert's dependencies aren't contained within the input group's fields,\n            # then we can not push it down into this group\n            if any([field not in input_group.fields for field in convert_operator.depends_on]):\n                continue\n\n            # iterate over logical expressions\n            for expr in input_group.logical_expressions:\n                # if the expression operator is not a convert, we cannot swap\n                if not isinstance(expr.operator, ConvertScan):\n                    continue\n\n                # if this convert depends on a field generated by the expression we're trying to swap with, we can't swap\n                if any([field in expr.generated_fields for field in convert_operator.depends_on]):\n                    continue\n\n                # create new logical expression with convert pushed down to the input group's logical expression\n                new_input_group_ids = deepcopy(expr.input_group_ids)\n                new_input_fields = deepcopy(expr.input_fields)\n                new_depends_on_field_names = deepcopy(logical_expression.depends_on_field_names)\n                new_generated_fields = deepcopy(logical_expression.generated_fields)\n                new_convert_expr = LogicalExpression(\n                    convert_operator,\n                    input_group_ids=new_input_group_ids,\n                    input_fields=new_input_fields,\n                    depends_on_field_names=new_depends_on_field_names,\n                    generated_fields=new_generated_fields,\n                    group_id=None,\n                )\n\n                # add new_convert_expr to set of new expressions\n                new_logical_expressions.add(new_convert_expr)\n\n                # get or compute the group_id and group for this new expression\n                group_id, group = None, None\n\n                # if the expression already exists, lookup the group_id and group\n                if new_convert_expr.expr_id in expressions:\n                    group_id = expressions[new_convert_expr.expr_id].group_id\n                    new_convert_expr.set_group_id(group_id)\n                    group = groups[group_id]\n\n                # otherwise, lookup or create expression's group and add it to the new expressions\n                else:\n                    # first, compute the fields for the group\n                    all_fields = {**new_input_fields, **new_generated_fields}\n\n                    # next, compute the properties; the properties will be identical to those of the input group\n                    # EXCEPT for the filters which will change as a result of our swap\n                    new_group_properties = deepcopy(input_group.properties)\n\n                    # if the expression we're swapping with is a map,\n                    # we need to remove its model fields from the input group properties\n                    if sorted(expr.operator.input_schema.model_fields.keys()) == sorted(expr.operator.output_schema.model_fields.keys()):\n                        model_fields_dict = {\n                            k: {\"annotation\": v.annotation, \"default\": v.default, \"description\": v.description}\n                            for k, v in expr.operator.output_schema.model_fields.items()\n                        }\n                        new_group_properties[\"maps\"].remove(model_fields_dict)\n\n                    # finally, if this expression is a map, add its model fields to the new group's properties\n                    if sorted(convert_operator.input_schema.model_fields.keys()) == sorted(convert_operator.output_schema.model_fields.keys()):\n                        model_fields_dict = {\n                            k: {\"annotation\": v.annotation, \"default\": v.default, \"description\": v.description}\n                            for k, v in convert_operator.output_schema.model_fields.items()\n                        }\n                        if \"maps\" in new_group_properties:\n                            new_group_properties[\"maps\"].add(model_fields_dict)\n                        else:\n                            new_group_properties[\"maps\"] = set([model_fields_dict])\n\n                    # create group for this new convert expression\n                    group = Group(\n                        logical_expressions=[new_convert_expr],\n                        fields=all_fields,\n                        properties=new_group_properties,\n                    )\n                    group_id = group.group_id\n                    new_convert_expr.set_group_id(group_id)\n\n                    # if the group already exists, add the expression to that group\n                    if group_id in groups:\n                        group = groups[group_id]\n                        group.logical_expressions.add(new_convert_expr)\n\n                    # otherwise, add this new group to groups and to the set of new groups\n                    else:\n                        groups[group_id] = group\n                        new_groups.add(group)\n\n                # create final new logical expression with expr's operator pulled up\n                new_expr = LogicalExpression(\n                    expr.operator.copy(),\n                    input_group_ids=[group_id] + [g_id for g_id in logical_expression.input_group_ids if g_id != input_group_id],\n                    input_fields=group.fields,\n                    depends_on_field_names=expr.depends_on_field_names,\n                    generated_fields=expr.generated_fields,\n                    group_id=logical_expression.group_id,\n                )\n\n                # add newly created expression to set of returned expressions\n                new_logical_expressions.add(new_expr)\n\n        logger.debug(f\"Done substituting ReorderConverts for {logical_expression}\")\n\n        return new_logical_expressions, new_groups\n\n\nclass PushDownFilter(TransformationRule):\n    \"\"\"\n    If this operator is a filter, push down the filter and replace it with the\n    most expensive operator in the input group.\n    \"\"\"\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: Expression) -> bool:\n        is_match = isinstance(logical_expression.operator, FilteredScan)\n        logger.debug(f\"PushDownFilter matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(\n        cls, logical_expression: LogicalExpression, groups: dict[int, Group], expressions: dict[int, Expression], **kwargs: dict\n    ) -> tuple[set[LogicalExpression], set[Group]]:\n        logger.debug(f\"Substituting PushDownFilter for {logical_expression}\")\n\n        # initialize the sets of new logical expressions and groups to be returned\n        new_logical_expressions, new_groups = set(), set()\n\n        # for each input group, if this filter does not depend on an operator\n        # in that group: then swap the group with this filter\n        filter_operator: FilteredScan = logical_expression.operator\n        for input_group_id in logical_expression.input_group_ids:\n            input_group = groups[input_group_id]\n\n            # if the filter's dependencies aren't contained within the input group's fields,\n            # then we can not push it down into this group\n            if any([field not in input_group.fields for field in filter_operator.depends_on]):\n                continue\n\n            # iterate over logical expressions\n            # NOTE: we previously deepcopy'ed the logical expression to avoid modifying the original;\n            #       I think I've fixed this internally, but I'm leaving this NOTE as a reminder in case\n            #       we see a regression / bug in the future\n            for expr in input_group.logical_expressions:\n                # if the expression operator is not a convert or a filter, we cannot swap\n                if not (isinstance(expr.operator, (ConvertScan, FilteredScan, JoinOp))):\n                    continue\n\n                # if this filter depends on a field generated by the expression we're trying to swap with, we can't swap\n                if any([field in expr.generated_fields for field in filter_operator.depends_on]):\n                    continue\n\n                # create new logical expression with filter pushed down to the input group's logical expression\n                new_input_group_ids = deepcopy(expr.input_group_ids)\n                new_input_fields = deepcopy(expr.input_fields)\n                new_depends_on_field_names = deepcopy(logical_expression.depends_on_field_names)\n                new_generated_fields = deepcopy(logical_expression.generated_fields)\n                new_filter_expr = LogicalExpression(\n                    filter_operator,\n                    input_group_ids=new_input_group_ids,\n                    input_fields=new_input_fields,\n                    depends_on_field_names=new_depends_on_field_names,\n                    generated_fields=new_generated_fields,\n                    group_id=None,\n                )\n\n                # add new_filter_expr to set of new expressions\n                new_logical_expressions.add(new_filter_expr)\n\n                # get or compute the group_id and group for this new expression\n                group_id, group = None, None\n\n                # if the expression already exists, lookup the group_id and group\n                if new_filter_expr.expr_id in expressions:\n                    group_id = expressions[new_filter_expr.expr_id].group_id\n                    new_filter_expr.set_group_id(group_id)\n                    group = groups[group_id]\n\n                # otherwise, lookup or create expression's group and add it to the new expressions\n                else:\n                    # first, compute the fields for the group\n                    all_fields = {**new_input_fields, **new_generated_fields}\n\n                    # next, compute the properties; the properties will be identical to those of the input group\n                    # EXCEPT for the filters which will change as a result of our swap\n                    new_group_properties = deepcopy(input_group.properties)\n\n                    # if the expression we're swapping with is a FilteredScan,\n                    # we need to remove its filter from the input group properties\n                    if isinstance(expr.operator, FilteredScan):\n                        filter_str = expr.operator.filter.get_filter_str()\n                        new_group_properties[\"filters\"].remove(filter_str)\n\n                    # finally, add the pushed-down filter to the new group's properties\n                    filter_str = filter_operator.filter.get_filter_str()\n                    if \"filters\" in new_group_properties:\n                        new_group_properties[\"filters\"].add(filter_str)\n                    else:\n                        new_group_properties[\"filters\"] = set([filter_str])\n\n                    # create group for this new filter expression\n                    group = Group(\n                        logical_expressions=[new_filter_expr],\n                        fields=all_fields,\n                        properties=new_group_properties,\n                    )\n                    group_id = group.group_id\n                    new_filter_expr.set_group_id(group_id)\n\n                    # if the group already exists, add the expression to that group\n                    if group_id in groups:\n                        group = groups[group_id]\n                        group.logical_expressions.add(new_filter_expr)\n\n                    # otherwise, add this new group to groups and to the set of new groups\n                    else:\n                        groups[group_id] = group\n                        new_groups.add(group)\n\n                # create final new logical expression with expr's operator pulled up\n                new_expr = LogicalExpression(\n                    expr.operator.copy(),\n                    input_group_ids=[group_id] + [g_id for g_id in logical_expression.input_group_ids if g_id != input_group_id],\n                    input_fields=group.fields,\n                    depends_on_field_names=expr.depends_on_field_names,\n                    generated_fields=expr.generated_fields,\n                    group_id=logical_expression.group_id,\n                )\n\n                # add newly created expression to set of returned expressions\n                new_logical_expressions.add(new_expr)\n\n        logger.debug(f\"Done substituting PushDownFilter for {logical_expression}\")\n\n        return new_logical_expressions, new_groups\n\n\nclass ImplementationRule(Rule):\n    \"\"\"\n    Base class for implementation rules which convert a logical expression to a physical expression.\n    \"\"\"\n\n    @classmethod\n    def _get_image_fields(cls, logical_expression: LogicalExpression) -> set[str]:\n        \"\"\"Returns the set of fields which have an image (or list[image]) type.\"\"\"\n        return set([\n            field_name.split(\".\")[-1]\n            for field_name, field in logical_expression.input_fields.items()\n            if field.annotation in IMAGE_FIELD_TYPES and field_name.split(\".\")[-1] in logical_expression.depends_on_field_names\n        ])\n\n    @classmethod\n    def _get_list_image_fields(cls, logical_expression: LogicalExpression) -> set[str]:\n        \"\"\"Returns the set of fields which have a list[image] type.\"\"\"\n        return set([\n            field_name.split(\".\")[-1]\n            for field_name, field in logical_expression.input_fields.items()\n            if field.annotation in IMAGE_LIST_FIELD_TYPES and field_name.split(\".\")[-1] in logical_expression.depends_on_field_names\n        ])\n\n    @classmethod\n    def _get_audio_fields(cls, logical_expression: LogicalExpression) -> set[str]:\n        \"\"\"Returns the set of fields which have an audio (or list[audio]) type.\"\"\"\n        return set([\n            field_name.split(\".\")[-1]\n            for field_name, field in logical_expression.input_fields.items()\n            if field.annotation in AUDIO_FIELD_TYPES and field_name.split(\".\")[-1] in logical_expression.depends_on_field_names\n        ])\n\n    @classmethod\n    def _get_list_audio_fields(cls, logical_expression: LogicalExpression) -> set[str]:\n        \"\"\"Returns the set of fields which have a list[audio] type.\"\"\"\n        return set([\n            field_name.split(\".\")[-1]\n            for field_name, field in logical_expression.input_fields.items()\n            if field.annotation in AUDIO_LIST_FIELD_TYPES and field_name.split(\".\")[-1] in logical_expression.depends_on_field_names\n        ])\n\n    @classmethod\n    def _is_image_only_operation(cls, logical_expression: LogicalExpression) -> bool:\n        \"\"\"Returns True if the logical_expression processes only image input(s) and False otherwise.\"\"\"\n        return all([\n            field.annotation in IMAGE_FIELD_TYPES\n            for field_name, field in logical_expression.input_fields.items()\n            if field_name.split(\".\")[-1] in logical_expression.depends_on_field_names\n        ])\n\n    @classmethod\n    def _is_image_operation(cls, logical_expression: LogicalExpression) -> bool:\n        \"\"\"Returns True if the logical_expression processes image input(s) and False otherwise.\"\"\"\n        return any([\n            field.annotation in IMAGE_FIELD_TYPES\n            for field_name, field in logical_expression.input_fields.items()\n            if field_name.split(\".\")[-1] in logical_expression.depends_on_field_names\n        ])\n\n    @classmethod\n    def _is_audio_only_operation(cls, logical_expression: LogicalExpression) -> bool:\n        \"\"\"Returns True if the logical_expression processes only audio input(s) and False otherwise.\"\"\"\n        return all([\n            field.annotation in AUDIO_FIELD_TYPES\n            for field_name, field in logical_expression.input_fields.items()\n            if field_name.split(\".\")[-1] in logical_expression.depends_on_field_names\n        ])\n\n    @classmethod\n    def _is_audio_operation(cls, logical_expression: LogicalExpression) -> bool:\n        \"\"\"Returns True if the logical_expression processes audio input(s) and False otherwise.\"\"\"\n        return any([\n            field.annotation in AUDIO_FIELD_TYPES\n            for field_name, field in logical_expression.input_fields.items()\n            if field_name.split(\".\")[-1] in logical_expression.depends_on_field_names\n        ])\n\n    @classmethod\n    def _is_text_only_operation(cls, logical_expression: LogicalExpression) -> bool:\n        \"\"\"Returns True if the logical_expression processes only text input(s) and False otherwise.\"\"\"\n        return all([\n            field.annotation not in IMAGE_FIELD_TYPES + AUDIO_FIELD_TYPES\n            for field_name, field in logical_expression.input_fields.items()\n            if field_name.split(\".\")[-1] in logical_expression.depends_on_field_names\n        ])\n\n    @classmethod\n    def _is_text_operation(cls, logical_expression: LogicalExpression) -> bool:\n        \"\"\"Returns True if the logical_expression processes text input(s) and False otherwise.\"\"\"\n        return any([\n            field.annotation not in IMAGE_FIELD_TYPES + AUDIO_FIELD_TYPES\n            for field_name, field in logical_expression.input_fields.items()\n            if field_name.split(\".\")[-1] in logical_expression.depends_on_field_names\n        ])\n\n    # TODO: support powerset of text + image + audio (+ video) multi-modal operations\n    @classmethod\n    def _is_text_image_multimodal_operation(cls, logical_expression: LogicalExpression) -> bool:\n        \"\"\"Returns True if the logical_expression processes text and image inputs and False otherwise.\"\"\"\n        return cls._is_image_operation(logical_expression) and cls._is_text_operation(logical_expression)\n\n    @classmethod\n    def _is_text_audio_multimodal_operation(cls, logical_expression: LogicalExpression) -> bool:\n        \"\"\"Returns True if the logical_expression processes text and audio inputs and False otherwise.\"\"\"\n        return cls._is_audio_operation(logical_expression) and cls._is_text_operation(logical_expression)\n\n    @classmethod\n    def _model_matches_input(cls, model: Model, logical_expression: LogicalExpression) -> bool:\n        \"\"\"Returns True if the model is capable of processing the input and False otherwise.\"\"\"\n        # compute how many image fields are in the input, and whether any fields are list[image] fields\n        num_image_fields = len(cls._get_image_fields(logical_expression))\n        has_list_image_field = len(cls._get_list_image_fields(logical_expression)) > 0\n        num_audio_fields = len(cls._get_audio_fields(logical_expression))\n        has_list_audio_field = len(cls._get_list_audio_fields(logical_expression)) > 0\n\n        # corner-case: for now, all operators use text or vision models for processing inputs to __call__\n        if model.is_embedding_model():\n            return False\n\n        # corner-case: Llama vision models cannot handle multiple image inputs (at least using Together)\n        if model.is_llama_model() and model.is_vision_model() and (num_image_fields > 1 or has_list_image_field):\n            return False\n\n        # corner-case: Gemini models cannot handle multiple audio inputs\n        if model.is_provider_vertex_ai() and model.is_audio_model() and (num_audio_fields > 1 or has_list_audio_field):\n            return False\n\n        # text-only input and text supporting model\n        if cls._is_text_only_operation(logical_expression) and model.is_text_model():\n            return True\n\n        # image-only input and image supporting model\n        if cls._is_image_only_operation(logical_expression) and model.is_vision_model():\n            return True\n\n        # audio-only input and audio supporting model\n        if cls._is_audio_only_operation(logical_expression) and model.is_audio_model():\n            return True\n\n        # multi-modal input and multi-modal supporting model\n        if cls._is_text_image_multimodal_operation(logical_expression) and model.is_text_image_multimodal_model():  # noqa: SIM103\n            return True\n\n        # multi-modal input and multi-modal supporting model\n        if cls._is_text_audio_multimodal_operation(logical_expression) and model.is_text_audio_multimodal_model():  # noqa: SIM103\n            return True\n\n        return False\n\n    @classmethod\n    def _embedding_model_matches_input(cls, model: Model, logical_expression: LogicalExpression) -> bool:\n        \"\"\"Returns True if the embedding model is capable of processing the input and False otherwise.\"\"\"\n        if cls._is_text_image_multimodal_operation(logical_expression) and model.is_text_image_multimodal_embedding_model():\n            return True\n\n        is_text_embedding_model = model.is_embedding_model() and not model.is_text_image_multimodal_embedding_model()\n        return cls._is_text_only_operation(logical_expression) and is_text_embedding_model\n\n    @classmethod\n    def _get_fixed_op_kwargs(cls, logical_expression: LogicalExpression, runtime_kwargs: dict) -> dict:\n        \"\"\"Get the fixed set of physical op kwargs provided by the logical expression and the runtime keyword arguments.\"\"\"\n        # get logical operator \n        logical_op = logical_expression.operator\n\n        # set initial set of parameters for physical op\n        op_kwargs = logical_op.get_logical_op_params()\n        op_kwargs.update(\n            {\n                \"verbose\": runtime_kwargs[\"verbose\"],\n                \"logical_op_id\": logical_op.get_logical_op_id(),\n                \"unique_logical_op_id\": logical_op.get_unique_logical_op_id(),\n                \"logical_op_name\": logical_op.logical_op_name(),\n            }\n        )\n\n        return op_kwargs\n\n    @classmethod\n    def _perform_substitution(\n        cls,\n        logical_expression: LogicalExpression,\n        physical_op_class: type[PhysicalOperator],\n        runtime_kwargs: dict,\n        variable_op_kwargs: list[dict] | dict | None = None,\n    ) -> set[PhysicalExpression]:\n        \"\"\"\n        This performs basic substitution logic which proceeds in four steps:\n\n            1. The basic kwargs for the physical operator are computed using the logical operator\n               and runtime kwargs.\n            2. If variable kwargs are provided, then they are merged with the basic kwargs and one\n               instance of the physical operator is created for each dictionary of variable kwargs.\n            3. A physical expression is created for each physical operator instance.\n            4. The unique set of physical expressions is returned.\n\n        Args:\n            logical_expression (LogicalExpression): The logical expression containing a logical operator.\n            physical_op_class (type[PhysicalOperator]): The class of the physical operator we wish to construct.\n            runtime_kwargs (dict): Keyword arguments which are provided at runtime.\n            variable_op_kwargs (list[dict] | dict | None): A (list of) variable kwargs to customize each\n                physical operator instance.\n\n        Returns:\n            set[PhysicalExpression]: The unique set of physical expressions produced by initializing the\n                physical_op_class with the provided keyword arguments.\n        \"\"\"\n        # get physical operator kwargs which are fixed for each instance of the physical operator\n        fixed_op_kwargs = cls._get_fixed_op_kwargs(logical_expression, runtime_kwargs)\n\n        # make variable_op_kwargs a list of dictionaries\n        if variable_op_kwargs is None:\n            variable_op_kwargs = [{}]\n        elif isinstance(variable_op_kwargs, dict):\n            variable_op_kwargs = [variable_op_kwargs]\n\n        # construct physical operators for each set of kwargs\n        physical_expressions = []\n        for var_op_kwargs in variable_op_kwargs:\n            # get kwargs for this physical operator instance\n            op_kwargs = {**fixed_op_kwargs, **var_op_kwargs}\n\n            # construct the physical operator\n            op = physical_op_class(**op_kwargs)\n\n            # construct physical expression and add to list of expressions\n            expression = PhysicalExpression.from_op_and_logical_expr(op, logical_expression)\n            physical_expressions.append(expression)\n\n        return set(physical_expressions)\n\n\nclass NonLLMConvertRule(ImplementationRule):\n    \"\"\"\n    Substitute a logical expression for a UDF ConvertScan with a NonLLMConvert physical implementation.\n    \"\"\"\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        is_match = isinstance(logical_expression.operator, ConvertScan) and logical_expression.operator.udf is not None\n        logger.debug(f\"NonLLMConvertRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting NonLLMConvertRule for {logical_expression}\")\n        return cls._perform_substitution(logical_expression, NonLLMConvert, runtime_kwargs)\n\n\nclass LLMConvertBondedRule(ImplementationRule):\n    \"\"\"\n    Substitute a logical expression for a ConvertScan with a bonded convert physical implementation.\n    \"\"\"\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        is_match = isinstance(logical_expression.operator, ConvertScan) and logical_expression.operator.udf is None\n        logger.debug(f\"LLMConvertBondedRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting LLMConvertBondedRule for {logical_expression}\")\n\n        # create variable physical operator kwargs for each model which can implement this logical_expression\n        models = [model for model in runtime_kwargs[\"available_models\"] if cls._model_matches_input(model, logical_expression)]\n        variable_op_kwargs = []\n        for model in models:\n            reasoning_prompt_strategy = use_reasoning_prompt(runtime_kwargs[\"reasoning_effort\"])\n            prompt_strategy = PromptStrategy.MAP if reasoning_prompt_strategy else PromptStrategy.MAP_NO_REASONING\n            variable_op_kwargs.append(\n                {\n                    \"model\": model,\n                    \"prompt_strategy\": prompt_strategy,\n                    \"reasoning_effort\": runtime_kwargs[\"reasoning_effort\"],\n                }\n            )\n\n        return cls._perform_substitution(logical_expression, LLMConvertBonded, runtime_kwargs, variable_op_kwargs)\n\n\nclass RAGRule(ImplementationRule):\n    \"\"\"\n    Implementation rule for the RAG operators.\n    \"\"\"\n\n    num_chunks_per_fields = [1, 2, 4]\n    chunk_sizes = [1000, 2000, 4000]\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        logical_op = logical_expression.operator\n        is_map_match = isinstance(logical_op, ConvertScan) and cls._is_text_only_operation(logical_expression) and logical_op.udf is None\n        is_filter_match = isinstance(logical_op, FilteredScan) and cls._is_text_only_operation(logical_expression) and logical_op.filter.filter_fn is None\n        logger.debug(f\"RAGRule matches_pattern: {is_map_match or is_filter_match} for {logical_expression}\")\n        return is_map_match or is_filter_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting RAGRule for {logical_expression}\")\n        # select physical operator class based on whether this is a map or filter operation\n        phys_op_cls = RAGConvert if isinstance(logical_expression.operator, ConvertScan) else RAGFilter\n\n        # create variable physical operator kwargs for each (model, embedding_model) which can implement this logical_expression\n        provided_models: list[Model] = runtime_kwargs[\"available_models\"]\n        models = [model for model in provided_models if cls._model_matches_input(model, logical_expression)]\n        embedding_models = [model for model in provided_models if cls._embedding_model_matches_input(model, logical_expression)]\n\n        variable_op_kwargs = []\n        for (model, embedding_model) in product(models, embedding_models):\n            reasoning_prompt_strategy = use_reasoning_prompt(runtime_kwargs[\"reasoning_effort\"])\n            if phys_op_cls is RAGConvert:\n                reasoning = PromptStrategy.MAP\n                no_reasoning = PromptStrategy.MAP_NO_REASONING\n            elif phys_op_cls is RAGFilter:\n                reasoning = PromptStrategy.FILTER\n                no_reasoning = PromptStrategy.FILTER_NO_REASONING\n\n            prompt_strategy = reasoning if reasoning_prompt_strategy else no_reasoning\n            variable_op_kwargs.extend(\n                [\n                    {\n                        \"model\": model,\n                        \"embedding_model\": embedding_model,\n                        \"prompt_strategy\": prompt_strategy,\n                        \"num_chunks_per_field\": num_chunks_per_field,\n                        \"chunk_size\": chunk_size,\n                        \"reasoning_effort\": runtime_kwargs[\"reasoning_effort\"],\n                    }\n                    for num_chunks_per_field in cls.num_chunks_per_fields\n                    for chunk_size in cls.chunk_sizes\n                ]\n            )\n\n        return cls._perform_substitution(logical_expression, phys_op_cls, runtime_kwargs, variable_op_kwargs)\n\n\nclass MixtureOfAgentsRule(ImplementationRule):\n    \"\"\"\n    Implementation rule for the MixtureOfAgents operators.\n    \"\"\"\n\n    num_proposer_models = [1, 2, 3]\n    temperatures = [0.0, 0.4, 0.8]\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        logical_op = logical_expression.operator\n        is_map_match = isinstance(logical_op, ConvertScan) and logical_op.udf is None\n        is_filter_match = isinstance(logical_op, FilteredScan) and logical_op.filter.filter_fn is None\n        logger.debug(f\"MixtureOfAgentsRule matches_pattern: {is_map_match or is_filter_match} for {logical_expression}\")\n        return is_map_match or is_filter_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting MixtureOfAgentsRule for {logical_expression}\")\n        # select physical operator class based on whether this is a map or filter operation\n        phys_op_cls = MixtureOfAgentsConvert if isinstance(logical_expression.operator, ConvertScan) else MixtureOfAgentsFilter\n\n        # create variable physical operator kwargs for each model which can implement this logical_expression\n        proposer_model_set = {model for model in runtime_kwargs[\"available_models\"] if cls._model_matches_input(model, logical_expression)}\n        aggregator_model_set = {model for model in runtime_kwargs[\"available_models\"] if model.is_text_model()}\n        variable_op_kwargs = [\n            {\n                \"proposer_models\": list(proposer_models),\n                \"temperatures\": [temp] * len(proposer_models),\n                \"aggregator_model\": aggregator_model,\n                \"reasoning_effort\": runtime_kwargs[\"reasoning_effort\"],\n            }\n            for k in cls.num_proposer_models\n            for temp in cls.temperatures\n            for proposer_models in combinations(proposer_model_set, k)\n            for aggregator_model in aggregator_model_set\n        ]\n\n        return cls._perform_substitution(logical_expression, phys_op_cls, runtime_kwargs, variable_op_kwargs)\n\n\nclass CritiqueAndRefineRule(ImplementationRule):\n    \"\"\"\n    Implementation rule for the CritiqueAndRefine operators.\n    \"\"\"\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        logical_op = logical_expression.operator\n        is_map_match = isinstance(logical_op, ConvertScan) and logical_op.udf is None\n        is_filter_match = isinstance(logical_op, FilteredScan) and logical_op.filter.filter_fn is None\n        logger.debug(f\"CritiqueAndRefineRule matches_pattern: {is_map_match or is_filter_match} for {logical_expression}\")\n        return is_map_match or is_filter_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting CritiqueAndRefineRule for {logical_expression}\")\n        # select physical operator class based on whether this is a map or filter operation\n        phys_op_cls = CritiqueAndRefineConvert if isinstance(logical_expression.operator, ConvertScan) else CritiqueAndRefineFilter\n\n        # create variable physical operator kwargs for each model which can implement this logical_expression\n        models = [model for model in runtime_kwargs[\"available_models\"] if cls._model_matches_input(model, logical_expression)]\n        variable_op_kwargs = []\n        for model in models:\n            reasoning_prompt_strategy = use_reasoning_prompt(runtime_kwargs[\"reasoning_effort\"])\n            if phys_op_cls is CritiqueAndRefineConvert:\n                reasoning = PromptStrategy.MAP\n                no_reasoning = PromptStrategy.MAP_NO_REASONING\n            elif phys_op_cls is CritiqueAndRefineFilter:\n                reasoning = PromptStrategy.FILTER\n                no_reasoning = PromptStrategy.FILTER_NO_REASONING\n\n            prompt_strategy = reasoning if reasoning_prompt_strategy else no_reasoning\n            variable_op_kwargs.extend(\n                [\n                    {\n                        \"model\": model,\n                        \"critic_model\": critic_model,\n                        \"refine_model\": refine_model,\n                        \"prompt_strategy\": prompt_strategy,\n                        \"reasoning_effort\": runtime_kwargs[\"reasoning_effort\"],\n                    }\n                    for critic_model in models\n                    for refine_model in models\n                ]\n            )\n\n        return cls._perform_substitution(logical_expression, phys_op_cls, runtime_kwargs, variable_op_kwargs)\n\n\nclass SplitRule(ImplementationRule):\n    \"\"\"\n    Implementation rule for the Split operators.\n    \"\"\"\n    num_chunks = [2, 4, 6]\n    min_size_to_chunk = [1000, 4000]\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        logical_op = logical_expression.operator\n        is_map_match = isinstance(logical_op, ConvertScan) and cls._is_text_only_operation(logical_expression) and logical_op.udf is None\n        is_filter_match = isinstance(logical_op, FilteredScan) and cls._is_text_only_operation(logical_expression) and logical_op.filter.filter_fn is None\n        logger.debug(f\"SplitRule matches_pattern: {is_map_match or is_filter_match} for {logical_expression}\")\n        return is_map_match or is_filter_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting SplitRule for {logical_expression}\")\n        # select physical operator class based on whether this is a map or filter operation\n        phys_op_cls = SplitConvert if isinstance(logical_expression.operator, ConvertScan) else SplitFilter\n\n        # create variable physical operator kwargs for each model which can implement this logical_expression\n        models = [model for model in runtime_kwargs[\"available_models\"] if cls._model_matches_input(model, logical_expression)]\n        variable_op_kwargs = [\n            {\n                \"model\": model,\n                \"min_size_to_chunk\": min_size_to_chunk,\n                \"num_chunks\": num_chunks,\n                \"reasoning_effort\": runtime_kwargs[\"reasoning_effort\"],\n            }\n            for model in models\n            for min_size_to_chunk in cls.min_size_to_chunk\n            for num_chunks in cls.num_chunks\n        ]\n\n        return cls._perform_substitution(logical_expression, phys_op_cls, runtime_kwargs, variable_op_kwargs)\n\n\nclass TopKRule(ImplementationRule):\n    \"\"\"\n    Substitute a logical expression for a TopKScan with a TopK physical implementation.\n    \"\"\"\n    k_budgets = [1, 3, 5, 10, 15, 20, 25]\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        is_match = isinstance(logical_expression.operator, TopKScan)\n        logger.debug(f\"TopKRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting TopKRule for {logical_expression}\")\n\n        # create variable physical operator kwargs for each model which can implement this logical_expression\n        ks = cls.k_budgets if logical_expression.operator.k == -1 else [logical_expression.operator.k]\n        variable_op_kwargs = [{\"k\": k} for k in ks]\n        return cls._perform_substitution(logical_expression, TopKOp, runtime_kwargs, variable_op_kwargs)\n\n\nclass NonLLMFilterRule(ImplementationRule):\n    \"\"\"\n    Substitute a logical expression for a FilteredScan with a non-llm filter physical implementation.\n    \"\"\"\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        logical_op = logical_expression.operator\n        is_match = isinstance(logical_op, FilteredScan) and logical_op.filter.filter_fn is not None\n        logger.debug(f\"NonLLMFilterRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting NonLLMFilterRule for {logical_expression}\")\n        return cls._perform_substitution(logical_expression, NonLLMFilter, runtime_kwargs)\n\n\nclass LLMFilterRule(ImplementationRule):\n    \"\"\"\n    Substitute a logical expression for a FilteredScan with an llm filter physical implementation.\n    \"\"\"\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        logical_op = logical_expression.operator\n        is_match = isinstance(logical_op, FilteredScan) and logical_op.filter.filter_fn is None\n        logger.debug(f\"LLMFilterRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting LLMFilterRule for {logical_expression}\")\n\n        # create variable physical operator kwargs for each model which can implement this logical_expression\n        models = [model for model in runtime_kwargs[\"available_models\"] if cls._model_matches_input(model, logical_expression)]\n        variable_op_kwargs = []\n        for model in models:\n            reasoning_prompt_strategy = use_reasoning_prompt(runtime_kwargs[\"reasoning_effort\"])\n            prompt_strategy = PromptStrategy.FILTER if reasoning_prompt_strategy else PromptStrategy.FILTER_NO_REASONING\n            variable_op_kwargs.append(\n                {\n                    \"model\": model,\n                    \"prompt_strategy\": prompt_strategy,\n                    \"reasoning_effort\": runtime_kwargs[\"reasoning_effort\"],\n                }\n            )\n\n        return cls._perform_substitution(logical_expression, LLMFilter, runtime_kwargs, variable_op_kwargs)\n\n\nclass RelationalJoinRule(ImplementationRule):\n    \"\"\"\n    Substitute a logical expression for a JoinOp with a RelationalJoin physical implementation.\n    \"\"\"\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        is_match = isinstance(logical_expression.operator, JoinOp) and logical_expression.operator.condition == \"\"\n        logger.debug(f\"RelationalJoinRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting RelationalJoinRule for {logical_expression}\")\n        return cls._perform_substitution(logical_expression, RelationalJoin, runtime_kwargs)\n\n\nclass NestedLoopsJoinRule(ImplementationRule):\n    \"\"\"\n    Substitute a logical expression for a JoinOp with an (LLM) NestedLoopsJoin physical implementation.\n    \"\"\"\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        is_match = isinstance(logical_expression.operator, JoinOp) and logical_expression.operator.condition != \"\"\n        logger.debug(f\"NestedLoopsJoinRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting NestedLoopsJoinRule for {logical_expression}\")\n\n        # create variable physical operator kwargs for each model which can implement this logical_expression\n        models = [model for model in runtime_kwargs[\"available_models\"] if cls._model_matches_input(model, logical_expression)]\n        variable_op_kwargs = []\n        for model in models:\n            reasoning_prompt_strategy = use_reasoning_prompt(runtime_kwargs[\"reasoning_effort\"])\n            prompt_strategy = PromptStrategy.JOIN if reasoning_prompt_strategy else PromptStrategy.JOIN_NO_REASONING\n            variable_op_kwargs.append(\n                {\n                    \"model\": model,\n                    \"prompt_strategy\": prompt_strategy,\n                    \"join_parallelism\": runtime_kwargs[\"join_parallelism\"],\n                    \"reasoning_effort\": runtime_kwargs[\"reasoning_effort\"],\n                    \"retain_inputs\": not runtime_kwargs[\"is_validation\"],\n                }\n            )\n\n        return cls._perform_substitution(logical_expression, NestedLoopsJoin, runtime_kwargs, variable_op_kwargs)\n\n\nclass EmbeddingJoinRule(ImplementationRule):\n    \"\"\"\n    Substitute a logical expression for a JoinOp with an EmbeddingJoin physical implementation.\n    \"\"\"\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        is_match = isinstance(logical_expression.operator, JoinOp) and logical_expression.operator.condition != \"\" and not cls._is_audio_operation(logical_expression)\n        logger.debug(f\"EmbeddingJoinRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting EmbeddingJoinRule for {logical_expression}\")\n\n        # create variable physical operator kwargs for each  (model, embedding_model) which can implement this logical_expression\n        provided_models: list[Model] = runtime_kwargs[\"available_models\"]\n        models = [model for model in provided_models if cls._model_matches_input(model, logical_expression)]\n        embedding_models = [model for model in provided_models if cls._embedding_model_matches_input(model, logical_expression)]\n        variable_op_kwargs = []\n\n        for (model, embedding_model) in product(models, embedding_models):\n            reasoning_prompt_strategy = use_reasoning_prompt(runtime_kwargs[\"reasoning_effort\"])\n            prompt_strategy = PromptStrategy.JOIN if reasoning_prompt_strategy else PromptStrategy.JOIN_NO_REASONING\n            variable_op_kwargs.append(\n                {\n                    \"model\": model,\n                    \"embedding_model\": embedding_model,\n                    \"prompt_strategy\": prompt_strategy,\n                    \"join_parallelism\": runtime_kwargs[\"join_parallelism\"],\n                    \"reasoning_effort\": runtime_kwargs[\"reasoning_effort\"],\n                    \"retain_inputs\": not runtime_kwargs[\"is_validation\"],\n                    \"num_samples\": 10, # TODO: iterate over different choices of num_samples\n                }\n            )\n\n        return cls._perform_substitution(logical_expression, EmbeddingJoin, runtime_kwargs, variable_op_kwargs)\n\nclass SemanticAggregateRule(ImplementationRule):\n    \"\"\"\n    Substitute a logical expression for a SemanticAggregate with an llm physical implementation.\n    \"\"\"\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        is_match = isinstance(logical_expression.operator, Aggregate) and logical_expression.operator.agg_str is not None\n        logger.debug(f\"SemanticAggregateRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting SemanticAggregateRule for {logical_expression}\")\n\n        # create variable physical operator kwargs for each model which can implement this logical_expression\n        models = [model for model in runtime_kwargs[\"available_models\"] if cls._model_matches_input(model, logical_expression) and not model.is_llama_model()]\n        variable_op_kwargs = []\n        for model in models:\n            reasoning_prompt_strategy = use_reasoning_prompt(runtime_kwargs[\"reasoning_effort\"])\n            prompt_strategy = PromptStrategy.AGG if reasoning_prompt_strategy else PromptStrategy.AGG_NO_REASONING\n            variable_op_kwargs.append(\n                {\n                    \"model\": model,\n                    \"prompt_strategy\": prompt_strategy,\n                    \"reasoning_effort\": runtime_kwargs[\"reasoning_effort\"],\n                }\n            )\n\n        return cls._perform_substitution(logical_expression, SemanticAggregate, runtime_kwargs, variable_op_kwargs)\n\n\nclass AggregateRule(ImplementationRule):\n    \"\"\"\n    Substitute the logical expression for an aggregate with its physical counterpart.\n    \"\"\"\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        is_match = isinstance(logical_expression.operator, Aggregate) and logical_expression.operator.agg_func is not None\n        logger.debug(f\"AggregateRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting AggregateRule for {logical_expression}\")\n\n        # get the physical op class based on the aggregation function\n        physical_op_class = None\n        if logical_expression.operator.agg_func == AggFunc.COUNT:\n            physical_op_class = CountAggregateOp\n        elif logical_expression.operator.agg_func == AggFunc.AVERAGE:\n            physical_op_class = AverageAggregateOp\n        elif logical_expression.operator.agg_func == AggFunc.SUM:\n            physical_op_class = SumAggregateOp\n        elif logical_expression.operator.agg_func == AggFunc.MIN:\n            physical_op_class = MinAggregateOp\n        elif logical_expression.operator.agg_func == AggFunc.MAX:\n            physical_op_class = MaxAggregateOp\n        else:\n            raise Exception(f\"Cannot support aggregate function: {logical_expression.operator.agg_func}\")\n\n        # perform the substitution\n        return cls._perform_substitution(logical_expression, physical_op_class, runtime_kwargs)\n\n\nclass AddContextsBeforeComputeRule(ImplementationRule):\n    \"\"\"\n    Searches the ContextManager for additional contexts which may be useful for the given computation.\n\n    TODO: track cost of generating search query\n    \"\"\"\n    k = 1\n    SEARCH_GENERATOR_PROMPT = CONTEXT_SEARCH_PROMPT\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        is_match = isinstance(logical_expression.operator, ComputeOperator)\n        logger.debug(f\"AddContextsBeforeComputeRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting AddContextsBeforeComputeRule for {logical_expression}\")\n\n        # load an LLM to generate a short search query\n        model = None\n        if os.getenv(\"OPENAI_API_KEY\"):\n            model = \"openai/gpt-4o-mini\"\n        elif os.getenv(\"ANTHROPIC_API_KEY\"):\n            model = \"anthropic/claude-3-5-sonnet-20241022\"\n        elif os.getenv(\"GEMINI_API_KEY\"):\n            model = \"vertex_ai/gemini-2.0-flash\"\n        elif os.getenv(\"TOGETHER_API_KEY\"):\n            model = \"together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo\"\n\n        # importing litellm here because importing above causes deprecation warning\n        import litellm\n\n        # retrieve any additional context which may be useful\n        cm = ContextManager()\n        response = litellm.completion(\n            model=model,\n            messages=[{\"role\": \"user\", \"content\": cls.SEARCH_GENERATOR_PROMPT.format(instruction=logical_expression.operator.instruction)}]\n        )\n        query = response.choices[0].message.content\n        variable_op_kwargs = {\"additional_contexts\": cm.search_context(query, k=cls.k, where={\"materialized\": True})}\n        return cls._perform_substitution(logical_expression, SmolAgentsCompute, runtime_kwargs, variable_op_kwargs)\n\n\nclass BasicSubstitutionRule(ImplementationRule):\n    \"\"\"\n    For logical operators with a single physical implementation, substitute the\n    logical expression with its physical counterpart.\n    \"\"\"\n\n    LOGICAL_OP_CLASS_TO_PHYSICAL_OP_CLASS_MAP = {\n        BaseScan: MarshalAndScanDataOp,\n        # ComputeOperator: SmolAgentsCompute,\n        SearchOperator: SmolAgentsSearch, # SmolAgentsManagedSearch, # SmolAgentsCustomManagedSearch\n        ContextScan: ContextScanOp,\n        Distinct: DistinctOp,\n        LimitScan: LimitScanOp,\n        Project: ProjectOp,\n        GroupByAggregate: ApplyGroupByOp,\n    }\n\n    @classmethod\n    def matches_pattern(cls, logical_expression: LogicalExpression) -> bool:\n        logical_op_class = logical_expression.operator.__class__\n        is_match = logical_op_class in cls.LOGICAL_OP_CLASS_TO_PHYSICAL_OP_CLASS_MAP\n        logger.debug(f\"BasicSubstitutionRule matches_pattern: {is_match} for {logical_expression}\")\n        return is_match\n\n    @classmethod\n    def substitute(cls, logical_expression: LogicalExpression, **runtime_kwargs) -> set[PhysicalExpression]:\n        logger.debug(f\"Substituting BasicSubstitutionRule for {logical_expression}\")\n        physical_op_class = cls.LOGICAL_OP_CLASS_TO_PHYSICAL_OP_CLASS_MAP[logical_expression.operator.__class__]\n        return cls._perform_substitution(logical_expression, physical_op_class, runtime_kwargs)\n"
  },
  {
    "path": "src/palimpzest/query/optimizer/tasks.py",
    "content": "from __future__ import annotations\n\nimport logging\nfrom typing import Any\n\nfrom palimpzest.core.models import PlanCost\nfrom palimpzest.policy import Policy\nfrom palimpzest.query.execution.execution_strategy_type import ExecutionStrategyType\nfrom palimpzest.query.operators.join import JoinOp\nfrom palimpzest.query.optimizer.cost_model import BaseCostModel\nfrom palimpzest.query.optimizer.optimizer_strategy_type import OptimizationStrategyType\nfrom palimpzest.query.optimizer.primitives import Expression, Group\nfrom palimpzest.query.optimizer.rules import ImplementationRule, Rule, TransformationRule\n\nlogger = logging.getLogger(__name__)\n\nclass Task:\n    \"\"\"\n    Base class for a task. Each task has a method called perform() which executes the task.\n    Examples of tasks include optimizing and exploring groups, optimizing expressions, applying\n    rules, and optimizing inputs / costing the full group tree.\n    \"\"\"\n\n    def perform(self, groups: dict[int, Group], context: dict[str, Any] | None = None) -> list[Task]:\n        \"\"\"\n        NOTE: At the moment we do not make use of the context, but in the future\n        this can be used to store required physical properties (e.g. sort conditions\n        for the query) and bounds (e.g. the operator should not cost more than X).\n        \"\"\"\n        raise NotImplementedError(\"Calling this method from an abstract base class.\")\n\n\nclass OptimizeGroup(Task):\n    \"\"\"\n    The task to optimize a group.\n\n    This task pushes optimization tasks for the group's current logical and physical\n    expressions onto the tasks stack. This will fully expand the space of possible\n    logical and physical expressions for the group, because OptimizeLogicalExpression\n    and OptimizePhysicalExpression tasks will indirectly schedule new tasks to apply\n    rules and to optimize input groups and expressions.\n    \"\"\"\n\n    def __init__(self, group_id: int):\n        self.group_id = group_id\n\n    def perform(self, groups: dict[int, Group], context: dict[str, Any] | None = None) -> list[Task]:\n        logger.debug(f\"Optimizing group {self.group_id}\")\n        # get updated instance of the group to be optimized\n        if context is None:\n            context = {}\n        group = groups[self.group_id]\n\n        # if this group has already been optimized, there's nothing more to do\n        if group.optimized:\n            return []\n\n        # otherwise, optimize all the logical expressions for the group\n        new_tasks = []\n        for logical_expr in group.logical_expressions:\n            task = OptimizeLogicalExpression(logical_expr)\n            new_tasks.append(task)\n\n        # and optimize all of the physical expressions in the group\n        for physical_expr in group.physical_expressions:\n            task = OptimizePhysicalExpression(physical_expr)\n            new_tasks.append(task)\n\n        # and first explore the group if it hasn't been explored yet\n        if not group.explored:\n            task = ExploreGroup(self.group_id)\n            new_tasks.append(task)\n\n        logger.debug(f\"Done optimizing group {self.group_id}\")\n        logger.debug(f\"New tasks: {len(new_tasks)}\")\n        return new_tasks\n\n\nclass ExploreGroup(Task):\n    \"\"\"\n    The task to explore a group and add additional logical expressions.\n    \"\"\"\n\n    def __init__(self, group_id: int):\n        self.group_id = group_id\n\n    def perform(self, groups: dict[int, Group], context: dict[str, Any] | None = None) -> list[Task]:\n        logger.debug(f\"Expanding group {self.group_id}\")\n\n        # fetch group\n        if context is None:\n            context = {}\n        group = groups[self.group_id]\n\n        # if the group has been explored before, return []\n        if group.explored:\n            return []\n\n        # for each logical_expr in the group, add a new OptimizeLogicalExpression() task to the queue\n        new_tasks = []\n        for logical_expr in group.logical_expressions:\n            task = OptimizeLogicalExpression(logical_expr, exploring=True)\n            new_tasks.append(task)\n\n        # but first (tasks are LIFO), we recursively explore input groups of logical expressions in this group\n        for logical_expr in group.logical_expressions:\n            for input_group_id in logical_expr.input_group_ids:\n                task = ExploreGroup(input_group_id)\n                new_tasks.append(task)\n\n        # mark the group as explored and return tasks\n        group.set_explored()\n\n        logger.debug(f\"Done expanding group {self.group_id}\")\n        logger.debug(f\"New tasks: {len(new_tasks)}\")\n        return new_tasks\n\n\nclass OptimizeLogicalExpression(Task):\n    \"\"\"\n    The task to optimize a (multi-)expression.\n\n    This task filters for the subset of rules which may be applied to the given logical expression\n    and schedules ApplyRule tasks for each rule.\n    \"\"\"\n\n    def __init__(self, logical_expression: Expression, exploring: bool = False):\n        self.logical_expression = logical_expression\n        self.exploring = exploring\n\n    def perform(\n        self,\n        transformation_rules: list[type[TransformationRule]],\n        implementation_rules: list[type[ImplementationRule]],\n        context: dict[str, Any] | None = None,\n    ) -> list[Task]:\n        logger.debug(f\"Optimizing logical expression {self.logical_expression}\")\n        if context is None:\n            context = {}\n\n        # if we're exploring, only apply transformation rules\n        rules = (\n            [rule for rule in transformation_rules if rule.is_exploration_rule()]\n            if self.exploring\n            else transformation_rules + implementation_rules\n        )\n\n        # filter out rules that have already been applied to logical expression\n        rules = list(filter(lambda rule: rule.get_rule_id() not in self.logical_expression.rules_applied, rules))\n\n        # filter for rules that match on this logical expression\n        rules = list(filter(lambda rule: rule.matches_pattern(self.logical_expression), rules))\n\n        # TODO compute priority (i.e. \"promise\") of the rules and sort in order of priority\n\n        # apply rules, exploring the input group(s) of each pattern if necessary\n        new_tasks = []\n        for rule in rules:\n            # TODO: if necessary, expand the input groups of the logical expression to see if they need to be expanded\n            apply_rule_task = ApplyRule(rule, self.logical_expression, self.exploring)\n            new_tasks.append(apply_rule_task)\n\n        logger.debug(f\"Done optimizing logical expression {self.logical_expression}\")\n        logger.debug(f\"New tasks: {len(new_tasks)}\")\n        return new_tasks\n\n\nclass ApplyRule(Task):\n    \"\"\"\n    The task to apply a transformation or implementation rule to a (multi-)expression.\n\n    For TransformationRules, this task will:\n    - apply the substitution, receiving new expressions and groups\n    - filter the new expressions for ones which may already exist\n      - NOTE: we don't filter new groups because this implicitly must be done by the\n              transformation rule in order to assign the correct group_id to any\n              new expressions.\n    - add new expressions to their group's set of logical expressions\n    - schedule OptimizeGroup and OptimizeLogicalExpression tasks\n\n    For ImplementationRules, this task will:\n    - apply the substitution, receiving new expressions and groups\n    - filter the new expressions for ones which may already exist\n    - add new expressions to their group's set of physical expressions\n    - schedule OptimizePhysicalExpression tasks\n    \"\"\"\n\n    def __init__(self, rule: type[Rule], logical_expression: Expression, exploring: bool = False):\n        self.rule = rule\n        self.logical_expression = logical_expression\n        self.exploring = exploring\n\n    def perform(\n        self,\n        groups: dict[int, Group],\n        expressions: dict[int, Expression],\n        context: dict[str, Any] | None = None,\n        **physical_op_params,\n    ) -> tuple[list[Task], int]:\n        logger.debug(f\"Applying rule {self.rule} to logical expression {self.logical_expression}\")\n        if context is None:\n            context = {}\n\n        # check if rule has already been applied to this logical expression; return [] if so\n        if self.rule.get_rule_id() in self.logical_expression.rules_applied:\n            return []\n\n        # get the group of the logical expression\n        group_id = self.logical_expression.group_id\n        group = groups[group_id]\n\n        # process new expressions; update groups and create new tasks as needed\n        new_tasks = []\n        if issubclass(self.rule, TransformationRule):\n            # apply transformation rule\n            new_expressions, new_groups = self.rule.substitute(\n                self.logical_expression, groups, expressions, **physical_op_params\n            )\n\n            # filter out any expressions which are duplicates (i.e. they've been previously computed)\n            new_expressions = [expr for expr in new_expressions if expr.expr_id not in expressions]\n            expressions.update({expr.expr_id: expr for expr in new_expressions})\n\n            # add all new groups to the groups mapping\n            for group in new_groups:\n                groups[group.group_id] = group\n                task = OptimizeGroup(group.group_id)\n\n            # add new expressions to their respective groups\n            for expr in new_expressions:\n                group = groups[expr.group_id]\n                group.logical_expressions.add(expr)\n\n            # NOTE: we place new tasks for groups on the top of the stack so that they may be\n            #       optimized before we optimize expressions which take new groups as inputs\n            # create new tasks for optimizing new logical expressions\n            for expr in new_expressions:\n                task = OptimizeLogicalExpression(expr, self.exploring)\n                new_tasks.append(task)\n\n            # create new tasks for optimizing new groups\n            for group in new_groups:\n                task = OptimizeGroup(group.group_id)\n                new_tasks.append(task)\n\n        else:\n            # apply implementation rule\n            new_expressions = self.rule.substitute(self.logical_expression, **physical_op_params)\n            new_expressions = [expr for expr in new_expressions if expr.expr_id not in expressions]\n\n            # get the costed_full_op_ids from the context (if provided) and compute whether this\n            # logical expression has physical operators which have been costed\n            costed_full_op_ids = context['costed_full_op_ids']\n            logical_op_has_been_costed = costed_full_op_ids is not None and any([\n                op_id.split(\"-\")[0] == self.logical_expression.operator.get_logical_op_id()\n                for op_id in costed_full_op_ids\n            ])\n\n            if logical_op_has_been_costed:\n                new_expressions = [expr for expr in new_expressions if expr.operator.get_full_op_id() in costed_full_op_ids]\n            expressions.update({expr.expr_id: expr for expr in new_expressions})\n            group.physical_expressions.update(new_expressions)\n\n            # create new task\n            for expr in new_expressions:\n                task = OptimizePhysicalExpression(expr)\n                new_tasks.append(task)\n\n        # mark that the rule has been applied to the logical expression\n        self.logical_expression.add_applied_rule(self.rule)\n\n        logger.debug(f\"Done applying rule {self.rule} to logical expression {self.logical_expression}\")\n        logger.debug(f\"New tasks: {len(new_tasks)}\")\n        return new_tasks\n\n\nclass OptimizePhysicalExpression(Task):\n    \"\"\"\n    The task to optimize a physical expression and derive its cost.\n\n    This task computes the cost of input groups for the given physical expression (scheduling\n    OptimizeGroup tasks if needed), computes the cost of the given expression, and then updates\n    the expression's group depending on whether this expression is its `best_physical_expression`\n    or in its `pareto_optimal_physical_expressions`.\n    \"\"\"\n\n    def __init__(self, physical_expression: Expression, exploring: bool = False):\n        self.physical_expression = physical_expression\n        self.exploring = exploring\n\n    def update_best_physical_expression(self, group: Group, policy: Policy) -> Group:\n        \"\"\"\n        Update the best physical expression for the given group and policy (if necessary).\n        \"\"\"\n        # get the PlanCosts for the current best expression and this physical expression\n        best_plan_cost = (\n            group.best_physical_expression.plan_cost if group.best_physical_expression is not None else None\n        )\n        expr_plan_cost = self.physical_expression.plan_cost\n\n        # pre-compute whether or not this physical expression satisfies the policy constraint\n        expr_satisfies_constraint = policy.constraint(expr_plan_cost)\n\n        # if we do not have a best physical expression for the group, we set this to be the best expression\n        if group.best_physical_expression is None:\n            group.best_physical_expression = self.physical_expression\n            group.satisfies_constraint = expr_satisfies_constraint\n\n        # if the group currently satisfies the constraint, only update the best physical expression\n        # if this expression also satisfies the constraint and is more policy optimal\n        elif group.satisfies_constraint and expr_satisfies_constraint and policy.choose(expr_plan_cost, best_plan_cost):\n            group.best_physical_expression = self.physical_expression\n\n        # finally, if the group does not satisfy the constraint, update the best physical expression if\n        # this expression does satisfy the constraint, or if it is more policy optimal\n        elif not group.satisfies_constraint and (\n            expr_satisfies_constraint or policy.choose(expr_plan_cost, best_plan_cost)\n        ):\n            group.best_physical_expression = self.physical_expression\n            group.satisfies_constraint = expr_satisfies_constraint\n\n        return group\n\n    def _is_dominated(self, plan_cost: PlanCost, other_plan_cost: PlanCost, policy: Policy):\n        \"\"\"\n        Return true if plan_cost is dominated by other_plan_cost and False otherwise.\n\n        If plan costs are perfectly tied on dimensions of interest, other dimensions\n        will be used as a tiebreaker.\n        \"\"\"\n        # get the dictionary representation of this poicy\n        policy_dict = policy.get_dict()\n\n        # get the metrics which matter for this policy\n        metrics_of_interest = {metric for metric, weight in policy_dict.items() if weight > 0.0}\n        remaining_metrics = {metric for metric, weight in policy_dict.items() if weight == 0.0}\n\n        # corner case: if the two plan costs are perfectly tied on all dimensions of interest,\n        # use other dimensions as tiebreaker\n        if (\n            all([getattr(plan_cost, metric) == getattr(other_plan_cost, metric) for metric in metrics_of_interest])\n            and plan_cost.op_estimates.cardinality == other_plan_cost.op_estimates.cardinality\n        ):\n            for metric in remaining_metrics:\n                if metric == \"cost\" and plan_cost.cost < other_plan_cost.cost:  # noqa: SIM114\n                    return False\n                elif metric == \"time\" and plan_cost.time < other_plan_cost.time:  # noqa: SIM114\n                    return False\n                elif metric == \"quality\" and plan_cost.quality > other_plan_cost.quality:\n                    return False\n\n            # if plan_cost is dominated by other_plan_cost on remaining metrics, return True\n            return True\n\n        # normal case: identify whether plan_cost is dominated by other_plan_cost\n        cost_dominated = True if policy_dict[\"cost\"] == 0.0 else other_plan_cost.cost <= plan_cost.cost\n        time_dominated = True if policy_dict[\"time\"] == 0.0 else other_plan_cost.time <= plan_cost.time\n        quality_dominated = True if policy_dict[\"quality\"] == 0.0 else other_plan_cost.quality >= plan_cost.quality\n        cardinality_dominated = other_plan_cost.op_estimates.cardinality <= plan_cost.op_estimates.cardinality\n\n        return cost_dominated and time_dominated and quality_dominated and cardinality_dominated\n\n    def _is_pareto_optimal(self, expr_plan_cost: PlanCost, pareto_optimal_physical_expressions: list[Expression], policy: Policy) -> bool:\n        \"\"\"\n        Return True if expr_plan_cost is pareto optimal and False otherwise.\n        \"\"\"\n        pareto_optimal = True\n        for pareto_phys_expr in pareto_optimal_physical_expressions:\n            for other_expr_plan_cost, _ in pareto_phys_expr.pareto_optimal_plan_costs:\n                if self._is_dominated(expr_plan_cost, other_expr_plan_cost, policy):\n                    pareto_optimal = False\n                    break\n\n        return pareto_optimal\n\n    def update_pareto_optimal_physical_expressions(self, group: Group, policy: Policy) -> Group:\n        \"\"\"\n        Update the pareto optimal physical expressions for the given group and policy (if necessary).\n        \"\"\"\n        for pareto_expr_plan_cost, _ in self.physical_expression.pareto_optimal_plan_costs:\n            # if the pareto optimal physical expressions are empty, set the pareto optimal\n            # physical expressions to be this expression\n            if group.pareto_optimal_physical_expressions is None:\n                group.pareto_optimal_physical_expressions = [self.physical_expression]\n\n            # otherwise, if this expression is pareto optimal, update the pareto frontier\n            elif self._is_pareto_optimal(pareto_expr_plan_cost, group.pareto_optimal_physical_expressions, policy):\n                all_physical_expressions = [self.physical_expression] + group.pareto_optimal_physical_expressions\n\n                # compute the pareto optimal set of expressions (or plan costs)\n                pareto_optimal_physical_expressions = []\n                for idx, expr in enumerate(all_physical_expressions):\n                    for plan_cost, _ in expr.pareto_optimal_plan_costs:\n                        pareto_optimal = True\n\n                        # check if any other_expr has a plan cost which dominates plan_cost\n                        for other_idx, other_expr in enumerate(all_physical_expressions):\n                            if idx == other_idx:\n                                continue\n\n                            # if plan is dominated by other_expr, set pareto_optimal = False and break\n                            for other_plan_cost, _ in other_expr.pareto_optimal_plan_costs:\n                                if self._is_dominated(plan_cost, other_plan_cost, policy):\n                                    pareto_optimal = False\n                                    break\n\n                            # break early if plan_cost is already dominated by another expression's plan_cost\n                            if not pareto_optimal:\n                                break\n\n                        # add expr to pareto frontier if it has at least one plan cost which is not dominated\n                        if pareto_optimal:\n                            pareto_optimal_physical_expressions.append(expr)\n\n                            # we can break now because we've identified that this expression has a plan on the pareto frontier\n                            break\n\n                # set pareto optimal physical expressions for the group\n                group.pareto_optimal_physical_expressions = pareto_optimal_physical_expressions\n\n        return group\n\n    def perform(\n        self,\n        cost_model: BaseCostModel,\n        groups: dict[int, Group],\n        policy: Policy,\n        context: dict[str, Any] | None = None,\n    ) -> list[Task]:\n        logger.debug(f\"Optimizing physical expression {self.physical_expression}\")\n\n        if context is None:\n            context = {}\n\n        # get the optimizer strategy (type) and the execution strategy (type) from the context\n        optimizer_strategy: OptimizationStrategyType = context['optimizer_strategy']\n        execution_strategy: ExecutionStrategyType = context['execution_strategy']\n\n        # return if we've already computed the cost of this physical expression\n        if optimizer_strategy.is_pareto() and self.physical_expression.pareto_optimal_plan_costs is not None:\n            return []\n\n        if optimizer_strategy.is_not_pareto() and self.physical_expression.plan_cost is not None:\n            return []\n\n        # for expressions with input group(s), compute the input plan cost(s)\n        best_input_plan_costs = {}\n        pareto_optimal_input_plan_costs = {}\n        if len(self.physical_expression.input_group_ids) > 0:\n            new_tasks = []\n            for input_group_id in self.physical_expression.input_group_ids:\n                # get the input group\n                input_group = groups[input_group_id]\n\n                # compute the input plan cost or list of input plan costs\n                if optimizer_strategy.is_not_pareto() and input_group.best_physical_expression is not None:\n                    # TODO: apply policy constraint here\n                    best_input_plan_costs[input_group_id] = input_group.best_physical_expression.plan_cost\n\n                elif optimizer_strategy.is_pareto() and input_group.pareto_optimal_physical_expressions is not None:\n                    # TODO: apply policy constraint here\n                    input_plan_costs = []\n                    for pareto_physical_expression in input_group.pareto_optimal_physical_expressions:\n                        plan_costs = list(map(lambda tup: tup[0], pareto_physical_expression.pareto_optimal_plan_costs))\n                        input_plan_costs.extend(plan_costs)\n\n                    # NOTE: this list will not necessarily be pareto-optimal, as a plan cost on the pareto frontier of\n                    # one pareto_optimal_physical_expression might be dominated by the plan cost on another physical\n                    # expression's pareto frontier; we handle this below by taking the pareto frontier of all_possible_plan_costs\n                    # de-duplicate equivalent plan costs; we will still reconstruct plans with equivalent cost in optimizer.py\n                    pareto_optimal_input_plan_costs[input_group_id] = list(set(input_plan_costs))\n\n                else:\n                    task = OptimizeGroup(input_group_id)\n                    new_tasks.append(task)\n\n            # if not all input groups have been costed, we need to compute these first and then retry this task\n            if len(new_tasks) > 0:\n                return [self] + new_tasks\n\n        # once all input groups have been costed, compute the cost of this physical expression\n        group = groups[self.physical_expression.group_id]\n        if optimizer_strategy.is_pareto():\n            # compute all possible plan costs for this physical expression given the pareto optimal input plan costs\n            all_possible_plan_costs = []\n            if isinstance(self.physical_expression.operator, JoinOp):\n                assert len(self.physical_expression.input_group_ids) == 2, \"Join operator must have exactly two input groups.\"\n\n                # get the best input plan costs for both inputs\n                left_input_group_id, right_input_group_id = self.physical_expression.input_group_ids\n                left_best_input_plan_cost = pareto_optimal_input_plan_costs[left_input_group_id]\n                right_best_input_plan_cost = pareto_optimal_input_plan_costs[right_input_group_id]\n                for left_input_plan_cost in left_best_input_plan_cost:\n                    for right_input_plan_cost in right_best_input_plan_cost:\n                        # compute the cost of this operator given the input plan costs\n                        op_plan_cost = cost_model(\n                            self.physical_expression.operator,\n                            left_input_plan_cost.op_estimates,\n                            right_input_plan_cost.op_estimates,\n                        )\n\n                        # compute the total cost for this physical expression by summing its operator's PlanCost\n                        # with the input groups' total PlanCost; also set the op_estimates for this expression's operator\n                        execution_strategy_str = \"parallel\" if execution_strategy.is_fully_parallel() else \"sequential\"\n                        full_plan_cost = op_plan_cost.join_add(left_input_plan_cost, right_input_plan_cost, execution_strategy_str)\n                        full_plan_cost.op_estimates = op_plan_cost.op_estimates\n                        all_possible_plan_costs.append((full_plan_cost, (left_input_plan_cost, right_input_plan_cost)))\n\n            else:\n                assert len(self.physical_expression.input_group_ids) < 2, \"Non-join operator must have zero or one input groups.\"\n\n                input_plan_costs = [PlanCost(cost=0, time=0, quality=1)]\n                if len(self.physical_expression.input_group_ids) == 1:\n                    input_group_id = self.physical_expression.input_group_ids[0]\n                    input_plan_costs = pareto_optimal_input_plan_costs[input_group_id]\n\n                # get the pareto-optimal input plan costs for the single input\n                for input_plan_cost in input_plan_costs:\n                    op_plan_cost = cost_model(self.physical_expression.operator, input_plan_cost.op_estimates)\n\n                    # compute the total cost for this physical expression by summing its operator's PlanCost\n                    # with the input groups' total PlanCost; also set the op_estimates for this expression's operator\n                    full_plan_cost = op_plan_cost + input_plan_cost\n                    full_plan_cost.op_estimates = op_plan_cost.op_estimates\n                    all_possible_plan_costs.append((full_plan_cost, (input_plan_cost, None)))\n\n            # reduce the set of possible plan costs to the subset which are pareto-optimal\n            pareto_optimal_plan_costs = []\n            for idx, (plan_cost, input_plan_cost_tuple) in enumerate(all_possible_plan_costs):\n                pareto_optimal = True\n\n                # check if any other_expr dominates expr\n                for other_idx, (other_plan_cost, _) in enumerate(all_possible_plan_costs):\n                    if idx == other_idx:\n                        continue\n\n                    # if plan is dominated by other_expr, set pareto_optimal = False and break\n                    if self._is_dominated(plan_cost, other_plan_cost, policy):\n                        pareto_optimal = False\n                        break\n\n                # add expr to pareto frontier if it's not dominated\n                if pareto_optimal:\n                    pareto_optimal_plan_costs.append((plan_cost, input_plan_cost_tuple))\n\n            # set the pareto frontier of plan costs which can be obtained by this physical expression\n            self.physical_expression.pareto_optimal_plan_costs = pareto_optimal_plan_costs\n\n            # update the group's pareto optimal costs\n            group = self.update_pareto_optimal_physical_expressions(group, policy)\n\n        else:\n\n            # otherwise, compute the cost of this operator given the optimal input plan cost(s)\n            full_plan_cost = None\n            if isinstance(self.physical_expression.operator, JoinOp):\n                assert len(self.physical_expression.input_group_ids) == 2, \"Join operator must have exactly two input groups.\"\n\n                # get the best input plan costs for both inputs\n                left_input_group_id, right_input_group_id = self.physical_expression.input_group_ids\n                left_best_input_plan_cost = best_input_plan_costs[left_input_group_id]\n                right_best_input_plan_cost = best_input_plan_costs[right_input_group_id]\n\n                # compute the cost of this operator given the best input plan costs\n                op_plan_cost = cost_model(\n                    self.physical_expression.operator,\n                    left_best_input_plan_cost.op_estimates,\n                    right_best_input_plan_cost.op_estimates,\n                )\n\n                # compute the total cost for this physical expression by summing its operator's PlanCost\n                # with the input groups' total PlanCost; also set the op_estimates for this expression's operator\n                execution_strategy_str = \"parallel\" if execution_strategy.is_fully_parallel() else \"sequential\"\n                full_plan_cost = op_plan_cost.join_add(left_best_input_plan_cost, right_best_input_plan_cost, execution_strategy_str)\n                full_plan_cost.op_estimates = op_plan_cost.op_estimates\n\n            else:\n                assert len(self.physical_expression.input_group_ids) < 2, \"Non-join operator must have zero or one input groups.\"\n\n                # get the best input plan cost for the single input\n                best_input_plan_cost = PlanCost(cost=0, time=0, quality=1)\n                if len(self.physical_expression.input_group_ids) == 1:\n                    input_group_id = self.physical_expression.input_group_ids[0]\n                    best_input_plan_cost = best_input_plan_costs[input_group_id]\n\n                # compute the cost of this operator given the best input plan cost\n                op_plan_cost = cost_model(self.physical_expression.operator, best_input_plan_cost.op_estimates)\n\n                # compute the total cost for this physical expression by summing its operator's PlanCost\n                # with the input groups' total PlanCost; also set the op_estimates for this expression's operator\n                full_plan_cost = op_plan_cost + best_input_plan_cost\n                full_plan_cost.op_estimates = op_plan_cost.op_estimates\n\n            # set the plan cost for this physical expression\n            self.physical_expression.plan_cost = full_plan_cost\n\n            # update the best physical expression for the group\n            group = self.update_best_physical_expression(group, policy)\n\n        # set the group's optimized flag to True, store the updated group, and return\n        group.optimized = True\n        groups[self.physical_expression.group_id] = group\n\n        logger.debug(f\"Done optimizing physical expression {self.physical_expression}\")\n        return []\n"
  },
  {
    "path": "src/palimpzest/query/processor/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/query/processor/config.py",
    "content": "from __future__ import annotations\n\nfrom pydantic import BaseModel, ConfigDict, Field\n\nfrom palimpzest.constants import Model\nfrom palimpzest.policy import MaxQuality, Policy\n\n\n# TODO: Add description for each field.\nclass QueryProcessorConfig(BaseModel):\n    \"\"\"Shared context for query processors\"\"\"\n    model_config = ConfigDict(arbitrary_types_allowed=True)\n\n    # execution and optimization flags\n    execution_strategy: str = Field(default=\"parallel\")              # substituted with ExecutionStrategyType\n    sentinel_execution_strategy: str | None = Field(default=\"auto\")  # substituted with SentinelExecutionStrategyType\n    optimizer_strategy: str = Field(default=\"pareto\")                # substituted with OptimizationStrategyType\n\n    # general execution flags\n    policy: Policy = Field(default_factory=MaxQuality)\n    enforce_types: bool = Field(default=False)\n    scan_start_idx: int = Field(default=0)\n    num_samples: int | None = Field(default=None)\n    verbose: bool = Field(default=False)\n    progress: bool = Field(default=True)\n    available_models: list[Model | str] | None = Field(default=None)\n    remove_models: list[Model] | None = Field(default=None)\n    max_workers: int | None = Field(default=64)\n    join_parallelism: int = Field(default=64)\n    batch_size: int | None = Field(default=None)\n    reasoning_effort: str = Field(default=\"default\")  # Gemini: \"disable\", \"low\", \"medium\", \"high\"\n    use_vertex: bool = Field(default=False)  # Whether to use Vertex models for Gemini or Google models\n    use_azure: bool = Field(default=False)  # Whether to use Azure for OpenAI models\n    gemini_credentials_path: str | None = Field(default=None)  # Path to Gemini credentials file\n    azure_endpoint: str | None = Field(default=None)  # Azure endpoint URL (AZURE_API_BASE)\n    azure_api_version: str | None = Field(default=None)  # Azure API version\n\n    # operator flags\n    allow_bonded_query: bool = Field(default=True)\n    allow_model_selection: bool = Field(default=True)\n    allow_rag_reduction: bool = Field(default=True)\n    allow_mixtures: bool = Field(default=True)\n    allow_critic: bool = Field(default=True)\n    allow_split_merge: bool = Field(default=False)\n    use_final_op_quality: bool = Field(default=False)\n\n    # sentinel optimization flags\n    k: int = Field(default=6)\n    j: int = Field(default=4)\n    sample_budget: int = Field(default=100)\n    sample_cost_budget: float | None = Field(default=None)\n    seed: int = Field(default=42)\n    exp_name: str | None = Field(default=None)\n    priors: dict | None = Field(default=None)\n    dont_use_priors: bool = Field(default=False)\n\n    def to_dict(self) -> dict:\n        \"\"\"Convert the config to a dict representation.\"\"\"\n        return self.model_dump()\n\n    def copy(self) -> QueryProcessorConfig:\n        \"\"\"Create a copy of the config.\"\"\"\n        return QueryProcessorConfig(**self.to_dict())\n"
  },
  {
    "path": "src/palimpzest/query/processor/query_processor.py",
    "content": "import logging\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.data.dataset import Dataset\nfrom palimpzest.core.elements.records import DataRecord, DataRecordCollection\nfrom palimpzest.core.models import ExecutionStats, PlanStats\nfrom palimpzest.policy import Policy\nfrom palimpzest.query.execution.execution_strategy import ExecutionStrategy, SentinelExecutionStrategy\nfrom palimpzest.query.optimizer.cost_model import SampleBasedCostModel\nfrom palimpzest.query.optimizer.optimizer import Optimizer\nfrom palimpzest.query.optimizer.optimizer_strategy_type import OptimizationStrategyType\nfrom palimpzest.query.optimizer.plan import SentinelPlan\nfrom palimpzest.utils.hash_helpers import hash_for_id\nfrom palimpzest.validator.validator import Validator\n\nlogger = logging.getLogger(__name__)\n\nclass QueryProcessor:\n    \"\"\"\n    Processes queries through the complete pipeline:\n    1. Optimization phase: Plan generation and selection\n    2. Execution phase: Plan execution and result collection\n    3. Result phase: Statistics gathering and result formatting\n    \"\"\"\n    def __init__(\n        self,\n        dataset: Dataset,\n        optimizer: Optimizer,\n        execution_strategy: ExecutionStrategy,\n        sentinel_execution_strategy: SentinelExecutionStrategy | None,\n        num_samples: int | None = None,\n        train_dataset: dict[str, Dataset] | None = None,\n        validator: Validator | None = None,\n        scan_start_idx: int = 0,\n        verbose: bool = False,\n        progress: bool = True,\n        max_workers: int | None = None,\n        policy: Policy | None = None,\n        available_models: list[Model] | None = None,\n        **kwargs,  # needed in order to provide compatibility with QueryProcessorConfig\n    ):\n        \"\"\"\n        Initialize QueryProcessor with optional custom components.\n        \n        Args:\n            dataset: Dataset to process\n            TODO\n        \"\"\"\n        self.dataset = dataset\n        self.optimizer = optimizer\n        self.execution_strategy = execution_strategy\n        self.sentinel_execution_strategy = sentinel_execution_strategy\n        self.num_samples = num_samples\n        self.train_dataset = train_dataset\n        self.validator = validator\n        self.scan_start_idx = scan_start_idx\n        self.verbose = verbose\n        self.progress = progress\n        self.max_workers = max_workers\n        self.policy = policy\n        self.available_models = available_models\n\n        if self.verbose:\n            print(\"Available models: \", self.available_models)\n\n        logger.info(f\"Initialized QueryProcessor {self.__class__.__name__}\")\n        logger.debug(f\"QueryProcessor initialized with config: {self.__dict__}\")\n\n    def execution_id(self) -> str:\n        \"\"\"\n        Hash of the class parameters.\n        \"\"\"\n        id_str = \"\"\n        for attr, value in self.__dict__.items():\n            if not attr.startswith(\"_\"):\n                id_str += f\"{attr}={value},\"\n\n        return hash_for_id(id_str)\n\n    def _create_sentinel_plan(self, train_dataset: dict[str, Dataset] | None) -> SentinelPlan:\n        \"\"\"\n        Generates and returns a SentinelPlan for the given dataset.\n        \"\"\"\n        # create a new optimizer and update its strategy to SENTINEL\n        optimizer = self.optimizer.deepcopy_clean()\n        optimizer.update_strategy(OptimizationStrategyType.SENTINEL)\n\n        # create copy of dataset, but change its root Dataset(s) to the validation Dataset(s)\n        dataset = self.dataset.copy()\n        if train_dataset is not None:\n            dataset._set_root_datasets(train_dataset)\n            dataset._generate_unique_logical_op_ids()\n\n        # get the sentinel plan for the given dataset\n        sentinel_plans = optimizer.optimize(dataset)\n        sentinel_plan = sentinel_plans[0]\n\n        return sentinel_plan\n\n    def _execute_best_plan(self, dataset: Dataset, optimizer: Optimizer) -> tuple[list[DataRecord], list[PlanStats]]:\n        # get the optimal plan according to the optimizer\n        plans = optimizer.optimize(dataset)\n        final_plan = plans[0]\n\n        # execute the plan\n        records, plan_stats = self.execution_strategy.execute_plan(plan=final_plan)\n\n        # return the output records and plan stats\n        return records, [plan_stats]\n\n    def execute(self) -> DataRecordCollection:\n        logger.info(f\"Executing {self.__class__.__name__}\")\n\n        # create execution stats\n        execution_stats = ExecutionStats(execution_id=self.execution_id())\n        execution_stats.start()\n\n        # if the user provides a validator, we perform optimization\n        if self.validator is not None:\n            # create sentinel plan\n            sentinel_plan = self._create_sentinel_plan(self.train_dataset)\n\n            # generate sample execution data\n            if self.train_dataset is not None:\n                sentinel_plan_stats = self.sentinel_execution_strategy.execute_sentinel_plan(sentinel_plan, self.train_dataset, self.validator)\n\n            else:\n                train_dataset = self.dataset._get_root_datasets()\n                sentinel_plan_stats = self.sentinel_execution_strategy.execute_sentinel_plan(sentinel_plan, train_dataset, self.validator)\n\n            # update the execution stats to account for the work done in optimization\n            execution_stats.add_plan_stats(sentinel_plan_stats)\n            execution_stats.finish_optimization()\n\n            # (re-)initialize the optimizer\n            self.optimizer = self.optimizer.deepcopy_clean()\n\n            # construct the CostModel with any sample execution data we've gathered\n            cost_model = SampleBasedCostModel(sentinel_plan_stats, self.verbose)\n            self.optimizer.update_cost_model(cost_model)\n\n        # execute plan(s) according to the optimization strategy\n        records, plan_stats = self._execute_best_plan(self.dataset, self.optimizer)\n\n        # update the execution stats to account for the work to execute the final plan\n        execution_stats.add_plan_stats(plan_stats)\n        execution_stats.finish()\n\n        # construct and return the DataRecordCollection\n        result = DataRecordCollection(records, execution_stats=execution_stats)\n        logger.info(f\"Done executing {self.__class__.__name__}\")\n\n        return result\n"
  },
  {
    "path": "src/palimpzest/query/processor/query_processor_factory.py",
    "content": "import logging\nimport os\nfrom enum import Enum\n\nfrom dotenv import load_dotenv\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.data.dataset import Dataset\nfrom palimpzest.core.elements.records import DataRecordCollection\nfrom palimpzest.query.execution.execution_strategy import ExecutionStrategy, SentinelExecutionStrategy\nfrom palimpzest.query.execution.execution_strategy_type import ExecutionStrategyType, SentinelExecutionStrategyType\nfrom palimpzest.query.optimizer.cost_model import SampleBasedCostModel\nfrom palimpzest.query.optimizer.optimizer import Optimizer\nfrom palimpzest.query.optimizer.optimizer_strategy_type import OptimizationStrategyType\nfrom palimpzest.query.processor.config import QueryProcessorConfig\nfrom palimpzest.query.processor.query_processor import QueryProcessor\nfrom palimpzest.utils.model_helpers import get_optimal_models\nfrom palimpzest.validator.validator import Validator\n\nlogger = logging.getLogger(__name__)\n\n\nclass QueryProcessorFactory:\n\n    @classmethod\n    def _convert_to_enum(cls, enum_type: type[Enum], value: str) -> Enum:\n        value = value.upper().replace('-', '_')\n        try:\n            return enum_type[value]\n        except KeyError as e:\n            raise ValueError(f\"Unsupported {enum_type.__name__}: {value}\") from e\n\n    @classmethod\n    def _normalize_strategies(cls, config: QueryProcessorConfig):\n        \"\"\"\n        Convert the string representation of each strategy into its Enum equivalent and throw\n        an exception if the conversion fails.\n        \"\"\"\n        strategy_types = {\n            \"execution_strategy\": ExecutionStrategyType,\n            \"sentinel_execution_strategy\": SentinelExecutionStrategyType,\n            \"optimizer_strategy\": OptimizationStrategyType,\n        }\n        for strategy in [\"execution_strategy\", \"sentinel_execution_strategy\", \"optimizer_strategy\"]:\n            strategy_str = getattr(config, strategy)\n            strategy_type = strategy_types[strategy]\n            strategy_enum = None\n            if strategy_str is not None:\n                try:\n                    strategy_enum = cls._convert_to_enum(strategy_type, strategy_str)\n                except ValueError as e:\n                    raise ValueError(f\"\"\"Unsupported {strategy}: {strategy_str}.\n                                        The supported strategies are: {strategy_type.__members__.keys()}\"\"\") from e\n            setattr(config, strategy, strategy_enum)\n            logger.debug(f\"Normalized {strategy}: {strategy_enum}\")\n\n        return config\n    \n    @classmethod\n    def _normalize_models(cls, config: QueryProcessorConfig) -> QueryProcessorConfig:\n        \"\"\"\n        Validate and normalize available_models and remove_models; converts all model strings to Model objects.\n        \"\"\"\n        # get the current set of available_models (if provided by the user's config)\n        current_available_models = getattr(config, 'available_models', [])\n\n        # normalize all models to be pz.Model objects\n        if current_available_models is not None and len(current_available_models) > 0:\n            assert all(\n                isinstance(model, (Model, str)) for model in current_available_models\n            ), \"Must provide pz.Model or the model's full string identifier for each element in `available_models`\"\n            current_available_models = [\n                Model(model) if isinstance(model, str) else model for model in current_available_models\n            ]\n\n        # if the user does not explicitly set the available models, select the optimal models based on policy\n        if current_available_models is None or len(current_available_models) == 0:\n            current_available_models = get_optimal_models(\n                policy = config.policy,\n                use_vertex = config.use_vertex,\n                use_azure = config.use_azure,\n                gemini_credentials_path = config.gemini_credentials_path,\n                azure_endpoint = config.azure_endpoint,\n                azure_api_version = config.azure_api_version,\n            )\n\n        # get the list of models to remove (if provided by the user's config)\n        remove_models = getattr(config, 'remove_models', [])\n\n        # remove any models specified in the config\n        if remove_models is not None and len(remove_models) > 0:\n            assert all(\n                isinstance(model, (Model, str)) for model in remove_models\n            ), \"Must provide pz.Model or the model's full string identifier for each element in `remove_models`\"\n            remove_models = [\n                Model(model) if isinstance(model, str) else model for model in remove_models\n            ]\n\n            # filter remove_models out of current_available_models\n            current_available_models = [model for model in current_available_models if model not in remove_models]\n\n        logger.info(f\"Final set of available models: {current_available_models}\")\n        config.available_models = current_available_models\n        config.remove_models = remove_models\n\n        return config\n\n    @classmethod\n    def _config_validation_and_normalization(cls, config: QueryProcessorConfig, train_dataset: dict[str, Dataset] | None, validator : Validator | None):\n        if config.policy is None:\n            raise ValueError(\"Policy is required for optimizer\")\n        \n        # only one of progress or verbose can be set; we will default to progress=True\n        if config.progress and config.verbose:\n            print(\"WARNING: Both `progress` and `verbose` are set to True, but only one can be True at a time; defaulting to `progress=True`\")\n            config.verbose = False\n\n        # if the user provides a training dataset, but no validator, create a default validator\n        if train_dataset is not None and validator is None:\n            validator = Validator()\n            logger.info(\"No validator provided; using default Validator\")\n\n        # boolean flag for whether we're performing optimization or not\n        optimization = validator is not None\n\n        # handle \"auto\" default for sentinel execution strategies\n        if config.sentinel_execution_strategy == \"auto\":\n            config.sentinel_execution_strategy = \"mab\" if optimization else None\n\n        # convert the config values for processing, execution, and optimization strategies to enums\n        config = cls._normalize_strategies(config)\n        config = cls._normalize_models(config)\n\n        if len(config.available_models) == 0:\n            raise ValueError(\"No available models found.\")\n\n        openai_key = os.getenv(\"OPENAI_API_KEY\")\n        azure_key = os.getenv(\"AZURE_API_KEY\") or os.getenv(\"AZURE_OPENAI_API_KEY\")\n        anthropic_key = os.getenv(\"ANTHROPIC_API_KEY\")\n        together_key = os.getenv(\"TOGETHER_API_KEY\")\n        gemini_key = os.getenv(\"GEMINI_API_KEY\")\n        google_key = os.getenv(\"GOOGLE_API_KEY\")\n\n        vllm_models = [model for model in config.available_models if model.is_vllm_model()]\n        if len(vllm_models) > 1:\n            raise ValueError(\"Only one vLLM model can be used per run. Multiple vLLM models found in available_models.\")\n\n        for model in config.available_models:\n            if model.is_provider_openai() and not openai_key:\n                raise ValueError(\"OPENAI_API_KEY must be set to use OpenAI models.\")\n            if model.is_provider_azure() and not azure_key:\n                raise ValueError(\"AZURE_API_KEY or AZURE_OPENAI_API_KEY must be set to use Azure OpenAI models.\")\n            if model.is_provider_anthropic() and not anthropic_key:\n                raise ValueError(\"ANTHROPIC_API_KEY must be set to use Anthropic models.\")\n            if model.is_provider_together_ai() and not together_key:\n                raise ValueError(\"TOGETHER_API_KEY must be set to use Together models.\")\n            if model.is_provider_google_ai_studio() and not (gemini_key or google_key or config.gemini_credentials_path):\n                raise ValueError(\"GEMINI_API_KEY, GOOGLE_API_KEY, or gemini_credentials path must be set to use Google Gemini models.\")\n            if model.is_vllm_model() and model.api_base is None:\n                raise ValueError(\"api_base must be set on the Model instance to use vLLM models.\")\n        return config, validator\n\n    @classmethod\n    def _create_optimizer(cls, config: QueryProcessorConfig) -> Optimizer:\n        return Optimizer(cost_model=SampleBasedCostModel(), **config.to_dict())\n\n    @classmethod\n    def _create_execution_strategy(cls, dataset: Dataset, config: QueryProcessorConfig) -> ExecutionStrategy:\n        \"\"\"\n        Creates an execution strategy based on the configuration.\n        \"\"\"\n        # for parallel execution, set the batch size if there's a limit in the query\n        limit = dataset.get_limit()\n        if limit is not None and config.execution_strategy == ExecutionStrategyType.PARALLEL:\n            if config.batch_size is None:\n                config.batch_size = limit\n                logger.info(f\"Setting batch size to query limit: {limit}\")\n            elif config.batch_size > limit:\n                config.batch_size = limit\n                logger.info(f\"Setting batch size to query limit: {limit} since it was larger than the limit\")\n\n        # create the execution strategy\n        execution_strategy_cls = config.execution_strategy.value\n        return execution_strategy_cls(**config.to_dict())\n\n    @classmethod\n    def _create_sentinel_execution_strategy(cls, config: QueryProcessorConfig) -> SentinelExecutionStrategy:\n        \"\"\"\n        Creates an execution strategy based on the configuration.\n        \"\"\"\n        if config.sentinel_execution_strategy is None:\n            return None\n\n        sentinel_execution_strategy_cls = config.sentinel_execution_strategy.value\n        return sentinel_execution_strategy_cls(**config.to_dict())\n\n    @classmethod\n    def create_processor(\n        cls,\n        dataset: Dataset,\n        config: QueryProcessorConfig | None = None,\n        train_dataset: dict[str, Dataset] | None = None,\n        validator: Validator | None = None,\n    ) -> QueryProcessor:\n        \"\"\"\n        Creates a QueryProcessor with specified processing and execution strategies.\n\n        Args:\n            dataset: The dataset to process\n            config: The user-provided QueryProcessorConfig; if it is None, the default config will be used\n            kwargs: Additional keyword arguments to pass to the QueryProcessorConfig\n        \"\"\"\n        if config is None:\n            config = QueryProcessorConfig()\n\n        # make a copy of the config to avoid modifying the original\n        config = config.copy()\n\n        # apply any additional keyword arguments to the config and validate its contents\n        config, validator = cls._config_validation_and_normalization(config, train_dataset, validator)\n\n        # update the dataset's types if we're not enforcing types\n        if not config.enforce_types:\n            dataset.relax_types()\n            if train_dataset is not None:\n                for _, ds in train_dataset.items():\n                    ds.relax_types()\n\n        # create the optimizer, execution strateg(ies), and processor\n        optimizer = cls._create_optimizer(config)\n        config.execution_strategy = cls._create_execution_strategy(dataset, config)\n        config.sentinel_execution_strategy = cls._create_sentinel_execution_strategy(config)\n        processor = QueryProcessor(dataset, optimizer, train_dataset=train_dataset, validator=validator, **config.to_dict())\n\n        return processor\n\n    @classmethod\n    def create_and_run_processor(\n        cls,\n        dataset: Dataset,\n        config: QueryProcessorConfig | None = None,\n        train_dataset: dict[str, Dataset] | None = None,\n        validator: Validator | None = None,\n    ) -> DataRecordCollection:\n        load_dotenv(override=True)\n        logger.info(f\"Creating processor for dataset: {dataset}\")\n        processor = cls.create_processor(dataset, config, train_dataset, validator)\n        logger.info(f\"Created processor: {processor}\")\n\n        return processor.execute()\n"
  },
  {
    "path": "src/palimpzest/schemabuilder/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/schemabuilder/schema_builder.py",
    "content": "\"\"\"\nThis module is responsible for building schemas dynamically, taking an input file and generating a schema for it.\nThe class offers a general class method, from_file, that takes a file and generates a schema for it.\nThis method is a simple wrapper for different methods, e.g., from_csv, from_yml, etc.\n\n\"\"\"\n\nimport json\nimport os\nfrom typing import Any\n\nimport pandas as pd\nimport yaml\nfrom pydantic import BaseModel\nfrom pyld import jsonld\n\nimport palimpzest.core.lib.schemas as pz_schemas\nfrom palimpzest.core.lib.schemas import create_schema_from_fields\n\n\nclass SchemaBuilder:\n\n    @classmethod\n    def from_file(cls,\n        schema_file: str,\n        schema_name: str = \"\",\n        include_attributes: list = None,\n        exclude_attributes: list = None,\n        schema_type: BaseModel = None,\n        ):\n        \"\"\"\n        Inputs:\n            schema_file: str - the path to the file\n            name (optional): str - the name of the schema\n            include_attributes (optional): list - a list of attribute names to include in the schema. If None, all attributes are included.\n            exclude_attributes (optional): list - a list of attribute names to exclude from the schema. If None, no attributes are excluded.\n            schema_type (optional): BaseModel - the parent type of the schema to generate, e.g. ScientificPapers have a schema_type of PDFFile. If None, a generic Schema type is used.\n        Outputs:\n            A class object - the dynamically generated class\n        \"\"\"\n\n        # Get the file extension\n        filename = os.path.basename(schema_file)\n        basename, file_extension = os.path.splitext(filename)\n      \n        if file_extension == \".csv\":\n            schema_data = cls.from_csv(schema_file)\n        elif file_extension == \".json\":\n            schema_data = cls.from_json(schema_file)\n        elif file_extension == \".jsonld\":\n            schema_data = cls.from_jsonld(schema_file)\n        elif file_extension == \".yml\":\n            schema_data = cls.from_yml(schema_file)\n        else:\n            raise ValueError(\"Unsupported file format!\")\n\n        # If additional metadata is not provided, read it from the file. \n        # If not available, generate it from the filename\n        if not schema_name:\n            if schema_data['name']:\n                schema_name = schema_data['name']\n            else:\n                schema_name = \"\".join([word.capitalize() for word in basename.split(\"_\")])\n\n        if include_attributes is None:\n           include_attributes = []\n        if exclude_attributes is None:\n           exclude_attributes = []\n\n        if schema_type is None:\n            if schema_data.get('type', None):\n                # Find if the schema type is a valid class in pz\n                parsed_type = getattr(pz_schemas, schema_data['type'], BaseModel)\n                schema_type = parsed_type if issubclass(parsed_type, BaseModel) else BaseModel\n            else:\n                schema_type = BaseModel\n\n        # Generate the schema class dynamically\n        fields = [\n            {\"name\": field_name, \"description\": field.description, \"type\": field.annotation}\n            for field_name, field in schema_type.model_fields.items()\n        ]\n        include_attributes_lower = set([a.lower() for a in include_attributes])\n        exclude_attributes_lower = set([a.lower() for a in exclude_attributes])\n        for field in schema_data['fields']:\n            norm_name = field['name'].lower()\n            if len(include_attributes) and norm_name not in include_attributes_lower:\n                continue\n            if norm_name in exclude_attributes_lower:\n                continue\n            name = field['name']\n            description = field.get('description', '')\n            fields.append({\"name\": name, \"description\": description, \"type\": Any})\n\n        return create_schema_from_fields(fields)\n\n    @classmethod\n    def from_csv(\n        cls,\n        schema_file: str,\n    ) -> dict:\n        \"\"\"\n        The attributes are extracted from the column names of the CSV file.\n        If columns contain null values, they are marked as optional.\n        TODO: Find a way to infer the description of the field.\n        \"\"\"\n        \n        # Use pandas to read the CSV file\n        df = pd.read_csv(schema_file)\n        columns = df.columns.tolist()\n\n        # Generate the schema class dynamically\n        fields = []\n        for col in columns:\n            field_type = df[col].dtype\n            if field_type == float or field_type == int:  # noqa\n                field_type = \"NumericField\"\n            else:\n                field_type = \"Field\"\n\n            fields.append({\"name\":col,\n                           \"description\":\"\",\n                           \"type\":field_type})\n        \n        return {\n            \"name\": '',\n            \"description\": '',\n            \"fields\": fields,\n        }\n\n    @classmethod\n    def from_jsonld(\n        cls,\n        schema_file: str,\n    ) -> dict:\n        \"\"\"JSON-LD schema builder.\n        The attributes are extracted from the JSON-LD objects of type 'rdfs:Class'.\n        If they contain a 'comment' field, this is used to populate a schema descripton.\n        If they contain a 'rangeIncludes' field, this is used within the description to\n        signal the list of valid values.\n       \"\"\"\n\n        # Load the schema from the JSONLD file\n        with open(schema_file) as file:\n            jsonld_data = json.load(file)\n        context = jsonld_data.get(\"@context\")\n        compacted_data = jsonld.compact(jsonld_data, context)\n        compacted_graph = compacted_data[\"@graph\"]\n\n        fields = []\n\n        for node in compacted_graph:\n            if node.get(\"@type\") != \"rdfs:Class\":\n                continue\n            name = node.get(\"rdfs:label\")\n\n            values = []\n            if \"schema:rangeIncludes\" in node:\n                values = [val[\"@id\"].split(\":\")[-1] for val in node[\"schema:rangeIncludes\"]]\n            \n            description = node.get(\"rdfs:comment\", \"\")\n            if values:\n                description += \" The only valid values are: \" + \", \".join(values)\n            fields.append({\n                \"name\": name,\n                \"description\": description, \n                \"values\": values})\n\n        return {\n            \"name\": '',\n            \"description\": '',\n            \"fields\": fields,\n        }\n    \n    @classmethod\n    def from_json(\n            cls,\n            schema_file: str,\n            schema_name: str = \"\",\n            include_attributes: list = None,\n            schema_type: BaseModel = None,\n    )-> dict:\n        \"\"\"\n        The attributes are extracted from the JSON objects.\n        The format of the json has to be in the form:\n        {\n            \"attribute1\": {\n                \"description\": \"description\",\n            },\n            ...\n        }\n        \"\"\"\n\n        # Load the schema from the JSON file\n        with open(schema_file) as file:\n            schema_data = json.load(file)\n\n        return schema_data\n    \n    @classmethod\n    def from_yml(\n            cls,\n            schema_file: str,\n    )-> dict:\n        \"\"\"\n        The attributes are extracted from the YAML file.\n        The schema name and description are extracted from the YAML file if it contains them, overwriting the input parameters.\n\n        The format of the yaml has to be in the form:\n        schema:\n          name:\n          description:\n          fields:\n            - name: attribute_name\n              description: description\n        ...\n        \"\"\"\n\n        # Load the schema from the YAML file\n        with open(schema_file) as file:\n            schema_data = yaml.safe_load(file)\n\n        schema_data = schema_data[\"schema\"]\n        return {\n            \"name\": schema_data.get(\"name\", \"\"),\n            \"description\": schema_data.get(\"description\", \"\"),\n            \"fields\": schema_data.get(\"fields\", []),\n            \"type\": schema_data.get(\"type\", \"\")\n        }"
  },
  {
    "path": "src/palimpzest/tools/README.md",
    "content": "# Palimpsest Tools Catalog\nThis is a directory of tools we have for the palimpzest.\n\n## Tools\nFor the pdf precessing tools, we have the following:\n- Papermage lambda service with parallel processing support\n- Cosmos literature pdf processing service with extraction of paper text, table, plot, diagrams and equation images\n\nFor the equation extraction (image to latex), we have the following:\n- GPT4V equation extraction service\n- SKEMA equation extraction service (low accuracy)\n\n\n\n"
  },
  {
    "path": "src/palimpzest/tools/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/tools/allenpdf.py",
    "content": "import modal\n\napp = modal.App(\"palimpzest.tools\")\npip_packs = [\n    \"papermage\",\n    \"tqdm\",\n    \"transformers\",\n    \"pdf2image\",\n    \"pdfplumber==0.7.4\",\n    \"requests\",\n    \"numpy>=1.23.2\",\n    \"scipy>=1.9.0\",\n    \"pandas<2\",\n    \"ncls==0.0.68\",\n    \"necessary>=0.3.2\",\n    \"grobid-client-python==0.0.5\",\n    \"charset-normalizer\",\n    \"torch>=1.10.0\",\n    \"smashed\",\n    \"layoutparser\",\n    \"pysbd\",\n    \"decontext\",\n    \"vila\",\n]\n\npdf_processing_image = (\n    modal.Image.debian_slim(python_version=\"3.11\")\n    .apt_install([\"ffmpeg\", \"pkg-config\", \"libpoppler-cpp-dev\", \"poppler-utils\"])\n    .pip_install([\"torch==2.1.1\", \"pkgconfig\", \"python-poppler\"] + pip_packs)\n)\n\n\n@app.function(image=pdf_processing_image)\ndef process_papermage_pdf(pdf_bytes_docs: list[bytes]):\n    \"\"\"Process a PDF file and return the text contents.\"\"\"\n    import json\n    import os\n\n    from papermage.recipes import CoreRecipe\n\n    os.makedirs(\"/tmp\", exist_ok=True)\n    results = []\n    for pdf_bytes in pdf_bytes_docs:\n        recipe = CoreRecipe()\n\n        with open(\"/tmp/papermage.pdf\", \"wb\") as file:\n            file.write(pdf_bytes)\n\n        doc = recipe.run(\"/tmp/papermage.pdf\")\n\n        os.remove(\"/tmp/papermage.pdf\")\n\n        results.append(json.dumps(doc.to_json()))\n\n    return results\n\n\n@app.local_entrypoint()\ndef main():\n    import json\n\n    from papermage import Document\n\n    with open(\"test.pdf\", \"rb\") as file:\n        pdf_bytes_1 = file.read()\n\n    results = process_papermage_pdf.local([pdf_bytes_1])\n    for idx, r in enumerate(results):\n        docdict = json.loads(r)\n        doc = Document.from_json(docdict)\n        print(idx, doc)\n        for p in doc.pages:\n            print(p)\n"
  },
  {
    "path": "src/palimpzest/tools/pdfparser.py",
    "content": "import hashlib\nimport io\nimport json\nimport os\nimport time\nfrom typing import BinaryIO\nfrom zipfile import ZipFile\n\nimport pandas as pd\nimport requests\nfrom fastapi import status\nfrom pypdf import PdfReader\n\nCOSMOS_ADDRESS = \"https://xdd.wisc.edu/cosmos_service\"\n\n\ndef get_md5(file_bytes: bytes) -> str:\n    if not isinstance(file_bytes, bytes):\n        file_bytes = file_bytes.encode()\n    return hashlib.md5(file_bytes).hexdigest()\n\n\n##\n# Function to extract a Cosmos parquet file to Cosmos JSON\n##\ndef cosmos_parquet_to_json(path):\n    parquet_df = pd.read_parquet(path)\n    parquet_json = parquet_df.to_json()\n    parquet_data = json.loads(parquet_json)\n\n    if len(parquet_data) > 0:\n        parquet_data_keys = list(parquet_data.keys())\n        num_data_rows = max([int(k) for k in parquet_data[parquet_data_keys[0]]])\n\n        row_order_parquet_data = [dict() for i in range(num_data_rows + 1)]\n        for field_key, row_data in parquet_data.items():\n            for row_idx, datum in row_data.items():\n                row_idx_num = int(row_idx)\n                row_order_parquet_data[row_idx_num][field_key] = datum\n\n        row_order_parquet_data.sort(\n            key=lambda d: (\n                d[\"page_num\"],\n                d[\"bounding_box\"][0] // 500,\n                d[\"bounding_box\"][1],\n            )\n        )\n\n        edits = list()\n        for e1, extraction1 in enumerate(row_order_parquet_data):\n            (ext1_x1, ext1_y1, ext1_x2, ext1_y2) = extraction1[\"bounding_box\"]\n            if ext1_x1 < 500:\n                continue\n\n            ext1_page_num = extraction1[\"page_num\"]\n            found_col_break = False\n            insertion_index = -1\n            t1 = e1\n            while t1 > 0:\n                extraction2 = row_order_parquet_data[t1 - 1]\n                ext2_page_num = extraction2[\"page_num\"]\n                if ext1_page_num > ext2_page_num:\n                    break\n\n                (ext2_x1, ext2_y1, ext2_x2, ext2_y2) = extraction2[\"bounding_box\"]\n\n                if ext1_y2 <= ext2_y1:\n                    ext2_xspan = ext2_x2 - ext2_x1\n                    if ext2_xspan >= 800:\n                        found_col_break = True\n                        insertion_index = t1 - 1\n                t1 -= 1\n            if found_col_break:\n                edits.append(\n                    {\n                        \"del_idx\": e1,\n                        \"ins_idx\": insertion_index,\n                        \"val\": extraction1,\n                    }\n                )\n        for edit_dict in edits:\n            del row_order_parquet_data[edit_dict[\"del_idx\"]]\n            row_order_parquet_data.insert(edit_dict[\"ins_idx\"], edit_dict[\"val\"])\n        row_order_parquet_data.sort(key=lambda d: (d[\"pdf_name\"]))\n\n        name2results = dict()\n        for row_data in row_order_parquet_data:\n            if row_data[\"pdf_name\"] in name2results:\n                name2results[row_data[\"pdf_name\"]].append(row_data)\n            else:\n                name2results[row_data[\"pdf_name\"]] = [row_data]\n\n        return next(iter(name2results.items()))[1]\n\n\n##\n# Function to extract the text 'content' attribute from the Cosmos JSON data\n##\ndef cosmos_json_txt(cosmos_json):\n    # Initialize an empty list to store the content texts\n    content_texts = []\n\n    # Iterate over each item in the JSON data\n    for item in cosmos_json:\n        # Extract the 'content' attribute and add it to the list\n        content_texts.append(item.get(\"content\", \"\"))\n\n    return content_texts\n\n\ndef cosmos_client(name: str, data: BinaryIO, output_dir: str, delay=10):\n    files = [\n        (\"pdf\", (name, data, \"application/pdf\")),\n    ]\n    print(f\"Sending {name} to COSMOS\")\n    response = requests.post(f\"{COSMOS_ADDRESS}/process/\", files=files)\n    print(f\"Received response of  {response.json()['status_endpoint']} from COSMOS: {response.status_code}\")\n    # get md5 of the data\n    md5 = get_md5(data)\n\n    if response.status_code == status.HTTP_202_ACCEPTED:\n        callback_endpoints = response.json()\n\n        for retry_num in range(400):\n            time.sleep(delay)\n            poll = requests.get(f\"{callback_endpoints['status_endpoint']}\")\n            print(f\"Polling COSMOS on retry num {retry_num + 1}\")\n            if poll.status_code == status.HTTP_200_OK:\n                poll_results = poll.json()\n                if poll_results[\"job_completed\"]:\n                    cosmos_response = requests.get(f\"{callback_endpoints['result_endpoint']}\")\n                    if cosmos_response.status_code == status.HTTP_200_OK:\n                        data = cosmos_response.content\n                        with ZipFile(io.BytesIO(data)) as z:\n                            output_subdir = os.path.join(\n                                output_dir, f\"COSMOS_{os.path.splitext(name)[0].replace(' ', '_')}_{md5}\"\n                            )\n                            os.makedirs(output_subdir, exist_ok=True)\n                            z.extractall(path=output_subdir)\n                            for file in os.listdir(output_subdir):\n                                if (\n                                    file.endswith(\".parquet\")\n                                    and not file.endswith(\"_figures.parquet\")\n                                    and not file.endswith(\"_pdfs.parquet\")\n                                    and not file.endswith(\"_tables.parquet\")\n                                    and not file.endswith(\"_sections.parquet\")\n                                    and not file.endswith(\"_equations.parquet\")\n                                ):\n                                    print(f\"Converting {file} to JSON\")\n                                    # if error while converting parquet to json, skip this file\n                                    try:\n                                        json_data = cosmos_parquet_to_json(os.path.join(output_subdir, file))\n                                        with open(\n                                            os.path.join(output_subdir, f\"{os.path.splitext(file)[0]}.json\"), \"w\"\n                                        ) as json_file:\n                                            json.dump(json_data, json_file)\n                                        with open(\n                                            os.path.join(output_subdir, f\"{os.path.splitext(file)[0]}.txt\"), \"w\"\n                                        ) as text_file:\n                                            text_file.write(\"\\n\".join(cosmos_json_txt(json_data)))\n                                        # print(f\"{file} : {json_data}\")\n\n                                    except Exception as e:\n                                        print(f\"Error while converting {file} to JSON: {e}\")\n                                        pass\n                        return\n                        # raise RuntimeError(\"COSMOS data doesn't include document file for annotation\")\n\n                    else:\n                        raise RuntimeError(\n                            f\"COSMOS Result Error - STATUS CODE: {response.status_code} - {COSMOS_ADDRESS}\"\n                        )\n                # If not, just wait until the next iteration\n                else:\n                    pass\n\n        # If we reached this point, we time out\n        raise TimeoutError(f\"Timed out waiting for COSMOS on retry num {retry_num + 1}\")\n\n    else:\n        raise RuntimeError(f\"COSMOS Error - STATUS CODE: {response.status_code} - {COSMOS_ADDRESS}\")\n\n\n##\n# Function to extract the text from a PDF file:\n# 1. Check if the text file already exists in the cache, if so, read from the cache\n# 2. If not, call the cosmos_client function to process the PDF file and cache the text file\n##\n\n# TODO(Jun): 1. cosmos returns 202 for me. 2. why only accept \"pypdf\" and \"cosmos\" as pdfprocessor?\ndef get_text_from_pdf(filename, pdf_bytes, pdfprocessor=\"pypdf\", enable_file_cache=True, file_cache_dir=\"/tmp\"):\n    pdf_filename = filename\n    file_name = os.path.basename(pdf_filename)\n    file_name_without_extension = os.path.splitext(file_name)[0]\n    text_file_name = f\"{file_name_without_extension}.txt\"\n\n    if pdfprocessor == \"pypdf\":\n        pdf = PdfReader(io.BytesIO(pdf_bytes))\n        all_text = \"\"\n        for page in pdf.pages:\n            all_text += page.extract_text() + \"\\n\"\n        return all_text\n\n    else:\n        # Get md5 of the pdf_bytes\n        md5 = get_md5(pdf_bytes)\n        cached_extraction_folder = f\"COSMOS_{os.path.splitext(file_name)[0].replace(' ', '_')}_{md5}\"\n\n        # Check if pz_file_cache_dir exists in the file system\n        pz_file_cache_dir = os.path.join(file_cache_dir, cached_extraction_folder)\n        if enable_file_cache and os.path.exists(pz_file_cache_dir):\n            print(f\"File {text_file_name} already exists in system tmp folder {pz_file_cache_dir}, reading from cache\")\n            text_file_path = os.path.join(pz_file_cache_dir, text_file_name)\n            with open(text_file_path) as file:\n                text_content = file.read()\n                return text_content\n\n        # Call the cosmos_client function\n        print(f\"Processing {file_name} through COSMOS\")\n        cosmos_client(file_name, pdf_bytes, file_cache_dir)\n        text_file_path = os.path.join(pz_file_cache_dir, text_file_name)\n        if not os.path.exists(text_file_path):\n            raise FileNotFoundError(f\"Text file {text_file_name} not found in {pz_file_cache_dir}/{text_file_name}\")\n        with open(text_file_path) as file:\n            text_content = file.read()\n            return text_content\n"
  },
  {
    "path": "src/palimpzest/tools/skema_tools.py",
    "content": "#####################################################\n#\n#####################################################\n# Description: This file contains the functions that are from ASKEM skema tools at endpoints:\n# https://api.askem.lum.ai/docs\n\nimport requests\n\n\ndef equations_to_latex(image_content):\n    url = \"https://api.askem.lum.ai/workflows/images/equations-to-latex\"\n    files = {\n        \"data\": image_content,\n    }\n    r = requests.post(url, files=files)\n    return r.text\n\n\ndef equations_to_latex_base64(image_content):\n    url = \"https://api.askem.lum.ai/workflows/images/base64/equations-to-latex\"\n    r = requests.post(url, data=image_content)\n    return r.text\n"
  },
  {
    "path": "src/palimpzest/utils/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/utils/env_helpers.py",
    "content": "import os\nimport sys\n\n\ndef load_env():\n    sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), \"..\")))\n\n    # read the env file\n    if os.path.exists(\".env\"):\n        with open(\".env\") as f:\n            for line in f:\n                key, value = line.strip().split(\"=\")\n                os.environ[key] = value\n"
  },
  {
    "path": "src/palimpzest/utils/hash_helpers.py",
    "content": "import hashlib\nimport json\n\nfrom palimpzest.constants import MAX_ID_CHARS\n\n\ndef hash_for_id(id_str: str, max_chars: int = MAX_ID_CHARS) -> str:\n    return hashlib.sha256(id_str.encode(\"utf-8\")).hexdigest()[:max_chars]\n\n\ndef hash_for_serialized_dict(dict_obj: dict) -> str:\n    return hash_for_id(json.dumps(dict_obj, sort_keys=True))\n"
  },
  {
    "path": "src/palimpzest/utils/model_helpers.py",
    "content": "import os\n\nfrom palimpzest.constants import MAX_AVAILABLE_MODELS, Model\nfrom palimpzest.core.models import PlanCost\nfrom palimpzest.policy import Policy\n\n\ndef get_models(include_embedding: bool = False, use_vertex: bool = False, use_azure: bool = False, gemini_credentials_path: str | None = None, azure_endpoint: str | None = None, azure_api_version: str | None = None) -> list[Model]:\n    \"\"\"\n    Return the set of models which the system has access to based on the set environment variables.\n    \n    Args:\n        include_embedding: Whether to include embedding models\n        use_vertex: Whether to use Vertex AI for Google models\n        use_azure: Whether to use Azure for OpenAI models\n        gemini_credentials_path: Path to Google credentials\n        azure_endpoint: Azure endpoint URL (optional, defaults to AZURE_API_BASE env var)\n        azure_api_version: Azure API version (optional, defaults to AZURE_API_VERSION env var)\n    \"\"\"\n    models = []\n    all_models = Model.get_all_models()\n\n    azure_key = os.getenv(\"AZURE_API_KEY\") or os.getenv(\"AZURE_OPENAI_API_KEY\")\n    if os.getenv(\"OPENAI_API_KEY\") not in [None, \"\"]:\n        openai_models = [model for model in all_models if model.is_provider_openai()]\n        if not include_embedding:\n            openai_models = [\n                model for model in openai_models if not model.is_embedding_model()\n            ]\n        models.extend(openai_models)\n\n    elif azure_key not in [None, \"\"] and use_azure:\n        azure_models = [model for model in all_models if model.is_provider_azure()]\n        if not include_embedding:\n            azure_models = [\n                model for model in azure_models if not model.is_embedding_model()\n            ]\n        models.extend(azure_models)\n\n    if os.getenv(\"TOGETHER_API_KEY\") not in [None, \"\"]:\n        together_models = [model for model in all_models if model.is_provider_together_ai()]\n        if not include_embedding:\n            together_models = [\n                model for model in together_models if not model.is_embedding_model()\n            ]\n        models.extend(together_models)\n\n    if os.getenv(\"ANTHROPIC_API_KEY\") not in [None, \"\"]:\n        anthropic_models = [model for model in all_models if model.is_provider_anthropic()]\n        if not include_embedding:\n            anthropic_models = [\n                model for model in anthropic_models if not model.is_embedding_model()\n            ]\n        models.extend(anthropic_models)\n\n    gemini_credentials_path = (\n        os.path.join(os.path.expanduser(\"~\"), \".config\", \"gcloud\", \"application_default_credentials.json\")\n        if gemini_credentials_path is None\n        else gemini_credentials_path\n    )\n    if os.getenv(\"GEMINI_API_KEY\") not in [None, \"\"] or (use_vertex and os.path.exists(gemini_credentials_path)):\n        vertex_models = [model for model in all_models if model.is_provider_vertex_ai()]\n        google_ai_studio_models = [model for model in all_models if model.is_provider_google_ai_studio()]\n        if not include_embedding:\n            vertex_models = [\n                model for model in vertex_models if not model.is_embedding_model()\n            ]\n            google_ai_studio_models = [\n                model for model in google_ai_studio_models if not model.is_embedding_model()\n            ]\n        if use_vertex:\n            models.extend(vertex_models)\n        else:\n            models.extend(google_ai_studio_models)\n\n    return models\n\ndef get_optimal_models(policy: Policy, include_embedding: bool = False, use_vertex: bool = False, use_azure: bool = False, gemini_credentials_path: str | None = None, azure_endpoint: str | None = None, azure_api_version: str | None = None) -> list[Model]:\n    \"\"\"\n    Selects the top models from the available list based on the user's policy.\n\n    Post-condition: This function will never return an empty list unless there are\n    no available models at all. If policy constraints filter out all models, it\n    falls back to returning the best model(s) based on the policy's primary metric.\n    \n    Args:\n        policy: Policy to use for model selection\n        include_embedding: Whether to include embedding models\n        use_vertex: Whether to use Vertex AI for Google models\n        use_azure: Whether to use Azure for OpenAI models\n        gemini_credentials_path: Path to Google credentials\n        azure_endpoint: Azure endpoint URL (optional, defaults to AZURE_API_BASE env var)\n        azure_api_version: Azure API version (optional, defaults to AZURE_API_VERSION env var)\n    \"\"\"\n    # gather available models\n    available_models = get_models(\n        include_embedding=include_embedding,\n        use_vertex=use_vertex,\n        use_azure=use_azure,\n        gemini_credentials_path=gemini_credentials_path,\n        azure_endpoint=azure_endpoint,\n        azure_api_version=azure_api_version,\n    )\n\n    if not available_models:\n        return []\n\n    # gather metrics for all models\n    all_model_metrics = []\n    for model in available_models:\n        quality_score = model.get_overall_score()\n        cost = model.get_usd_per_output_token()\n        time_val = model.get_seconds_per_output_token()\n\n        if quality_score is None:\n            quality_score = 0\n        if cost is None:\n            cost = float(\"inf\")\n        if time_val is None:\n            time_val = float(\"inf\")\n\n        all_model_metrics.append({\n            \"id\": model,\n            \"quality\": quality_score,\n            \"cost\": cost,\n            \"time\": time_val\n        })\n\n    # apply constraints\n    candidates = []\n    for model_data in all_model_metrics:\n        normalized_quality = model_data[\"quality\"] / 100.0\n        proxy_plan = PlanCost(cost=0.0, time=0.0, quality=normalized_quality)\n\n        if policy.constraint(proxy_plan):\n            candidates.append(model_data)\n\n    # fallback: If no models meet constraints, select best model(s) by primary metric\n    if not candidates:\n        primary_metric = policy.get_primary_metric()\n\n        if primary_metric == \"quality\":\n            # return the model with the highest quality score\n            best = max(all_model_metrics, key=lambda x: x[\"quality\"])\n        elif primary_metric == \"cost\":\n            # return the model with the lowest cost\n            best = min(all_model_metrics, key=lambda x: x[\"cost\"])\n        elif primary_metric == \"time\":\n            # return the model with the lowest latency\n            best = min(all_model_metrics, key=lambda x: x[\"time\"])\n        else:\n            # default to highest quality\n            best = max(all_model_metrics, key=lambda x: x[\"quality\"])\n\n        return [best[\"id\"]]\n\n    # normalize metrics using min-max normalization\n    quals = [c[\"quality\"] for c in candidates]\n    costs = [c[\"cost\"] for c in candidates]\n    times = [c[\"time\"] for c in candidates]\n\n    min_q, max_q = min(quals), max(quals)\n    min_c, max_c = min(costs), max(costs)\n    min_t, max_t = min(times), max(times)\n\n    def normalize(val, min_v, max_v, invert=False):\n        if max_v == min_v:\n            return 1.0\n        norm = (val - min_v) / (max_v - min_v)\n        return (1.0 - norm) if invert else norm\n\n    # get weight for each metric based on policy\n    weights = policy.get_dict()\n    w_q = weights.get(\"quality\", 0.0)\n    w_c = weights.get(\"cost\", 0.0)\n    w_t = weights.get(\"time\", 0.0)\n\n    scored_candidates = []\n    for cand in candidates:\n        n_q = normalize(cand[\"quality\"], min_q, max_q, invert=False)\n        n_c = normalize(cand[\"cost\"], min_c, max_c, invert=True)\n        n_t = normalize(cand[\"time\"], min_t, max_t, invert=True)\n\n        score = (w_q * n_q) + (w_c * n_c) + (w_t * n_t)\n\n        scored_candidates.append((score, cand[\"id\"]))\n\n    # select the top-k candidates based on score\n    scored_candidates.sort(key=lambda x: x[0], reverse=True)\n    top_models = [model for _, model in scored_candidates[:MAX_AVAILABLE_MODELS]]\n\n    return top_models\n\ndef use_reasoning_prompt(reasoning_effort: str) -> bool:\n    \"\"\"\n    Determine whether to use the reasoning prompt based on the provided reasoning effort.\n    By default, we use the reasoning prompt everywhere unless the reasoning_effort is in [None, \"disable\", \"minimal\", \"low\"].\n    \"\"\"\n    return reasoning_effort not in [\"disable\", \"minimal\", \"low\"]\n\n\ndef resolve_reasoning_effort(model: Model, reasoning_effort: str) -> str | None:\n    \"\"\"\n    Resolve the reasoning effort setting based on the model and provided reasoning effort.\n    \"\"\"\n    # check that model is a reasoning model, throw an assertion error otherwise\n    assert model.is_reasoning_model(), f\"Model {model} is not a reasoning model. Should only use resolve_reasoning_effort with reasoning models.\"\n\n    # if reasoning_effort is set to \"default\", set it to None to use model defaults\n    if reasoning_effort == \"default\":\n        reasoning_effort = None\n\n    # translate reasoning_effort into model-specific settings\n    if model.is_provider_vertex_ai() or model.is_provider_google_ai_studio():\n        if reasoning_effort is None and model in [Model.GEMINI_2_5_PRO, Model.GOOGLE_GEMINI_2_5_PRO]:\n            reasoning_effort = \"low\"\n        elif reasoning_effort is None:\n            reasoning_effort = \"disable\"\n    elif model.is_provider_openai() or model.is_provider_azure():\n        reasoning_effort = \"low\" if reasoning_effort in [None, \"disable\", \"minimal\", \"low\"] else reasoning_effort\n\n    return reasoning_effort\n"
  },
  {
    "path": "src/palimpzest/utils/model_info_helpers.py",
    "content": "import logging\nimport re\nfrom typing import Any\n\nimport requests\n\nlogger = logging.getLogger(__name__)\n\nPZ_MODEL_DATA_URL = \"https://palimpzest-research.s3.us-east-1.amazonaws.com/pz_models_information.json\"\n\n# Known MMLU-Pro scores (manually curated)\n# Keys should be canonical patterns that fuzzy matching will find\nMMLU_PRO_SCORES = {\n    # OpenAI\n    \"gpt-4o\": 74.1,\n    \"gpt-4o-mini\": 62.7,\n    \"gpt-4-turbo\": 70.6,\n    \"gpt-4\": 64.8,\n    \"gpt-3.5-turbo\": 49.2,\n    \"o1-preview\": 80.3,\n    \"o1-mini\": 80.0,\n    \"o3-mini\": 79.6,\n    \"o4-mini\": 80.6,\n    \"gpt-4.1\": 80.5,\n    \"gpt-4.1-mini\": 77.2,\n    \"gpt-4.1-nano\": 62.3,\n    \"gpt-5\": 87.0,\n    \"gpt-5-mini\": 82.5,\n    \"gpt-5-nano\": 77.9,\n    \"gpt-5.2\": 86.23,\n    # Anthropic\n    \"claude-3-5-sonnet\": 78.4,\n    \"claude-3-7-sonnet\": 80.7,\n    \"claude-3-opus\": 72.6,\n    \"claude-3-sonnet\": 68.5,\n    \"claude-3-haiku\": 55.7,\n    \"claude-3-5-haiku\": 64.1,\n    \"claude-sonnet-4\": 83.87,\n    \"claude-sonnet-4-5\": 87.36,\n    \"claude-haiku-4-5\": 78.72,\n    \"claude-opus-4-5\": 87.3,\n    # Google\n    \"gemini-1.5-pro\": 75.8,\n    \"gemini-1.5-flash\": 67.5,\n    \"gemini-2.0-flash\": 77.4,\n    \"gemini-2.5-flash\": 80.75,\n    \"gemini-2.5-flash-lite\": 79.1,\n    \"gemini-2.5-pro\": 84.1,\n    \"gemini-3-flash\": 87.63,\n    \"gemini-3-pro\": 90.1,\n    # Meta Llama (include version-specific entries)\n    \"llama-3-8b\": 44.25,\n    \"llama-3-70b\": 55.0,\n    \"llama-3.1-8b\": 44.25,\n    \"llama-3.1-70b\": 55.0,\n    \"llama-3.1-405b\": 73.3,\n    \"llama-3.2-1b\": 24.0,\n    \"llama-3.2-3b\": 36.5,\n    \"llama-3.2-11b\": 48.0,  # vision model\n    \"llama-3.2-90b\": 65.0,  # vision model\n    \"llama-3.3-70b\": 69.9,\n    \"llama-4-maverick\": 79.4,\n    \"llama-4-scout\": 75.0,\n    # Mistral\n    \"mistral-large\": 65.0,\n    \"mistral-medium\": 55.0,\n    \"mistral-small\": 50.0,\n    \"mistral-7b\": 45.0,\n    \"mistral-nemo\": 55.0,\n    \"mixtral-8x7b\": 49.0,\n    \"mixtral-8x22b\": 58.0,\n    # DeepSeek\n    \"deepseek-v3\": 73.8,\n    \"deepseek-v2\": 65.0,\n    \"deepseek-coder\": 55.0,\n    \"deepseek-r1\": 85.0,\n    \"deepseek-r1-distill-qwen-1.5b\": 39.9,\n    \"deepseek-r1-distill-qwen-7b\": 52.0,\n    \"deepseek-r1-distill-qwen-32b\": 65.0,\n    \"deepseek-r1-distill-llama-8b\": 50.0,\n    \"deepseek-r1-distill-llama-70b\": 72.0,\n    # Qwen\n    \"qwen-2-0.5b\": 25.0,\n    \"qwen-2-1.5b\": 30.0,\n    \"qwen-2-7b\": 45.0,\n    \"qwen-2-72b\": 55.0,\n    \"qwen-2.5-0.5b\": 28.0,\n    \"qwen-2.5-1.5b\": 33.0,\n    \"qwen-2.5-3b\": 38.0,\n    \"qwen-2.5-7b\": 48.0,\n    \"qwen-2.5-14b\": 55.0,\n    \"qwen-2.5-32b\": 63.0,\n    \"qwen-2.5-72b\": 71.1,\n    \"qwen-2.5-coder\": 52.0,\n    \"qwen-vl\": 50.0,\n    # Phi\n    \"phi-2\": 35.0,\n    \"phi-3-mini\": 45.0,\n    \"phi-3-small\": 50.0,\n    \"phi-3-medium\": 55.0,\n    \"phi-3.5-mini\": 48.0,\n    \"phi-4\": 60.0,\n    # Yi\n    \"yi-1.5-6b\": 42.0,\n    \"yi-1.5-9b\": 48.0,\n    \"yi-1.5-34b\": 58.0,\n    \"yi-34b\": 55.0,\n    # Gemma\n    \"gemma-2b\": 30.0,\n    \"gemma-7b\": 42.0,\n    \"gemma-2-2b\": 35.0,\n    \"gemma-2-9b\": 50.0,\n    \"gemma-2-27b\": 60.0,\n    # InternLM\n    \"internlm2-7b\": 45.0,\n    \"internlm2-20b\": 55.0,\n    # Command-R\n    \"command-r\": 55.0,\n    \"command-r-plus\": 65.0,\n}\n\n# Known latency data (tokens per second) - used for cloud APIs\n# For local models, we'll estimate based on model size\nLATENCY_TPS_DATA = {\n    # OpenAI\n    \"gpt-4o\": 125.0,\n    \"gpt-4o-mini\": 63.0,\n    \"gpt-4-turbo\": 35.0,\n    \"o1-preview\": 15.0,\n    \"o1-mini\": 65.0,\n    \"gpt-4.1\": 132.0,\n    \"gpt-4.1-mini\": 62.0,\n    \"gpt-4.1-nano\": 167.0,\n    # Anthropic\n    \"claude-3-5-sonnet\": 65.0,\n    \"claude-3-opus\": 25.0,\n    \"claude-3-sonnet\": 60.0,\n    \"claude-3-haiku\": 110.0,\n    \"claude-3-5-haiku\": 53.0,\n    \"claude-sonnet-4\": 71.3,\n    \"claude-sonnet-4-5\": 78.6,\n    \"claude-haiku-4-5\": 118.3,\n    # Google\n    \"gemini-1.5-pro\": 70.0,\n    \"gemini-1.5-flash\": 150.0,\n    \"gemini-2.0-flash\": 185.0,\n    \"gemini-2.5-flash\": 227.0,\n    \"gemini-2.5-pro\": 139.0,\n    \"gemini-3-flash\": 219.0,\n    \"gemini-3-pro\": 132.0,\n    # Meta Llama (cloud-hosted speeds)\n    \"llama-3-8b\": 200.0,\n    \"llama-3-70b\": 80.0,\n    \"llama-3.1-8b\": 200.0,\n    \"llama-3.1-70b\": 82.0,\n    \"llama-3.2-3b\": 127.0,\n    \"llama-3.3-70b\": 82.0,\n    # DeepSeek\n    \"deepseek-v3\": 88.0,\n    \"deepseek-r1\": 50.0,\n}\n\n# Model size to estimated TPS mapping for local inference (conservative estimates)\n# These are rough estimates assuming a single GPU setup\nLOCAL_MODEL_SIZE_TO_TPS = {\n    \"0.5b\": 300.0,\n    \"1b\": 250.0,\n    \"1.5b\": 220.0,\n    \"2b\": 200.0,\n    \"3b\": 150.0,\n    \"7b\": 80.0,\n    \"8b\": 75.0,\n    \"9b\": 70.0,\n    \"11b\": 60.0,\n    \"13b\": 50.0,\n    \"14b\": 45.0,\n    \"20b\": 35.0,\n    \"27b\": 30.0,\n    \"32b\": 25.0,\n    \"34b\": 23.0,\n    \"70b\": 12.0,\n    \"72b\": 11.0,\n    \"90b\": 8.0,\n    \"405b\": 3.0,\n}\n\n# Default values for when no match is found\nDEFAULT_QUALITY_SCORE = 40.0\nDEFAULT_SECONDS_PER_OUTPUT_TOKEN = 0.02  # Conservative default (~50 TPS)\n\n\ndef _normalize_model_name(name: str) -> str:\n    \"\"\"Normalize a model name for comparison by removing separators and lowercasing.\"\"\"\n    return name.lower().replace(\"-\", \"\").replace(\"_\", \"\").replace(\".\", \"\").replace(\" \", \"\")\n\n\ndef _extract_version_info(name: str) -> tuple[str, str | None, str | None]:\n    \"\"\"\n    Extract base model name, version, and size from a model name.\n    Returns (base_name, version, size) where version/size may be None.\n\n    Examples:\n        \"Llama-3.1-8B-Instruct\" -> (\"llama\", \"3.1\", \"8b\")\n        \"Qwen2.5-7B-Instruct\" -> (\"qwen\", \"2.5\", \"7b\")\n        \"deepseek-r1-distill-qwen-7b\" -> (\"deepseek-r1-distill-qwen\", None, \"7b\")\n    \"\"\"\n    name_lower = name.lower()\n\n    # Extract size (look for patterns like 7b, 70b, 0.5b, etc.)\n    size_match = re.search(r'(\\d+(?:\\.\\d+)?)\\s*b(?:illion)?(?:\\b|$|-|_)', name_lower)\n    size = size_match.group(1) + \"b\" if size_match else None\n\n    # Extract version (look for patterns like 3.1, 2.5, v3, etc.)\n    version_match = re.search(r'[-_]?(\\d+(?:\\.\\d+)?)\\s*[-_]', name_lower)\n    version = version_match.group(1) if version_match else None\n\n    # Extract base name (first recognizable model family)\n    base_patterns = [\n        r'(llama)', r'(qwen)', r'(mistral)', r'(mixtral)', r'(gemma)',\n        r'(phi)', r'(yi)', r'(deepseek)', r'(internlm)', r'(command)',\n        r'(claude)', r'(gpt)', r'(gemini)', r'(o\\d)', r'(falcon)',\n    ]\n    base_name = None\n    for pattern in base_patterns:\n        match = re.search(pattern, name_lower)\n        if match:\n            base_name = match.group(1)\n            break\n\n    return (base_name or name_lower, version, size)\n\n\ndef fuzzy_match_score(model_id: str, scores_dict: dict[str, float]) -> float | None:\n    \"\"\"\n    Fuzzy match a model ID against a dictionary of scores.\n\n    Matching strategy (in order of priority):\n    1. Exact key match\n    2. Key matches the model name portion (after last /)\n    3. Normalized substring matching, preferring longer (more specific) keys\n    4. Model family + size matching as fallback\n\n    Prefers the longest (most specific) matching key to avoid e.g.\n    \"llama-3-8b\" matching before \"llama-3.1-8b\".\n    \"\"\"\n    model_lower = model_id.lower()\n    # Extract just the model name (after provider prefix)\n    model_name = model_lower.split(\"/\")[-1] if \"/\" in model_lower else model_lower\n    model_normalized = _normalize_model_name(model_name)\n\n    # Pass 1: Check for exact key match\n    for key, score in scores_dict.items():\n        if key.lower() == model_name or key.lower() == model_lower:\n            return score\n\n    best_match = None\n    best_score = 0  # Track specificity score (higher = better match)\n\n    # Pass 2: Substring matching with specificity scoring\n    for key, score in scores_dict.items():\n        key_lower = key.lower()\n        key_normalized = _normalize_model_name(key)\n\n        # Check substring match\n        if key_lower in model_name or key_normalized in model_normalized:\n            # Score based on: length of match + bonus for version/size alignment\n            specificity = len(key_normalized)\n\n            # Bonus for matching version numbers\n            key_base, key_ver, key_size = _extract_version_info(key)\n            model_base, model_ver, model_size = _extract_version_info(model_name)\n\n            if key_ver and model_ver and key_ver == model_ver:\n                specificity += 10  # Version match bonus\n            if key_size and model_size and key_size == model_size:\n                specificity += 15  # Size match bonus\n\n            if specificity > best_score:\n                best_match = score\n                best_score = specificity\n\n    if best_match is not None:\n        return best_match\n\n    # Pass 3: Try matching by model family + size\n    model_base, model_ver, model_size = _extract_version_info(model_name)\n    if model_base:\n        for key, score in scores_dict.items():\n            key_base, key_ver, key_size = _extract_version_info(key)\n            if (key_base and model_base in key_base or key_base in model_base) \\\n                and (key_size and model_size and key_size == model_size):\n                    return score\n\n    return best_match\n\n\ndef _extract_model_size(model_id: str) -> str | None:\n    \"\"\"\n    Extract model size from model ID (e.g., \"7b\", \"70b\", \"0.5b\").\n    Returns the size string or None if not found.\n    \"\"\"\n    model_lower = model_id.lower()\n    # Match patterns like: 7b, 70b, 0.5b, 1.5b, 8b-instruct, etc.\n    size_match = re.search(r'(\\d+(?:\\.\\d+)?)\\s*b(?:illion)?(?:\\b|$|[-_])', model_lower)\n    if size_match:\n        return size_match.group(1) + \"b\"\n    return None\n\n\ndef derive_model_flags(model_id: str) -> dict[str, bool]:\n    \"\"\"\n    Derive boolean model flags from the model ID string.\n    Detects: is_llama_model, is_gpt_5_model, is_o_model, is_clip_model,\n             is_vision_model (from name patterns), is_reasoning_model,\n             is_embedding_model, is_text_model\n    \"\"\"\n    model_lower = model_id.lower()\n    model_name = model_lower.split(\"/\")[-1] if \"/\" in model_lower else model_lower\n    flags = {}\n\n    # Model family detection\n    if \"llama\" in model_lower:\n        flags[\"is_llama_model\"] = True\n    if \"gpt-5\" in model_lower or \"gpt5\" in model_lower:\n        flags[\"is_gpt_5_model\"] = True\n\n    if model_name.startswith((\"o1\", \"o3\", \"o4\")) and not model_name.startswith(\"openai\"):\n        flags[\"is_o_model\"] = True\n\n    if \"clip\" in model_lower:\n        flags[\"is_clip_model\"] = True\n\n    # Vision/multimodal detection from model name patterns\n    vision_patterns = [\n        \"-vision\", \"-vl\", \"vl-\", \"-v-\",  # Common suffixes/infixes\n        \"vision-\", \"visual\",  # Prefix patterns\n        \"llava\", \"cogvlm\", \"qwen-vl\", \"internvl\",  # Known vision model families\n        \"pixtral\", \"idefics\", \"fuyu\",  # More vision models\n    ]\n    if any(pattern in model_lower for pattern in vision_patterns):\n        flags[\"is_vision_model\"] = True\n\n    # Also detect vision for specific Llama 3.2 vision models (11B and 90B variants)\n    if \"llama\" in model_lower and \"3.2\" in model_lower:\n        size = _extract_model_size(model_id)\n        if size in (\"11b\", \"90b\"):\n            flags[\"is_vision_model\"] = True\n\n    # Reasoning model detection\n    reasoning_patterns = [\n        \"deepseek-r1\", \"o1-\", \"o3-\", \"o4-\",\n        \"-cot\", \"chain-of-thought\", \"reasoning\",\n    ]\n    if any(pattern in model_lower for pattern in reasoning_patterns):\n        flags[\"is_reasoning_model\"] = True\n\n    # Embedding model detection\n    embedding_patterns = [\n        \"embed\", \"e5-\", \"e5_\", \"/e5\",  # Common embedding model names\n        \"bge-\", \"bge_\", \"/bge\",  # BGE models\n        \"gte-\", \"gte_\", \"/gte\",  # GTE models\n        \"nomic-embed\", \"jina-embed\",  # Nomic and Jina\n        \"sentence-transformer\", \"sbert\",  # Sentence transformers\n        \"instructor-\", \"contriever\",  # Other embedding models\n    ]\n    if any(pattern in model_lower for pattern in embedding_patterns):\n        flags[\"is_embedding_model\"] = True\n        flags[\"is_text_model\"] = False  # Embedding models are not text generation models\n\n    return flags\n\n\ndef _estimate_tps_from_size(model_id: str) -> float | None:\n    \"\"\"\n    Estimate tokens per second based on model size.\n    Uses LOCAL_MODEL_SIZE_TO_TPS mapping for conservative local inference estimates.\n    \"\"\"\n    size = _extract_model_size(model_id)\n    if size and size in LOCAL_MODEL_SIZE_TO_TPS:\n        return LOCAL_MODEL_SIZE_TO_TPS[size]\n\n    # Try to find closest size match\n    if size:\n        try:\n            size_num = float(size.replace(\"b\", \"\"))\n            # Find closest known size\n            closest_size = None\n            closest_diff = float(\"inf\")\n            for known_size in LOCAL_MODEL_SIZE_TO_TPS:\n                known_num = float(known_size.replace(\"b\", \"\"))\n                diff = abs(known_num - size_num)\n                if diff < closest_diff:\n                    closest_diff = diff\n                    closest_size = known_size\n            if closest_size:\n                return LOCAL_MODEL_SIZE_TO_TPS[closest_size]\n        except ValueError:\n            pass\n\n    return None\n\n\ndef predict_local_model_metrics(model_id: str) -> dict[str, Any]:\n    \"\"\"\n    Predict quality and latency metrics for local/vLLM models.\n\n    Uses the following strategy:\n    1. Fuzzy match against known MMLU-Pro scores for quality\n    2. For latency:\n       a. Try fuzzy matching against known latency data\n       b. Fall back to model size-based estimation\n       c. Use conservative default if nothing matches\n\n    Returns a dict with MMLU_Pro_score, seconds_per_output_token, and any\n    detected capability flags.\n    \"\"\"\n    # Try to fuzzy match quality score\n    quality_score = fuzzy_match_score(model_id, MMLU_PRO_SCORES)\n    if quality_score is None:\n        # Try to estimate based on model size as a rough heuristic\n        size = _extract_model_size(model_id)\n        if size:\n            size_num = float(size.replace(\"b\", \"\"))\n            # Rough heuristic: larger models tend to score better\n            # This is a very rough estimate for unknown models\n            if size_num < 3:\n                quality_score = 30.0\n            elif size_num < 10:\n                quality_score = 45.0\n            elif size_num < 35:\n                quality_score = 55.0\n            elif size_num < 80:\n                quality_score = 65.0\n            else:\n                quality_score = 70.0\n        else:\n            quality_score = DEFAULT_QUALITY_SCORE\n\n    # Try to fuzzy match latency (tokens per second)\n    tps = fuzzy_match_score(model_id, LATENCY_TPS_DATA)\n\n    if tps is None:\n        # Fall back to size-based estimation for local models\n        tps = _estimate_tps_from_size(model_id)\n\n    seconds_per_output_token = round(1.0 / tps, 6) if tps is not None else DEFAULT_SECONDS_PER_OUTPUT_TOKEN\n\n    # Also derive capability flags from model name\n    flags = derive_model_flags(model_id)\n\n    return {\n        \"MMLU_Pro_score\": quality_score,\n        \"seconds_per_output_token\": seconds_per_output_token,\n        **flags,  # Include detected flags\n    }\n\n\nclass ModelMetricsManager:\n    \"\"\"\n    Manages fetching and caching of model metrics from an external source.\n    \"\"\"\n    _instance = None\n\n    def __new__(cls, *args, **kwargs):\n        if cls._instance is None:\n            cls._instance = super().__new__(cls)\n        return cls._instance\n\n    def __init__(self):\n        if getattr(self, \"_initialized\", False):\n            return\n        self.data_url = PZ_MODEL_DATA_URL\n        self._metrics_cache = None\n        self._initialized = True\n\n    def _load_data(self):\n        if self._metrics_cache is None:\n            logger.info(f\"Fetching data from URL: {self.data_url}\")\n            try:\n                self._metrics_cache = requests.get(self.data_url).json()\n            except Exception as e:\n                logger.error(f\"Error fetching data: {e}\")\n                self._metrics_cache = {}\n\n    def get_model_metrics(self, model_name) -> dict[str, Any]:\n        self._load_data()\n        return self._metrics_cache.get(model_name, {})\n\n    def refresh_data(self) -> None:\n        self._metrics_cache = None\n        self._load_data()\n"
  },
  {
    "path": "src/palimpzest/utils/progress.py",
    "content": "import time\nfrom abc import ABC, abstractmethod\nfrom dataclasses import dataclass\n\nfrom chromadb.api.models.Collection import Collection\nfrom rich.console import Console\nfrom rich.live import Live\nfrom rich.panel import Panel\nfrom rich.progress import (\n    BarColumn,\n    MofNCompleteColumn,\n    SpinnerColumn,\n    TaskProgressColumn,\n    TextColumn,\n    TimeElapsedColumn,\n    TimeRemainingColumn,\n)\nfrom rich.progress import Progress as RichProgress\nfrom rich.table import Table\n\nfrom palimpzest.query.operators.aggregate import AggregateOp\nfrom palimpzest.query.operators.convert import LLMConvert\nfrom palimpzest.query.operators.filter import LLMFilter\nfrom palimpzest.query.operators.join import JoinOp\nfrom palimpzest.query.operators.limit import LimitScanOp\nfrom palimpzest.query.operators.physical import PhysicalOperator\nfrom palimpzest.query.operators.topk import TopKOp\nfrom palimpzest.query.optimizer.plan import PhysicalPlan, SentinelPlan\n\n\n@dataclass\nclass ProgressStats:\n    \"\"\"Statistics tracked for progress reporting\"\"\"\n    start_time: float = 0.0\n    total_cost: float = 0.0\n    success_count: int = 0\n    failure_count: int = 0\n    current_operation: str = \"\"\n    memory_usage_mb: float = 0.0\n    recent_text: str = \"\"\n\ndef get_memory_usage() -> float:\n    \"\"\"Get current memory usage in MB\"\"\"\n    try:\n        import psutil\n        process = psutil.Process()\n        return process.memory_info().rss / 1024 / 1024\n    except Exception:\n        return 0.0\n\n# NOTE: right now we only need to support single plan execution; in a multi-plan setting, we will\n#       need to modify the semantics of the progress manager to support multiple plans\nclass ProgressManager(ABC):\n    \"\"\"Abstract base class for progress managers for plan execution\"\"\"\n\n    def __init__(self, plan: PhysicalPlan | SentinelPlan, num_samples: int | None = None):\n        \"\"\"\n        Initialize the progress manager for the given plan. This function takes in a plan,\n        the number of samples to process (if specified).\n\n        If `num_samples` is None, then the entire Dataset will be scanned.\n\n        For each operator which is not an `AggregateOp` or `LimitScanOp`, we set its task `total`\n        to the number of inputs to be processed by the plan. As intermediate operators process\n        their inputs, the ProgressManager will update the `total` for their downstream operators.\n        \"\"\"\n        # initialize progress object\n        self.progress = RichProgress(\n            SpinnerColumn(),\n            TextColumn(\"[bold blue]{task.description}\"),\n            BarColumn(),\n            TaskProgressColumn(),\n            MofNCompleteColumn(),\n            TimeElapsedColumn(),\n            TimeRemainingColumn(),\n            #TextColumn(\"[green]Success: {task.fields[success]}\"),\n            #TextColumn(\"[red]Failed: {task.fields[failed]}\"),\n            #TextColumn(\"[cyan]Mem: {task.fields[memory]:.1f}MB\"),\n            TextColumn(\"[green]Cost: ${task.fields[cost]:.4f}\"),\n            TextColumn(\"\\n[white]{task.fields[recent]}\"),  # Recent text on new line\n            refresh_per_second=10,\n            expand=True,   # Use full width\n        )\n\n        # initialize mapping from unique_full_op_id --> ProgressStats\n        self.unique_full_op_id_to_stats: dict[str, ProgressStats] = {}\n\n        # initialize mapping from unique_full_op_id --> task\n        self.unique_full_op_id_to_task = {}\n\n        # initialize start time\n        self.start_time = None\n\n        # TODO: store plan and use its methods within incr()\n        # create mapping from unique_full_op_id --> input unique_full_op_ids\n        self.unique_full_op_id_to_input_unique_full_op_ids: dict[str, list[str]] = {}\n        for topo_idx, op in enumerate(plan):\n            unique_full_op_id = f\"{topo_idx}-{op.get_full_op_id()}\"\n            input_unique_full_op_ids = plan.get_source_unique_full_op_ids(topo_idx, op)\n            self.unique_full_op_id_to_input_unique_full_op_ids[unique_full_op_id] = input_unique_full_op_ids\n\n        # create mapping from unique_full_op_id --> next_op\n        self.unique_full_op_id_to_next_op_and_id: dict[str, tuple[PhysicalOperator, str]] = {}\n        for topo_idx, op in enumerate(plan):\n            unique_full_op_id = f\"{topo_idx}-{op.get_full_op_id()}\"\n            next_op, next_unique_full_op_id = plan.get_next_unique_full_op_and_id(topo_idx, op)\n            self.unique_full_op_id_to_next_op_and_id[unique_full_op_id] = (next_op, next_unique_full_op_id)\n\n        # add a task to the progress manager for each operator in the plan\n        est_total_outputs, _ = plan.get_est_total_outputs(num_samples)\n        for topo_idx, op in enumerate(plan):\n            # get the op id and a short string representation of the op; (str(op) is too long)\n            op_str = f\"{op.op_name()} ({op.get_op_id()})\"\n            unique_full_op_id = f\"{topo_idx}-{op.get_full_op_id()}\"\n            self.add_task(unique_full_op_id, op_str, est_total_outputs[unique_full_op_id])\n\n    def get_task_total(self, unique_full_op_id: str) -> int:\n        \"\"\"Return the current total value for the given task.\"\"\"\n        task = self.unique_full_op_id_to_task[unique_full_op_id]\n        return self.progress._tasks[task].total\n\n    def get_task_description(self, unique_full_op_id: str) -> str:\n        \"\"\"Return the current description for the given task.\"\"\"\n        task = self.unique_full_op_id_to_task[unique_full_op_id]\n        return self.progress._tasks[task].description\n\n    @abstractmethod\n    def add_task(self, unique_full_op_id: str, op_str: str, total: int):\n        \"\"\"Initialize progress tracking for operator execution with total items\"\"\"\n        pass\n\n    @abstractmethod\n    def start(self):\n        \"\"\"Start the progress bar(s)\"\"\"\n        pass\n\n    @abstractmethod\n    def incr(self, unique_full_op_id: str, num_inputs: int = 1, num_outputs: int = 1, display_text: str | None = None, **kwargs):\n        \"\"\"\n        Advance the progress bar for the given operator. Modify the downstream operators'\n        progress bar `total` to reflect the number of outputs produced by this operator.\n\n        NOTE: `num_outputs` specifies how many outputs were generated by the operator when processing\n        the `num_inputs` inputs for which `incr()` was called. E.g. a filter which filters one input record\n        will advance its progress bar by 1, but the next operator will now have 1 fewer inputs to process.\n        Alternatively, a convert which generates 3 `num_outputs` for 2 `num_inputs` will increase the inputs\n        for the next operator by `delta = num_outputs - num_inputs = 3 - 2 = 1`.\n        \"\"\"\n        pass\n\n    @abstractmethod\n    def finish(self):\n        \"\"\"Clean up and finalize progress tracking\"\"\"\n        pass\n\n\nclass MockProgressManager(ProgressManager):\n    \"\"\"Mock progress manager for testing purposes\"\"\"\n\n    def __init__(self, plan: PhysicalPlan | SentinelPlan, num_samples: int | None = None):\n        pass\n\n    def add_task(self, unique_full_op_id: str, op_str: str, total: int):\n        pass\n\n    def start(self):\n        pass\n\n    def incr(self, unique_full_op_id: str, num_inputs: int = 1, num_outputs: int = 1, display_text: str | None = None, **kwargs):\n        pass\n\n    def finish(self):\n        pass\n\n    def incr_overall_progress_cost(self, cost_delta: float):\n        pass\n \nclass PZProgressManager(ProgressManager):\n    \"\"\"Progress manager for command line interface using rich\"\"\"\n    \n    def __init__(self, plan: PhysicalPlan, num_samples: int | None = None):\n        super().__init__(plan, num_samples)\n        self.console = Console()\n\n    def add_task(self, unique_full_op_id: str, op_str: str, total: int):\n        \"\"\"Add a new task to the progress bar\"\"\"\n        task = self.progress.add_task(\n            f\"[blue]{op_str}\", \n            total=total,\n            cost=0.0,\n            success=0,\n            failed=0,\n            memory=0.0,\n            recent=\"\",\n        )\n\n        # store the mapping of operator ID to task ID\n        self.unique_full_op_id_to_task[unique_full_op_id] = task\n\n        # initialize the stats for this operation\n        self.unique_full_op_id_to_stats[unique_full_op_id] = ProgressStats(start_time=time.time())\n\n    def start(self):\n        # print a newline before starting to separate from previous output\n        print()\n\n        # set start time\n        self.start_time = time.time()\n\n        # start progress bar\n        self.progress.start()\n\n    def incr(self, unique_full_op_id: str, num_inputs: int = 1, num_outputs: int = 1, display_text: str | None = None, **kwargs):\n        # get the task for the given operation\n        task = self.unique_full_op_id_to_task.get(unique_full_op_id)\n\n        # update statistics with any additional keyword arguments\n        if kwargs != {}:\n            self.update_stats(unique_full_op_id, **kwargs)\n\n        # update progress bar and recent text in one update\n        if display_text is not None:\n            self.unique_full_op_id_to_stats[unique_full_op_id].recent_text = display_text\n\n        # update the downstream operators' progress bar total for any operator which is not an AggregateOp or LimitScanOp\n        delta = num_outputs - num_inputs\n        if delta != 0:\n            current_unique_full_op_id = unique_full_op_id\n            next_op, next_unique_full_op_id = self.unique_full_op_id_to_next_op_and_id[unique_full_op_id]\n            while next_op is not None:\n                if isinstance(next_op, (AggregateOp, LimitScanOp)):\n                    break\n\n                next_task = self.unique_full_op_id_to_task[next_unique_full_op_id]\n                multiplier = 1\n                if isinstance(next_op, JoinOp):\n                    # for joins, scale the delta by the number of inputs from the other side of the join\n                    left_input_unique_full_op_id, right_input_unique_input_op_id = self.unique_full_op_id_to_input_unique_full_op_ids[next_unique_full_op_id]\n                    if current_unique_full_op_id == left_input_unique_full_op_id:\n                        multiplier = self.get_task_total(right_input_unique_input_op_id)\n                    elif current_unique_full_op_id == right_input_unique_input_op_id:\n                        multiplier = self.get_task_total(left_input_unique_full_op_id)\n                    else:\n                        raise ValueError(f\"Current op ID {current_unique_full_op_id} not found in join inputs {left_input_unique_full_op_id}, {right_input_unique_input_op_id}\")\n                delta_adjusted = delta * multiplier\n                self.progress.update(next_task, total=self.get_task_total(next_unique_full_op_id) + delta_adjusted)\n\n                # move to the next operator in the plan\n                current_unique_full_op_id = next_unique_full_op_id\n                next_op, next_unique_full_op_id = self.unique_full_op_id_to_next_op_and_id[next_unique_full_op_id]\n\n        # advance the progress bar for this task\n        self.progress.update(\n            task,\n            advance=num_inputs,\n            description=f\"[bold blue]{self.get_task_description(unique_full_op_id)}\",\n            cost=self.unique_full_op_id_to_stats[unique_full_op_id].total_cost,\n            success=self.unique_full_op_id_to_stats[unique_full_op_id].success_count,\n            failed=self.unique_full_op_id_to_stats[unique_full_op_id].failure_count,\n            memory=get_memory_usage(),\n            recent=f\"{self.unique_full_op_id_to_stats[unique_full_op_id].recent_text}\" if display_text is not None else \"\",\n            refresh=True,\n        )\n\n    def finish(self):\n        self.progress.stop()\n\n        # compute total cost, success, and failure\n        total_cost = sum(stats.total_cost for stats in self.unique_full_op_id_to_stats.values())\n        # success_count = sum(stats.success_count for stats in self.unique_full_op_id_to_stats.values())\n        # failure_count = sum(stats.failure_count for stats in self.unique_full_op_id_to_stats.values())\n\n        # Print final stats on new lines after progress display\n        print(f\"Total time: {time.time() - self.start_time:.2f}s\")\n        print(f\"Total cost: ${total_cost:.4f}\")\n        # print(f\"Success rate: {success_count}/{success_count + failure_count}\")\n\n    def update_stats(self, unique_full_op_id: str, **kwargs):\n        \"\"\"Update progress statistics\"\"\"\n        for key, value in kwargs.items():\n            if hasattr(self.unique_full_op_id_to_stats[unique_full_op_id], key):\n                if key != \"total_cost\":\n                    setattr(self.unique_full_op_id_to_stats[unique_full_op_id], key, value)\n                else:\n                    self.unique_full_op_id_to_stats[unique_full_op_id].total_cost += value\n        self.unique_full_op_id_to_stats[unique_full_op_id].memory_usage_mb = get_memory_usage()\n\nclass PZSentinelProgressManager(ProgressManager):\n    def __init__(self, plan: SentinelPlan, sample_budget: int | None, sample_cost_budget: float | None):\n        # overall progress bar\n        self.overall_progress = RichProgress(\n            SpinnerColumn(),\n            TextColumn(\"{task.description}\"),  # TODO: fixed string?\n            BarColumn(),\n            TaskProgressColumn(),\n            MofNCompleteColumn(),\n            TimeElapsedColumn(),\n            TimeRemainingColumn(),\n            TextColumn(\"[green]Cost: ${task.fields[cost]:.4f}\"),\n            TextColumn(\"\\n[white]{task.fields[recent]}\"),  # Recent text on new line\n            refresh_per_second=10,\n            expand=True,   # Use full width\n        )\n        self.use_cost_budget = sample_cost_budget is not None\n        total = sample_cost_budget if self.use_cost_budget else sample_budget\n        self.overall_task_id = self.overall_progress.add_task(\"\", total=total, cost=0.0, recent=\"\")\n\n        # logical operator progress bars\n        self.op_progress = RichProgress(\n            SpinnerColumn(),\n            \"{task.description}\",\n            BarColumn(),\n            TaskProgressColumn(),\n            MofNCompleteColumn(),\n            TextColumn(\"[green]Cost: ${task.fields[cost]:.4f}\"),\n            TextColumn(\"\\n[white]{task.fields[recent]}\"),  # Recent text on new line\n            refresh_per_second=10,\n            expand=True,   # Use full width\n        )\n\n        # organize progress bars into nice display\n        self.progress_table = Table.grid()\n        self.progress_table.add_row(\n            Panel.fit(self.op_progress, title=\"[b]Sample Allocation\", border_style=\"red\", padding=(1, 2)),\n        )\n        self.progress_table.add_row(\n            Panel.fit(\n                self.overall_progress, title=\"Optimization Progress\", border_style=\"green\", padding=(2, 2)\n            )\n        )\n        self.live_display = Live(self.progress_table, refresh_per_second=10)\n\n        # initialize mapping from unique_logical_op_id --> ProgressStats\n        self.unique_logical_op_id_to_stats: dict[str, ProgressStats] = {}\n\n        # initialize mapping from unique_logical_op_id --> task\n        self.unique_logical_op_id_to_task = {}\n\n        # initialize start time\n        self.start_time = None\n\n        # initialize validation cost\n        self.validation_cost = 0.0\n\n        # add a task to the progress manager for each operator in the plan\n        for topo_idx, (logical_op_id, op_set) in enumerate(plan):\n            unique_logical_op_id = f\"{topo_idx}-{logical_op_id}\"\n            physical_op = op_set[0]\n            is_llm_convert = isinstance(physical_op, LLMConvert)\n            is_llm_filter = isinstance(physical_op, LLMFilter)\n            op_name = \"LLMConvert\" if is_llm_convert else \"LLMFilter\" if is_llm_filter else physical_op.op_name()\n            op_str = f\"{op_name} ({unique_logical_op_id})\"\n            total = sample_budget if self._is_llm_op(op_set[0]) else 0\n            self.add_task(unique_logical_op_id, op_str, total)\n\n        self.console = Console()\n\n    def _is_llm_op(self, physical_op: PhysicalOperator) -> bool:\n        is_llm_convert = isinstance(physical_op, LLMConvert)\n        is_llm_filter = isinstance(physical_op, LLMFilter)\n        is_llm_topk = isinstance(physical_op, TopKOp) and isinstance(physical_op.index, Collection)\n        is_llm_join = isinstance(physical_op, JoinOp)\n        return is_llm_convert or is_llm_filter or is_llm_topk or is_llm_join\n\n    def get_task_description(self, unique_logical_op_id: str) -> str:\n        \"\"\"Return the current description for the given task.\"\"\"\n        task = self.unique_logical_op_id_to_task[unique_logical_op_id]\n        return self.op_progress._tasks[task].description\n\n    def add_task(self, unique_logical_op_id: str, op_str: str, total: int):\n        \"\"\"Add a new task to the op progress bars\"\"\"\n        task = self.op_progress.add_task(\n            f\"[blue]{op_str}\", \n            total=total,\n            cost=0.0,\n            success=0,\n            failed=0,\n            memory=0.0,\n            recent=\"\",\n        )\n\n        # store the mapping of operator ID to task ID\n        self.unique_logical_op_id_to_task[unique_logical_op_id] = task\n\n        # initialize the stats for this operation\n        self.unique_logical_op_id_to_stats[unique_logical_op_id] = ProgressStats(start_time=time.time())\n\n    def start(self):\n        # print a newline before starting to separate from previous output\n        print()\n\n        # set start time\n        self.start_time = time.time()\n\n        # start progress bars\n        self.live_display.start()\n\n    def incr_overall_progress_cost(self, cost_delta: float):\n        \"\"\"Advance the overall progress bar by the given cost delta\"\"\"\n        self.validation_cost += cost_delta\n        self.overall_progress.update(\n            self.overall_task_id,\n            advance=cost_delta,\n            cost=sum(stats.total_cost for _, stats in self.unique_logical_op_id_to_stats.items()) + self.validation_cost,\n            refresh=True,\n        )\n\n        # force the live display to refresh\n        self.live_display.refresh()\n\n    def incr(self, unique_logical_op_id: str, num_samples: int, display_text: str | None = None, **kwargs):\n        # TODO: (above) organize progress bars into a Live / Table / Panel or something\n        # get the task for the given operation\n        task = self.unique_logical_op_id_to_task.get(unique_logical_op_id)\n\n        # store the cost before updating stats\n        previous_total_cost = self.unique_logical_op_id_to_stats[unique_logical_op_id].total_cost\n\n        # update statistics with any additional keyword arguments\n        if kwargs != {}:\n            self.update_stats(unique_logical_op_id, **kwargs)\n\n        # compute the cost delta\n        cost_delta = self.unique_logical_op_id_to_stats[unique_logical_op_id].total_cost - previous_total_cost\n\n        # update progress bar and recent text in one update\n        if display_text is not None:\n            self.unique_logical_op_id_to_stats[unique_logical_op_id].recent_text = display_text\n\n        # advance the op progress bar for this unique_logical_op_id\n        self.op_progress.update(\n            task,\n            advance=num_samples,\n            description=f\"[bold blue]{self.get_task_description(unique_logical_op_id)}\",\n            cost=self.unique_logical_op_id_to_stats[unique_logical_op_id].total_cost,\n            success=self.unique_logical_op_id_to_stats[unique_logical_op_id].success_count,\n            failed=self.unique_logical_op_id_to_stats[unique_logical_op_id].failure_count,\n            memory=get_memory_usage(),\n            recent=f\"{self.unique_logical_op_id_to_stats[unique_logical_op_id].recent_text}\" if display_text is not None else \"\",\n            refresh=True,\n        )\n\n        # advance the overall progress bar\n        advance = cost_delta if self.use_cost_budget else num_samples\n        self.overall_progress.update(\n            self.overall_task_id,\n            advance=advance,\n            cost=sum(stats.total_cost for _, stats in self.unique_logical_op_id_to_stats.items()) + self.validation_cost,\n            refresh=True,\n        )\n\n        # force the live display to refresh\n        self.live_display.refresh()\n\n    def finish(self):\n        self.live_display.stop()\n\n        # compute total cost, success, and failure\n        total_cost = sum(stats.total_cost for stats in self.unique_logical_op_id_to_stats.values())\n        # success_count = sum(stats.success_count for stats in self.unique_logical_op_id_to_stats.values())\n        # failure_count = sum(stats.failure_count for stats in self.unique_logical_op_id_to_stats.values())\n\n        # Print final stats on new lines after progress display\n        print(f\"Total opt. time: {time.time() - self.start_time:.2f}s\")\n        print(f\"Total opt. cost: ${total_cost:.4f}\")\n        # print(f\"Success rate: {success_count}/{success_count + failure_count}\")\n\n    def update_stats(self, unique_logical_op_id: str, **kwargs):\n        \"\"\"Update progress statistics\"\"\"\n        for key, value in kwargs.items():\n            if hasattr(self.unique_logical_op_id_to_stats[unique_logical_op_id], key):\n                if key != \"total_cost\":\n                    setattr(self.unique_logical_op_id_to_stats[unique_logical_op_id], key, value)\n                else:\n                    self.unique_logical_op_id_to_stats[unique_logical_op_id].total_cost += value\n        self.unique_logical_op_id_to_stats[unique_logical_op_id].memory_usage_mb = get_memory_usage()\n\ndef create_progress_manager(\n    plan: PhysicalPlan | SentinelPlan,\n    num_samples: int | None = None,\n    sample_budget: int | None = None,\n    sample_cost_budget: float | None = None,\n    progress: bool = True,\n) -> ProgressManager:\n    \"\"\"Factory function to create appropriate progress manager based on environment\"\"\"\n    if not progress:\n        return MockProgressManager(plan, num_samples)\n\n    if isinstance(plan, SentinelPlan):\n        assert sample_budget is not None or sample_cost_budget is not None, \"Sample budget must be specified for SentinelPlan progress manager\"\n        return PZSentinelProgressManager(plan, sample_budget, sample_cost_budget)\n\n    return PZProgressManager(plan, num_samples)\n"
  },
  {
    "path": "src/palimpzest/utils/pz_models_information.json",
    "content": "{\n    \"together_ai/meta-llama/Llama-3.2-3B-Instruct-Turbo\": {\n        \"usd_per_input_token\": 6e-08,\n        \"usd_per_output_token\": 6e-08,\n        \"seconds_per_output_token\": 0.0079,\n        \"MMLU_Pro_score\": 36.5,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_llama_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"together_ai\",\n        \"sources\": [\n            \"https://artificialanalysis.ai/models/llama-3-1-instruct-8b\",\n            \"https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/discussions/13\"\n        ]\n    },\n    \"together_ai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo\": {\n        \"usd_per_input_token\": 1.8e-07,\n        \"usd_per_output_token\": 1.8e-07,\n        \"seconds_per_output_token\": 0.005,\n        \"MMLU_Pro_score\": 44.25,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_llama_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"together_ai\",\n        \"sources\": null\n    },\n    \"together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo\": {\n        \"usd_per_input_token\": 8.8e-07,\n        \"usd_per_output_token\": 8.8e-07,\n        \"seconds_per_output_token\": 0.0122,\n        \"MMLU_Pro_score\": 69.9,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_llama_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"together_ai\",\n        \"sources\": null\n    },\n    \"together_ai/meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo\": {\n        \"usd_per_input_token\": 1.2e-06,\n        \"usd_per_output_token\": 1.2e-06,\n        \"seconds_per_output_token\": 0.0303,\n        \"MMLU_Pro_score\": 65.0,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": false,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_llama_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"together_ai\",\n        \"sources\": null\n    },\n    \"together_ai/deepseek-ai/DeepSeek-V3\": {\n        \"usd_per_input_token\": 1.25e-06,\n        \"usd_per_output_token\": 1.25e-06,\n        \"seconds_per_output_token\": 0.0114,\n        \"MMLU_Pro_score\": 73.8,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"together_ai\",\n        \"sources\": null\n    },\n    \"together_ai/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B\": {\n        \"usd_per_input_token\": 1.8e-07,\n        \"usd_per_output_token\": 1.8e-07,\n        \"seconds_per_output_token\": 0.005,\n        \"MMLU_Pro_score\": 39.9,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"together_ai\",\n        \"sources\": [\n            \"https://www.reddit.com/r/LocalLLaMA/comments/1iserf9/deepseek_r1_distilled_models_mmlu_pro_benchmarks/\"\n        ],\n        \"note\": \"seconds_per_output_token copied to be same as LLAMA3_1_8B_INSTRUCT_MODEL_CARD; need to update when we have data\"\n    },\n    \"together_ai/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B\":{\n        \"usd_per_input_token\": 6e-08,\n        \"usd_per_output_token\": 9e-08,\n        \"seconds_per_output_token\": 0.02,\n        \"MMLU_Pro_score\": 85.0,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"together_ai\",\n        \"sources\": [\n            \"https://llm-stats.com/models/compare/deepseek-r1-0528-vs-qwen3-vl-8b-instruct\",\n            \"https://artificialanalysis.ai/models/deepseek-r1-qwen3-8b\"\n        ]\n    },\n    \"deepseek-chat\": {\n        \"usd_per_input_token\": 0.28e-06,\n        \"usd_per_output_token\": 0.42e-06,\n        \"seconds_per_output_token\": 0,\n        \"MMLU_Pro_score\": 0,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": false,\n        \"is_audio_model\": true,\n        \"is_vision_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"deepseek\",\n        \"sources\": [\n            \"https://api-docs.deepseek.com/quick_start/pricing\",\n            \"https://artificialanalysis.ai/models/deepseek-v3-2-reasoning\"\n        ],\n        \"note\": \"needs update when the newer deepseek-chat model comes out\"\n    },\n    \"openai/gpt-4o-2024-08-06\": {\n        \"usd_per_input_token\": 2.5e-06,\n        \"usd_per_output_token\": 1e-05,\n        \"usd_per_cache_read_token\": 1.25e-06,\n        \"usd_per_cache_creation_token\": 0,\n        \"seconds_per_output_token\": 0.008,\n        \"MMLU_Pro_score\": 74.1,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"openai\",\n        \"sources\": null,\n        \"note\": \"it is unclear if the same ($ / token) costs can be applied for vision, or if we have to calculate this ourselves\"\n    },\n    \"openai/gpt-4o-mini-2024-07-18\": {\n        \"usd_per_input_token\": 1.5e-07,\n        \"usd_per_output_token\": 6e-07,\n        \"usd_per_cache_read_token\": 7.5e-08,\n        \"seconds_per_output_token\": 0.0159,\n        \"MMLU_Pro_score\": 62.7,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"openai\",\n        \"sources\": null\n    },\n    \"openai/gpt-4.1-2025-04-14\": {\n        \"usd_per_input_token\": 2e-06,\n        \"usd_per_output_token\": 8e-06,\n        \"usd_per_cache_read_token\": 5e-07,\n        \"seconds_per_output_token\": 0.0076,\n        \"MMLU_Pro_score\": 80.5,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"openai\",\n        \"sources\": null\n    },\n    \"openai/gpt-4.1-mini-2025-04-14\": {\n        \"usd_per_input_token\": 4e-07,\n        \"usd_per_output_token\": 1.6e-06,\n        \"usd_per_cache_read_token\": 1.0e-07,\n        \"seconds_per_output_token\": 0.0161,\n        \"MMLU_Pro_score\": 77.2,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"openai\",\n        \"sources\": null\n    },\n    \"openai/gpt-4.1-nano-2025-04-14\": {\n        \"usd_per_input_token\": 1.0e-07,\n        \"usd_per_output_token\": 4.0e-07,\n        \"usd_per_cache_read_token\": 2.5e-08,\n        \"seconds_per_output_token\": 0.006,\n        \"MMLU_Pro_score\": 62.3,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"openai\",\n        \"sources\": null\n    },\n    \"openai/gpt-5-2025-08-07\": {\n        \"usd_per_input_token\": 1.25e-06,\n        \"usd_per_output_token\": 1e-05,\n        \"usd_per_cache_read_token\": 1.25e-07,\n        \"seconds_per_output_token\": 0.006,\n        \"MMLU_Pro_score\": 87.0,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_gpt_5_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"openai\",\n        \"sources\": null\n    },\n    \"openai/gpt-5-mini-2025-08-07\": {\n        \"usd_per_input_token\": 2.5e-07,\n        \"usd_per_output_token\": 2e-06,\n        \"usd_per_audio_input_token\": 0,\n        \"usd_per_cache_read_token\": 2.5e-08,\n        \"seconds_per_output_token\": 0.0135,\n        \"MMLU_Pro_score\": 82.5,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_gpt_5_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"openai\",\n        \"sources\": [\"https://platform.openai.com/docs/pricing\"]\n    },\n    \"openai/gpt-5-nano-2025-08-07\": {\n        \"usd_per_input_token\": 5.0e-08,\n        \"usd_per_output_token\": 4.0e-07,\n        \"usd_per_cache_read_token\": 5e-09,\n        \"seconds_per_output_token\": 0.0055,\n        \"MMLU_Pro_score\": 77.9,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_gpt_5_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"openai\",\n        \"sources\": [\"https://platform.openai.com/docs/pricing\"]\n    },\n    \"openai/gpt-5.2-2025-12-11\": {\n        \"usd_per_input_token\": 1.75e-06,\n        \"usd_per_output_token\": 1.4e-05,\n        \"usd_per_audio_input_token\": 0,\n        \"usd_per_cache_read_token\": 1.75e-07,\n        \"seconds_per_output_token\": 0.01471,\n        \"MMLU_Pro_score\": 86.23,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_gpt_5_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"openai\",\n        \"sources\": [\"https://platform.openai.com/docs/pricing\"]\n    },\n    \"openai/o4-mini-2025-04-16\": {\n        \"usd_per_input_token\": 1.1e-06,\n        \"usd_per_output_token\": 4.4e-06,\n        \"usd_per_cache_read_token\": 2.75e-07,\n        \"seconds_per_output_token\": 0.0092,\n        \"MMLU_Pro_score\": 80.6,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_o_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"openai\",\n        \"sources\": null\n    },\n    \"azure/gpt-4o-2024-08-06\": {\n        \"usd_per_input_token\": 2.5e-06,\n        \"usd_per_output_token\": 1e-05,\n        \"usd_per_cache_read_token\": 1.25e-06,\n        \"usd_per_cache_creation_token\": 0,\n        \"seconds_per_output_token\": 0.008,\n        \"MMLU_Pro_score\": 74.1,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"azure\",\n        \"sources\": null,\n        \"note\": \"Azure OpenAI variant of gpt-4o; pricing may vary by Azure region\"\n    },\n    \"azure/gpt-4o-mini-2024-07-18\": {\n        \"usd_per_input_token\": 1.5e-07,\n        \"usd_per_output_token\": 6e-07,\n        \"usd_per_cache_read_token\": 7.5e-08,\n        \"seconds_per_output_token\": 0.0159,\n        \"MMLU_Pro_score\": 62.7,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"azure\",\n        \"sources\": null\n    },\n    \"azure/gpt-4.1-2025-04-14\": {\n        \"usd_per_input_token\": 2e-06,\n        \"usd_per_output_token\": 8e-06,\n        \"usd_per_cache_read_token\": 5e-07,\n        \"seconds_per_output_token\": 0.0076,\n        \"MMLU_Pro_score\": 80.5,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"azure\",\n        \"sources\": null\n    },\n    \"azure/gpt-4.1-mini-2025-04-14\": {\n        \"usd_per_input_token\": 4e-07,\n        \"usd_per_output_token\": 1.6e-06,\n        \"usd_per_cache_read_token\": 1.0e-07,\n        \"seconds_per_output_token\": 0.0161,\n        \"MMLU_Pro_score\": 77.2,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"azure\",\n        \"sources\": null\n    },\n    \"azure/gpt-4.1-nano-2025-04-14\": {\n        \"usd_per_input_token\": 1.0e-07,\n        \"usd_per_output_token\": 4.0e-07,\n        \"usd_per_cache_read_token\": 3e-08,\n        \"seconds_per_output_token\": 0.006,\n        \"MMLU_Pro_score\": 62.3,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"azure\",\n        \"sources\": null\n    },\n    \"azure/o4-mini-2025-04-16\": {\n        \"usd_per_input_token\": 1.1e-06,\n        \"usd_per_output_token\": 4.4e-06,\n        \"usd_per_cache_read_token\": 2.8e-07,\n        \"seconds_per_output_token\": 0.0092,\n        \"MMLU_Pro_score\": 80.6,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_o_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"azure\",\n        \"sources\": null\n    },\n    \"azure/gpt-4o-audio-preview\": {\n        \"usd_per_input_token\": 2.5e-06,\n        \"usd_per_output_token\": 1.0e-05,\n        \"usd_per_input_audio_token\": 40e-6,\n        \"usd_per_output_audio_token\": 80e-6,\n        \"seconds_per_output_token\": 0.008,\n        \"MMLU_Pro_score\": 74.1,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"azure\",\n        \"sources\": null\n    },\n    \"azure/gpt-4o-mini-audio-preview\": {\n        \"usd_per_input_token\": 0.6e-06,\n        \"usd_per_output_token\": 2.4e-06,\n        \"usd_per_audio_input_token\": 10e-06,\n        \"usd_per_audio_output_token\": 20e-06,\n        \"seconds_per_output_token\": 0.0159,\n        \"MMLU_Pro_score\": 62.7,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"provider\": \"azure\",\n        \"sources\": null\n    },\n    \"anthropic/claude-3-7-sonnet-20250219\": {\n        \"usd_per_input_token\": 3e-06,\n        \"usd_per_output_token\": 1.5e-05,\n        \"usd_per_cache_read_token\": 3e-07,\n        \"usd_per_cache_creation_token\": 3.75e-06,\n        \"seconds_per_output_token\": 0.0156,\n        \"MMLU_Pro_score\": 80.7,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"anthropic\",\n        \"sources\": [\"https://platform.claude.com/docs/en/about-claude/pricing\"]\n    },\n    \"anthropic/claude-sonnet-4-20250514\": {\n        \"usd_per_input_token\": 3e-06,\n        \"usd_per_output_token\": 1.5e-05,\n        \"usd_per_cache_read_token\": 3e-07,\n        \"usd_per_cache_creation_token\": 3.75e-06,\n        \"seconds_per_output_token\": 0.014025245441795233,\n        \"MMLU_Pro_score\": 83.87,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"anthropic\",\n        \"sources\": [\"https://platform.claude.com/docs/en/about-claude/pricing\"]\n    },\n    \"anthropic/claude-sonnet-4-5-20250929\": {\n        \"usd_per_input_token\": 3e-06,\n        \"usd_per_output_token\": 1.5e-05,\n        \"usd_per_cache_read_token\": 3e-07,\n        \"usd_per_cache_creation_token\": 3.75e-06,\n        \"seconds_per_output_token\": 0.012722646310432571,\n        \"MMLU_Pro_score\": 87.36,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"anthropic\",\n        \"sources\": [\"https://platform.claude.com/docs/en/about-claude/pricing\"]\n    },\n    \"anthropic/claude-3-5-haiku-20241022\": {\n        \"usd_per_input_token\": 8e-07,\n        \"usd_per_output_token\": 4e-06,\n        \"usd_per_cache_read_token\": 8e-08,\n        \"usd_per_cache_creation_token\": 1e-06,\n        \"seconds_per_output_token\": 0.0189,\n        \"MMLU_Pro_score\": 64.1,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"anthropic\",\n        \"sources\": [\"https://platform.claude.com/docs/en/about-claude/pricing\"]\n    },\n    \"anthropic/claude-haiku-4-5-20251001\": {\n        \"usd_per_input_token\": 1e-06,\n        \"usd_per_output_token\": 1.25e-06,\n        \"usd_per_cache_read_token\": 2e-06,\n        \"usd_per_cache_creation_token\": 1.25e-06,\n        \"seconds_per_output_token\": 0.0084530853761623,\n        \"MMLU_Pro_score\": 78.72,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"anthropic\",\n        \"sources\": [\"https://platform.claude.com/docs/en/about-claude/pricing\"]\n    },\n    \"anthropic/claude-opus-4-5-20251101\": {\n        \"usd_per_input_token\": 5e-06,\n        \"usd_per_output_token\": 2.5e-05,\n        \"usd_per_cache_read_token\": 5e-07,\n        \"usd_per_cache_creation_token\": 6.25e-06,\n        \"seconds_per_output_token\": 0.01642,\n        \"MMLU_Pro_score\": 87.3,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"anthropic\",\n        \"sources\": null\n    },\n    \"vertex_ai/gemini-3-pro-preview\": {\n        \"usd_per_input_token\": 2e-06,\n        \"usd_per_audio_input_token\": 2e-06,\n        \"usd_per_output_token\": 1.2e-05,\n        \"usd_per_image_output_token\": 120e-06,\n        \"usd_per_cache_read_token\": 2e-07,\n        \"usd_per_cached_token_per_hour\": 4.50e-06,\n        \"seconds_per_output_token\": 0.0075758,\n        \"MMLU_Pro_score\": 90.1,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"vertex_ai\",\n        \"sources\": [\"https://cloud.google.com/vertex-ai/generative-ai/pricing\"]\n    },\n    \"vertex_ai/gemini-3-flash-preview\": {\n        \"usd_per_input_token\": 0.5e-06,\n        \"usd_per_output_token\": 3e-06,\n        \"usd_per_audio_input_token\": 1e-06,\n        \"usd_per_cache_read_token\": 0.05e-06,\n        \"usd_per_audio_cache_read_token\": 0.1e-06,\n        \"usd_per_cached_token_per_hour\": 1e-06,\n        \"seconds_per_output_token\": 0.00457247,\n        \"MMLU_Pro_score\": 87.63,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"vertex_ai\",\n        \"sources\": [\"https://cloud.google.com/vertex-ai/generative-ai/pricing\"]\n    },\n    \"vertex_ai/gemini-2.0-flash\": {\n        \"usd_per_input_token\": 1.5e-07,\n        \"usd_per_output_token\": 6e-07,\n        \"usd_per_audio_input_token\": 1e-06,\n        \"usd_per_cached_token_per_hour\": 1e-06,\n        \"seconds_per_output_token\": 0.0054,\n        \"MMLU_Pro_score\": 77.4,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"vertex_ai\",\n        \"sources\": [\"https://cloud.google.com/vertex-ai/generative-ai/pricing\"],\n        \"note\": \"MMLU-pro score interpolated between gemini 2.5 flash and gemini 2.0 flash\"\n    },\n    \"vertex_ai/gemini-2.5-flash-lite\": {\n        \"usd_per_input_token\": 0.1e-06,\n        \"usd_per_output_token\": 0.4e-06,\n        \"usd_per_audio_input_token\": 0.3e-06,\n        \"usd_per_cache_read_token\": 0.010e-06,\n        \"usd_per_cached_token_per_hour\": 1e-06,\n        \"usd_per_audio_cache_read_token\": 0.030e-06,\n        \"seconds_per_output_token\": 0.0034,\n        \"MMLU_Pro_score\": 79.1,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"vertex_ai\",\n        \"sources\": null,\n        \"note\": \"MMLU-pro score interpolated between gemini 2.5 flash and gemini 2.0 flash\"\n    },\n    \"vertex_ai/gemini-2.5-flash\": {\n        \"usd_per_input_token\": 0.30e-06,\n        \"usd_per_output_token\": 2.5e-06,\n        \"usd_per_image_output_token\": 30e-06,\n        \"usd_per_audio_input_token\": 1e-06,\n        \"usd_per_cache_read_token\": 3e-08,\n        \"usd_per_cached_token_per_hour\": 1e-06,\n        \"seconds_per_output_token\": 0.0044,\n        \"MMLU_Pro_score\": 80.75,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"vertex_ai\",\n        \"sources\": [\"https://cloud.google.com/vertex-ai/generative-ai/pricing\"]\n    },\n    \"vertex_ai/gemini-2.5-pro\": {\n        \"usd_per_input_token\": 1.25e-06,\n        \"usd_per_output_token\": 1e-05,\n        \"usd_per_audio_input_token\": 1.25e-06,\n        \"usd_per_cache_read_token\": 1.25e-07,\n        \"usd_per_cached_token_per_hour\": 4.50e-06,\n        \"seconds_per_output_token\": 0.0072,\n        \"MMLU_Pro_score\": 84.1,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"vertex_ai\",\n        \"sources\": [\"https://cloud.google.com/vertex-ai/generative-ai/pricing\"]\n    },\n    \"gemini/gemini-3-pro-image-preview\": {\n        \"usd_per_input_token\": 2e-06,\n        \"usd_per_audio_input_token\": 2e-06,\n        \"usd_per_output_token\": 1.2e-05,\n        \"usd_per_image_output_token\": 120e-06,\n        \"usd_per_cache_read_token\": 2e-07,\n        \"seconds_per_output_token\": 0.0075758,\n        \"MMLU_Pro_score\": 90.1,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"gemini\",\n        \"sources\": [\"https://ai.google.dev/gemini-api/docs/pricing\"]\n    },\n    \"gemini/gemini-3-pro-preview\": {\n        \"usd_per_input_token\": 2e-06,\n        \"usd_per_audio_input_token\": 2e-06,\n        \"usd_per_output_token\": 1.2e-05,\n        \"usd_per_cache_read_token\": 2e-07,\n        \"usd_per_cached_token_per_hour\": 4.50e-06,\n        \"seconds_per_output_token\": 0.0075758,\n        \"MMLU_Pro_score\": 90.1,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"gemini\",\n        \"sources\": [\"https://ai.google.dev/gemini-api/docs/pricing\"]\n    },\n    \"gemini/gemini-3-flash-preview\": {\n        \"usd_per_input_token\": 5e-07,\n        \"usd_per_output_token\": 3e-06,\n        \"usd_per_audio_input_token\": 1e-06,\n        \"usd_per_cache_read_token\": 5e-08,\n        \"usd_per_cached_token_per_hour\": 1e-06,\n        \"seconds_per_output_token\": 0.00457247,\n        \"MMLU_Pro_score\": 87.63,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"gemini\",\n        \"sources\": [\"https://ai.google.dev/gemini-api/docs/pricing\"]\n    },\n    \"gemini/gemini-2.5-flash\": {\n        \"usd_per_input_token\": 3e-07,\n        \"usd_per_output_token\": 2.5e-06,\n        \"usd_per_audio_input_token\": 1e-06,\n        \"usd_per_cache_read_token\": 3e-08,\n        \"usd_per_cached_token_per_hour\": 1e-06,\n        \"seconds_per_output_token\": 0.0044,\n        \"MMLU_Pro_score\": 80.75,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"gemini\",\n        \"sources\": [\"https://ai.google.dev/gemini-api/docs/pricing\"]\n    },\n    \"gemini/gemini-2.5-flash-lite\": {\n        \"usd_per_input_token\": 1e-07,\n        \"usd_per_output_token\": 4e-07,\n        \"usd_per_cache_read_token\": 0.01e-06,\n        \"usd_per_audio_cache_read_token\": 0.03e-06,\n        \"usd_per_cached_token_per_hour\": 1e-06,\n        \"usd_per_audio_input_token\": 3e-07,\n        \"seconds_per_output_token\": 0.0034,\n        \"MMLU_Pro_score\": 79.1,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"gemini\",\n        \"sources\": [\"https://ai.google.dev/gemini-api/docs/pricing\"],\n        \"note\": \"MMLU-pro score interpolated between gemini 2.5 flash and gemini 2.0 flash\"\n    },\n    \"gemini/gemini-2.5-pro\": {\n        \"usd_per_input_token\": 1.25e-06,\n        \"usd_per_output_token\": 1e-05,\n        \"usd_per_audio_input_token\": 1.25e-06,\n        \"usd_per_cache_read_token\": 1.25e-07,\n        \"usd_per_cached_token_per_hour\": 4.50e-06,\n        \"seconds_per_output_token\": 0.0072,\n        \"MMLU_Pro_score\": 84.1,\n        \"is_reasoning_model\": true,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": true,\n        \"provider\": \"gemini\",\n        \"sources\": [\"https://ai.google.dev/gemini-api/docs/pricing\"]\n    },\n    \"vertex_ai/meta/llama-4-maverick-17b-128e-instruct-maas\": {\n        \"usd_per_input_token\": 3.5e-07,\n        \"usd_per_output_token\": 1.15e-06,\n        \"seconds_per_output_token\": 0.0122,\n        \"MMLU_Pro_score\": 79.4,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": true,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"vertex_ai\",\n        \"sources\": null\n    },\n    \"openai/gpt-4o-audio-preview\": {\n        \"usd_per_input_token\": 2.5e-06,\n        \"usd_per_output_token\": 1.0e-05,\n        \"usd_per_input_audio_token\": 40e-6,\n        \"usd_per_output_audio_token\": 80e-6,\n        \"seconds_per_output_token\": 0.008,\n        \"MMLU_Pro_score\": 74.1,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"supports_prompt_caching\": false,\n        \"provider\": \"openai\",\n        \"sources\": [\"https://platform.openai.com/docs/models/gpt-4o-audio-preview\"]\n    },\n    \"openai/gpt-4o-mini-audio-preview\": {\n        \"usd_per_input_token\": 0.15e-06,\n        \"usd_per_output_token\": 0.60e-06,\n        \"usd_per_audio_input_token\": 10e-06,\n        \"usd_per_audio_output_tokne\": 20e-06,\n        \"seconds_per_output_token\": 0.0159,\n        \"MMLU_Pro_score\": 62.7,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"provider\": \"openai\",\n        \"sources\": [\"https://platform.openai.com/docs/models/gpt-4o-mini-audio-preview\"]\n    },\n    \"hosted_vllm/qwen/Qwen1.5-0.5B-Chat\": {\n        \"usd_per_input_token\": 0.0,\n        \"usd_per_output_token\": 0.0,\n        \"seconds_per_output_token\": 0.1,\n        \"MMLU_Pro_score\": 30.0,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": true,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_vllm_model\": true,\n        \"is_embedding_model\": false,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"provider\": \"hosted_vllm\",\n        \"sources\": null,\n        \"note\": \"fill in seconds per output token and MMLU-pro score fill in with better estimates.\"\n    },\n    \"openai/text-embedding-3-small\": {\n        \"usd_per_input_token\": 2e-08,\n        \"usd_per_output_token\": null,\n        \"seconds_per_output_token\": 0.0098,\n        \"MMLU_Pro_score\": 63.09,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": false,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": true,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"provider\": \"openai\",\n        \"sources\": [\"https://platform.openai.com/docs/models/text-embedding-3-small\"],\n        \"note\": \"just copying GPT_4o_MINI_MODEL_CARD for now for overall and seconds per output token\"\n    },\n    \"openai/text-embedding-3-large\": {\n        \"usd_per_input_token\": 0.13e-08,\n        \"usd_per_output_token\": null,\n        \"seconds_per_output_token\": null,\n        \"MMLU_Pro_score\": null,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": false,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": true,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"provider\": \"openai\",\n        \"sources\": [\"https://platform.openai.com/docs/models/text-embedding-3-large\"],\n        \"note\": \"just copying GPT_4o_MINI_MODEL_CARD for now for overall and seconds per output token\"\n    },\n    \"clip-ViT-B-32\": {\n        \"usd_per_input_token\": 0.0,\n        \"usd_per_output_token\": null,\n        \"seconds_per_output_token\": 0.0098,\n        \"MMLU_Pro_score\": 63.3,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": false,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_clip_model\": true,\n        \"is_embedding_model\": true,\n        \"is_text_image_multimodal_embedding_model\": true,\n        \"provider\": \"together_ai\",\n        \"sources\": null,\n        \"note\": \"just copying TEXT_EMBEDDING_3_SMALL_MODEL_CARD for now for seconds per output token; imageNet top-1 accuracy\"\n    },\n    \"ollama/nomic-embed-text\": {\n        \"usd_per_input_token\": 0.0,\n        \"usd_per_output_token\": null,\n        \"seconds_per_output_token\": 0.01,\n        \"MMLU_Pro_score\": 50.0,\n        \"is_reasoning_model\": false,\n        \"is_text_model\": false,\n        \"is_vision_model\": false,\n        \"is_audio_model\": false,\n        \"is_embedding_model\": true,\n        \"is_text_image_multimodal_embedding_model\": false,\n        \"provider\": \"ollama\",\n        \"sources\": [],\n        \"note\": \"copied text-embedding-3-small model card and made numbers slightly worse to protect default behavior for max quality and min latency\"\n    }\n}"
  },
  {
    "path": "src/palimpzest/utils/udfs.py",
    "content": "\"\"\"\nThis file collects a sample of useful UDFs to convert schemata.\n\"\"\"\n\nimport io\nfrom datetime import datetime\n\nimport pandas as pd\nimport requests\n\nfrom palimpzest.constants import MAX_ROWS\n\n\ndef url_to_file(candidate: dict):\n    \"\"\"Function used to convert a DataRecord instance of URL to a File DataRecord.\"\"\"\n    url = candidate[\"url\"]\n    filename = url.split(\"/\")[-1]\n    timestamp = datetime.now().isoformat()\n    try:\n        contents = requests.get(url).content\n    except Exception as e:\n        print(f\"Error fetching URL {url}: {e}\")\n        contents = b\"\"\n\n    return {\"filename\": filename, \"timestamp\": timestamp, \"contents\": contents}\n\n\ndef file_to_xls(candidate: dict):\n    \"\"\"Function used to convert a DataRecord instance of File to a XLSFile DataRecord.\"\"\"\n    xls = pd.ExcelFile(io.BytesIO(candidate[\"contents\"]), engine=\"openpyxl\")\n    return {\"number_sheets\": len(xls.sheet_names), \"sheet_names\": xls.sheet_names}\n\n\ndef xls_to_tables(candidate: dict):\n    \"\"\"Function used to convert a DataRecord instance of XLSFile to a Table DataRecord.\"\"\"\n    xls_bytes = candidate[\"contents\"]\n    sheet_names = candidate[\"sheet_names\"]\n\n    records = []\n    for sheet_name in sheet_names:\n        dataframe = pd.read_excel(io.BytesIO(xls_bytes), sheet_name=sheet_name, engine=\"openpyxl\")\n\n        # TODO extend number of rows with dynamic sizing of context length\n        # construct data record\n        record = {}\n        rows = []\n        for row in dataframe.values[:100]:\n            row_record = [str(x) for x in row]\n            rows += [row_record]\n        record[\"rows\"] = rows[:MAX_ROWS]\n        record[\"filename\"] = candidate[\"filename\"]\n        record[\"header\"] = dataframe.columns.values.tolist()\n        record[\"name\"] = candidate[\"filename\"].split(\"/\")[-1] + \"_\" + sheet_name\n        records.append(record)\n\n    return records\n"
  },
  {
    "path": "src/palimpzest/validator/__init__.py",
    "content": ""
  },
  {
    "path": "src/palimpzest/validator/validator.py",
    "content": "import json\nimport time\n\nimport litellm\n\n# from colorama import Fore, Style\nfrom palimpzest.constants import Cardinality, Model, PromptStrategy\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.models import GenerationStats\nfrom palimpzest.prompts import (\n    FLAT_MAP_IMAGE_VALIDATOR_PROMPT,\n    FLAT_MAP_VALIDATOR_PROMPT,\n    MAP_IMAGE_VALIDATOR_PROMPT,\n    MAP_VALIDATOR_PROMPT,\n    RETRIEVE_VALIDATOR_PROMPT,\n    PromptFactory,\n)\nfrom palimpzest.query.generators.generators import get_json_from_answer\nfrom palimpzest.query.operators.convert import LLMConvert\nfrom palimpzest.query.operators.filter import LLMFilter\nfrom palimpzest.query.operators.join import JoinOp\nfrom palimpzest.query.operators.topk import TopKOp\n\n\nclass Validator:\n    \"\"\"\n    The Validator is used during optimization to score the output of physical operator(s) and physical plan(s).\n\n    TODO: support end-to-end labels; will likely require a different SentinelExecutionStrategy which\n          executes the full input to produce an output, evaluates the output, and then updates\n          intermediate operator(s) based on the evaluation.\n    \"\"\"\n    def __init__(self, model: Model = Model.o4_MINI):\n        self.model = model\n        self.filter_cache = {}\n        self.join_cache = {}\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        raise NotImplementedError(\"Validator.map_score_fn not implemented.\")\n\n    def flat_map_score_fn(self, fields: list[str], input_record: dict, output: list[dict]) -> float | None:\n        raise NotImplementedError(\"Validator.flat_map_score_fn not implemented.\")\n\n    def filter_score_fn(self, filter_str: str, input_record: dict, output: bool) -> float | None:\n        raise NotImplementedError(\"Validator.filter_score_fn not implemented.\")\n\n    def join_score_fn(self, condition: str, left_input_record: dict, right_input_record: dict, output: bool) -> float | None:\n        raise NotImplementedError(\"Validator.join_score_fn not implemented.\")\n\n    def topk_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        raise NotImplementedError(\"Validator.map_score_fn not implemented.\")\n\n    def _get_gen_stats_from_completion(self, completion, start_time: float) -> GenerationStats:\n        \"\"\"\n        Extract generation stats from the given completion response.\n        \"\"\"\n        usage = completion.usage.model_dump()\n\n        # get cost per input/output token for the model and parse number of input and output tokens\n        usd_per_input_token = self.model.get_usd_per_input_token()\n        usd_per_output_token = self.model.get_usd_per_output_token()\n        input_tokens = usage[\"prompt_tokens\"]\n        output_tokens = usage[\"completion_tokens\"]\n\n        return GenerationStats(\n            model_name=self.model.value,\n            llm_call_duration_secs=time.time() - start_time,\n            fn_call_duration_secs=0.0,\n            input_text_tokens=input_tokens,\n            output_text_tokens=output_tokens,\n            cost_per_record=input_tokens * usd_per_input_token + output_tokens * usd_per_output_token,\n            total_llm_calls=1,\n        )\n\n    def _default_map_score_fn(self, op: LLMConvert, fields: list[str], input_record: DataRecord, output: dict) -> tuple[float | None, GenerationStats]:\n        \"\"\"\n        Compute the quality of the generated output for the given fields and input_record.\n        \"\"\"\n        # create prompt factory\n        factory = PromptFactory(PromptStrategy.MAP, self.model, Cardinality.ONE_TO_ONE)\n\n        # get the input messages; strip out the system message(s)\n        msg_kwargs = {\"output_schema\": op.output_schema, \"project_cols\": op.get_input_fields()}\n        messages = factory.create_messages(input_record, fields, **msg_kwargs)\n        input_messages = [msg for msg in messages if msg[\"role\"] != \"system\"]\n        output = json.dumps(output, indent=2)\n        output_message = f\"OUTPUT:\\n--------\\n{output}\\n\\nEVALUATION: \"\n        # input_str = '\\n'.join(list(map(lambda d: d['content'], input_messages + [{\"role\": \"user\", \"content\": output_message}])))\n\n        # invoke the judge\n        score, gen_stats = None, GenerationStats()\n        try:\n            start_time = time.time()\n            validator_prompt = MAP_IMAGE_VALIDATOR_PROMPT if op.is_image_op() else MAP_VALIDATOR_PROMPT\n            val_messages = [{\"role\": \"system\", \"content\": validator_prompt}] + input_messages + [{\"role\": \"user\", \"content\": output_message}]\n            completion = litellm.completion(model=self.model.value, messages=val_messages)\n            completion_text = completion.choices[0].message.content\n            gen_stats = self._get_gen_stats_from_completion(completion, start_time)\n            # print(f\"INPUT:\\n{input_str}\")\n            # print(Fore.GREEN + f\"{completion_text}\\n\" + Style.RESET_ALL)\n\n            # parse the evaluation\n            eval_dict: dict = get_json_from_answer(completion_text, self.model, Cardinality.ONE_TO_ONE)\n            score = sum(eval_dict.values()) / len(eval_dict)\n\n        except Exception:\n            pass\n\n        return score, gen_stats\n\n    def _default_flat_map_score_fn(self, op: LLMConvert, fields: list[str], input_record: dict, output: list[dict]) -> tuple[float | None, GenerationStats]:\n        \"\"\"\n        Compute the quality for each record_op_stats object in the given record_set.\n        \"\"\"\n        # create prompt factory\n        factory = PromptFactory(PromptStrategy.MAP, self.model, Cardinality.ONE_TO_MANY)\n\n        # get the input messages; strip out the system message(s)\n        msg_kwargs = {\"output_schema\": op.output_schema, \"project_cols\": op.get_input_fields()}\n        messages = factory.create_messages(input_record, fields, **msg_kwargs)\n        input_messages = [msg for msg in messages if msg[\"role\"] != \"system\"]\n        output = json.dumps(output, indent=2)\n        output_message = f\"OUTPUTS:\\n--------\\n{output}\\n\\nEVALUATION: \"\n        # input_str = '\\n'.join(list(map(lambda d: d['content'], input_messages + [{\"role\": \"user\", \"content\": output_message}])))\n\n        # invoke the judge\n        score, gen_stats = None, GenerationStats()\n        try:\n            start_time = time.time()\n            validator_prompt = FLAT_MAP_IMAGE_VALIDATOR_PROMPT if op.is_image_op() else FLAT_MAP_VALIDATOR_PROMPT\n            val_messages = [{\"role\": \"system\", \"content\": validator_prompt}] + input_messages + [{\"role\": \"user\", \"content\": output_message}]\n            completion = litellm.completion(model=\"openai/o4-mini\", messages=val_messages)\n            completion_text = completion.choices[0].message.content\n            gen_stats = self._get_gen_stats_from_completion(completion, start_time)\n            # print(f\"INPUT:\\n{input_str}\")\n            # print(Fore.GREEN + f\"{completion_text}\\n\" + Style.RESET_ALL)\n\n            # parse the evaluation\n            eval_dicts: list[dict] = get_json_from_answer(completion_text, self.model, Cardinality.ONE_TO_MANY)\n            all_qualities = []\n            for record_eval_dict in eval_dicts:\n                all_qualities.extend(record_eval_dict.values())\n            score = sum(all_qualities) / len(all_qualities)\n\n        except Exception:\n            pass\n\n        return score, gen_stats\n\n    def _default_filter_score_fn(self, op: LLMFilter, filter_str: str, input_record: dict, output: bool) -> tuple[float | None, GenerationStats]:\n        \"\"\"\n        Compute the quality for each record_op_stats object in the given record_set.\n        \"\"\"\n        score, gen_stats = None, GenerationStats()\n        filter_input_hash = hash(f\"{filter_str}{hash(input_record)}\")\n        label = self.filter_cache.get(filter_input_hash, None)\n        if label is None:\n            validator_op: LLMFilter = op.copy()\n            validator_op.model = self.model\n            try:\n                target_record_set = validator_op(input_record)\n                label = target_record_set[0]._passed_operator\n                self.filter_cache[filter_input_hash] = label\n                score = float(label == output)\n                record_op_stats = target_record_set.record_op_stats[0]\n                gen_stats = GenerationStats(\n                    model_name=self.model.value,\n                    input_text_tokens=record_op_stats.input_text_tokens,\n                    input_audio_tokens=record_op_stats.input_audio_tokens,\n                    input_image_tokens=record_op_stats.input_image_tokens,\n                    cache_read_tokens=record_op_stats.cache_read_tokens,\n                    cache_creation_tokens=record_op_stats.cache_creation_tokens,\n                    output_text_tokens=record_op_stats.output_text_tokens,\n                    embedding_input_tokens=record_op_stats.embedding_input_tokens,\n                    cost_per_record=record_op_stats.cost_per_record,\n                    llm_call_duration_secs=record_op_stats.llm_call_duration_secs,\n                    fn_call_duration_secs=record_op_stats.fn_call_duration_secs,\n                    total_llm_calls=record_op_stats.total_llm_calls,\n                    total_embedding_llm_calls=record_op_stats.total_embedding_llm_calls,\n                )\n\n            except Exception:\n                pass\n\n        else:\n            score = float(label == output)\n\n        return score, gen_stats\n\n    def _default_join_score_fn(self, op: JoinOp, condition: str, left_input_record: DataRecord, right_input_record: DataRecord, output: bool) -> tuple[float | None, GenerationStats]:\n        score, gen_stats = None, GenerationStats()\n        join_input_hash = hash(f\"{condition}{hash(left_input_record)}{hash(right_input_record)}\")\n        label = self.join_cache.get(join_input_hash, None)\n        if label is None:\n            validator_op: JoinOp = op.copy()\n            validator_op.model = self.model\n            try:\n                target_record_set, _ = validator_op([left_input_record], [right_input_record])\n                label = target_record_set[0]._passed_operator\n                self.join_cache[join_input_hash] = label\n                score = float(label == output)\n                record_op_stats = target_record_set.record_op_stats[0]\n                gen_stats = GenerationStats(\n                    model_name=self.model.value,\n                    input_text_tokens=record_op_stats.input_text_tokens,\n                    input_audio_tokens=record_op_stats.input_audio_tokens,\n                    input_image_tokens=record_op_stats.input_image_tokens,\n                    cache_read_tokens=record_op_stats.cache_read_tokens,\n                    cache_creation_tokens=record_op_stats.cache_creation_tokens,\n                    output_text_tokens=record_op_stats.output_text_tokens,\n                    embedding_input_tokens=record_op_stats.embedding_input_tokens,\n                    cost_per_record=record_op_stats.cost_per_record,\n                    llm_call_duration_secs=record_op_stats.llm_call_duration_secs,\n                    fn_call_duration_secs=record_op_stats.fn_call_duration_secs,\n                    total_llm_calls=record_op_stats.total_llm_calls,\n                    total_embedding_llm_calls=record_op_stats.total_embedding_llm_calls,\n                )\n\n            except Exception:\n                pass\n\n        else:\n            score = float(label == output)\n\n        return score, gen_stats\n\n    def _default_topk_score_fn(self, op: TopKOp, fields: list[str], input_record: DataRecord, output: dict) -> tuple[float | None, GenerationStats]:\n        \"\"\"\n        Compute the quality of the generated output for the given fields and input_record.\n        \"\"\"\n        # TODO: top-k k=25; score each item based on relevance; compute F1\n        # TODO: support retrieval over images\n        # create prompt factory\n        factory = PromptFactory(PromptStrategy.MAP, self.model, Cardinality.ONE_TO_ONE)\n\n        # get the input messages; strip out the system message(s)\n        msg_kwargs = {\"output_schema\": op.output_schema, \"project_cols\": op.get_input_fields()}\n        messages = factory.create_messages(input_record, fields, **msg_kwargs)\n        input_messages = [msg for msg in messages if msg[\"role\"] != \"system\"]\n        output = json.dumps(output, indent=2)\n        output_message = f\"OUTPUT:\\n--------\\n{output}\\n\\nEVALUATION: \"\n        # input_str = '\\n'.join(list(map(lambda d: d['content'], input_messages + [{\"role\": \"user\", \"content\": output_message}])))\n\n        # invoke the judge\n        score, gen_stats = None, GenerationStats()\n        try:\n            start_time = time.time()\n            # TODO: support retrieval over images\n            validator_prompt = RETRIEVE_VALIDATOR_PROMPT\n            val_messages = [{\"role\": \"system\", \"content\": validator_prompt}] + input_messages + [{\"role\": \"user\", \"content\": output_message}]\n            completion = litellm.completion(model=\"openai/o4-mini\", messages=val_messages)\n            completion_text = completion.choices[0].message.content\n            gen_stats = self._get_gen_stats_from_completion(completion, start_time)\n            # print(f\"INPUT:\\n{input_str}\")\n            # print(Fore.GREEN + f\"{completion_text}\\n\" + Style.RESET_ALL)\n\n            # parse the evaluation\n            eval_dict: dict = get_json_from_answer(completion_text, self.model, Cardinality.ONE_TO_ONE)\n            score = sum(eval_dict.values()) / len(eval_dict)\n\n        except Exception:\n            pass\n\n        return score, gen_stats\n\n\n    def _score_map(self, op: LLMConvert, fields: list[str], input_record: DataRecord, output: dict, full_hash: str) -> tuple[float | None, GenerationStats, str]:\n        try:\n            out = self.map_score_fn(fields, input_record.to_dict(), output)\n            score, gen_stats = out if isinstance(out, tuple) else (out, GenerationStats())\n            return score, gen_stats, full_hash\n        except NotImplementedError:\n            score, gen_stats = self._default_map_score_fn(op, fields, input_record, output)\n            return score, gen_stats, full_hash\n\n    def _score_flat_map(self, op: LLMConvert, fields: list[str], input_record: DataRecord, output: list[dict], full_hash: str) -> tuple[float | None, GenerationStats, str]:\n        try:\n            out = self.flat_map_score_fn(fields, input_record.to_dict(), output)\n            score, gen_stats = out if isinstance(out, tuple) else (out, GenerationStats())\n            return score, gen_stats, full_hash\n        except NotImplementedError:\n            score, gen_stats = self._default_flat_map_score_fn(op, fields, input_record, output)\n            return score, gen_stats, full_hash\n\n    def _score_filter(self, op: LLMFilter, filter_str: str, input_record: DataRecord, output: bool, full_hash: str) -> tuple[float | None, GenerationStats, str]:\n        try:\n            out = self.filter_score_fn(filter_str, input_record.to_dict(), output)\n            score, gen_stats = out if isinstance(out, tuple) else (out, GenerationStats())\n            return score, gen_stats, full_hash\n        except NotImplementedError:\n            score, gen_stats = self._default_filter_score_fn(op, filter_str, input_record, output)\n            return score, gen_stats, full_hash\n\n    def _score_join(self, op: JoinOp, condition: str, left_input_record: DataRecord, right_input_record: DataRecord, output: bool, full_hash: str) -> tuple[float | None, GenerationStats, str]:\n        try:\n            out = self.join_score_fn(condition, left_input_record.to_dict(), right_input_record.to_dict(), output)\n            score, gen_stats = out if isinstance(out, tuple) else (out, GenerationStats())\n            return score, gen_stats, full_hash\n        except NotImplementedError:\n            score, gen_stats = self._default_join_score_fn(op, condition, left_input_record, right_input_record, output)\n            return score, gen_stats, full_hash\n\n    def _score_topk(self, op: TopKOp, fields: list[str], input_record: DataRecord, output: dict, full_hash: str) -> tuple[float | None, GenerationStats, str]:\n        try:\n            out = self.topk_score_fn(fields, input_record.to_dict(), output)\n            score, gen_stats = out if isinstance(out, tuple) else (out, GenerationStats())\n            return score, gen_stats, full_hash\n        except NotImplementedError:\n            score, gen_stats = self._default_topk_score_fn(op, fields, input_record, output)\n            return score, gen_stats, full_hash\n"
  },
  {
    "path": "testdata/README.md",
    "content": "## Note About Datasets Used in Evaluation\nEnron is run using the `enron-eval` dataset\n\nReal Estate is run using the `real-estate-eval` dataset\n\nFor the easy and hard code generation evaluations, we needed to create a range of dataset sizes based on the `real-estate-eval` dataset. Thus, I created `real-estate-eval-5`, `real-estate-eval-10`, ..., `real-estate-eval-30`. Note that `real-estate-eval-15` should be equivalent to `real-estate-eval`.\n\nGroundtruth labels are stored in the `groundtruth` folder. The Enron and Real Estate groundtruth files match the evaluation directories of the same name. The codegen groundtruths are slightly different than the criteria used for Real Estate, so they have their own set of labels in e.g. `codegen-easy-eval-[5,30].csv` which map(s) to `real-estate-eval-[5,30]`.  \n"
  },
  {
    "path": "testdata/download-testdata.sh",
    "content": "#!/bin/bash\n#This script can be used to download and extract the test data for the palimpzest demos\n# Usage: bash testdata/download.sh\n# Requirements: wget, tar\n\n# Move to the testdata directory\npushd testdata\n# Download the test data\necho \"Downloading the test data...\"\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/askem.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/askem-tiny.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/bdf-usecase3-pdf.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/bdf-usecase3-references.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/bdf-usecase3-references-pdf.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/bdf-usecase3-references-pdffull.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/bdf-usecase3-tiny.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/biofabric-html.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/biofabric-matching.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/biofabric-medium.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/biofabric-tiny.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/biofabric-tiny-filtered.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/biofabric-urls.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/enron-eval.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/enron-eval-tiny.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/enron-small.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/enron-tiny.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/equation-tiny.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/groundtruth.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/images-tiny.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/pdfs-tiny.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/real-estate-eval.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/real-estate-eval-10.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/real-estate-eval-15.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/real-estate-eval-20.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/real-estate-eval-5.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/real-estate-eval-tiny.tar.gz\nwget -nc https://palimpzest-workloads.s3.us-east-1.amazonaws.com/real-estate-eval-100.tar\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/vldbdownload.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/real-estate-eval-25.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/real-estate-eval-30.tar.gz \nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/biofabric-pdf.tar.gz\nwget -nc https://people.csail.mit.edu/gerarvit/PalimpzestData/enron-tiny.csv\n\necho \"Extracting the test data...\"\n# Extract the test data\nfor f in *.tar.gz; do\n  tar -xzf $f\ndone\nrm *.tar.gz\npopd\necho \"Done!\"\n"
  },
  {
    "path": "testdata/enron-eval-medium-labels.json",
    "content": "{\"allen-p-inbox-45.txt\": [], \"crandell-s-inbox-241.txt\": [], \"blair-l-inbox-267.txt\": [], \"forney-j-inbox-149.txt\": [], \"ermis-f-inbox-149.txt\": [], \"bass-e-inbox-227.txt\": [], \"arora-h-inbox-receipts-2.txt\": [], \"donohoe-t-inbox-451.txt\": [], \"townsend-j-inbox-7-short.txt\": [], \"arnold-j-inbox-166.txt\": [], \"germany-c-inbox-3.txt\": [], \"ermis-f-inbox-3.txt\": [], \"derrick-j-inbox-103.txt\": [], \"carson-m-inbox-125.txt\": [], \"giron-d-inbox-241.txt\": [], \"derrick-j-inbox-75.txt\": [], \"carson-m-inbox-76.txt\": [], \"kaminski-v-technote-mail-projects-1.txt\": [{\"sender\": \"ron.baker@enron.com\", \"subject\": \"Valuation Methodology\"}], \"brawner-s-inbox-10.txt\": [], \"beck-s-inbox-202.txt\": [], \"ermis-f-inbox-565.txt\": [], \"gay-r-inbox-9.txt\": [], \"arnold-j-inbox-50.txt\": [], \"germany-c-inbox-125.txt\": [], \"blair-l-inbox-53.txt\": [], \"dorland-c-inbox-50.txt\": [], \"delainey-d-inbox-9.txt\": [], \"farmer-d-inbox-69.txt\": [], \"cuilla-m-inbox-20.txt\": [], \"causholli-m-inbox-109.txt\": [], \"fischer-m-inbox-9.txt\": [], \"arora-h-inbox-9.txt\": [], \"donoho-l-inbox-165.txt\": [], \"farmer-d-inbox-96.txt\": [], \"delainey-d-inbox-11.txt\": [], \"davis-d-inbox-wells-19.txt\": [], \"delainey-d-inbox-10.txt\": [], \"buy-r-inbox-710.txt\": [], \"davis-d-inbox-wells-18.txt\": [], \"derrick-j-inbox-248.txt\": [], \"corman-s-inbox-measurement-9.txt\": [], \"gilbertsmith-d-inbox-84.txt\": [], \"lay-k-inbox-667.txt\": [], \"gilbertsmith-d-inbox-53.txt\": [], \"delainey-d-inbox-8.txt\": [], \"cuilla-m-inbox-35.txt\": [], \"beck-s-inbox-983.txt\": [], \"forney-j-inbox-95.txt\": [], \"dorland-c-inbox-45.txt\": [], \"delainey-d-sent-295.txt\": [{\"sender\": \"david.delainey@enron.com\", \"subject\": \"AIG Fund\"}], \"germany-c-inbox-19.txt\": [], \"donohoe-t-inbox-9.txt\": [], \"giron-d-inbox-9.txt\": [], \"griffith-j-inbox-10.txt\": [], \"benson-r-inbox-53.txt\": [], \"parks-j-deleted-items-913-short.txt\": [], \"lay-k-inbox-1693.txt\": [], \"arora-h-inbox-receipts-3.txt\": [], \"corman-s-inbox-59.txt\": [], \"donohoe-t-inbox-336.txt\": [], \"kaminski-v-all-documents-1103.txt\": [{\"sender\": \"david.port@enron.com\", \"subject\": \"RE: Raptors\"}], \"gang-l-inbox-45.txt\": [], \"allen-p-inbox-87.txt\": [], \"dasovich-j-inbox-960.txt\": [], \"gay-r-inbox-10.txt\": [], \"bass-e-inbox-36.txt\": [], \"arora-h-inbox-84.txt\": [], \"beck-s-inbox-149.txt\": [], \"allen-p-inbox-78.txt\": [], \"geaccone-t-inbox-3.txt\": [], \"brawner-s-inbox-3.txt\": [], \"germany-c-inbox-287.txt\": [], \"benson-r-inbox-208.txt\": [], \"cash-m-inbox-208.txt\": [], \"dasovich-j-inbox-96.txt\": [], \"baughman-d-inbox-19.txt\": [], \"corman-s-inbox-67.txt\": [], \"kaminski-v-all-documents-2352.txt\": [{\"sender\": \"ron.baker@enron.com\", \"subject\": \"RE: Cross-Guarantees\"}], \"blair-l-inbox-264.txt\": [], \"buy-r-inbox-100.txt\": [], \"arora-h-inbox-receipts-1.txt\": [], \"arnold-j-inbox-165.txt\": [], \"brawner-s-inbox-100.txt\": [], \"carson-m-inbox-126.txt\": [], \"derrick-j-inbox-100.txt\": [], \"giron-d-inbox-13.txt\": [], \"arnold-j-inbox-84.txt\": [], \"carson-m-inbox-75.txt\": [], \"brawner-s-inbox-13.txt\": [], \"bass-e-inbox-185.txt\": [], \"lay-k-inbox-1083.txt\": [], \"arnold-j-inbox-53.txt\": [], \"ermis-f-inbox-566.txt\": [], \"dorland-c-inbox-53.txt\": [], \"martin-t-inbox-53-short.txt\": [], \"blair-l-inbox-50.txt\": [], \"cuilla-m-inbox-23.txt\": [], \"derrick-j-sent-items-536.txt\": [{\"sender\": \"james.derrick@enron.com\", \"subject\": \"FW: Raptor Notes\"}], \"beck-s-inbox-759.txt\": [], \"donohoe-t-inbox-19.txt\": [], \"donoho-l-inbox-166.txt\": [], \"delainey-d-inbox-12.txt\": [], \"farmer-d-inbox-95.txt\": [], \"forney-j-inbox-9.txt\": [], \"germany-c-inbox-319.txt\": [], \"delainey-d-inbox-13.txt\": [], \"donohoe-t-inbox-241.txt\": [], \"cuilla-m-inbox-36.txt\": [], \"gilbertsmith-d-inbox-50.txt\": [], \"dorland-c-inbox-46.txt\": [], \"giron-d-inbox-451.txt\": [], \"crandell-s-inbox-109.txt\": [], \"arnold-j-inbox-3.txt\": [], \"forney-j-inbox-96.txt\": [], \"benson-r-inbox-143.txt\": [], \"cuilla-m-inbox-9.txt\": [], \"kaminski-v-inbox-92.txt\": [{\"sender\": \"j.kaminski@enron.com\", \"subject\": \"FW: Raptors\"}], \"griffith-j-inbox-13.txt\": [], \"baughman-d-inbox-165.txt\": [], \"campbell-l-inbox-1256.txt\": [], \"cash-m-inbox-143.txt\": [], \"dean-c-inbox-838.txt\": [], \"benson-r-inbox-50.txt\": [], \"beck-s-inbox-412.txt\": [], \"campbell-l-inbox-166.txt\": [], \"geaccone-t-inbox-100.txt\": [], \"bass-e-notes-inbox-48-short.txt\": [], \"donoho-l-inbox-19.txt\": [], \"gang-l-inbox-46.txt\": [], \"allen-p-inbox-84.txt\": [], \"buy-r-inbox-667.txt\": [], \"bass-e-inbox-35.txt\": [], \"gay-r-inbox-13.txt\": [], \"arora-h-inbox-87.txt\": [], \"cash-m-inbox-96.txt\": [], \"beck-s-inbox-166.txt\": [], \"campbell-l-inbox-412.txt\": [], \"geaccone-t-inbox-474.txt\": [], \"whalley-g-merchant-investments-3.txt\": [{\"sender\": \"kevin.garland@enron.com\", \"subject\": \"Enron Principal Investments Update\"}], \"gilbertsmith-d-inbox-9.txt\": [], \"bass-e-inbox-19.txt\": [], \"gang-l-inbox-42.txt\": [], \"ermis-f-inbox-601.txt\": [], \"lay-k-inbox-1469.txt\": [], \"germany-c-inbox-241.txt\": [], \"beck-s-notes-inbox-166.txt\": [{\"sender\": \"shona.wilson@enron.com\", \"subject\": \"Re: Enron Raptor I P&L Reversal\"}], \"arnold-j-inbox-174.txt\": [], \"donoho-l-inbox-35.txt\": [], \"lay-k-inbox-264.txt\": [], \"buy-r-inbox-313.txt\": [], \"baughman-d-inbox-149.txt\": [], \"blair-l-inbox-96.txt\": [], \"arnold-j-inbox-95.txt\": [], \"bass-e-inbox-143.txt\": [], \"dorland-c-inbox-42.txt\": [], \"arnold-j-inbox-42.txt\": [], \"crandell-s-inbox-125.txt\": [], \"dean-c-inbox-960.txt\": [], \"farmer-d-inbox-53.txt\": [], \"delainey-d-inbox-17.txt\": [], \"giron-d-inbox-125.txt\": [], \"davis-d-inbox-wells-36.txt\": [], \"bass-e-inbox-9.txt\": [], \"gilbertsmith-d-inbox-96.txt\": [], \"corman-s-inbox-vacation_schedules-1.txt\": [], \"delainey-d-inbox-16.txt\": [], \"kaminski-v-inbox-291.txt\": [{\"sender\": \"baker@enron.com\", \"subject\": \"RE: Pricing of restriction on Enron stock\"}], \"cash-m-inbox-350.txt\": [], \"cuilla-m-inbox-27.txt\": [], \"farmer-d-inbox-46.txt\": [], \"donohoe-t-inbox-287.txt\": [], \"cash-m-inbox-185.txt\": [], \"lay-k-inbox-1722.txt\": [], \"benson-r-inbox-185.txt\": [], \"campbell-l-inbox-759.txt\": [], \"carson-m-inbox-59.txt\": [], \"forney-j-inbox-50.txt\": [], \"delainey-d-sent-318.txt\": [{\"sender\": \"david.delainey@enron.com\", \"subject\": \"Re: ENA Comp suggestions for Project Raptor\"}], \"buy-r-inbox-474.txt\": [], \"arnold-j-inbox-149.txt\": [], \"beck-s-sent-13.txt\": [{\"sender\": \"sally.beck@enron.com\", \"subject\": \"Re: Catalytica Write-down\"}], \"corman-s-inbox-9.txt\": [], \"bass-e-inbox-208.txt\": [], \"cash-m-inbox-50.txt\": [], \"geaccone-t-inbox-13.txt\": [], \"arora-h-inbox-41.txt\": [], \"ermis-f-inbox-166.txt\": [], \"forney-j-inbox-166.txt\": [], \"allen-p-inbox-9.txt\": [], \"blair-l-inbox-248.txt\": [], \"buy-r-inbox-1083.txt\": [], \"crandell-s-inbox-75.txt\": [], \"baughman-d-inbox-202.txt\": [], \"allen-p-inbox-42.txt\": [], \"delainey-d-sent-683.txt\": [{\"sender\": \"david.delainey@enron.com\", \"subject\": \"Re: Raptor\"}], \"geaccone-t-inbox-313.txt\": [], \"beck-s-inbox-601.txt\": [], \"gay-r-inbox-14.txt\": [], \"allen-p-inbox-83.txt\": [], \"gang-l-inbox-41.txt\": [], \"giron-d-inbox-287.txt\": [], \"donoho-l-inbox-36.txt\": [], \"campbell-l-inbox-149.txt\": [], \"dean-c-inbox-817.txt\": [], \"lay-k-inbox-313.txt\": [], \"arnold-j-inbox-96.txt\": [], \"germany-c-inbox-35.txt\": [], \"buy-r-inbox-264.txt\": [], \"dorland-c-inbox-41.txt\": [], \"causholli-m-inbox-3.txt\": [], \"blair-l-inbox-100.txt\": [], \"dorland-c-inbox-69.txt\": [], \"farmer-d-inbox-50.txt\": [], \"geaccone-t-inbox-503.txt\": [], \"campbell-l-inbox-565.txt\": [], \"davis-d-inbox-100.txt\": [], \"cuilla-m-inbox-19.txt\": [], \"dasovich-j-inbox-838.txt\": [], \"delainey-d-inbox-14.txt\": [], \"corman-s-inbox-vacation_schedules-3.txt\": [], \"gilbertsmith-d-inbox-95.txt\": [], \"blair-l-inbox-3.txt\": [], \"ermis-f-inbox-50.txt\": [], \"donohoe-t-inbox-509.txt\": [], \"corman-s-inbox-vacation_schedules-2.txt\": [], \"delainey-d-inbox-15.txt\": [], \"donoho-l-inbox-149.txt\": [], \"donohoe-t-inbox-36.txt\": [], \"cuilla-m-inbox-24.txt\": [], \"giron-d-sent-items-200.txt\": [], \"gilbertsmith-d-inbox-42.txt\": [], \"farmer-d-inbox-45.txt\": [], \"geaccone-t-inbox-264.txt\": [], \"causholli-m-inbox-125.txt\": [], \"buy-r-inbox-503.txt\": [], \"forney-j-inbox-53.txt\": [], \"griffith-j-inbox-29.txt\": [], \"fischer-m-inbox-48.txt\": [], \"lay-k-inbox-1655.txt\": [], \"carson-m-inbox-109.txt\": [], \"donohoe-t-inbox-125.txt\": [], \"baughman-d-inbox-3.txt\": [], \"arora-h-inbox-42.txt\": [], \"forney-j-inbox-165.txt\": [], \"kaminski-v-all-documents-2355.txt\": [{\"sender\": \"ron.baker@enron.com\", \"subject\": \"Raptor Position Reports for 12/28/00\"}], \"corman-s-inbox-48.txt\": [], \"allen-p-inbox-69.txt\": [], \"baughman-d-inbox-36.txt\": [], \"cash-m-inbox-227.txt\": [], \"benson-r-inbox-227.txt\": []}"
  },
  {
    "path": "testdata/target_matching.csv",
    "content": "target_attribute,li,cao,clark,dou,gilette,huang,krug,mcdermott,satpathy,vasaikar,wang\r\ncase_submitter_id,Patient_ID,case_id,case_id,Proteomics_Participant_ID,participant,case_id,Sample.ID,CPTAC Case ID,Participant,SampleID,case_id\r\nage_at_diagnosis,Age,age,age,age,age,age,Age.in.Months,\"Age in Months at Time of Tissue Procurement \t\",Age,Age,age\r\nrace,Race,race,race,race,ethnicity,missing,Ethnicity,Race,Ethnicity,missing,race\r\nethnicity,Ethnicity,missing,ethnicity_self_identify,ethnicity,missing,missing,missing,Ethnicity,missing,missing,ethnicity\r\ngender,Sex,sex,gender,gender,gender,gender,Gender,Gender,Gender,Gender,gender\r\nvital_status,\"Survival status (1, dead; 0, alive)\r\n\",vital_status,missing,missing,missing,missing,missing,missing,missing,Vital.Status,vital_status\r\najcc_pathologic_t,baseline/pathologic_staging_primary_tumor_pt,pathologic_staging_primary_tumor_pt,path_stage_primary_tumor_pt,Path_Stage_Primary_Tumor-pT,missing,patho_staging_pt,missing,missing,missing,PT,missing\r\najcc_pathologic_n,baseline/pathologic_staging_regional_lymph_nodes_pn,pathologic_staging_regional_lymph_nodes_pn,path_stage_reg_lymph_nodes_pN,Path_Stage_Reg_Lymph_Nodes-pN,missing,patho_staging_pn,missing,missing,missing,PN,missing\r\najcc_pathologic_stage,baseline/tumor_stage_pathological,tumor_stage_pathological,tumor_stage_pathological,tumor_Stage-Pathological,stage,tumor_stage,Tumor.Stage,Tumor Stage (Pathological) Ovary FIGO Staging System,Stage,Stage,missing\r\ntumor_grade,cptac_path/histologic_grade,missing,grade,Histologic_Grade_FIGO,missing,histologic_grade,missing,Tumor Grade,missing,missing,missing\r\ntumor_focality,baseline/tumor_focality,tumor_focality,tumor_focality,Tumor_Focality,missing,tumor_focality,missing,missing,missing,missing,missing\r\ntumor_largest_dimension_diameter,baseline/tumor_size_cm,tumor_size_cm,tumor_size_cm,Tumor_Size_cm,missing,tumor_size_cm,missing,missing,missing,missing,missing\r\nprimary_diagnosis,baseline/histologic_type,histology_diagnosis,histologic_type,histologic_type,Dominant.Histological.Subtype,histologic_type,missing,Histological Subtype,missing,paper,paper\r\nmorphology,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing\r\ntissue_or_organ_of_origin,baseline/tumor_site,paper,paper,Tumor_Site,paper,paper,paper,paper,paper,paper,paper\r\n"
  },
  {
    "path": "tests/pytest/README.md",
    "content": "## Testing with Pytest in Palimpzest\n- tests in `test_*.py` files\n- fixtures in `conftest.py` and `fixtures/` (these are auto-discovered by `pytest` classes / tests)\n\n### Running tests\nRun the scripts within the `tests/pytest` directory:\n- `cd tests/pytest` first\n- `pytest` runs all tests\n- `pytest -k <test_name>` runs a specific test\n- `pytest -m <tag>` runs tests with a specific tag\n"
  },
  {
    "path": "tests/pytest/conftest.py",
    "content": "import pytest\n\nfrom palimpzest.policy import MaxQuality, MaxQualityAtFixedCost, MinCost, MinCostAtFixedQuality\n\npytest_plugins = [\n    \"fixtures.champion_outputs\",\n    \"fixtures.datasets\",\n    \"fixtures.execution_data\",\n    \"fixtures.expected_physical_plans\",\n    \"fixtures.expected_qualities\",\n    \"fixtures.expected_records\",\n    \"fixtures.models\",\n    \"fixtures.physical_plans\",\n    \"fixtures.operator_to_stats\",\n    \"fixtures.schemas\",\n    \"fixtures.side_effects\",\n    \"fixtures.workloads\",\n]\n\n# NOTE: these fixtures may grow to have long lists of arguments;\n#       the benefit of using fixtures here (which requires us to specify them\n#       as arguments) is that pytest will compute each fixture value once\n#       and cache the result. Thus, we minimize recomputation and don't\n#       need to, for example, re-register datasets for each individual test.\n@pytest.fixture\ndef dataset(request, enron_eval_tiny, real_estate_eval_tiny):\n    dataset_id = request.param\n    dataset_id_to_dataset = {\n        \"enron-eval-tiny\": enron_eval_tiny,\n        \"real-estate-eval-tiny\": real_estate_eval_tiny,\n    }\n    return dataset_id_to_dataset[dataset_id]\n\n\n@pytest.fixture\ndef workload(\n    request,\n    enron_workload,\n    real_estate_workload,\n    three_converts_workload,\n    one_filter_one_convert_workload,\n    two_converts_two_filters_workload,\n):\n    workload_id = request.param\n    workload_id_to_workload = {\n        \"enron-workload\": enron_workload,\n        \"real-estate-workload\": real_estate_workload,\n        \"three-converts\": three_converts_workload,\n        \"one-filter-one-convert\": one_filter_one_convert_workload,\n        \"two-converts-two-filters\": two_converts_two_filters_workload,\n    }\n    return workload_id_to_workload[workload_id]\n\n\n@pytest.fixture\ndef policy(request):\n    policy_id = request.param\n    policy_id_to_policy = {\n        \"mincost\": MinCost(),\n        \"maxquality\": MaxQuality(),\n        \"mincost@quality=0.8\": MinCostAtFixedQuality(0.8),\n        \"maxquality@cost=1.0\": MaxQualityAtFixedCost(1.0),\n    }\n    return policy_id_to_policy[policy_id]\n\n\n@pytest.fixture\ndef physical_plan(\n    request,\n    scan_only_plan,\n    non_llm_filter_plan,\n    llm_filter_plan,\n    bonded_llm_convert_plan,\n    rag_convert_plan,\n    image_convert_plan,\n    one_to_many_convert_plan,\n):\n    physical_plan_id = request.param\n    physical_plan_id_to_physical_plan = {\n        \"scan-only\": scan_only_plan,\n        \"non-llm-filter\": non_llm_filter_plan,\n        \"llm-filter\": llm_filter_plan,\n        \"bonded-llm-convert\": bonded_llm_convert_plan,\n        \"rag-convert\": rag_convert_plan,\n        \"image-convert\": image_convert_plan,\n        \"one-to-many-convert\": one_to_many_convert_plan,\n    }\n    return physical_plan_id_to_physical_plan[physical_plan_id]\n\n\n@pytest.fixture\ndef sentinel_plan(\n    request,\n    scan_convert_filter_sentinel_plan,\n    scan_multi_convert_multi_filter_sentinel_plan\n):\n    sentinel_plan_id = request.param\n    sentinel_plan_id_to_sentinel_plan = {\n        \"scf\": scan_convert_filter_sentinel_plan,\n        \"scffc\": scan_multi_convert_multi_filter_sentinel_plan,\n    }\n    return sentinel_plan_id_to_sentinel_plan[sentinel_plan_id]\n\n@pytest.fixture\ndef execution_data(\n    request,\n    scan_convert_filter_execution_data,\n    scan_convert_filter_varied_execution_data,\n    scan_multi_convert_multi_filter_execution_data,\n):\n    execution_data_id = request.param\n    execution_data_id_to_execution_data = {\n        \"scf\": scan_convert_filter_execution_data,\n        \"scf-varied\": scan_convert_filter_varied_execution_data,\n        \"scffc\": scan_multi_convert_multi_filter_execution_data,\n    }\n    return execution_data_id_to_execution_data[execution_data_id]\n\n@pytest.fixture\ndef expected_records(\n    request,\n    enron_all_expected_records,\n    enron_filter_expected_records,\n    real_estate_all_expected_records,\n    real_estate_one_to_many_expected_records,\n    scan_convert_filter_expected_outputs,\n    scan_convert_filter_empty_expected_outputs,\n    scan_convert_filter_varied_expected_outputs,\n    scan_multi_convert_multi_filter_expected_outputs,\n):\n    records_id = request.param\n    records_id_to_expected_records = {\n        \"enron-all-records\": enron_all_expected_records,\n        \"enron-filtered-records\": enron_filter_expected_records,\n        \"real-estate-all-records\": real_estate_all_expected_records,\n        \"real-estate-one-to-many-records\": real_estate_one_to_many_expected_records,\n        \"scf\": scan_convert_filter_expected_outputs,\n        \"empty\": scan_convert_filter_empty_expected_outputs,\n        \"scf-varied\": scan_convert_filter_varied_expected_outputs,\n        \"scffc\": scan_multi_convert_multi_filter_expected_outputs,\n    }\n    return records_id_to_expected_records[records_id]\n\n\n@pytest.fixture\ndef champion_outputs(\n    request,\n    scan_convert_filter_champion_outputs,\n    scan_convert_filter_empty_champion_outputs,\n    scan_convert_filter_varied_champion_outputs,\n    scan_multi_convert_multi_filter_champion_outputs\n):\n    champion_outputs_id = request.param\n    champion_outputs_id_to_champion_outputs = {\n        \"scf\": scan_convert_filter_champion_outputs,\n        \"empty\": scan_convert_filter_empty_champion_outputs,\n        \"scf-varied\": scan_convert_filter_varied_champion_outputs,\n        \"scffc\": scan_multi_convert_multi_filter_champion_outputs,\n    }\n    return champion_outputs_id_to_champion_outputs[champion_outputs_id]\n\n\n@pytest.fixture\ndef expected_qualities(\n    request,\n    scan_convert_filter_qualities,\n    scan_convert_filter_empty_qualities,\n    scan_convert_filter_varied_qualities,\n    scan_convert_filter_varied_override_qualities,\n    scan_multi_convert_multi_filter_qualities,\n):\n    expected_qualities_id = request.param\n    expected_qualities_id_to_expected_qualities = {\n        \"scf\": scan_convert_filter_qualities,\n        \"empty\": scan_convert_filter_empty_qualities,\n        \"scf-varied\": scan_convert_filter_varied_qualities,\n        \"scf-varied-override\": scan_convert_filter_varied_override_qualities,\n        \"scffc\": scan_multi_convert_multi_filter_qualities,\n    }\n    return expected_qualities_id_to_expected_qualities[expected_qualities_id]\n\n\n@pytest.fixture\ndef side_effect(\n    request,\n    enron_filter,\n    enron_convert,\n    real_estate_convert,\n    real_estate_one_to_many_convert,\n):\n    side_effect_id = request.param\n    side_effect_id_to_side_effect = {\n        None: None,\n        \"enron-filter\": enron_filter,\n        \"enron-convert\": enron_convert,\n        \"real-estate-convert\": real_estate_convert,\n        \"real-estate-one-to-many-convert\": real_estate_one_to_many_convert,\n    }\n    return side_effect_id_to_side_effect[side_effect_id]\n\n\n@pytest.fixture\ndef operator_to_stats(\n    request,\n    three_converts_min_cost_operator_to_stats,\n    three_converts_max_quality_operator_to_stats,\n    three_converts_min_cost_at_fixed_quality_operator_to_stats,\n    three_converts_max_quality_at_fixed_cost_operator_to_stats,\n    one_filter_one_convert_min_cost_operator_to_stats,\n    two_converts_two_filters_min_cost_operator_to_stats,\n    two_converts_two_filters_max_quality_operator_to_stats,\n    two_converts_two_filters_min_cost_at_fixed_quality_operator_to_stats,\n    two_converts_two_filters_max_quality_at_fixed_cost_operator_to_stats,\n):\n    operator_to_stats_id = request.param\n    operator_to_stats_id_to_operator_to_stats = {\n        \"3c-mincost\": three_converts_min_cost_operator_to_stats,\n        \"3c-maxquality\": three_converts_max_quality_operator_to_stats,\n        \"3c-mincost@quality=0.8\": three_converts_min_cost_at_fixed_quality_operator_to_stats,\n        \"3c-maxquality@cost=1.0\": three_converts_max_quality_at_fixed_cost_operator_to_stats,\n        \"1f-1c-mincost\": one_filter_one_convert_min_cost_operator_to_stats,\n        \"2c-2f-mincost\": two_converts_two_filters_min_cost_operator_to_stats,\n        \"2c-2f-maxquality\": two_converts_two_filters_max_quality_operator_to_stats,\n        \"2c-2f-mincost@quality=0.8\": two_converts_two_filters_min_cost_at_fixed_quality_operator_to_stats,\n        \"2c-2f-maxquality@cost=1.0\": two_converts_two_filters_max_quality_at_fixed_cost_operator_to_stats,\n    }\n\n    return operator_to_stats_id_to_operator_to_stats[operator_to_stats_id]\n\n\n@pytest.fixture\ndef expected_plan(\n    request,\n    three_converts_min_cost_expected_plan,\n    three_converts_max_quality_expected_plan,\n    three_converts_min_cost_at_fixed_quality_expected_plan,\n    three_converts_max_quality_at_fixed_cost_expected_plan,\n    one_filter_one_convert_min_cost_expected_plan,\n    two_converts_two_filters_min_cost_expected_plan,\n    two_converts_two_filters_max_quality_expected_plan,\n    two_converts_two_filters_min_cost_at_fixed_quality_expected_plan,\n    two_converts_two_filters_max_quality_at_fixed_cost_expected_plan,\n):\n    expected_plan_id = request.param\n    expected_plan_id_to_expected_plan = {\n        \"3c-mincost\": three_converts_min_cost_expected_plan,\n        \"3c-maxquality\": three_converts_max_quality_expected_plan,\n        \"3c-mincost@quality=0.8\": three_converts_min_cost_at_fixed_quality_expected_plan,\n        \"3c-maxquality@cost=1.0\": three_converts_max_quality_at_fixed_cost_expected_plan,\n        \"1f-1c-mincost\": one_filter_one_convert_min_cost_expected_plan,\n        \"2c-2f-mincost\": two_converts_two_filters_min_cost_expected_plan,\n        \"2c-2f-maxquality\": two_converts_two_filters_max_quality_expected_plan,\n        \"2c-2f-mincost@quality=0.8\": two_converts_two_filters_min_cost_at_fixed_quality_expected_plan,\n        \"2c-2f-maxquality@cost=1.0\": two_converts_two_filters_max_quality_at_fixed_cost_expected_plan,\n    }\n\n    return expected_plan_id_to_expected_plan[expected_plan_id]"
  },
  {
    "path": "tests/pytest/data/email_schema.json",
    "content": "{\n    \"name\": \"Email\",\n    \"type\": \"TextFile\",\n    \"description\": \"Represents an email, which in practice is usually from a text file\",\n    \"fields\": [\n        {\n            \"name\": \"sender\",\n            \"description\": \"The email address of the sender\",\n            \"required\": true\n        },\n        {\n            \"name\": \"subject\",\n            \"description\": \"The subject of the email\",\n            \"required\": true\n        }\n    ]\n}"
  },
  {
    "path": "tests/pytest/data/email_schema.yml",
    "content": "schema:\n  name: Email\n  type: TextFile\n  description: Represents an email, which in practice is usually from a text file\n  fields:\n    - name: sender\n      description: The email address of the sender\n    - name: subject\n      description: The subject of the email\n\n"
  },
  {
    "path": "tests/pytest/data/synapse_schema.csv",
    "content": "resourceType,dataType,dataSubtype,individualID,specimenID,cellType,assay,isCellLine,diagnosis,organ,platform,sex,species,tissue,consortium,fileFormat,fundingAgency,isStranded,libraryPrep,readLength,readPairOrientation,readStrandOrigin,runType,nf1Genotype,nf2Genotype,isMultiIndividual,isMultiSpecimen,isPrimaryCell,detailed_diagnosis,libraryPreparationMethod,readPair,nucleicAcidSource,bodyPart,dissociationMethod,individualIdSource,specimenIdSource,initiative,tumorType,pain,description\nexperimentalData,genomicVariants,raw,swn_patient_X,swn_patient_X_tumor_0,,whole genome sequencing,,Schwannomatosis,nerves,Illumina HiSeq X,Male,Homo sapiens,primary tumor,,fastq,CTF,,,123,,,pairedEnd,unknown,unknown,,,FALSE,Familial Schwannomatosis,TruSeq,,bulk cell,,none,Source 1,Source 1,Initiative 1,\"[\"\"Schwannoma\"\"]\",,"
  },
  {
    "path": "tests/pytest/data/synapse_schema.jsonld",
    "content": "{\n    \"@context\": {\n        \"bts\": \"http://schema.biothings.io/\",\n        \"rdf\": \"http://www.w3.org/1999/02/22-rdf-syntax-ns#\",\n        \"rdfs\": \"http://www.w3.org/2000/01/rdf-schema#\",\n        \"schema\": \"http://schema.org/\",\n        \"xsd\": \"http://www.w3.org/2001/XMLSchema#\"\n    },\n    \"@graph\": [\n        {\n            \"@id\": \"bts:BodyPartEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BodyPartEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BodyPartEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UnknownEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UnknownEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"UnknownEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SpecimenPreparationMethodEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SpecimenPreparationMethodEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SpecimenPreparationMethodEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DataSubtypeEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DataSubtypeEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DataSubtypeEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FileFormatEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FileFormatEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FileFormatEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MultipleImagingDiagnosisEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Enumerations describing the manifestation of tumors confirmed via imaging with one or multiple sites being the most relevant result.\",\n            \"rdfs:label\": \"MultipleImagingDiagnosisEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MultipleImagingDiagnosisEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SexEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SexEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SexEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MRISequenceEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MRISequenceEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MRISequenceEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AgeGroupEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AgeGroupEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"AgeGroupEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TumorTreatmentEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TumorTreatmentEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"TumorTreatmentEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PlatformEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PlatformEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PlatformEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NF1Variant\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NF1Variant\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NF1Variant\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenePerturbationEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GenePerturbationEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GenePerturbationEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TestSummaryEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TestSummaryEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"TestSummaryEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SpinalNeuromaManifestationEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Enumerations describing the manifestation of spinal neurofibromas, combining information on presence and the cervical locations.\",\n            \"rdfs:label\": \"SpinalNeuromaManifestationEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SpinalNeuromaManifestationEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WorkflowEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WorkflowEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"WorkflowEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BooleanEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Boolean data as Yes/No enums\",\n            \"rdfs:label\": \"BooleanEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BooleanEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PubertyOnsetEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PubertyOnsetEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PubertyOnsetEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenePerturbationMethodEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GenePerturbationMethodEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GenePerturbationMethodEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Data\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"What the data (file) contains.\",\n            \"rdfs:label\": \"Data\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Data\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PlexiformNeurofibromaManifestationEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Enumerations describing the manifestation of plexiform neurofibromas, combining information on presence and the certainty via MRI imaging technique.\",\n            \"rdfs:label\": \"PlexiformNeurofibromaManifestationEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PlexiformNeurofibromaManifestationEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BinaryImagingDiagnosisEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Enumerations describing the manifestation of tumors confirmed via imaging with mainly absent or present being the most relevant result.\",\n            \"rdfs:label\": \"BinaryImagingDiagnosisEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BinaryImagingDiagnosisEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SpeciesEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SpeciesEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SpeciesEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DiagnosisEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DiagnosisEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DiagnosisEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProteinExtractSourceEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ProteinExtractSourceEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ProteinExtractSourceEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VitalStatusEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VitalStatusEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"VitalStatusEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MosaicismEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MosaicismEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MosaicismEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WorkingDistanceUnitEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WorkingDistanceUnitEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"WorkingDistanceUnitEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StrandednessEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StrandednessEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"StrandednessEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OrganismSubstance\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"This preferred root in the UBERON ontology is meant to cover organism-produced substances (bodily secretions and excreta) commonly used as assay specimens.\",\n            \"rdfs:label\": \"OrganismSubstance\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"OrganismSubstance\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BiologicalProcess\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BiologicalProcess\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BiologicalProcess\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ExpressionUnitEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Quantification units for gene expression.\",\n            \"rdfs:label\": \"ExpressionUnitEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ExpressionUnitEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OrganEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OrganEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"OrganEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tumor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tumor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tumor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WHOPerformanceStatusScores\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"//www.ncbi.nlm.nih.gov/books/NBK97482/\",\n            \"rdfs:label\": \"WHOPerformanceStatusScores\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"WHOPerformanceStatusScores\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ConcentrationUnit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ConcentrationUnit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ConcentrationUnit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NucleicAcidSourceEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NucleicAcidSourceEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NucleicAcidSourceEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReadPairOrientationEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ReadPairOrientationEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ReadPairOrientationEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LateralManifestationEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Enumerations describing the manifestation of tumors confirmed via imaging (MRI or other), where lateral information is most relevant.\",\n            \"rdfs:label\": \"LateralManifestationEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"LateralManifestationEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LibraryPrepEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LibraryPrepEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"LibraryPrepEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Percentiles-2SD\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Enumerations corresponding to 3 bins using 2 standard deviations as a cutoff.\",\n            \"rdfs:label\": \"Percentiles-2SD\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Percentiles-2SD\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tissue\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Tissue is a group of cells that have similar structure and that function together as a unit.\",\n            \"rdfs:label\": \"Tissue\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Tissue\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LibraryPreparationMethodEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LibraryPreparationMethodEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"LibraryPreparationMethodEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StudyStatusEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StudyStatusEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"StudyStatusEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Metadata\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Metadata\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"metadata\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DiseaseFocusEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DiseaseFocusEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DiseaseFocusEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ChannelEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ChannelEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ChannelEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FundingAgencyEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FundingAgencyEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FundingAgencyEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cell\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cell\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cell\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenomicReferenceEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GenomicReferenceEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GenomicReferenceEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Genotype\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Genotype\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Genotype\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DrugScreen\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DrugScreen\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"drugScreen\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DissociationMethodEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DissociationMethodEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DissociationMethodEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TimeUnit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TimeUnit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"TimeUnit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Measurement\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Measurement\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Measurement\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Material\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Material\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Material\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ManifestationEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ManifestationEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ManifestationEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SchwannomaManifestationEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Enumerations describing the manifestation of schwannomas.\",\n            \"rdfs:label\": \"SchwannomaManifestationEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SchwannomaManifestationEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OpticGliomaManifestationEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Enumerations describing the manifestation of optic gliomas, combining information on presence, symptoms, and treatment.\",\n            \"rdfs:label\": \"OpticGliomaManifestationEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"OpticGliomaManifestationEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TransplantationType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Type of transplantation involved in the experiment, derived from MESH\",\n            \"rdfs:label\": \"TransplantationType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Allograft\"\n                },\n                {\n                    \"@id\": \"bts:Xenograft\"\n                },\n                {\n                    \"@id\": \"bts:Autograft\"\n                },\n                {\n                    \"@id\": \"bts:Isograft\"\n                }\n            ],\n            \"sms:displayName\": \"transplantationType\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RunTypeEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RunTypeEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RunTypeEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NotApplicableEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NotApplicableEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NotApplicableEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CellLineModel\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CellLineModel\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CellLineModel\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Resource\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Resource\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Resource\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MouseModel\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MouseModel\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MouseModel\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AssayEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AssayEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"AssayEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CuratedDataEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CuratedDataEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CuratedDataEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NonvestibularSchwannomaManifestationEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NonvestibularSchwannomaManifestationEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NonvestibularSchwannomaManifestationEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DataStatusEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DataStatusEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DataStatusEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Institution\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Link to affiliated institution.\",\n            \"rdfs:label\": \"Institution\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:AlbanyMedicalCollege\"\n                },\n                {\n                    \"@id\": \"bts:AlbertEinsteinCollegeofMedicine\"\n                },\n                {\n                    \"@id\": \"bts:AlliantInternationalUniversity\"\n                },\n                {\n                    \"@id\": \"bts:AmericanUniversity\"\n                },\n                {\n                    \"@id\": \"bts:ArizonaStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:AuburnUniversity,Auburn\"\n                },\n                {\n                    \"@id\": \"bts:AugustaUniversity\"\n                },\n                {\n                    \"@id\": \"bts:BaylorCollegeofMedicine\"\n                },\n                {\n                    \"@id\": \"bts:BaylorUniversity\"\n                },\n                {\n                    \"@id\": \"bts:BoiseStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:BostonCollege\"\n                },\n                {\n                    \"@id\": \"bts:BostonUniversity\"\n                },\n                {\n                    \"@id\": \"bts:BowlingGreenStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:BrandeisUniversity\"\n                },\n                {\n                    \"@id\": \"bts:BrighamYoungUniversity,Provo\"\n                },\n                {\n                    \"@id\": \"bts:BrownUniversity\"\n                },\n                {\n                    \"@id\": \"bts:CUNY,CityCollege\"\n                },\n                {\n                    \"@id\": \"bts:CUNY,HunterCollege\"\n                },\n                {\n                    \"@id\": \"bts:CUNY,JohnJayCollegeofCriminalJustice\"\n                },\n                {\n                    \"@id\": \"bts:CUNY,QueensCollege\"\n                },\n                {\n                    \"@id\": \"bts:CaliforniaInstituteofTechnology\"\n                },\n                {\n                    \"@id\": \"bts:CaliforniaPolytechnicStateUniversity,SanLuisObispo\"\n                },\n                {\n                    \"@id\": \"bts:CaliforniaStateUniversity,LongBeach\"\n                },\n                {\n                    \"@id\": \"bts:CaliforniaStateUniversity,Northridge\"\n                },\n                {\n                    \"@id\": \"bts:CaliforniaStateUniversity,Sacramento\"\n                },\n                {\n                    \"@id\": \"bts:CarnegieMellonUniversity\"\n                },\n                {\n                    \"@id\": \"bts:CaseWesternReserveUniversity\"\n                },\n                {\n                    \"@id\": \"bts:CatholicUniversityofAmerica\"\n                },\n                {\n                    \"@id\": \"bts:CentralMichiganUniversity\"\n                },\n                {\n                    \"@id\": \"bts:ChapmanUniversity\"\n                },\n                {\n                    \"@id\": \"bts:Children'sHospitalofPhiladelphia\"\n                },\n                {\n                    \"@id\": \"bts:Children'sNational\"\n                },\n                {\n                    \"@id\": \"bts:CincinnatiChildren'sHospitalMedicalCenter\"\n                },\n                {\n                    \"@id\": \"bts:CityofHope\"\n                },\n                {\n                    \"@id\": \"bts:ClarksonUniversity\"\n                },\n                {\n                    \"@id\": \"bts:ClemsonUniversity\"\n                },\n                {\n                    \"@id\": \"bts:ClevelandStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:ColdSpringHarborLaboratory\"\n                },\n                {\n                    \"@id\": \"bts:ColoradoStateUniversity,FortCollins\"\n                },\n                {\n                    \"@id\": \"bts:ColumbiaU.intheCityofNewYork\"\n                },\n                {\n                    \"@id\": \"bts:CornellUniversity\"\n                },\n                {\n                    \"@id\": \"bts:CreightonUniversity\"\n                },\n                {\n                    \"@id\": \"bts:Dana-FarberCancerInstitute\"\n                },\n                {\n                    \"@id\": \"bts:DartmouthCollege\"\n                },\n                {\n                    \"@id\": \"bts:DelawareStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:DrexelUniversity\"\n                },\n                {\n                    \"@id\": \"bts:DukeUniversity\"\n                },\n                {\n                    \"@id\": \"bts:DuquesneUniversity\"\n                },\n                {\n                    \"@id\": \"bts:EastCarolinaUniversity\"\n                },\n                {\n                    \"@id\": \"bts:EastTennesseeStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:EasternVirginiaMedicalSchool\"\n                },\n                {\n                    \"@id\": \"bts:EmoryUniversity\"\n                },\n                {\n                    \"@id\": \"bts:EutropicsPharmaceuticals\"\n                },\n                {\n                    \"@id\": \"bts:FloridaA&MUniversity\"\n                },\n                {\n                    \"@id\": \"bts:FloridaAtlanticUniversity\"\n                },\n                {\n                    \"@id\": \"bts:FloridaInstituteofTechnology\"\n                },\n                {\n                    \"@id\": \"bts:FloridaInternationalUniversity\"\n                },\n                {\n                    \"@id\": \"bts:FloridaStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:FordhamUniversity\"\n                },\n                {\n                    \"@id\": \"bts:GeorgeMasonUniversity\"\n                },\n                {\n                    \"@id\": \"bts:GeorgeWashingtonUniversity\"\n                },\n                {\n                    \"@id\": \"bts:GeorgetownUniversity\"\n                },\n                {\n                    \"@id\": \"bts:GeorgiaInstituteofTechnology\"\n                },\n                {\n                    \"@id\": \"bts:GeorgiaStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:HarvardUniversity\"\n                },\n                {\n                    \"@id\": \"bts:HenriMondorHospitalParisEstCreteilFrance\"\n                },\n                {\n                    \"@id\": \"bts:HowardUniversity\"\n                },\n                {\n                    \"@id\": \"bts:HumboldtStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:IcahnSchoolofMedicineatMountSinai\"\n                },\n                {\n                    \"@id\": \"bts:IdahoStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:IllinoisInstituteofTechnology\"\n                },\n                {\n                    \"@id\": \"bts:IllinoisStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:IndianaUniversity\"\n                },\n                {\n                    \"@id\": \"bts:IndianaUniversity,Bloomington\"\n                },\n                {\n                    \"@id\": \"bts:IndianaUniversity-PurdueUniversityatIndianapolis\"\n                },\n                {\n                    \"@id\": \"bts:Institutd'InvestigacióBiomédicadeBellvitge\"\n                },\n                {\n                    \"@id\": \"bts:Institutd'InvestigacióenCiènciesdelaSalutGermansTriasiPujol\"\n                },\n                {\n                    \"@id\": \"bts:Inserm\"\n                },\n                {\n                    \"@id\": \"bts:IMBA-InstituteofMolecularBiotechnology\"\n                },\n                {\n                    \"@id\": \"bts:IowaStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:JAX\"\n                },\n                {\n                    \"@id\": \"bts:JacksonStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:JohnsHopkinsUniversity\"\n                },\n                {\n                    \"@id\": \"bts:KansasStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:KentStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:LangstonUniversity\"\n                },\n                {\n                    \"@id\": \"bts:LehighUniversity\"\n                },\n                {\n                    \"@id\": \"bts:LeibnizInstituteonAging–FritzLipmannInstitute\"\n                },\n                {\n                    \"@id\": \"bts:LomaLindaUniversity\"\n                },\n                {\n                    \"@id\": \"bts:LongIslandUniversity\"\n                },\n                {\n                    \"@id\": \"bts:LouisianaStateUniversity,BatonRouge\"\n                },\n                {\n                    \"@id\": \"bts:LouisianaStateUniversity,HealthSciencesCenter,NewOrleans\"\n                },\n                {\n                    \"@id\": \"bts:LouisianaStateUniversity,HealthSciencesCenter,Shreveport\"\n                },\n                {\n                    \"@id\": \"bts:LouisianaTechUniversity\"\n                },\n                {\n                    \"@id\": \"bts:LoyolaUniversity,Chicago\"\n                },\n                {\n                    \"@id\": \"bts:L’InsermdansParisetl’Île-de-FranceCentreNord\"\n                },\n                {\n                    \"@id\": \"bts:MarquetteUniversity\"\n                },\n                {\n                    \"@id\": \"bts:MarshallUniversity\"\n                },\n                {\n                    \"@id\": \"bts:MassachusettsGeneralHospital\"\n                },\n                {\n                    \"@id\": \"bts:MassachusettsInstituteofTechnology\"\n                },\n                {\n                    \"@id\": \"bts:MayoClinic\"\n                },\n                {\n                    \"@id\": \"bts:MayoClinicinArizona\"\n                },\n                {\n                    \"@id\": \"bts:MedicalCollegeofWisconsin\"\n                },\n                {\n                    \"@id\": \"bts:MedicalUniversityofSouthCarolina\"\n                },\n                {\n                    \"@id\": \"bts:MemorialSloanKetteringCancerCenter\"\n                },\n                {\n                    \"@id\": \"bts:MercerUniversity\"\n                },\n                {\n                    \"@id\": \"bts:MiamiUniversity\"\n                },\n                {\n                    \"@id\": \"bts:MichiganStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:MichiganTechnologicalUniversity\"\n                },\n                {\n                    \"@id\": \"bts:MississippiStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:MissouriUniversityofScienceandTechnology\"\n                },\n                {\n                    \"@id\": \"bts:MontanaStateUniversity,Bozeman\"\n                },\n                {\n                    \"@id\": \"bts:MontclairStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:MorehouseSchoolofMedicine\"\n                },\n                {\n                    \"@id\": \"bts:MorganStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:NCICenterforCancerResearch\"\n                },\n                {\n                    \"@id\": \"bts:NationalInstitutesofHealth\"\n                },\n                {\n                    \"@id\": \"bts:NewJerseyInstituteofTechnology\"\n                },\n                {\n                    \"@id\": \"bts:NewMexicoStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:NewYorkMedicalCollege\"\n                },\n                {\n                    \"@id\": \"bts:NewYorkUniversity\"\n                },\n                {\n                    \"@id\": \"bts:NorthCarolinaCentralUniversity\"\n                },\n                {\n                    \"@id\": \"bts:NorthCarolinaStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:NorthDakotaStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:NortheastOhioMedicalUniversity\"\n                },\n                {\n                    \"@id\": \"bts:NortheasternUniversity\"\n                },\n                {\n                    \"@id\": \"bts:NorthernArizonaUniversity\"\n                },\n                {\n                    \"@id\": \"bts:NorthernIllinoisUniversity\"\n                },\n                {\n                    \"@id\": \"bts:NorthwesternUniversity\"\n                },\n                {\n                    \"@id\": \"bts:NovaSoutheasternUniversity\"\n                },\n                {\n                    \"@id\": \"bts:OhioStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:OhioUniversity\"\n                },\n                {\n                    \"@id\": \"bts:OklahomaStateUniversity,Stillwater\"\n                },\n                {\n                    \"@id\": \"bts:OregonHealthandScienceUniversity\"\n                },\n                {\n                    \"@id\": \"bts:OregonStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:PacificNorthwestNationalLaboratory\"\n                },\n                {\n                    \"@id\": \"bts:PenningtonBiomedicalResearchCenter\"\n                },\n                {\n                    \"@id\": \"bts:PennsylvaniaStateUniversity,UniversityParkandHersheyMedicalCenter\"\n                },\n                {\n                    \"@id\": \"bts:PortlandStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:PrairieViewA&MUniversity\"\n                },\n                {\n                    \"@id\": \"bts:PrincetonUniversity\"\n                },\n                {\n                    \"@id\": \"bts:PurdueUniversity,WestLafayette\"\n                },\n                {\n                    \"@id\": \"bts:PusanNationalUniversity\"\n                },\n                {\n                    \"@id\": \"bts:RensselaerPolytechnicInstitute\"\n                },\n                {\n                    \"@id\": \"bts:RiceUniversity\"\n                },\n                {\n                    \"@id\": \"bts:RochesterInstituteofTechnology\"\n                },\n                {\n                    \"@id\": \"bts:RockefellerUniversity\"\n                },\n                {\n                    \"@id\": \"bts:RosalindFranklinUniversityofMedicineandScience\"\n                },\n                {\n                    \"@id\": \"bts:RoyalNorthShoreHospital\"\n                },\n                {\n                    \"@id\": \"bts:RushUniversity\"\n                },\n                {\n                    \"@id\": \"bts:RutgersStateUniversityofNewJersey,NewBrunswick\"\n                },\n                {\n                    \"@id\": \"bts:RutgersStateUniversityofNewJersey,Newark\"\n                },\n                {\n                    \"@id\": \"bts:SUNY,BinghamtonUniversity\"\n                },\n                {\n                    \"@id\": \"bts:SUNY,DownstateHealthSciencesUniversity\"\n                },\n                {\n                    \"@id\": \"bts:SageBionetworks\"\n                },\n                {\n                    \"@id\": \"bts:SaintLouisUniversity\"\n                },\n                {\n                    \"@id\": \"bts:SanDiegoStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:SanFranciscoStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:SanJoseStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:ScrippsResearchInstitute\"\n                },\n                {\n                    \"@id\": \"bts:SeattleChildren's\"\n                },\n                {\n                    \"@id\": \"bts:SouthDakotaStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:SouthernIllinoisUniversity,Carbondale\"\n                },\n                {\n                    \"@id\": \"bts:SouthernIllinoisUniversity,Edwardsville\"\n                },\n                {\n                    \"@id\": \"bts:SouthernMethodistUniversity\"\n                },\n                {\n                    \"@id\": \"bts:StanfordUniversity\"\n                },\n                {\n                    \"@id\": \"bts:StateUniversityofNewYorkPolytechnicInstitute\"\n                },\n                {\n                    \"@id\": \"bts:StateUniversityofNewYork,Albany\"\n                },\n                {\n                    \"@id\": \"bts:StateUniversityofNewYork,Buffalo\"\n                },\n                {\n                    \"@id\": \"bts:StateUniversityofNewYork,UpstateMedicalUniversity\"\n                },\n                {\n                    \"@id\": \"bts:StevensInstituteofTechnology\"\n                },\n                {\n                    \"@id\": \"bts:StonyBrookUniversity\"\n                },\n                {\n                    \"@id\": \"bts:SyracuseUniversity\"\n                },\n                {\n                    \"@id\": \"bts:TempleUniversity\"\n                },\n                {\n                    \"@id\": \"bts:TennesseeStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:TennesseeTechnologicalUniversity\"\n                },\n                {\n                    \"@id\": \"bts:TexasA&MUniversity\"\n                },\n                {\n                    \"@id\": \"bts:TexasChristianUniversity\"\n                },\n                {\n                    \"@id\": \"bts:TexasStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:TexasTechUniversity\"\n                },\n                {\n                    \"@id\": \"bts:TexasTechUniversityofHealthSciencesCenter\"\n                },\n                {\n                    \"@id\": \"bts:ThomasJeffersonUniversity\"\n                },\n                {\n                    \"@id\": \"bts:TuftsUniversity\"\n                },\n                {\n                    \"@id\": \"bts:TulaneUniversity\"\n                },\n                {\n                    \"@id\": \"bts:TuskegeeUniversity\"\n                },\n                {\n                    \"@id\": \"bts:UniformedServicesUniversityoftheHealthSciences\"\n                },\n                {\n                    \"@id\": \"bts:UniversityCollegeofLondon\"\n                },\n                {\n                    \"@id\": \"bts:UniversityHealthNetwork\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofAkron\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofAlabama\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofAlabama,Birmingham\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofAlabama,Huntsville\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofAlabama,Tuscaloosa\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofAlaska,Anchorage\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofAlaska,Fairbanks\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofArizona\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofArkansasforMedicalSciences\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofArkansas,Fayetteville\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofBaltimore\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCalifornia,Berkeley\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCalifornia,Davis\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCalifornia,Irvine\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCalifornia,LosAngeles\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCalifornia,Merced\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCalifornia,Riverside\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCalifornia,SanDiego\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCalifornia,SanFrancisco\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCalifornia,SantaBarbara\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCalifornia,SantaCruz\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCambridge\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCentralFlorida\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofChicago\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofCincinnati\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofColoradoBoulder\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofColoradoDenverandAnschutzMedicalCampus\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofConnecticut\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofDayton\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofDelaware\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofDenver\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofFlorida\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofGeorgia\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofGlasgow\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofHawaii,Manoa\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofHouston\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofIdaho\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofIllinois,Chicago\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofIllinois,Urbana-Champaign\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofIowa\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofKansas\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofKentucky\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofLouisianaatLafayette\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofLouisville\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMaine\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMaryland,BaltimoreCounty\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMaryland,EasternShore\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMassachusetts,Amherst\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMassachusetts,Boston\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMassachusetts,Dartmouth\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMassachusetts,Lowell\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMassachusetts,MedicalSchool\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMemphis\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMiami\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMichigan,AnnArbor\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMinnesota,Duluth\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMinnesota,TwinCities\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMississippi\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMissouri,Columbia\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMissouri,KansasCity\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMissouri,SaintLouis\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofMontana,Missoula\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNebraska,Lincoln\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNebraska,MedicalCenter\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNebraska,Omaha\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNevada,LasVegas\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNevada,Reno\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNewHampshire\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNewMexico\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNewOrleans\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNorthCarolina,ChapelHill\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNorthCarolina,Charlotte\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNorthCarolina,Greensboro\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNorthCarolina,Wilmington\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNorthDakota\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNorthTexas,Denton\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNorthTexas,HealthScienceCenter\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofNotreDame\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofOklahoma\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofOregon\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofPennsylvania\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofPittsburgh,Pittsburgh\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofPlymouth\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofPuertoRico,Mayaguez\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofPuertoRico,MedicalSciencesCampus\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofPuertoRico,RioPiedras\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofRhodeIsland\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofRochester\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofSouthAlabama\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofSouthCarolina,Columbia\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofSouthDakota\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofSouthFlorida,SaintPetersburg\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofSouthFlorida,Tampa\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofSouthernCalifornia\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofSouthernMississippi\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTennessee,HealthScienceCenter\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTennessee,Knoxville\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexasHealthScienceCenter,Houston\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexasHealthScienceCenter,SanAntonio\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexasHealthScienceCenter,Tyler\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexasM.D.AndersonCancerCenter\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexasMedicalBranchatGalveston\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexasRioGrandeValley\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexasSouthwesternMedicalCenter\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexas,Arlington\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexas,Austin\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexas,Dallas\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexas,ElPaso\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTexas,SanAntonio\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofToledo\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTulsa\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofTurku\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofUtah\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofVermont\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofVirginia,Charlottesville\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofWashington,Seattle\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofWestFlorida\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofWisconsin-Madison\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofWisconsin-Milwaukee\"\n                },\n                {\n                    \"@id\": \"bts:UniversityofWyoming\"\n                },\n                {\n                    \"@id\": \"bts:UniversitéParis-EstCréteil\"\n                },\n                {\n                    \"@id\": \"bts:UtahStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:VanAndelResearchInstitute\"\n                },\n                {\n                    \"@id\": \"bts:VanderbiltUniversity\"\n                },\n                {\n                    \"@id\": \"bts:VillanovaUniversity\"\n                },\n                {\n                    \"@id\": \"bts:VirginiaCommonwealthUniversity\"\n                },\n                {\n                    \"@id\": \"bts:VirginiaPolytechnicInstituteandStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:WakeForestUniversity\"\n                },\n                {\n                    \"@id\": \"bts:WashingtonStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:WashingtonUniversityinSt.Louis\"\n                },\n                {\n                    \"@id\": \"bts:WayneStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:WestVirginiaUniversity\"\n                },\n                {\n                    \"@id\": \"bts:WesternMichiganUniversity\"\n                },\n                {\n                    \"@id\": \"bts:WichitaStateUniversity\"\n                },\n                {\n                    \"@id\": \"bts:William&Mary\"\n                },\n                {\n                    \"@id\": \"bts:WorcesterPolytechnicInstitute\"\n                },\n                {\n                    \"@id\": \"bts:YaleUniversity\"\n                }\n            ],\n            \"sms:displayName\": \"institution\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReadStrandOriginEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ReadStrandOriginEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ReadStrandOriginEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PainEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PainEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PainEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:License\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Link to a license or name of license applicable for the resource.\",\n            \"rdfs:label\": \"License\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"license\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PresenceEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Enumeration for binary + unknown for very basic classification of a manifestation in clinical context.\",\n            \"rdfs:label\": \"PresenceEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PresenceEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NeurofibromaManifestationEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Enumerations describing how neurofibromas can manifest more generally.\",\n            \"rdfs:label\": \"NeurofibromaManifestationEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NeurofibromaManifestationEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:InheritanceEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"InheritanceEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"InheritanceEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AccessTypeEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AccessTypeEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"AccessTypeEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenePerturbationTechnologyEnum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GenePerturbationTechnologyEnum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GenePerturbationTechnologyEnum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AlbanyMedicalCollege\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AlbanyMedicalCollege\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Albany Medical College\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AlbertEinsteinCollegeofMedicine\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AlbertEinsteinCollegeofMedicine\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Albert Einstein College of Medicine\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AlliantInternationalUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AlliantInternationalUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Alliant International University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AmericanUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AmericanUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"American University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ArizonaStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ArizonaStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Arizona State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AuburnUniversity,Auburn\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AuburnUniversity,Auburn\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Auburn University, Auburn\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AugustaUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AugustaUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Augusta University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BaylorCollegeofMedicine\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BaylorCollegeofMedicine\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Baylor College of Medicine\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BaylorUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BaylorUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Baylor University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BoiseStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BoiseStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Boise State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BostonCollege\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BostonCollege\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Boston College\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BostonUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BostonUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Boston University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BowlingGreenStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BowlingGreenStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Bowling Green State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BrandeisUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BrandeisUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Brandeis University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BrighamYoungUniversity,Provo\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BrighamYoungUniversity,Provo\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Brigham Young University, Provo\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BrownUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BrownUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Brown University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CUNY,CityCollege\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CUNY,CityCollege\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CUNY, City College\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CUNY,HunterCollege\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CUNY,HunterCollege\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CUNY, Hunter College\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CUNY,JohnJayCollegeofCriminalJustice\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CUNY,JohnJayCollegeofCriminalJustice\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CUNY, John Jay College of Criminal Justice\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CUNY,QueensCollege\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CUNY,QueensCollege\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CUNY, Queens College\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CaliforniaInstituteofTechnology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CaliforniaInstituteofTechnology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"California Institute of Technology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CaliforniaPolytechnicStateUniversity,SanLuisObispo\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CaliforniaPolytechnicStateUniversity,SanLuisObispo\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"California Polytechnic State University, San Luis Obispo\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CaliforniaStateUniversity,LongBeach\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CaliforniaStateUniversity,LongBeach\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"California State University, Long Beach\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CaliforniaStateUniversity,Northridge\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CaliforniaStateUniversity,Northridge\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"California State University, Northridge\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CaliforniaStateUniversity,Sacramento\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CaliforniaStateUniversity,Sacramento\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"California State University, Sacramento\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CarnegieMellonUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CarnegieMellonUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Carnegie Mellon University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CaseWesternReserveUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CaseWesternReserveUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Case Western Reserve University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CatholicUniversityofAmerica\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CatholicUniversityofAmerica\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Catholic University of America\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CentralMichiganUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CentralMichiganUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Central Michigan University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ChapmanUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ChapmanUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Chapman University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Children'sHospitalofPhiladelphia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Children'sHospitalofPhiladelphia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Children's Hospital of Philadelphia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Children'sNational\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Children'sNational\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Children's National\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CincinnatiChildren'sHospitalMedicalCenter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CincinnatiChildren'sHospitalMedicalCenter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cincinnati Children's Hospital Medical Center\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CityofHope\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CityofHope\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"City of Hope\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ClarksonUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ClarksonUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Clarkson University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ClemsonUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ClemsonUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Clemson University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ClevelandStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ClevelandStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cleveland State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ColdSpringHarborLaboratory\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ColdSpringHarborLaboratory\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cold Spring Harbor Laboratory\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ColoradoStateUniversity,FortCollins\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ColoradoStateUniversity,FortCollins\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Colorado State University, Fort Collins\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ColumbiaU.intheCityofNewYork\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ColumbiaU.intheCityofNewYork\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Columbia U. in the City of New York\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CornellUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CornellUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cornell University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CreightonUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CreightonUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Creighton University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Dana-FarberCancerInstitute\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Dana-FarberCancerInstitute\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Dana-Farber Cancer Institute\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DartmouthCollege\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DartmouthCollege\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Dartmouth College\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DelawareStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DelawareStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Delaware State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DrexelUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DrexelUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Drexel University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DukeUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DukeUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Duke University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DuquesneUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DuquesneUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Duquesne University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:EastCarolinaUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"EastCarolinaUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"East Carolina University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:EastTennesseeStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"EastTennesseeStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"East Tennessee State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:EasternVirginiaMedicalSchool\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"EasternVirginiaMedicalSchool\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Eastern Virginia Medical School\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:EmoryUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"EmoryUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Emory University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:EutropicsPharmaceuticals\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"EutropicsPharmaceuticals\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Eutropics Pharmaceuticals\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FloridaA&MUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FloridaA&MUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Florida A&M University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FloridaAtlanticUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FloridaAtlanticUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Florida Atlantic University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FloridaInstituteofTechnology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FloridaInstituteofTechnology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Florida Institute of Technology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FloridaInternationalUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FloridaInternationalUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Florida International University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FloridaStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FloridaStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Florida State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FordhamUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FordhamUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Fordham University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GeorgeMasonUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GeorgeMasonUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"George Mason University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GeorgeWashingtonUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GeorgeWashingtonUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"George Washington University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GeorgetownUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GeorgetownUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Georgetown University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GeorgiaInstituteofTechnology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GeorgiaInstituteofTechnology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Georgia Institute of Technology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GeorgiaStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GeorgiaStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Georgia State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HarvardUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HarvardUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Harvard University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HenriMondorHospitalParisEstCreteilFrance\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HenriMondorHospitalParisEstCreteilFrance\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Henri Mondor Hospital Paris Est Creteil France\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HowardUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HowardUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Howard University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HumboldtStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HumboldtStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Humboldt State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IcahnSchoolofMedicineatMountSinai\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IcahnSchoolofMedicineatMountSinai\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Icahn School of Medicine at Mount Sinai\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IdahoStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IdahoStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Idaho State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IllinoisInstituteofTechnology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IllinoisInstituteofTechnology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illinois Institute of Technology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IllinoisStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IllinoisStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illinois State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IndianaUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IndianaUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Indiana University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IndianaUniversity,Bloomington\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IndianaUniversity,Bloomington\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Indiana University, Bloomington\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IndianaUniversity-PurdueUniversityatIndianapolis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IndianaUniversity-PurdueUniversityatIndianapolis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Indiana University-Purdue University at Indianapolis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Institutd'InvestigacióBiomédicadeBellvitge\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Institutd'InvestigacióBiomédicadeBellvitge\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Institut d'Investigació Biomédica de Bellvitge\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Institutd'InvestigacióenCiènciesdelaSalutGermansTriasiPujol\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Institutd'InvestigacióenCiènciesdelaSalutGermansTriasiPujol\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Inserm\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Inserm\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Inserm\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IMBA-InstituteofMolecularBiotechnology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IMBA-InstituteofMolecularBiotechnology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"IMBA - Institute of Molecular Biotechnology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IowaStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IowaStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Iowa State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JAX\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JAX\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"JAX\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JacksonStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JacksonStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Jackson State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JohnsHopkinsUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JohnsHopkinsUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Johns Hopkins University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:KansasStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"KansasStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Kansas State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:KentStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"KentStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Kent State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LangstonUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LangstonUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Langston University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LehighUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LehighUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Lehigh University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LeibnizInstituteonAging–FritzLipmannInstitute\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LeibnizInstituteonAging–FritzLipmannInstitute\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Leibniz Institute on Aging – Fritz Lipmann Institute\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LomaLindaUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LomaLindaUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Loma Linda University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LongIslandUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LongIslandUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Long Island University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LouisianaStateUniversity,BatonRouge\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LouisianaStateUniversity,BatonRouge\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Louisiana State University, Baton Rouge\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LouisianaStateUniversity,HealthSciencesCenter,NewOrleans\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LouisianaStateUniversity,HealthSciencesCenter,NewOrleans\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Louisiana State University, Health Sciences Center, New Orleans\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LouisianaStateUniversity,HealthSciencesCenter,Shreveport\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LouisianaStateUniversity,HealthSciencesCenter,Shreveport\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Louisiana State University, Health Sciences Center, Shreveport\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LouisianaTechUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LouisianaTechUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Louisiana Tech University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LoyolaUniversity,Chicago\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LoyolaUniversity,Chicago\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Loyola University, Chicago\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:L’InsermdansParisetl’Île-de-FranceCentreNord\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"L’InsermdansParisetl’Île-de-FranceCentreNord\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"L’Inserm dans Paris et l’Île-de-France Centre Nord\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MarquetteUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MarquetteUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Marquette University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MarshallUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MarshallUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Marshall University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MassachusettsGeneralHospital\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MassachusettsGeneralHospital\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Massachusetts General Hospital\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MassachusettsInstituteofTechnology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MassachusettsInstituteofTechnology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Massachusetts Institute of Technology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MayoClinic\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MayoClinic\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Mayo Clinic\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MayoClinicinArizona\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MayoClinicinArizona\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Mayo Clinic in Arizona\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MedicalCollegeofWisconsin\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MedicalCollegeofWisconsin\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Medical College of Wisconsin\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MedicalUniversityofSouthCarolina\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MedicalUniversityofSouthCarolina\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Medical University of South Carolina\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MemorialSloanKetteringCancerCenter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MemorialSloanKetteringCancerCenter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Memorial Sloan Kettering Cancer Center\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MercerUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MercerUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Mercer University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MiamiUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MiamiUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Miami University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MichiganStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MichiganStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Michigan State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MichiganTechnologicalUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MichiganTechnologicalUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Michigan Technological University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MississippiStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MississippiStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Mississippi State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MissouriUniversityofScienceandTechnology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MissouriUniversityofScienceandTechnology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Missouri University of Science and Technology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MontanaStateUniversity,Bozeman\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MontanaStateUniversity,Bozeman\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Montana State University, Bozeman\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MontclairStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MontclairStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Montclair State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MorehouseSchoolofMedicine\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MorehouseSchoolofMedicine\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Morehouse School of Medicine\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MorganStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MorganStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Morgan State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NCICenterforCancerResearch\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NCICenterforCancerResearch\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NCI Center for Cancer Research\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NationalInstitutesofHealth\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NationalInstitutesofHealth\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"National Institutes of Health\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NewJerseyInstituteofTechnology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NewJerseyInstituteofTechnology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"New Jersey Institute of Technology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NewMexicoStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NewMexicoStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"New Mexico State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NewYorkMedicalCollege\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NewYorkMedicalCollege\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"New York Medical College\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NewYorkUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NewYorkUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"New York University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NorthCarolinaCentralUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NorthCarolinaCentralUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"North Carolina Central University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NorthCarolinaStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NorthCarolinaStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"North Carolina State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NorthDakotaStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NorthDakotaStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"North Dakota State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NortheastOhioMedicalUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NortheastOhioMedicalUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Northeast Ohio Medical University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NortheasternUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NortheasternUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Northeastern University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NorthernArizonaUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NorthernArizonaUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Northern Arizona University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NorthernIllinoisUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NorthernIllinoisUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Northern Illinois University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NorthwesternUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NorthwesternUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Northwestern University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NovaSoutheasternUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NovaSoutheasternUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nova Southeastern University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OhioStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OhioStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Ohio State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OhioUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OhioUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Ohio University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OklahomaStateUniversity,Stillwater\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OklahomaStateUniversity,Stillwater\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Oklahoma State University, Stillwater\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OregonHealthandScienceUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OregonHealthandScienceUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Oregon Health and Science University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OregonStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OregonStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Oregon State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PacificNorthwestNationalLaboratory\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PacificNorthwestNationalLaboratory\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Pacific Northwest National Laboratory\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PenningtonBiomedicalResearchCenter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PenningtonBiomedicalResearchCenter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Pennington Biomedical Research Center\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PennsylvaniaStateUniversity,UniversityParkandHersheyMedicalCenter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PennsylvaniaStateUniversity,UniversityParkandHersheyMedicalCenter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Pennsylvania State University, University Park and Hershey Medical Center\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PortlandStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PortlandStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Portland State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PrairieViewA&MUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PrairieViewA&MUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Prairie View A&M University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PrincetonUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PrincetonUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Princeton University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PurdueUniversity,WestLafayette\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PurdueUniversity,WestLafayette\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Purdue University, West Lafayette\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PusanNationalUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PusanNationalUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Pusan National University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RensselaerPolytechnicInstitute\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RensselaerPolytechnicInstitute\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Rensselaer Polytechnic Institute\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RiceUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RiceUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Rice University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RochesterInstituteofTechnology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RochesterInstituteofTechnology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Rochester Institute of Technology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RockefellerUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RockefellerUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Rockefeller University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RosalindFranklinUniversityofMedicineandScience\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RosalindFranklinUniversityofMedicineandScience\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Rosalind Franklin University of Medicine and Science\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RoyalNorthShoreHospital\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RoyalNorthShoreHospital\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Royal North Shore Hospital\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RushUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RushUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Rush University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RutgersStateUniversityofNewJersey,NewBrunswick\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RutgersStateUniversityofNewJersey,NewBrunswick\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Rutgers State University of New Jersey, New Brunswick\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RutgersStateUniversityofNewJersey,Newark\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RutgersStateUniversityofNewJersey,Newark\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Rutgers State University of New Jersey, Newark\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SUNY,BinghamtonUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SUNY,BinghamtonUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SUNY, Binghamton University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SUNY,DownstateHealthSciencesUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SUNY,DownstateHealthSciencesUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SUNY, Downstate Health Sciences University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SageBionetworks\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SageBionetworks\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Sage Bionetworks\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SaintLouisUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SaintLouisUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Saint Louis University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SanDiegoStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SanDiegoStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"San Diego State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SanFranciscoStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SanFranciscoStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"San Francisco State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SanJoseStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SanJoseStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"San Jose State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ScrippsResearchInstitute\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ScrippsResearchInstitute\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Scripps Research Institute\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SeattleChildren's\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SeattleChildren's\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Seattle Children's\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SouthDakotaStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SouthDakotaStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"South Dakota State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SouthernIllinoisUniversity,Carbondale\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SouthernIllinoisUniversity,Carbondale\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Southern Illinois University, Carbondale\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SouthernIllinoisUniversity,Edwardsville\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SouthernIllinoisUniversity,Edwardsville\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Southern Illinois University, Edwardsville\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SouthernMethodistUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SouthernMethodistUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Southern Methodist University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StanfordUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StanfordUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Stanford University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StateUniversityofNewYorkPolytechnicInstitute\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StateUniversityofNewYorkPolytechnicInstitute\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"State University of New York Polytechnic Institute\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StateUniversityofNewYork,Albany\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StateUniversityofNewYork,Albany\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"State University of New York, Albany\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StateUniversityofNewYork,Buffalo\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StateUniversityofNewYork,Buffalo\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"State University of New York, Buffalo\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StateUniversityofNewYork,UpstateMedicalUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StateUniversityofNewYork,UpstateMedicalUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"State University of New York, Upstate Medical University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StevensInstituteofTechnology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StevensInstituteofTechnology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Stevens Institute of Technology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StonyBrookUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StonyBrookUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Stony Brook University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SyracuseUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SyracuseUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Syracuse University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TempleUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TempleUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Temple University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TennesseeStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TennesseeStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Tennessee State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TennesseeTechnologicalUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TennesseeTechnologicalUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Tennessee Technological University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TexasA&MUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TexasA&MUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Texas A&M University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TexasChristianUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TexasChristianUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Texas Christian University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TexasStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TexasStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Texas State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TexasTechUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TexasTechUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Texas Tech University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TexasTechUniversityofHealthSciencesCenter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TexasTechUniversityofHealthSciencesCenter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Texas Tech University of Health Sciences Center\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ThomasJeffersonUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ThomasJeffersonUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Thomas Jefferson University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TuftsUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TuftsUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Tufts University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TulaneUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TulaneUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Tulane University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TuskegeeUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TuskegeeUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Tuskegee University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniformedServicesUniversityoftheHealthSciences\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniformedServicesUniversityoftheHealthSciences\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Uniformed Services University of the Health Sciences\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityCollegeofLondon\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityCollegeofLondon\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University College of London\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityHealthNetwork\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityHealthNetwork\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University Health Network\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofAkron\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofAkron\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Akron\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofAlabama\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofAlabama\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Alabama\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofAlabama,Birmingham\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofAlabama,Birmingham\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Alabama, Birmingham\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofAlabama,Huntsville\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofAlabama,Huntsville\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Alabama, Huntsville\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofAlabama,Tuscaloosa\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofAlabama,Tuscaloosa\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Alabama, Tuscaloosa\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofAlaska,Anchorage\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofAlaska,Anchorage\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Alaska, Anchorage\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofAlaska,Fairbanks\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofAlaska,Fairbanks\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Alaska, Fairbanks\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofArizona\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofArizona\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Arizona\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofArkansasforMedicalSciences\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofArkansasforMedicalSciences\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Arkansas for Medical Sciences\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofArkansas,Fayetteville\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofArkansas,Fayetteville\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Arkansas, Fayetteville\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofBaltimore\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofBaltimore\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Baltimore\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCalifornia,Berkeley\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCalifornia,Berkeley\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of California, Berkeley\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCalifornia,Davis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCalifornia,Davis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of California, Davis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCalifornia,Irvine\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCalifornia,Irvine\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of California, Irvine\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCalifornia,LosAngeles\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCalifornia,LosAngeles\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of California, Los Angeles\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCalifornia,Merced\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCalifornia,Merced\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of California, Merced\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCalifornia,Riverside\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCalifornia,Riverside\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of California, Riverside\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCalifornia,SanDiego\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCalifornia,SanDiego\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of California, San Diego\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCalifornia,SanFrancisco\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCalifornia,SanFrancisco\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of California, San Francisco\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCalifornia,SantaBarbara\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCalifornia,SantaBarbara\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of California, Santa Barbara\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCalifornia,SantaCruz\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCalifornia,SantaCruz\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of California, Santa Cruz\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCambridge\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCambridge\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Cambridge\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCentralFlorida\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCentralFlorida\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Central Florida\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofChicago\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofChicago\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Chicago\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofCincinnati\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofCincinnati\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Cincinnati\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofColoradoBoulder\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofColoradoBoulder\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Colorado Boulder\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofColoradoDenverandAnschutzMedicalCampus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofColoradoDenverandAnschutzMedicalCampus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Colorado Denver and Anschutz Medical Campus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofConnecticut\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofConnecticut\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Connecticut\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofDayton\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofDayton\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Dayton\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofDelaware\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofDelaware\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Delaware\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofDenver\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofDenver\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Denver\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofFlorida\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofFlorida\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Florida\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofGeorgia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofGeorgia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Georgia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofGlasgow\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofGlasgow\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Glasgow\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofHawaii,Manoa\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofHawaii,Manoa\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Hawaii, Manoa\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofHouston\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofHouston\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Houston\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofIdaho\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofIdaho\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Idaho\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofIllinois,Chicago\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofIllinois,Chicago\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Illinois, Chicago\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofIllinois,Urbana-Champaign\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofIllinois,Urbana-Champaign\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Illinois, Urbana-Champaign\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofIowa\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofIowa\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Iowa\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofKansas\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofKansas\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Kansas\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofKentucky\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofKentucky\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Kentucky\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofLouisianaatLafayette\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofLouisianaatLafayette\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Louisiana at Lafayette\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofLouisville\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofLouisville\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Louisville\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMaine\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMaine\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Maine\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMaryland,BaltimoreCounty\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMaryland,BaltimoreCounty\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Maryland, Baltimore County\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMaryland,EasternShore\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMaryland,EasternShore\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Maryland, Eastern Shore\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMassachusetts,Amherst\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMassachusetts,Amherst\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Massachusetts, Amherst\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMassachusetts,Boston\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMassachusetts,Boston\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Massachusetts, Boston\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMassachusetts,Dartmouth\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMassachusetts,Dartmouth\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Massachusetts, Dartmouth\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMassachusetts,Lowell\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMassachusetts,Lowell\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Massachusetts, Lowell\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMassachusetts,MedicalSchool\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMassachusetts,MedicalSchool\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Massachusetts, Medical School\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMemphis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMemphis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Memphis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMiami\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMiami\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Miami\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMichigan,AnnArbor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMichigan,AnnArbor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Michigan, Ann Arbor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMinnesota,Duluth\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMinnesota,Duluth\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Minnesota, Duluth\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMinnesota,TwinCities\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMinnesota,TwinCities\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Minnesota, Twin Cities\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMississippi\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMississippi\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Mississippi\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMissouri,Columbia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMissouri,Columbia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Missouri, Columbia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMissouri,KansasCity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMissouri,KansasCity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Missouri, Kansas City\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMissouri,SaintLouis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMissouri,SaintLouis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Missouri, Saint Louis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofMontana,Missoula\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofMontana,Missoula\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Montana, Missoula\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNebraska,Lincoln\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNebraska,Lincoln\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Nebraska, Lincoln\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNebraska,MedicalCenter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNebraska,MedicalCenter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Nebraska, Medical Center\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNebraska,Omaha\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNebraska,Omaha\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Nebraska, Omaha\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNevada,LasVegas\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNevada,LasVegas\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Nevada, Las Vegas\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNevada,Reno\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNevada,Reno\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Nevada, Reno\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNewHampshire\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNewHampshire\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of New Hampshire\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNewMexico\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNewMexico\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of New Mexico\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNewOrleans\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNewOrleans\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of New Orleans\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNorthCarolina,ChapelHill\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNorthCarolina,ChapelHill\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of North Carolina, Chapel Hill\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNorthCarolina,Charlotte\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNorthCarolina,Charlotte\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of North Carolina, Charlotte\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNorthCarolina,Greensboro\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNorthCarolina,Greensboro\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of North Carolina, Greensboro\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNorthCarolina,Wilmington\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNorthCarolina,Wilmington\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of North Carolina, Wilmington\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNorthDakota\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNorthDakota\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of North Dakota\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNorthTexas,Denton\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNorthTexas,Denton\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of North Texas, Denton\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNorthTexas,HealthScienceCenter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNorthTexas,HealthScienceCenter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of North Texas, Health Science Center\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofNotreDame\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofNotreDame\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Notre Dame\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofOklahoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofOklahoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Oklahoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofOregon\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofOregon\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Oregon\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofPennsylvania\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofPennsylvania\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Pennsylvania\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofPittsburgh,Pittsburgh\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofPittsburgh,Pittsburgh\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Pittsburgh, Pittsburgh\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofPlymouth\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofPlymouth\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Plymouth\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofPuertoRico,Mayaguez\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofPuertoRico,Mayaguez\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Puerto Rico, Mayaguez\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofPuertoRico,MedicalSciencesCampus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofPuertoRico,MedicalSciencesCampus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Puerto Rico, Medical Sciences Campus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofPuertoRico,RioPiedras\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofPuertoRico,RioPiedras\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Puerto Rico, Rio Piedras\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofRhodeIsland\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofRhodeIsland\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Rhode Island\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofRochester\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofRochester\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Rochester\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofSouthAlabama\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofSouthAlabama\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of South Alabama\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofSouthCarolina,Columbia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofSouthCarolina,Columbia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of South Carolina, Columbia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofSouthDakota\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofSouthDakota\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of South Dakota\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofSouthFlorida,SaintPetersburg\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofSouthFlorida,SaintPetersburg\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of South Florida, Saint Petersburg\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofSouthFlorida,Tampa\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofSouthFlorida,Tampa\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of South Florida, Tampa\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofSouthernCalifornia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofSouthernCalifornia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Southern California\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofSouthernMississippi\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofSouthernMississippi\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Southern Mississippi\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTennessee,HealthScienceCenter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTennessee,HealthScienceCenter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Tennessee, Health Science Center\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTennessee,Knoxville\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTennessee,Knoxville\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Tennessee, Knoxville\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexasHealthScienceCenter,Houston\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexasHealthScienceCenter,Houston\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas Health Science Center, Houston\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexasHealthScienceCenter,SanAntonio\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexasHealthScienceCenter,SanAntonio\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas Health Science Center, San Antonio\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexasHealthScienceCenter,Tyler\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexasHealthScienceCenter,Tyler\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas Health Science Center, Tyler\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexasM.D.AndersonCancerCenter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexasM.D.AndersonCancerCenter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas M. D. Anderson Cancer Center\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexasMedicalBranchatGalveston\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexasMedicalBranchatGalveston\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas Medical Branch at Galveston\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexasRioGrandeValley\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexasRioGrandeValley\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas Rio Grande Valley\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexasSouthwesternMedicalCenter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexasSouthwesternMedicalCenter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas Southwestern Medical Center\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexas,Arlington\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexas,Arlington\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas, Arlington\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexas,Austin\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexas,Austin\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas, Austin\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexas,Dallas\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexas,Dallas\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas, Dallas\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexas,ElPaso\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexas,ElPaso\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas, El Paso\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTexas,SanAntonio\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTexas,SanAntonio\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Texas, San Antonio\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofToledo\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofToledo\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Toledo\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTulsa\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTulsa\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Tulsa\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofTurku\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofTurku\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Turku\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofUtah\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofUtah\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Utah\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofVermont\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofVermont\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Vermont\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofVirginia,Charlottesville\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofVirginia,Charlottesville\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Virginia, Charlottesville\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofWashington,Seattle\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofWashington,Seattle\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Washington, Seattle\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofWestFlorida\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofWestFlorida\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of West Florida\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofWisconsin-Madison\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofWisconsin-Madison\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Wisconsin-Madison\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofWisconsin-Milwaukee\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofWisconsin-Milwaukee\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Wisconsin-Milwaukee\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversityofWyoming\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversityofWyoming\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"University of Wyoming\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UniversitéParis-EstCréteil\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UniversitéParis-EstCréteil\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Université Paris-Est Créteil\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UtahStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UtahStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Utah State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VanAndelResearchInstitute\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VanAndelResearchInstitute\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Van Andel Research Institute\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VanderbiltUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VanderbiltUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Vanderbilt University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VillanovaUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VillanovaUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Villanova University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VirginiaCommonwealthUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VirginiaCommonwealthUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Virginia Commonwealth University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VirginiaPolytechnicInstituteandStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VirginiaPolytechnicInstituteandStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Virginia Polytechnic Institute and State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WakeForestUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WakeForestUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Wake Forest University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WashingtonStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WashingtonStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Washington State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WashingtonUniversityinSt.Louis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WashingtonUniversityinSt.Louis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Washington University in St. Louis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WayneStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WayneStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Wayne State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WestVirginiaUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WestVirginiaUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"West Virginia University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WesternMichiganUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WesternMichiganUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Western Michigan University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WichitaStateUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WichitaStateUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Wichita State University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:William&Mary\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"William&Mary\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"William & Mary\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WorcesterPolytechnicInstitute\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WorcesterPolytechnicInstitute\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Worcester Polytechnic Institute\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:YaleUniversity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"YaleUniversity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Institution\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Yale University\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProportionCoverage30x\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Proportion of all reference bases for whole genome sequencing, or targeted bases for whole exome and targeted sequencing, that achieves 30X or greater coverage from Picard Tools\",\n            \"rdfs:label\": \"ProportionCoverage30x\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"proportionCoverage30x\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"num\"\n            ]\n        },\n        {\n            \"@id\": \"bts:Description\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Text describing a resource.\",\n            \"rdfs:label\": \"Description\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"description\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CompoundDoseUnit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A unit associated with the value(s) in compoundDose.\",\n            \"rdfs:label\": \"CompoundDoseUnit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"compoundDoseUnit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IsStranded\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Whether or not the library is stranded (Yes; No)\",\n            \"rdfs:label\": \"IsStranded\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Yes\"\n                },\n                {\n                    \"@id\": \"bts:No\"\n                }\n            ],\n            \"sms:displayName\": \"isStranded\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Yes\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Yes\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:IsStranded\"\n                },\n                {\n                    \"@id\": \"bts:IsCellLine\"\n                },\n                {\n                    \"@id\": \"bts:IsMultiIndividual\"\n                },\n                {\n                    \"@id\": \"bts:IsPairedEnd\"\n                },\n                {\n                    \"@id\": \"bts:IsMultiSpecimen\"\n                },\n                {\n                    \"@id\": \"bts:IsPrimaryCell\"\n                },\n                {\n                    \"@id\": \"bts:IsXenograft\"\n                },\n                {\n                    \"@id\": \"bts:FailedQC\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Yes\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:No\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"No\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:IsStranded\"\n                },\n                {\n                    \"@id\": \"bts:IsCellLine\"\n                },\n                {\n                    \"@id\": \"bts:IsMultiIndividual\"\n                },\n                {\n                    \"@id\": \"bts:IsPairedEnd\"\n                },\n                {\n                    \"@id\": \"bts:IsMultiSpecimen\"\n                },\n                {\n                    \"@id\": \"bts:IsPrimaryCell\"\n                },\n                {\n                    \"@id\": \"bts:IsXenograft\"\n                },\n                {\n                    \"@id\": \"bts:FailedQC\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"No\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenomicReferenceLink\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Link to genome reference data file used for alignment in processing workflow\",\n            \"rdfs:label\": \"GenomicReferenceLink\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"genomicReferenceLink\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProgrammingLanguage\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A computer programming language\",\n            \"rdfs:label\": \"ProgrammingLanguage\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Python\"\n                },\n                {\n                    \"@id\": \"bts:R\"\n                },\n                {\n                    \"@id\": \"bts:MATLAB\"\n                },\n                {\n                    \"@id\": \"bts:Java\"\n                },\n                {\n                    \"@id\": \"bts:C\"\n                },\n                {\n                    \"@id\": \"bts:C++\"\n                },\n                {\n                    \"@id\": \"bts:C#\"\n                },\n                {\n                    \"@id\": \"bts:Javascript\"\n                },\n                {\n                    \"@id\": \"bts:Bash\"\n                }\n            ],\n            \"sms:displayName\": \"programmingLanguage\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Python\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Python\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgrammingLanguage\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Python\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:R\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"R\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgrammingLanguage\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"R\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MATLAB\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MATLAB\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgrammingLanguage\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MATLAB\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Java\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Java\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgrammingLanguage\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Java\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:C\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"C\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CaptureArea\"\n                },\n                {\n                    \"@id\": \"bts:ProgrammingLanguage\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"C\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:C++\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"C++\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgrammingLanguage\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"C++\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:C#\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"C#\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgrammingLanguage\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"C#\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Javascript\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Javascript\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgrammingLanguage\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Javascript\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bash\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bash\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgrammingLanguage\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bash\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SlideID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Unique identifier printed on the label of each Visium slide. The serial number starts with V followed by a number which can range between one through five and ends with a dash and a three digit number, such as 123.\\n\",\n            \"rdfs:label\": \"SlideID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"slideID\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SourceName\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Intended for non-biological samples, tf the sample is a nanoparticle sample or some chemical substance not derived from a biological material,  the corresponding source name should refer to the starting sample that was modified by a protocol for the assay.\\n\",\n            \"rdfs:label\": \"SourceName\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sourceName\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mosaicism\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Whether individual is mosaic.\",\n            \"rdfs:label\": \"Mosaicism\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Mosaic\"\n                },\n                {\n                    \"@id\": \"bts:Notmosaic\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"mosaicism\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mosaic\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mosaic\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Mosaicism\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mosaic\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Notmosaic\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Notmosaic\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Mosaicism\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"not mosaic\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Unknown\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Unknown\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Mosaicism\"\n                },\n                {\n                    \"@id\": \"bts:DermalSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:BreastCancer\"\n                },\n                {\n                    \"@id\": \"bts:OtherTumors\"\n                },\n                {\n                    \"@id\": \"bts:GIST\"\n                },\n                {\n                    \"@id\": \"bts:NonopticGlioma\"\n                },\n                {\n                    \"@id\": \"bts:PeripheralNeuropathy\"\n                },\n                {\n                    \"@id\": \"bts:Scoliosis\"\n                },\n                {\n                    \"@id\": \"bts:IntellectualDisability\"\n                },\n                {\n                    \"@id\": \"bts:MPNSTCharacterization\"\n                },\n                {\n                    \"@id\": \"bts:AqueductalStenosis\"\n                },\n                {\n                    \"@id\": \"bts:IrisLischNodules\"\n                },\n                {\n                    \"@id\": \"bts:HeartDefect\"\n                },\n                {\n                    \"@id\": \"bts:Pheochromocytoma\"\n                },\n                {\n                    \"@id\": \"bts:Inheritance\"\n                },\n                {\n                    \"@id\": \"bts:LearningDisability\"\n                },\n                {\n                    \"@id\": \"bts:LenticularOpacity\"\n                },\n                {\n                    \"@id\": \"bts:CafeaulaitMacules\"\n                },\n                {\n                    \"@id\": \"bts:LongBoneDysplasia\"\n                },\n                {\n                    \"@id\": \"bts:GlomusTumor\"\n                },\n                {\n                    \"@id\": \"bts:SphenoidDysplasia\"\n                },\n                {\n                    \"@id\": \"bts:SkinFoldFreckling\"\n                },\n                {\n                    \"@id\": \"bts:Leukemia\"\n                },\n                {\n                    \"@id\": \"bts:VascularDisease\"\n                },\n                {\n                    \"@id\": \"bts:AttentionDeficitDisorder\"\n                },\n                {\n                    \"@id\": \"bts:VitalStatus\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:DiffuseDermalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:NumberOfSchwannomas\"\n                },\n                {\n                    \"@id\": \"bts:PlexiformNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:DermalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:NonvestibularCranialSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:NonvestibularSchwannomas\"\n                },\n                {\n                    \"@id\": \"bts:SubcutaneousNodularNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:GliomaOrEpendymoma\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:SpinalSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:OpticGlioma\"\n                },\n                {\n                    \"@id\": \"bts:VestibularSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:DiagnosisAgeGroup\"\n                },\n                {\n                    \"@id\": \"bts:Meningioma\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Unknown\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReadPairOrientation\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The relative orientation of the reads in a paired-end protocol\",\n            \"rdfs:label\": \"ReadPairOrientation\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Fr-firststrand\"\n                },\n                {\n                    \"@id\": \"bts:Inward\"\n                },\n                {\n                    \"@id\": \"bts:Matching\"\n                },\n                {\n                    \"@id\": \"bts:Outward\"\n                }\n            ],\n            \"sms:displayName\": \"readPairOrientation\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Fr-firststrand\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Fr-firststrand\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ReadPairOrientation\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"fr-firststrand\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Inward\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Inward\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ReadPairOrientation\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"inward\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Matching\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Matching\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ReadPairOrientation\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"matching\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Outward\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Outward\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ReadPairOrientation\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"outward\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FileSize\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Size of file in bytes.\",\n            \"rdfs:label\": \"FileSize\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"fileSize\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DermalSchwannoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Dermal schwannoma.\",\n            \"rdfs:label\": \"DermalSchwannoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"dermalSchwannoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Absent\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Absent\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DermalSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:OpticGlioma\"\n                },\n                {\n                    \"@id\": \"bts:BreastCancer\"\n                },\n                {\n                    \"@id\": \"bts:OtherTumors\"\n                },\n                {\n                    \"@id\": \"bts:DiffuseDermalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:GIST\"\n                },\n                {\n                    \"@id\": \"bts:NonopticGlioma\"\n                },\n                {\n                    \"@id\": \"bts:PeripheralNeuropathy\"\n                },\n                {\n                    \"@id\": \"bts:Scoliosis\"\n                },\n                {\n                    \"@id\": \"bts:IntellectualDisability\"\n                },\n                {\n                    \"@id\": \"bts:MPNSTCharacterization\"\n                },\n                {\n                    \"@id\": \"bts:AqueductalStenosis\"\n                },\n                {\n                    \"@id\": \"bts:IrisLischNodules\"\n                },\n                {\n                    \"@id\": \"bts:HeartDefect\"\n                },\n                {\n                    \"@id\": \"bts:Pheochromocytoma\"\n                },\n                {\n                    \"@id\": \"bts:LearningDisability\"\n                },\n                {\n                    \"@id\": \"bts:LenticularOpacity\"\n                },\n                {\n                    \"@id\": \"bts:CafeaulaitMacules\"\n                },\n                {\n                    \"@id\": \"bts:DermalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:LongBoneDysplasia\"\n                },\n                {\n                    \"@id\": \"bts:GlomusTumor\"\n                },\n                {\n                    \"@id\": \"bts:SphenoidDysplasia\"\n                },\n                {\n                    \"@id\": \"bts:SkinFoldFreckling\"\n                },\n                {\n                    \"@id\": \"bts:Leukemia\"\n                },\n                {\n                    \"@id\": \"bts:SubcutaneousNodularNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:VascularDisease\"\n                },\n                {\n                    \"@id\": \"bts:AttentionDeficitDisorder\"\n                },\n                {\n                    \"@id\": \"bts:SpinalNeurofibromas\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"absent\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Present\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Present\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:PlexiformNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:DermalSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:BreastCancer\"\n                },\n                {\n                    \"@id\": \"bts:OtherTumors\"\n                },\n                {\n                    \"@id\": \"bts:GIST\"\n                },\n                {\n                    \"@id\": \"bts:NonopticGlioma\"\n                },\n                {\n                    \"@id\": \"bts:PeripheralNeuropathy\"\n                },\n                {\n                    \"@id\": \"bts:Scoliosis\"\n                },\n                {\n                    \"@id\": \"bts:IntellectualDisability\"\n                },\n                {\n                    \"@id\": \"bts:MPNSTCharacterization\"\n                },\n                {\n                    \"@id\": \"bts:AqueductalStenosis\"\n                },\n                {\n                    \"@id\": \"bts:IrisLischNodules\"\n                },\n                {\n                    \"@id\": \"bts:HeartDefect\"\n                },\n                {\n                    \"@id\": \"bts:Pheochromocytoma\"\n                },\n                {\n                    \"@id\": \"bts:LearningDisability\"\n                },\n                {\n                    \"@id\": \"bts:LenticularOpacity\"\n                },\n                {\n                    \"@id\": \"bts:CafeaulaitMacules\"\n                },\n                {\n                    \"@id\": \"bts:LongBoneDysplasia\"\n                },\n                {\n                    \"@id\": \"bts:GlomusTumor\"\n                },\n                {\n                    \"@id\": \"bts:SphenoidDysplasia\"\n                },\n                {\n                    \"@id\": \"bts:SkinFoldFreckling\"\n                },\n                {\n                    \"@id\": \"bts:Leukemia\"\n                },\n                {\n                    \"@id\": \"bts:VascularDisease\"\n                },\n                {\n                    \"@id\": \"bts:AttentionDeficitDisorder\"\n                },\n                {\n                    \"@id\": \"bts:NonvestibularCranialSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:GliomaOrEpendymoma\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"present\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BenefactorId\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The id of the resource from which access control is inherited.\",\n            \"rdfs:label\": \"BenefactorId\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"benefactorId\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SpinalSchwannoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Spinal schwannoma.\",\n            \"rdfs:label\": \"SpinalSchwannoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Notimaged\"\n                },\n                {\n                    \"@id\": \"bts:Absentbyimaging\"\n                },\n                {\n                    \"@id\": \"bts:Single\"\n                },\n                {\n                    \"@id\": \"bts:Multiple\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"spinalSchwannoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Notimaged\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Notimaged\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpinalSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:SpinalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:VestibularSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:NonvestibularCranialSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:Meningioma\"\n                },\n                {\n                    \"@id\": \"bts:GliomaOrEpendymoma\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"not imaged\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Absentbyimaging\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Absentbyimaging\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpinalSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:VestibularSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:NonvestibularCranialSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:Meningioma\"\n                },\n                {\n                    \"@id\": \"bts:GliomaOrEpendymoma\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"absent by imaging\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Single\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Single\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:NumberOfSchwannomas\"\n                },\n                {\n                    \"@id\": \"bts:SpinalSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:Meningioma\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"single\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Multiple\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Multiple\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpinalSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:Meningioma\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Multiple\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AccessType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Indicates access type / possible procedures needed for access to the resource.\",\n            \"rdfs:label\": \"AccessType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:PublicAccess\"\n                },\n                {\n                    \"@id\": \"bts:OpenAccess\"\n                },\n                {\n                    \"@id\": \"bts:ControlledAccess\"\n                },\n                {\n                    \"@id\": \"bts:PrivateAccess\"\n                }\n            ],\n            \"sms:displayName\": \"accessType\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PublicAccess\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PublicAccess\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:AccessType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Public Access\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OpenAccess\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OpenAccess\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:AccessType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Open Access\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ControlledAccess\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ControlledAccess\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:AccessType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Controlled Access\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PrivateAccess\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PrivateAccess\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:AccessType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Private Access\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SpinalNeurofibromas\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Spinal neurofibromas.\",\n            \"rdfs:label\": \"SpinalNeurofibromas\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Notimaged\"\n                },\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Levels1-3\"\n                },\n                {\n                    \"@id\": \"bts:Alllevels\"\n                }\n            ],\n            \"sms:displayName\": \"spinalNeurofibromas\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Levels1-3\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Levels1-3\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpinalNeurofibromas\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"levels 1-3\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Alllevels\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Alllevels\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpinalNeurofibromas\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"all levels\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OpticGlioma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Optic glioma.\",\n            \"rdfs:label\": \"OpticGlioma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present-asymptomatic\"\n                },\n                {\n                    \"@id\": \"bts:Present-symptomatic-nottreated\"\n                },\n                {\n                    \"@id\": \"bts:Present-symptomatic-treated\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"opticGlioma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Present-asymptomatic\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Present-asymptomatic\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:OpticGlioma\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"present - asymptomatic\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Present-symptomatic-nottreated\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Present-symptomatic-nottreated\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:OpticGlioma\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"present - symptomatic - not treated\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Present-symptomatic-treated\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Present-symptomatic-treated\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:OpticGlioma\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"present - symptomatic - treated\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CreatedOn\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Refers to when the resource was created.\",\n            \"rdfs:label\": \"CreatedOn\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"createdOn\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProtocolPurpose\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Brief description of the protocol purpose.\",\n            \"rdfs:label\": \"ProtocolPurpose\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"protocolPurpose\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GrantDOI\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Doi of a grant (e.g. in ProposalCentral) that can be associated with the entity.\",\n            \"rdfs:label\": \"GrantDOI\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"grantDOI\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BreastCancer\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BreastCancer\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"Breast Cancer\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReporterSubstance\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A biological material (clone, oligo, etc.) on an array which will report on some biosequence or biosequences.\",\n            \"rdfs:label\": \"ReporterSubstance\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"reporterSubstance\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SlideVersion\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Version of imaging slide used. Slide version is critical for the analysis of the sequencing data as different slides have different capture area layouts.\",\n            \"rdfs:label\": \"SlideVersion\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:V1\"\n                },\n                {\n                    \"@id\": \"bts:V2\"\n                },\n                {\n                    \"@id\": \"bts:V3\"\n                },\n                {\n                    \"@id\": \"bts:V4\"\n                }\n            ],\n            \"sms:displayName\": \"slideVersion\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:V1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"V1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SlideVersion\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"V1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:V2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"V2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SlideVersion\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"V2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:V3\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"V3\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SlideVersion\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"V3\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:V4\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"V4\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SlideVersion\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"V4\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReporterGene\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A gene which produces an easily assayed phenotype. Often used for expression studies of heterologous promoters.\",\n            \"rdfs:label\": \"ReporterGene\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"reporterGene\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1Genotype\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Genotype of NF1 gene in the biospecimen from which the data were derived, if known.\",\n            \"rdfs:label\": \"Nf1Genotype\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:-/-\"\n                },\n                {\n                    \"@id\": \"bts:+/-\"\n                },\n                {\n                    \"@id\": \"bts:+/+\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"nf1Genotype\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:-/-\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"-/-\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"-/-\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:+/-\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"+/-\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"+/-\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:+/+\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"+/+\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"+/+\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Diagnosis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Diagnosis for the individual given signs and symptoms. Use the most specific diagnosis term that applies.\",\n            \"rdfs:label\": \"Diagnosis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Neurofibromatosistype1\"\n                },\n                {\n                    \"@id\": \"bts:Schwannomatosis\"\n                },\n                {\n                    \"@id\": \"bts:NF2-relatedschwannomatosis\"\n                },\n                {\n                    \"@id\": \"bts:SMARCB1-relatedschwannomatosis\"\n                },\n                {\n                    \"@id\": \"bts:LZTR1-relatedschwannomatosis\"\n                },\n                {\n                    \"@id\": \"bts:22q-relatedschwannomatosis\"\n                },\n                {\n                    \"@id\": \"bts:Schwannomatosis-NOS\"\n                },\n                {\n                    \"@id\": \"bts:Schwannomatosis-NEC\"\n                },\n                {\n                    \"@id\": \"bts:SporadicSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:NoonanSyndrome\"\n                },\n                {\n                    \"@id\": \"bts:NotApplicable\"\n                }\n            ],\n            \"sms:displayName\": \"diagnosis\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Neurofibromatosistype1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Neurofibromatosistype1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Neurofibromatosis type 1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Schwannomatosis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Schwannomatosis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Schwannomatosis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NF2-relatedschwannomatosis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NF2-relatedschwannomatosis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NF2-related schwannomatosis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SMARCB1-relatedschwannomatosis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SMARCB1-relatedschwannomatosis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SMARCB1-related schwannomatosis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LZTR1-relatedschwannomatosis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LZTR1-relatedschwannomatosis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"LZTR1-related schwannomatosis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:22q-relatedschwannomatosis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"22q-relatedschwannomatosis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"22q-related schwannomatosis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Schwannomatosis-NOS\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Schwannomatosis-NOS\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Schwannomatosis-NOS\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Schwannomatosis-NEC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Schwannomatosis-NEC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Schwannomatosis-NEC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SporadicSchwannoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SporadicSchwannoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Sporadic Schwannoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NoonanSyndrome\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NoonanSyndrome\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Noonan Syndrome\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NotApplicable\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NotApplicable\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Channel\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Not Applicable\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TargetDepth\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The targeted read depth prior to sequencing.\",\n            \"rdfs:label\": \"TargetDepth\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"targetDepth\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AccessRequirements\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Statement describing access requirements for an entity.\",\n            \"rdfs:label\": \"AccessRequirements\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"accessRequirements\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Creator\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"An entity responsible for making the resource.\",\n            \"rdfs:label\": \"Creator\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"creator\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Age\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A numeric value representing age of the individual. Use with `ageUnit`.\",\n            \"rdfs:label\": \"Age\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"age\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                }\n            ],\n            \"sms:validationRules\": [\n                \"num\"\n            ]\n        },\n        {\n            \"@id\": \"bts:AgeUnit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A time unit that can be used with a given age value, e.g. years.\",\n            \"rdfs:label\": \"AgeUnit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Seconds\"\n                },\n                {\n                    \"@id\": \"bts:Minutes\"\n                },\n                {\n                    \"@id\": \"bts:Hours\"\n                },\n                {\n                    \"@id\": \"bts:Days\"\n                },\n                {\n                    \"@id\": \"bts:Weeks\"\n                },\n                {\n                    \"@id\": \"bts:Months\"\n                },\n                {\n                    \"@id\": \"bts:Years\"\n                }\n            ],\n            \"sms:displayName\": \"ageUnit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IsCellLine\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Whether or not sample source is a cell line (Yes; No)\",\n            \"rdfs:label\": \"IsCellLine\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Yes\"\n                },\n                {\n                    \"@id\": \"bts:No\"\n                }\n            ],\n            \"sms:displayName\": \"isCellLine\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LibraryPreparationMethod\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Method by which library was prepared\",\n            \"rdfs:label\": \"LibraryPreparationMethod\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:10x\"\n                },\n                {\n                    \"@id\": \"bts:CEL-seq\"\n                },\n                {\n                    \"@id\": \"bts:Drop-Seq\"\n                },\n                {\n                    \"@id\": \"bts:GTAC@WUSTLin-houseprep\"\n                },\n                {\n                    \"@id\": \"bts:IDTxGenExomeResearchPanel\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaTruSeqDNANano\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaTn5Transposase\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaRibo-ZeroPlus\"\n                },\n                {\n                    \"@id\": \"bts:KAPAHyperPrepKitPCR-free\"\n                },\n                {\n                    \"@id\": \"bts:KAPARNAHyperPrepKitwithRiboErase(HMR)\"\n                },\n                {\n                    \"@id\": \"bts:KAPAmRNAHyperPrepKit\"\n                },\n                {\n                    \"@id\": \"bts:NEBNextmRNALibraryPrepReagentSetforIllumina\"\n                },\n                {\n                    \"@id\": \"bts:Omni-ATAC\"\n                },\n                {\n                    \"@id\": \"bts:QuantSeqFWDV2withUDI\"\n                },\n                {\n                    \"@id\": \"bts:Smart-seq2\"\n                },\n                {\n                    \"@id\": \"bts:Smart-seq4\"\n                },\n                {\n                    \"@id\": \"bts:TruSeq\"\n                },\n                {\n                    \"@id\": \"bts:TruSeqstandardtotalRNAlibrarykit\"\n                },\n                {\n                    \"@id\": \"bts:OxfordNanoporeDirectRNASequencingKit\"\n                },\n                {\n                    \"@id\": \"bts:QIAseqFXDNALibraryKit\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"libraryPreparationMethod\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:10x\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"10x\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"10x\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CEL-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CEL-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CEL-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Drop-Seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Drop-Seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Drop-Seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GTAC@WUSTLin-houseprep\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GTAC@WUSTLin-houseprep\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GTAC@WUSTL in-house prep\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IDTxGenExomeResearchPanel\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IDTxGenExomeResearchPanel\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"IDT xGen Exome Research Panel\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaTruSeqDNANano\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaTruSeqDNANano\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina TruSeq DNA Nano\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaTn5Transposase\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaTn5Transposase\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina Tn5 Transposase\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaRibo-ZeroPlus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaRibo-ZeroPlus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina Ribo-Zero Plus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:KAPAHyperPrepKitPCR-free\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"KAPAHyperPrepKitPCR-free\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"KAPA HyperPrep Kit PCR-free\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:KAPARNAHyperPrepKitwithRiboErase(HMR)\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"KAPARNAHyperPrepKitwithRiboErase(HMR)\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"KAPA RNA HyperPrep Kit with RiboErase (HMR)\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:KAPAmRNAHyperPrepKit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"KAPAmRNAHyperPrepKit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"KAPA mRNA HyperPrep Kit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NEBNextmRNALibraryPrepReagentSetforIllumina\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NEBNextmRNALibraryPrepReagentSetforIllumina\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NEBNext mRNA Library Prep Reagent Set for Illumina\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Omni-ATAC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Omni-ATAC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Omni-ATAC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:QuantSeqFWDV2withUDI\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"QuantSeqFWDV2withUDI\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"QuantSeq FWD V2 with UDI\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Smart-seq2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Smart-seq2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Smart-seq2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Smart-seq4\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Smart-seq4\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Smart-seq4\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TruSeq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TruSeq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"TruSeq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TruSeqstandardtotalRNAlibrarykit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TruSeqstandardtotalRNAlibrarykit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"TruSeq standard total RNA library kit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OxfordNanoporeDirectRNASequencingKit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OxfordNanoporeDirectRNASequencingKit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Oxford Nanopore Direct RNA Sequencing Kit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:QIAseqFXDNALibraryKit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"QIAseqFXDNALibraryKit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"QIAseq FX DNA Library Kit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Publisher\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"An entity responsible for making the resource available.\",\n            \"rdfs:label\": \"Publisher\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"publisher\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AverageBaseQuality\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Average base quality collected from samtools\",\n            \"rdfs:label\": \"AverageBaseQuality\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"averageBaseQuality\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sex\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Phenotypic expression of chromosomal makeup that defines a study subject as male, female, or other.\",\n            \"rdfs:label\": \"Sex\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Male\"\n                },\n                {\n                    \"@id\": \"bts:Female\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                },\n                {\n                    \"@id\": \"bts:NotApplicable\"\n                }\n            ],\n            \"sms:displayName\": \"sex\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Male\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Male\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Sex\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Male\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Female\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Female\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Sex\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Female\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CreatedBy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Refers to the user who created the resource.\",\n            \"rdfs:label\": \"CreatedBy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"createdBy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OtherTumors\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of other tumors.\",\n            \"rdfs:label\": \"OtherTumors\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"otherTumors\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenePerturbationType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Specific way in which a single gene was perturbed in a sample\",\n            \"rdfs:label\": \"GenePerturbationType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"genePerturbationType\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PainStatus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Pain status rating.\",\n            \"rdfs:label\": \"PainStatus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Notaproblem\"\n                },\n                {\n                    \"@id\": \"bts:Occasional\"\n                },\n                {\n                    \"@id\": \"bts:Disabling\"\n                }\n            ],\n            \"sms:displayName\": \"painStatus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Notaproblem\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Notaproblem\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:PainStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"not a problem\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Occasional\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Occasional\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:PainStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"occasional\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Disabling\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Disabling\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:PainStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"disabling\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CompoundDose\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A dose quantity for the treatment compound. To be used with compoundDoseUnit.\",\n            \"rdfs:label\": \"CompoundDose\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"compoundDose\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:CompoundDoseUnit\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReadDepth\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"If available, the coverage statistic as output from bedtools coverage or samtools stats.\",\n            \"rdfs:label\": \"ReadDepth\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"readDepth\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"int\"\n            ]\n        },\n        {\n            \"@id\": \"bts:InChIKey\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"//pubchem.ncbi.nlm.nih.gov/compound/10127622#section=InChI-Key).  This is a more reliable identifier than the compound name and should be used if available.\\n\",\n            \"rdfs:label\": \"InChIKey\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"InChIKey\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ConcentrationMaterial\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Numeric value for concentration of the material\",\n            \"rdfs:label\": \"ConcentrationMaterial\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"concentrationMaterial\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"num\"\n            ]\n        },\n        {\n            \"@id\": \"bts:FileCount\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Number of files in the resource collection.\",\n            \"rdfs:label\": \"FileCount\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"fileCount\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AssayTarget\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Target of the assay such as a HUGO gene symbol, cell type, or tissue region depending on the capabilities of the assay.\",\n            \"rdfs:label\": \"AssayTarget\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"assayTarget\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReadLength\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Number of base pairs (bp) sequenced for a read\",\n            \"rdfs:label\": \"ReadLength\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"readLength\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"int\"\n            ]\n        },\n        {\n            \"@id\": \"bts:TumorTreatmentStatus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Tumor treatment status for the individual.\",\n            \"rdfs:label\": \"TumorTreatmentStatus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Nospecifictherapy\"\n                },\n                {\n                    \"@id\": \"bts:Chemotherapy\"\n                },\n                {\n                    \"@id\": \"bts:Surgery\"\n                },\n                {\n                    \"@id\": \"bts:Radiation\"\n                },\n                {\n                    \"@id\": \"bts:Targetedtherapy\"\n                },\n                {\n                    \"@id\": \"bts:Clinicaltrial\"\n                }\n            ],\n            \"sms:displayName\": \"tumorTreatmentStatus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nospecifictherapy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nospecifictherapy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorTreatmentStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"no specific therapy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Chemotherapy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Chemotherapy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorTreatmentStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"chemotherapy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Surgery\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Surgery\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorTreatmentStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"surgery\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Radiation\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Radiation\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorTreatmentStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"radiation\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Targetedtherapy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Targetedtherapy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorTreatmentStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"targeted therapy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Clinicaltrial\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Clinicaltrial\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorTreatmentStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"clinical trial\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StudyId\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Id of study.\",\n            \"rdfs:label\": \"StudyId\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"studyId\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Filename\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The name of the file.\",\n            \"rdfs:label\": \"Filename\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Filename\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DiffuseDermalNeurofibromas\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Diffuse dermal neurofibromas.\",\n            \"rdfs:label\": \"DiffuseDermalNeurofibromas\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Scattered\"\n                },\n                {\n                    \"@id\": \"bts:Dense\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"diffuseDermalNeurofibromas\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Scattered\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Scattered\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DiffuseDermalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:NumberOfSchwannomas\"\n                },\n                {\n                    \"@id\": \"bts:DermalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:SubcutaneousNodularNeurofibromas\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"scattered\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Dense\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Dense\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DiffuseDermalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:NumberOfSchwannomas\"\n                },\n                {\n                    \"@id\": \"bts:DermalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:SubcutaneousNodularNeurofibromas\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"dense\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReadStrandOrigin\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The strand from which the read originates in a strand-specific protocol\",\n            \"rdfs:label\": \"ReadStrandOrigin\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Forward\"\n                },\n                {\n                    \"@id\": \"bts:Reverse\"\n                }\n            ],\n            \"sms:displayName\": \"readStrandOrigin\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Forward\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Forward\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ReadStrandOrigin\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"forward\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Reverse\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Reverse\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ReadStrandOrigin\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"reverse\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Series\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Title of a custom series that this resource is part of, if any.\",\n            \"rdfs:label\": \"Series\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"series\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Seconds\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Seconds\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"seconds\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Minutes\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Minutes\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"minutes\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Hours\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Hours\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hours\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Days\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Days\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"days\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Weeks\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Weeks\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"weeks\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Months\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Months\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"months\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Years\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Years\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"years\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProgressReportNumber\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \" if submitting data for the 6-month milestone report for NTAP, progressReportNumber=1.  Also if submitting data associated with first milestone, progressReportNumber =1\",\n            \"rdfs:label\": \"ProgressReportNumber\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:1\"\n                },\n                {\n                    \"@id\": \"bts:2\"\n                },\n                {\n                    \"@id\": \"bts:3\"\n                },\n                {\n                    \"@id\": \"bts:4\"\n                },\n                {\n                    \"@id\": \"bts:5\"\n                },\n                {\n                    \"@id\": \"bts:6\"\n                },\n                {\n                    \"@id\": \"bts:7\"\n                },\n                {\n                    \"@id\": \"bts:8\"\n                },\n                {\n                    \"@id\": \"bts:9\"\n                },\n                {\n                    \"@id\": \"bts:10\"\n                },\n                {\n                    \"@id\": \"bts:NotApplicable\"\n                }\n            ],\n            \"sms:displayName\": \"progressReportNumber\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                },\n                {\n                    \"@id\": \"bts:WHOPerformanceStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                },\n                {\n                    \"@id\": \"bts:WHOPerformanceStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:3\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"3\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                },\n                {\n                    \"@id\": \"bts:WHOPerformanceStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"3\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:4\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"4\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                },\n                {\n                    \"@id\": \"bts:WHOPerformanceStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"4\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:5\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"5\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"5\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:6\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"6\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"6\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:7\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"7\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"7\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:8\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"8\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"8\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:9\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"9\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"9\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:10\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"10\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"10\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SpecimenPreparationMethod\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Term that represents preservation of the sample before usage in, e.g. sequencing\",\n            \"rdfs:label\": \"SpecimenPreparationMethod\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Freshcollected\"\n                },\n                {\n                    \"@id\": \"bts:Flashfrozen\"\n                },\n                {\n                    \"@id\": \"bts:FFPE\"\n                },\n                {\n                    \"@id\": \"bts:Cryopreserved\"\n                },\n                {\n                    \"@id\": \"bts:OCT\"\n                },\n                {\n                    \"@id\": \"bts:RNAlater\"\n                },\n                {\n                    \"@id\": \"bts:Formalin-fixed\"\n                },\n                {\n                    \"@id\": \"bts:Ethanol\"\n                },\n                {\n                    \"@id\": \"bts:Viablyfrozen\"\n                }\n            ],\n            \"sms:displayName\": \"specimenPreparationMethod\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Freshcollected\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Freshcollected\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Fresh collected\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Flashfrozen\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Flashfrozen\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Flash frozen\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FFPE\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FFPE\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FFPE\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cryopreserved\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cryopreserved\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cryopreserved\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OCT\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OCT\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"OCT\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RNAlater\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RNAlater\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RNAlater\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Formalin-fixed\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Formalin-fixed\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"formalin-fixed\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Ethanol\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Ethanol\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ethanol\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Viablyfrozen\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Viablyfrozen\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Viably frozen\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IsMultiIndividual\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Whether or not a file has data for multiple individuals (Yes; No)\",\n            \"rdfs:label\": \"IsMultiIndividual\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Yes\"\n                },\n                {\n                    \"@id\": \"bts:No\"\n                }\n            ],\n            \"sms:displayName\": \"isMultiIndividual\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LibraryStrand\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Strandedness of paired-end RNA-Sequencing data. This is an important parameter for RNA-seq analysis.\",\n            \"rdfs:label\": \"LibraryStrand\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:FirstStranded\"\n                },\n                {\n                    \"@id\": \"bts:SecondStranded\"\n                },\n                {\n                    \"@id\": \"bts:Unstranded\"\n                },\n                {\n                    \"@id\": \"bts:NotApplicable\"\n                }\n            ],\n            \"sms:displayName\": \"libraryStrand\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FirstStranded\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FirstStranded\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FirstStranded\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SecondStranded\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SecondStranded\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SecondStranded\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Unstranded\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Unstranded\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Unstranded\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GIST\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Gastrointestinal stromal tumor (GIST).\",\n            \"rdfs:label\": \"GIST\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"GIST\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Immersion\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Immersion medium\",\n            \"rdfs:label\": \"Immersion\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Air\"\n                },\n                {\n                    \"@id\": \"bts:Oil\"\n                },\n                {\n                    \"@id\": \"bts:Water\"\n                },\n                {\n                    \"@id\": \"bts:Other\"\n                }\n            ],\n            \"sms:displayName\": \"immersion\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Air\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Air\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Immersion\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"air\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Oil\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Oil\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Immersion\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"oil\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Water\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Water\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Immersion\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"water\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Other\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Other\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataCollectionMode\"\n                },\n                {\n                    \"@id\": \"bts:Immersion\"\n                },\n                {\n                    \"@id\": \"bts:ExpressionUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Other\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CaptureArea\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"  A1, B1, C1, D1 for Visium slides with 6.5 mm Capture Area and A, B for CytAssist slides with 11 mm Capture Area.  Both CytAssist slides with 6.5 mm Capture Area and Gateway Slides contain only two slide areas, A1 and D1.\\n\",\n            \"rdfs:label\": \"CaptureArea\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:A\"\n                },\n                {\n                    \"@id\": \"bts:B\"\n                },\n                {\n                    \"@id\": \"bts:C\"\n                },\n                {\n                    \"@id\": \"bts:D\"\n                },\n                {\n                    \"@id\": \"bts:A1\"\n                },\n                {\n                    \"@id\": \"bts:B1\"\n                },\n                {\n                    \"@id\": \"bts:C1\"\n                },\n                {\n                    \"@id\": \"bts:D1\"\n                }\n            ],\n            \"sms:displayName\": \"captureArea\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:A\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"A\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CaptureArea\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"A\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:B\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"B\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CaptureArea\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"B\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:D\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"D\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CaptureArea\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"D\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:A1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"A1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CaptureArea\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"A1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:B1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"B1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CaptureArea\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"B1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:C1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"C1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CaptureArea\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"C1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:D1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"D1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CaptureArea\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"D1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ExperimentalFactor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"An ontology concept for experimental factor measured with this data.\",\n            \"rdfs:label\": \"ExperimentalFactor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Braingrowthmeasurement\"\n                },\n                {\n                    \"@id\": \"bts:Brainvolumemeasurement\"\n                },\n                {\n                    \"@id\": \"bts:Bodyweight\"\n                },\n                {\n                    \"@id\": \"bts:Clinicallaboratorymeasurement\"\n                },\n                {\n                    \"@id\": \"bts:Cognitivefunctionmeasurement\"\n                },\n                {\n                    \"@id\": \"bts:Gaitmeasurement\"\n                },\n                {\n                    \"@id\": \"bts:Motordevelopmentmeasurement\"\n                },\n                {\n                    \"@id\": \"bts:Painmeasurement\"\n                }\n            ],\n            \"sms:displayName\": \"experimentalFactor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Braingrowthmeasurement\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Braingrowthmeasurement\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExperimentalFactor\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"brain growth measurement\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Brainvolumemeasurement\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Brainvolumemeasurement\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExperimentalFactor\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"brain volume measurement\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bodyweight\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bodyweight\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExperimentalFactor\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"body weight\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Clinicallaboratorymeasurement\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Clinicallaboratorymeasurement\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExperimentalFactor\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"clinical laboratory measurement\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cognitivefunctionmeasurement\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cognitivefunctionmeasurement\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExperimentalFactor\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cognitive function measurement\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Gaitmeasurement\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Gaitmeasurement\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExperimentalFactor\"\n                },\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"gait measurement\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Motordevelopmentmeasurement\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Motordevelopmentmeasurement\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExperimentalFactor\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"motor development measurement\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Painmeasurement\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Painmeasurement\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExperimentalFactor\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"pain measurement\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Stature\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Stature of the individual.\",\n            \"rdfs:label\": \"Stature\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:<5thcentile\"\n                },\n                {\n                    \"@id\": \"bts:5th-95thcentile\"\n                },\n                {\n                    \"@id\": \"bts:>95thcentile\"\n                }\n            ],\n            \"sms:displayName\": \"stature\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:<5thcentile\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"<5thcentile\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Stature\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"< 5th centile\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:5th-95thcentile\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"5th-95thcentile\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Stature\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"5th-95th centile\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:>95thcentile\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \">95thcentile\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Stature\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"> 95th centile\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReadPair\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The read of origin, Read 1 or Read 2\",\n            \"rdfs:label\": \"ReadPair\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"readPair\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"inRange 1 2\"\n            ]\n        },\n        {\n            \"@id\": \"bts:WorkingDistanceUnit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Unit for working distance.\",\n            \"rdfs:label\": \"WorkingDistanceUnit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Angstrom\"\n                },\n                {\n                    \"@id\": \"bts:Nanometer\"\n                },\n                {\n                    \"@id\": \"bts:Micrometer\"\n                },\n                {\n                    \"@id\": \"bts:Millimeter\"\n                },\n                {\n                    \"@id\": \"bts:Centimeter\"\n                }\n            ],\n            \"sms:displayName\": \"workingDistanceUnit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Angstrom\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Angstrom\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:WorkingDistanceUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"angstrom\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nanometer\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nanometer\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:WorkingDistanceUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"nanometer\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Micrometer\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Micrometer\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:WorkingDistanceUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"micrometer\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Millimeter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Millimeter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:WorkingDistanceUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"millimeter\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Centimeter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Centimeter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:WorkingDistanceUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"centimeter\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NonopticGlioma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Nonoptic glioma.\",\n            \"rdfs:label\": \"NonopticGlioma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"nonopticGlioma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GermlineMutationIndicator\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Indicates a summary testing result for individual's germline mutation only.\",\n            \"rdfs:label\": \"GermlineMutationIndicator\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Nottested\"\n                },\n                {\n                    \"@id\": \"bts:Testedbutunknown\"\n                }\n            ],\n            \"sms:displayName\": \"germlineMutationIndicator\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nottested\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nottested\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:GermlineMutationIndicator\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"not tested\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Testedbutunknown\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Testedbutunknown\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:GermlineMutationIndicator\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tested but unknown\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AntibodyID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Antibody ID such as RRID if available, otherwise use vendor ID.\",\n            \"rdfs:label\": \"AntibodyID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"antibodyID\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SpecimenIDSource\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Optional annotation describing where the specimen ID source derived from, e.g. the biobank providing samples or a providing lab.\",\n            \"rdfs:label\": \"SpecimenIDSource\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"specimenIDSource\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VestibularSchwannoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Vestibular schwannoma.\",\n            \"rdfs:label\": \"VestibularSchwannoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Notimaged\"\n                },\n                {\n                    \"@id\": \"bts:Absentbyimaging\"\n                },\n                {\n                    \"@id\": \"bts:Unilateral\"\n                },\n                {\n                    \"@id\": \"bts:Bilateral\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"vestibularSchwannoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Unilateral\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Unilateral\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:VestibularSchwannoma\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"unilateral\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bilateral\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bilateral\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:VestibularSchwannoma\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bilateral\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IsPairedEnd\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"(Legacy/deprecated annotation) Whether or not is paired-end sequencing (Yes; No).\",\n            \"rdfs:label\": \"IsPairedEnd\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Yes\"\n                },\n                {\n                    \"@id\": \"bts:No\"\n                }\n            ],\n            \"sms:displayName\": \"isPairedEnd\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PlateName\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"User-specified identifier of the plate used to prepare the sample for analysis.\",\n            \"rdfs:label\": \"PlateName\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"plateName\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NumberOfSchwannomas\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Number of schwannomas.\",\n            \"rdfs:label\": \"NumberOfSchwannomas\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Single\"\n                },\n                {\n                    \"@id\": \"bts:Scattered\"\n                },\n                {\n                    \"@id\": \"bts:Dense\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"numberOfSchwannomas\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DataStatus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Overall status of data in a study.\",\n            \"rdfs:label\": \"DataStatus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:DataNotExpected\"\n                },\n                {\n                    \"@id\": \"bts:DataPending\"\n                },\n                {\n                    \"@id\": \"bts:UnderEmbargo\"\n                },\n                {\n                    \"@id\": \"bts:RollingRelease\"\n                },\n                {\n                    \"@id\": \"bts:PartiallyAvailable\"\n                },\n                {\n                    \"@id\": \"bts:Available\"\n                }\n            ],\n            \"sms:displayName\": \"dataStatus\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DataNotExpected\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DataNotExpected\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Data Not Expected\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DataPending\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DataPending\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Data Pending\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UnderEmbargo\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UnderEmbargo\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Under Embargo\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RollingRelease\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RollingRelease\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Rolling Release\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PartiallyAvailable\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PartiallyAvailable\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Partially Available\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Available\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Available\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Available\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ContentSize\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"(Files only) File size, usually calculated by the backend.\",\n            \"rdfs:label\": \"ContentSize\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"contentSize\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TargetCaptureKitID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A unique identifier for the kit used to construct a genomic library using target capture-based techniques, which should be composed of the vendor name, kit name and kit version.\\n\",\n            \"rdfs:label\": \"TargetCaptureKitID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"targetCaptureKitID\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AliquotID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A unique identifier (non-PII) that represents the aliquots used for e.g. replicate runs. This is linked to the specimenID.\",\n            \"rdfs:label\": \"AliquotID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"aliquotID\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Initiative\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Refers to a funding initiative. Typically handled by the DCC.\",\n            \"rdfs:label\": \"Initiative\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"initiative\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NominalMagnification\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"magnification of the lens as specified by the manufacturer - i.e. '60' is a 60X lens.\",\n            \"rdfs:label\": \"NominalMagnification\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"nominalMagnification\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"num\"\n            ]\n        },\n        {\n            \"@id\": \"bts:GenePerturbationTechnology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Technology used to perturb gene\",\n            \"rdfs:label\": \"GenePerturbationTechnology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:RNAi\"\n                },\n                {\n                    \"@id\": \"bts:CRISPR\"\n                },\n                {\n                    \"@id\": \"bts:CRERecombinase\"\n                }\n            ],\n            \"sms:displayName\": \"genePerturbationTechnology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RNAi\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RNAi\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:GenePerturbationTechnology\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RNAi\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CRISPR\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CRISPR\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:GenePerturbationTechnology\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CRISPR\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CRERecombinase\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CRERecombinase\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:GenePerturbationTechnology\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CRE Recombinase\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RelatedResource\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A related resource.\",\n            \"rdfs:label\": \"RelatedResource\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"relatedResource\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SpecimenID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A unique identifier (non-PII) that represents the subspecimen (subsample) from which the data came,  e.g. an ID that distinguishes between different parts of the same parent tumor specimen.\\n\",\n            \"rdfs:label\": \"SpecimenID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"specimenID\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Comments\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Brief free-text comments that may also be important to understanding the resource.\",\n            \"rdfs:label\": \"Comments\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"comments\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FileFormat\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Defined format of the data file, typically corresponding to extension, but sometimes indicating more general group of files produced by the same tool or software\",\n            \"rdfs:label\": \"FileFormat\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:7z\"\n                },\n                {\n                    \"@id\": \"bts:DICOM\"\n                },\n                {\n                    \"@id\": \"bts:MATLABdata\"\n                },\n                {\n                    \"@id\": \"bts:MATLABscript\"\n                },\n                {\n                    \"@id\": \"bts:NWB\"\n                },\n                {\n                    \"@id\": \"bts:PAR\"\n                },\n                {\n                    \"@id\": \"bts:Pythonscript\"\n                },\n                {\n                    \"@id\": \"bts:Rscript\"\n                },\n                {\n                    \"@id\": \"bts:RCC\"\n                },\n                {\n                    \"@id\": \"bts:RData\"\n                },\n                {\n                    \"@id\": \"bts:REC\"\n                },\n                {\n                    \"@id\": \"bts:SDAT\"\n                },\n                {\n                    \"@id\": \"bts:SPAR\"\n                },\n                {\n                    \"@id\": \"bts:Sentrixdescriptorfile\"\n                },\n                {\n                    \"@id\": \"bts:Ab1\"\n                },\n                {\n                    \"@id\": \"bts:Abf\"\n                },\n                {\n                    \"@id\": \"bts:Ai\"\n                },\n                {\n                    \"@id\": \"bts:Avi\"\n                },\n                {\n                    \"@id\": \"bts:Bai\"\n                },\n                {\n                    \"@id\": \"bts:Bam\"\n                },\n                {\n                    \"@id\": \"bts:Bashscript\"\n                },\n                {\n                    \"@id\": \"bts:Bcf\"\n                },\n                {\n                    \"@id\": \"bts:Bed\"\n                },\n                {\n                    \"@id\": \"bts:BedbroadPeak\"\n                },\n                {\n                    \"@id\": \"bts:BedgappedPeak\"\n                },\n                {\n                    \"@id\": \"bts:BednarrowPeak\"\n                },\n                {\n                    \"@id\": \"bts:Bedgraph\"\n                },\n                {\n                    \"@id\": \"bts:Bgzip\"\n                },\n                {\n                    \"@id\": \"bts:Bigwig\"\n                },\n                {\n                    \"@id\": \"bts:Bmp\"\n                },\n                {\n                    \"@id\": \"bts:Bpm\"\n                },\n                {\n                    \"@id\": \"bts:Cel\"\n                },\n                {\n                    \"@id\": \"bts:Chp\"\n                },\n                {\n                    \"@id\": \"bts:Cnn\"\n                },\n                {\n                    \"@id\": \"bts:Cnr\"\n                },\n                {\n                    \"@id\": \"bts:Cns\"\n                },\n                {\n                    \"@id\": \"bts:Cram\"\n                },\n                {\n                    \"@id\": \"bts:Crai\"\n                },\n                {\n                    \"@id\": \"bts:Csi\"\n                },\n                {\n                    \"@id\": \"bts:Csv\"\n                },\n                {\n                    \"@id\": \"bts:Ctab\"\n                },\n                {\n                    \"@id\": \"bts:Czi\"\n                },\n                {\n                    \"@id\": \"bts:Dat\"\n                },\n                {\n                    \"@id\": \"bts:Dna\"\n                },\n                {\n                    \"@id\": \"bts:Doc\"\n                },\n                {\n                    \"@id\": \"bts:Dockerimage\"\n                },\n                {\n                    \"@id\": \"bts:Dup\"\n                },\n                {\n                    \"@id\": \"bts:Edat3\"\n                },\n                {\n                    \"@id\": \"bts:Excel\"\n                },\n                {\n                    \"@id\": \"bts:Fasta\"\n                },\n                {\n                    \"@id\": \"bts:Fastq\"\n                },\n                {\n                    \"@id\": \"bts:Fcs\"\n                },\n                {\n                    \"@id\": \"bts:Fig\"\n                },\n                {\n                    \"@id\": \"bts:Flagstat\"\n                },\n                {\n                    \"@id\": \"bts:Gb\"\n                },\n                {\n                    \"@id\": \"bts:Gct\"\n                },\n                {\n                    \"@id\": \"bts:Gff3\"\n                },\n                {\n                    \"@id\": \"bts:Gtf\"\n                },\n                {\n                    \"@id\": \"bts:Gzip\"\n                },\n                {\n                    \"@id\": \"bts:Hdf5\"\n                },\n                {\n                    \"@id\": \"bts:Hdr\"\n                },\n                {\n                    \"@id\": \"bts:Hic\"\n                },\n                {\n                    \"@id\": \"bts:Html\"\n                },\n                {\n                    \"@id\": \"bts:Hyperlink\"\n                },\n                {\n                    \"@id\": \"bts:Idat\"\n                },\n                {\n                    \"@id\": \"bts:Idx\"\n                },\n                {\n                    \"@id\": \"bts:Img\"\n                },\n                {\n                    \"@id\": \"bts:Jpg\"\n                },\n                {\n                    \"@id\": \"bts:Js\"\n                },\n                {\n                    \"@id\": \"bts:Json\"\n                },\n                {\n                    \"@id\": \"bts:Lif\"\n                },\n                {\n                    \"@id\": \"bts:Locs\"\n                },\n                {\n                    \"@id\": \"bts:Maf\"\n                },\n                {\n                    \"@id\": \"bts:Md\"\n                },\n                {\n                    \"@id\": \"bts:Mov\"\n                },\n                {\n                    \"@id\": \"bts:MPEG-4\"\n                },\n                {\n                    \"@id\": \"bts:Msf\"\n                },\n                {\n                    \"@id\": \"bts:Mtx\"\n                },\n                {\n                    \"@id\": \"bts:MzML\"\n                },\n                {\n                    \"@id\": \"bts:Nii\"\n                },\n                {\n                    \"@id\": \"bts:Ome-tiff\"\n                },\n                {\n                    \"@id\": \"bts:Pdf\"\n                },\n                {\n                    \"@id\": \"bts:Plink\"\n                },\n                {\n                    \"@id\": \"bts:Png\"\n                },\n                {\n                    \"@id\": \"bts:Powerpoint\"\n                },\n                {\n                    \"@id\": \"bts:Pzfx\"\n                },\n                {\n                    \"@id\": \"bts:Psydat\"\n                },\n                {\n                    \"@id\": \"bts:Raw\"\n                },\n                {\n                    \"@id\": \"bts:Rds\"\n                },\n                {\n                    \"@id\": \"bts:Recal\"\n                },\n                {\n                    \"@id\": \"bts:Rmd\"\n                },\n                {\n                    \"@id\": \"bts:Sam\"\n                },\n                {\n                    \"@id\": \"bts:Sav\"\n                },\n                {\n                    \"@id\": \"bts:Sdf\"\n                },\n                {\n                    \"@id\": \"bts:Seg\"\n                },\n                {\n                    \"@id\": \"bts:Sf\"\n                },\n                {\n                    \"@id\": \"bts:Sif\"\n                },\n                {\n                    \"@id\": \"bts:Sqlite\"\n                },\n                {\n                    \"@id\": \"bts:Sra\"\n                },\n                {\n                    \"@id\": \"bts:Svg\"\n                },\n                {\n                    \"@id\": \"bts:Svs\"\n                },\n                {\n                    \"@id\": \"bts:TagAlign\"\n                },\n                {\n                    \"@id\": \"bts:Tar\"\n                },\n                {\n                    \"@id\": \"bts:Tbi\"\n                },\n                {\n                    \"@id\": \"bts:Tif\"\n                },\n                {\n                    \"@id\": \"bts:Tom\"\n                },\n                {\n                    \"@id\": \"bts:Tranches\"\n                },\n                {\n                    \"@id\": \"bts:Tsv\"\n                },\n                {\n                    \"@id\": \"bts:Txt\"\n                },\n                {\n                    \"@id\": \"bts:Vcf\"\n                },\n                {\n                    \"@id\": \"bts:Wiggle\"\n                },\n                {\n                    \"@id\": \"bts:Xml\"\n                },\n                {\n                    \"@id\": \"bts:Yaml\"\n                },\n                {\n                    \"@id\": \"bts:Zip\"\n                }\n            ],\n            \"sms:displayName\": \"fileFormat\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:7z\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"7z\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"7z\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DICOM\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DICOM\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DICOM\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MATLABdata\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MATLABdata\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MATLAB data\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MATLABscript\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MATLABscript\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MATLAB script\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NWB\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NWB\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NWB\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PAR\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PAR\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PAR\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Pythonscript\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Pythonscript\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Python script\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Rscript\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Rscript\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"R script\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RCC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RCC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RCC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RData\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RData\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RData\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:REC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"REC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"REC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SDAT\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SDAT\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SDAT\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SPAR\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SPAR\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SPAR\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sentrixdescriptorfile\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sentrixdescriptorfile\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Sentrix descriptor file\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Ab1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Ab1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ab1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Abf\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Abf\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"abf\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Ai\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Ai\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ai\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Avi\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Avi\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"avi\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bai\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bai\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bai\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bam\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bam\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bam\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bashscript\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bashscript\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bash script\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bcf\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bcf\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bcf\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bed\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bed\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bed\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BedbroadPeak\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BedbroadPeak\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bed broadPeak\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BedgappedPeak\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BedgappedPeak\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bed gappedPeak\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BednarrowPeak\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BednarrowPeak\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bed narrowPeak\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bedgraph\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bedgraph\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bedgraph\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bgzip\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bgzip\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bgzip\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bigwig\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bigwig\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bigwig\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bmp\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bmp\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bmp\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bpm\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bpm\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bpm\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cel\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cel\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cel\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Chp\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Chp\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"chp\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cnn\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cnn\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cnn\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cnr\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cnr\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cnr\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cns\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cns\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cns\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cram\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cram\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cram\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Crai\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Crai\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"crai\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Csi\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Csi\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"csi\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Csv\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Csv\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"csv\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Ctab\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Ctab\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ctab\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Czi\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Czi\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"czi\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Dat\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Dat\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"dat\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Dna\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Dna\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"dna\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Doc\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Doc\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"doc\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Dockerimage\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Dockerimage\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"docker image\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Dup\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Dup\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"dup\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Edat3\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Edat3\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"edat3\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Excel\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Excel\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"excel\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Fasta\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Fasta\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"fasta\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Fastq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Fastq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"fastq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Fcs\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Fcs\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"fcs\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Fig\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Fig\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"fig\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Flagstat\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Flagstat\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"flagstat\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Gb\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Gb\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"gb\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Gct\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Gct\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"gct\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Gff3\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Gff3\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"gff3\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Gtf\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Gtf\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"gtf\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Gzip\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Gzip\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"gzip\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Hdf5\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Hdf5\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hdf5\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Hdr\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Hdr\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hdr\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Hic\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Hic\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hic\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Html\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Html\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"html\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Hyperlink\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Hyperlink\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hyperlink\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Idat\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Idat\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"idat\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Idx\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Idx\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"idx\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Img\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Img\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"img\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Jpg\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Jpg\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"jpg\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Js\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Js\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"js\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Json\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Json\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"json\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Lif\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Lif\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"lif\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Locs\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Locs\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"locs\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Maf\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Maf\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"maf\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Md\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Md\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"md\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mov\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mov\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mov\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MPEG-4\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MPEG-4\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MPEG-4\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Msf\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Msf\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"msf\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mtx\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mtx\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mtx\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MzML\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MzML\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mzML\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nii\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nii\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"nii\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Ome-tiff\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Ome-tiff\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ome-tiff\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Pdf\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Pdf\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"pdf\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Plink\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Plink\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"plink\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Png\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Png\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"png\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Powerpoint\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Powerpoint\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"powerpoint\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Pzfx\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Pzfx\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"pzfx\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Psydat\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Psydat\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"psydat\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Raw\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Raw\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"raw\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Rds\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Rds\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"rds\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Recal\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Recal\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"recal\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Rmd\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Rmd\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"rmd\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sam\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sam\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sam\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sav\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sav\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sav\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sdf\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sdf\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sdf\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Seg\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Seg\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"seg\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sf\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sf\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sf\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sif\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sif\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sif\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sqlite\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sqlite\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sqlite\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sra\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sra\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sra\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Svg\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Svg\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"svg\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Svs\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Svs\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"svs\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TagAlign\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TagAlign\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tagAlign\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tar\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tar\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tar\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tbi\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tbi\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tbi\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tif\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tif\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tif\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tom\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tom\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tom\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tranches\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tranches\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tranches\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tsv\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tsv\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tsv\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Txt\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Txt\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"txt\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Vcf\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Vcf\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"vcf\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Wiggle\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Wiggle\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"wiggle\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Xml\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Xml\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"xml\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Yaml\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Yaml\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"yaml\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Zip\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Zip\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:FileFormat\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"zip\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf2Genotype\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Genotype of NF2 gene in the biospecimen from which the data were derived, if known\",\n            \"rdfs:label\": \"Nf2Genotype\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:-/-\"\n                },\n                {\n                    \"@id\": \"bts:+/-\"\n                },\n                {\n                    \"@id\": \"bts:+/+\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"nf2Genotype\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PeripheralNeuropathy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Peripheral neuropathy.\",\n            \"rdfs:label\": \"PeripheralNeuropathy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"peripheralNeuropathy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Component\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Type of metadata template; provide the same one for all items/rows.\",\n            \"rdfs:label\": \"Component\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Component\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Scoliosis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Scoliosis.\",\n            \"rdfs:label\": \"Scoliosis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"scoliosis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Species\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The name of a species (typically a taxonomic group) of organism.\",\n            \"rdfs:label\": \"Species\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Rattusnorvegicus\"\n                },\n                {\n                    \"@id\": \"bts:Gallusgallus\"\n                },\n                {\n                    \"@id\": \"bts:Pantroglodytes\"\n                },\n                {\n                    \"@id\": \"bts:Musmusculus(humanized)\"\n                },\n                {\n                    \"@id\": \"bts:Homosapiens\"\n                },\n                {\n                    \"@id\": \"bts:Daniorerio\"\n                },\n                {\n                    \"@id\": \"bts:Drosophilamelanogaster\"\n                },\n                {\n                    \"@id\": \"bts:Rhesusmacaque\"\n                },\n                {\n                    \"@id\": \"bts:Susscrofa\"\n                },\n                {\n                    \"@id\": \"bts:Oryctolaguscuniculus\"\n                },\n                {\n                    \"@id\": \"bts:Musmusculus\"\n                }\n            ],\n            \"sms:displayName\": \"species\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Rattusnorvegicus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Rattusnorvegicus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Species\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Rattus norvegicus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Gallusgallus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Gallusgallus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Species\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Gallus gallus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Pantroglodytes\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Pantroglodytes\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Species\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Pan troglodytes\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Musmusculus(humanized)\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Musmusculus(humanized)\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Species\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Mus musculus (humanized)\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Homosapiens\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Homosapiens\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Species\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Homo sapiens\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Daniorerio\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Daniorerio\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Species\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Danio rerio\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Drosophilamelanogaster\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Drosophilamelanogaster\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Species\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Drosophila melanogaster\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Rhesusmacaque\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Rhesusmacaque\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Species\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Rhesus macaque\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Susscrofa\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Susscrofa\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Species\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Sus scrofa\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Oryctolaguscuniculus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Oryctolaguscuniculus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Species\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Oryctolagus cuniculus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Musmusculus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Musmusculus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Species\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Mus musculus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ConcreteType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Refers to the class model the data platform uses for representing the resource. This is a low-level field set by the platform and is not a user annotation.\",\n            \"rdfs:label\": \"ConcreteType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"concreteType\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Allograft\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Allograft\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"allograft\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Xenograft\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Xenograft\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"xenograft\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Autograft\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Autograft\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"autograft\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Isograft\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Isograft\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"isograft\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:YearPublished\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Year in which the resource was published/generally made available.\",\n            \"rdfs:label\": \"YearPublished\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"yearPublished\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"int\"\n            ]\n        },\n        {\n            \"@id\": \"bts:FundingAgency\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Refers to the funding organization for the generated resource. This annotation is handled by the DCC.\",\n            \"rdfs:label\": \"FundingAgency\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"fundingAgency\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IntellectualDisability\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Intellectual disability.\",\n            \"rdfs:label\": \"IntellectualDisability\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"intellectualDisability\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PlexiformNeurofibromas\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Plexiform neurofibromas.\",\n            \"rdfs:label\": \"PlexiformNeurofibromas\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:AbsentbyMRI\"\n                },\n                {\n                    \"@id\": \"bts:Absentclinically-noMRI\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"plexiformNeurofibromas\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AbsentbyMRI\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AbsentbyMRI\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:PlexiformNeurofibromas\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"absent by MRI\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Absentclinically-noMRI\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Absentclinically-noMRI\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:PlexiformNeurofibromas\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"absent clinically - no MRI\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MPNSTCharacterization\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of MPNST.\",\n            \"rdfs:label\": \"MPNSTCharacterization\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"MPNSTCharacterization\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NucleicAcidSource\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Source of the extracted nucleic acid used in the experiment\",\n            \"rdfs:label\": \"NucleicAcidSource\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Bulkcell\"\n                },\n                {\n                    \"@id\": \"bts:Bulknuclei\"\n                },\n                {\n                    \"@id\": \"bts:Mitochondria\"\n                },\n                {\n                    \"@id\": \"bts:Singlecell\"\n                },\n                {\n                    \"@id\": \"bts:Singlenucleus\"\n                }\n            ],\n            \"sms:displayName\": \"nucleicAcidSource\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bulkcell\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bulkcell\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bulk cell\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bulknuclei\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bulknuclei\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bulk nuclei\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mitochondria\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mitochondria\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:ProteinExtractSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mitochondria\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Singlecell\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Singlecell\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"single cell\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Singlenucleus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Singlenucleus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"single nucleus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MeanCoverage\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Mean coverage for whole genome sequencing, or mean target coverage for whole exome and targeted sequencing, collected from Picard Tools\",\n            \"rdfs:label\": \"MeanCoverage\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"meanCoverage\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Channel\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Color channel used to generate data file.\",\n            \"rdfs:label\": \"Channel\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Cy3\"\n                },\n                {\n                    \"@id\": \"bts:Cy5\"\n                },\n                {\n                    \"@id\": \"bts:NotApplicable\"\n                }\n            ],\n            \"sms:displayName\": \"channel\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cy3\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cy3\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Channel\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cy3\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cy5\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cy5\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Channel\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cy5\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenePerturbed\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The HUGO gene symbol for the gene that is perturbed.\",\n            \"rdfs:label\": \"GenePerturbed\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"genePerturbed\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Type\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Refers to the type of the resource on the platform, e.g. “file”.\",\n            \"rdfs:label\": \"Type\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"type\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Etag\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Synapse employs an Optimistic Concurrency Control (OCC) scheme to handle concurrent updates. The E-Tag changes every time an entity is updated it is used to detect when a client's current representation of an entity is out-of-date.\",\n            \"rdfs:label\": \"Etag\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"etag\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RuntimePlatform\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Runtime platform or script interpreter dependencies (e.g. Java v1, Python 2.3).\",\n            \"rdfs:label\": \"RuntimePlatform\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"runtimePlatform\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1SomaticMutation\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"NF1 somatic mutation, i.e. in tumor samples.\",\n            \"rdfs:label\": \"Nf1SomaticMutation\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"nf1SomaticMutation\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Manifestation\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"An associated phenotype characteristic.\",\n            \"rdfs:label\": \"Manifestation\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"manifestation\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AqueductalStenosis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of aqueductal stenosis.\",\n            \"rdfs:label\": \"AqueductalStenosis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"aqueductalStenosis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IrisLischNodules\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Iris Lisch Nodules.\",\n            \"rdfs:label\": \"IrisLischNodules\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"IrisLischNodules\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ConcentrationNaCl\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Numeric value for NaCl concentration\",\n            \"rdfs:label\": \"ConcentrationNaCl\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"concentrationNaCl\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ResourceType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The type of resource being stored and annotated\",\n            \"rdfs:label\": \"ResourceType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:ExperimentalData\"\n                },\n                {\n                    \"@id\": \"bts:Result\"\n                },\n                {\n                    \"@id\": \"bts:Tool\"\n                },\n                {\n                    \"@id\": \"bts:Workflowreport\"\n                },\n                {\n                    \"@id\": \"bts:Report\"\n                },\n                {\n                    \"@id\": \"bts:Metadata\"\n                },\n                {\n                    \"@id\": \"bts:Protocol\"\n                },\n                {\n                    \"@id\": \"bts:Weblink\"\n                }\n            ],\n            \"sms:displayName\": \"resourceType\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ExperimentalData\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ExperimentalData\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ResourceType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"experimentalData\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Result\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Result\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ResourceType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"result\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tool\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tool\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ResourceType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tool\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Workflowreport\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Workflowreport\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ResourceType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"workflow report\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Report\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Report\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ResourceType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"report\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Protocol\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Protocol\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ResourceType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"protocol\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Weblink\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Weblink\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ResourceType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"weblink\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DissociationMethod\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Procedure by which a biological specimen is dissociated into individual cells or a cell suspension\",\n            \"rdfs:label\": \"DissociationMethod\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:10xV2\"\n                },\n                {\n                    \"@id\": \"bts:FACS\"\n                },\n                {\n                    \"@id\": \"bts:FluidigmC1\"\n                },\n                {\n                    \"@id\": \"bts:Drop-seq\"\n                },\n                {\n                    \"@id\": \"bts:InDrop\"\n                },\n                {\n                    \"@id\": \"bts:Mouthpipette\"\n                },\n                {\n                    \"@id\": \"bts:Enzymatic\"\n                },\n                {\n                    \"@id\": \"bts:Mechanical\"\n                }\n            ],\n            \"sms:displayName\": \"dissociationMethod\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:10xV2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"10xV2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DissociationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"10x_v2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FACS\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FACS\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DissociationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FACS\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FluidigmC1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FluidigmC1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DissociationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Fluidigm C1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Drop-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Drop-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DissociationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"drop-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:InDrop\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"InDrop\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DissociationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"inDrop\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mouthpipette\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mouthpipette\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DissociationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mouth pipette\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Enzymatic\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Enzymatic\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DissociationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"enzymatic\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mechanical\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mechanical\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DissociationMethod\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mechanical\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MaterialType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Type of material in the characterization\",\n            \"rdfs:label\": \"MaterialType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Nanoparticles\"\n                },\n                {\n                    \"@id\": \"bts:Polymericnanoparticles\"\n                },\n                {\n                    \"@id\": \"bts:Smallmolecule\"\n                },\n                {\n                    \"@id\": \"bts:DNA\"\n                }\n            ],\n            \"sms:displayName\": \"materialType\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nanoparticles\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nanoparticles\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:MaterialType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"nanoparticles\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Polymericnanoparticles\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Polymericnanoparticles\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:MaterialType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Polymeric nanoparticles\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Smallmolecule\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Smallmolecule\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:MaterialType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"small molecule\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:MaterialType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AuxiliaryAsset\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"URI to supplemental asset(s), e.g. QC reports or other auxiliary files to support the processing, analysis, or interpretation of the current entity.\\n\",\n            \"rdfs:label\": \"AuxiliaryAsset\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"auxiliaryAsset\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HeartDefect\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Heart defect.\",\n            \"rdfs:label\": \"HeartDefect\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"heartDefect\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Doi\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Digital object identifier of the resource.\",\n            \"rdfs:label\": \"Doi\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"doi\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReadsDuplicatedPercent\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Percent of duplicated reads collected from samtools\",\n            \"rdfs:label\": \"ReadsDuplicatedPercent\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"readsDuplicatedPercent\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Pheochromocytoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Pheochromocytoma.\",\n            \"rdfs:label\": \"Pheochromocytoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"pheochromocytoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IsMultiSpecimen\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Whether or not a file has data for multiple specimens (Yes; No)\",\n            \"rdfs:label\": \"IsMultiSpecimen\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Yes\"\n                },\n                {\n                    \"@id\": \"bts:No\"\n                }\n            ],\n            \"sms:displayName\": \"isMultiSpecimen\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MRISequence\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A mode used in MRI imaging\",\n            \"rdfs:label\": \"MRISequence\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:PD-weighted\"\n                },\n                {\n                    \"@id\": \"bts:ShortTauInversionRecovery\"\n                },\n                {\n                    \"@id\": \"bts:T1-weighted\"\n                },\n                {\n                    \"@id\": \"bts:T2-weighted\"\n                }\n            ],\n            \"sms:displayName\": \"MRISequence\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PD-weighted\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PD-weighted\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:MRISequence\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PD-weighted\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ShortTauInversionRecovery\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ShortTauInversionRecovery\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:MRISequence\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Short Tau Inversion Recovery\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:T1-weighted\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"T1-weighted\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:MRISequence\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"T1-weighted\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:T2-weighted\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"T2-weighted\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:MRISequence\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"T2-weighted\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DatasetSizeInBytes\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Size of dataset entity in bytes. Auto-calculated by Synapse.\",\n            \"rdfs:label\": \"DatasetSizeInBytes\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"datasetSizeInBytes\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DiagnosisAgeGroup\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Age group of the individual at the time of diagnosis.\",\n            \"rdfs:label\": \"DiagnosisAgeGroup\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Infancy\"\n                },\n                {\n                    \"@id\": \"bts:Childhood\"\n                },\n                {\n                    \"@id\": \"bts:Adolescence\"\n                },\n                {\n                    \"@id\": \"bts:Adulthood\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"diagnosisAgeGroup\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Infancy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Infancy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DiagnosisAgeGroup\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"infancy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Childhood\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Childhood\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DiagnosisAgeGroup\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"childhood\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Adolescence\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Adolescence\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DiagnosisAgeGroup\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"adolescence\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Adulthood\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Adulthood\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DiagnosisAgeGroup\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"adulthood\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:YearProcessed\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Year in which the resource was processed/derived, if applicable.  This is only required for data processed by NF-OSI for tracking purposes, optional for community-contributed datasets. \\n\",\n            \"rdfs:label\": \"YearProcessed\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"yearProcessed\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"int\"\n            ]\n        },\n        {\n            \"@id\": \"bts:ReferenceSequence\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Syntactic sequences that has a role as reference of an annotation.\",\n            \"rdfs:label\": \"ReferenceSequence\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"referenceSequence\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Title\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Title of a resource.\",\n            \"rdfs:label\": \"Title\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"title\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Documentation\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"URL to any documentation describing the resource and its use.\",\n            \"rdfs:label\": \"Documentation\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"documentation\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RelatedDataset\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Reference to a relevant dataset entity.\",\n            \"rdfs:label\": \"RelatedDataset\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"relatedDataset\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Inheritance\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Describes whether known inheritance from a parent.\",\n            \"rdfs:label\": \"Inheritance\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Parentaffected\"\n                },\n                {\n                    \"@id\": \"bts:Parentnotaffected\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"inheritance\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Parentaffected\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Parentaffected\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Inheritance\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"parent affected\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Parentnotaffected\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Parentnotaffected\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Inheritance\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"parent not affected\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ConcentrationMaterialUnit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Unit used for the material concentration, e.g. mg/mL\",\n            \"rdfs:label\": \"ConcentrationMaterialUnit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Mg/mL\"\n                },\n                {\n                    \"@id\": \"bts:MM\"\n                },\n                {\n                    \"@id\": \"bts:Particles/mL\"\n                }\n            ],\n            \"sms:displayName\": \"concentrationMaterialUnit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mg/mL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mg/mL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ConcentrationMaterialUnit\"\n                },\n                {\n                    \"@id\": \"bts:ConcentrationNaClUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mg/mL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MM\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MM\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ConcentrationMaterialUnit\"\n                },\n                {\n                    \"@id\": \"bts:ConcentrationNaClUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mM\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Particles/mL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Particles/mL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ConcentrationMaterialUnit\"\n                },\n                {\n                    \"@id\": \"bts:ConcentrationNaClUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"particles/mL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProportionCoverage10x\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Proportion of all reference bases for whole genome sequencing, or targeted bases for whole exome and targeted sequencing, that achieves 10X or greater coverage from Picard Tools\",\n            \"rdfs:label\": \"ProportionCoverage10x\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"proportionCoverage10x\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"num\"\n            ]\n        },\n        {\n            \"@id\": \"bts:StudyName\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Name of a study.\",\n            \"rdfs:label\": \"StudyName\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"studyName\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Author\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The author of the resource; preferably use an ORCID ID, GitHub profile link, etc., if available and a text name if not.\",\n            \"rdfs:label\": \"Author\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"author\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Summary\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A short description (an abstract).\",\n            \"rdfs:label\": \"Summary\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"summary\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IsPrimaryCell\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Whether or not cellType is primary (Yes; No)\",\n            \"rdfs:label\": \"IsPrimaryCell\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Yes\"\n                },\n                {\n                    \"@id\": \"bts:No\"\n                }\n            ],\n            \"sms:displayName\": \"isPrimaryCell\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Citation\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Citation (e.g. doi) that usage of data or resource should be cited with.\",\n            \"rdfs:label\": \"Citation\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"citation\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SampleType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Type of sample used\",\n            \"rdfs:label\": \"SampleType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sampleType\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RecordingSource\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Source of electrophysiology recording.\",\n            \"rdfs:label\": \"RecordingSource\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Electrode\"\n                },\n                {\n                    \"@id\": \"bts:Tetrode\"\n                },\n                {\n                    \"@id\": \"bts:Shank\"\n                },\n                {\n                    \"@id\": \"bts:Utaharray\"\n                },\n                {\n                    \"@id\": \"bts:Otherelectrodearray\"\n                }\n            ],\n            \"sms:displayName\": \"recordingSource\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Electrode\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Electrode\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:RecordingSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"electrode\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tetrode\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tetrode\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:RecordingSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tetrode\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Shank\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Shank\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:RecordingSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"shank\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Utaharray\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Utaharray\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:RecordingSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Utah array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Otherelectrodearray\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Otherelectrodearray\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:RecordingSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"other electrode array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Workflow\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Name and version of the workflow used to generate/analyze the data\",\n            \"rdfs:label\": \"Workflow\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:CNVkit\"\n                },\n                {\n                    \"@id\": \"bts:DeepVariant\"\n                },\n                {\n                    \"@id\": \"bts:DESeq2\"\n                },\n                {\n                    \"@id\": \"bts:FastQC\"\n                },\n                {\n                    \"@id\": \"bts:FreeBayes\"\n                },\n                {\n                    \"@id\": \"bts:GATKBaseRecalibration\"\n                },\n                {\n                    \"@id\": \"bts:GATKMarkDuplicates\"\n                },\n                {\n                    \"@id\": \"bts:MultiQC\"\n                },\n                {\n                    \"@id\": \"bts:Mutect2\"\n                },\n                {\n                    \"@id\": \"bts:Sarek\"\n                },\n                {\n                    \"@id\": \"bts:STARandSalmon\"\n                },\n                {\n                    \"@id\": \"bts:Strelka2\"\n                },\n                {\n                    \"@id\": \"bts:StringTie\"\n                },\n                {\n                    \"@id\": \"bts:TrimGalore\"\n                }\n            ],\n            \"sms:displayName\": \"workflow\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:WorkflowLink\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNVkit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNVkit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CNVkit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DeepVariant\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DeepVariant\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DeepVariant\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DESeq2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DESeq2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DESeq2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FastQC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FastQC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FastQC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FreeBayes\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FreeBayes\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FreeBayes\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GATKBaseRecalibration\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GATKBaseRecalibration\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GATK BaseRecalibration\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GATKMarkDuplicates\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GATKMarkDuplicates\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GATK MarkDuplicates\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MultiQC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MultiQC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MultiQC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mutect2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mutect2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Mutect2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sarek\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sarek\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Sarek\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:STARandSalmon\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"STARandSalmon\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"STAR and Salmon\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Strelka2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Strelka2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Strelka2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StringTie\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StringTie\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"StringTie\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TrimGalore\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TrimGalore\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Workflow\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"TrimGalore\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WorkflowLink\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Workflow URL reference\",\n            \"rdfs:label\": \"WorkflowLink\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"workflowLink\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PH\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Numeric value for pH (range 0-14)\",\n            \"rdfs:label\": \"PH\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"pH\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"num\"\n            ]\n        },\n        {\n            \"@id\": \"bts:LibraryKitID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Library kit ID.\",\n            \"rdfs:label\": \"LibraryKitID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"libraryKitID\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PairsOnDifferentChr\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Pairs on different chromosomes collected from samtools\",\n            \"rdfs:label\": \"PairsOnDifferentChr\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"pairsOnDifferentChr\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"int\"\n            ]\n        },\n        {\n            \"@id\": \"bts:ConcentrationNaClUnit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Unit used for the NaCl concentration, e.g. mM\",\n            \"rdfs:label\": \"ConcentrationNaClUnit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Mg/mL\"\n                },\n                {\n                    \"@id\": \"bts:MM\"\n                },\n                {\n                    \"@id\": \"bts:Particles/mL\"\n                }\n            ],\n            \"sms:displayName\": \"concentrationNaClUnit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TransplantationRecipientTissue\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Tissue into which a xenograph sample is transplanted\",\n            \"rdfs:label\": \"TransplantationRecipientTissue\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Cerebralcortex\"\n                },\n                {\n                    \"@id\": \"bts:Bonemarrow\"\n                },\n                {\n                    \"@id\": \"bts:Plasma\"\n                },\n                {\n                    \"@id\": \"bts:DorsalRootGanglion\"\n                },\n                {\n                    \"@id\": \"bts:Opticnerve\"\n                },\n                {\n                    \"@id\": \"bts:Tumor-adjacentnormal\"\n                },\n                {\n                    \"@id\": \"bts:Serum\"\n                },\n                {\n                    \"@id\": \"bts:Spheroid\"\n                },\n                {\n                    \"@id\": \"bts:Sciaticnerve\"\n                },\n                {\n                    \"@id\": \"bts:Meninges\"\n                },\n                {\n                    \"@id\": \"bts:Nervetissue\"\n                },\n                {\n                    \"@id\": \"bts:BuccalMucosa\"\n                },\n                {\n                    \"@id\": \"bts:Embryonictissue\"\n                },\n                {\n                    \"@id\": \"bts:Wholebrain\"\n                },\n                {\n                    \"@id\": \"bts:Microtissue\"\n                },\n                {\n                    \"@id\": \"bts:Retina\"\n                },\n                {\n                    \"@id\": \"bts:CDXtissue\"\n                },\n                {\n                    \"@id\": \"bts:Organoid\"\n                },\n                {\n                    \"@id\": \"bts:Splenocyte\"\n                },\n                {\n                    \"@id\": \"bts:Blood\"\n                },\n                {\n                    \"@id\": \"bts:Connectivetissue\"\n                },\n                {\n                    \"@id\": \"bts:Primarytumor\"\n                },\n                {\n                    \"@id\": \"bts:PDXtissue\"\n                },\n                {\n                    \"@id\": \"bts:BuffyCoat\"\n                }\n            ],\n            \"sms:displayName\": \"transplantationRecipientTissue\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cerebralcortex\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cerebralcortex\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cerebral cortex\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bonemarrow\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bonemarrow\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bone marrow\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Plasma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Plasma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"plasma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DorsalRootGanglion\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DorsalRootGanglion\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Dorsal Root Ganglion\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Opticnerve\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Opticnerve\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"optic nerve\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tumor-adjacentnormal\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tumor-adjacentnormal\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"tumor-adjacent normal\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Serum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Serum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"serum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Spheroid\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Spheroid\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"spheroid\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sciaticnerve\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sciaticnerve\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sciatic nerve\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Meninges\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Meninges\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"meninges\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nervetissue\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nervetissue\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"nerve tissue\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BuccalMucosa\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BuccalMucosa\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Buccal Mucosa\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Embryonictissue\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Embryonictissue\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"embryonic tissue\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Wholebrain\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Wholebrain\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"whole brain\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Microtissue\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Microtissue\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"microtissue\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Retina\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Retina\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"retina\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CDXtissue\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CDXtissue\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CDX tissue\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Organoid\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Organoid\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"organoid\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Splenocyte\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Splenocyte\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"splenocyte\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Blood\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Blood\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"blood\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Connectivetissue\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Connectivetissue\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"connective tissue\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Primarytumor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Primarytumor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"primary tumor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PDXtissue\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PDXtissue\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PDX tissue\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BuffyCoat\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BuffyCoat\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Buffy Coat\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SpecimenType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The type of a material sample taken from a biological entity for testing, diagnostic, propagation, treatment or research purposes. This includes particular types of cellular molecules, cells, tissues, organs, body fluids, embryos, and body excretory substances.\\n\",\n            \"rdfs:label\": \"SpecimenType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Cerebralcortex\"\n                },\n                {\n                    \"@id\": \"bts:Bonemarrow\"\n                },\n                {\n                    \"@id\": \"bts:Plasma\"\n                },\n                {\n                    \"@id\": \"bts:DorsalRootGanglion\"\n                },\n                {\n                    \"@id\": \"bts:Opticnerve\"\n                },\n                {\n                    \"@id\": \"bts:Tumor-adjacentnormal\"\n                },\n                {\n                    \"@id\": \"bts:Serum\"\n                },\n                {\n                    \"@id\": \"bts:Spheroid\"\n                },\n                {\n                    \"@id\": \"bts:Sciaticnerve\"\n                },\n                {\n                    \"@id\": \"bts:Meninges\"\n                },\n                {\n                    \"@id\": \"bts:Nervetissue\"\n                },\n                {\n                    \"@id\": \"bts:BuccalMucosa\"\n                },\n                {\n                    \"@id\": \"bts:Embryonictissue\"\n                },\n                {\n                    \"@id\": \"bts:Wholebrain\"\n                },\n                {\n                    \"@id\": \"bts:Microtissue\"\n                },\n                {\n                    \"@id\": \"bts:Retina\"\n                },\n                {\n                    \"@id\": \"bts:CDXtissue\"\n                },\n                {\n                    \"@id\": \"bts:Organoid\"\n                },\n                {\n                    \"@id\": \"bts:Splenocyte\"\n                },\n                {\n                    \"@id\": \"bts:Blood\"\n                },\n                {\n                    \"@id\": \"bts:Connectivetissue\"\n                },\n                {\n                    \"@id\": \"bts:Primarytumor\"\n                },\n                {\n                    \"@id\": \"bts:PDXtissue\"\n                },\n                {\n                    \"@id\": \"bts:BuffyCoat\"\n                },\n                {\n                    \"@id\": \"bts:Saliva\"\n                },\n                {\n                    \"@id\": \"bts:Mucus\"\n                },\n                {\n                    \"@id\": \"bts:Urine\"\n                },\n                {\n                    \"@id\": \"bts:Stool\"\n                },\n                {\n                    \"@id\": \"bts:Sweat\"\n                }\n            ],\n            \"sms:displayName\": \"specimenType\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Saliva\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Saliva\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"saliva\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mucus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mucus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mucus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Urine\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Urine\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"urine\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Stool\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Stool\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"stool\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sweat\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sweat\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sweat\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DatasetItemCount\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Count of files in dataset. Auto-calculated by Synapse.\",\n            \"rdfs:label\": \"DatasetItemCount\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"datasetItemCount\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DataFileHandleId\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"(Files only) Refers to the id of the file.\",\n            \"rdfs:label\": \"DataFileHandleId\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"dataFileHandleId\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IndividualIdSource\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Database or repository to which individual ID maps\",\n            \"rdfs:label\": \"IndividualIdSource\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"individualIdSource\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TotalReads\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"If available, the total number of reads collected from samtools.\",\n            \"rdfs:label\": \"TotalReads\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"totalReads\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"int\"\n            ]\n        },\n        {\n            \"@id\": \"bts:Contributor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"An entity responsible for making contributions to the resource.\",\n            \"rdfs:label\": \"Contributor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"contributor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DataType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Links an entity to data types that the entity represents/contains. This is closely tied to the assay property. For example, a file of dataType `genomicVariants` might have an assay value of `whole genome sequencing`.\\n\",\n            \"rdfs:label\": \"DataType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Immunoassay\"\n                },\n                {\n                    \"@id\": \"bts:Behaviorprocess\"\n                },\n                {\n                    \"@id\": \"bts:Clinical\"\n                },\n                {\n                    \"@id\": \"bts:Demographics\"\n                },\n                {\n                    \"@id\": \"bts:Particlecharacterization\"\n                },\n                {\n                    \"@id\": \"bts:DrugCombinationScreen\"\n                },\n                {\n                    \"@id\": \"bts:IsoformExpression\"\n                },\n                {\n                    \"@id\": \"bts:Proteomics\"\n                },\n                {\n                    \"@id\": \"bts:SurveyData\"\n                },\n                {\n                    \"@id\": \"bts:Kinomics\"\n                },\n                {\n                    \"@id\": \"bts:SomaticVariants\"\n                },\n                {\n                    \"@id\": \"bts:Volume\"\n                },\n                {\n                    \"@id\": \"bts:Characteristic\"\n                },\n                {\n                    \"@id\": \"bts:DrugScreen\"\n                },\n                {\n                    \"@id\": \"bts:GeneExpression\"\n                },\n                {\n                    \"@id\": \"bts:AlignedReads\"\n                },\n                {\n                    \"@id\": \"bts:GenomicFeatures\"\n                },\n                {\n                    \"@id\": \"bts:GenomicVariants\"\n                },\n                {\n                    \"@id\": \"bts:Rawcounts\"\n                },\n                {\n                    \"@id\": \"bts:RawIntensities\"\n                },\n                {\n                    \"@id\": \"bts:NormalizedIntensities\"\n                },\n                {\n                    \"@id\": \"bts:PharmacokineticStudy\"\n                },\n                {\n                    \"@id\": \"bts:Maskimage\"\n                },\n                {\n                    \"@id\": \"bts:ChromatinActivity\"\n                },\n                {\n                    \"@id\": \"bts:StructuralVariants\"\n                },\n                {\n                    \"@id\": \"bts:GermlineVariants\"\n                },\n                {\n                    \"@id\": \"bts:CopyNumberVariants\"\n                },\n                {\n                    \"@id\": \"bts:Image\"\n                },\n                {\n                    \"@id\": \"bts:Network\"\n                },\n                {\n                    \"@id\": \"bts:CellularPhysiology\"\n                },\n                {\n                    \"@id\": \"bts:Metabolomics\"\n                },\n                {\n                    \"@id\": \"bts:AnnotatedSomaticVariants\"\n                },\n                {\n                    \"@id\": \"bts:AnnotatedGermlineVariants\"\n                },\n                {\n                    \"@id\": \"bts:Weight\"\n                },\n                {\n                    \"@id\": \"bts:Electrophysiology\"\n                },\n                {\n                    \"@id\": \"bts:Audiotranscript\"\n                },\n                {\n                    \"@id\": \"bts:DataIndex\"\n                },\n                {\n                    \"@id\": \"bts:DescriptiveMetadata\"\n                },\n                {\n                    \"@id\": \"bts:StructuralMetadata\"\n                },\n                {\n                    \"@id\": \"bts:AdministrativeMetadata\"\n                },\n                {\n                    \"@id\": \"bts:ReferenceMetadata\"\n                },\n                {\n                    \"@id\": \"bts:StatisticalMetadata\"\n                },\n                {\n                    \"@id\": \"bts:LegalMetadata\"\n                },\n                {\n                    \"@id\": \"bts:WorkflowMetadata\"\n                }\n            ],\n            \"sms:displayName\": \"dataType\",\n            \"sms:required\": \"sms:true\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Immunoassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Immunoassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"immunoassay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Behaviorprocess\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Behaviorprocess\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"behavior process\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Clinical\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Clinical\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"clinical\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Demographics\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Demographics\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"demographics\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Particlecharacterization\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Particlecharacterization\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"particle characterization\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DrugCombinationScreen\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DrugCombinationScreen\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"drugCombinationScreen\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IsoformExpression\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IsoformExpression\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"isoformExpression\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Proteomics\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Proteomics\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"proteomics\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SurveyData\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SurveyData\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"surveyData\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Kinomics\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Kinomics\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"kinomics\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SomaticVariants\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SomaticVariants\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SomaticVariants\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Volume\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Volume\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Volume\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Characteristic\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Characteristic\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"characteristic\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GeneExpression\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GeneExpression\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"geneExpression\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AlignedReads\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AlignedReads\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"AlignedReads\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenomicFeatures\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GenomicFeatures\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"genomicFeatures\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenomicVariants\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GenomicVariants\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"genomicVariants\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Rawcounts\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Rawcounts\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"raw counts\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RawIntensities\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RawIntensities\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RawIntensities\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NormalizedIntensities\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NormalizedIntensities\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NormalizedIntensities\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PharmacokineticStudy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PharmacokineticStudy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Pharmacokinetic Study\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Maskimage\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Maskimage\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mask image\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ChromatinActivity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ChromatinActivity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"chromatinActivity\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StructuralVariants\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StructuralVariants\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"StructuralVariants\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GermlineVariants\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GermlineVariants\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GermlineVariants\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CopyNumberVariants\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CopyNumberVariants\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CopyNumberVariants\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Image\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Image\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"image\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Network\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Network\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"network\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CellularPhysiology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CellularPhysiology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cellularPhysiology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Metabolomics\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Metabolomics\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"metabolomics\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AnnotatedSomaticVariants\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AnnotatedSomaticVariants\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"AnnotatedSomaticVariants\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AnnotatedGermlineVariants\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AnnotatedGermlineVariants\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"AnnotatedGermlineVariants\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Weight\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Weight\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Weight\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Electrophysiology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Electrophysiology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"electrophysiology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Audiotranscript\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Audiotranscript\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"audio transcript\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DataIndex\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DataIndex\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"dataIndex\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DescriptiveMetadata\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DescriptiveMetadata\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Descriptive Metadata\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StructuralMetadata\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StructuralMetadata\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Structural Metadata\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AdministrativeMetadata\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AdministrativeMetadata\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Administrative Metadata\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReferenceMetadata\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ReferenceMetadata\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Reference Metadata\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StatisticalMetadata\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"StatisticalMetadata\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Statistical Metadata\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LegalMetadata\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LegalMetadata\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Legal Metadata\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WorkflowMetadata\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WorkflowMetadata\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Workflow Metadata\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DataSubtype\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Further qualification of dataType, which may be used to indicate the state of processing of the data, aggregation of the data, or presence of metadata.\",\n            \"rdfs:label\": \"DataSubtype\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Normalized\"\n                },\n                {\n                    \"@id\": \"bts:DataMatrix\"\n                },\n                {\n                    \"@id\": \"bts:Raw\"\n                },\n                {\n                    \"@id\": \"bts:Processed\"\n                },\n                {\n                    \"@id\": \"bts:Metadata\"\n                },\n                {\n                    \"@id\": \"bts:Representative\"\n                }\n            ],\n            \"sms:displayName\": \"dataSubtype\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TimepointUnit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"For timed experiments this represents the unit of time measured\",\n            \"rdfs:label\": \"TimepointUnit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Seconds\"\n                },\n                {\n                    \"@id\": \"bts:Minutes\"\n                },\n                {\n                    \"@id\": \"bts:Hours\"\n                },\n                {\n                    \"@id\": \"bts:Days\"\n                },\n                {\n                    \"@id\": \"bts:Weeks\"\n                },\n                {\n                    \"@id\": \"bts:Months\"\n                },\n                {\n                    \"@id\": \"bts:Years\"\n                }\n            ],\n            \"sms:displayName\": \"timepointUnit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WHOPerformanceStatus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Score on the WHO scale describing patient's functional abilities.\",\n            \"rdfs:label\": \"WHOPerformanceStatus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:0\"\n                },\n                {\n                    \"@id\": \"bts:1\"\n                },\n                {\n                    \"@id\": \"bts:2\"\n                },\n                {\n                    \"@id\": \"bts:3\"\n                },\n                {\n                    \"@id\": \"bts:4\"\n                }\n            ],\n            \"sms:displayName\": \"WHOPerformanceStatus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:0\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"0\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:WHOPerformanceStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"0\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CurrentVersion\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"(Versionable entities only) The current version number of the resource.\",\n            \"rdfs:label\": \"CurrentVersion\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"currentVersion\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenomicReference\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Version of genome reference used for alignment in processing workflow\",\n            \"rdfs:label\": \"GenomicReference\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"genomicReference\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Normalized\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Normalized\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"normalized\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DataMatrix\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DataMatrix\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"dataMatrix\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Processed\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Processed\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"processed\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Representative\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Representative\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"representative\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Organ\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A unique macroscopic (gross) anatomic structure that performs specific functions. It is composed of various tissues. An organ is part of an anatomic system or a body region.\",\n            \"rdfs:label\": \"Organ\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Kidney\"\n                },\n                {\n                    \"@id\": \"bts:Ovary\"\n                },\n                {\n                    \"@id\": \"bts:Lung\"\n                },\n                {\n                    \"@id\": \"bts:Bonemarrow\"\n                },\n                {\n                    \"@id\": \"bts:Prostate\"\n                },\n                {\n                    \"@id\": \"bts:Breast\"\n                },\n                {\n                    \"@id\": \"bts:Mesentery\"\n                },\n                {\n                    \"@id\": \"bts:Mammarygland\"\n                },\n                {\n                    \"@id\": \"bts:Colon\"\n                },\n                {\n                    \"@id\": \"bts:Spleen\"\n                },\n                {\n                    \"@id\": \"bts:BursaOfFabricius\"\n                },\n                {\n                    \"@id\": \"bts:Nose\"\n                },\n                {\n                    \"@id\": \"bts:Brain\"\n                },\n                {\n                    \"@id\": \"bts:Pancreas\"\n                },\n                {\n                    \"@id\": \"bts:Liver\"\n                },\n                {\n                    \"@id\": \"bts:Blood\"\n                },\n                {\n                    \"@id\": \"bts:Lymphnode\"\n                },\n                {\n                    \"@id\": \"bts:Nerves\"\n                },\n                {\n                    \"@id\": \"bts:Skin\"\n                },\n                {\n                    \"@id\": \"bts:Eye\"\n                }\n            ],\n            \"sms:displayName\": \"organ\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Kidney\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Kidney\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"kidney\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Ovary\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Ovary\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ovary\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Lung\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Lung\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"lung\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Prostate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Prostate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"prostate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Breast\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Breast\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"breast\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mesentery\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mesentery\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mesentery\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mammarygland\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mammarygland\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mammary gland\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Colon\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Colon\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"colon\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Spleen\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Spleen\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"spleen\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BursaOfFabricius\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BursaOfFabricius\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Bursa Of Fabricius\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nose\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nose\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"nose\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Brain\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Brain\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"brain\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Pancreas\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Pancreas\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"pancreas\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Liver\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Liver\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"liver\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Lymphnode\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Lymphnode\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"lymph node\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nerves\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nerves\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"nerves\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Skin\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Skin\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"skin\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Eye\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Eye\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Organ\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"eye\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LearningDisability\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Learning disability.\",\n            \"rdfs:label\": \"LearningDisability\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"learningDisability\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Id\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The entity id for the resource automatically assigned by the platform.\",\n            \"rdfs:label\": \"Id\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"id\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IsXenograft\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Whether or not sample source is a xenograft (Yes; No)\",\n            \"rdfs:label\": \"IsXenograft\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Yes\"\n                },\n                {\n                    \"@id\": \"bts:No\"\n                }\n            ],\n            \"sms:displayName\": \"isXenograft\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ResourceId\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A UUID for a Resource from the NF Research Tools Database\",\n            \"rdfs:label\": \"ResourceId\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Resource_id\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LenticularOpacity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Lenticular opacity.\",\n            \"rdfs:label\": \"LenticularOpacity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"lenticularOpacity\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DrugScreenType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"String describing general class of drug screen\",\n            \"rdfs:label\": \"DrugScreenType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Combinationlibraryscreen\"\n                },\n                {\n                    \"@id\": \"bts:Combinationscreen\"\n                },\n                {\n                    \"@id\": \"bts:Singlemolecule\"\n                },\n                {\n                    \"@id\": \"bts:Smallmoleculelibraryscreen\"\n                }\n            ],\n            \"sms:displayName\": \"drugScreenType\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Combinationlibraryscreen\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Combinationlibraryscreen\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DrugScreenType\"\n                },\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"combination library screen\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Combinationscreen\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Combinationscreen\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DrugScreenType\"\n                },\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"combination screen\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Singlemolecule\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Singlemolecule\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DrugScreenType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"single molecule\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Smallmoleculelibraryscreen\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Smallmoleculelibraryscreen\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DrugScreenType\"\n                },\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"small molecule library screen\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ExperimentalCondition\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A free-text description of the experimental condition (e.g. 5 mM doxorubicin).\",\n            \"rdfs:label\": \"ExperimentalCondition\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"experimentalCondition\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CafeaulaitMacules\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of cafe-au-lait macules.\",\n            \"rdfs:label\": \"CafeaulaitMacules\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"cafeaulaitMacules\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ModelSystemName\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"  HEK293 (cell line), Minnesota5 (swine strain), DXL (poultry strain), RB51 (vaccine strain of Brucella abortus)\",\n            \"rdfs:label\": \"ModelSystemName\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:HEK293NF1-/-withWTmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:SZ-NF4\"\n                },\n                {\n                    \"@id\": \"bts:Nf1-/-HEK293\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-Exon52R2550X#5\"\n                },\n                {\n                    \"@id\": \"bts:HBE135-E6E7\"\n                },\n                {\n                    \"@id\": \"bts:NCC-MPNST5-C1\"\n                },\n                {\n                    \"@id\": \"bts:Sc93.1\"\n                },\n                {\n                    \"@id\": \"bts:JHU2-079-PDX\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-withR192XmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:IcNF98.4c\"\n                },\n                {\n                    \"@id\": \"bts:ST88-14\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipNF95.6\"\n                },\n                {\n                    \"@id\": \"bts:IcNF99.1\"\n                },\n                {\n                    \"@id\": \"bts:CNF00.10a\"\n                },\n                {\n                    \"@id\": \"bts:IPSCY489C;Exon13crypticsplice\"\n                },\n                {\n                    \"@id\": \"bts:3PNFSiPSsvMM11\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipnNF09.4\"\n                },\n                {\n                    \"@id\": \"bts:S520\"\n                },\n                {\n                    \"@id\": \"bts:SchwanncellNF1-/-withR681XmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:JH-2-002-CL\"\n                },\n                {\n                    \"@id\": \"bts:NCC-MPNST2-C1\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-withR1306XmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:WTES\"\n                },\n                {\n                    \"@id\": \"bts:Fb93.1\"\n                },\n                {\n                    \"@id\": \"bts:CNF04.9a\"\n                },\n                {\n                    \"@id\": \"bts:JHU2-103-PDX\"\n                },\n                {\n                    \"@id\": \"bts:5PNFTDiPSsvPM6\"\n                },\n                {\n                    \"@id\": \"bts:NCC-MPNST1-C1\"\n                },\n                {\n                    \"@id\": \"bts:7PNFSiPSrvPM12\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-withR681XmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:GM23338\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipnNF95.11c\"\n                },\n                {\n                    \"@id\": \"bts:T265\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-clone2\"\n                },\n                {\n                    \"@id\": \"bts:SNF02.2\"\n                },\n                {\n                    \"@id\": \"bts:IPSCNF1WT\"\n                },\n                {\n                    \"@id\": \"bts:IcNF98.4d\"\n                },\n                {\n                    \"@id\": \"bts:NF2-/-AC007-hTERT\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-withWTtaggedmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:NCC-MPNST3-C1\"\n                },\n                {\n                    \"@id\": \"bts:HEK293\"\n                },\n                {\n                    \"@id\": \"bts:Dh5alpha\"\n                },\n                {\n                    \"@id\": \"bts:IcNF97.2a\"\n                },\n                {\n                    \"@id\": \"bts:SchwanncellNF1-/-(iPN97.4#24)\"\n                },\n                {\n                    \"@id\": \"bts:CNF97.5\"\n                },\n                {\n                    \"@id\": \"bts:NCC-MPNST4-C1\"\n                },\n                {\n                    \"@id\": \"bts:NMS-2\"\n                },\n                {\n                    \"@id\": \"bts:CNF99.1\"\n                },\n                {\n                    \"@id\": \"bts:HiPSC\"\n                },\n                {\n                    \"@id\": \"bts:AC007-hTERT\"\n                },\n                {\n                    \"@id\": \"bts:ScienCellSchwanncells\"\n                },\n                {\n                    \"@id\": \"bts:IcNF04.9a\"\n                },\n                {\n                    \"@id\": \"bts:90-8\"\n                },\n                {\n                    \"@id\": \"bts:SchwanncellNF1-/-withWTtaggedmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:IPSCNF1+/-BJFF.6bkgd\"\n                },\n                {\n                    \"@id\": \"bts:JHU2-079-CL\"\n                },\n                {\n                    \"@id\": \"bts:CNF97.2b\"\n                },\n                {\n                    \"@id\": \"bts:JHU2-002-CL\"\n                },\n                {\n                    \"@id\": \"bts:NF1\"\n                },\n                {\n                    \"@id\": \"bts:SZ-NF1\"\n                },\n                {\n                    \"@id\": \"bts:3PNFFiPSsvPM2\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipNF05.5(Mixedclones)\"\n                },\n                {\n                    \"@id\": \"bts:HTERTipn02.32λ\"\n                },\n                {\n                    \"@id\": \"bts:JHU2-103-CL\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-Exon17#A15G629Rcrypticsplice\"\n                },\n                {\n                    \"@id\": \"bts:Lis42NF11N\"\n                },\n                {\n                    \"@id\": \"bts:ELK-TADLuciferaseReporterHEK293StableNF1-/-\"\n                },\n                {\n                    \"@id\": \"bts:Humanforeskinfibroblasts\"\n                },\n                {\n                    \"@id\": \"bts:IcNF00.10a\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-withR1947XmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:Nf1-/-Epitheliallungcells\"\n                },\n                {\n                    \"@id\": \"bts:Nf2-/-SchwannSC(mouse)(PMID26554010)\"\n                },\n                {\n                    \"@id\": \"bts:STS-26T\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-withR461XmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:I28cNF\"\n                },\n                {\n                    \"@id\": \"bts:SNF94.3\"\n                },\n                {\n                    \"@id\": \"bts:Nf1-/-skin-derivedprecursorcells\"\n                },\n                {\n                    \"@id\": \"bts:HTERTipn02.8\"\n                },\n                {\n                    \"@id\": \"bts:Dhh-Cre;NF1Arg681*/floxSchwannCells\"\n                },\n                {\n                    \"@id\": \"bts:KCL024\"\n                },\n                {\n                    \"@id\": \"bts:IcNF97.2b\"\n                },\n                {\n                    \"@id\": \"bts:HeLaSilenciXNF1\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipNF95.11bC/T\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-Exon47insT#14\"\n                },\n                {\n                    \"@id\": \"bts:HS-PSS\"\n                },\n                {\n                    \"@id\": \"bts:YST-1\"\n                },\n                {\n                    \"@id\": \"bts:M3MPNST\"\n                },\n                {\n                    \"@id\": \"bts:S462.TY\"\n                },\n                {\n                    \"@id\": \"bts:S462\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-Exon17#B48G629Rcrypticsplice\"\n                },\n                {\n                    \"@id\": \"bts:GM11602\"\n                },\n                {\n                    \"@id\": \"bts:Lis47NF12N\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1sipnNF95.12B\"\n                },\n                {\n                    \"@id\": \"bts:KCL025\"\n                },\n                {\n                    \"@id\": \"bts:SchwanncellNF1-/-withR816XmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:HiPSC-SCP\"\n                },\n                {\n                    \"@id\": \"bts:SNF96.2\"\n                },\n                {\n                    \"@id\": \"bts:CNF98.4c\"\n                },\n                {\n                    \"@id\": \"bts:NF1-R68XEmbryoniccells\"\n                },\n                {\n                    \"@id\": \"bts:BJFF.6\"\n                },\n                {\n                    \"@id\": \"bts:Ben-Men-1\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-withR816XmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipNF05.5\"\n                },\n                {\n                    \"@id\": \"bts:I18cNF\"\n                },\n                {\n                    \"@id\": \"bts:HTERTSCipn97.4\"\n                },\n                {\n                    \"@id\": \"bts:I21cNF\"\n                },\n                {\n                    \"@id\": \"bts:CNF98.4d\"\n                },\n                {\n                    \"@id\": \"bts:28cNF\"\n                },\n                {\n                    \"@id\": \"bts:SC4[Mouseschwannoma]\"\n                },\n                {\n                    \"@id\": \"bts:CNF18.1a\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Arg681*/Arg681*ES\"\n                },\n                {\n                    \"@id\": \"bts:GM11601\"\n                },\n                {\n                    \"@id\": \"bts:HEK293NF1-/-withR2550XmNf1cDNA\"\n                },\n                {\n                    \"@id\": \"bts:JH-2-103-CL\"\n                },\n                {\n                    \"@id\": \"bts:HS-Sch-2\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipn02.32λ\"\n                },\n                {\n                    \"@id\": \"bts:ELK-TADLuciferaseReporterHEK293Stable\"\n                },\n                {\n                    \"@id\": \"bts:6PNFSiPSrvPM2\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipNF03.3\"\n                },\n                {\n                    \"@id\": \"bts:JH-2-079-CL\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipNF04.4\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Arg681*/Arg681*MEFs\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Arg681*/+ES\"\n                },\n                {\n                    \"@id\": \"bts:CNF97.2a\"\n                },\n                {\n                    \"@id\": \"bts:Schwanncelli28cNFNF1-/-(#14)\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipNF00.6\"\n                },\n                {\n                    \"@id\": \"bts:NCC-MPNST3-X2-C1\"\n                },\n                {\n                    \"@id\": \"bts:JHU2-002-PDX\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipNF95.11bC\"\n                },\n                {\n                    \"@id\": \"bts:5PNFTDiPSsvMM4\"\n                },\n                {\n                    \"@id\": \"bts:HTERTNF1ipn06.2A\"\n                },\n                {\n                    \"@id\": \"bts:HTERTSCipn02.8\"\n                },\n                {\n                    \"@id\": \"bts:SZ-NF2\"\n                },\n                {\n                    \"@id\": \"bts:SMPNST\"\n                },\n                {\n                    \"@id\": \"bts:HS02\"\n                },\n                {\n                    \"@id\": \"bts:HS05\"\n                },\n                {\n                    \"@id\": \"bts:B6;129S2-Trp53tm1TyjNf1tm1Tyj/J\"\n                },\n                {\n                    \"@id\": \"bts:Prss56Cre;R26mT\"\n                },\n                {\n                    \"@id\": \"bts:B6.129S1-Nf1tm1Cbr/J\"\n                },\n                {\n                    \"@id\": \"bts:Nf1-OPG\"\n                },\n                {\n                    \"@id\": \"bts:C57BL/6J\"\n                },\n                {\n                    \"@id\": \"bts:NRG\"\n                },\n                {\n                    \"@id\": \"bts:Nf1-OPG-Arg816\"\n                },\n                {\n                    \"@id\": \"bts:NODscidgamma\"\n                },\n                {\n                    \"@id\": \"bts:B6.129S2-Nf1tm1Tyj/J\"\n                },\n                {\n                    \"@id\": \"bts:B6;129-Trp53tm1TyjNf1tm1TyjSuz12Gt(Betageo)1Khe/KcichJ\"\n                },\n                {\n                    \"@id\": \"bts:B6.129(Cg)-Nf1tm1Par/J\"\n                },\n                {\n                    \"@id\": \"bts:GFAP-Cre;Nf1-G848R/Flox\"\n                },\n                {\n                    \"@id\": \"bts:GFAP-Cre;Nf1-R681X/Flox\"\n                },\n                {\n                    \"@id\": \"bts:GFAP-Cre;Nf1-C383X/Flox\"\n                },\n                {\n                    \"@id\": \"bts:Nf1-C848R/Flox\"\n                },\n                {\n                    \"@id\": \"bts:Nf1-R681X/Flox\"\n                },\n                {\n                    \"@id\": \"bts:Nf1-C383X/Flox\"\n                },\n                {\n                    \"@id\": \"bts:Nf1flox/flox\"\n                }\n            ],\n            \"sms:displayName\": \"modelSystemName\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"list like\"\n            ]\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-withWTmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-withWTmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- with WT mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SZ-NF4\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SZ-NF4\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SZ-NF4\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1-/-HEK293\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1-/-HEK293\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1-/- HEK 293\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-Exon52R2550X#5\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-Exon52R2550X#5\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- Exon 52 R2550X #5\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HBE135-E6E7\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HBE135-E6E7\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HBE135-E6E7\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NCC-MPNST5-C1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NCC-MPNST5-C1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NCC-MPNST5-C1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sc93.1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sc93.1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sc93.1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JHU2-079-PDX\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JHU2-079-PDX\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"JHU 2-079-PDX\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-withR192XmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-withR192XmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- with R192X mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IcNF98.4c\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IcNF98.4c\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"icNF98.4c\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ST88-14\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ST88-14\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ST88-14\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipNF95.6\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipNF95.6\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipNF95.6\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IcNF99.1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IcNF99.1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"icNF99.1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNF00.10a\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNF00.10a\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cNF00.10a\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IPSCY489C;Exon13crypticsplice\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IPSCY489C;Exon13crypticsplice\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"iPSC Y489C; Exon 13 cryptic splice\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:3PNFSiPSsvMM11\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"3PNFSiPSsvMM11\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"3PNF_SiPSsv_MM_11\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipnNF09.4\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipnNF09.4\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipnNF09.4\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:S520\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"S520\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"S520\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SchwanncellNF1-/-withR681XmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SchwanncellNF1-/-withR681XmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Schwann cell NF1 -/- with R681X mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JH-2-002-CL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JH-2-002-CL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"JH-2-002-CL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NCC-MPNST2-C1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NCC-MPNST2-C1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NCC-MPNST2-C1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-withR1306XmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-withR1306XmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- with R1306X mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WTES\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"WTES\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"WT ES\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Fb93.1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Fb93.1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Fb93.1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNF04.9a\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNF04.9a\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cNF04.9a\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JHU2-103-PDX\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JHU2-103-PDX\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"JHU 2-103-PDX\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:5PNFTDiPSsvPM6\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"5PNFTDiPSsvPM6\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"5PNF_TDiPSsv_PM_6\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NCC-MPNST1-C1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NCC-MPNST1-C1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NCC-MPNST1-C1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:7PNFSiPSrvPM12\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"7PNFSiPSrvPM12\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"7PNF_SiPSrv_PM_12\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-withR681XmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-withR681XmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- with R681X mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GM23338\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GM23338\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GM23338\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipnNF95.11c\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipnNF95.11c\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipnNF95.11c\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:T265\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"T265\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"T265\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-clone2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-clone2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- clone 2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SNF02.2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SNF02.2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sNF02.2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IPSCNF1WT\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IPSCNF1WT\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"iPSC NF1 WT\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IcNF98.4d\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IcNF98.4d\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"icNF98.4d\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NF2-/-AC007-hTERT\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NF2-/-AC007-hTERT\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NF2-/- AC007-hTERT\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-withWTtaggedmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-withWTtaggedmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- with WT tagged mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NCC-MPNST3-C1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NCC-MPNST3-C1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NCC-MPNST3-C1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Dh5alpha\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Dh5alpha\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Dh5 alpha\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IcNF97.2a\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IcNF97.2a\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"icNF97.2a\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SchwanncellNF1-/-(iPN97.4#24)\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SchwanncellNF1-/-(iPN97.4#24)\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Schwann cell NF1 -/- (iPN97.4 #24)\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNF97.5\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNF97.5\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cNF97.5\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NCC-MPNST4-C1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NCC-MPNST4-C1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NCC-MPNST4-C1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NMS-2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NMS-2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NMS-2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNF99.1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNF99.1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cNF99.1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HiPSC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HiPSC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hiPSC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AC007-hTERT\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AC007-hTERT\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"AC007-hTERT\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ScienCellSchwanncells\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ScienCellSchwanncells\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ScienCell Schwann cells\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IcNF04.9a\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IcNF04.9a\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"icNF04.9a\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:90-8\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"90-8\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"90-8\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SchwanncellNF1-/-withWTtaggedmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SchwanncellNF1-/-withWTtaggedmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Schwann cell NF1 -/- with WT tagged mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IPSCNF1+/-BJFF.6bkgd\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IPSCNF1+/-BJFF.6bkgd\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"iPSC NF1 +/- BJFF.6 bkgd\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JHU2-079-CL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JHU2-079-CL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"JHU 2-079-CL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNF97.2b\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNF97.2b\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cNF97.2b\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JHU2-002-CL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JHU2-002-CL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"JHU 2-002-CL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NF1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NF1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NF1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SZ-NF1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SZ-NF1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SZ-NF1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:3PNFFiPSsvPM2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"3PNFFiPSsvPM2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"3PNF_FiPSsv_PM_2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipNF05.5(Mixedclones)\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipNF05.5(Mixedclones)\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipNF05.5 (Mixed clones)\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTipn02.32λ\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTipn02.32λ\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT ipn02.3 2λ\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JHU2-103-CL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JHU2-103-CL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"JHU 2-103-CL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-Exon17#A15G629Rcrypticsplice\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-Exon17#A15G629Rcrypticsplice\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- Exon 17 #A15 G629R cryptic splice\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Lis42NF11N\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Lis42NF11N\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Lis42_NF1_1N\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ELK-TADLuciferaseReporterHEK293StableNF1-/-\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ELK-TADLuciferaseReporterHEK293StableNF1-/-\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ELK-TAD Luciferase Reporter HEK293 Stable NF1 -/-\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Humanforeskinfibroblasts\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Humanforeskinfibroblasts\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"human foreskin fibroblasts\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IcNF00.10a\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IcNF00.10a\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"icNF00.10a\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-withR1947XmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-withR1947XmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- with R1947X mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1-/-Epitheliallungcells\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1-/-Epitheliallungcells\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1-/- Epithelial lung cells\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf2-/-SchwannSC(mouse)(PMID26554010)\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf2-/-SchwannSC(mouse)(PMID26554010)\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf2-/- Schwann SC (mouse) (PMID26554010)\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:STS-26T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"STS-26T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"STS-26T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-withR461XmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-withR461XmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- with R461X mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:I28cNF\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"I28cNF\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"i28cNF\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SNF94.3\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SNF94.3\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sNF94.3\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1-/-skin-derivedprecursorcells\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1-/-skin-derivedprecursorcells\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1-/- skin-derived precursor cells\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTipn02.8\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTipn02.8\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT ipn02.8\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Dhh-Cre;NF1Arg681*/floxSchwannCells\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Dhh-Cre;NF1Arg681*/floxSchwannCells\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Dhh-Cre; NF1Arg681*/flox Schwann Cells\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:KCL024\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"KCL024\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"KCL024\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IcNF97.2b\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IcNF97.2b\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"icNF97.2b\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HeLaSilenciXNF1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HeLaSilenciXNF1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HeLa SilenciX NF1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipNF95.11bC/T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipNF95.11bC/T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipNF95.11b C/T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-Exon47insT#14\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-Exon47insT#14\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- Exon 47 insT #14\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HS-PSS\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HS-PSS\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HS-PSS\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:YST-1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"YST-1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"YST-1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:M3MPNST\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"M3MPNST\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"M3 MPNST\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:S462.TY\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"S462.TY\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"S462.TY\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:S462\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"S462\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"S462\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-Exon17#B48G629Rcrypticsplice\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-Exon17#B48G629Rcrypticsplice\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- Exon 17 #B48 G629R cryptic splice\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GM11602\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GM11602\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GM11602\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Lis47NF12N\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Lis47NF12N\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Lis47_NF1_2N\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1sipnNF95.12B\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1sipnNF95.12B\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 sipnNF95.12B\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:KCL025\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"KCL025\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"KCL025\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SchwanncellNF1-/-withR816XmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SchwanncellNF1-/-withR816XmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Schwann cell NF1 -/- with R816X mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HiPSC-SCP\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HiPSC-SCP\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hiPSC-SCP\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SNF96.2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SNF96.2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sNF96.2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNF98.4c\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNF98.4c\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cNF98.4c\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NF1-R68XEmbryoniccells\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NF1-R68XEmbryoniccells\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NF1-R68X Embryonic cells\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BJFF.6\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BJFF.6\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BJFF.6\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Ben-Men-1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Ben-Men-1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Ben-Men-1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-withR816XmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-withR816XmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- with R816X mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipNF05.5\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipNF05.5\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipNF05.5\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:I18cNF\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"I18cNF\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"i18cNF\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTSCipn97.4\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTSCipn97.4\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT SC ipn97.4\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:I21cNF\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"I21cNF\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"i21cNF\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNF98.4d\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNF98.4d\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cNF98.4d\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:28cNF\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"28cNF\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"28cNF\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SC4[Mouseschwannoma]\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SC4[Mouseschwannoma]\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SC4 [Mouse schwannoma]\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNF18.1a\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNF18.1a\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cNF18.1a\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1Arg681*/Arg681*ES\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1Arg681*/Arg681*ES\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1Arg681*/Arg681* ES\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GM11601\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GM11601\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GM11601\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HEK293NF1-/-withR2550XmNf1cDNA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HEK293NF1-/-withR2550XmNf1cDNA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HEK293 NF1 -/- with R2550X mNf1 cDNA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JH-2-103-CL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JH-2-103-CL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"JH-2-103-CL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HS-Sch-2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HS-Sch-2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HS-Sch-2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipn02.32λ\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipn02.32λ\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipn02.3 2λ\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ELK-TADLuciferaseReporterHEK293Stable\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ELK-TADLuciferaseReporterHEK293Stable\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ELK-TAD Luciferase Reporter HEK293 Stable\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:6PNFSiPSrvPM2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"6PNFSiPSrvPM2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"6PNF_SiPSrv_PM_2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipNF03.3\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipNF03.3\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipNF03.3\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JH-2-079-CL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JH-2-079-CL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"JH-2-079-CL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipNF04.4\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipNF04.4\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipNF04.4\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1Arg681*/Arg681*MEFs\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1Arg681*/Arg681*MEFs\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1Arg681*/Arg681* MEFs\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1Arg681*/+ES\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1Arg681*/+ES\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1Arg681*/+ ES\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNF97.2a\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNF97.2a\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cNF97.2a\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Schwanncelli28cNFNF1-/-(#14)\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Schwanncelli28cNFNF1-/-(#14)\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Schwann cell i28cNF NF1 -/- (#14)\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipNF00.6\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipNF00.6\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipNF00.6\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NCC-MPNST3-X2-C1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NCC-MPNST3-X2-C1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NCC-MPNST3-X2-C1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JHU2-002-PDX\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JHU2-002-PDX\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"JHU 2-002-PDX\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipNF95.11bC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipNF95.11bC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipNF95.11b C\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:5PNFTDiPSsvMM4\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"5PNFTDiPSsvMM4\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"5PNF_TDiPSsv_MM_4\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTNF1ipn06.2A\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTNF1ipn06.2A\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT NF1 ipn06.2 A\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HTERTSCipn02.8\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HTERTSCipn02.8\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"hTERT SC ipn02.8\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SZ-NF2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SZ-NF2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SZ-NF2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SMPNST\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SMPNST\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sMPNST\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HS02\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HS02\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HS02\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HS05\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HS05\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HS05\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:B6;129S2-Trp53tm1TyjNf1tm1Tyj/J\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"B6;129S2-Trp53tm1TyjNf1tm1Tyj/J\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"B6;129S2-Trp53tm1Tyj Nf1tm1Tyj/J\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Prss56Cre;R26mT\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Prss56Cre;R26mT\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Prss56Cre;R26mT\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:B6.129S1-Nf1tm1Cbr/J\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"B6.129S1-Nf1tm1Cbr/J\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"B6.129S1-Nf1tm1Cbr/J\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1-OPG\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1-OPG\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1-OPG\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:C57BL/6J\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"C57BL/6J\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"C57BL/6J\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NRG\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NRG\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NRG\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1-OPG-Arg816\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1-OPG-Arg816\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1-OPG-Arg816\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NODscidgamma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NODscidgamma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NOD scid gamma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:B6.129S2-Nf1tm1Tyj/J\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"B6.129S2-Nf1tm1Tyj/J\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"B6.129S2-Nf1tm1Tyj/J\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:B6;129-Trp53tm1TyjNf1tm1TyjSuz12Gt(Betageo)1Khe/KcichJ\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"B6;129-Trp53tm1TyjNf1tm1TyjSuz12Gt(Betageo)1Khe/KcichJ\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"B6;129-Trp53tm1Tyj Nf1tm1Tyj Suz12Gt(Betageo)1Khe/KcichJ\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:B6.129(Cg)-Nf1tm1Par/J\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"B6.129(Cg)-Nf1tm1Par/J\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"B6.129(Cg)-Nf1tm1Par/J\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GFAP-Cre;Nf1-G848R/Flox\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GFAP-Cre;Nf1-G848R/Flox\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GFAP-Cre; Nf1-G848R/Flox\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GFAP-Cre;Nf1-R681X/Flox\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GFAP-Cre;Nf1-R681X/Flox\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GFAP-Cre; Nf1-R681X/Flox\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GFAP-Cre;Nf1-C383X/Flox\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GFAP-Cre;Nf1-C383X/Flox\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GFAP-Cre; Nf1-C383X/Flox\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1-C848R/Flox\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1-C848R/Flox\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1-C848R/Flox\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1-R681X/Flox\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1-R681X/Flox\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1-R681X/Flox\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1-C383X/Flox\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1-C383X/Flox\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1-C383X/Flox\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1flox/flox\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nf1flox/flox\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nf1flox/flox\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DermalNeurofibromas\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Dermal neurofibromas.\",\n            \"rdfs:label\": \"DermalNeurofibromas\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Scattered\"\n                },\n                {\n                    \"@id\": \"bts:Dense\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"dermalNeurofibromas\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PlateWell\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"User-specified identifier for the specific well of the plate used to prepare the sample for analysis.\",\n            \"rdfs:label\": \"PlateWell\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"plateWell\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ChipID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"User-specified identifier for the chip used to perform the methylation microarray.\",\n            \"rdfs:label\": \"ChipID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"chipID\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProteinExtractSource\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Source of the extracted protein used in the experiment\",\n            \"rdfs:label\": \"ProteinExtractSource\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Celllysate\"\n                },\n                {\n                    \"@id\": \"bts:Cytoplasm\"\n                },\n                {\n                    \"@id\": \"bts:Mitochondria\"\n                },\n                {\n                    \"@id\": \"bts:Nuclearextract\"\n                }\n            ],\n            \"sms:displayName\": \"proteinExtractSource\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Celllysate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Celllysate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProteinExtractSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cell lysate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cytoplasm\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cytoplasm\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProteinExtractSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cytoplasm\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nuclearextract\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nuclearextract\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProteinExtractSource\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"nuclear extract\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DataCollectionMode\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Mode of data collection in tandem MS assays. Either DDA (data-dependent acquisition) or DIA (data-independent) acquisition.\",\n            \"rdfs:label\": \"DataCollectionMode\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:DDA\"\n                },\n                {\n                    \"@id\": \"bts:DIA\"\n                },\n                {\n                    \"@id\": \"bts:Other\"\n                }\n            ],\n            \"sms:displayName\": \"dataCollectionMode\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DDA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DDA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataCollectionMode\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DDA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DIA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DIA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:DataCollectionMode\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DIA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BodySite\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Sample location referring to a named area of the body, inclusive of gross anatomical structures and organ parts.\",\n            \"rdfs:label\": \"BodySite\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Axilla\"\n                },\n                {\n                    \"@id\": \"bts:Groin\"\n                },\n                {\n                    \"@id\": \"bts:Leg\"\n                },\n                {\n                    \"@id\": \"bts:Thoracicspine\"\n                },\n                {\n                    \"@id\": \"bts:Forearm\"\n                },\n                {\n                    \"@id\": \"bts:Acetabulum\"\n                },\n                {\n                    \"@id\": \"bts:Muscle\"\n                },\n                {\n                    \"@id\": \"bts:Finger\"\n                },\n                {\n                    \"@id\": \"bts:Iliacspine\"\n                },\n                {\n                    \"@id\": \"bts:Head\"\n                },\n                {\n                    \"@id\": \"bts:Neck\"\n                },\n                {\n                    \"@id\": \"bts:Shoulder\"\n                },\n                {\n                    \"@id\": \"bts:Back\"\n                },\n                {\n                    \"@id\": \"bts:Spine\"\n                },\n                {\n                    \"@id\": \"bts:Scalp\"\n                },\n                {\n                    \"@id\": \"bts:Scapula\"\n                },\n                {\n                    \"@id\": \"bts:Pelvis\"\n                },\n                {\n                    \"@id\": \"bts:Dorsolateralprefrontalcortex\"\n                },\n                {\n                    \"@id\": \"bts:Occcipitallobe\"\n                }\n            ],\n            \"sms:displayName\": \"bodySite\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Axilla\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Axilla\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"axilla\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Groin\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Groin\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"groin\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Leg\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Leg\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"leg\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Thoracicspine\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Thoracicspine\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"thoracic spine\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Forearm\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Forearm\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"forearm\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Acetabulum\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Acetabulum\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"acetabulum\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Muscle\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Muscle\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"muscle\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Finger\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Finger\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"finger\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Iliacspine\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Iliacspine\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"iliac spine\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Head\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Head\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"head\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Neck\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Neck\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"neck\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Shoulder\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Shoulder\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"shoulder\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Back\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Back\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"back\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Spine\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Spine\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"spine\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Scalp\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Scalp\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"scalp\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Scapula\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Scapula\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"scapula\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Pelvis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Pelvis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"pelvis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Dorsolateralprefrontalcortex\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Dorsolateralprefrontalcortex\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"dorsolateral prefrontal cortex\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Occcipitallobe\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Occcipitallobe\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BodySite\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"occcipital lobe\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LongBoneDysplasia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Long bone dysplasia.\",\n            \"rdfs:label\": \"LongBoneDysplasia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"longBoneDysplasia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LibraryPrep\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The general strategy by which the library was prepared\",\n            \"rdfs:label\": \"LibraryPrep\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:LncRNAenrichment\"\n                },\n                {\n                    \"@id\": \"bts:MiRNAenrichment\"\n                },\n                {\n                    \"@id\": \"bts:PolyAselection\"\n                },\n                {\n                    \"@id\": \"bts:RRNAdepletion\"\n                }\n            ],\n            \"sms:displayName\": \"libraryPrep\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LncRNAenrichment\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LncRNAenrichment\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"lncRNAenrichment\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MiRNAenrichment\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MiRNAenrichment\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"miRNAenrichment\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PolyAselection\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PolyAselection\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"polyAselection\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RRNAdepletion\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RRNAdepletion\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"rRNAdepletion\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TumorType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The type of tumor that the biospecimen used to generate the data were collected from.\",\n            \"rdfs:label\": \"TumorType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:DiffuseAstrocytoma\"\n                },\n                {\n                    \"@id\": \"bts:SubcutaneousNeurofibroma\"\n                },\n                {\n                    \"@id\": \"bts:PilomyxoidAstrocytoma\"\n                },\n                {\n                    \"@id\": \"bts:JuvenileMyelomonocyticLeukemia\"\n                },\n                {\n                    \"@id\": \"bts:AnaplasticAstrocytoma\"\n                },\n                {\n                    \"@id\": \"bts:MalignantPeripheralNerveSheathTumor\"\n                },\n                {\n                    \"@id\": \"bts:LocalizedNeurofibroma\"\n                },\n                {\n                    \"@id\": \"bts:CutaneousNeurofibroma\"\n                },\n                {\n                    \"@id\": \"bts:ColorectalAdenocarcinoma\"\n                },\n                {\n                    \"@id\": \"bts:AnaplasticPilocyticAstrocytoma\"\n                },\n                {\n                    \"@id\": \"bts:Schwannoma\"\n                },\n                {\n                    \"@id\": \"bts:HemorrhagicNeoplasm\"\n                },\n                {\n                    \"@id\": \"bts:NotApplicable\"\n                },\n                {\n                    \"@id\": \"bts:PilocyticAstrocytoma\"\n                },\n                {\n                    \"@id\": \"bts:Ganglioglioma\"\n                },\n                {\n                    \"@id\": \"bts:PlexiformNeurofibroma\"\n                },\n                {\n                    \"@id\": \"bts:NF2-AssociatedTumor\"\n                },\n                {\n                    \"@id\": \"bts:Fibrosarcoma\"\n                },\n                {\n                    \"@id\": \"bts:Low-GradeGliomaNOS\"\n                },\n                {\n                    \"@id\": \"bts:Glioblastoma\"\n                },\n                {\n                    \"@id\": \"bts:AnaplasticPleomorphicXanthoastrocytoma\"\n                },\n                {\n                    \"@id\": \"bts:GlioblastomaMultiforme\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                },\n                {\n                    \"@id\": \"bts:AtypicalPilocyticAstrocytoma\"\n                },\n                {\n                    \"@id\": \"bts:OpticPathwayGlioma\"\n                },\n                {\n                    \"@id\": \"bts:DiffuseInfiltratingNeurofibroma\"\n                },\n                {\n                    \"@id\": \"bts:Fibromatosis\"\n                },\n                {\n                    \"@id\": \"bts:NeurofibromawithDegenerativeAtypia\"\n                },\n                {\n                    \"@id\": \"bts:NecroticNeoplasm\"\n                },\n                {\n                    \"@id\": \"bts:PleomorphicXanthoastrocytoma\"\n                },\n                {\n                    \"@id\": \"bts:Teratoma\"\n                },\n                {\n                    \"@id\": \"bts:Sarcoma\"\n                },\n                {\n                    \"@id\": \"bts:MassiveSoftTissueNeurofibroma\"\n                },\n                {\n                    \"@id\": \"bts:Glioma\"\n                },\n                {\n                    \"@id\": \"bts:Oligoastrocytoma\"\n                },\n                {\n                    \"@id\": \"bts:ColorectalCarcinoma\"\n                },\n                {\n                    \"@id\": \"bts:Meningioma\"\n                },\n                {\n                    \"@id\": \"bts:ANNUBP\"\n                },\n                {\n                    \"@id\": \"bts:High-GradeGliomaNOS\"\n                },\n                {\n                    \"@id\": \"bts:NF1-AssociatedTumor\"\n                },\n                {\n                    \"@id\": \"bts:AtypicalNeurofibroma\"\n                },\n                {\n                    \"@id\": \"bts:CellularNeurofibroma\"\n                },\n                {\n                    \"@id\": \"bts:Neurofibroma\"\n                },\n                {\n                    \"@id\": \"bts:RecurrentMPNST\"\n                },\n                {\n                    \"@id\": \"bts:AnaplasticGanglioglioma\"\n                },\n                {\n                    \"@id\": \"bts:Tumor\"\n                },\n                {\n                    \"@id\": \"bts:Metastatictumor\"\n                },\n                {\n                    \"@id\": \"bts:Metastatic/recurrenttumor\"\n                },\n                {\n                    \"@id\": \"bts:Recurrenttumor\"\n                },\n                {\n                    \"@id\": \"bts:Melanoma\"\n                },\n                {\n                    \"@id\": \"bts:NodularNeurofibroma\"\n                }\n            ],\n            \"sms:displayName\": \"tumorType\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DiffuseAstrocytoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DiffuseAstrocytoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Diffuse Astrocytoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SubcutaneousNeurofibroma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SubcutaneousNeurofibroma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Subcutaneous Neurofibroma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PilomyxoidAstrocytoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PilomyxoidAstrocytoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Pilomyxoid Astrocytoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JuvenileMyelomonocyticLeukemia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JuvenileMyelomonocyticLeukemia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Juvenile Myelomonocytic Leukemia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AnaplasticAstrocytoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AnaplasticAstrocytoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Anaplastic Astrocytoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MalignantPeripheralNerveSheathTumor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MalignantPeripheralNerveSheathTumor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Malignant Peripheral Nerve Sheath Tumor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LocalizedNeurofibroma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LocalizedNeurofibroma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Localized Neurofibroma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CutaneousNeurofibroma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CutaneousNeurofibroma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cutaneous Neurofibroma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ColorectalAdenocarcinoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ColorectalAdenocarcinoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Colorectal Adenocarcinoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AnaplasticPilocyticAstrocytoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AnaplasticPilocyticAstrocytoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Anaplastic Pilocytic Astrocytoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Schwannoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Schwannoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"schwannoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HemorrhagicNeoplasm\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HemorrhagicNeoplasm\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Hemorrhagic Neoplasm\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PilocyticAstrocytoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PilocyticAstrocytoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Pilocytic Astrocytoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Ganglioglioma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Ganglioglioma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Ganglioglioma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PlexiformNeurofibroma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PlexiformNeurofibroma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Plexiform Neurofibroma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NF2-AssociatedTumor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NF2-AssociatedTumor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NF2-Associated Tumor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Fibrosarcoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Fibrosarcoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Fibrosarcoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Low-GradeGliomaNOS\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Low-GradeGliomaNOS\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Low-Grade Glioma NOS\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Glioblastoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Glioblastoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Glioblastoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AnaplasticPleomorphicXanthoastrocytoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AnaplasticPleomorphicXanthoastrocytoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Anaplastic Pleomorphic Xanthoastrocytoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GlioblastomaMultiforme\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GlioblastomaMultiforme\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Glioblastoma Multiforme\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AtypicalPilocyticAstrocytoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AtypicalPilocyticAstrocytoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Atypical Pilocytic Astrocytoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OpticPathwayGlioma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OpticPathwayGlioma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Optic Pathway Glioma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DiffuseInfiltratingNeurofibroma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DiffuseInfiltratingNeurofibroma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Diffuse Infiltrating Neurofibroma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Fibromatosis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Fibromatosis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Fibromatosis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NeurofibromawithDegenerativeAtypia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NeurofibromawithDegenerativeAtypia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Neurofibroma with Degenerative Atypia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NecroticNeoplasm\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NecroticNeoplasm\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Necrotic Neoplasm\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PleomorphicXanthoastrocytoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PleomorphicXanthoastrocytoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Pleomorphic Xanthoastrocytoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Teratoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Teratoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"teratoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sarcoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sarcoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Sarcoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MassiveSoftTissueNeurofibroma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MassiveSoftTissueNeurofibroma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Massive Soft Tissue Neurofibroma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Glioma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Glioma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Glioma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Oligoastrocytoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Oligoastrocytoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Oligoastrocytoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ColorectalCarcinoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ColorectalCarcinoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Colorectal Carcinoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Meningioma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Meningioma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Notimaged\"\n                },\n                {\n                    \"@id\": \"bts:Absentbyimaging\"\n                },\n                {\n                    \"@id\": \"bts:Single\"\n                },\n                {\n                    \"@id\": \"bts:Multiple\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"meningioma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ANNUBP\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ANNUBP\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ANNUBP\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:High-GradeGliomaNOS\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"High-GradeGliomaNOS\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"High-Grade Glioma NOS\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NF1-AssociatedTumor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NF1-AssociatedTumor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NF1-Associated Tumor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AtypicalNeurofibroma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AtypicalNeurofibroma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Atypical Neurofibroma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CellularNeurofibroma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CellularNeurofibroma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cellular Neurofibroma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Neurofibroma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Neurofibroma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Neurofibroma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RecurrentMPNST\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RecurrentMPNST\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Recurrent MPNST\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AnaplasticGanglioglioma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AnaplasticGanglioglioma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Anaplastic Ganglioglioma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Metastatictumor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Metastatictumor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"metastatic tumor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Metastatic/recurrenttumor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Metastatic/recurrenttumor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"metastatic/recurrent tumor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Recurrenttumor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Recurrenttumor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"recurrent tumor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Melanoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Melanoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Melanoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NodularNeurofibroma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NodularNeurofibroma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TumorType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nodular Neurofibroma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WorkingDistance\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Working distance of the lens expressed as a floating point number > 0.\",\n            \"rdfs:label\": \"WorkingDistance\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"workingDistance\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:WorkingDistanceUnit\"\n                }\n            ],\n            \"sms:validationRules\": [\n                \"num\"\n            ]\n        },\n        {\n            \"@id\": \"bts:NonvestibularCranialSchwannoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Nonvestibular cranial schwannoma.\",\n            \"rdfs:label\": \"NonvestibularCranialSchwannoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Notimaged\"\n                },\n                {\n                    \"@id\": \"bts:Absentbyimaging\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"nonvestibularCranialSchwannoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ExpressionUnit\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Measure used for transcript expression quantification\",\n            \"rdfs:label\": \"ExpressionUnit\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:TPM\"\n                },\n                {\n                    \"@id\": \"bts:RPKM\"\n                },\n                {\n                    \"@id\": \"bts:FPKM\"\n                },\n                {\n                    \"@id\": \"bts:Counts\"\n                },\n                {\n                    \"@id\": \"bts:RawCounts\"\n                },\n                {\n                    \"@id\": \"bts:NormalizedCounts\"\n                },\n                {\n                    \"@id\": \"bts:Other\"\n                }\n            ],\n            \"sms:displayName\": \"expressionUnit\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TPM\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TPM\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExpressionUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"TPM\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RPKM\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RPKM\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExpressionUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RPKM\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FPKM\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FPKM\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExpressionUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FPKM\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Counts\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Counts\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExpressionUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Counts\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RawCounts\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RawCounts\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExpressionUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Raw Counts\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NormalizedCounts\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NormalizedCounts\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ExpressionUnit\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Normalized Counts\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PubertyOnset\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Puberty onset.\",\n            \"rdfs:label\": \"PubertyOnset\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Prebubertal\"\n                },\n                {\n                    \"@id\": \"bts:Precocious\"\n                },\n                {\n                    \"@id\": \"bts:Normal\"\n                },\n                {\n                    \"@id\": \"bts:Late\"\n                }\n            ],\n            \"sms:displayName\": \"pubertyOnset\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Prebubertal\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Prebubertal\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:PubertyOnset\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"prebubertal\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Precocious\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Precocious\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:PubertyOnset\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"precocious\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Normal\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Normal\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:PubertyOnset\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"normal\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Late\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Late\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:PubertyOnset\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"late\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ReadsMappedPercent\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Percent of mapped reads collected from samtools\",\n            \"rdfs:label\": \"ReadsMappedPercent\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"readsMappedPercent\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"num\"\n            ]\n        },\n        {\n            \"@id\": \"bts:GlomusTumor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Glomus tumor.\",\n            \"rdfs:label\": \"GlomusTumor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"glomusTumor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StudyStatus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Status of a study.\",\n            \"rdfs:label\": \"StudyStatus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Active\"\n                },\n                {\n                    \"@id\": \"bts:Completed\"\n                },\n                {\n                    \"@id\": \"bts:Withdrawn\"\n                }\n            ],\n            \"sms:displayName\": \"studyStatus\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Active\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Active\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:StudyStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Active\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Completed\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Completed\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:StudyStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Completed\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Withdrawn\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Withdrawn\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:StudyStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Withdrawn\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SphenoidDysplasia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Sphenoid dysplasia.\",\n            \"rdfs:label\": \"SphenoidDysplasia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"sphenoidDysplasia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NonvestibularSchwannomas\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Nonvestibular schwannomas.\",\n            \"rdfs:label\": \"NonvestibularSchwannomas\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Onlybyimagingevidence\"\n                },\n                {\n                    \"@id\": \"bts:1pathogicallyconfirmed\"\n                },\n                {\n                    \"@id\": \"bts:2ormore,atleast1pathogicallyconfirmed\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"nonvestibularSchwannomas\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Onlybyimagingevidence\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Onlybyimagingevidence\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:NonvestibularSchwannomas\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"only by imaging evidence\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:1pathogicallyconfirmed\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"1pathogicallyconfirmed\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:NonvestibularSchwannomas\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"1 pathogically confirmed\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:2ormore,atleast1pathogicallyconfirmed\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"2ormore,atleast1pathogicallyconfirmed\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:NonvestibularSchwannomas\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"2 or more, at least 1 pathogically confirmed\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GermlineMutation\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The individual's actual germline mutation.\",\n            \"rdfs:label\": \"GermlineMutation\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"germlineMutation\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SkinFoldFreckling\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of skin fold freckling.\",\n            \"rdfs:label\": \"SkinFoldFreckling\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"skinFoldFreckling\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ContentType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Refers to the type of content.\",\n            \"rdfs:label\": \"ContentType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"contentType\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Objective\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Microscope objective.\",\n            \"rdfs:label\": \"Objective\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"objective\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Leukemia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Leukemia.\",\n            \"rdfs:label\": \"Leukemia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"leukemia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SubcutaneousNodularNeurofibromas\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Subcutaneous nodular neurofibromas.\",\n            \"rdfs:label\": \"SubcutaneousNodularNeurofibromas\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Scattered\"\n                },\n                {\n                    \"@id\": \"bts:Dense\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"subcutaneousNodularNeurofibromas\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AverageInsertSize\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Average insert size as reported by samtools\",\n            \"rdfs:label\": \"AverageInsertSize\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"averageInsertSize\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CellID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Also known as cell barcode, this value can be added for single-cell experiments to identify data at the cell level.\",\n            \"rdfs:label\": \"CellID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cellID\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BatchID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Batch identifier, can be used in any context where added batch information is helpful, such as different sequencing runs or collection times.\",\n            \"rdfs:label\": \"BatchID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"batchID\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IndividualID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A unique identifier (non-PII) that represents the individual from which the data came. This could be a patient or animal ID.\",\n            \"rdfs:label\": \"IndividualID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"individualID\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": [\n                \"list like\"\n            ]\n        },\n        {\n            \"@id\": \"bts:FailedQC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Whether the sample or data failed QC checks (Yes, No)\",\n            \"rdfs:label\": \"FailedQC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Yes\"\n                },\n                {\n                    \"@id\": \"bts:No\"\n                }\n            ],\n            \"sms:displayName\": \"failedQC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VascularDisease\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Vascular disease.\",\n            \"rdfs:label\": \"VascularDisease\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"vascularDisease\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CompoundName\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"//pubchem.ncbi.nlm.nih.gov/compound/10127622)\",\n            \"rdfs:label\": \"CompoundName\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"compoundName\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nf1GermlineMutation\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"NF1 germline mutation.\",\n            \"rdfs:label\": \"Nf1GermlineMutation\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"nf1GermlineMutation\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IsFilteredReads\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Whether the reads in the processed result has been filtered by adding a 'PASS' filter or other filters as determined by the data generator\",\n            \"rdfs:label\": \"IsFilteredReads\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"isFilteredReads\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RunType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Sequencing run type.\",\n            \"rdfs:label\": \"RunType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:PairedEnd\"\n                },\n                {\n                    \"@id\": \"bts:SingleEnd\"\n                }\n            ],\n            \"sms:displayName\": \"runType\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PairedEnd\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PairedEnd\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:RunType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"pairedEnd\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SingleEnd\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SingleEnd\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:RunType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"singleEnd\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LensAperture\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Numerical aperture of the lens. Floating point value > 0.\",\n            \"rdfs:label\": \"LensAperture\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"lensAperture\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"num\"\n            ]\n        },\n        {\n            \"@id\": \"bts:ReferenceSet\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A set of references (e.g., canonical assembled contigs) which defines a common coordinate space for comparing reference-aligned experimental data.\",\n            \"rdfs:label\": \"ReferenceSet\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"referenceSet\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AttentionDeficitDisorder\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Attention Deficit Disorder.\",\n            \"rdfs:label\": \"AttentionDeficitDisorder\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Absent\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"attentionDeficitDisorder\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ParentId\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The id of the parent resource, i.e. the parent folder on the platform. \",\n            \"rdfs:label\": \"ParentId\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"parentId\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DiseaseFocus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Disease that acts as the main topic.\",\n            \"rdfs:label\": \"DiseaseFocus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"diseaseFocus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TransplantationRecipientSpecies\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Species into which donor  was grown\",\n            \"rdfs:label\": \"TransplantationRecipientSpecies\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Human\"\n                },\n                {\n                    \"@id\": \"bts:Mouse\"\n                }\n            ],\n            \"sms:displayName\": \"transplantationRecipientSpecies\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Human\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Human\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientSpecies\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Human\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Mouse\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Mouse\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:TransplantationRecipientSpecies\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Mouse\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AcknowledgementStatements\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Statement describing how resource use should be acknowledged.\",\n            \"rdfs:label\": \"AcknowledgementStatements\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"acknowledgementStatements\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ExperimentId\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"When applicable, an optional identifier that can be used to distinguish or group the experiments that generated the data; also can be used to denote internal batch reference if needed.\",\n            \"rdfs:label\": \"ExperimentId\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"experimentId\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CellType\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A cell type is a distinct morphological or functional form of cell.\",\n            \"rdfs:label\": \"CellType\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Monocytes\"\n                },\n                {\n                    \"@id\": \"bts:Macrophages\"\n                },\n                {\n                    \"@id\": \"bts:IPSC-derivedneuron\"\n                },\n                {\n                    \"@id\": \"bts:Lymphoblast\"\n                },\n                {\n                    \"@id\": \"bts:IPSC\"\n                },\n                {\n                    \"@id\": \"bts:DRG/nerverootneurospherecell\"\n                },\n                {\n                    \"@id\": \"bts:CD138+\"\n                },\n                {\n                    \"@id\": \"bts:Schwannoma\"\n                },\n                {\n                    \"@id\": \"bts:IPSC-derivedtelencephalicorganoids\"\n                },\n                {\n                    \"@id\": \"bts:Monocyte-derivedmicroglia\"\n                },\n                {\n                    \"@id\": \"bts:Microglia\"\n                },\n                {\n                    \"@id\": \"bts:SH-SY5Y\"\n                },\n                {\n                    \"@id\": \"bts:CNON\"\n                },\n                {\n                    \"@id\": \"bts:NeuN+\"\n                },\n                {\n                    \"@id\": \"bts:CulturedMullerglia\"\n                },\n                {\n                    \"@id\": \"bts:B-lymphocytes\"\n                },\n                {\n                    \"@id\": \"bts:Round\"\n                },\n                {\n                    \"@id\": \"bts:Epithelial\"\n                },\n                {\n                    \"@id\": \"bts:Epithelial-like\"\n                },\n                {\n                    \"@id\": \"bts:CD8+T-Cells\"\n                },\n                {\n                    \"@id\": \"bts:GLUtamatergicneurons\"\n                },\n                {\n                    \"@id\": \"bts:Arachnoid\"\n                },\n                {\n                    \"@id\": \"bts:GABAergicneurons\"\n                },\n                {\n                    \"@id\": \"bts:Schwann\"\n                },\n                {\n                    \"@id\": \"bts:IPSC-derivedglia\"\n                },\n                {\n                    \"@id\": \"bts:IPSC-derivedastrocytes\"\n                },\n                {\n                    \"@id\": \"bts:IPSC-derivedneuronalprogenitorcell\"\n                },\n                {\n                    \"@id\": \"bts:Oligodendrocyte\"\n                },\n                {\n                    \"@id\": \"bts:Fibroblast\"\n                },\n                {\n                    \"@id\": \"bts:Astrocytes\"\n                },\n                {\n                    \"@id\": \"bts:Schwanncellprecursor\"\n                },\n                {\n                    \"@id\": \"bts:NeuN-\"\n                },\n                {\n                    \"@id\": \"bts:Embryonicstemcells\"\n                },\n                {\n                    \"@id\": \"bts:Teratoma\"\n                },\n                {\n                    \"@id\": \"bts:Meningioma\"\n                }\n            ],\n            \"sms:displayName\": \"cellType\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": [\n                \"list like\"\n            ]\n        },\n        {\n            \"@id\": \"bts:Monocytes\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Monocytes\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"monocytes\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Macrophages\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Macrophages\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"macrophages\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IPSC-derivedneuron\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IPSC-derivedneuron\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"iPSC-derived neuron\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Lymphoblast\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Lymphoblast\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"lymphoblast\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IPSC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IPSC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"iPSC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DRG/nerverootneurospherecell\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DRG/nerverootneurospherecell\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DRG/nerve root neurosphere cell\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CD138+\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CD138+\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CD138+\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IPSC-derivedtelencephalicorganoids\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IPSC-derivedtelencephalicorganoids\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"iPSC-derived telencephalic organoids\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Monocyte-derivedmicroglia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Monocyte-derivedmicroglia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"monocyte-derived microglia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Microglia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Microglia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"microglia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SH-SY5Y\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SH-SY5Y\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SH-SY5Y\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNON\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNON\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CNON\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NeuN+\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NeuN+\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NeuN+\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CulturedMullerglia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CulturedMullerglia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cultured Muller glia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:B-lymphocytes\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"B-lymphocytes\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"B-lymphocytes\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Round\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Round\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"round\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Epithelial\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Epithelial\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"epithelial\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Epithelial-like\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Epithelial-like\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"epithelial-like\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CD8+T-Cells\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CD8+T-Cells\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CD8+ T-Cells\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GLUtamatergicneurons\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GLUtamatergicneurons\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GLUtamatergic neurons\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Arachnoid\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Arachnoid\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"arachnoid\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GABAergicneurons\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GABAergicneurons\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GABAergic neurons\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Schwann\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Schwann\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"schwann\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IPSC-derivedglia\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IPSC-derivedglia\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"iPSC-derived glia\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IPSC-derivedastrocytes\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IPSC-derivedastrocytes\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"iPSC-derived astrocytes\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IPSC-derivedneuronalprogenitorcell\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IPSC-derivedneuronalprogenitorcell\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"iPSC-derived neuronal progenitor cell\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Oligodendrocyte\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Oligodendrocyte\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"oligodendrocyte\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Fibroblast\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Fibroblast\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"fibroblast\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Astrocytes\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Astrocytes\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"astrocytes\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Schwanncellprecursor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Schwanncellprecursor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Schwann cell precursor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NeuN-\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NeuN-\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NeuN-\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Embryonicstemcells\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Embryonicstemcells\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:CellType\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Embryonic stem cells\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ModifiedBy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Refers to a user who last modified the resource on the platform.\",\n            \"rdfs:label\": \"ModifiedBy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"modifiedBy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ParentSpecimenID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A unique identifier (non-PII) that represents the parent specimen (sample) from which the data came from, e.g. the single parent tumor that was subsectioned into several samples.  The parentSpecimenID can be the same as specimenID when there is no subsectioning.\\n\",\n            \"rdfs:label\": \"ParentSpecimenID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"parentSpecimenID\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ExperimentalTimepoint\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The numeric value indicating the time elapsed from the beginning of the experiment at which the specimen was collected. Use in tandem with timePointUnit\",\n            \"rdfs:label\": \"ExperimentalTimepoint\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"experimentalTimepoint\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                }\n            ],\n            \"sms:validationRules\": [\n                \"num\"\n            ]\n        },\n        {\n            \"@id\": \"bts:VitalStatus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Vital status of the patient.\",\n            \"rdfs:label\": \"VitalStatus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Alive\"\n                },\n                {\n                    \"@id\": \"bts:Deceased\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"vitalStatus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Alive\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Alive\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:VitalStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"alive\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Deceased\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Deceased\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:VitalStatus\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"deceased\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProtocolAssay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Main assay type that this protocol is related to, e.g. this is a prep protocol for single-cell RNA-seq assay. This is especially helpful for newly-developed or in-house assays.\\n\",\n            \"rdfs:label\": \"ProtocolAssay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:2DAlamarBlueabsorbance\"\n                },\n                {\n                    \"@id\": \"bts:2DAlamarBluefluorescence\"\n                },\n                {\n                    \"@id\": \"bts:3Dconfocalimaging\"\n                },\n                {\n                    \"@id\": \"bts:3Delectronmicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:3Dimaging\"\n                },\n                {\n                    \"@id\": \"bts:3Dmicrotissueviability\"\n                },\n                {\n                    \"@id\": \"bts:Actigraphy\"\n                },\n                {\n                    \"@id\": \"bts:AlgometRxNociometer\"\n                },\n                {\n                    \"@id\": \"bts:Auditorybrainstemresponse\"\n                },\n                {\n                    \"@id\": \"bts:ATAC-seq\"\n                },\n                {\n                    \"@id\": \"bts:ATPaseactivityassay\"\n                },\n                {\n                    \"@id\": \"bts:BrdUproliferationassay\"\n                },\n                {\n                    \"@id\": \"bts:CAPP-seq\"\n                },\n                {\n                    \"@id\": \"bts:CUT&RUN\"\n                },\n                {\n                    \"@id\": \"bts:ChIP-seq\"\n                },\n                {\n                    \"@id\": \"bts:ChildBehaviorChecklistforAges1.5-5\"\n                },\n                {\n                    \"@id\": \"bts:ChildBehaviorChecklistforAges6-18\"\n                },\n                {\n                    \"@id\": \"bts:CODEX\"\n                },\n                {\n                    \"@id\": \"bts:Confocalmicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:Corsiblocks\"\n                },\n                {\n                    \"@id\": \"bts:Currentclampassay\"\n                },\n                {\n                    \"@id\": \"bts:DiffusionMRI\"\n                },\n                {\n                    \"@id\": \"bts:Distortionproductotoacousticemissions\"\n                },\n                {\n                    \"@id\": \"bts:DNAopticalmapping\"\n                },\n                {\n                    \"@id\": \"bts:ELISA\"\n                },\n                {\n                    \"@id\": \"bts:ERRbisulfitesequencing\"\n                },\n                {\n                    \"@id\": \"bts:EdUproliferationassay\"\n                },\n                {\n                    \"@id\": \"bts:FIA-MSMS\"\n                },\n                {\n                    \"@id\": \"bts:FLIPRhigh-throughputcellularscreening\"\n                },\n                {\n                    \"@id\": \"bts:FluorescenceInSituHybridization\"\n                },\n                {\n                    \"@id\": \"bts:Focusgroup\"\n                },\n                {\n                    \"@id\": \"bts:FTIRspectroscopy\"\n                },\n                {\n                    \"@id\": \"bts:HI-C\"\n                },\n                {\n                    \"@id\": \"bts:HPLC\"\n                },\n                {\n                    \"@id\": \"bts:Interview\"\n                },\n                {\n                    \"@id\": \"bts:ISO-seq\"\n                },\n                {\n                    \"@id\": \"bts:MIB/MS\"\n                },\n                {\n                    \"@id\": \"bts:Matrigel-basedtumorigenesisassay\"\n                },\n                {\n                    \"@id\": \"bts:MudPIT\"\n                },\n                {\n                    \"@id\": \"bts:NIHToolbox\"\n                },\n                {\n                    \"@id\": \"bts:NOMe-seq\"\n                },\n                {\n                    \"@id\": \"bts:RNAarray\"\n                },\n                {\n                    \"@id\": \"bts:RNA-seq\"\n                },\n                {\n                    \"@id\": \"bts:RPPA\"\n                },\n                {\n                    \"@id\": \"bts:RiccardiandAblonscales\"\n                },\n                {\n                    \"@id\": \"bts:SNParray\"\n                },\n                {\n                    \"@id\": \"bts:SUSHI\"\n                },\n                {\n                    \"@id\": \"bts:Sangersequencing\"\n                },\n                {\n                    \"@id\": \"bts:SocialResponsivenessScale\"\n                },\n                {\n                    \"@id\": \"bts:SocialResponsivenessScale,SecondEdition\"\n                },\n                {\n                    \"@id\": \"bts:Tcellreceptorrepertoiresequencing\"\n                },\n                {\n                    \"@id\": \"bts:TIDE\"\n                },\n                {\n                    \"@id\": \"bts:TMTquantitation\"\n                },\n                {\n                    \"@id\": \"bts:TriKineticsactivitymonitoring\"\n                },\n                {\n                    \"@id\": \"bts:VonFreytest\"\n                },\n                {\n                    \"@id\": \"bts:Activeavoidancelearningbehaviorassay\"\n                },\n                {\n                    \"@id\": \"bts:Array\"\n                },\n                {\n                    \"@id\": \"bts:Atomicforcemicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:Autoradiography\"\n                },\n                {\n                    \"@id\": \"bts:Bisulfitesequencing\"\n                },\n                {\n                    \"@id\": \"bts:Bloodchemistrymeasurement\"\n                },\n                {\n                    \"@id\": \"bts:BluenativePAGE\"\n                },\n                {\n                    \"@id\": \"bts:Bodysizetraitmeasurement\"\n                },\n                {\n                    \"@id\": \"bts:Bonehistomorphometry\"\n                },\n                {\n                    \"@id\": \"bts:Brightfieldmicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:CAMP-GloMaxAssay\"\n                },\n                {\n                    \"@id\": \"bts:Calciumretentioncapacityassay\"\n                },\n                {\n                    \"@id\": \"bts:Cellcompetition\"\n                },\n                {\n                    \"@id\": \"bts:Cellcount\"\n                },\n                {\n                    \"@id\": \"bts:Cellpainting\"\n                },\n                {\n                    \"@id\": \"bts:Cellproliferation\"\n                },\n                {\n                    \"@id\": \"bts:Cellviabilityassay\"\n                },\n                {\n                    \"@id\": \"bts:Clinicaldata\"\n                },\n                {\n                    \"@id\": \"bts:CNF-Skindex\"\n                },\n                {\n                    \"@id\": \"bts:Cognitiveassessment\"\n                },\n                {\n                    \"@id\": \"bts:Combinationlibraryscreen\"\n                },\n                {\n                    \"@id\": \"bts:Combinationscreen\"\n                },\n                {\n                    \"@id\": \"bts:ComplexIIenzymeactivityassay\"\n                },\n                {\n                    \"@id\": \"bts:Compoundscreen\"\n                },\n                {\n                    \"@id\": \"bts:Contextualconditioningbehaviorassay\"\n                },\n                {\n                    \"@id\": \"bts:ConventionalMRI\"\n                },\n                {\n                    \"@id\": \"bts:Children'sDermatologyLifeQualityIndexQuestionnaire\"\n                },\n                {\n                    \"@id\": \"bts:Differentialscanningcalorimetry\"\n                },\n                {\n                    \"@id\": \"bts:Dynamiclightscattering\"\n                },\n                {\n                    \"@id\": \"bts:Electrochemiluminescence\"\n                },\n                {\n                    \"@id\": \"bts:Electrophoreticlightscattering\"\n                },\n                {\n                    \"@id\": \"bts:Elevatedplusmazetest\"\n                },\n                {\n                    \"@id\": \"bts:FACE-QAppearance-relatedDistress\"\n                },\n                {\n                    \"@id\": \"bts:Flowcytometry\"\n                },\n                {\n                    \"@id\": \"bts:Focusformingassay\"\n                },\n                {\n                    \"@id\": \"bts:FunctionalMRI\"\n                },\n                {\n                    \"@id\": \"bts:Gaitmeasurement\"\n                },\n                {\n                    \"@id\": \"bts:Gelfiltrationchromatography\"\n                },\n                {\n                    \"@id\": \"bts:Gelpermeationchromatography\"\n                },\n                {\n                    \"@id\": \"bts:Genotyping\"\n                },\n                {\n                    \"@id\": \"bts:Highcontentscreen\"\n                },\n                {\n                    \"@id\": \"bts:Highfrequencyultrasound\"\n                },\n                {\n                    \"@id\": \"bts:High-performanceliquidchromatography/tandemmassspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:Immunoassay\"\n                },\n                {\n                    \"@id\": \"bts:Immunocytochemistry\"\n                },\n                {\n                    \"@id\": \"bts:Immunofluorescence\"\n                },\n                {\n                    \"@id\": \"bts:Immunohistochemistry\"\n                },\n                {\n                    \"@id\": \"bts:Insilicosynthesis\"\n                },\n                {\n                    \"@id\": \"bts:Invitrotumorigenesis\"\n                },\n                {\n                    \"@id\": \"bts:InvivoPDXviability\"\n                },\n                {\n                    \"@id\": \"bts:Invivobioluminescence\"\n                },\n                {\n                    \"@id\": \"bts:Invivotumorgrowth\"\n                },\n                {\n                    \"@id\": \"bts:Jumpinglibrary\"\n                },\n                {\n                    \"@id\": \"bts:Labelfreemassspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:Laserspeckleimaging\"\n                },\n                {\n                    \"@id\": \"bts:Lightscatteringassay\"\n                },\n                {\n                    \"@id\": \"bts:Liquidchromatography-electrochemicaldetection\"\n                },\n                {\n                    \"@id\": \"bts:Liquidchromatography/massspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:Liquidchromatography/tandemmassspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:LncRNA-seq\"\n                },\n                {\n                    \"@id\": \"bts:Localfieldpotentialrecording\"\n                },\n                {\n                    \"@id\": \"bts:Longtermpotentiationassay\"\n                },\n                {\n                    \"@id\": \"bts:MRNAcounts\"\n                },\n                {\n                    \"@id\": \"bts:Magneticresonanceangiography\"\n                },\n                {\n                    \"@id\": \"bts:Magneticresonancespectroscopy\"\n                },\n                {\n                    \"@id\": \"bts:Magnetization-PreparedRapidGradientEchoMRI\"\n                },\n                {\n                    \"@id\": \"bts:Massspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:Massivelyparallelreporterassay\"\n                },\n                {\n                    \"@id\": \"bts:Metabolicscreening\"\n                },\n                {\n                    \"@id\": \"bts:Methylationarray\"\n                },\n                {\n                    \"@id\": \"bts:MiRNAarray\"\n                },\n                {\n                    \"@id\": \"bts:MiRNA-seq\"\n                },\n                {\n                    \"@id\": \"bts:Microrheology\"\n                },\n                {\n                    \"@id\": \"bts:Skindex-16\"\n                },\n                {\n                    \"@id\": \"bts:Multi-electrodearray\"\n                },\n                {\n                    \"@id\": \"bts:Nanoparticletrackinganalysis\"\n                },\n                {\n                    \"@id\": \"bts:NanoStringnCounterAnalysisSystem\"\n                },\n                {\n                    \"@id\": \"bts:N-backtask\"\n                },\n                {\n                    \"@id\": \"bts:Neuropsychologicalassessment\"\n                },\n                {\n                    \"@id\": \"bts:Nextgenerationtargetedsequencing\"\n                },\n                {\n                    \"@id\": \"bts:Noveltyresponsebehaviorassay\"\n                },\n                {\n                    \"@id\": \"bts:Openfieldtest\"\n                },\n                {\n                    \"@id\": \"bts:Opticaltomography\"\n                },\n                {\n                    \"@id\": \"bts:Opticalcoherencetomography\"\n                },\n                {\n                    \"@id\": \"bts:Optokineticreflexassay\"\n                },\n                {\n                    \"@id\": \"bts:Oscillatoryrheology\"\n                },\n                {\n                    \"@id\": \"bts:OxBS-seq\"\n                },\n                {\n                    \"@id\": \"bts:Oxygenconsumptionassay\"\n                },\n                {\n                    \"@id\": \"bts:Patternelectroretinogram\"\n                },\n                {\n                    \"@id\": \"bts:Perineurialcellthickness\"\n                },\n                {\n                    \"@id\": \"bts:PharmocokineticADMEassay\"\n                },\n                {\n                    \"@id\": \"bts:Phase-contrastmicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:Photograph\"\n                },\n                {\n                    \"@id\": \"bts:Polymerasechainreaction\"\n                },\n                {\n                    \"@id\": \"bts:Polysomnography\"\n                },\n                {\n                    \"@id\": \"bts:Positronemissiontomography\"\n                },\n                {\n                    \"@id\": \"bts:PROMISCognitiveFunction\"\n                },\n                {\n                    \"@id\": \"bts:Proximityextensionassay\"\n                },\n                {\n                    \"@id\": \"bts:Puretoneaverage\"\n                },\n                {\n                    \"@id\": \"bts:QuantitativePCR\"\n                },\n                {\n                    \"@id\": \"bts:Questionnaire\"\n                },\n                {\n                    \"@id\": \"bts:Reactiveoxygenspeciesassay\"\n                },\n                {\n                    \"@id\": \"bts:Reportergeneassay\"\n                },\n                {\n                    \"@id\": \"bts:Rheometry\"\n                },\n                {\n                    \"@id\": \"bts:Ribo-seq\"\n                },\n                {\n                    \"@id\": \"bts:Rotarodperformancetest\"\n                },\n                {\n                    \"@id\": \"bts:SandwichELISA\"\n                },\n                {\n                    \"@id\": \"bts:ScCGI-seq\"\n                },\n                {\n                    \"@id\": \"bts:Scale\"\n                },\n                {\n                    \"@id\": \"bts:SaferSeqS\"\n                },\n                {\n                    \"@id\": \"bts:Singlemoleculedrugscreenassay\"\n                },\n                {\n                    \"@id\": \"bts:Single-cellRNA-seq\"\n                },\n                {\n                    \"@id\": \"bts:SinglecellATAC-seq\"\n                },\n                {\n                    \"@id\": \"bts:Single-nucleusRNA-seq\"\n                },\n                {\n                    \"@id\": \"bts:Smallmoleculelibraryscreen\"\n                },\n                {\n                    \"@id\": \"bts:Sorbitoldehydrogenaseactivitylevelassay\"\n                },\n                {\n                    \"@id\": \"bts:Spatialfrequencydomainimaging\"\n                },\n                {\n                    \"@id\": \"bts:Spatialtranscriptomics\"\n                },\n                {\n                    \"@id\": \"bts:Statichistomorphometry\"\n                },\n                {\n                    \"@id\": \"bts:Staticlightscattering\"\n                },\n                {\n                    \"@id\": \"bts:Survival\"\n                },\n                {\n                    \"@id\": \"bts:Targetedexomesequencing\"\n                },\n                {\n                    \"@id\": \"bts:Tractionforcemicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:Twinspotassay\"\n                },\n                {\n                    \"@id\": \"bts:Ultrahigh-performanceliquidchromatography/tandemmassspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:Westernblot\"\n                },\n                {\n                    \"@id\": \"bts:Wholeexomesequencing\"\n                },\n                {\n                    \"@id\": \"bts:Wholegenomesequencing\"\n                },\n                {\n                    \"@id\": \"bts:Whole-cellpatchclamp\"\n                },\n                {\n                    \"@id\": \"bts:Wordrecognitionscore\"\n                },\n                {\n                    \"@id\": \"bts:STRprofile\"\n                }\n            ],\n            \"sms:displayName\": \"protocolAssay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:2DAlamarBlueabsorbance\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"2DAlamarBlueabsorbance\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"2D AlamarBlue absorbance\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:2DAlamarBluefluorescence\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"2DAlamarBluefluorescence\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"2D AlamarBlue fluorescence\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:3Dconfocalimaging\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"3Dconfocalimaging\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"3D confocal imaging\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:3Delectronmicroscopy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"3Delectronmicroscopy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"3D electron microscopy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:3Dimaging\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"3Dimaging\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"3D imaging\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:3Dmicrotissueviability\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"3Dmicrotissueviability\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"3D microtissue viability\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Actigraphy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Actigraphy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"actigraphy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AlgometRxNociometer\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AlgometRxNociometer\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"AlgometRx Nociometer\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Auditorybrainstemresponse\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Auditorybrainstemresponse\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"auditory brainstem response\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ATAC-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ATAC-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ATAC-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ATPaseactivityassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ATPaseactivityassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ATPase activity assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BrdUproliferationassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BrdUproliferationassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BrdU proliferation assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CAPP-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CAPP-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CAPP-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CUT&RUN\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CUT&RUN\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CUT&RUN\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ChIP-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ChIP-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ChIP-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ChildBehaviorChecklistforAges1.5-5\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ChildBehaviorChecklistforAges1.5-5\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Child Behavior Checklist for Ages 1.5-5\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ChildBehaviorChecklistforAges6-18\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ChildBehaviorChecklistforAges6-18\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Child Behavior Checklist for Ages 6-18\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CODEX\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CODEX\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CODEX\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Confocalmicroscopy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Confocalmicroscopy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"confocal microscopy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Corsiblocks\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Corsiblocks\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Corsi blocks\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Currentclampassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Currentclampassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"current clamp assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DiffusionMRI\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DiffusionMRI\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"diffusion MRI\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Distortionproductotoacousticemissions\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Distortionproductotoacousticemissions\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"distortion product otoacoustic emissions\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:DNAopticalmapping\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"DNAopticalmapping\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"DNA optical mapping\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ELISA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ELISA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ELISA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ERRbisulfitesequencing\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ERRbisulfitesequencing\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ERR bisulfite sequencing\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:EdUproliferationassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"EdUproliferationassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"EdU proliferation assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FIA-MSMS\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FIA-MSMS\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FIA-MSMS\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FLIPRhigh-throughputcellularscreening\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FLIPRhigh-throughputcellularscreening\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FLIPR high-throughput cellular screening\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FluorescenceInSituHybridization\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FluorescenceInSituHybridization\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Fluorescence In Situ Hybridization\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Focusgroup\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Focusgroup\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Focus group\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FTIRspectroscopy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FTIRspectroscopy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FTIR spectroscopy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HI-C\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HI-C\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HI-C\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HPLC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HPLC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HPLC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Interview\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Interview\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Interview\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ISO-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ISO-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ISO-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MIB/MS\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MIB/MS\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MIB/MS\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Matrigel-basedtumorigenesisassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Matrigel-basedtumorigenesisassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Matrigel-based tumorigenesis assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MudPIT\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MudPIT\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MudPIT\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NIHToolbox\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NIHToolbox\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NIH Toolbox\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NOMe-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NOMe-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NOMe-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RNAarray\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RNAarray\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RNA array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RNA-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RNA-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RNA-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RPPA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RPPA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RPPA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RiccardiandAblonscales\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RiccardiandAblonscales\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Riccardi and Ablon scales\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SNParray\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SNParray\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SNP array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SUSHI\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SUSHI\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SUSHI\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sangersequencing\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sangersequencing\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Sanger sequencing\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SocialResponsivenessScale\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SocialResponsivenessScale\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Social Responsiveness Scale\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SocialResponsivenessScale,SecondEdition\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SocialResponsivenessScale,SecondEdition\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Social Responsiveness Scale, Second Edition\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tcellreceptorrepertoiresequencing\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tcellreceptorrepertoiresequencing\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"T cell receptor repertoire sequencing\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TIDE\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TIDE\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"TIDE\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TMTquantitation\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TMTquantitation\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"TMT quantitation\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TriKineticsactivitymonitoring\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TriKineticsactivitymonitoring\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"TriKinetics activity monitoring\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VonFreytest\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VonFreytest\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Von Frey test\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Activeavoidancelearningbehaviorassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Activeavoidancelearningbehaviorassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"active avoidance learning behavior assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Array\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Array\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Atomicforcemicroscopy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Atomicforcemicroscopy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"atomic force microscopy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Autoradiography\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Autoradiography\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"autoradiography\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bisulfitesequencing\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bisulfitesequencing\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bisulfite sequencing\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bloodchemistrymeasurement\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bloodchemistrymeasurement\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"blood chemistry measurement\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BluenativePAGE\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BluenativePAGE\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"blue native PAGE\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bodysizetraitmeasurement\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bodysizetraitmeasurement\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"body size trait measurement\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Bonehistomorphometry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Bonehistomorphometry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bone histomorphometry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Brightfieldmicroscopy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Brightfieldmicroscopy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"brightfield microscopy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CAMP-GloMaxAssay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CAMP-GloMaxAssay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cAMP-Glo Max Assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Calciumretentioncapacityassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Calciumretentioncapacityassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"calcium retention capacity assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cellcompetition\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cellcompetition\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cell competition\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cellcount\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cellcount\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cell count\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cellpainting\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cellpainting\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cell painting\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cellproliferation\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cellproliferation\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cell proliferation\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cellviabilityassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cellviabilityassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cell viability assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Clinicaldata\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Clinicaldata\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"clinical data\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CNF-Skindex\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CNF-Skindex\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cNF-Skindex\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cognitiveassessment\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cognitiveassessment\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"cognitive assessment\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ComplexIIenzymeactivityassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ComplexIIenzymeactivityassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"complex II enzyme activity assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Compoundscreen\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Compoundscreen\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"compound screen\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Contextualconditioningbehaviorassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Contextualconditioningbehaviorassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"contextual conditioning behavior assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ConventionalMRI\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ConventionalMRI\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"conventional MRI\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Children'sDermatologyLifeQualityIndexQuestionnaire\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Children'sDermatologyLifeQualityIndexQuestionnaire\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Children's Dermatology Life Quality Index Questionnaire\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Differentialscanningcalorimetry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Differentialscanningcalorimetry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"differential scanning calorimetry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Dynamiclightscattering\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Dynamiclightscattering\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"dynamic light scattering\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Electrochemiluminescence\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Electrochemiluminescence\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"electrochemiluminescence\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Electrophoreticlightscattering\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Electrophoreticlightscattering\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"electrophoretic light scattering\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Elevatedplusmazetest\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Elevatedplusmazetest\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"elevated plus maze test\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FACE-QAppearance-relatedDistress\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FACE-QAppearance-relatedDistress\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FACE-Q Appearance-related Distress\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Flowcytometry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Flowcytometry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"flow cytometry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Focusformingassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Focusformingassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"focus forming assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FunctionalMRI\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"FunctionalMRI\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"functional MRI\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Gelfiltrationchromatography\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Gelfiltrationchromatography\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"gel filtration chromatography\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Gelpermeationchromatography\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Gelpermeationchromatography\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"gel permeation chromatography\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Genotyping\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Genotyping\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"genotyping\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Highcontentscreen\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Highcontentscreen\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"high content screen\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Highfrequencyultrasound\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Highfrequencyultrasound\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"high frequency ultrasound\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:High-performanceliquidchromatography/tandemmassspectrometry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"High-performanceliquidchromatography/tandemmassspectrometry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"high-performance liquid chromatography/tandem mass spectrometry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Immunocytochemistry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Immunocytochemistry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"immunocytochemistry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Immunofluorescence\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Immunofluorescence\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"immunofluorescence\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Immunohistochemistry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Immunohistochemistry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"immunohistochemistry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Insilicosynthesis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Insilicosynthesis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"in silico synthesis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Invitrotumorigenesis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Invitrotumorigenesis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"in vitro tumorigenesis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:InvivoPDXviability\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"InvivoPDXviability\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"in vivo PDX viability\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Invivobioluminescence\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Invivobioluminescence\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"in vivo bioluminescence\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Invivotumorgrowth\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Invivotumorgrowth\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"in vivo tumor growth\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Jumpinglibrary\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Jumpinglibrary\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"jumping library\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Labelfreemassspectrometry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Labelfreemassspectrometry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"label free mass spectrometry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Laserspeckleimaging\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Laserspeckleimaging\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"laser speckle imaging\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Lightscatteringassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Lightscatteringassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"light scattering assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Liquidchromatography-electrochemicaldetection\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Liquidchromatography-electrochemicaldetection\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"liquid chromatography-electrochemical detection\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Liquidchromatography/massspectrometry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Liquidchromatography/massspectrometry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"liquid chromatography/mass spectrometry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Liquidchromatography/tandemmassspectrometry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Liquidchromatography/tandemmassspectrometry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"liquid chromatography/tandem mass spectrometry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LncRNA-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LncRNA-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"lncRNA-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Localfieldpotentialrecording\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Localfieldpotentialrecording\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"local field potential recording\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Longtermpotentiationassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Longtermpotentiationassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"long term potentiation assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MRNAcounts\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MRNAcounts\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mRNA counts\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Magneticresonanceangiography\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Magneticresonanceangiography\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"magnetic resonance angiography\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Magneticresonancespectroscopy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Magneticresonancespectroscopy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"magnetic resonance spectroscopy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Magnetization-PreparedRapidGradientEchoMRI\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Magnetization-PreparedRapidGradientEchoMRI\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Magnetization-Prepared Rapid Gradient Echo MRI\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Massspectrometry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Massspectrometry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"mass spectrometry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Massivelyparallelreporterassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Massivelyparallelreporterassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"massively parallel reporter assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Metabolicscreening\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Metabolicscreening\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"metabolic screening\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Methylationarray\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Methylationarray\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"methylation array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MiRNAarray\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MiRNAarray\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"miRNA array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MiRNA-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MiRNA-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"miRNA-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Microrheology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Microrheology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"microrheology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Skindex-16\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Skindex-16\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Skindex-16\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Multi-electrodearray\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Multi-electrodearray\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"multi-electrode array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nanoparticletrackinganalysis\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nanoparticletrackinganalysis\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"nanoparticle tracking analysis\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NanoStringnCounterAnalysisSystem\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NanoStringnCounterAnalysisSystem\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NanoString nCounter Analysis System\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:N-backtask\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"N-backtask\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"n-back task\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Neuropsychologicalassessment\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Neuropsychologicalassessment\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"neuropsychological assessment\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Nextgenerationtargetedsequencing\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Nextgenerationtargetedsequencing\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"next generation targeted sequencing\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Noveltyresponsebehaviorassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Noveltyresponsebehaviorassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"novelty response behavior assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Openfieldtest\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Openfieldtest\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"open field test\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Opticaltomography\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Opticaltomography\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"optical tomography\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Opticalcoherencetomography\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Opticalcoherencetomography\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"optical coherence tomography\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Optokineticreflexassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Optokineticreflexassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"optokinetic reflex assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Oscillatoryrheology\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Oscillatoryrheology\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"oscillatory rheology\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OxBS-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OxBS-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"oxBS-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Oxygenconsumptionassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Oxygenconsumptionassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"oxygen consumption assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Patternelectroretinogram\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Patternelectroretinogram\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"pattern electroretinogram\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Perineurialcellthickness\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Perineurialcellthickness\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"perineurial cell thickness\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PharmocokineticADMEassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PharmocokineticADMEassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"pharmocokinetic ADME assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Phase-contrastmicroscopy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Phase-contrastmicroscopy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"phase-contrast microscopy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Photograph\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Photograph\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"photograph\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Polymerasechainreaction\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Polymerasechainreaction\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"polymerase chain reaction\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Polysomnography\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Polysomnography\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"polysomnography\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Positronemissiontomography\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Positronemissiontomography\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"positron emission tomography\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PROMISCognitiveFunction\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PROMISCognitiveFunction\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PROMIS Cognitive Function\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Proximityextensionassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Proximityextensionassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"proximity extension assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Puretoneaverage\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Puretoneaverage\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"pure tone average\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:QuantitativePCR\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"QuantitativePCR\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"quantitative PCR\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Questionnaire\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Questionnaire\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"questionnaire\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Reactiveoxygenspeciesassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Reactiveoxygenspeciesassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"reactive oxygen species assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Reportergeneassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Reportergeneassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"reporter gene assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Rheometry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Rheometry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"rheometry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Ribo-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Ribo-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ribo-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Rotarodperformancetest\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Rotarodperformancetest\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"rotarod performance test\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SandwichELISA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SandwichELISA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sandwich ELISA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ScCGI-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ScCGI-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"scCGI-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Scale\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Scale\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Scale\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SaferSeqS\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SaferSeqS\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SaferSeqS\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Singlemoleculedrugscreenassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Singlemoleculedrugscreenassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"single molecule drug screen assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Single-cellRNA-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Single-cellRNA-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"single-cell RNA-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SinglecellATAC-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SinglecellATAC-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"single cell ATAC-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Single-nucleusRNA-seq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Single-nucleusRNA-seq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"single-nucleus RNA-seq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Sorbitoldehydrogenaseactivitylevelassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Sorbitoldehydrogenaseactivitylevelassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"sorbitol dehydrogenase activity level assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Spatialfrequencydomainimaging\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Spatialfrequencydomainimaging\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"spatial frequency domain imaging\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Spatialtranscriptomics\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Spatialtranscriptomics\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"spatial transcriptomics\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Statichistomorphometry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Statichistomorphometry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"static histomorphometry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Staticlightscattering\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Staticlightscattering\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"static light scattering\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Survival\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Survival\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"survival\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Targetedexomesequencing\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Targetedexomesequencing\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"targeted exome sequencing\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Tractionforcemicroscopy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Tractionforcemicroscopy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"traction force microscopy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Twinspotassay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Twinspotassay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"twin spot assay\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Ultrahigh-performanceliquidchromatography/tandemmassspectrometry\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Ultrahigh-performanceliquidchromatography/tandemmassspectrometry\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ultra high-performance liquid chromatography/tandem mass spectrometry\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Westernblot\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Westernblot\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"western blot\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Wholeexomesequencing\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Wholeexomesequencing\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"whole exome sequencing\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Wholegenomesequencing\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Wholegenomesequencing\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"whole genome sequencing\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Whole-cellpatchclamp\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Whole-cellpatchclamp\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"whole-cell patch clamp\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Wordrecognitionscore\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Wordrecognitionscore\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"word recognition score\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:STRprofile\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"STRprofile\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"STR profile\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BisulfiteConversionKitID\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Name of kit used in bisulfite conversion.\",\n            \"rdfs:label\": \"BisulfiteConversionKitID\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"bisulfiteConversionKitID\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ChipPosition\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"User-specified identifier for the specific position on the chip that the sample was loaded into to perform the methylation microarray.\",\n            \"rdfs:label\": \"ChipPosition\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"chipPosition\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Assay\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"The technology used to generate the data in this file.\",\n            \"rdfs:label\": \"Assay\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:2DAlamarBlueabsorbance\"\n                },\n                {\n                    \"@id\": \"bts:2DAlamarBluefluorescence\"\n                },\n                {\n                    \"@id\": \"bts:3Dconfocalimaging\"\n                },\n                {\n                    \"@id\": \"bts:3Delectronmicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:3Dimaging\"\n                },\n                {\n                    \"@id\": \"bts:3Dmicrotissueviability\"\n                },\n                {\n                    \"@id\": \"bts:Actigraphy\"\n                },\n                {\n                    \"@id\": \"bts:AlgometRxNociometer\"\n                },\n                {\n                    \"@id\": \"bts:Auditorybrainstemresponse\"\n                },\n                {\n                    \"@id\": \"bts:ATAC-seq\"\n                },\n                {\n                    \"@id\": \"bts:ATPaseactivityassay\"\n                },\n                {\n                    \"@id\": \"bts:BrdUproliferationassay\"\n                },\n                {\n                    \"@id\": \"bts:CAPP-seq\"\n                },\n                {\n                    \"@id\": \"bts:CUT&RUN\"\n                },\n                {\n                    \"@id\": \"bts:ChIP-seq\"\n                },\n                {\n                    \"@id\": \"bts:ChildBehaviorChecklistforAges1.5-5\"\n                },\n                {\n                    \"@id\": \"bts:ChildBehaviorChecklistforAges6-18\"\n                },\n                {\n                    \"@id\": \"bts:CODEX\"\n                },\n                {\n                    \"@id\": \"bts:Confocalmicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:Corsiblocks\"\n                },\n                {\n                    \"@id\": \"bts:Currentclampassay\"\n                },\n                {\n                    \"@id\": \"bts:DiffusionMRI\"\n                },\n                {\n                    \"@id\": \"bts:Distortionproductotoacousticemissions\"\n                },\n                {\n                    \"@id\": \"bts:DNAopticalmapping\"\n                },\n                {\n                    \"@id\": \"bts:ELISA\"\n                },\n                {\n                    \"@id\": \"bts:ERRbisulfitesequencing\"\n                },\n                {\n                    \"@id\": \"bts:EdUproliferationassay\"\n                },\n                {\n                    \"@id\": \"bts:FIA-MSMS\"\n                },\n                {\n                    \"@id\": \"bts:FLIPRhigh-throughputcellularscreening\"\n                },\n                {\n                    \"@id\": \"bts:FluorescenceInSituHybridization\"\n                },\n                {\n                    \"@id\": \"bts:Focusgroup\"\n                },\n                {\n                    \"@id\": \"bts:FTIRspectroscopy\"\n                },\n                {\n                    \"@id\": \"bts:HI-C\"\n                },\n                {\n                    \"@id\": \"bts:HPLC\"\n                },\n                {\n                    \"@id\": \"bts:Interview\"\n                },\n                {\n                    \"@id\": \"bts:ISO-seq\"\n                },\n                {\n                    \"@id\": \"bts:MIB/MS\"\n                },\n                {\n                    \"@id\": \"bts:Matrigel-basedtumorigenesisassay\"\n                },\n                {\n                    \"@id\": \"bts:MudPIT\"\n                },\n                {\n                    \"@id\": \"bts:NIHToolbox\"\n                },\n                {\n                    \"@id\": \"bts:NOMe-seq\"\n                },\n                {\n                    \"@id\": \"bts:RNAarray\"\n                },\n                {\n                    \"@id\": \"bts:RNA-seq\"\n                },\n                {\n                    \"@id\": \"bts:RPPA\"\n                },\n                {\n                    \"@id\": \"bts:RiccardiandAblonscales\"\n                },\n                {\n                    \"@id\": \"bts:SNParray\"\n                },\n                {\n                    \"@id\": \"bts:SUSHI\"\n                },\n                {\n                    \"@id\": \"bts:Sangersequencing\"\n                },\n                {\n                    \"@id\": \"bts:SocialResponsivenessScale\"\n                },\n                {\n                    \"@id\": \"bts:SocialResponsivenessScale,SecondEdition\"\n                },\n                {\n                    \"@id\": \"bts:Tcellreceptorrepertoiresequencing\"\n                },\n                {\n                    \"@id\": \"bts:TIDE\"\n                },\n                {\n                    \"@id\": \"bts:TMTquantitation\"\n                },\n                {\n                    \"@id\": \"bts:TriKineticsactivitymonitoring\"\n                },\n                {\n                    \"@id\": \"bts:VonFreytest\"\n                },\n                {\n                    \"@id\": \"bts:Activeavoidancelearningbehaviorassay\"\n                },\n                {\n                    \"@id\": \"bts:Array\"\n                },\n                {\n                    \"@id\": \"bts:Atomicforcemicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:Autoradiography\"\n                },\n                {\n                    \"@id\": \"bts:Bisulfitesequencing\"\n                },\n                {\n                    \"@id\": \"bts:Bloodchemistrymeasurement\"\n                },\n                {\n                    \"@id\": \"bts:BluenativePAGE\"\n                },\n                {\n                    \"@id\": \"bts:Bodysizetraitmeasurement\"\n                },\n                {\n                    \"@id\": \"bts:Bonehistomorphometry\"\n                },\n                {\n                    \"@id\": \"bts:Brightfieldmicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:CAMP-GloMaxAssay\"\n                },\n                {\n                    \"@id\": \"bts:Calciumretentioncapacityassay\"\n                },\n                {\n                    \"@id\": \"bts:Cellcompetition\"\n                },\n                {\n                    \"@id\": \"bts:Cellcount\"\n                },\n                {\n                    \"@id\": \"bts:Cellpainting\"\n                },\n                {\n                    \"@id\": \"bts:Cellproliferation\"\n                },\n                {\n                    \"@id\": \"bts:Cellviabilityassay\"\n                },\n                {\n                    \"@id\": \"bts:Clinicaldata\"\n                },\n                {\n                    \"@id\": \"bts:CNF-Skindex\"\n                },\n                {\n                    \"@id\": \"bts:Cognitiveassessment\"\n                },\n                {\n                    \"@id\": \"bts:Combinationlibraryscreen\"\n                },\n                {\n                    \"@id\": \"bts:Combinationscreen\"\n                },\n                {\n                    \"@id\": \"bts:ComplexIIenzymeactivityassay\"\n                },\n                {\n                    \"@id\": \"bts:Compoundscreen\"\n                },\n                {\n                    \"@id\": \"bts:Contextualconditioningbehaviorassay\"\n                },\n                {\n                    \"@id\": \"bts:ConventionalMRI\"\n                },\n                {\n                    \"@id\": \"bts:Children'sDermatologyLifeQualityIndexQuestionnaire\"\n                },\n                {\n                    \"@id\": \"bts:Differentialscanningcalorimetry\"\n                },\n                {\n                    \"@id\": \"bts:Dynamiclightscattering\"\n                },\n                {\n                    \"@id\": \"bts:Electrochemiluminescence\"\n                },\n                {\n                    \"@id\": \"bts:Electrophoreticlightscattering\"\n                },\n                {\n                    \"@id\": \"bts:Elevatedplusmazetest\"\n                },\n                {\n                    \"@id\": \"bts:FACE-QAppearance-relatedDistress\"\n                },\n                {\n                    \"@id\": \"bts:Flowcytometry\"\n                },\n                {\n                    \"@id\": \"bts:Focusformingassay\"\n                },\n                {\n                    \"@id\": \"bts:FunctionalMRI\"\n                },\n                {\n                    \"@id\": \"bts:Gaitmeasurement\"\n                },\n                {\n                    \"@id\": \"bts:Gelfiltrationchromatography\"\n                },\n                {\n                    \"@id\": \"bts:Gelpermeationchromatography\"\n                },\n                {\n                    \"@id\": \"bts:Genotyping\"\n                },\n                {\n                    \"@id\": \"bts:Highcontentscreen\"\n                },\n                {\n                    \"@id\": \"bts:Highfrequencyultrasound\"\n                },\n                {\n                    \"@id\": \"bts:High-performanceliquidchromatography/tandemmassspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:Immunoassay\"\n                },\n                {\n                    \"@id\": \"bts:Immunocytochemistry\"\n                },\n                {\n                    \"@id\": \"bts:Immunofluorescence\"\n                },\n                {\n                    \"@id\": \"bts:Immunohistochemistry\"\n                },\n                {\n                    \"@id\": \"bts:Insilicosynthesis\"\n                },\n                {\n                    \"@id\": \"bts:Invitrotumorigenesis\"\n                },\n                {\n                    \"@id\": \"bts:InvivoPDXviability\"\n                },\n                {\n                    \"@id\": \"bts:Invivobioluminescence\"\n                },\n                {\n                    \"@id\": \"bts:Invivotumorgrowth\"\n                },\n                {\n                    \"@id\": \"bts:Jumpinglibrary\"\n                },\n                {\n                    \"@id\": \"bts:Labelfreemassspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:Laserspeckleimaging\"\n                },\n                {\n                    \"@id\": \"bts:Lightscatteringassay\"\n                },\n                {\n                    \"@id\": \"bts:Liquidchromatography-electrochemicaldetection\"\n                },\n                {\n                    \"@id\": \"bts:Liquidchromatography/massspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:Liquidchromatography/tandemmassspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:LncRNA-seq\"\n                },\n                {\n                    \"@id\": \"bts:Localfieldpotentialrecording\"\n                },\n                {\n                    \"@id\": \"bts:Longtermpotentiationassay\"\n                },\n                {\n                    \"@id\": \"bts:MRNAcounts\"\n                },\n                {\n                    \"@id\": \"bts:Magneticresonanceangiography\"\n                },\n                {\n                    \"@id\": \"bts:Magneticresonancespectroscopy\"\n                },\n                {\n                    \"@id\": \"bts:Magnetization-PreparedRapidGradientEchoMRI\"\n                },\n                {\n                    \"@id\": \"bts:Massspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:Massivelyparallelreporterassay\"\n                },\n                {\n                    \"@id\": \"bts:Metabolicscreening\"\n                },\n                {\n                    \"@id\": \"bts:Methylationarray\"\n                },\n                {\n                    \"@id\": \"bts:MiRNAarray\"\n                },\n                {\n                    \"@id\": \"bts:MiRNA-seq\"\n                },\n                {\n                    \"@id\": \"bts:Microrheology\"\n                },\n                {\n                    \"@id\": \"bts:Skindex-16\"\n                },\n                {\n                    \"@id\": \"bts:Multi-electrodearray\"\n                },\n                {\n                    \"@id\": \"bts:Nanoparticletrackinganalysis\"\n                },\n                {\n                    \"@id\": \"bts:NanoStringnCounterAnalysisSystem\"\n                },\n                {\n                    \"@id\": \"bts:N-backtask\"\n                },\n                {\n                    \"@id\": \"bts:Neuropsychologicalassessment\"\n                },\n                {\n                    \"@id\": \"bts:Nextgenerationtargetedsequencing\"\n                },\n                {\n                    \"@id\": \"bts:Noveltyresponsebehaviorassay\"\n                },\n                {\n                    \"@id\": \"bts:Openfieldtest\"\n                },\n                {\n                    \"@id\": \"bts:Opticaltomography\"\n                },\n                {\n                    \"@id\": \"bts:Opticalcoherencetomography\"\n                },\n                {\n                    \"@id\": \"bts:Optokineticreflexassay\"\n                },\n                {\n                    \"@id\": \"bts:Oscillatoryrheology\"\n                },\n                {\n                    \"@id\": \"bts:OxBS-seq\"\n                },\n                {\n                    \"@id\": \"bts:Oxygenconsumptionassay\"\n                },\n                {\n                    \"@id\": \"bts:Patternelectroretinogram\"\n                },\n                {\n                    \"@id\": \"bts:Perineurialcellthickness\"\n                },\n                {\n                    \"@id\": \"bts:PharmocokineticADMEassay\"\n                },\n                {\n                    \"@id\": \"bts:Phase-contrastmicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:Photograph\"\n                },\n                {\n                    \"@id\": \"bts:Polymerasechainreaction\"\n                },\n                {\n                    \"@id\": \"bts:Polysomnography\"\n                },\n                {\n                    \"@id\": \"bts:Positronemissiontomography\"\n                },\n                {\n                    \"@id\": \"bts:PROMISCognitiveFunction\"\n                },\n                {\n                    \"@id\": \"bts:Proximityextensionassay\"\n                },\n                {\n                    \"@id\": \"bts:Puretoneaverage\"\n                },\n                {\n                    \"@id\": \"bts:QuantitativePCR\"\n                },\n                {\n                    \"@id\": \"bts:Questionnaire\"\n                },\n                {\n                    \"@id\": \"bts:Reactiveoxygenspeciesassay\"\n                },\n                {\n                    \"@id\": \"bts:Reportergeneassay\"\n                },\n                {\n                    \"@id\": \"bts:Rheometry\"\n                },\n                {\n                    \"@id\": \"bts:Ribo-seq\"\n                },\n                {\n                    \"@id\": \"bts:Rotarodperformancetest\"\n                },\n                {\n                    \"@id\": \"bts:SandwichELISA\"\n                },\n                {\n                    \"@id\": \"bts:ScCGI-seq\"\n                },\n                {\n                    \"@id\": \"bts:Scale\"\n                },\n                {\n                    \"@id\": \"bts:SaferSeqS\"\n                },\n                {\n                    \"@id\": \"bts:Singlemoleculedrugscreenassay\"\n                },\n                {\n                    \"@id\": \"bts:Single-cellRNA-seq\"\n                },\n                {\n                    \"@id\": \"bts:SinglecellATAC-seq\"\n                },\n                {\n                    \"@id\": \"bts:Single-nucleusRNA-seq\"\n                },\n                {\n                    \"@id\": \"bts:Smallmoleculelibraryscreen\"\n                },\n                {\n                    \"@id\": \"bts:Sorbitoldehydrogenaseactivitylevelassay\"\n                },\n                {\n                    \"@id\": \"bts:Spatialfrequencydomainimaging\"\n                },\n                {\n                    \"@id\": \"bts:Spatialtranscriptomics\"\n                },\n                {\n                    \"@id\": \"bts:Statichistomorphometry\"\n                },\n                {\n                    \"@id\": \"bts:Staticlightscattering\"\n                },\n                {\n                    \"@id\": \"bts:Survival\"\n                },\n                {\n                    \"@id\": \"bts:Targetedexomesequencing\"\n                },\n                {\n                    \"@id\": \"bts:Tractionforcemicroscopy\"\n                },\n                {\n                    \"@id\": \"bts:Twinspotassay\"\n                },\n                {\n                    \"@id\": \"bts:Ultrahigh-performanceliquidchromatography/tandemmassspectrometry\"\n                },\n                {\n                    \"@id\": \"bts:Westernblot\"\n                },\n                {\n                    \"@id\": \"bts:Wholeexomesequencing\"\n                },\n                {\n                    \"@id\": \"bts:Wholegenomesequencing\"\n                },\n                {\n                    \"@id\": \"bts:Whole-cellpatchclamp\"\n                },\n                {\n                    \"@id\": \"bts:Wordrecognitionscore\"\n                },\n                {\n                    \"@id\": \"bts:STRprofile\"\n                }\n            ],\n            \"sms:displayName\": \"assay\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StudyFileviewId\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Links a Synapse project to its fileview.\",\n            \"rdfs:label\": \"StudyFileviewId\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"studyFileviewId\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AverageReadLength\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Average read length collected from samtools\",\n            \"rdfs:label\": \"AverageReadLength\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"averageReadLength\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GliomaOrEpendymoma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Characterization of the manifestation of Glioma or ependymoma.\",\n            \"rdfs:label\": \"GliomaOrEpendymoma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:Notimaged\"\n                },\n                {\n                    \"@id\": \"bts:Absentbyimaging\"\n                },\n                {\n                    \"@id\": \"bts:Present\"\n                },\n                {\n                    \"@id\": \"bts:Unknown\"\n                }\n            ],\n            \"sms:displayName\": \"gliomaOrEpendymoma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RelatedStudies\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Studies similar to the current study. Subproperty of `relatedResource`.\",\n            \"rdfs:label\": \"RelatedStudies\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"relatedStudies\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Platform\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A sequencing platform, microscope, spectroscope/plate reader, or other platform for collecting data.\",\n            \"rdfs:label\": \"Platform\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"schema:rangeIncludes\": [\n                {\n                    \"@id\": \"bts:7TBrukerBiospec\"\n                },\n                {\n                    \"@id\": \"bts:10xVisiumSpatialGeneExpression\"\n                },\n                {\n                    \"@id\": \"bts:2DCellTiter-Glo\"\n                },\n                {\n                    \"@id\": \"bts:2DIncucyte\"\n                },\n                {\n                    \"@id\": \"bts:AffymetrixGenome-WideHumanSNP5.0Array\"\n                },\n                {\n                    \"@id\": \"bts:AffymetrixGenome-WideHumanSNP6.0Array\"\n                },\n                {\n                    \"@id\": \"bts:AffymetrixHumanGene1.0STArray\"\n                },\n                {\n                    \"@id\": \"bts:AffymetrixHumanGenomeU133Plus2.0Array\"\n                },\n                {\n                    \"@id\": \"bts:AffymetrixU133AB\"\n                },\n                {\n                    \"@id\": \"bts:Agilent44Karray\"\n                },\n                {\n                    \"@id\": \"bts:BDFACSCalibur\"\n                },\n                {\n                    \"@id\": \"bts:BDFACSymphony\"\n                },\n                {\n                    \"@id\": \"bts:BGISEQ-500\"\n                },\n                {\n                    \"@id\": \"bts:BionanoIrys\"\n                },\n                {\n                    \"@id\": \"bts:Caliper\"\n                },\n                {\n                    \"@id\": \"bts:CherryImagingFACEPlatform\"\n                },\n                {\n                    \"@id\": \"bts:CherryImagingTRACEPlatform\"\n                },\n                {\n                    \"@id\": \"bts:ChromiumX\"\n                },\n                {\n                    \"@id\": \"bts:EnVision2103MultiplateReader\"\n                },\n                {\n                    \"@id\": \"bts:GEDiscoveryMR7503T\"\n                },\n                {\n                    \"@id\": \"bts:GEOptimaMR450W1.5T\"\n                },\n                {\n                    \"@id\": \"bts:GESignaHDxt1.5T\"\n                },\n                {\n                    \"@id\": \"bts:GESignaGenesis1.5T\"\n                },\n                {\n                    \"@id\": \"bts:GESignaHDxt3T\"\n                },\n                {\n                    \"@id\": \"bts:GESignaPremier3T\"\n                },\n                {\n                    \"@id\": \"bts:GESignaExcite1.5T\"\n                },\n                {\n                    \"@id\": \"bts:HitachiEchelon1.5T\"\n                },\n                {\n                    \"@id\": \"bts:HitachiOasis1.2T\"\n                },\n                {\n                    \"@id\": \"bts:IVISSpectrumInVivoImagingSystem\"\n                },\n                {\n                    \"@id\": \"bts:Illumina1M\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaGenomeAnalyzerIIx\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaHiSeq2000\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaHiSeq2500\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaHiSeq3000\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaHiSeq4000\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaHiSeqX\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaHuman660W-Quadv1.0BeadChip\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaHumanHap300\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaHumanMethylation450\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaHumanOmni1-Quadv1.0\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaHumanOmniExpress-24v1.0BeadChip\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaHumanOmniExpress-24v1.2BeadChip\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaInfiniumMethylationEPICBeadChipv1.0(850k)\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaInfiniumMethylationEPICBeadChipv2.0(935k)\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaMiSeq\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaMouseWG-6v2.0expressionbeadchip\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaNextSeq1000\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaNextSeq2000\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaNextSeq500\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaNextSeq550\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaNovaSeq6000\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaNovaSeqX\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaNovaSeqXPlus\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaOmni2pt5M\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaOmni5M\"\n                },\n                {\n                    \"@id\": \"bts:IlluminaWholeGenomeDASL\"\n                },\n                {\n                    \"@id\": \"bts:Illuminah650\"\n                },\n                {\n                    \"@id\": \"bts:InfiniumHumanOmniExpressExome\"\n                },\n                {\n                    \"@id\": \"bts:LI-COROdysseyCLx\"\n                },\n                {\n                    \"@id\": \"bts:LTQOrbitrapXL\"\n                },\n                {\n                    \"@id\": \"bts:LeicaAperioAT2\"\n                },\n                {\n                    \"@id\": \"bts:LeicaMZ16\"\n                },\n                {\n                    \"@id\": \"bts:LeicaS9Stereomicroscope\"\n                },\n                {\n                    \"@id\": \"bts:LifeVizInfinitySystem\"\n                },\n                {\n                    \"@id\": \"bts:LifeVizMicroSystem\"\n                },\n                {\n                    \"@id\": \"bts:MGIT-series\"\n                },\n                {\n                    \"@id\": \"bts:MalvernZetasizer\"\n                },\n                {\n                    \"@id\": \"bts:NanoFCM\"\n                },\n                {\n                    \"@id\": \"bts:NanoStringHumannCounterPanCancerIO360Panel\"\n                },\n                {\n                    \"@id\": \"bts:NanostringCounter\"\n                },\n                {\n                    \"@id\": \"bts:NanostringGeoMx\"\n                },\n                {\n                    \"@id\": \"bts:NotApplicable\"\n                },\n                {\n                    \"@id\": \"bts:OlympusDP80\"\n                },\n                {\n                    \"@id\": \"bts:OlympusIX73\"\n                },\n                {\n                    \"@id\": \"bts:OrbitrapFusionLumosTribrid\"\n                },\n                {\n                    \"@id\": \"bts:OtherPlatform\"\n                },\n                {\n                    \"@id\": \"bts:OxfordNanopore\"\n                },\n                {\n                    \"@id\": \"bts:PacBioRSII\"\n                },\n                {\n                    \"@id\": \"bts:PacBioSequelIISystem\"\n                },\n                {\n                    \"@id\": \"bts:PacBioSequelIIeSystem\"\n                },\n                {\n                    \"@id\": \"bts:Pannoramic250Flash\"\n                },\n                {\n                    \"@id\": \"bts:Perlegen300Karray\"\n                },\n                {\n                    \"@id\": \"bts:PhilipsAchieva1.5T\"\n                },\n                {\n                    \"@id\": \"bts:PhilipsAchieva3T\"\n                },\n                {\n                    \"@id\": \"bts:PhilipsInteraAchieva3T\"\n                },\n                {\n                    \"@id\": \"bts:PhilipsIngenia1.5T\"\n                },\n                {\n                    \"@id\": \"bts:PhilipsIngenia3T\"\n                },\n                {\n                    \"@id\": \"bts:PhilipsPanorama1.0T\"\n                },\n                {\n                    \"@id\": \"bts:PromegaGloMaxDiscover\"\n                },\n                {\n                    \"@id\": \"bts:QExativeHF\"\n                },\n                {\n                    \"@id\": \"bts:Scale\"\n                },\n                {\n                    \"@id\": \"bts:SiemensAvanto1.5T\"\n                },\n                {\n                    \"@id\": \"bts:SiemensAvantoFit1.5T\"\n                },\n                {\n                    \"@id\": \"bts:SiemensMagnetomAera1.5T\"\n                },\n                {\n                    \"@id\": \"bts:SiemensMagnetomEspree1.5T\"\n                },\n                {\n                    \"@id\": \"bts:SiemensMagnetomPrisma3T\"\n                },\n                {\n                    \"@id\": \"bts:SiemensMagnetomSkyra3T\"\n                },\n                {\n                    \"@id\": \"bts:SiemensMagnetomTrio3T\"\n                },\n                {\n                    \"@id\": \"bts:SiemensMagnetomVerio3T\"\n                },\n                {\n                    \"@id\": \"bts:SiemensMagnetomPrismaFit3T\"\n                },\n                {\n                    \"@id\": \"bts:SpectramaxMSeries\"\n                },\n                {\n                    \"@id\": \"bts:TOOsonixSystemONE-M\"\n                },\n                {\n                    \"@id\": \"bts:ToshibaVantageTitan1.5T\"\n                },\n                {\n                    \"@id\": \"bts:VarioskanLUX\"\n                },\n                {\n                    \"@id\": \"bts:VectraH13DImagingSystem\"\n                },\n                {\n                    \"@id\": \"bts:VentanaBenchmarkXT\"\n                },\n                {\n                    \"@id\": \"bts:Vevo3100ImagingSystem\"\n                },\n                {\n                    \"@id\": \"bts:XF24ExtracellularFluxAnalyzer\"\n                },\n                {\n                    \"@id\": \"bts:ZeissLSM\"\n                },\n                {\n                    \"@id\": \"bts:ZeissLSM700\"\n                },\n                {\n                    \"@id\": \"bts:ZeissLSM980\"\n                },\n                {\n                    \"@id\": \"bts:ZenoElectronicWalkway\"\n                },\n                {\n                    \"@id\": \"bts:ZetaView\"\n                }\n            ],\n            \"sms:displayName\": \"platform\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:7TBrukerBiospec\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"7TBrukerBiospec\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"7T Bruker Biospec\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:10xVisiumSpatialGeneExpression\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"10xVisiumSpatialGeneExpression\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"10x Visium Spatial Gene Expression\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:2DCellTiter-Glo\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"2DCellTiter-Glo\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"2D CellTiter-Glo\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:2DIncucyte\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"2DIncucyte\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"2D Incucyte\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AffymetrixGenome-WideHumanSNP5.0Array\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AffymetrixGenome-WideHumanSNP5.0Array\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Affymetrix Genome-Wide Human SNP 5.0 Array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AffymetrixGenome-WideHumanSNP6.0Array\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AffymetrixGenome-WideHumanSNP6.0Array\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Affymetrix Genome-Wide Human SNP 6.0 Array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AffymetrixHumanGene1.0STArray\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AffymetrixHumanGene1.0STArray\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Affymetrix Human Gene 1.0 ST Array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AffymetrixHumanGenomeU133Plus2.0Array\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AffymetrixHumanGenomeU133Plus2.0Array\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Affymetrix Human Genome U133 Plus 2.0 Array\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AffymetrixU133AB\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"AffymetrixU133AB\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Affymetrix U133AB\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Agilent44Karray\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Agilent44Karray\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Agilent 44Karray\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BDFACSCalibur\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BDFACSCalibur\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BD FACS Calibur\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BDFACSymphony\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BDFACSymphony\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BD FACSymphony\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BGISEQ-500\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BGISEQ-500\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BGISEQ-500\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BionanoIrys\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"BionanoIrys\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Bionano Irys\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Caliper\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Caliper\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Caliper\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CherryImagingFACEPlatform\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CherryImagingFACEPlatform\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cherry Imaging FACE Platform\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CherryImagingTRACEPlatform\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CherryImagingTRACEPlatform\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cherry Imaging TRACE Platform\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ChromiumX\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ChromiumX\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Chromium X\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:EnVision2103MultiplateReader\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"EnVision2103MultiplateReader\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"EnVision 2103 Multiplate Reader\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GEDiscoveryMR7503T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GEDiscoveryMR7503T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GE Discovery MR750 3T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GEOptimaMR450W1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GEOptimaMR450W1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GE Optima MR450W 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GESignaHDxt1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GESignaHDxt1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GE Signa HDxt 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GESignaGenesis1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GESignaGenesis1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GE Signa Genesis 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GESignaHDxt3T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GESignaHDxt3T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GE Signa HDxt 3T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GESignaPremier3T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GESignaPremier3T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GE Signa Premier 3T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GESignaExcite1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GESignaExcite1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GE Signa Excite 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HitachiEchelon1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HitachiEchelon1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Hitachi Echelon 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HitachiOasis1.2T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HitachiOasis1.2T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Hitachi Oasis 1.2T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IVISSpectrumInVivoImagingSystem\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IVISSpectrumInVivoImagingSystem\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"IVIS Spectrum In Vivo Imaging System\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Illumina1M\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Illumina1M\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina 1M\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaGenomeAnalyzerIIx\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaGenomeAnalyzerIIx\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina Genome Analyzer IIx\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaHiSeq2000\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaHiSeq2000\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina HiSeq 2000\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaHiSeq2500\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaHiSeq2500\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina HiSeq 2500\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaHiSeq3000\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaHiSeq3000\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina HiSeq 3000\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaHiSeq4000\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaHiSeq4000\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina HiSeq 4000\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaHiSeqX\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaHiSeqX\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina HiSeq X\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaHuman660W-Quadv1.0BeadChip\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaHuman660W-Quadv1.0BeadChip\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina Human660W-Quad v1.0 BeadChip\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaHumanHap300\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaHumanHap300\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina HumanHap300\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaHumanMethylation450\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaHumanMethylation450\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina HumanMethylation450\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaHumanOmni1-Quadv1.0\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaHumanOmni1-Quadv1.0\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina HumanOmni1-Quadv1.0\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaHumanOmniExpress-24v1.0BeadChip\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaHumanOmniExpress-24v1.0BeadChip\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina HumanOmniExpress-24 v1.0 BeadChip\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaHumanOmniExpress-24v1.2BeadChip\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaHumanOmniExpress-24v1.2BeadChip\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina HumanOmniExpress-24 v1.2 BeadChip\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaInfiniumMethylationEPICBeadChipv1.0(850k)\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaInfiniumMethylationEPICBeadChipv1.0(850k)\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina Infinium MethylationEPIC BeadChip v1.0 (850k)\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaInfiniumMethylationEPICBeadChipv2.0(935k)\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaInfiniumMethylationEPICBeadChipv2.0(935k)\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina Infinium MethylationEPIC BeadChip v2.0 (935k)\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaMiSeq\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaMiSeq\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina MiSeq\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaMouseWG-6v2.0expressionbeadchip\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaMouseWG-6v2.0expressionbeadchip\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina MouseWG-6 v2.0 expression beadchip\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaNextSeq1000\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaNextSeq1000\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina NextSeq 1000\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaNextSeq2000\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaNextSeq2000\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina NextSeq 2000\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaNextSeq500\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaNextSeq500\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina NextSeq 500\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaNextSeq550\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaNextSeq550\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina NextSeq 550\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaNovaSeq6000\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaNovaSeq6000\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina NovaSeq 6000\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaNovaSeqX\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaNovaSeqX\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina NovaSeq X\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaNovaSeqXPlus\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaNovaSeqXPlus\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina NovaSeq X Plus\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaOmni2pt5M\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaOmni2pt5M\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina Omni2pt5M\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaOmni5M\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaOmni5M\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina Omni5M\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IlluminaWholeGenomeDASL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IlluminaWholeGenomeDASL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina WholeGenome DASL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Illuminah650\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Illuminah650\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Illumina h650\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:InfiniumHumanOmniExpressExome\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"InfiniumHumanOmniExpressExome\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Infinium HumanOmniExpressExome\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LI-COROdysseyCLx\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LI-COROdysseyCLx\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"LI-COR Odyssey CLx\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LTQOrbitrapXL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LTQOrbitrapXL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"LTQ Orbitrap XL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LeicaAperioAT2\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LeicaAperioAT2\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Leica Aperio AT2\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LeicaMZ16\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LeicaMZ16\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Leica MZ16\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LeicaS9Stereomicroscope\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LeicaS9Stereomicroscope\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Leica S9 Stereomicroscope\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LifeVizInfinitySystem\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LifeVizInfinitySystem\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"LifeViz Infinity System\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LifeVizMicroSystem\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LifeVizMicroSystem\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"LifeViz Micro System\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MGIT-series\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MGIT-series\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MGI T-series\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MalvernZetasizer\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MalvernZetasizer\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Malvern Zetasizer\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NanoFCM\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NanoFCM\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NanoFCM\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NanoStringHumannCounterPanCancerIO360Panel\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NanoStringHumannCounterPanCancerIO360Panel\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NanoString Human nCounter PanCancer IO360 Panel\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NanostringCounter\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NanostringCounter\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nanostring Counter\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NanostringGeoMx\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NanostringGeoMx\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Nanostring GeoMx\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OlympusDP80\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OlympusDP80\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Olympus DP80\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OlympusIX73\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OlympusIX73\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Olympus IX73\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OrbitrapFusionLumosTribrid\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OrbitrapFusionLumosTribrid\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Orbitrap Fusion Lumos Tribrid\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OtherPlatform\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OtherPlatform\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Other Platform\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:OxfordNanopore\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"OxfordNanopore\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Oxford Nanopore\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PacBioRSII\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PacBioRSII\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PacBio RS II\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PacBioSequelIISystem\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PacBioSequelIISystem\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PacBio Sequel II System\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PacBioSequelIIeSystem\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PacBioSequelIIeSystem\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PacBio Sequel IIe System\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Pannoramic250Flash\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Pannoramic250Flash\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Pannoramic 250 Flash\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Perlegen300Karray\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Perlegen300Karray\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Perlegen 300Karray\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PhilipsAchieva1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PhilipsAchieva1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Philips Achieva 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PhilipsAchieva3T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PhilipsAchieva3T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Philips Achieva 3T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PhilipsInteraAchieva3T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PhilipsInteraAchieva3T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Philips Intera Achieva 3T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PhilipsIngenia1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PhilipsIngenia1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Philips Ingenia 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PhilipsIngenia3T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PhilipsIngenia3T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Philips Ingenia 3T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PhilipsPanorama1.0T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PhilipsPanorama1.0T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Philips Panorama 1.0T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PromegaGloMaxDiscover\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PromegaGloMaxDiscover\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Promega GloMax Discover\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:QExativeHF\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"QExativeHF\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Q Exative HF\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SiemensAvanto1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SiemensAvanto1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Siemens Avanto 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SiemensAvantoFit1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SiemensAvantoFit1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Siemens Avanto Fit 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SiemensMagnetomAera1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SiemensMagnetomAera1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Siemens Magnetom Aera 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SiemensMagnetomEspree1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SiemensMagnetomEspree1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Siemens Magnetom Espree 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SiemensMagnetomPrisma3T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SiemensMagnetomPrisma3T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Siemens Magnetom Prisma 3T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SiemensMagnetomSkyra3T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SiemensMagnetomSkyra3T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Siemens Magnetom Skyra 3T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SiemensMagnetomTrio3T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SiemensMagnetomTrio3T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Siemens Magnetom Trio 3T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SiemensMagnetomVerio3T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SiemensMagnetomVerio3T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Siemens Magnetom Verio 3T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SiemensMagnetomPrismaFit3T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SiemensMagnetomPrismaFit3T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Siemens Magnetom Prisma Fit 3T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SpectramaxMSeries\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SpectramaxMSeries\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Spectramax M Series\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:TOOsonixSystemONE-M\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"TOOsonixSystemONE-M\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"TOOsonix System ONE-M\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ToshibaVantageTitan1.5T\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ToshibaVantageTitan1.5T\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Toshiba Vantage Titan 1.5T\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VarioskanLUX\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VarioskanLUX\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Varioskan LUX\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VectraH13DImagingSystem\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VectraH13DImagingSystem\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Vectra H1 3D Imaging System\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VentanaBenchmarkXT\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VentanaBenchmarkXT\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Ventana Benchmark XT\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Vevo3100ImagingSystem\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Vevo3100ImagingSystem\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Vevo 3100 Imaging System\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:XF24ExtracellularFluxAnalyzer\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"XF24ExtracellularFluxAnalyzer\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"XF24 Extracellular Flux Analyzer\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ZeissLSM\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ZeissLSM\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Zeiss LSM\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ZeissLSM700\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ZeissLSM700\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Zeiss LSM 700\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ZeissLSM980\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ZeissLSM980\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Zeiss LSM 980\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ZenoElectronicWalkway\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ZenoElectronicWalkway\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Zeno Electronic Walkway\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ZetaView\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ZetaView\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Platform\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ZetaView\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:StudyLeads\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Individuals with lead roles in a study.\",\n            \"rdfs:label\": \"StudyLeads\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"studyLeads\",\n            \"sms:required\": \"sms:true\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MassSpecAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for raw mass spec-based proteomics data.\",\n            \"rdfs:label\": \"MassSpecAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProteinAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MassSpecAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:ProteinExtractSource\"\n                },\n                {\n                    \"@id\": \"bts:DataCollectionMode\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProteinAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Abstract template for data from some assay of protein structure and function. Data should be instantiated with more specific template.\",\n            \"rdfs:label\": \"ProteinAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ProteinAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:ProteinExtractSource\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ClinicalAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"General template for typically tabular **individual-level** data. This can include repeated measures and a drug treatment context.\\n\",\n            \"rdfs:label\": \"ClinicalAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ClinicalAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalFactor\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalCondition\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                },\n                {\n                    \"@id\": \"bts:CompoundName\"\n                },\n                {\n                    \"@id\": \"bts:CompoundDose\"\n                },\n                {\n                    \"@id\": \"bts:CompoundDoseUnit\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BiologicalAssayDataTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A template defining basic metadata on deposited data artifacts (i.e. files) from experimental assays involving biosamples.  This is an abstract template; \\\"real\\\" template subclasses define additional properties appropriate for the type of data file (e.g. imaging vs sequencing).\\n\",\n            \"rdfs:label\": \"BiologicalAssayDataTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Template\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BiologicalAssayDataTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ScRNASeqTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for describing raw data from single-cell RNA-seq.\",\n            \"rdfs:label\": \"ScRNASeqTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ScSequencingAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ScRNASeqTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:CellType\"\n                },\n                {\n                    \"@id\": \"bts:IsCellLine\"\n                },\n                {\n                    \"@id\": \"bts:CellID\"\n                },\n                {\n                    \"@id\": \"bts:DissociationMethod\"\n                },\n                {\n                    \"@id\": \"bts:RunType\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:LibraryKitID\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:ReadPair\"\n                },\n                {\n                    \"@id\": \"bts:ReadLength\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TargetDepth\"\n                },\n                {\n                    \"@id\": \"bts:BatchID\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbed\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbationType\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbationTechnology\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalCondition\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ScSequencingAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"General template for raw RNA/DNA data, i.e. sequence data from a sequencing assay.\",\n            \"rdfs:label\": \"ScSequencingAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:GeneticsAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ScSequencingAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:CellType\"\n                },\n                {\n                    \"@id\": \"bts:IsCellLine\"\n                },\n                {\n                    \"@id\": \"bts:CellID\"\n                },\n                {\n                    \"@id\": \"bts:DissociationMethod\"\n                },\n                {\n                    \"@id\": \"bts:RunType\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:LibraryKitID\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:ReadPair\"\n                },\n                {\n                    \"@id\": \"bts:ReadLength\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TargetDepth\"\n                },\n                {\n                    \"@id\": \"bts:BatchID\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GeneralMeasureDataTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"General template for data in tabular form that aggregates tissue-level or cellular-level data.\\n\",\n            \"rdfs:label\": \"GeneralMeasureDataTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GeneralMeasureDataTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalFactor\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalCondition\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalTimepoint\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                },\n                {\n                    \"@id\": \"bts:CompoundName\"\n                },\n                {\n                    \"@id\": \"bts:CompoundDose\"\n                },\n                {\n                    \"@id\": \"bts:CompoundDoseUnit\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbed\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbationType\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbationTechnology\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MaterialScienceAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"General template for describing data for a materials science assay.\",\n            \"rdfs:label\": \"MaterialScienceAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:NonBiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MaterialScienceAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:MaterialType\"\n                },\n                {\n                    \"@id\": \"bts:ConcentrationMaterial\"\n                },\n                {\n                    \"@id\": \"bts:ConcentrationMaterialUnit\"\n                },\n                {\n                    \"@id\": \"bts:ConcentrationNaCl\"\n                },\n                {\n                    \"@id\": \"bts:ConcentrationNaClUnit\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NonBiologicalAssayDataTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Describes data artifacts (i.e. files) from an experimental/physical-sciences assay  where input specimens are more at the level of synthesized molecules or inorganic materials.\\n\",\n            \"rdfs:label\": \"NonBiologicalAssayDataTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Template\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NonBiologicalAssayDataTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RNASeqTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for describing raw data from (bulk) RNA-sequencing\",\n            \"rdfs:label\": \"RNASeqTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BulkSequencingAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RNASeqTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                },\n                {\n                    \"@id\": \"bts:RunType\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:ReadPair\"\n                },\n                {\n                    \"@id\": \"bts:ReadLength\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TargetDepth\"\n                },\n                {\n                    \"@id\": \"bts:BatchID\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbed\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbationType\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbationTechnology\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalCondition\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BulkSequencingAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"General template for raw (level 1) RNA/DNA data, i.e. sequence data from a sequencing assay.\",\n            \"rdfs:label\": \"BulkSequencingAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:GeneticsAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BulkSequencingAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                },\n                {\n                    \"@id\": \"bts:RunType\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:ReadPair\"\n                },\n                {\n                    \"@id\": \"bts:ReadLength\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TargetDepth\"\n                },\n                {\n                    \"@id\": \"bts:BatchID\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Template\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A collection of fields representing some entity.\",\n            \"rdfs:label\": \"Template\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Template\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:BiospecimenTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Specimen-level data, whether from human or animal.\",\n            \"rdfs:label\": \"BiospecimenTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"BiospecimenTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:BodySite\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalCondition\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PartialTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A template for collecting a subset of contextual data; not intended to be a standalone template but rather a subdocument.\",\n            \"rdfs:label\": \"PartialTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PartialTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PortalStudy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"//nf.synapse.org/Explore/Studies.\\n\",\n            \"rdfs:label\": \"PortalStudy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PortalStudy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:StudyId\"\n                },\n                {\n                    \"@id\": \"bts:StudyName\"\n                },\n                {\n                    \"@id\": \"bts:StudyLeads\"\n                },\n                {\n                    \"@id\": \"bts:Summary\"\n                },\n                {\n                    \"@id\": \"bts:Institutions\"\n                },\n                {\n                    \"@id\": \"bts:FundingAgency\"\n                },\n                {\n                    \"@id\": \"bts:Initiative\"\n                },\n                {\n                    \"@id\": \"bts:StudyStatus\"\n                },\n                {\n                    \"@id\": \"bts:DataStatus\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:Manifestation\"\n                },\n                {\n                    \"@id\": \"bts:DiseaseFocus\"\n                },\n                {\n                    \"@id\": \"bts:RelatedStudies\"\n                },\n                {\n                    \"@id\": \"bts:GrantDOI\"\n                },\n                {\n                    \"@id\": \"bts:GrantStartDate\"\n                },\n                {\n                    \"@id\": \"bts:GrantEndDate\"\n                },\n                {\n                    \"@id\": \"bts:EmbargoEndDate\"\n                },\n                {\n                    \"@id\": \"bts:AccessRequirements\"\n                },\n                {\n                    \"@id\": \"bts:AcknowledgementStatements\"\n                },\n                {\n                    \"@id\": \"bts:StudyFileviewId\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Institutions\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Institutions\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"institutions\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GrantStartDate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GrantStartDate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"grantStartDate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GrantEndDate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GrantEndDate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"grantEndDate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:EmbargoEndDate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"EmbargoEndDate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"embargoEndDate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ImagingAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"General template for describing imaging data.\",\n            \"rdfs:label\": \"ImagingAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ImagingAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:AssayTarget\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenomicsAssayTemplateExtended\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Genomics assay template but with additional experiment data.\",\n            \"rdfs:label\": \"GenomicsAssayTemplateExtended\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:GenomicsAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GenomicsAssayTemplateExtended\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                },\n                {\n                    \"@id\": \"bts:RunType\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:ReadPair\"\n                },\n                {\n                    \"@id\": \"bts:ReadLength\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TargetDepth\"\n                },\n                {\n                    \"@id\": \"bts:BatchID\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalCondition\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalTimepoint\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbed\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbationTechnology\"\n                },\n                {\n                    \"@id\": \"bts:GenePerturbationType\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenomicsAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Alias to BulkSequencingAssayTemplate, use for sequence data on a large scale when there is no template available that is more specific.\",\n            \"rdfs:label\": \"GenomicsAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BulkSequencingAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GenomicsAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                },\n                {\n                    \"@id\": \"bts:RunType\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:ReadPair\"\n                },\n                {\n                    \"@id\": \"bts:ReadLength\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TargetDepth\"\n                },\n                {\n                    \"@id\": \"bts:BatchID\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProteinArrayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for array- and immuno-based proteomics data.\",\n            \"rdfs:label\": \"ProteinArrayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ProteinAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ProteinArrayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:ProteinExtractSource\"\n                },\n                {\n                    \"@id\": \"bts:AntibodyID\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PdxGenomicsAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Raw genomics data from patient-derived xenograft (PDX) experiment, with additional PDX-relevant metadata.\",\n            \"rdfs:label\": \"PdxGenomicsAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:GenomicsAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PdxGenomicsAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                },\n                {\n                    \"@id\": \"bts:RunType\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:ReadPair\"\n                },\n                {\n                    \"@id\": \"bts:ReadLength\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TargetDepth\"\n                },\n                {\n                    \"@id\": \"bts:BatchID\"\n                },\n                {\n                    \"@id\": \"bts:TransplantationType\"\n                },\n                {\n                    \"@id\": \"bts:TransplantationRecipientSpecies\"\n                },\n                {\n                    \"@id\": \"bts:TransplantationRecipientTissue\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalCondition\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalTimepoint\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LightScatteringAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for dynamic or static light scattering data adapted from ISA-TAB-Nano specs.\",\n            \"rdfs:label\": \"LightScatteringAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:MaterialScienceAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"LightScatteringAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:MaterialType\"\n                },\n                {\n                    \"@id\": \"bts:ConcentrationMaterial\"\n                },\n                {\n                    \"@id\": \"bts:ConcentrationMaterialUnit\"\n                },\n                {\n                    \"@id\": \"bts:ConcentrationNaCl\"\n                },\n                {\n                    \"@id\": \"bts:ConcentrationNaClUnit\"\n                },\n                {\n                    \"@id\": \"bts:PH\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProcessedAlignedReadsTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for describing aligned reads (e.g. BAM/CRAM files) from a sequencing assay. The QC meta are extracted from samtools stats when available and are the same metrics preferred by GDC. \\n\",\n            \"rdfs:label\": \"ProcessedAlignedReadsTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ProcessedAlignedReadsTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:GenomicReference\"\n                },\n                {\n                    \"@id\": \"bts:GenomicReferenceLink\"\n                },\n                {\n                    \"@id\": \"bts:AverageInsertSize\"\n                },\n                {\n                    \"@id\": \"bts:AverageReadLength\"\n                },\n                {\n                    \"@id\": \"bts:AverageBaseQuality\"\n                },\n                {\n                    \"@id\": \"bts:PairsOnDifferentChr\"\n                },\n                {\n                    \"@id\": \"bts:ReadsDuplicatedPercent\"\n                },\n                {\n                    \"@id\": \"bts:ReadsMappedPercent\"\n                },\n                {\n                    \"@id\": \"bts:MeanCoverage\"\n                },\n                {\n                    \"@id\": \"bts:ProportionCoverage10x\"\n                },\n                {\n                    \"@id\": \"bts:ProportionCoverage30x\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TotalReads\"\n                },\n                {\n                    \"@id\": \"bts:Workflow\"\n                },\n                {\n                    \"@id\": \"bts:WorkflowLink\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HumanCohortTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Data of biosamples from human patients. Adapted from Table 2 of http://nrs.harvard.edu/urn-3:HUL.InstRepos:32725809. This template should be used when biosamples are from NF patients to provides a more valuable characterization and make additional gene-phenotype insights possible.\\n\",\n            \"rdfs:label\": \"HumanCohortTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HumanCohortTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:DiagnosisAgeGroup\"\n                },\n                {\n                    \"@id\": \"bts:Inheritance\"\n                },\n                {\n                    \"@id\": \"bts:Mosaicism\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:GermlineMutationIndicator\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:VitalStatus\"\n                },\n                {\n                    \"@id\": \"bts:WHOPerformanceStatus\"\n                },\n                {\n                    \"@id\": \"bts:PainStatus\"\n                },\n                {\n                    \"@id\": \"bts:TumorTreatmentStatus\"\n                },\n                {\n                    \"@id\": \"bts:CafeaulaitMacules\"\n                },\n                {\n                    \"@id\": \"bts:SkinFoldFreckling\"\n                },\n                {\n                    \"@id\": \"bts:IrisLischNodules\"\n                },\n                {\n                    \"@id\": \"bts:DermalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:SubcutaneousNodularNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:DiffuseDermalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:SpinalNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:PlexiformNeurofibromas\"\n                },\n                {\n                    \"@id\": \"bts:OpticGlioma\"\n                },\n                {\n                    \"@id\": \"bts:HeartDefect\"\n                },\n                {\n                    \"@id\": \"bts:VascularDisease\"\n                },\n                {\n                    \"@id\": \"bts:PubertyOnset\"\n                },\n                {\n                    \"@id\": \"bts:Stature\"\n                },\n                {\n                    \"@id\": \"bts:PeripheralNeuropathy\"\n                },\n                {\n                    \"@id\": \"bts:AqueductalStenosis\"\n                },\n                {\n                    \"@id\": \"bts:LongBoneDysplasia\"\n                },\n                {\n                    \"@id\": \"bts:SphenoidDysplasia\"\n                },\n                {\n                    \"@id\": \"bts:Scoliosis\"\n                },\n                {\n                    \"@id\": \"bts:IntellectualDisability\"\n                },\n                {\n                    \"@id\": \"bts:LearningDisability\"\n                },\n                {\n                    \"@id\": \"bts:AttentionDeficitDisorder\"\n                },\n                {\n                    \"@id\": \"bts:Pheochromocytoma\"\n                },\n                {\n                    \"@id\": \"bts:GlomusTumor\"\n                },\n                {\n                    \"@id\": \"bts:MPNSTCharacterization\"\n                },\n                {\n                    \"@id\": \"bts:NonopticGlioma\"\n                },\n                {\n                    \"@id\": \"bts:GIST\"\n                },\n                {\n                    \"@id\": \"bts:Leukemia\"\n                },\n                {\n                    \"@id\": \"bts:BreastCancer\"\n                },\n                {\n                    \"@id\": \"bts:OtherTumors\"\n                },\n                {\n                    \"@id\": \"bts:VestibularSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:Meningioma\"\n                },\n                {\n                    \"@id\": \"bts:GliomaOrEpendymoma\"\n                },\n                {\n                    \"@id\": \"bts:SpinalSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:DermalSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:NonvestibularCranialSchwannoma\"\n                },\n                {\n                    \"@id\": \"bts:Lenticularopacity\"\n                },\n                {\n                    \"@id\": \"bts:NonvestibularSchwannomas\"\n                },\n                {\n                    \"@id\": \"bts:NumberOfSchwannomas\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Lenticularopacity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Lenticularopacity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"lenticularopacity\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProtocolTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for describing a protocol document.\",\n            \"rdfs:label\": \"ProtocolTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Template\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ProtocolTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Title\"\n                },\n                {\n                    \"@id\": \"bts:Author\"\n                },\n                {\n                    \"@id\": \"bts:Citation\"\n                },\n                {\n                    \"@id\": \"bts:License\"\n                },\n                {\n                    \"@id\": \"bts:ProtocolAssay\"\n                },\n                {\n                    \"@id\": \"bts:ProtocolPurpose\"\n                },\n                {\n                    \"@id\": \"bts:SampleType\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:EpigenomiscAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for describing raw data from epigenetics sequencing assays such as bisulfite sequencing.\",\n            \"rdfs:label\": \"EpigenomiscAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BulkSequencingAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"EpigenomiscAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                },\n                {\n                    \"@id\": \"bts:RunType\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:ReadPair\"\n                },\n                {\n                    \"@id\": \"bts:ReadLength\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TargetDepth\"\n                },\n                {\n                    \"@id\": \"bts:BatchID\"\n                },\n                {\n                    \"@id\": \"bts:BisulfiteConversionKitID\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProcessedExpressionTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for quantified gene/protein expression data that are still represented as one file per sample.\",\n            \"rdfs:label\": \"ProcessedExpressionTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ProcessedExpressionTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ExpressionUnit\"\n                },\n                {\n                    \"@id\": \"bts:Workflow\"\n                },\n                {\n                    \"@id\": \"bts:WorkflowLink\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GenomicsArrayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"A template for describing raw data from array-based genomics/epigenomics, e.g. CEL files.\",\n            \"rdfs:label\": \"GenomicsArrayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:GeneticsAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GenomicsArrayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:Channel\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GeneticsAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for relatively raw data of RNA/DNA structure and expression.  This is an abstract template encapsulating data from low-throughput to high-throughput assays,  sequencing-based or non-sequencing based (e.g. microarrays, optical genome mapping).  In practice, data are more specifically typed and matched to one of the templates below.\\n\",\n            \"rdfs:label\": \"GeneticsAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GeneticsAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProteomicsAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Alias to MassSpecAssayTemplate for backwards-compatibility.\",\n            \"rdfs:label\": \"ProteomicsAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:MassSpecAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ProteomicsAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:ProteinExtractSource\"\n                },\n                {\n                    \"@id\": \"bts:DataCollectionMode\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WGSTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for describing raw data from Whole Genome Sequencing (WGS)\",\n            \"rdfs:label\": \"WGSTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BulkSequencingAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"WGSTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                },\n                {\n                    \"@id\": \"bts:RunType\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:ReadPair\"\n                },\n                {\n                    \"@id\": \"bts:ReadLength\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TargetDepth\"\n                },\n                {\n                    \"@id\": \"bts:BatchID\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProcessedMergedDataTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Further processed data with multiple samples aggregated into one file. This may be also be known as level-4 data. Unlike level-2 and level-3 data, individual-level attributes such as age and sex are no longer surfaced on the data file directly.\\n\",\n            \"rdfs:label\": \"ProcessedMergedDataTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ProcessedMergedDataTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Workflow\"\n                },\n                {\n                    \"@id\": \"bts:WorkflowLink\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UpdateMilestoneReport\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Metadata template for updating milestone report values in NF studies -- currently a supported feature for NTAP and GFF.\",\n            \"rdfs:label\": \"UpdateMilestoneReport\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:PartialTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"UpdateMilestoneReport\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:ProgressReportNumber\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MethylationArrayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for raw data files (idat) from DNA methylation arrays.\",\n            \"rdfs:label\": \"MethylationArrayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:GenomicsArrayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MethylationArrayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:Channel\"\n                },\n                {\n                    \"@id\": \"bts:ChipID\"\n                },\n                {\n                    \"@id\": \"bts:ChipPosition\"\n                },\n                {\n                    \"@id\": \"bts:PlateName\"\n                },\n                {\n                    \"@id\": \"bts:PlateWell\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PharmacokineticsAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Generic template for describing data from a pharmacokinetics assay.\",\n            \"rdfs:label\": \"PharmacokineticsAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PharmacokineticsAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:CompoundName\"\n                },\n                {\n                    \"@id\": \"bts:CompoundDose\"\n                },\n                {\n                    \"@id\": \"bts:CompoundDoseUnit\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalCondition\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalTimepoint\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ElectrophysiologyAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for raw electrophysiology data (electrical recordings).\",\n            \"rdfs:label\": \"ElectrophysiologyAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ElectrophysiologyAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:BodySite\"\n                },\n                {\n                    \"@id\": \"bts:CellType\"\n                },\n                {\n                    \"@id\": \"bts:RecordingSource\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalTimepoint\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalCondition\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MicroscopyAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for data from a microscopy data.\",\n            \"rdfs:label\": \"MicroscopyAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ImagingAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MicroscopyAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:AssayTarget\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Objective\"\n                },\n                {\n                    \"@id\": \"bts:NominalMagnification\"\n                },\n                {\n                    \"@id\": \"bts:LensAperture\"\n                },\n                {\n                    \"@id\": \"bts:WorkingDistance\"\n                },\n                {\n                    \"@id\": \"bts:WorkingDistanceUnit\"\n                },\n                {\n                    \"@id\": \"bts:Immersion\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ImmunoMicroscopyTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for describing immunofluorescence or immunohistochemistry images.\",\n            \"rdfs:label\": \"ImmunoMicroscopyTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:MicroscopyAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ImmunoMicroscopyTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:AssayTarget\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Objective\"\n                },\n                {\n                    \"@id\": \"bts:NominalMagnification\"\n                },\n                {\n                    \"@id\": \"bts:LensAperture\"\n                },\n                {\n                    \"@id\": \"bts:WorkingDistance\"\n                },\n                {\n                    \"@id\": \"bts:WorkingDistanceUnit\"\n                },\n                {\n                    \"@id\": \"bts:Immersion\"\n                },\n                {\n                    \"@id\": \"bts:AntibodyID\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:AnimalIndividualTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for non-human individual-level data.\",\n            \"rdfs:label\": \"AnimalIndividualTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"AnimalIndividualTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:DiagnosisAgeGroup\"\n                },\n                {\n                    \"@id\": \"bts:Inheritance\"\n                },\n                {\n                    \"@id\": \"bts:Mosaicism\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:GermlineMutation\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PlateBasedReporterAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Generic template for describing data from a plate-based reporter assay.\",\n            \"rdfs:label\": \"PlateBasedReporterAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PlateBasedReporterAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:AssayTarget\"\n                },\n                {\n                    \"@id\": \"bts:CompoundName\"\n                },\n                {\n                    \"@id\": \"bts:CompoundDose\"\n                },\n                {\n                    \"@id\": \"bts:CompoundDoseUnit\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalCondition\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalTimepoint\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                },\n                {\n                    \"@id\": \"bts:ReporterGene\"\n                },\n                {\n                    \"@id\": \"bts:ReporterSubstance\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MRIAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for describing MRI data.\",\n            \"rdfs:label\": \"MRIAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:ImagingAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MRIAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:AssayTarget\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                },\n                {\n                    \"@id\": \"bts:BodySite\"\n                },\n                {\n                    \"@id\": \"bts:MRISequence\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalCondition\"\n                },\n                {\n                    \"@id\": \"bts:ExperimentalTimepoint\"\n                },\n                {\n                    \"@id\": \"bts:TimepointUnit\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WESTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for describing raw data from Whole Exome Sequencing (WES/WXS)\",\n            \"rdfs:label\": \"WESTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BulkSequencingAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"WESTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                },\n                {\n                    \"@id\": \"bts:RunType\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:ReadPair\"\n                },\n                {\n                    \"@id\": \"bts:ReadLength\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TargetDepth\"\n                },\n                {\n                    \"@id\": \"bts:BatchID\"\n                },\n                {\n                    \"@id\": \"bts:TargetCaptureKitID\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SourceCodeTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for describing scripts or software code.\",\n            \"rdfs:label\": \"SourceCodeTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Template\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SourceCodeTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Title\"\n                },\n                {\n                    \"@id\": \"bts:Author\"\n                },\n                {\n                    \"@id\": \"bts:Citation\"\n                },\n                {\n                    \"@id\": \"bts:License\"\n                },\n                {\n                    \"@id\": \"bts:ProgrammingLanguage\"\n                },\n                {\n                    \"@id\": \"bts:RuntimePlatform\"\n                },\n                {\n                    \"@id\": \"bts:Documentation\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:EpigeneticsAssayTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Alias for EpigenomiscAssayTemplate for backwards-compatibility.\",\n            \"rdfs:label\": \"EpigeneticsAssayTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:EpigenomiscAssayTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"EpigeneticsAssayTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:ParentSpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                },\n                {\n                    \"@id\": \"bts:AliquotID\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:NucleicAcidSource\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenType\"\n                },\n                {\n                    \"@id\": \"bts:RunType\"\n                },\n                {\n                    \"@id\": \"bts:LibraryStrand\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPrep\"\n                },\n                {\n                    \"@id\": \"bts:LibraryPreparationMethod\"\n                },\n                {\n                    \"@id\": \"bts:ReadPair\"\n                },\n                {\n                    \"@id\": \"bts:ReadLength\"\n                },\n                {\n                    \"@id\": \"bts:ReadDepth\"\n                },\n                {\n                    \"@id\": \"bts:TargetDepth\"\n                },\n                {\n                    \"@id\": \"bts:BatchID\"\n                },\n                {\n                    \"@id\": \"bts:BisulfiteConversionKitID\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ProcessedVariantCallsTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for describing either simple germline/somatic variant calls output data (VCF/MAF) as well as structural variants (e.g. CNVs).\",\n            \"rdfs:label\": \"ProcessedVariantCallsTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ProcessedVariantCallsTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:IsFilteredReads\"\n                },\n                {\n                    \"@id\": \"bts:Workflow\"\n                },\n                {\n                    \"@id\": \"bts:WorkflowLink\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                },\n                {\n                    \"@id\": \"bts:SpecimenID\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PortalDataset\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"//nf.synapse.org/Explore/Datasets. \\n\",\n            \"rdfs:label\": \"PortalDataset\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"PortalDataset\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Title\"\n                },\n                {\n                    \"@id\": \"bts:Creator\"\n                },\n                {\n                    \"@id\": \"bts:Contributor\"\n                },\n                {\n                    \"@id\": \"bts:Description\"\n                },\n                {\n                    \"@id\": \"bts:AccessType\"\n                },\n                {\n                    \"@id\": \"bts:License\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:StudyId\"\n                },\n                {\n                    \"@id\": \"bts:Manifestation\"\n                },\n                {\n                    \"@id\": \"bts:DiseaseFocus\"\n                },\n                {\n                    \"@id\": \"bts:FundingAgency\"\n                },\n                {\n                    \"@id\": \"bts:Series\"\n                },\n                {\n                    \"@id\": \"bts:VisualizeDataOn\"\n                },\n                {\n                    \"@id\": \"bts:YearProcessed\"\n                },\n                {\n                    \"@id\": \"bts:YearPublished\"\n                },\n                {\n                    \"@id\": \"bts:IncludedInDataCatalog\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VisualizeDataOn\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VisualizeDataOn\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"visualizeDataOn\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:IncludedInDataCatalog\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"IncludedInDataCatalog\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"includedInDataCatalog\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:FlowCytometryTemplate\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template for flow cytometry assay\",\n            \"rdfs:label\": \"FlowCytometryTemplate\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:BiologicalAssayDataTemplate\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"FlowCytometryTemplate\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:Component\"\n                },\n                {\n                    \"@id\": \"bts:Filename\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:DataType\"\n                },\n                {\n                    \"@id\": \"bts:DataSubtype\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:IndividualID\"\n                },\n                {\n                    \"@id\": \"bts:Species\"\n                },\n                {\n                    \"@id\": \"bts:Sex\"\n                },\n                {\n                    \"@id\": \"bts:Age\"\n                },\n                {\n                    \"@id\": \"bts:AgeUnit\"\n                },\n                {\n                    \"@id\": \"bts:Diagnosis\"\n                },\n                {\n                    \"@id\": \"bts:Nf1Genotype\"\n                },\n                {\n                    \"@id\": \"bts:Nf2Genotype\"\n                },\n                {\n                    \"@id\": \"bts:TumorType\"\n                },\n                {\n                    \"@id\": \"bts:ModelSystemName\"\n                },\n                {\n                    \"@id\": \"bts:Organ\"\n                },\n                {\n                    \"@id\": \"bts:Comments\"\n                },\n                {\n                    \"@id\": \"bts:Platform\"\n                },\n                {\n                    \"@id\": \"bts:CellType\"\n                },\n                {\n                    \"@id\": \"bts:AuxiliaryAsset\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:WorkflowReport\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"Template used for miscellaneous workflow reports and accessory files\",\n            \"rdfs:label\": \"WorkflowReport\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Template\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"WorkflowReport\",\n            \"sms:required\": \"sms:false\",\n            \"sms:requiresDependency\": [\n                {\n                    \"@id\": \"bts:ResourceType\"\n                },\n                {\n                    \"@id\": \"bts:Assay\"\n                },\n                {\n                    \"@id\": \"bts:FileFormat\"\n                },\n                {\n                    \"@id\": \"bts:RelatedDataset\"\n                },\n                {\n                    \"@id\": \"bts:Workflow\"\n                },\n                {\n                    \"@id\": \"bts:WorkflowLink\"\n                }\n            ],\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CTF\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CTF\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CTF\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:SMN\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"SMN\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"SMN\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CCBY-ND\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CCBY-ND\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CC BY-ND\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:UNKNOWN\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"UNKNOWN\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"UNKNOWN\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Genesymbol\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Genesymbol\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"gene symbol\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CC-0\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CC-0\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CC-0\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GRCh38\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GRCh38\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GRCh38\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:LowGradeGlioma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"LowGradeGlioma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Low Grade Glioma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MPNST\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MPNST\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MPNST\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Knockout\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Knockout\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"knockout\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Atrophy\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Atrophy\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"atrophy\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Cognition\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Cognition\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Cognition\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GRCh37\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GRCh37\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GRCh37\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Drugtoxicity\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Drugtoxicity\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"drug toxicity\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GFF\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GFF\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GFF\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:MMUL1.0\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"MMUL1.0\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"MMUL1.0\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Braindevelopment\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Braindevelopment\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"brain development\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Independent\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Independent\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Independent\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CCBY-NC-SA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CCBY-NC-SA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CC BY-NC-SA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:QualityofLife\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"QualityofLife\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Quality of Life\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:GRCh38VerilyV1\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"GRCh38VerilyV1\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"GRCh38_Verily_v1\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ODC-BY\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ODC-BY\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ODC-BY\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:JMML\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"JMML\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"JMML\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Knockdown\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Knockdown\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"knockdown\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ODC-ODbL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ODC-ODbL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ODC-ODbL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Druginteraction\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Druginteraction\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"drug interaction\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HighGradeGlioma\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HighGradeGlioma\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"High Grade Glioma\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CCBY-SA\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CCBY-SA\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CC BY-SA\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Chemicaldescriptor\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Chemicaldescriptor\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"chemical descriptor\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:NTAP\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"NTAP\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"NTAP\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Drugresistance\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Drugresistance\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"drug resistance\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CCBY-NC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CCBY-NC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CC BY-NC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CC-BY\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CC-BY\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CC-BY\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Memory\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Memory\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Memory\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Pain\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Pain\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Pain\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Genefunction\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Genefunction\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"gene function\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:ODC-PDDL\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"ODC-PDDL\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"ODC-PDDL\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:RESTRICTED-USE\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"RESTRICTED-USE\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"RESTRICTED-USE\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Non-targetingcontrol\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Non-targetingcontrol\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"non-targeting control\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Overexpression\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Overexpression\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"overexpression\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:CCBY-NC-ND\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"CCBY-NC-ND\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"CC BY-NC-ND\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:PublicDomain\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"PublicDomain\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Public Domain\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:VisionLoss\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"VisionLoss\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Vision Loss\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:Behavioral\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"Behavioral\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"Behavioral\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:R816X\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"R816X\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"R816X\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        },\n        {\n            \"@id\": \"bts:HRC\",\n            \"@type\": \"rdfs:Class\",\n            \"rdfs:comment\": \"TBD\",\n            \"rdfs:label\": \"HRC\",\n            \"rdfs:subClassOf\": [\n                {\n                    \"@id\": \"bts:Thing\"\n                }\n            ],\n            \"schema:isPartOf\": {\n                \"@id\": \"http://schema.biothings.io\"\n            },\n            \"sms:displayName\": \"HRC\",\n            \"sms:required\": \"sms:false\",\n            \"sms:validationRules\": []\n        }\n    ],\n    \"@id\": \"http://schema.biothings.io/#0.1\"\n}"
  },
  {
    "path": "tests/pytest/fixtures/champion_outputs.py",
    "content": "import pytest\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.lib.schemas import TextFile\n\n\n# NOTE: this relies on knowledge of the fixtures in fixtures/execution_data.py\n@pytest.fixture\ndef scan_convert_filter_champion_outputs(scan_convert_filter_sentinel_plan, foobar_schema):\n    logical_op_ids = scan_convert_filter_sentinel_plan.logical_op_ids\n    scan_logical_op_id = logical_op_ids[0]\n    convert_logical_op_id = logical_op_ids[1]\n    filter_logical_op_id = logical_op_ids[2]\n    champion_outputs = {logical_op_id: {} for logical_op_id in logical_op_ids}\n\n    # compute scan champion_outputs\n    for source_idx in range(10):\n        scan_dr = DataRecord(TextFile, [source_idx], parent_ids=None)\n        scan_dr.filename = f\"file{source_idx}\"\n        scan_dr.contents = None\n        champion_outputs[scan_logical_op_id][source_idx] = DataRecordSet([scan_dr], None)\n\n    # add convert champion outputs\n    for source_idx in range(10):\n        convert_dr = DataRecord(foobar_schema, [source_idx])\n        convert_dr.filename = f\"file{source_idx}\"\n        convert_dr.contents = None\n        convert_dr.foo = f\"foo{source_idx}\"\n        convert_dr.bar = f\"bar{source_idx}\"\n        champion_outputs[convert_logical_op_id][source_idx] = DataRecordSet([convert_dr], None)\n\n    # add filter champion outputs\n    for source_idx in range(10):\n        filter_dr = DataRecord(foobar_schema, [source_idx])\n        filter_dr.filename = f\"file{source_idx}\"\n        filter_dr.contents = None\n        filter_dr.foo = f\"foo{source_idx}\"\n        filter_dr.bar = f\"bar{source_idx}\"\n        filter_dr._passed_operator = bool(source_idx % 2)\n        champion_outputs[filter_logical_op_id][source_idx] = DataRecordSet([filter_dr], None)\n\n    return champion_outputs\n\n\n@pytest.fixture\ndef scan_convert_filter_empty_champion_outputs(scan_convert_filter_sentinel_plan, foobar_schema):\n    logical_op_ids = scan_convert_filter_sentinel_plan.logical_op_ids\n    scan_logical_op_id = logical_op_ids[0]\n    convert_logical_op_id = logical_op_ids[1]\n    filter_logical_op_id = logical_op_ids[2]\n    champion_outputs = {logical_op_id: {} for logical_op_id in logical_op_ids}\n\n    # compute scan champion_outputs\n    for source_idx in range(10):\n        scan_dr = DataRecord(TextFile, [source_idx], parent_ids=None)\n        scan_dr.filename = f\"file{source_idx}\"\n        scan_dr.contents = None\n        champion_outputs[scan_logical_op_id][source_idx] = DataRecordSet([scan_dr], None)\n\n    # add convert champion outputs\n    for source_idx in range(10):\n        convert_dr = DataRecord(foobar_schema, [source_idx])\n        convert_dr.filename = f\"file{source_idx}\"\n        convert_dr.contents = None\n        convert_dr.foo = f\"foo{source_idx}\"\n        convert_dr.bar = f\"bar{source_idx}\"\n        champion_outputs[convert_logical_op_id][source_idx] = DataRecordSet([convert_dr], None)\n\n    # add filter champion outputs\n    for source_idx in range(10):\n        filter_dr = DataRecord(foobar_schema, [source_idx])\n        filter_dr.filename = f\"file{source_idx}\"\n        filter_dr.contents = None\n        filter_dr.foo = f\"foo{source_idx}\"\n        filter_dr.bar = f\"bar{source_idx}\"\n        filter_dr._passed_operator = False\n        champion_outputs[filter_logical_op_id][source_idx] = DataRecordSet([filter_dr], None)\n\n    return champion_outputs\n\n\n@pytest.fixture\ndef scan_convert_filter_varied_champion_outputs(scan_convert_filter_sentinel_plan, foobar_schema):\n    logical_op_ids = scan_convert_filter_sentinel_plan.logical_op_ids\n    scan_logical_op_id = logical_op_ids[0]\n    convert_logical_op_id = logical_op_ids[1]\n    filter_logical_op_id = logical_op_ids[2]\n    champion_outputs = {logical_op_id: {} for logical_op_id in logical_op_ids}\n\n    # compute scan champion_outputs\n    for source_idx in range(10):\n        scan_dr = DataRecord(TextFile, [source_idx], parent_ids=None)\n        scan_dr.filename = f\"file{source_idx}\"\n        scan_dr.contents = None\n        champion_outputs[scan_logical_op_id][source_idx] = DataRecordSet([scan_dr], None)\n\n    # add convert champion outputs\n    for source_idx in range(10):\n        convert_dr = DataRecord(foobar_schema, [source_idx])\n        convert_dr.filename = f\"file{source_idx}\"\n        convert_dr.contents = None\n        convert_dr.foo = f\"foo{source_idx}\"\n        convert_dr.bar = f\"bar{source_idx}-{str(Model.GPT_4o)}\"\n        champion_outputs[convert_logical_op_id][source_idx] = DataRecordSet([convert_dr], None)\n\n    # add filter champion outputs\n    for source_idx in range(10):\n        filter_dr = DataRecord(foobar_schema, [source_idx])\n        filter_dr.filename = f\"file{source_idx}\"\n        filter_dr.contents = None\n        filter_dr.foo = f\"foo{source_idx}\"\n        filter_dr.bar = f\"bar{source_idx}-{str(Model.GPT_4o)}\"\n        filter_dr._passed_operator = bool(source_idx % 2)\n        champion_outputs[filter_logical_op_id][source_idx] = DataRecordSet([filter_dr], None)\n\n    return champion_outputs\n\n\n@pytest.fixture\ndef scan_multi_convert_multi_filter_champion_outputs(scan_multi_convert_multi_filter_sentinel_plan, foobar_schema, baz_schema):\n    \"\"\"\n    Champion outputs agree with GPT-4.\n    \"\"\"\n    logical_op_ids = scan_multi_convert_multi_filter_sentinel_plan.logical_op_ids\n    scan_logical_op_id = logical_op_ids[0]\n    convert1_logical_op_id = logical_op_ids[1]\n    filter1_logical_op_id = logical_op_ids[2]\n    filter2_logical_op_id = logical_op_ids[3]\n    convert2_logical_op_id = logical_op_ids[4]\n    champion_outputs = {logical_op_id: {} for logical_op_id in logical_op_ids}\n\n    # compute scan champion_outputs\n    for source_idx in range(10):\n        scan_dr = DataRecord(TextFile, [source_idx], parent_ids=None)\n        scan_dr.filename = f\"file{source_idx}\"\n        scan_dr.contents = None\n        champion_outputs[scan_logical_op_id][source_idx] = DataRecordSet([scan_dr], None)\n\n    # add first convert champion outputs\n    for source_idx in range(10):\n        drs = []\n        for one_to_many_idx in range(2):\n            convert_dr = DataRecord(foobar_schema, [source_idx])\n            convert_dr.filename = f\"file{source_idx}\"\n            convert_dr.contents = None\n            convert_dr.foo = f\"foo{source_idx}-one-to-many-{one_to_many_idx}\"\n            convert_dr.bar = f\"bar{source_idx}-{str(Model.GPT_4o)}\"\n            drs.append(convert_dr)\n\n        champion_outputs[convert1_logical_op_id][source_idx] = DataRecordSet(drs, None)\n\n    # add first filter champion outputs\n    for source_idx in range(10):\n        for one_to_many_idx in range(2):\n            filter_dr = DataRecord(foobar_schema, [source_idx])\n            filter_dr.filename = f\"file{source_idx}\"\n            filter_dr.contents = None\n            filter_dr.foo = f\"foo{source_idx}-one-to-many-{one_to_many_idx}\"\n            filter_dr.bar = f\"bar{source_idx}-{str(Model.GPT_4o)}\"\n            filter_dr._passed_operator = bool(source_idx < 7)\n            champion_outputs[filter1_logical_op_id][source_idx] = DataRecordSet([filter_dr], None)\n\n    # add second filter champion outputs\n    for source_idx in range(7):\n        for one_to_many_idx in range(2):\n            filter_dr = DataRecord(foobar_schema, [source_idx])\n            filter_dr.filename = f\"file{source_idx}\"\n            filter_dr.contents = None\n            filter_dr.foo = f\"foo{source_idx}-one-to-many-{one_to_many_idx}\"\n            filter_dr.bar = f\"bar{source_idx}-{str(Model.GPT_4o)}\"\n            filter_dr._passed_operator = bool(source_idx < 5)\n            champion_outputs[filter2_logical_op_id][source_idx] = DataRecordSet([filter_dr], None)\n\n    # add first convert champion outputs\n    for source_idx in range(5):\n        for one_to_many_idx in range(2):\n            convert_dr = DataRecord(baz_schema, [source_idx])\n            convert_dr.filename = f\"file{source_idx}\"\n            convert_dr.contents = None\n            convert_dr.foo = f\"foo{source_idx}-one-to-many-{one_to_many_idx}\"\n            convert_dr.bar = f\"bar{source_idx}-{str(Model.GPT_4o)}\"\n            convert_dr.baz = f\"baz{str(Model.GPT_4o)}\"\n            champion_outputs[convert2_logical_op_id][source_idx] = DataRecordSet([convert_dr], None)\n\n    return champion_outputs\n"
  },
  {
    "path": "tests/pytest/fixtures/datasets.py",
    "content": "import os\nfrom pathlib import Path\n\nimport pytest\n\nfrom palimpzest.core.data.iter_dataset import IterDataset, TextFileDataset\nfrom palimpzest.core.lib.schemas import ImageFilepath\n\n### Raw IterDatasets ###\nreal_estate_listing_cols = [\n    {\"name\": \"listing\", \"type\": str, \"desc\": \"The name of the listing\"},\n    {\"name\": \"text_content\", \"type\": str, \"desc\": \"The content of the listing's text description\"},\n    {\"name\": \"image_filepaths\", \"type\": list[ImageFilepath], \"desc\": \"A list of the filepaths for each image of the listing\"},\n]\n\nclass RealEstateListingDataset(IterDataset):\n    def __init__(self, listings_dir):\n        super().__init__(id=\"real-estate\", schema=real_estate_listing_cols)\n        self.listings_dir = listings_dir\n        self.listings = [\n            listing\n            for listing in sorted(os.listdir(self.listings_dir))\n            if os.path.isdir(os.path.join(self.listings_dir, listing))\n        ]\n        if len(self.listings) == 0:\n            raise ValueError(f\"No listings found in directory: {self.listings_dir}\")\n\n    def __len__(self):\n        return len(self.listings)\n    \n    def __getitem__(self, idx: int):\n        # get listing\n        listing = self.listings[idx]\n\n        # get fields\n        image_filepaths, text_content = [], None\n        listing_dir = os.path.join(self.listings_dir, listing)\n        for file in os.listdir(listing_dir):\n            if file.endswith(\".txt\"):\n                with open(os.path.join(listing_dir, file), \"rb\") as f:\n                    text_content = f.read().decode(\"utf-8\")\n            elif file.endswith(\".png\"):\n                image_filepaths.append(os.path.join(listing_dir, file))\n\n        # construct and return dictionary with fields\n        return {\"listing\": listing, \"text_content\": text_content, \"image_filepaths\": image_filepaths}\n\n\nclass CostModelTestDataset(IterDataset):\n    def __init__(self):\n        super().__init__(id=\"test\", schema=[{\"name\": \"value\", \"type\": int, \"desc\": \"A number\"}])\n        self.numbers = [1, 2, 3]\n\n    def __len__(self):\n        return len(self.numbers)\n\n    def __getitem__(self, idx: int):\n        # fetch number\n        number = self.numbers[idx]\n\n        # create and return item\n        return {\"value\": number}\n\n\n@pytest.fixture(scope=\"session\")\ndef project_root() -> Path:\n    return Path(__file__).resolve().parent.parent.parent.parent\n\n### DATA PATH FIXTURES ###\n@pytest.fixture\ndef enron_eval_tiny_data_path(project_root) -> str:\n    return str(project_root / \"testdata/enron-eval-tiny\")\n\n\n@pytest.fixture\ndef real_estate_eval_tiny_data_path(project_root) -> str:\n    return str(project_root / \"testdata/real-estate-eval-tiny\")\n\n\n### ROOT DATASET FIXTURES ###\n@pytest.fixture\ndef enron_eval_tiny(enron_eval_tiny_data_path):\n    return TextFileDataset(id=\"enron-eval-tiny\", path=enron_eval_tiny_data_path)\n\n\n@pytest.fixture\ndef real_estate_eval_tiny(real_estate_eval_tiny_data_path):\n    return RealEstateListingDataset(real_estate_eval_tiny_data_path)\n\n\n@pytest.fixture\ndef cost_model_test_dataset():\n    return CostModelTestDataset()\n"
  },
  {
    "path": "tests/pytest/fixtures/execution_data.py",
    "content": "import pytest\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.lib.schemas import TextFile\nfrom palimpzest.core.models import RecordOpStats\n\n\n# NOTE: technically the filter should process 10 outputs 3x times\n@pytest.fixture\ndef scan_convert_filter_execution_data(scan_convert_filter_sentinel_plan, foobar_schema):\n    # initialize execution data\n    op_sets = scan_convert_filter_sentinel_plan.operator_sets\n    logical_op_ids = scan_convert_filter_sentinel_plan.logical_op_ids\n    scan_logical_op_id = logical_op_ids[0]\n    convert_logical_op_id = logical_op_ids[1]\n    filter_logical_op_id = logical_op_ids[2]\n    execution_data = {logical_op_id: {} for logical_op_id in logical_op_ids}\n\n    # create data records first\n    scan_drs, convert_drs, filter_drs = [], [], []\n    for source_idx in range(10):\n        scan_dr = DataRecord(TextFile, [source_idx], parent_ids=None)\n        scan_dr.filename = f\"file{source_idx}\"\n        scan_dr.contents = None\n        scan_drs.append(scan_dr)\n\n    # create convert data records\n    for idx in range(30):\n        data_item = {\"foo\": f\"foo{idx % 10}\", \"bar\": f\"bar{idx % 10}\"}\n        convert_dr = DataRecord.from_parent(foobar_schema, data_item, scan_drs[idx % 10])\n        convert_drs.append(convert_dr)\n\n    # create filter data records\n    for idx in range(30):\n        filter_dr = DataRecord.from_parent(foobar_schema, {}, convert_drs[idx])\n        filter_drs.append(filter_dr)\n\n    # create execution data entries for scan operator\n    for scan_dr in scan_drs:\n        full_op_id = op_sets[0][0].get_full_op_id()\n        source_idx = scan_dr._source_indices[0]\n        record_op_stats = RecordOpStats(\n            record_id=scan_dr._id,\n            record_parent_ids=scan_dr._parent_ids,\n            record_source_indices=scan_dr._source_indices,\n            full_op_id=full_op_id,\n            op_name=\"MarshalAndScanDataOp\",\n            time_per_record=1.0,\n            cost_per_record=0.0,\n            logical_op_id=\"scan1-logical\",\n            record_state=scan_dr.to_dict(),\n            passed_operator=None,\n            generated_fields=None,\n        )\n        scan_record_set = DataRecordSet([scan_dr], [record_op_stats])\n        execution_data[scan_logical_op_id][source_idx] = [scan_record_set]\n\n    # create execution data entries for convert operator\n    for op_idx, op in enumerate(op_sets[1]):\n        full_op_id = op.get_full_op_id()\n        for source_idx in range(10):\n            record_idx = op_idx * len(op_sets) + source_idx\n            convert_dr = convert_drs[record_idx]\n            record_op_stats = RecordOpStats(\n                record_id=convert_dr._id,\n                record_parent_ids=convert_dr._parent_ids,\n                record_source_indices=convert_dr._source_indices,\n                full_op_id=full_op_id,\n                op_name=\"LLMConvertBonded\",\n                time_per_record=1.0,\n                cost_per_record=1.0,\n                logical_op_id=\"convert1-logical\",\n                record_state=convert_dr.to_dict(),\n                passed_operator=None,\n                generated_fields=[\"foo\", \"bar\"],\n            )\n            convert_record_set = DataRecordSet([convert_dr], [record_op_stats])\n            if source_idx not in execution_data[convert_logical_op_id]:\n                execution_data[convert_logical_op_id][source_idx] = [convert_record_set]\n            else:\n                execution_data[convert_logical_op_id][source_idx].append(convert_record_set)\n\n    # create execution data entries for filter operator\n    for op_idx, op in enumerate(op_sets[2]):\n        full_op_id = op.get_full_op_id()\n        for source_idx in range(10):\n            record_idx = op_idx * len(op_sets) + source_idx\n            filter_dr = filter_drs[record_idx]\n            record_op_stats = RecordOpStats(\n                record_id=filter_dr._id,\n                record_parent_ids=filter_dr._parent_ids,\n                record_source_indices=filter_dr._source_indices,\n                full_op_id=full_op_id,\n                op_name=\"LLMFilter\",\n                time_per_record=1.0,\n                cost_per_record=1.0,\n                logical_op_id=\"filter1-logical\",\n                record_state=filter_dr.to_dict(),\n                passed_operator=bool(source_idx % 2), # odd examples pass filter\n                generated_fields=None,\n            )\n            filter_record_set = DataRecordSet([filter_dr], [record_op_stats])\n            if source_idx not in execution_data[filter_logical_op_id]:\n                execution_data[filter_logical_op_id][source_idx] = [filter_record_set]\n            else:\n                execution_data[filter_logical_op_id][source_idx].append(filter_record_set)\n\n    return execution_data\n\n\n# NOTE: technically the filter should process 10 outputs 3x times\n@pytest.fixture\ndef scan_convert_filter_varied_execution_data(scan_convert_filter_sentinel_plan, foobar_schema):\n    # initialize execution data\n    logical_op_ids = scan_convert_filter_sentinel_plan.logical_op_ids\n    scan_logical_op_id = logical_op_ids[0]\n    convert_logical_op_id = logical_op_ids[1]\n    filter_logical_op_id = logical_op_ids[2]\n    execution_data = {logical_op_id: {} for logical_op_id in logical_op_ids}\n\n    # create data records first\n    scan_drs, convert_drs, filter_drs = [], [], []\n    for source_idx in range(10):\n        scan_dr = DataRecord(TextFile, [source_idx], parent_ids=None)\n        scan_dr.filename = f\"file{source_idx}\"\n        scan_dr.contents = None\n        scan_drs.append(scan_dr)\n\n    # create convert data records\n    models = [Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_1_8B]\n    for model in models:\n        for idx in range(10):\n            data_item = {\"foo\": f\"foo{idx}\", \"bar\": f\"bar{idx}-{str(model)}\"}\n            convert_dr = DataRecord.from_parent(foobar_schema, data_item, scan_drs[idx])\n            convert_drs.append(convert_dr)\n\n    # create filter data records\n    for idx in range(30):\n        filter_dr = DataRecord.from_parent(foobar_schema, {}, convert_drs[idx])\n        filter_drs.append(filter_dr)\n\n    # create execution data entries for scan operator\n    for scan_dr in scan_drs:\n        source_idx = scan_dr._source_indices[0]\n        record_op_stats = RecordOpStats(\n            record_id=scan_dr._id,\n            record_parent_ids=scan_dr._parent_ids,\n            record_source_indices=scan_dr._source_indices,\n            full_op_id=\"scan1-phys\",\n            op_name=\"MarshalAndScanDataOp\",\n            time_per_record=1.0,\n            cost_per_record=0.0,\n            logical_op_id=\"scan1-logical\",\n            record_state=scan_dr.to_dict(),\n            passed_operator=None,\n            generated_fields=None,\n        )\n        scan_record_set = DataRecordSet([scan_dr], [record_op_stats])\n        execution_data[scan_logical_op_id][source_idx] = [scan_record_set]\n\n    # create execution data entries for convert operator\n    for idx, convert_dr in enumerate(convert_drs):\n        source_idx = convert_dr._source_indices[0]\n        model = models[idx // 10]\n        record_op_stats = RecordOpStats(\n            record_id=convert_dr._id,\n            record_parent_ids=convert_dr._parent_ids,\n            record_source_indices=convert_dr._source_indices,\n            full_op_id=f\"convert1-phys-{str(model)}\",\n            op_name=\"LLMConvertBonded\",\n            time_per_record=1.0,\n            cost_per_record=1.0,\n            logical_op_id=\"convert1-logical\",\n            record_state=convert_dr.to_dict(),\n            passed_operator=None,\n            generated_fields=[\"foo\", \"bar\"],\n        )\n        convert_record_set = DataRecordSet([convert_dr], [record_op_stats])\n        if source_idx not in execution_data[convert_logical_op_id]:\n            execution_data[convert_logical_op_id][source_idx] = [convert_record_set]\n        else:\n            execution_data[convert_logical_op_id][source_idx].append(convert_record_set)\n\n    # create execution data entries for filter operator\n    for idx, filter_dr in enumerate(filter_drs):\n        source_idx = filter_dr._source_indices[0]\n        model = models[idx // 10]\n\n        # GPT-4o passes odd examples\n        # GPT-4o-mini passes even examples\n        # LLAMA3_1_8B passes all examples\n        passed_operator = None\n        if model == Model.GPT_4o:\n            passed_operator = bool(source_idx % 2)\n        elif model == Model.GPT_4o_MINI:\n            passed_operator = not bool(source_idx % 2)\n        elif model == Model.LLAMA3_1_8B:\n            passed_operator = True\n\n        record_op_stats = RecordOpStats(\n            record_id=filter_dr._id,\n            record_parent_ids=filter_dr._parent_ids,\n            record_source_indices=filter_dr._source_indices,\n            full_op_id=f\"filter1-phys-{str(model)}\",\n            op_name=\"LLMFilter\",\n            time_per_record=1.0,\n            cost_per_record=1.0,\n            logical_op_id=\"filter1-logical\",\n            record_state=filter_dr.to_dict(),\n            passed_operator=passed_operator,\n            generated_fields=None,\n        )\n        filter_record_set = DataRecordSet([filter_dr], [record_op_stats])\n        if source_idx not in execution_data[filter_logical_op_id]:\n            execution_data[filter_logical_op_id][source_idx] = [filter_record_set]\n        else:\n            execution_data[filter_logical_op_id][source_idx].append(filter_record_set)\n\n    return execution_data\n\n\n# TODO: are we still using this?\n@pytest.fixture\ndef scan_multi_convert_multi_filter_execution_data(scan_multi_convert_multi_filter_sentinel_plan, foobar_schema, baz_schema):\n    # initialize execution data\n    logical_op_ids = scan_multi_convert_multi_filter_sentinel_plan.logical_op_ids\n    scan_logical_op_id = logical_op_ids[0]\n    convert1_logical_op_id = logical_op_ids[1]\n    filter1_logical_op_id = logical_op_ids[2]\n    filter2_logical_op_id = logical_op_ids[3]\n    convert2_logical_op_id = logical_op_ids[4]\n    execution_data = {logical_op_id: {} for logical_op_id in logical_op_ids}\n\n    # create data records first\n    scan_drs, convert1_drs, convert2_drs, filter1_drs, filter2_drs = [], [], [], [], []\n    for source_idx in range(10):\n        scan_dr = DataRecord(TextFile, [source_idx], parent_ids=None)\n        scan_dr.filename = f\"file{source_idx}\"\n        scan_dr.contents = None\n        scan_drs.append(scan_dr)\n\n    # create first convert data records\n    models = [Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_1_8B]\n    for model in models:\n        for source_idx in range(10):\n            for one_to_many_idx in range(2):\n                data_item = {\"foo\": f\"foo{source_idx}-one-to-many-{one_to_many_idx}\", \"bar\": f\"bar{source_idx}-{str(model)}\"}\n                convert_dr = DataRecord.from_parent(foobar_schema, data_item, scan_drs[source_idx])\n                convert1_drs.append(convert_dr)\n\n    # create first filter data records\n    for _ in models:\n        for gpt4_convert_dr in convert1_drs[:20]:\n            filter_dr = DataRecord.from_parent(foobar_schema, {}, gpt4_convert_dr)\n            filter1_drs.append(filter_dr)\n\n    # NOTE: assume GPT-4 in filter1 filtered out last 6 out of 20 records\n    # create second filter data records\n    for _ in models:\n        for gpt4_filter_dr in filter1_drs[:14]:\n            filter_dr = DataRecord.from_parent(foobar_schema, {}, gpt4_filter_dr)\n            filter2_drs.append(filter_dr)\n\n    # NOTE: assume GPT-4 in filter2 filtered out last 4 out of 14 records\n    # create second convert data records (second half of records will be filtered out)\n    for model in models:\n        for gpt4_filter_dr in filter2_drs[:10]:\n            data_item = {\"baz\": f\"baz{str(model)}\"}\n            convert_dr = DataRecord.from_parent(baz_schema, data_item, gpt4_filter_dr)\n            convert2_drs.append(convert_dr)\n\n    # create execution data entries for scan operator\n    for scan_dr in scan_drs:\n        source_idx = scan_dr._source_indices[0]\n        record_op_stats = RecordOpStats(\n            record_id=scan_dr._id,\n            record_parent_ids=scan_dr._parent_ids,\n            record_source_indices=scan_dr._source_indices,\n            full_op_id=\"scan1-phys\",\n            op_name=\"MarshalAndScanDataOp\",\n            time_per_record=1.0,\n            cost_per_record=0.0,\n            logical_op_id=\"scan1-logical\",\n            record_state=scan_dr.to_dict(),\n            passed_operator=None,\n            generated_fields=None,\n        )\n        scan_record_set = DataRecordSet([scan_dr], [record_op_stats])\n        execution_data[scan_logical_op_id][source_idx] = [scan_record_set]\n\n    # create execution data entries for first convert operator\n    for model_idx in range(3):\n        for record_idx in range(10):\n            drs, record_op_stats_lst = [], []\n            for one_to_many_idx in range(2):\n                abs_idx = model_idx * 20 + record_idx * 2 + one_to_many_idx\n                convert_dr = convert1_drs[abs_idx]\n                source_idx = convert_dr._source_indices[0]\n                record_op_stats = RecordOpStats(\n                    record_id=convert_dr._id,\n                    record_parent_ids=convert_dr._parent_ids,\n                    record_source_indices=convert_dr._source_indices,\n                    full_op_id=f\"convert1-phys-{str(models[model_idx])}\",\n                    op_name=\"LLMConvertBonded\",\n                    time_per_record=1.0,\n                    cost_per_record=1.0,\n                    logical_op_id=\"convert1-logical\",\n                    record_state=convert_dr.to_dict(),\n                    passed_operator=None,\n                    generated_fields=[\"foo\", \"bar\"],\n                )\n                drs.append(convert_dr)\n                record_op_stats_lst.append(record_op_stats)\n            convert_record_set = DataRecordSet(drs, record_op_stats_lst)\n            if source_idx not in execution_data[convert1_logical_op_id]:\n                execution_data[convert1_logical_op_id][source_idx] = [convert_record_set]\n            else:\n                execution_data[convert1_logical_op_id][source_idx].append(convert_record_set)\n\n    # create execution data entries for first filter operator\n    for model_idx in range(3):\n        for record_idx in range(10):\n            for one_to_many_idx in range(2):\n                abs_idx = model_idx * 20 + record_idx * 2 + one_to_many_idx\n                filter_dr = filter1_drs[abs_idx]\n                source_idx = filter_dr._source_indices[0]\n                model = models[model_idx]\n\n                # GPT-4 filters final 6 records it sees\n                passed_operator = True\n                if model_idx == 0 and source_idx > 6:  # noqa: SIM114\n                    passed_operator = False\n\n                # TODO: are we still using this?\n                # GPT-3.5 filters all records with one_to_many_idx == 1\n                elif model_idx == 1 and one_to_many_idx == 1:\n                    passed_operator = False\n\n                # LLAMA3_1_8B passes all records\n\n                record_op_stats = RecordOpStats(\n                    record_id=filter_dr._id,\n                    record_parent_ids=filter_dr._parent_ids,\n                    record_source_indices=filter_dr._source_indices,\n                    full_op_id=f\"filter1-phys-{str(model)}\",\n                    op_name=\"LLMFilter\",\n                    time_per_record=1.0,\n                    cost_per_record=1.0,\n                    logical_op_id=\"filter1-logical\",\n                    record_state=filter_dr.to_dict(),\n                    passed_operator=passed_operator,\n                    generated_fields=None,\n                )\n                filter_record_set = DataRecordSet([filter_dr], [record_op_stats])\n                if source_idx not in execution_data[filter1_logical_op_id]:\n                    execution_data[filter1_logical_op_id][source_idx] = [filter_record_set]\n                else:\n                    execution_data[filter1_logical_op_id][source_idx].append(filter_record_set)\n\n    # create execution data entries for second filter operator\n    for model_idx in range(3):\n        for record_idx in range(7):\n            for one_to_many_idx in range(2):\n                abs_idx = model_idx * 14 + record_idx * 2 + one_to_many_idx\n                filter_dr = filter2_drs[abs_idx]\n                source_idx = filter_dr._source_indices[0]\n                model = models[model_idx]\n\n                # TODO: this makes # of records seen by convert2 more complicated\n                # GPT-4 filters out final 4 records it sees\n                passed_operator = True\n                if model_idx == 0 and source_idx > 4:  # noqa: SIM114\n                    passed_operator = False\n\n                # GPT-3.5 filters all records with one_to_many_idx == 1\n                elif model_idx == 1 and one_to_many_idx == 1:\n                    passed_operator = False\n\n                # LLAMA3_1_8B passes all records\n\n                # filter out records with abs_idx >= 30\n                record_op_stats = RecordOpStats(\n                    record_id=filter_dr._id,\n                    record_parent_ids=filter_dr._parent_ids,\n                    record_source_indices=filter_dr._source_indices,\n                    full_op_id=f\"filter2-phys-{str(model)}\",\n                    op_name=\"LLMFilter\",\n                    time_per_record=1.0,\n                    cost_per_record=1.0,\n                    logical_op_id=\"filter2-logical\",\n                    record_state=filter_dr.to_dict(),\n                    passed_operator=passed_operator,\n                    generated_fields=None,\n                )\n                filter_record_set = DataRecordSet([filter_dr], [record_op_stats])\n                if source_idx not in execution_data[filter2_logical_op_id]:\n                    execution_data[filter2_logical_op_id][source_idx] = [filter_record_set]\n                else:\n                    execution_data[filter2_logical_op_id][source_idx].append(filter_record_set)\n\n    # create execution data entries for second convert operator\n    for model_idx in range(3):\n        for record_idx in range(5):\n            for one_to_many_idx in range(2):\n                abs_idx = model_idx * 10 + record_idx * 2 + one_to_many_idx\n                convert_dr = convert2_drs[abs_idx]\n                source_idx = convert_dr._source_indices[0]\n                record_op_stats = RecordOpStats(\n                    record_id=convert_dr._id,\n                    record_parent_ids=convert_dr._parent_ids,\n                    record_source_indices=convert_dr._source_indices,\n                    full_op_id=f\"convert1-phys-{str(models[model_idx])}\",\n                    op_name=\"LLMConvertBonded\",\n                    time_per_record=1.0,\n                    cost_per_record=1.0,\n                    logical_op_id=\"convert2-logical\",\n                    record_state=convert_dr.to_dict(),\n                    passed_operator=None,\n                    generated_fields=[\"baz\"],\n                )\n                convert_record_set = DataRecordSet([convert_dr], [record_op_stats])\n                if source_idx not in execution_data[convert2_logical_op_id]:\n                    execution_data[convert2_logical_op_id][source_idx] = [convert_record_set]\n                else:\n                    execution_data[convert2_logical_op_id][source_idx].append(convert_record_set)\n\n    return execution_data\n"
  },
  {
    "path": "tests/pytest/fixtures/expected_physical_plans.py",
    "content": "from copy import deepcopy\n\nimport pytest\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.elements.filters import Filter\nfrom palimpzest.core.lib.schemas import TextFile, get_schema_field_names, union_schemas\nfrom palimpzest.core.models import PlanCost\nfrom palimpzest.query.operators.convert import LLMConvertBonded\nfrom palimpzest.query.operators.filter import LLMFilter\nfrom palimpzest.query.operators.logical import BaseScan, ConvertScan, FilteredScan\nfrom palimpzest.query.operators.scan import MarshalAndScanDataOp\nfrom palimpzest.query.optimizer.plan import PhysicalPlan\n\n\n### THREE CONVERTS PHYSICAL PLANS ###\ndef get_three_converts_plan(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema, models, expected_cost, expected_time, expected_quality):\n    # extract node id's from workload Datasets\n    scan_node_id = three_converts_workload._sources[0]._sources[0]._sources[0].id\n    first_convert_node_id = three_converts_workload._sources[0]._sources[0].id\n    second_convert_node_id = three_converts_workload._sources[0].id\n\n    # create physical op for scan operator\n    scan_logical_op = BaseScan(datasource=enron_eval_tiny, output_schema=TextFile)\n    scan_op = MarshalAndScanDataOp(output_schema=TextFile, datasource=enron_eval_tiny, logical_op_id=scan_logical_op.get_logical_op_id())\n\n    # create physical op for first convert operator\n    first_convert_schema = union_schemas([TextFile, email_schema])\n    depends_on = set(get_schema_field_names(scan_logical_op.output_schema, id=scan_node_id))\n    first_convert_logical_op = ConvertScan(input_schema=TextFile, output_schema=first_convert_schema, depends_on=list(depends_on))\n    first_convert_op = LLMConvertBonded(output_schema=first_convert_schema, input_schema=TextFile, model=models[0], depends_on=list(depends_on), logical_op_id=first_convert_logical_op.get_logical_op_id())\n\n    # get physical op id for second convert operators\n    second_convert_schema = union_schemas([first_convert_schema, foobar_schema])\n    depends_on.update(get_schema_field_names(first_convert_logical_op.output_schema, id=first_convert_node_id))\n    second_convert_logical_op = ConvertScan(input_schema=first_convert_schema, output_schema=second_convert_schema, depends_on=list(depends_on))\n    second_convert_op = LLMConvertBonded(output_schema=second_convert_schema, input_schema=first_convert_schema, model=models[1], depends_on=list(depends_on), logical_op_id=second_convert_logical_op.get_logical_op_id())\n\n    # get physical op id for third convert operators\n    third_convert_schema = union_schemas([second_convert_schema, baz_schema])\n    depends_on.update(get_schema_field_names(second_convert_logical_op.output_schema, id=second_convert_node_id))\n    third_convert_logical_op = ConvertScan(input_schema=second_convert_schema, output_schema=third_convert_schema, depends_on=list(depends_on))\n    third_convert_op = LLMConvertBonded(output_schema=third_convert_schema, input_schema=second_convert_schema, model=models[2], depends_on=list(depends_on), logical_op_id=third_convert_logical_op.get_logical_op_id())\n\n    plan = PhysicalPlan._from_ops(\n        ops=[scan_op, first_convert_op, second_convert_op, third_convert_op],\n        plan_cost=PlanCost(cost=expected_cost, time=expected_time, quality=expected_quality),\n    )\n    return plan\n\n@pytest.fixture\ndef three_converts_min_cost_expected_plan(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    cardinality = len(enron_eval_tiny)\n    expected_cost = 0.3 * cardinality\n    expected_time = 4.0 * cardinality\n    expected_quality = 1.0\n\n    return get_three_converts_plan(\n        three_converts_workload=three_converts_workload,\n        enron_eval_tiny=enron_eval_tiny,\n        email_schema=email_schema,\n        foobar_schema=foobar_schema,\n        baz_schema=baz_schema,\n        models=[Model.LLAMA3_3_70B, Model.GPT_4o_MINI, Model.LLAMA3_3_70B],\n        expected_cost=expected_cost,\n        expected_time=expected_time,\n        expected_quality=expected_quality,\n    )\n\n@pytest.fixture\ndef three_converts_max_quality_expected_plan(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    cardinality = len(enron_eval_tiny)\n    expected_cost = 3.0 * cardinality\n    expected_time = 4.0 * cardinality\n    expected_quality = 0.81\n\n    return get_three_converts_plan(\n        three_converts_workload=three_converts_workload,\n        enron_eval_tiny=enron_eval_tiny,\n        email_schema=email_schema,\n        foobar_schema=foobar_schema,\n        baz_schema=baz_schema,\n        models=[Model.GPT_4o, Model.GPT_4o_MINI, Model.GPT_4o],\n        expected_cost=expected_cost,\n        expected_time=expected_time,\n        expected_quality=expected_quality,\n    )\n\n@pytest.fixture\ndef three_converts_min_cost_at_fixed_quality_expected_plan(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    cardinality = len(enron_eval_tiny)\n    expected_cost = 2.0 * cardinality\n    expected_time = 4.0 * cardinality\n    expected_quality = 0.81\n\n    return get_three_converts_plan(\n        three_converts_workload=three_converts_workload,\n        enron_eval_tiny=enron_eval_tiny,\n        email_schema=email_schema,\n        foobar_schema=foobar_schema,\n        baz_schema=baz_schema,\n        models=[Model.LLAMA3_3_70B, Model.GPT_4o_MINI, Model.GPT_4o],\n        expected_cost=expected_cost,\n        expected_time=expected_time,\n        expected_quality=expected_quality,\n    )\n\n@pytest.fixture\ndef three_converts_max_quality_at_fixed_cost_expected_plan(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    cardinality = len(enron_eval_tiny)\n    expected_cost = (0.9 / cardinality) * cardinality\n    expected_time = 4.0 * cardinality\n    expected_quality = 0.72\n\n    return get_three_converts_plan(\n        three_converts_workload=three_converts_workload,\n        enron_eval_tiny=enron_eval_tiny,\n        email_schema=email_schema,\n        foobar_schema=foobar_schema,\n        baz_schema=baz_schema,\n        models=[Model.GPT_4o_MINI, Model.GPT_4o_MINI, Model.GPT_4o],\n        expected_cost=expected_cost,\n        expected_time=expected_time,\n        expected_quality=expected_quality,\n    )\n\n### ONE FILTER ONE CONVERT PHYSICAL PLANS ###\ndef get_one_filter_one_convert_plan(one_filter_one_convert_workload, enron_eval_tiny, email_schema, models, expected_cost, expected_time, expected_quality):\n    dataset_nodes = []\n    node = deepcopy(one_filter_one_convert_workload)\n    while not node.is_root:\n        dataset_nodes.append(node)\n        node = node._sources[0]\n    dataset_nodes.append(node)\n    dataset_nodes = list(reversed(dataset_nodes))\n\n    # extract node id's from workload Datasets\n    scan_node_id = dataset_nodes[0].id\n    first_filter_node_id = dataset_nodes[1].id\n\n    # create physical op for scan operator\n    scan_logical_op = BaseScan(datasource=enron_eval_tiny, output_schema=TextFile)\n    scan_op = MarshalAndScanDataOp(output_schema=TextFile, datasource=enron_eval_tiny, logical_op_id=scan_logical_op.get_logical_op_id())\n\n    # get physical op id for first filter operator\n    depends_on = set(get_schema_field_names(scan_logical_op.output_schema, id=scan_node_id))\n    first_filter_logical_op = FilteredScan(input_schema=TextFile, output_schema=TextFile, filter=Filter(\"filter1\"), depends_on=list(depends_on))\n    first_filter_op = LLMFilter(output_schema=TextFile, input_schema=TextFile, filter=Filter(\"filter1\"), model=models[0], depends_on=list(depends_on), logical_op_id=first_filter_logical_op.get_logical_op_id())\n\n    # create physical op for first convert operator\n    first_convert_schema = union_schemas([TextFile, email_schema])\n    depends_on = depends_on.union(set(get_schema_field_names(first_filter_logical_op.output_schema, id=first_filter_node_id)))\n    first_convert_logical_op = ConvertScan(input_schema=TextFile, output_schema=first_convert_schema, depends_on=list(depends_on))\n    first_convert_op = LLMConvertBonded(output_schema=first_convert_schema, input_schema=TextFile, model=models[1], depends_on=list(depends_on), logical_op_id=first_convert_logical_op.get_logical_op_id())\n\n    plan = PhysicalPlan._from_ops(\n        ops=[scan_op, first_filter_op, first_convert_op],\n        plan_cost=PlanCost(cost=expected_cost, time=expected_time, quality=expected_quality),\n    )\n    return plan\n\n@pytest.fixture\ndef one_filter_one_convert_min_cost_expected_plan(one_filter_one_convert_workload, enron_eval_tiny, email_schema):\n    cardinality = len(enron_eval_tiny)\n    expected_cost = (\n        1.0 * 1.0 * cardinality\n        + 1.0 * 0.5 * cardinality\n    )\n    expected_time = (\n        2.0 * cardinality\n        + 1.0 * 0.5 * cardinality\n    )\n    expected_quality = 1.0\n\n    return get_one_filter_one_convert_plan(\n        one_filter_one_convert_workload=one_filter_one_convert_workload,\n        enron_eval_tiny=enron_eval_tiny,\n        email_schema=email_schema,\n        models=[Model.LLAMA3_3_70B, Model.GPT_4o],\n        expected_cost=expected_cost,\n        expected_time=expected_time,\n        expected_quality=expected_quality,\n    )\n\n### TWO CONVERTS TWO FILTERS PHYSICAL PLANS ###\ndef get_two_converts_two_filters_plan(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema, first_filter_str, models, expected_cost, expected_time, expected_quality):\n    # extract node id's from workload Datasets\n    scan_node_id = two_converts_two_filters_workload._sources[0]._sources[0]._sources[0]._sources[0].id\n    first_convert_node_id = two_converts_two_filters_workload._sources[0]._sources[0]._sources[0].id\n\n    # create physical op for scan operator\n    scan_logical_op = BaseScan(datasource=enron_eval_tiny, output_schema=TextFile)\n    scan_op = MarshalAndScanDataOp(output_schema=TextFile, datasource=enron_eval_tiny, logical_op_id=scan_logical_op.get_logical_op_id())\n\n    # create physical op for first convert operator\n    first_convert_schema = union_schemas([TextFile, email_schema])\n    depends_on = set(get_schema_field_names(scan_logical_op.output_schema, id=scan_node_id))\n    first_convert_logical_op = ConvertScan(input_schema=TextFile, output_schema=first_convert_schema, depends_on=list(depends_on))\n    first_convert_op = LLMConvertBonded(output_schema=first_convert_schema, input_schema=TextFile, model=models[0], depends_on=list(depends_on), logical_op_id=first_convert_logical_op.get_logical_op_id())\n\n    # get physical op id for second convert operators\n    second_convert_schema = union_schemas([first_convert_schema, foobar_schema])\n    depends_on.update(get_schema_field_names(first_convert_logical_op.output_schema, id=first_convert_node_id))\n    second_convert_logical_op = ConvertScan(input_schema=first_convert_schema, output_schema=second_convert_schema, depends_on=list(depends_on))\n    second_convert_op = LLMConvertBonded(output_schema=second_convert_schema, input_schema=first_convert_schema, model=models[1], depends_on=list(depends_on), logical_op_id=second_convert_logical_op.get_logical_op_id())\n\n    # get physical op id for first filter operator\n    depends_on = [field for field in get_schema_field_names(first_convert_logical_op.output_schema, id=first_convert_node_id) if \"sender\" in field]\n    first_filter_logical_op = FilteredScan(input_schema=second_convert_schema, output_schema=second_convert_schema, filter=Filter(\"filter1\"), depends_on=list(depends_on))\n    first_filter_op = LLMFilter(output_schema=second_convert_schema, input_schema=second_convert_schema, filter=Filter(\"filter1\"), model=models[2], depends_on=list(depends_on), logical_op_id=first_filter_logical_op.get_logical_op_id())\n\n    # get physical op id for second filter operator\n    depends_on = [field for field in get_schema_field_names(first_convert_logical_op.output_schema, id=first_convert_node_id) if \"subject\" in field]\n    second_filter_logical_op = FilteredScan(input_schema=second_convert_schema, output_schema=second_convert_schema, filter=Filter(\"filter2\"), depends_on=list(depends_on))\n    second_filter_op = LLMFilter(output_schema=second_convert_schema, input_schema=second_convert_schema, filter=Filter(\"filter2\"), model=models[3], depends_on=list(depends_on), logical_op_id=second_filter_logical_op.get_logical_op_id())\n\n    plan = PhysicalPlan._from_ops(\n        ops=(\n            [scan_op, first_convert_op, first_filter_op, second_filter_op, second_convert_op]\n            if first_filter_str == \"filter1\"\n            else [scan_op, first_convert_op, second_filter_op, first_filter_op, second_convert_op]\n        ),\n        plan_cost=PlanCost(cost=expected_cost, time=expected_time, quality=expected_quality),\n    )\n    return plan\n\n@pytest.fixture\ndef two_converts_two_filters_min_cost_expected_plan(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema):\n    cardinality = len(enron_eval_tiny)\n    expected_cost = (\n        0.1 * 1.0 * cardinality\n        + 1.0 * 1.0 * cardinality\n        + 1.0 * (1 / 3) * cardinality\n        + 0.1 * (1 / 3) * 0.5 * cardinality\n    )\n    expected_time = (\n        3.0 * cardinality\n        + (1 / 3) * cardinality\n        + (1 / 3) * 0.5 * cardinality\n    )\n    expected_quality = 1.0\n\n    return get_two_converts_two_filters_plan(\n        two_converts_two_filters_workload=two_converts_two_filters_workload,\n        enron_eval_tiny=enron_eval_tiny,\n        email_schema=email_schema,\n        foobar_schema=foobar_schema,\n        first_filter_str=\"filter2\",\n        # models: [first convert, second convert, filter1, filter2]\n        models=[Model.LLAMA3_3_70B, Model.GPT_4o_MINI, Model.GPT_4o_MINI, Model.LLAMA3_3_70B],\n        expected_cost=expected_cost,\n        expected_time=expected_time,\n        expected_quality=expected_quality,\n    )\n\n@pytest.fixture\ndef two_converts_two_filters_max_quality_expected_plan(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema):\n    cardinality = len(enron_eval_tiny)\n    expected_cost = (\n        0.5 * cardinality\n        + 1.0 * cardinality\n        + 0.75 * 0.5 * cardinality\n        + 1.0 * 0.5 * 0.75 * cardinality\n    )\n    expected_time = (\n        3.0 * cardinality\n        + 0.5 * cardinality\n        + 0.5 * 0.75 * cardinality\n    )\n    expected_quality = 0.81\n\n    return get_two_converts_two_filters_plan(\n        two_converts_two_filters_workload=two_converts_two_filters_workload,\n        enron_eval_tiny=enron_eval_tiny,\n        email_schema=email_schema,\n        foobar_schema=foobar_schema,\n        first_filter_str=\"filter2\",\n        # models: [first convert, second convert, filter1, filter2]\n        models=[Model.LLAMA3_3_70B, Model.GPT_4o_MINI, Model.LLAMA3_3_70B, Model.GPT_4o],\n        expected_cost=expected_cost,\n        expected_time=expected_time,\n        expected_quality=expected_quality,\n    )\n\n@pytest.fixture\ndef two_converts_two_filters_min_cost_at_fixed_quality_expected_plan(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema):\n    cardinality = len(enron_eval_tiny)\n    expected_cost = (\n        0.5 * cardinality\n        + 1.0 * cardinality\n        + 0.5 * 0.5 * cardinality\n        + 0.3 * 0.5 * 0.75 * cardinality\n    )\n    expected_time = (\n        3.0 * cardinality\n        + 0.5 * cardinality\n        + 0.5 * 0.75 * cardinality\n    )\n    expected_quality = 0.81\n\n    return get_two_converts_two_filters_plan(\n        two_converts_two_filters_workload=two_converts_two_filters_workload,\n        enron_eval_tiny=enron_eval_tiny,\n        email_schema=email_schema,\n        foobar_schema=foobar_schema,\n        first_filter_str=\"filter1\",\n        # models: [first convert, second convert, filter1, filter2]\n        models=[Model.GPT_4o_MINI, Model.LLAMA3_3_70B, Model.GPT_4o_MINI, Model.LLAMA3_3_70B],\n        expected_cost=expected_cost,\n        expected_time=expected_time,\n        expected_quality=expected_quality,\n    )\n\n@pytest.fixture\ndef two_converts_two_filters_max_quality_at_fixed_cost_expected_plan(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema):\n    cardinality = len(enron_eval_tiny)\n    expected_cost = (\n        (0.3 / cardinality) * cardinality\n        + (0.3 / cardinality) * cardinality\n        + (0.3 / cardinality) * 0.5 * cardinality\n        + (0.2 / cardinality) * 0.5 * cardinality\n    )\n    expected_time = (\n        3.0 * cardinality\n        + 0.5 * cardinality\n        + 0.5 * cardinality\n    )\n    expected_quality = 0.81\n\n    return get_two_converts_two_filters_plan(\n        two_converts_two_filters_workload=two_converts_two_filters_workload,\n        enron_eval_tiny=enron_eval_tiny,\n        email_schema=email_schema,\n        foobar_schema=foobar_schema,\n        first_filter_str=\"filter2\",\n        # models: [first convert, second convert, filter1, filter2]\n        models=[Model.GPT_4o_MINI, Model.LLAMA3_3_70B, Model.GPT_4o, Model.GPT_4o],\n        expected_cost=expected_cost,\n        expected_time=expected_time,\n        expected_quality=expected_quality,\n    )\n"
  },
  {
    "path": "tests/pytest/fixtures/expected_qualities.py",
    "content": "import pytest\n\nfrom palimpzest.constants import Model\n\n\n# NOTE: this relies on knowledge of the fixtures in fixtures/execution_data.py\n@pytest.fixture\ndef scan_convert_filter_qualities(scan_convert_filter_execution_data):\n    expected_qualities = {\n        logical_op_id: {\n            source_idx: [[1.0] for _ in record_sets]\n            for source_idx, record_sets in source_idx_to_record_sets.items()\n        }\n        for logical_op_id, source_idx_to_record_sets in scan_convert_filter_execution_data.items()\n    }\n    return expected_qualities\n\n@pytest.fixture\ndef scan_convert_filter_empty_qualities(scan_convert_filter_execution_data):\n    expected_qualities = {}\n    for logical_op_id, source_idx_to_record_sets in scan_convert_filter_execution_data.items():\n        expected_qualities[logical_op_id] = {}\n        for source_idx, record_sets in source_idx_to_record_sets.items():\n            expected_qualities[logical_op_id][source_idx] = []\n            for record_set in record_sets:\n                record_set_expected_qualities = []\n                for record_op_stats in record_set.record_op_stats:\n                    quality = None\n                    if record_op_stats.logical_op_id == \"scan1-logical\":  # noqa: SIM114\n                        quality = 1.0\n                    elif record_op_stats.logical_op_id == \"convert1-logical\":\n                        quality = 1.0\n                    elif record_op_stats.logical_op_id == \"filter1-logical\":\n                        # by construction, champion model expects no outputs but models output odd records,\n                        # so odd records get quality 0.0 and even records get quality 1.0\n                        quality = int(not bool(source_idx % 2))\n\n                    record_set_expected_qualities.append(quality)\n                expected_qualities[logical_op_id][source_idx].append(record_set_expected_qualities)\n\n    return expected_qualities\n\n@pytest.fixture\ndef scan_convert_filter_varied_qualities(scan_convert_filter_varied_execution_data):\n    expected_qualities = {}\n    for logical_op_id, source_idx_to_record_sets in scan_convert_filter_varied_execution_data.items():\n        expected_qualities[logical_op_id] = {}\n        for source_idx, record_sets in source_idx_to_record_sets.items():\n            expected_qualities[logical_op_id][source_idx] = []\n            for record_set in record_sets:\n                record_set_expected_qualities = []\n                for record_op_stats in record_set.record_op_stats:\n                    quality = None\n                    if record_op_stats.logical_op_id == \"scan1-logical\":\n                        quality = 1.0\n                    elif record_op_stats.logical_op_id == \"convert1-logical\":\n                        quality = 1.0 if str(Model.GPT_4o) in record_op_stats.full_op_id else 0.5\n                    elif record_op_stats.logical_op_id == \"filter1-logical\":\n                        if str(Model.GPT_4o) in record_op_stats.full_op_id:\n                            quality = 1.0\n                        elif str(Model.GPT_4o_MINI) in record_op_stats.full_op_id:\n                            # by construction, champion model expects odd record outputs but GPT-3.5 outputs even records,\n                            # so all records get quality 0.0\n                            quality = 0.0\n                        elif str(Model.LLAMA3_1_8B) in record_op_stats.full_op_id:\n                            # by construction, champion model expects odd record outputs but LLAMA3_1_8B outputs all records,\n                            # so even records get quality 0.0 and odd records get quality 1.0\n                            quality = int(bool(source_idx % 2))\n\n                    record_set_expected_qualities.append(quality)\n                expected_qualities[logical_op_id][source_idx].append(record_set_expected_qualities)\n\n    return expected_qualities\n\n@pytest.fixture\ndef scan_convert_filter_varied_override_qualities(scan_convert_filter_varied_execution_data):\n    \"\"\"\n    NOTE: this test in particular kind of sucks, it is really hard to verify what correct behavior is\n\n    The score_quality() function will use expected_output record for scoring quality when (a subset\n    of) its record state matches a record_op_stats object perfectly. If no match is found, then the\n    champion model is used. Qualities here are computed accordingly.\n    \"\"\"\n    expected_qualities = {}\n    for logical_op_id, source_idx_to_record_sets in scan_convert_filter_varied_execution_data.items():\n        expected_qualities[logical_op_id] = {}\n        for source_idx, record_sets in source_idx_to_record_sets.items():\n            expected_qualities[logical_op_id][source_idx] = []\n            for record_set in record_sets:\n                record_set_expected_qualities = []\n                for record_op_stats in record_set.record_op_stats:\n                    quality = None\n                    if record_op_stats.logical_op_id == \"scan1-logical\":\n                        quality = 1.0\n                    elif record_op_stats.logical_op_id == \"convert1-logical\":\n                        # by construction, expected output is used to score records with idx % 3 > 0, i.e. records 1, 2, 4, 5, 7, 8\n                        # for expected outputs w/record idx < 6 (i.e. records 1, 2, 4, 5) the expected `bar` value is f\"bar{idx}-{str(Model.GPT_4o_MINI)}\";\n                        # for expected outputs w/record idx >= 6 (i.e. records 7, 8) the expected `bar` value is f\"bar{idx}-{str(Model.LLAMA3_1_8B)}\";\n                        # for records 0, 3, 6, 9; the champion model expects outputs f\"bar{idx}-{str(Model.GPT_4o)}\"\n                        if source_idx % 3 > 0 and source_idx < 6:\n                            quality = 1.0 if str(Model.GPT_4o_MINI) in record_op_stats.full_op_id else 0.5\n                        elif source_idx % 3 > 0:\n                            quality = 1.0 if str(Model.LLAMA3_1_8B) in record_op_stats.full_op_id else 0.5\n                        else:\n                            quality = 1.0 if str(Model.GPT_4o) in record_op_stats.full_op_id else 0.5\n\n                    elif record_op_stats.logical_op_id == \"filter1-logical\":\n                        # by construction, expected output passes all records with idx % 3 > 0, i.e. records 1, 2, 4, 5, 7, 8\n                        # - it expects GPT-3.5 for records with idx < 6 (i.e. records 1, 2, 4, 5)\n                        # - it expects LLAMA3_1_8B for records with idx >= 6 (i.e. records 7, 8)\n                        # champion model passes all odd records\n\n                        # using expected_record and match found\n                        if source_idx % 3 > 0 and source_idx < 6 and str(Model.GPT_4o_MINI) in record_op_stats.full_op_id:  # noqa: SIM114\n                            quality = int(record_op_stats.passed_operator)\n                        \n                        elif source_idx % 3 > 0 and source_idx >= 6 and str(Model.LLAMA3_1_8B) in record_op_stats.full_op_id:  # noqa: SIM114\n                            quality = int(record_op_stats.passed_operator)\n\n                        # using champion record and it thinks record should pass\n                        elif source_idx % 2:\n                            quality = int(record_op_stats.passed_operator)\n\n                        # using champion record and it thinks record should not pass\n                        else:\n                            quality = int(not record_op_stats.passed_operator)\n\n                    record_set_expected_qualities.append(quality)\n                expected_qualities[logical_op_id][source_idx].append(record_set_expected_qualities)\n\n    return expected_qualities\n\n\n@pytest.fixture\ndef scan_multi_convert_multi_filter_qualities(scan_multi_convert_multi_filter_execution_data):\n    expected_qualities = {}\n    for logical_op_id, source_idx_to_record_sets in scan_multi_convert_multi_filter_execution_data.items():\n        expected_qualities[logical_op_id] = {}\n        for source_idx, record_sets in source_idx_to_record_sets.items():\n            expected_qualities[logical_op_id][source_idx] = []\n            for record_set in record_sets:\n                record_set_expected_qualities = []\n                for one_to_many_idx, record_op_stats in enumerate(record_set.record_op_stats):\n                    quality = None\n                    if record_op_stats.logical_op_id == \"scan1-logical\":\n                        quality = 1.0\n                    elif record_op_stats.logical_op_id == \"convert1-logical\":\n                        # by construction, expected output is used to score records with source_idx < 7\n                        # the second output (one_to_many_idx == 1) for source_idx == 0 is not expected\n                        # the expected `bar` value is f\"bar{source_idx}-{str(Model.GPT_4o_MINI)}\";\n                        # for records with source_idx >= 7; the champion model expects outputs f\"bar{idx}-{str(Model.GPT_4o)}\"\n                        if source_idx == 0 and one_to_many_idx == 1:\n                            quality = 0.0\n                        elif source_idx < 7:\n                            quality = 1.0 if str(Model.GPT_4o_MINI) in record_op_stats.full_op_id else 0.5\n                        else:\n                            quality = 1.0 if str(Model.GPT_4o) in record_op_stats.full_op_id else 0.5\n\n                    elif record_op_stats.logical_op_id == \"filter1-logical\":\n                        # by construction, expected output passes all records with source_idx < 7\n                        # champion model also passes all records with source_idx < 7\n                        if source_idx < 7:\n                            quality = int(record_op_stats.passed_operator)\n                        else:\n                            quality = int(not record_op_stats.passed_operator)\n\n                    elif record_op_stats.logical_op_id == \"filter2-logical\":\n                        # by construction, expected output passes all records with source_idx < 7\n                        # champion model passes all records with source_idx < 5\n\n                        # the champion model and expected output agree on the filter decision for all records with source_idx < 5\n                        if source_idx < 5:  # noqa: SIM114\n                            quality = int(record_op_stats.passed_operator)\n\n                        # for records with source_idxs in [5, 6] if the *convert& model used was GPT_3_5, then the\n                        # expected record will match and we expect the record to pass\n                        elif source_idx < 7 and str(Model.GPT_4o_MINI) in record_op_stats.record_state['bar']:\n                            quality = int(record_op_stats.passed_operator)\n\n                        # for records with source_idxs in [5, 6] if the model used was *not* GPT_3_5, then the champion model will be used\n                        # and it does *not* expect the record to pass\n                        elif source_idx < 7:\n                            quality = int(not record_op_stats.passed_operator)\n\n                        # for all records with source_idxs >= 7, the champion is used and it does not pass the record\n                        else:\n                            quality = int(not record_op_stats.passed_operator)\n\n                    elif record_op_stats.logical_op_id == \"convert2-logical\":\n                        # by construction, expected output is used to score records with source_idx < 7\n                        # the second output (one_to_many_idx == 1) for source_idx == 0 is not expected\n                        # the expected `bar` value is f\"bar{source_idx}-{str(Model.GPT_4o_MINI)}\";\n                        # for records with source_idx >= 7; the champion model expects outputs f\"bar{idx}-{str(Model.GPT_4o)}\"\n                        if source_idx == 0 and one_to_many_idx == 1:\n                            quality = 0.0\n                        else:\n                            quality = 1.0 if str(Model.GPT_4o_MINI) in record_op_stats.full_op_id else 0.0\n\n                    record_set_expected_qualities.append(quality)\n                expected_qualities[logical_op_id][source_idx].append(record_set_expected_qualities)\n\n    return expected_qualities\n"
  },
  {
    "path": "tests/pytest/fixtures/expected_records.py",
    "content": "import os\n\nimport pytest\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.elements.records import DataRecord, DataRecordSet\nfrom palimpzest.core.lib.schemas import File\n\n\n### EXPECTED RECORDS ###\n@pytest.fixture\ndef enron_all_expected_records(enron_eval_tiny_data_path):\n    data_records = []\n    for source_idx, file in enumerate(sorted(os.listdir(enron_eval_tiny_data_path))):\n        with open(os.path.join(enron_eval_tiny_data_path, file), \"rb\") as f:\n            contents = f.read()\n        data_item = File(filename=file, contents=contents)\n        dr = DataRecord(data_item=data_item, source_indices=[source_idx])\n        data_records.append(dr)\n\n    return data_records\n\n\n@pytest.fixture\ndef enron_filter_expected_records(enron_all_expected_records):\n    data_records = [\n        record\n        for record in enron_all_expected_records\n        if record.filename in [\"buy-r-inbox-628.txt\", \"buy-r-inbox-749.txt\", \"zipper-a-espeed-28.txt\"]\n    ]\n    return data_records\n\n\n@pytest.fixture\ndef real_estate_all_expected_records(real_estate_eval_tiny_data_path, image_real_estate_listing_schema):\n    expected_listings = sorted(os.listdir(real_estate_eval_tiny_data_path))\n    listing_to_modern_and_attractive = {\"listing1\": True, \"listing2\": False, \"listing3\": False}\n    listing_to_has_natural_sunlight = {\"listing1\": True, \"listing2\": True, \"listing3\": False}\n\n    data_records = []\n    for source_idx, listing in enumerate(expected_listings):\n        if listing == \".DS_Store\":\n            continue\n        data_item = image_real_estate_listing_schema(\n            listing=listing,\n            text_content=\"\",\n            image_filepaths=[],\n            is_modern_and_attractive=listing_to_modern_and_attractive[listing],\n            has_natural_sunlight=listing_to_has_natural_sunlight[listing],\n        )\n        dr = DataRecord(data_item=data_item, source_indices=[source_idx])\n        data_records.append(dr)\n\n    return data_records\n\n\n@pytest.fixture\ndef real_estate_one_to_many_expected_records(real_estate_eval_tiny_data_path, room_real_estate_listing_schema):\n    expected_listings = sorted(os.listdir(real_estate_eval_tiny_data_path))\n    listing_to_rooms = {\n        \"listing1\": [\"other\", \"living_room\", \"kitchen\"],\n        \"listing2\": [\"other\", \"living_room\", \"living_room\"],\n        \"listing3\": [\"other\", \"living_room\", \"other\"],\n    }\n\n    data_records = []\n    for source_idx, listing in enumerate(expected_listings):\n        if listing == \".DS_Store\":\n            continue\n        for room in listing_to_rooms[listing]:\n            data_item = room_real_estate_listing_schema(\n                listing=listing,\n                text_content=\"\",\n                image_filepaths=[],\n                room=room,\n            )\n            dr = DataRecord(data_item=data_item, source_indices=[source_idx])\n            data_records.append(dr)\n\n    return data_records\n\n# NOTE: this relies on knowledge of the fixtures in fixtures/execution_data.py\n@pytest.fixture\ndef scan_convert_filter_expected_outputs(foobar_schema):\n    # create expected outputs to match execution data and champion outputs\n    expected_outputs = {}\n    for source_idx in range(10):\n        if source_idx % 2:\n            data_item = foobar_schema(\n                filename=f\"file{source_idx}\",\n                contents=None,\n                foo=f\"foo{source_idx}\",\n                bar=f\"bar{source_idx}\",\n            )\n            dr = DataRecord(data_item, [source_idx])\n            dr._passed_operator = True # bool(source_idx % 2)\n            expected_outputs[source_idx] = DataRecordSet([dr], None)\n\n    return expected_outputs\n\n@pytest.fixture\ndef scan_convert_filter_empty_expected_outputs():\n    return {}\n\n@pytest.fixture\ndef scan_convert_filter_varied_expected_outputs(foobar_schema):\n    # create expected outputs to differ from champion outputs;\n    # - champion outputs passes odd records\n    # - champion outputs always expects bar=f\"bar{idx}-{str(Model.GPT_4o)}\"\n    expected_outputs = {}\n    for source_idx in range(10):\n        if source_idx % 3 > 0:\n            data_item = foobar_schema(\n                filename=f\"file{source_idx}\",\n                contents=None,\n                foo=f\"foo{source_idx}\",\n                bar=f\"bar{source_idx}-{str(Model.GPT_4o_MINI)}\" if source_idx < 6 else f\"bar{source_idx}-{str(Model.LLAMA3_1_8B)}\",\n            )\n            dr = DataRecord(data_item, [source_idx])\n            dr._passed_operator = True\n            expected_outputs[source_idx] = DataRecordSet([dr], None)\n\n    return expected_outputs\n\n\n@pytest.fixture\ndef scan_multi_convert_multi_filter_expected_outputs(foobar_schema, baz_schema):\n    # create expected outputs to differ from champion outputs;\n    # - champion outputs passes source_idx < 5\n    # - champion outputs always expects GPT-4 outputs\n    # expected outputs:\n    # - pass source_idx < 7\n    # - always expects GPT-3.5 outputs\n    # - does not expect second one-to-many output for source_idx == 0\n    expected_outputs = {}\n    for source_idx in range(7):\n        drs = []\n        for one_to_many_idx in range(2):\n            if source_idx == 0 and one_to_many_idx == 1:\n                continue\n\n            data_item = foobar_schema(\n                filename=f\"file{source_idx}\",\n                contents=None,\n                foo=f\"foo{source_idx}-one-to-many-{one_to_many_idx}\",\n                bar=f\"bar{source_idx}-{str(Model.GPT_4o_MINI)}\",\n                baz=f\"baz{str(Model.GPT_4o_MINI)}\",\n            )\n            dr = DataRecord(data_item, [source_idx])\n            dr._passed_operator = True\n            drs.append(dr)\n\n        expected_outputs[source_idx] = DataRecordSet(drs, None)\n\n    return expected_outputs"
  },
  {
    "path": "tests/pytest/fixtures/models.py",
    "content": "from os import getenv\n\nimport pytest\n\nfrom palimpzest.constants import Model\n\n\n@pytest.fixture\ndef embedding_text_only_model():\n        return Model.NOMIC_EMBED_TEXT if getenv(\"TESTS_USE_OLLAMA_FOR_EMBEDDING\") else Model.TEXT_EMBEDDING_3_SMALL\n"
  },
  {
    "path": "tests/pytest/fixtures/operator_to_stats.py",
    "content": "from copy import deepcopy\n\nimport pytest\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.elements.filters import Filter\nfrom palimpzest.core.lib.schemas import TextFile, get_schema_field_names, union_schemas\nfrom palimpzest.query.operators.convert import LLMConvertBonded\nfrom palimpzest.query.operators.filter import LLMFilter\nfrom palimpzest.query.operators.logical import BaseScan, ConvertScan, FilteredScan\nfrom palimpzest.query.operators.scan import MarshalAndScanDataOp\n\n\n### THREE CONVERTS OPERATOR-TO-STATS ###\ndef get_three_converts_logical_and_full_op_ids(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    # extract node id's from workload Datasets\n    scan_node_id = three_converts_workload._sources[0]._sources[0]._sources[0].id\n    first_convert_node_id = three_converts_workload._sources[0]._sources[0].id\n    second_convert_node_id = three_converts_workload._sources[0].id\n    # third_convert_node_id = three_converts_workload.id\n\n    # get full and logical op id for scan operator\n    scan_logical_op = BaseScan(datasource=enron_eval_tiny, output_schema=TextFile)\n    scan_logical_op_id = scan_logical_op.get_logical_op_id()\n    scan_full_op_id = MarshalAndScanDataOp(logical_op_id=scan_logical_op_id, output_schema=TextFile, datasource=enron_eval_tiny).get_full_op_id()\n\n    # get full op ids for first convert operators\n    depends_on = set(get_schema_field_names(scan_logical_op.output_schema, id=scan_node_id))\n    first_convert_logical_op = ConvertScan(input_schema=TextFile, output_schema=email_schema, depends_on=list(depends_on))\n    first_convert_logical_op_id = first_convert_logical_op.get_logical_op_id()\n    first_convert_gpt4o_full_op_id = LLMConvertBonded(logical_op_id=first_convert_logical_op_id, output_schema=email_schema, input_schema=TextFile, model=Model.GPT_4o, depends_on=list(depends_on)).get_full_op_id()\n    first_convert_gpt4o_mini_full_op_id = LLMConvertBonded(logical_op_id=first_convert_logical_op_id, output_schema=email_schema, input_schema=TextFile, model=Model.GPT_4o_MINI, depends_on=list(depends_on)).get_full_op_id()\n    first_convert_llama_full_op_id = LLMConvertBonded(logical_op_id=first_convert_logical_op_id, output_schema=email_schema, input_schema=TextFile, model=Model.LLAMA3_3_70B, depends_on=list(depends_on)).get_full_op_id()\n\n    # get full op ids for second convert operators\n    depends_on.update(get_schema_field_names(first_convert_logical_op.output_schema, id=first_convert_node_id))\n    second_output_schema = union_schemas([email_schema, foobar_schema])\n    second_convert_logical_op = ConvertScan(input_schema=email_schema, output_schema=second_output_schema, depends_on=list(depends_on))\n    second_convert_logical_op_id = second_convert_logical_op.get_logical_op_id()\n    second_convert_gpt4o_full_op_id = LLMConvertBonded(logical_op_id=second_convert_logical_op_id, output_schema=second_output_schema, input_schema=email_schema, model=Model.GPT_4o, depends_on=list(depends_on)).get_full_op_id()\n    second_convert_gpt4o_mini_full_op_id = LLMConvertBonded(logical_op_id=second_convert_logical_op_id, output_schema=second_output_schema, input_schema=email_schema, model=Model.GPT_4o_MINI, depends_on=list(depends_on)).get_full_op_id()\n    second_convert_llama_full_op_id = LLMConvertBonded(logical_op_id=second_convert_logical_op_id, output_schema=second_output_schema, input_schema=email_schema, model=Model.LLAMA3_3_70B, depends_on=list(depends_on)).get_full_op_id()\n\n    # get full op ids for third convert operators\n    depends_on.update(get_schema_field_names(second_convert_logical_op.output_schema, id=second_convert_node_id))\n    third_output_schema = union_schemas([second_output_schema, baz_schema])\n    third_convert_logical_op = ConvertScan(input_schema=second_output_schema, output_schema=third_output_schema, depends_on=list(depends_on))\n    third_convert_logical_op_id = third_convert_logical_op.get_logical_op_id()\n    third_convert_gpt4o_full_op_id = LLMConvertBonded(logical_op_id=third_convert_logical_op_id, output_schema=third_output_schema, input_schema=second_output_schema, model=Model.GPT_4o, depends_on=list(depends_on)).get_full_op_id()\n    third_convert_gpt4o_mini_full_op_id = LLMConvertBonded(logical_op_id=third_convert_logical_op_id, output_schema=third_output_schema, input_schema=second_output_schema, model=Model.GPT_4o_MINI, depends_on=list(depends_on)).get_full_op_id()\n    third_convert_llama_full_op_id = LLMConvertBonded(logical_op_id=third_convert_logical_op_id, output_schema=third_output_schema, input_schema=second_output_schema, model=Model.LLAMA3_3_70B, depends_on=list(depends_on)).get_full_op_id()\n\n    return {\n        \"scan_logical_op_id\": scan_logical_op_id,\n        \"scan_full_op_id\": scan_full_op_id,\n        \"first_convert_logical_op_id\": first_convert_logical_op_id,\n        \"first_convert_gpt4o_full_op_id\": first_convert_gpt4o_full_op_id,\n        \"first_convert_gpt4o_mini_full_op_id\": first_convert_gpt4o_mini_full_op_id,\n        \"first_convert_llama_full_op_id\": first_convert_llama_full_op_id,\n        \"second_convert_logical_op_id\": second_convert_logical_op_id,\n        \"second_convert_gpt4o_full_op_id\": second_convert_gpt4o_full_op_id,\n        \"second_convert_gpt4o_mini_full_op_id\": second_convert_gpt4o_mini_full_op_id,\n        \"second_convert_llama_full_op_id\": second_convert_llama_full_op_id,\n        \"third_convert_logical_op_id\": third_convert_logical_op_id,\n        \"third_convert_gpt4o_full_op_id\": third_convert_gpt4o_full_op_id,\n        \"third_convert_gpt4o_mini_full_op_id\": third_convert_gpt4o_mini_full_op_id,\n        \"third_convert_llama_full_op_id\": third_convert_llama_full_op_id,\n    }\n\n@pytest.fixture\ndef three_converts_min_cost_operator_to_stats(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    # get logical and full op ids\n    op_ids = get_three_converts_logical_and_full_op_ids(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema)\n\n    # construct operator_to_stats\n    operator_to_stats = {\n        op_ids['scan_logical_op_id']: {\n            op_ids['scan_full_op_id']: {\"cost\": 0.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0}\n        },\n        op_ids['first_convert_logical_op_id']: {\n            op_ids['first_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['first_convert_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['first_convert_llama_full_op_id']: {\"cost\": 0.1, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n        },\n        op_ids['second_convert_logical_op_id']: {\n            op_ids['second_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['second_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.1, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['second_convert_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n        },\n        op_ids['third_convert_logical_op_id']: {\n            op_ids['third_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['third_convert_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['third_convert_llama_full_op_id']: {\"cost\": 0.1, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n        },\n    }\n\n    return operator_to_stats\n\n@pytest.fixture\ndef three_converts_max_quality_operator_to_stats(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    # get logical and full op ids\n    op_ids = get_three_converts_logical_and_full_op_ids(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema)\n\n    # construct operator_to_stats\n    operator_to_stats = {\n        op_ids['scan_logical_op_id']: {\n            op_ids['scan_full_op_id']: {\"cost\": 0.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0}\n        },\n        op_ids['first_convert_logical_op_id']: {\n            op_ids['first_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['first_convert_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.8, \"selectivity\": 1.0},\n            op_ids['first_convert_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.8, \"selectivity\": 1.0},\n        },\n        op_ids['second_convert_logical_op_id']: {\n            op_ids['second_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.8, \"selectivity\": 1.0},\n            op_ids['second_convert_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['second_convert_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.8, \"selectivity\": 1.0},\n        },\n        op_ids['third_convert_logical_op_id']: {\n            op_ids['third_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['third_convert_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['third_convert_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n        },\n    }\n\n    return operator_to_stats\n\n@pytest.fixture\ndef three_converts_min_cost_at_fixed_quality_operator_to_stats(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    # get logical and full op ids\n    op_ids = get_three_converts_logical_and_full_op_ids(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema)\n\n    # construct operator_to_stats\n    operator_to_stats = {\n        op_ids['scan_logical_op_id']: {\n            op_ids['scan_full_op_id']: {\"cost\": 0.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0}\n        },\n        op_ids['first_convert_logical_op_id']: {\n            op_ids['first_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['first_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.3, \"time\": 1.0, \"quality\": 0.8, \"selectivity\": 1.0},\n            op_ids['first_convert_llama_full_op_id']: {\"cost\": 0.5, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n        },\n        op_ids['second_convert_logical_op_id']: {\n            op_ids['second_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['second_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.5, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['second_convert_llama_full_op_id']: {\"cost\": 0.3, \"time\": 1.0, \"quality\": 0.8, \"selectivity\": 1.0},\n        },\n        op_ids['third_convert_logical_op_id']: {\n            op_ids['third_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['third_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.5, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['third_convert_llama_full_op_id']: {\"cost\": 0.3, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n        },\n    }\n\n    return operator_to_stats\n\n@pytest.fixture\ndef three_converts_max_quality_at_fixed_cost_operator_to_stats(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    # get logical and full op ids\n    op_ids = get_three_converts_logical_and_full_op_ids(three_converts_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema)\n\n    # normalize costs by cardinality; needs to cost less than 1.0 per record\n    cardinality = len(enron_eval_tiny)\n\n    # construct operator_to_stats\n    operator_to_stats = {\n        op_ids['scan_logical_op_id']: {\n            op_ids['scan_full_op_id']: {\"cost\": 0.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0}\n        },\n        op_ids['first_convert_logical_op_id']: {\n            op_ids['first_convert_gpt4o_full_op_id']: {\"cost\": 2.0 / cardinality, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['first_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.3 / cardinality, \"time\": 1.0, \"quality\": 0.8, \"selectivity\": 1.0},\n            op_ids['first_convert_llama_full_op_id']: {\"cost\": 0.5 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n        },\n        op_ids['second_convert_logical_op_id']: {\n            op_ids['second_convert_gpt4o_full_op_id']: {\"cost\": 2.0 / cardinality, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['second_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.3 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['second_convert_llama_full_op_id']: {\"cost\": 0.2 / cardinality, \"time\": 1.0, \"quality\": 0.8, \"selectivity\": 1.0},\n        },\n        op_ids['third_convert_logical_op_id']: {\n            op_ids['third_convert_gpt4o_full_op_id']: {\"cost\": 0.3 / cardinality, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['third_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.5 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['third_convert_llama_full_op_id']: {\"cost\": 0.25 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n        },\n    }\n\n    return operator_to_stats\n\n### ONE FILTER ONE CONVERT OPERATOR-TO-STATS ###\ndef get_one_filter_one_convert_logical_and_full_op_ids(one_filter_one_convert_workload, enron_eval_tiny, email_schema):\n    dataset_nodes = []\n    node = deepcopy(one_filter_one_convert_workload)\n    while not node.is_root:\n        dataset_nodes.append(node)\n        node = node._sources[0]\n    dataset_nodes.append(node)\n    dataset_nodes = list(reversed(dataset_nodes))\n\n    # extract node id's from workload Datasets\n    scan_node_id = dataset_nodes[0].id\n    first_filter_node_id = dataset_nodes[1].id\n    # first_convert_node_id = dataset_nodes[2].id\n\n    # get full and logical op id for scan operator\n    scan_logical_op = BaseScan(datasource=enron_eval_tiny, output_schema=TextFile)\n    scan_logical_op_id = scan_logical_op.get_logical_op_id()\n    scan_full_op_id = MarshalAndScanDataOp(logical_op_id=scan_logical_op_id, output_schema=TextFile, datasource=enron_eval_tiny).get_full_op_id()\n\n    # get full op ids for first filter operator\n    depends_on = set(get_schema_field_names(scan_logical_op.output_schema, id=scan_node_id))\n    first_filter_logical_op = FilteredScan(input_schema=TextFile, output_schema=TextFile, filter=Filter(\"filter1\"), depends_on=list(depends_on))\n    first_filter_logical_op_id = first_filter_logical_op.get_logical_op_id()\n    first_filter_gpt4o_full_op_id = LLMFilter(logical_op_id=first_filter_logical_op_id, output_schema=TextFile, input_schema=TextFile, filter=Filter(\"filter1\"), model=Model.GPT_4o, depends_on=list(depends_on)).get_full_op_id()\n    first_filter_gpt4o_mini_full_op_id = LLMFilter(logical_op_id=first_filter_logical_op_id, output_schema=TextFile, input_schema=TextFile, filter=Filter(\"filter1\"), model=Model.GPT_4o_MINI, depends_on=list(depends_on)).get_full_op_id()\n    first_filter_llama_full_op_id = LLMFilter(logical_op_id=first_filter_logical_op_id, output_schema=TextFile, input_schema=TextFile, filter=Filter(\"filter1\"), model=Model.LLAMA3_3_70B, depends_on=list(depends_on)).get_full_op_id()\n\n    # get full op ids for first convert operator\n    depends_on = depends_on.union(set(get_schema_field_names(first_filter_logical_op.output_schema, id=first_filter_node_id)))\n    output_schema = union_schemas([TextFile, email_schema])\n    first_convert_logical_op = ConvertScan(input_schema=TextFile, output_schema=output_schema, depends_on=list(depends_on))\n    first_convert_logical_op_id = first_convert_logical_op.get_logical_op_id()\n    first_convert_gpt4o_full_op_id = LLMConvertBonded(logical_op_id=first_convert_logical_op_id, output_schema=output_schema, input_schema=TextFile, model=Model.GPT_4o, depends_on=list(depends_on)).get_full_op_id()\n    first_convert_gpt4o_mini_full_op_id = LLMConvertBonded(logical_op_id=first_convert_logical_op_id, output_schema=output_schema, input_schema=TextFile, model=Model.GPT_4o_MINI, depends_on=list(depends_on)).get_full_op_id()\n    first_convert_llama_full_op_id = LLMConvertBonded(logical_op_id=first_convert_logical_op_id, output_schema=output_schema, input_schema=TextFile, model=Model.LLAMA3_3_70B, depends_on=list(depends_on)).get_full_op_id()\n\n    return {\n        \"scan_logical_op_id\": scan_logical_op_id,\n        \"scan_full_op_id\": scan_full_op_id,\n        \"first_filter_logical_op_id\": first_filter_logical_op_id,\n        \"first_filter_gpt4o_full_op_id\": first_filter_gpt4o_full_op_id,\n        \"first_filter_gpt4o_mini_full_op_id\": first_filter_gpt4o_mini_full_op_id,\n        \"first_filter_llama_full_op_id\": first_filter_llama_full_op_id,\n        \"first_convert_logical_op_id\": first_convert_logical_op_id,\n        \"first_convert_gpt4o_full_op_id\": first_convert_gpt4o_full_op_id,\n        \"first_convert_gpt4o_mini_full_op_id\": first_convert_gpt4o_mini_full_op_id,\n        \"first_convert_llama_full_op_id\": first_convert_llama_full_op_id,\n    }\n\n@pytest.fixture\ndef one_filter_one_convert_min_cost_operator_to_stats(one_filter_one_convert_workload, enron_eval_tiny, email_schema):\n    # get logical and full op ids\n    op_ids = get_one_filter_one_convert_logical_and_full_op_ids(one_filter_one_convert_workload, enron_eval_tiny, email_schema)\n\n    # construct operator_to_stats\n    operator_to_stats = {\n        op_ids['scan_logical_op_id']: {\n            op_ids['scan_full_op_id']: {\"cost\": 0.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0}\n        },\n        op_ids['first_filter_logical_op_id']: {\n            op_ids['first_filter_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['first_filter_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['first_filter_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 0.5},\n        },\n        op_ids['first_convert_logical_op_id']: {\n            op_ids['first_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['first_convert_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['first_convert_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n        },\n    }\n\n    return operator_to_stats\n\n### TWO CONVERTS TWO FILTERS OPERATOR-TO-STATS ###\ndef get_two_converts_two_filters_logical_and_full_op_ids(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    # extract node id's from workload Datasets\n    scan_node_id = two_converts_two_filters_workload._sources[0]._sources[0]._sources[0]._sources[0].id\n    first_convert_node_id = two_converts_two_filters_workload._sources[0]._sources[0]._sources[0].id\n    # second_convert_node_id = two_converts_two_filters_workload._sources[0]._sources[0].id\n    # first_filter_node_id = two_converts_two_filters_workload._sources[0].id\n    # second_filter_node_id = two_converts_two_filters_workload.id\n\n    # get full and logical op id for scan operator\n    scan_logical_op = BaseScan(datasource=enron_eval_tiny, output_schema=TextFile)\n    scan_logical_op_id = scan_logical_op.get_logical_op_id()\n    scan_full_op_id = MarshalAndScanDataOp(logical_op_id=scan_logical_op_id, output_schema=TextFile, datasource=enron_eval_tiny).get_full_op_id()\n\n    # get full op ids for first convert operators\n    depends_on = set(get_schema_field_names(scan_logical_op.output_schema, id=scan_node_id))\n    first_convert_logical_op = ConvertScan(input_schema=TextFile, output_schema=email_schema, depends_on=list(depends_on))\n    first_convert_logical_op_id = first_convert_logical_op.get_logical_op_id()\n    first_convert_gpt4o_full_op_id = LLMConvertBonded(logical_op_id=first_convert_logical_op_id, output_schema=email_schema, input_schema=TextFile, model=Model.GPT_4o, depends_on=list(depends_on)).get_full_op_id()\n    first_convert_gpt4o_mini_full_op_id = LLMConvertBonded(logical_op_id=first_convert_logical_op_id, output_schema=email_schema, input_schema=TextFile, model=Model.GPT_4o_MINI, depends_on=list(depends_on)).get_full_op_id()\n    first_convert_llama_full_op_id = LLMConvertBonded(logical_op_id=first_convert_logical_op_id, output_schema=email_schema, input_schema=TextFile, model=Model.LLAMA3_3_70B, depends_on=list(depends_on)).get_full_op_id()\n\n    # get full op ids for second convert operators\n    depends_on.update(get_schema_field_names(first_convert_logical_op.output_schema, id=first_convert_node_id))\n    output_schema = union_schemas([email_schema, foobar_schema])\n    second_convert_logical_op = ConvertScan(input_schema=email_schema, output_schema=output_schema, depends_on=list(depends_on))\n    second_convert_logical_op_id = second_convert_logical_op.get_logical_op_id()\n    second_convert_gpt4o_full_op_id = LLMConvertBonded(logical_op_id=second_convert_logical_op_id, output_schema=output_schema, input_schema=email_schema, model=Model.GPT_4o, depends_on=list(depends_on)).get_full_op_id()\n    second_convert_gpt4o_mini_full_op_id = LLMConvertBonded(logical_op_id=second_convert_logical_op_id, output_schema=output_schema, input_schema=email_schema, model=Model.GPT_4o_MINI, depends_on=list(depends_on)).get_full_op_id()\n    second_convert_llama_full_op_id = LLMConvertBonded(logical_op_id=second_convert_logical_op_id, output_schema=output_schema, input_schema=email_schema, model=Model.LLAMA3_3_70B, depends_on=list(depends_on)).get_full_op_id()\n\n    # get full op ids for first filter operators\n    depends_on = [field for field in get_schema_field_names(first_convert_logical_op.output_schema, id=first_convert_node_id) if \"sender\" in field]\n    first_filter_logical_op = FilteredScan(input_schema=output_schema, output_schema=output_schema, filter=Filter(\"filter1\"), depends_on=list(depends_on))\n    first_filter_logical_op_id = first_filter_logical_op.get_logical_op_id()\n    first_filter_gpt4o_full_op_id = LLMFilter(logical_op_id=first_filter_logical_op_id, output_schema=output_schema, input_schema=output_schema, filter=Filter(\"filter1\"), model=Model.GPT_4o, depends_on=list(depends_on)).get_full_op_id()\n    first_filter_gpt4o_mini_full_op_id = LLMFilter(logical_op_id=first_filter_logical_op_id, output_schema=output_schema, input_schema=output_schema, filter=Filter(\"filter1\"), model=Model.GPT_4o_MINI, depends_on=list(depends_on)).get_full_op_id()\n    first_filter_llama_full_op_id = LLMFilter(logical_op_id=first_filter_logical_op_id, output_schema=output_schema, input_schema=output_schema, filter=Filter(\"filter1\"), model=Model.LLAMA3_3_70B, depends_on=list(depends_on)).get_full_op_id()\n\n    # get full op ids for second filter operators\n    depends_on = [field for field in get_schema_field_names(first_convert_logical_op.output_schema, id=first_convert_node_id) if \"subject\" in field]\n    second_filter_logical_op = FilteredScan(input_schema=output_schema, output_schema=output_schema, filter=Filter(\"filter2\"), depends_on=list(depends_on))\n    second_filter_logical_op_id = second_filter_logical_op.get_logical_op_id()\n    second_filter_gpt4o_full_op_id = LLMFilter(logical_op_id=second_filter_logical_op_id, output_schema=output_schema, input_schema=output_schema, filter=Filter(\"filter2\"), model=Model.GPT_4o, depends_on=list(depends_on)).get_full_op_id()\n    second_filter_gpt4o_mini_full_op_id = LLMFilter(logical_op_id=second_filter_logical_op_id, output_schema=output_schema, input_schema=output_schema, filter=Filter(\"filter2\"), model=Model.GPT_4o_MINI, depends_on=list(depends_on)).get_full_op_id()\n    second_filter_llama_full_op_id = LLMFilter(logical_op_id=second_filter_logical_op_id, output_schema=output_schema, input_schema=output_schema, filter=Filter(\"filter2\"), model=Model.LLAMA3_3_70B, depends_on=list(depends_on)).get_full_op_id()\n\n    return {\n        \"scan_logical_op_id\": scan_logical_op_id,\n        \"scan_full_op_id\": scan_full_op_id,\n        \"first_convert_logical_op_id\": first_convert_logical_op_id,\n        \"first_convert_gpt4o_full_op_id\": first_convert_gpt4o_full_op_id,\n        \"first_convert_gpt4o_mini_full_op_id\": first_convert_gpt4o_mini_full_op_id,\n        \"first_convert_llama_full_op_id\": first_convert_llama_full_op_id,\n        \"second_convert_logical_op_id\": second_convert_logical_op_id,\n        \"second_convert_gpt4o_full_op_id\": second_convert_gpt4o_full_op_id,\n        \"second_convert_gpt4o_mini_full_op_id\": second_convert_gpt4o_mini_full_op_id,\n        \"second_convert_llama_full_op_id\": second_convert_llama_full_op_id,\n        \"first_filter_logical_op_id\": first_filter_logical_op_id,\n        \"first_filter_gpt4o_full_op_id\": first_filter_gpt4o_full_op_id,\n        \"first_filter_gpt4o_mini_full_op_id\": first_filter_gpt4o_mini_full_op_id,\n        \"first_filter_llama_full_op_id\": first_filter_llama_full_op_id,\n        \"second_filter_logical_op_id\": second_filter_logical_op_id,\n        \"second_filter_gpt4o_full_op_id\": second_filter_gpt4o_full_op_id,\n        \"second_filter_gpt4o_mini_full_op_id\": second_filter_gpt4o_mini_full_op_id,\n        \"second_filter_llama_full_op_id\": second_filter_llama_full_op_id,\n    }\n\n@pytest.fixture\ndef two_converts_two_filters_min_cost_operator_to_stats(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    # get logical and full op ids\n    op_ids = get_two_converts_two_filters_logical_and_full_op_ids(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema)\n\n    # construct operator_to_stats\n    operator_to_stats = {\n        op_ids['scan_logical_op_id']: {\n            op_ids['scan_full_op_id']: {\"cost\": 0.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0}\n        },\n        op_ids['first_convert_logical_op_id']: {\n            op_ids['first_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['first_convert_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['first_convert_llama_full_op_id']: {\"cost\": 0.1, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n        },\n        op_ids['second_convert_logical_op_id']: {\n            op_ids['second_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['second_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.1, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['second_convert_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n        },\n        op_ids['first_filter_logical_op_id']: {\n            op_ids['first_filter_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['first_filter_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 0.5},\n            op_ids['first_filter_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n        },\n        op_ids['second_filter_logical_op_id']: {\n            op_ids['second_filter_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['second_filter_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['second_filter_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1 / 3},\n        },\n    }\n\n    return operator_to_stats\n\n@pytest.fixture\ndef two_converts_two_filters_max_quality_operator_to_stats(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    # get logical and full op ids\n    op_ids = get_two_converts_two_filters_logical_and_full_op_ids(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema)\n\n    # construct operator_to_stats\n    operator_to_stats = {\n        op_ids['scan_logical_op_id']: {\n            op_ids['scan_full_op_id']: {\"cost\": 0.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0}\n        },\n        op_ids['first_convert_logical_op_id']: {\n            op_ids['first_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['first_convert_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.8, \"selectivity\": 1.0},\n            op_ids['first_convert_llama_full_op_id']: {\"cost\": 0.5, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n        },\n        op_ids['second_convert_logical_op_id']: {\n            op_ids['second_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.8, \"selectivity\": 1.0},\n            op_ids['second_convert_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['second_convert_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.8, \"selectivity\": 1.0},\n        },\n        op_ids['first_filter_logical_op_id']: {\n            op_ids['first_filter_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 0.75},\n            op_ids['first_filter_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 0.5},\n            op_ids['first_filter_llama_full_op_id']: {\"cost\": 0.75, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 0.75},\n        },\n        op_ids['second_filter_logical_op_id']: {\n            op_ids['second_filter_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 0.5},\n            op_ids['second_filter_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['second_filter_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n        },\n    }\n\n    return operator_to_stats\n\n@pytest.fixture\ndef two_converts_two_filters_min_cost_at_fixed_quality_operator_to_stats(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    # get logical and full op ids\n    op_ids = get_two_converts_two_filters_logical_and_full_op_ids(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema)\n\n    # construct operator_to_stats\n    operator_to_stats = {\n        op_ids['scan_logical_op_id']: {\n            op_ids['scan_full_op_id']: {\"cost\": 0.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0}\n        },\n        op_ids['first_convert_logical_op_id']: {\n            op_ids['first_convert_gpt4o_full_op_id']: {\"cost\": 10.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['first_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.5, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0}, # pick 1st\n            op_ids['first_convert_llama_full_op_id']: {\"cost\": 10.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n        },\n        op_ids['second_convert_logical_op_id']: {\n            op_ids['second_convert_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['second_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.5, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['second_convert_llama_full_op_id']: {\"cost\": 0.3, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0}, # pick 4th\n        },\n        op_ids['first_filter_logical_op_id']: {\n            op_ids['first_filter_gpt4o_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": (1 / 3)},\n            op_ids['first_filter_gpt4o_mini_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 0.5}, # pick 2nd\n            op_ids['first_filter_llama_full_op_id']: {\"cost\": 1.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n        },\n        op_ids['second_filter_logical_op_id']: {\n            op_ids['second_filter_gpt4o_full_op_id']: {\"cost\": 10.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['second_filter_gpt4o_mini_full_op_id']: {\"cost\": 10.0, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 0.5},\n            op_ids['second_filter_llama_full_op_id']: {\"cost\": 0.5, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 0.75}, # pick 3rd\n        },\n    }\n\n    return operator_to_stats\n\n@pytest.fixture\ndef two_converts_two_filters_max_quality_at_fixed_cost_operator_to_stats(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    # get logical and full op ids\n    op_ids = get_two_converts_two_filters_logical_and_full_op_ids(two_converts_two_filters_workload, enron_eval_tiny, email_schema, foobar_schema, baz_schema)\n\n    # normalize costs by cardinality; needs to cost less than 1.0 per record\n    cardinality = len(enron_eval_tiny)\n\n    # construct operator_to_stats\n    operator_to_stats = {\n        op_ids['scan_logical_op_id']: {\n            op_ids['scan_full_op_id']: {\"cost\": 0.0, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0}\n        },\n        op_ids['first_convert_logical_op_id']: {\n            op_ids['first_convert_gpt4o_full_op_id']: {\"cost\": 2.0 / cardinality, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['first_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.3 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0}, # pick 1st\n            op_ids['first_convert_llama_full_op_id']: {\"cost\": 0.5 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n        },\n        op_ids['second_convert_logical_op_id']: {\n            op_ids['second_convert_gpt4o_full_op_id']: {\"cost\": 2.0 / cardinality, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0},\n            op_ids['second_convert_gpt4o_mini_full_op_id']: {\"cost\": 0.3 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['second_convert_llama_full_op_id']: {\"cost\": 0.2 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0}, # pick 4th\n        },\n        op_ids['first_filter_logical_op_id']: {\n            op_ids['first_filter_gpt4o_full_op_id']: {\"cost\": 0.3 / cardinality, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 1.0}, # pick 3rd\n            op_ids['first_filter_gpt4o_mini_full_op_id']: {\"cost\": 0.5 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['first_filter_llama_full_op_id']: {\"cost\": 0.1 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 0.5}, \n        },\n        op_ids['second_filter_logical_op_id']: {\n            op_ids['second_filter_gpt4o_full_op_id']: {\"cost\": 0.3 / cardinality, \"time\": 1.0, \"quality\": 1.0, \"selectivity\": 0.5}, # pick 2nd\n            op_ids['second_filter_gpt4o_mini_full_op_id']: {\"cost\": 0.5 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 1.0},\n            op_ids['second_filter_llama_full_op_id']: {\"cost\": 0.2 / cardinality, \"time\": 1.0, \"quality\": 0.9, \"selectivity\": 0.5},\n        },\n    }\n\n    return operator_to_stats\n"
  },
  {
    "path": "tests/pytest/fixtures/physical_plans.py",
    "content": "import pytest\n\nfrom palimpzest.constants import Cardinality, Model\nfrom palimpzest.core.data.iter_dataset import MemoryDataset\nfrom palimpzest.core.elements.filters import Filter\nfrom palimpzest.core.lib.schemas import File, TextFile\nfrom palimpzest.query.operators.convert import LLMConvertBonded\nfrom palimpzest.query.operators.filter import LLMFilter, NonLLMFilter\nfrom palimpzest.query.operators.rag import RAGConvert\nfrom palimpzest.query.operators.scan import MarshalAndScanDataOp\nfrom palimpzest.query.optimizer.plan import PhysicalPlan, SentinelPlan\n\n\n### PHYSICAL PLANS ###\n@pytest.fixture\ndef scan_only_plan(enron_eval_tiny):\n    scan_op = MarshalAndScanDataOp(output_schema=File, datasource=enron_eval_tiny, logical_op_id=\"scan1\")\n    plan = PhysicalPlan._from_ops(ops=[scan_op])\n    return plan\n\n\n@pytest.fixture\ndef non_llm_filter_plan(enron_eval_tiny):\n    scan_op = MarshalAndScanDataOp(output_schema=File, datasource=enron_eval_tiny, logical_op_id=\"scan1\")\n\n    def filter_emails(record: dict):\n        return record[\"filename\"] in [\"buy-r-inbox-628.txt\", \"buy-r-inbox-749.txt\", \"zipper-a-espeed-28.txt\"]\n\n    filter = Filter(filter_fn=filter_emails)\n    filter_op = NonLLMFilter(input_schema=File, output_schema=File, filter=filter, logical_op_id=\"filter1\")\n    plan = PhysicalPlan._from_ops(ops=[scan_op, filter_op])\n    return plan\n\n\n@pytest.fixture\ndef llm_filter_plan(enron_eval_tiny):\n    scan_op = MarshalAndScanDataOp(output_schema=File, datasource=enron_eval_tiny, logical_op_id=\"scan1\")\n    filter = Filter(\"This filter will be mocked out\")\n    filter_op = LLMFilter(\n        input_schema=File,\n        output_schema=File,\n        filter=filter,\n        model=Model.GPT_4o_MINI,\n        logical_op_id=\"filter1\",\n    )\n    plan = PhysicalPlan._from_ops(ops=[scan_op, filter_op])\n    return plan\n\n\n@pytest.fixture\ndef bonded_llm_convert_plan(email_schema, enron_eval_tiny):\n    scan_op = MarshalAndScanDataOp(output_schema=TextFile, datasource=enron_eval_tiny, logical_op_id=\"scan1\")\n    convert_op_llm = LLMConvertBonded(\n        input_schema=TextFile,\n        output_schema=email_schema,\n        model=Model.GPT_4o_MINI,\n        logical_op_id=\"convert1\",\n    )\n    plan = PhysicalPlan._from_ops(ops=[scan_op, convert_op_llm])\n    return plan\n\n\n@pytest.fixture\ndef rag_convert_plan(email_schema, enron_eval_tiny, embedding_text_only_model):\n    scan_op = MarshalAndScanDataOp(output_schema=TextFile, datasource=enron_eval_tiny, logical_op_id=\"scan1\")\n    convert_op_llm = RAGConvert(\n        input_schema=TextFile,\n        output_schema=email_schema,\n        model=Model.GPT_4o_MINI,\n        embedding_model=embedding_text_only_model,\n        num_chunks_per_field=1,\n        chunk_size=1000,\n        logical_op_id=\"rag_convert1\",\n    )\n    plan = PhysicalPlan._from_ops(ops=[scan_op, convert_op_llm])\n    return plan\n\n\n@pytest.fixture\ndef image_convert_plan(real_estate_listing_files_schema, image_real_estate_listing_schema, real_estate_eval_tiny):\n    scan_op = MarshalAndScanDataOp(output_schema=real_estate_listing_files_schema, datasource=real_estate_eval_tiny, logical_op_id=\"scan1\")\n    convert_op_llm = LLMConvertBonded(\n        input_schema=real_estate_listing_files_schema,\n        output_schema=image_real_estate_listing_schema,\n        model=Model.GPT_4o_MINI,\n        logical_op_id=\"convert1\",\n    )\n    plan = PhysicalPlan._from_ops(ops=[scan_op, convert_op_llm])\n    return plan\n\n\n@pytest.fixture\ndef one_to_many_convert_plan(real_estate_listing_files_schema, room_real_estate_listing_schema, real_estate_eval_tiny):\n    scan_op = MarshalAndScanDataOp(output_schema=real_estate_listing_files_schema, datasource=real_estate_eval_tiny, logical_op_id=\"scan1\")\n    convert_op_llm = LLMConvertBonded(\n        input_schema=real_estate_listing_files_schema,\n        output_schema=room_real_estate_listing_schema,\n        model=Model.GPT_4o_MINI,\n        cardinality=Cardinality.ONE_TO_MANY,\n        logical_op_id=\"convert1\",\n    )\n    plan = PhysicalPlan._from_ops(ops=[scan_op, convert_op_llm])\n    return plan\n\n\n@pytest.fixture\ndef scan_convert_filter_sentinel_plan(foobar_schema):\n    datasource = MemoryDataset(id=\"test\", vals=[1, 2, 3, 4, 5, 6])\n    scan_op = MarshalAndScanDataOp(output_schema=TextFile, datasource=datasource, logical_op_id=\"scan1\")\n    convert_ops = [\n        LLMConvertBonded(\n            input_schema=TextFile,\n            output_schema=foobar_schema,\n            model=model,\n            logical_op_id=\"convert1\",\n        )\n        for model in [Model.GPT_4o_MINI, Model.GPT_4o, Model.LLAMA3_1_8B]\n    ]\n    filter_ops = [\n        LLMFilter(\n            input_schema=foobar_schema,\n            output_schema=foobar_schema,\n            filter=Filter(\"hello\"),\n            model=model,\n            logical_op_id=\"filter1\",\n        )\n        for model in [Model.GPT_4o_MINI, Model.GPT_4o, Model.LLAMA3_1_8B]\n    ]\n    plan = SentinelPlan(operator_sets=[[scan_op], convert_ops, filter_ops])\n    return plan\n\n\n@pytest.fixture\ndef scan_multi_convert_multi_filter_sentinel_plan(foobar_schema, baz_schema):\n    datasource = MemoryDataset(id=\"test\", vals=[1, 2, 3, 4, 5, 6])\n    scan_op = MarshalAndScanDataOp(output_schema=TextFile, datasource=datasource, logical_op_id=\"scan1\")\n    convert_ops1 = [\n        LLMConvertBonded(\n            input_schema=TextFile,\n            output_schema=foobar_schema,\n            model=model,\n            logical_op_id=\"convert1\",\n        )\n        for model in [Model.GPT_4o_MINI, Model.GPT_4o, Model.LLAMA3_1_8B]\n    ]\n    filter_ops1 = [\n        LLMFilter(\n            input_schema=foobar_schema,\n            output_schema=foobar_schema,\n            filter=Filter(\"hello\"),\n            model=model,\n            logical_op_id=\"filter1\",\n        )\n        for model in [Model.GPT_4o_MINI, Model.GPT_4o, Model.LLAMA3_1_8B]\n    ]\n    filter_ops2 = [\n        LLMFilter(\n            input_schema=foobar_schema,\n            output_schema=foobar_schema,\n            filter=Filter(\"world\"),\n            model=model,\n            logical_op_id=\"filter2\",\n        )\n        for model in [Model.GPT_4o_MINI, Model.GPT_4o, Model.LLAMA3_1_8B]\n    ]\n    convert_ops2 = [\n        LLMConvertBonded(\n            input_schema=foobar_schema,\n            output_schema=baz_schema,\n            model=model,\n            logical_op_id=\"convert2\",\n        )\n        for model in [Model.GPT_4o_MINI, Model.GPT_4o, Model.LLAMA3_1_8B]\n    ]\n    plan = SentinelPlan(operator_sets=[[scan_op], convert_ops1, filter_ops1, filter_ops2, convert_ops2])\n    return plan\n"
  },
  {
    "path": "tests/pytest/fixtures/schemas.py",
    "content": "from typing import Any\n\nimport pytest\nfrom pydantic import BaseModel, Field\n\nfrom palimpzest.core.lib.schemas import ImageFilepath, TextFile\n\n\n### SCHEMAS ###\n@pytest.fixture\ndef email_schema():\n    class Email(TextFile):\n        \"\"\"Represents an email, which in practice is usually from a text file\"\"\"\n        sender: str = Field(description=\"The email address of the sender\")\n        subject: str = Field(description=\"The subject of the email\")\n\n    return Email\n\n\n@pytest.fixture\ndef real_estate_listing_files_schema():\n    class RealEstateListingFiles(BaseModel):\n        \"\"\"The source text and image data for a real estate listing.\"\"\"\n        listing: str = Field(description=\"The name of the listing\")\n        text_content: str = Field(description=\"The content of the listing's text description\")\n        image_filepaths: list[ImageFilepath] = Field(description=\"A list of the filepaths for each image of the listing\")\n\n    return RealEstateListingFiles\n\n\n@pytest.fixture\ndef text_real_estate_listing_schema(real_estate_listing_files_schema):\n    class TextRealEstateListing(real_estate_listing_files_schema):\n        \"\"\"Represents a real estate listing with specific fields extracted from its text.\"\"\"\n        address: str = Field(description=\"The address of the property\")\n        price: int | float = Field(description=\"The listed price of the property\")\n\n    return TextRealEstateListing\n\n\n@pytest.fixture\ndef image_real_estate_listing_schema(real_estate_listing_files_schema):\n    class ImageRealEstateListing(real_estate_listing_files_schema):\n        \"\"\"Represents a real estate listing with specific fields extracted from its text and images.\"\"\"\n\n        is_modern_and_attractive: bool = Field(\n            description=\"True if the home interior design is modern and attractive and False otherwise\"\n        )\n        has_natural_sunlight: bool = Field(\n            description=\"True if the home interior has lots of natural sunlight and False otherwise\"\n        )\n\n    return ImageRealEstateListing\n\n\n@pytest.fixture\ndef room_real_estate_listing_schema(real_estate_listing_files_schema):\n    class RoomRealEstateListing(real_estate_listing_files_schema):\n        \"\"\"Represents a room shown in the image of a real estate listing.\"\"\"\n\n        room: str = Field(\n            description='The room shown in an image. Room can be one of [\"living_room\", \"kitchen\", \"bedroom\", \"other\"]',\n        )\n\n    return RoomRealEstateListing\n\n\n@pytest.fixture\ndef case_data_schema():\n    class CaseData(BaseModel):\n        \"\"\"An individual row extracted from a table containing medical study data.\"\"\"\n\n        case_submitter_id: Any = Field(description=\"The ID of the case\")\n        age_at_diagnosis: Any = Field(description=\"The age of the patient at the time of diagnosis\")\n        race: Any = Field(\n            description=\"An arbitrary classification of a taxonomic group that is a division of a species.\",\n        )\n        ethnicity: Any = Field(\n            description=\"Whether an individual describes themselves as Hispanic or Latino or not.\",\n        )\n        gender: Any = Field(description=\"Text designations that identify gender.\")\n        vital_status: Any = Field(description=\"The vital status of the patient\")\n        ajcc_pathologic_t: Any = Field(description=\"The AJCC pathologic T\")\n        ajcc_pathologic_n: Any = Field(description=\"The AJCC pathologic N\")\n        ajcc_pathologic_stage: Any = Field(description=\"The AJCC pathologic stage\")\n        tumor_grade: Any = Field(description=\"The tumor grade\")\n        tumor_focality: Any = Field(description=\"The tumor focality\")\n        tumor_largest_dimension_diameter: Any = Field(description=\"The tumor largest dimension diameter\")\n        primary_diagnosis: Any = Field(description=\"The primary diagnosis\")\n        morphology: Any = Field(description=\"The morphology\")\n        tissue_or_organ_of_origin: Any = Field(description=\"The tissue or organ of origin\")\n        filename: Any = Field(description=\"The name of the file the record was extracted from\")\n        study: Any = Field(\n            description=\"The last name of the author of the study, from the table name\",\n        )\n\n    return CaseData\n\n@pytest.fixture\ndef foobar_schema():\n    class FooBar(BaseModel):\n        foo: Any = Field(\"foo\")\n        bar: Any = Field(\"bar\")\n\n    return FooBar\n\n@pytest.fixture\ndef baz_schema():\n    class Baz(BaseModel):\n        baz: Any = Field(\"baz\")\n\n    return Baz\n"
  },
  {
    "path": "tests/pytest/fixtures/side_effects.py",
    "content": "import pytest\n\nfrom palimpzest.core.models import GenerationStats\n\n\n### Side-Effects for Mocking LLM Calls ###\n@pytest.fixture\ndef enron_filter():\n    def mock_filter(candidate):\n        # determine the answer based on the record filename\n        field_answers = {\"passed_operator\": candidate.filename in [\"buy-r-inbox-628.txt\", \"buy-r-inbox-749.txt\", \"zipper-a-espeed-28.txt\"]}\n        generation_stats = GenerationStats(cost_per_record=1.0)\n\n        return field_answers, generation_stats\n\n    return mock_filter\n\n\n@pytest.fixture\ndef enron_convert(email_schema):\n    def mock_convert(candidate, fields):\n        filename_to_sender = {\n            \"buy-r-inbox-628.txt\": \"sherron.watkins@enron.com\",\n            \"buy-r-inbox-749.txt\": \"david.port@enron.com\",\n            \"kaminski-v-deleted-items-1902.txt\": \"vkaminski@aol.com\",\n            \"martin-t-inbox-96-short.txt\": \"sarah.palmer@enron.com\",\n            \"skilling-j-inbox-1109.txt\": \"gary@cioclub.com\",\n            \"zipper-a-espeed-28.txt\": \"travis.mccullough@enron.com\",\n        }\n        filename_to_subject = {\n            \"buy-r-inbox-628.txt\": \"RE: portrac\",\n            \"buy-r-inbox-749.txt\": \"RE: NewPower\",\n            \"kaminski-v-deleted-items-1902.txt\": \"Fwd: FYI\",\n            \"martin-t-inbox-96-short.txt\": \"Enron Mentions -- 01/18/02\",\n            \"skilling-j-inbox-1109.txt\": \"Information Security Executive -092501\",\n            \"zipper-a-espeed-28.txt\": \"Redraft of the Exclusivity Agreement\",\n        }\n\n        # determine the answer based on the record filename\n        field_answers = {\n            \"sender\": [filename_to_sender[candidate.filename]],\n            \"subject\": [filename_to_subject[candidate.filename]],\n        }\n        generation_stats = GenerationStats(cost_per_record=1.0)\n\n        return field_answers, generation_stats\n\n    return mock_convert\n\n\n@pytest.fixture\ndef real_estate_convert(image_real_estate_listing_schema):\n    def mock_convert(candidate, fields):\n        listing_to_modern_and_attractive = {\"listing1\": True, \"listing2\": False, \"listing3\": False}\n        listing_to_has_natural_sunlight = {\"listing1\": True, \"listing2\": True, \"listing3\": False}\n\n        # determine the answer based on the record listing\n        field_answers = {\n            \"is_modern_and_attractive\": [listing_to_modern_and_attractive[candidate.listing]],\n            \"has_natural_sunlight\": [listing_to_has_natural_sunlight[candidate.listing]],\n        }\n        generation_stats = GenerationStats(cost_per_record=1.0)\n\n        return field_answers, generation_stats\n\n    return mock_convert\n\n\n@pytest.fixture\ndef real_estate_one_to_many_convert(room_real_estate_listing_schema):\n    def mock_convert(candidate, fields):\n        listing_to_rooms = {\n            \"listing1\": [\"other\", \"living_room\", \"kitchen\"],\n            \"listing2\": [\"other\", \"living_room\", \"living_room\"],\n            \"listing3\": [\"other\", \"living_room\", \"other\"],\n        }\n\n        # determine the answers based on the record listing\n        field_answers = {\"room\": listing_to_rooms[candidate.listing]}\n        generation_stats = GenerationStats(cost_per_record=1.0)\n\n        return field_answers, generation_stats\n\n    return mock_convert\n"
  },
  {
    "path": "tests/pytest/fixtures/workloads.py",
    "content": "import pytest\n\n\n### UDFs ###\ndef within_two_miles_of_mit(record):\n    # NOTE: I'm using this hard-coded function so that folks w/out a\n    #       Geocoding API key from google can still run this example\n    far_away_addrs = [\n        \"Melcher St\",\n        \"Sleeper St\",\n        \"437 D St\",\n        \"Seaport Blvd\",\n        \"50 Liberty Dr\",\n        \"Telegraph St\",\n        \"Columbia Rd\",\n        \"E 6th St\",\n        \"E 7th St\",\n        \"E 5th St\",\n    ]\n    try:\n        return not any([street.lower() in record.address.lower() for street in far_away_addrs])\n    except Exception:\n        return False\n\n\ndef in_price_range(record):\n    try:\n        price = record.price\n        if isinstance(price, str):\n            price = price.strip()\n            price = int(price.replace(\"$\", \"\").replace(\",\", \"\"))\n        return 6e5 < price <= 2e6\n    except Exception:\n        return False\n\n\n### WORKLOADS ###\n@pytest.fixture\ndef enron_workload(enron_eval_tiny, email_schema):\n    emails = enron_eval_tiny\n    emails = emails.sem_add_columns(email_schema)\n    emails = emails.sem_filter(\n        'The email refers to a fraudulent scheme (i.e., \"Raptor\", \"Deathstar\", \"Chewco\", and/or \"Fat Boy\"'\n    )\n    emails = emails.sem_filter(\n        \"The email is not quoting from a news article or an article written by someone outside of Enron\"\n    )\n    return emails\n\n\n@pytest.fixture\ndef small_real_estate_workload(\n    real_estate_eval_tiny,\n    text_real_estate_listing_schema,\n    image_real_estate_listing_schema,\n):\n    listings = real_estate_eval_tiny\n    listings = listings.sem_add_columns(text_real_estate_listing_schema, depends_on=\"text_content\")\n    listings = listings.sem_add_columns(image_real_estate_listing_schema, depends_on=\"image_filepaths\")\n    listings = listings.sem_filter(\n        \"The interior is modern and attractive, and has lots of natural sunlight\",\n        depends_on=[\"is_modern_and_attractive\", \"has_natural_sunlight\"],\n    )\n    return listings\n\n\n@pytest.fixture\ndef real_estate_workload(\n    real_estate_eval_tiny,\n    text_real_estate_listing_schema,\n    image_real_estate_listing_schema,\n):\n    listings = real_estate_eval_tiny\n    listings = listings.sem_add_columns(text_real_estate_listing_schema, depends_on=\"text_content\")\n    listings = listings.sem_add_columns(image_real_estate_listing_schema, depends_on=\"image_filepaths\")\n    listings = listings.sem_filter(\n        \"The interior is modern and attractive, and has lots of natural sunlight\",\n        depends_on=[\"is_modern_and_attractive\", \"has_natural_sunlight\"],\n    )\n    listings = listings.filter(\n        within_two_miles_of_mit,\n        depends_on=\"address\",\n    )\n    listings = listings.filter(\n        in_price_range,\n        depends_on=\"price\",\n    )\n    return listings\n\n\n@pytest.fixture\ndef three_converts_workload(enron_eval_tiny, email_schema, foobar_schema, baz_schema):\n    # construct plan with three converts\n    dataset = enron_eval_tiny\n    dataset = dataset.sem_add_columns(email_schema)\n    dataset = dataset.sem_add_columns(foobar_schema)\n    dataset = dataset.sem_add_columns(baz_schema)\n\n    return dataset\n\n@pytest.fixture\ndef one_filter_one_convert_workload(enron_eval_tiny, email_schema):\n    # construct plan with two converts and two filters\n    dataset = enron_eval_tiny\n    dataset = dataset.sem_filter(\"filter1\")\n    dataset = dataset.sem_add_columns(email_schema)\n\n    return dataset\n\n@pytest.fixture\ndef two_converts_two_filters_workload(enron_eval_tiny, email_schema, foobar_schema):\n    # construct plan with two converts and two filters\n    dataset = enron_eval_tiny\n    dataset = dataset.sem_add_columns(email_schema)\n    dataset = dataset.sem_add_columns(foobar_schema)\n    dataset = dataset.sem_filter(\"filter1\", depends_on=[\"sender\"])\n    dataset = dataset.sem_filter(\"filter2\", depends_on=[\"subject\"])\n\n    return dataset\n"
  },
  {
    "path": "tests/pytest/test_aggregate.py",
    "content": "\"\"\"This script contains tests for physical operators for semantic aggregation.\"\"\"\n\nimport os\n\nimport pytest\nfrom pydantic import BaseModel, Field\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.lib.schemas import AudioFilepath, ImageFilepath, union_schemas\nfrom palimpzest.core.models import GenerationStats\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.operators.aggregate import SemanticAggregate\n\nif not os.environ.get(\"OPENAI_API_KEY\"):\n    from palimpzest.utils.env_helpers import load_env\n\n    load_env()\n\n\nclass TextInputSchema(BaseModel):\n    text: str = Field(description=\"Description of an animal\")\n    age: int = Field(description=\"The age of the animal in years\")\n\nclass ImageInputSchema(BaseModel):\n    image_file: ImageFilepath = Field(description=\"File path to an image of an animal\")\n    height: float = Field(description=\"The estimated height of the animal in cm\")\n\nclass AudioInputSchema(BaseModel):\n    audio_file: AudioFilepath = Field(description=\"File path to an audio recording of an animal\")\n    year: float = Field(description=\"The year the recording was made\")\n\nTextImageInputSchema = union_schemas([TextInputSchema, ImageInputSchema])\nTextAudioInputSchema = union_schemas([TextInputSchema, AudioInputSchema])\nImageAudioInputSchema = union_schemas([ImageInputSchema, AudioInputSchema])\nTextImageAudioInputSchema = union_schemas([TextInputSchema, ImageInputSchema, AudioInputSchema])\n\nclass OutputSchema(BaseModel):\n    num_elephants: int = Field(description=\"The number of (possibly duplicate) elephants in the input\")\n\ndef create_input_record(input_schema: type[BaseModel], idx: int) -> DataRecord:\n    idx_to_elephant_name = {0: \"Dumbo\", 1: \"Ella\", 2: \"Babar\"}\n    idx_to_elephant_height = {0: 250.0, 1: 300.5, 2: 350.2}\n    idx_to_elephant_year = {0: 2018, 1: 2019, 2: 2020}\n    data_item = {}\n    if all(field in input_schema.model_fields for field in TextInputSchema.model_fields):\n        data_item['text'] = f\"This record contains the age of an elephant named {idx_to_elephant_name[idx]}.\"\n        data_item['age'] = idx + 1\n    if all(field in input_schema.model_fields for field in ImageInputSchema.model_fields):\n        data_item['image_file'] = \"tests/pytest/data/elephant.png\"\n        data_item['height'] = idx_to_elephant_height[idx]\n    if all(field in input_schema.model_fields for field in AudioInputSchema.model_fields):\n        data_item['audio_file'] = \"tests/pytest/data/elephant.wav\"\n        data_item['year'] = idx_to_elephant_year[idx]\n\n    return DataRecord(input_schema(**data_item), source_indices=[idx])\n\n\ndef mock_generator_call(candidate, fields, right_candidate=None, json_output=True, **kwargs):\n    field_answers = {\"num_elephants\": [3]}\n    reasoning = \"The input shows three elephants.\"\n    generation_stats = GenerationStats(cost_per_record=1.0, time_per_record=1.0, num_input_tokens=10, num_output_tokens=10)\n    messages = []\n    return field_answers, reasoning, generation_stats, messages\n\n\n@pytest.mark.parametrize(\n    \"input_schema\",\n    [TextInputSchema, ImageInputSchema, AudioInputSchema, TextImageInputSchema, TextAudioInputSchema, ImageAudioInputSchema, TextImageAudioInputSchema],\n    ids=[\"text-only\", \"image-only\", \"audio-only\", \"text-image\", \"text-audio\", \"image-audio\", \"text-image-audio\"],\n)\n@pytest.mark.parametrize(\n    \"physical_op_class\",\n    [SemanticAggregate],\n    ids=[\"semantic-aggregate\"],\n)\ndef test_aggregate(mocker, input_schema, physical_op_class):\n    \"\"\"Test aggregate operators on simple input\"\"\"\n    if os.getenv(\"NO_GEMINI\") and input_schema in [AudioInputSchema, TextAudioInputSchema, ImageAudioInputSchema, TextImageAudioInputSchema]:\n        pytest.skip(\"Skipping multi-modal audio tests on CI which does not have access to gemini models\")\n\n    model = Model.GPT_5_MINI if os.getenv(\"NO_GEMINI\") else Model.GEMINI_2_5_FLASH\n\n    # construct the kwargs for the physical operator\n    physical_op_kwargs = {\n        \"input_schema\": input_schema,\n        \"output_schema\": OutputSchema,\n        \"agg_str\": \"The number of (possibly duplicate) elephants in the input\",\n        \"logical_op_id\": \"test-aggregate\",\n        \"model\": model,\n    }\n\n    # create filter operator\n    agg_op = physical_op_class(**physical_op_kwargs)\n\n    # create input records\n    input_records = [create_input_record(input_schema, idx) for idx in range(3)]\n\n    # only execute LLM calls if specified\n    if not os.getenv(\"RUN_LLM_TESTS\"):\n        mocker.patch.object(Generator, \"__call__\", side_effect=mock_generator_call)\n\n    # apply filter operator to the input\n    data_record_set = agg_op(input_records)\n\n    # check for single output record with expected fields\n    assert len(data_record_set) == 1\n    output_record = data_record_set[0]\n\n    assert list(output_record.schema.model_fields) == [\"num_elephants\"]\n    assert output_record.num_elephants == 3\n"
  },
  {
    "path": "tests/pytest/test_convert.py",
    "content": "\"\"\"This testing class is an integration test suite.\nWhat it does is consider one of the demo scenarios and test whether we can obtain the same results with the refactored code\n\"\"\"\n\nimport os\n\nimport pytest\n\nfrom palimpzest.constants import Model, PromptStrategy\nfrom palimpzest.core.lib.schemas import File, TextFile, union_schemas\nfrom palimpzest.query.operators.convert import LLMConvertBonded\nfrom palimpzest.query.operators.scan import MarshalAndScanDataOp\n\nif not os.environ.get(\"OPENAI_API_KEY\"):\n    from palimpzest.utils.env_helpers import load_env\n\n    load_env()\n\n\n@pytest.mark.parametrize(\n    argnames=(\"convert_op\", \"side_effect\"),\n    argvalues=[\n        pytest.param(LLMConvertBonded, \"enron-convert\", id=\"bonded-llm-convert\"),\n    ],\n    indirect=[\"side_effect\"],\n)\ndef test_convert(mocker, convert_op, side_effect, email_schema, enron_eval_tiny):\n    \"\"\"Test whether convert operators\"\"\"\n    scan_op = MarshalAndScanDataOp(datasource=enron_eval_tiny, output_schema=TextFile, logical_op_id=\"test_scan\")\n    convert_op = convert_op(\n        input_schema=File,\n        output_schema=email_schema,\n        model=Model.GPT_4o,\n        prompt_strategy=PromptStrategy.MAP,\n        logical_op_id=\"test_convert\",\n    )\n\n    # mock out calls to generators used by the plans which parameterize this test\n    mocker.patch.object(LLMConvertBonded, \"convert\", side_effect=side_effect)\n\n    # run scan and convert operators\n    source_idx = 0\n    record_op_stats_lst, outputs = [], []\n    for record in scan_op(source_idx):\n        record_set = convert_op(record)\n        record_op_stats_lst.extend(record_set.record_op_stats)\n        outputs.extend(record_set.data_records)\n\n    assert len(outputs) == 1\n    assert outputs[0].schema == union_schemas([email_schema, TextFile])\n    assert sorted(outputs[0].get_field_names()) == [\"contents\", \"filename\", \"sender\", \"subject\"]\n"
  },
  {
    "path": "tests/pytest/test_dataset.py",
    "content": "import pandas as pd\nimport pytest\n\nfrom palimpzest.core.data.dataset import Dataset\nfrom palimpzest.core.data.iter_dataset import MemoryDataset\nfrom palimpzest.query.operators.logical import ConvertScan, FilteredScan\n\n\n# Test data\n@pytest.fixture\ndef sample_df():\n    return pd.DataFrame({\n        'id': [1, 2, 3],\n        'name': ['Alice', 'Bob', 'Charlie'],\n        'age': [25, 30, 35]\n    })\n\n\ndef test_dataset_initialization(sample_df):\n    ds = MemoryDataset(\"test\", sample_df)\n    assert isinstance(ds, Dataset)\n    assert sorted(ds.schema.model_fields) == ['age', 'id', 'name']\n\n\ndef test_dataset_filter(sample_df):\n    ds = MemoryDataset(\"test\", sample_df)\n    \n    # Test callable filter\n    filtered_ds = ds.filter(lambda x: x['age'] > 30)\n    assert isinstance(filtered_ds, Dataset)\n    assert isinstance(filtered_ds._operator, FilteredScan)\n    \n    # Test semantic filter\n    sem_filtered_ds = ds.sem_filter(\"age > 30\")\n    assert isinstance(sem_filtered_ds, Dataset)\n    assert isinstance(filtered_ds._operator, FilteredScan)\n\n\ndef test_dataset_add_columns(sample_df):\n    ds = MemoryDataset(\"test\", sample_df)\n\n    # Test UDF add_columns\n    def add_greeting(df):\n        df['greeting'] = 'Hello ' + df['name']\n        return df\n    \n    new_ds = ds.add_columns(udf=add_greeting, cols=[{'name': 'greeting', 'desc': 'Greeting message', 'type': str}])\n    assert isinstance(new_ds, Dataset)\n    assert isinstance(new_ds._operator, ConvertScan)\n    assert new_ds._operator.udf is not None\n    assert sorted(new_ds.schema.model_fields) == ['age', 'greeting', 'id', 'name']\n    greeting_field = new_ds.schema.model_fields['greeting'] \n    assert greeting_field.annotation is str\n    assert greeting_field.description == 'Greeting message'\n\n    # Test semantic add_columns\n    new_cols = [{'name': 'greeting', 'type': str, 'desc': 'Greeting message'},\n                {'name': 'score', 'type': int | float, 'desc': 'Score'}]\n    sem_new_ds = ds.sem_add_columns(new_cols)\n    assert isinstance(sem_new_ds, Dataset)\n    assert isinstance(sem_new_ds._operator, ConvertScan)\n    assert sorted(sem_new_ds.schema.model_fields) == ['age', 'greeting', 'id', 'name', 'score']\n    greeting_field = sem_new_ds.schema.model_fields['greeting']\n    assert greeting_field.annotation is str\n    assert greeting_field.description == 'Greeting message'\n\n    score_field = sem_new_ds.schema.model_fields['score']\n    assert score_field.annotation == int | float\n    assert score_field.description == 'Score'\n\n    with pytest.raises(ValueError, match=\"`udf` must be provided for add_columns.\"):\n        ds.add_columns(udf=None, cols=None)\n"
  },
  {
    "path": "tests/pytest/test_distinct.py",
    "content": "\"\"\"This testing class is an integration test suite.\nWhat it does is consider one of the demo scenarios and test whether we can obtain the same results with the refactored code\n\"\"\"\n\nimport os\n\nimport pandas as pd\nimport pytest\n\nimport palimpzest as pz\n\nif not os.environ.get(\"OPENAI_API_KEY\"):\n    from palimpzest.utils.env_helpers import load_env\n\n    load_env()\n\n\n# Test data\n@pytest.fixture\ndef sample_df():\n    return pd.DataFrame({\n        'person_id': [1, 1, 2, 3, 4],\n        'name': ['Alice', 'Alice', 'Bob', 'Bob', 'Charlie'],\n        'age': [25, 25, 30, 30, 35]\n    })\n\n\n@pytest.mark.parametrize(\"execution_strategy\", [\"sequential\", \"parallel\"])\ndef test_distinct(sample_df, execution_strategy):\n    ds = pz.MemoryDataset(\"test\", sample_df)\n    ds = ds.distinct()\n    output = ds.run(config=pz.QueryProcessorConfig(execution_strategy=execution_strategy))\n    output_df = output.to_df()\n    assert len(output_df) == 4\n    assert sorted(output_df.columns) == ['age', 'name', 'person_id']\n\n\n@pytest.mark.parametrize(\"execution_strategy\", [\"sequential\", \"parallel\"])\ndef test_dataset_with_distinct_cols(sample_df, execution_strategy):\n    ds = pz.MemoryDataset(\"test\", sample_df)\n    ds = ds.distinct(distinct_cols=['name', 'age'])\n    output = ds.run(config=pz.QueryProcessorConfig(execution_strategy=execution_strategy))\n    output_df = output.to_df()\n    assert len(output_df) == 3\n    assert sorted(output_df.columns) == ['age', 'name', 'person_id']\n\n\n@pytest.mark.parametrize(\"execution_strategy\", [\"sequential\", \"parallel\"])\ndef test_dataset_with_distinct_cols_and_limit(sample_df, execution_strategy):\n    ds = pz.MemoryDataset(\"test\", sample_df)\n    ds = ds.distinct(distinct_cols=['name', 'age']).limit(2)\n    output = ds.run(config=pz.QueryProcessorConfig(execution_strategy=execution_strategy))\n    output_df = output.to_df()\n    assert len(output_df) == 2\n    assert sorted(output_df.columns) == ['age', 'name', 'person_id']\n\n\n@pytest.mark.parametrize(\"execution_strategy\", [\"sequential\", \"parallel\"])\ndef test_dataset_with_distinct_cols_and_filter(sample_df, execution_strategy):\n    ds = pz.MemoryDataset(\"test\", sample_df)\n    ds = ds.distinct(distinct_cols=['name', 'age']).filter(lambda row: row['age'] > 30)\n    output = ds.run(config=pz.QueryProcessorConfig(execution_strategy=execution_strategy))\n    output_df = output.to_df()\n    assert len(output_df) == 1\n    assert sorted(output_df.columns) == ['age', 'name', 'person_id']\n"
  },
  {
    "path": "tests/pytest/test_dynamic_models.py",
    "content": "\"\"\"\nTest suite for Model class and model helper functions in Palimpzest.\n\nThis module tests:\n- Model instantiation with curated model IDs\n- Model properties and methods\n- Cost and performance metric retrieval\n- Model registry and get_all_models()\n- Model helper functions (get_models, get_optimal_models, resolve_reasoning_settings)\n- Integration with Generator and QueryProcessor\n- End-to-end pipeline execution\n\"\"\"\n\nimport os\nfrom unittest.mock import MagicMock, patch\n\nimport pandas as pd\nimport pytest\nfrom pydantic import BaseModel, Field\nfrom pydantic.fields import FieldInfo\n\nimport palimpzest as pz\nfrom palimpzest.constants import Model, PromptStrategy\nfrom palimpzest.core.data.dataset import Dataset\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.policy import MaxQuality, MinCost, MinCostAtFixedQuality, MinTime\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.processor.config import QueryProcessorConfig\nfrom palimpzest.query.processor.query_processor_factory import QueryProcessorFactory\nfrom palimpzest.utils.model_helpers import (\n    get_models,\n    get_optimal_models,\n)\nfrom palimpzest.utils.model_info_helpers import (\n    DEFAULT_QUALITY_SCORE,\n    DEFAULT_SECONDS_PER_OUTPUT_TOKEN,\n    MMLU_PRO_SCORES,\n    derive_model_flags,\n    fuzzy_match_score,\n    predict_local_model_metrics,\n)\n\n# =============================================================================\n# FIXTURES\n# =============================================================================\n\n@pytest.fixture\ndef input_schema():\n    \"\"\"Basic input schema for tests.\"\"\"\n    class InputSchema(BaseModel):\n        text: str = Field(description=\"Input text\")\n    return InputSchema\n\n\n@pytest.fixture\ndef output_schema():\n    \"\"\"Basic output schema for tests.\"\"\"\n    class OutputSchema(BaseModel):\n        result: str = Field(description=\"Result field\")\n    return OutputSchema\n\n\n@pytest.fixture\ndef sample_record(input_schema):\n    \"\"\"A sample DataRecord for generator tests.\"\"\"\n    return DataRecord(input_schema(text=\"Hello\"), source_indices=[1])\n\n\n@pytest.fixture\ndef mock_litellm_response():\n    \"\"\"Standard mock response for litellm.completion.\"\"\"\n    mock_response = MagicMock()\n    mock_response.usage.model_dump.return_value = {\n        \"completion_tokens\": 10,\n        \"prompt_tokens\": 20,\n        \"total_tokens\": 30\n    }\n    mock_response.choices[0].message.content = '{\"result\": \"Test Answer\"}'\n    return mock_response\n\n# =============================================================================\n# TEST CLASS: Model Class Instantiation\n# =============================================================================\n\nclass TestModelInstantiation:\n    \"\"\"Tests for Model class instantiation.\"\"\"\n\n    def test_known_model_instantiation(self):\n        \"\"\"Test that a known model can be instantiated.\"\"\"\n        model = Model.GPT_4o\n        assert model is not None\n        assert model.value == \"openai/gpt-4o-2024-08-06\"\n\n    def test_model_instantiation_with_string(self):\n        \"\"\"Test Model instantiation with a valid model string.\"\"\"\n        # This should work if the model exists in the curated JSON\n        model = Model(\"openai/gpt-4o-2024-08-06\")\n        assert model.value == \"openai/gpt-4o-2024-08-06\"\n        assert model.provider == \"openai\"\n\n    def test_unknown_model_raises_error(self):\n        \"\"\"Test that unknown model IDs raise ValueError.\"\"\"\n        with pytest.raises(ValueError, match=\"does not contain information\"):\n            Model(\"unknown-provider/nonexistent-model-xyz\")\n\n    def test_model_properties_from_specs(self):\n        \"\"\"Test that model properties are correctly loaded from specs.\"\"\"\n        model = Model.GPT_4o\n\n        assert model.is_text_model() is True\n        assert model.is_embedding_model() is False\n        assert isinstance(model.get_usd_per_input_token(), float)\n        assert model.get_usd_per_input_token() > 0\n\n    def test_model_provider_property(self):\n        \"\"\"Test that the provider property returns the correct string.\"\"\"\n        model = Model.GPT_4o\n        assert model.provider == \"openai\"\n\n        model_anthropic = Model.CLAUDE_3_7_SONNET\n        assert model_anthropic.provider == \"anthropic\"\n\n    def test_model_api_base_parameter(self):\n        \"\"\"Test that api_base parameter creates a local/vLLM model.\"\"\"\n        model = Model(\"hosted_vllm/qwen/Qwen1.5-0.5B-Chat\", api_base=\"http://localhost:8000/v1\")\n        assert model.value == \"hosted_vllm/qwen/Qwen1.5-0.5B-Chat\"\n        assert model.api_base == \"http://localhost:8000/v1\"\n        assert model.is_vllm_model() is True\n\n\n# =============================================================================\n# TEST CLASS: Model Registry\n# =============================================================================\n\nclass TestModelRegistry:\n    \"\"\"Tests for Model registry functionality.\"\"\"\n\n    def test_models_registered_on_creation(self):\n        \"\"\"Test that models are registered in _registry on creation.\"\"\"\n        # The predefined models should be in the registry\n        all_models = Model.get_all_models()\n        assert len(all_models) > 0\n\n        # Check that GPT_4o is in the registry\n        model_values = [m.value for m in all_models]\n        assert \"openai/gpt-4o-2024-08-06\" in model_values\n\n    def test_get_all_models_returns_list(self):\n        \"\"\"Test that get_all_models returns a list of Model instances.\"\"\"\n        all_models = Model.get_all_models()\n        assert isinstance(all_models, list)\n        assert all(isinstance(m, Model) for m in all_models)\n\n    def test_registry_contains_expected_models(self):\n        \"\"\"Test that the registry contains expected predefined models.\"\"\"\n        all_models = Model.get_all_models()\n        model_values = [m.value for m in all_models]\n\n        # Check for some expected models\n        expected_models = [\n            \"openai/gpt-4o-2024-08-06\",\n            \"anthropic/claude-3-7-sonnet-20250219\",\n            \"together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo\",\n        ]\n        for expected in expected_models:\n            assert expected in model_values, f\"Expected {expected} in registry\"\n\n# =============================================================================\n# TEST CLASS: Model Equality and Hashing\n# =============================================================================\n\nclass TestModelEqualityAndHashing:\n    \"\"\"Tests for Model equality and hashing.\"\"\"\n\n    def test_model_equality_same_instance(self):\n        \"\"\"Test that the same model instance is equal to itself.\"\"\"\n        model = Model.GPT_4o\n        assert model == model\n\n    def test_model_equality_same_value(self):\n        \"\"\"Test that models with the same value are equal.\"\"\"\n        model1 = Model(\"openai/gpt-4o-2024-08-06\")\n        model2 = Model(\"openai/gpt-4o-2024-08-06\")\n        assert model1 == model2\n\n    def test_model_equality_with_string(self):\n        \"\"\"Test that a model equals its string value.\"\"\"\n        model = Model.GPT_4o\n        assert model == \"openai/gpt-4o-2024-08-06\"\n\n    def test_model_inequality(self):\n        \"\"\"Test that different models are not equal.\"\"\"\n        assert Model.GPT_4o != Model.CLAUDE_3_7_SONNET\n\n    def test_model_hash_consistency(self):\n        \"\"\"Test that model hash is consistent.\"\"\"\n        model1 = Model(\"openai/gpt-4o-2024-08-06\")\n        model2 = Model(\"openai/gpt-4o-2024-08-06\")\n        assert hash(model1) == hash(model2)\n\n    def test_model_usable_in_set(self):\n        \"\"\"Test that models can be used in sets.\"\"\"\n        model_set = {Model.GPT_4o, Model.GPT_4o, Model.CLAUDE_3_7_SONNET}\n        assert len(model_set) == 2\n\n    def test_model_usable_as_dict_key(self):\n        \"\"\"Test that models can be used as dictionary keys.\"\"\"\n        model_dict = {Model.GPT_4o: \"gpt4\", Model.CLAUDE_3_7_SONNET: \"claude\"}\n        assert model_dict[Model.GPT_4o] == \"gpt4\"\n\n    def test_model_str_repr(self):\n        \"\"\"Test string representation of Model.\"\"\"\n        model = Model.GPT_4o\n        assert str(model) == \"openai/gpt-4o-2024-08-06\"\n        assert repr(model) == \"openai/gpt-4o-2024-08-06\"\n\n    def test_model_lt_comparison(self):\n        \"\"\"Test less-than comparison for sorting.\"\"\"\n        models = [Model.GPT_4o, Model.CLAUDE_3_7_SONNET, Model.LLAMA3_1_8B]\n        sorted_models = sorted(models)\n        # Should be sortable without error\n        assert len(sorted_models) == 3\n\n\n# =============================================================================\n# TEST CLASS: Model Helper Functions\n# =============================================================================\n\nclass TestModelHelperFunctions:\n    \"\"\"Tests for model helper functions.\"\"\"\n\n    def test_get_models_with_openai_key(self):\n        \"\"\"Test get_models returns OpenAI models when key is set.\"\"\"\n        with patch.dict(os.environ, {\"OPENAI_API_KEY\": \"test-key\"}, clear=False):\n            models = get_models()\n            openai_models = [m for m in models if m.provider == \"openai\"]\n            assert len(openai_models) > 0\n\n    def test_get_models_excludes_embedding_by_default(self):\n        \"\"\"Test that embedding models are excluded by default.\"\"\"\n        with patch.dict(os.environ, {\"OPENAI_API_KEY\": \"test-key\"}, clear=False):\n            models = get_models(include_embedding=False)\n            embedding_models = [m for m in models if m.is_embedding_model()]\n            assert len(embedding_models) == 0\n\n    def test_get_models_includes_embedding_when_requested(self):\n        \"\"\"Test that embedding models are included when requested.\"\"\"\n        with patch.dict(os.environ, {\"OPENAI_API_KEY\": \"test-key\"}, clear=False):\n            models = get_models(include_embedding=True)\n            embedding_models = [m for m in models if m.is_embedding_model()]\n            assert len(embedding_models) > 0\n\n    def test_get_models_empty_without_keys(self):\n        \"\"\"Test that get_models returns empty list without API keys.\"\"\"\n        with patch.dict(os.environ, {\n            \"OPENAI_API_KEY\": \"\",\n            \"ANTHROPIC_API_KEY\": \"\",\n            \"TOGETHER_API_KEY\": \"\",\n            \"GEMINI_API_KEY\": \"\",\n        }, clear=True):\n            models = get_models()\n            assert len(models) == 0\n\n    def test_get_optimal_models_returns_top_models(self):\n        \"\"\"Test that get_optimal_models returns top models based on policy.\"\"\"\n        with patch.dict(os.environ, {\"OPENAI_API_KEY\": \"test-key\"}, clear=False):\n            models = get_optimal_models(policy=MinCost())\n            assert len(models) <= 5  # Should return at most 5\n\n    def test_get_optimal_models_respects_policy(self):\n        \"\"\"Test that optimal models selection respects the policy.\"\"\"\n        with patch.dict(os.environ, {\n            \"OPENAI_API_KEY\": \"test-key\",\n            \"ANTHROPIC_API_KEY\": \"test-key\",\n        }, clear=False):\n            cost_models = get_optimal_models(policy=MinCost())\n            quality_models = get_optimal_models(policy=MaxQuality())\n\n            # Both should return models\n            assert len(cost_models) > 0\n            assert len(quality_models) > 0\n\n    def test_get_optimal_models_never_returns_empty_with_available_models(self):\n        \"\"\"Test that get_optimal_models never returns empty when models are available.\"\"\"\n        with patch.dict(os.environ, {\"OPENAI_API_KEY\": \"test-key\"}, clear=False):\n            # Use a very high quality constraint that no model can meet (0.95 = 95% MMLU-Pro)\n            policy = MinCostAtFixedQuality(min_quality=0.95)\n            models = get_optimal_models(policy=policy)\n\n            # Should still return at least one model (the best by primary metric)\n            assert len(models) >= 1\n\n    def test_get_optimal_models_fallback_returns_best_by_primary_metric(self):\n        \"\"\"Test that fallback returns best model according to primary metric.\"\"\"\n        with patch.dict(os.environ, {\"OPENAI_API_KEY\": \"test-key\"}, clear=False):\n            # MinCostAtFixedQuality has primary_metric=\"cost\"\n            # With impossible constraint, should return cheapest model\n            policy_cost = MinCostAtFixedQuality(min_quality=0.99)\n            cost_models = get_optimal_models(policy=policy_cost)\n            assert len(cost_models) >= 1\n\n            # MaxQuality has primary_metric=\"quality\"\n            # Even with no constraint issues, verify it returns models\n            policy_quality = MaxQuality()\n            quality_models = get_optimal_models(policy=policy_quality)\n            assert len(quality_models) >= 1\n\n    def test_get_optimal_models_fallback_with_time_policy(self):\n        \"\"\"Test that fallback works with time-based policy.\"\"\"\n        with patch.dict(os.environ, {\"OPENAI_API_KEY\": \"test-key\"}, clear=False):\n            # MinTime has primary_metric=\"time\"\n            policy = MinTime()\n            models = get_optimal_models(policy=policy)\n\n            # Should return models (fastest ones)\n            assert len(models) >= 1\n\n\n# =============================================================================\n# TEST CLASS: Generator Integration\n# =============================================================================\n\nclass TestGeneratorIntegration:\n    \"\"\"Tests for Generator integration with Model class.\"\"\"\n\n    @patch(\"palimpzest.query.generators.generators.litellm.completion\")\n    def test_generator_uses_model_value(\n        self, mock_completion, sample_record, output_schema, mock_litellm_response\n    ):\n        \"\"\"Test that Generator uses model.value for litellm calls.\"\"\"\n        mock_completion.return_value = mock_litellm_response\n\n        model = Model.GPT_4o\n        generator = Generator(\n            model=model,\n            prompt_strategy=PromptStrategy.MAP,\n            reasoning_effort=\"default\",\n            verbose=True\n        )\n\n        fields = {k: FieldInfo.from_annotation(v) for k, v in output_schema.model_fields.items()}\n        generator(\n            candidate=sample_record,\n            fields=fields,\n            prompt=\"Test prompt\",\n            parse_answer=lambda x: x,\n            output_schema=output_schema\n        )\n\n        _, kwargs = mock_completion.call_args\n        assert kwargs[\"model\"] == \"openai/gpt-4o-2024-08-06\"\n\n    @patch(\"palimpzest.query.generators.generators.litellm.completion\")\n    def test_generator_with_different_providers(\n        self, mock_completion, sample_record, output_schema, mock_litellm_response\n    ):\n        \"\"\"Test Generator works with models from different providers.\"\"\"\n        mock_completion.return_value = mock_litellm_response\n\n        for model in [Model.GPT_4o, Model.CLAUDE_3_7_SONNET, Model.LLAMA3_3_70B]:\n            generator = Generator(\n                model=model,\n                prompt_strategy=PromptStrategy.MAP,\n                reasoning_effort=\"default\"\n            )\n\n            fields = {k: FieldInfo.from_annotation(v) for k, v in output_schema.model_fields.items()}\n            generator(\n                candidate=sample_record,\n                fields=fields,\n                prompt=\"Test\",\n                parse_answer=lambda x: x,\n                output_schema=output_schema\n            )\n\n            _, kwargs = mock_completion.call_args\n            assert kwargs[\"model\"] == model.value\n\n\n# =============================================================================\n# TEST CLASS: QueryProcessor Integration\n# =============================================================================\n\nclass TestQueryProcessorIntegration:\n    \"\"\"Tests for QueryProcessor integration.\"\"\"\n\n    @patch(\"palimpzest.query.processor.query_processor_factory.QueryProcessor\")\n    def test_factory_accepts_model_list(self, mock_processor_cls):\n        \"\"\"Test that QueryProcessorFactory accepts available_models.\"\"\"\n        mock_dataset = MagicMock(spec=Dataset)\n        mock_dataset.schema = MagicMock()\n        mock_dataset.get_limit.return_value = None\n\n        config = QueryProcessorConfig(\n            policy=MinCost(),\n            available_models=[Model.GPT_4o, Model.CLAUDE_3_7_SONNET],\n            verbose=True,\n        )\n\n        with patch.dict(os.environ, {\"OPENAI_API_KEY\": \"fake-key\", \"ANTHROPIC_API_KEY\": \"fake-key\"}), \\\n             patch.object(QueryProcessorFactory, \"_create_optimizer\"), \\\n             patch.object(QueryProcessorFactory, \"_create_execution_strategy\"), \\\n             patch.object(QueryProcessorFactory, \"_create_sentinel_execution_strategy\"):\n            QueryProcessorFactory.create_processor(mock_dataset, config=config)\n\n        # Verify processor was created\n        mock_processor_cls.assert_called_once()\n\n    def test_factory_auto_selects_models_when_none_provided(self):\n        \"\"\"Test that factory calls get_optimal_models when available_models is empty.\"\"\"\n        mock_dataset = MagicMock(spec=Dataset)\n        mock_dataset.schema = MagicMock()\n        mock_dataset.get_limit.return_value = None\n\n        config = QueryProcessorConfig(\n            policy=MinCost(),\n            available_models=[],  # Empty list\n            verbose=True,\n        )\n\n        # Mock get_optimal_models to return some models and verify it's called\n        with patch.dict(os.environ, {\"OPENAI_API_KEY\": \"fake-key\"}), \\\n             patch(\"palimpzest.query.processor.query_processor_factory.get_optimal_models\",\n                   return_value=[Model.GPT_4o, Model.GPT_4o_MINI]) as mock_get_optimal, \\\n             patch(\"palimpzest.query.processor.query_processor_factory.QueryProcessor\"), \\\n             patch.object(QueryProcessorFactory, \"_create_optimizer\"), \\\n             patch.object(QueryProcessorFactory, \"_create_execution_strategy\"), \\\n             patch.object(QueryProcessorFactory, \"_create_sentinel_execution_strategy\"):\n            QueryProcessorFactory.create_processor(mock_dataset, config=config)\n            # Verify get_optimal_models was called with correct policy\n            mock_get_optimal.assert_called_once()\n            call_kwargs = mock_get_optimal.call_args\n            assert call_kwargs[1][\"policy\"] == config.policy\n\n# =============================================================================\n# TEST CLASS: End-to-End Integration\n# =============================================================================\n\nclass TestEndToEndIntegration:\n    \"\"\"End-to-end integration tests for the palimpzest pipeline.\"\"\"\n\n    @pytest.mark.skipif(\n        not os.environ.get(\"OPENAI_API_KEY\"),\n        reason=\"OPENAI_API_KEY not set\"\n    )\n    def test_simple_sem_map_pipeline(self):\n        \"\"\"Test a simple semantic map pipeline end-to-end.\"\"\"\n        # Create a simple dataset\n        df = pd.DataFrame({\n            \"question\": [\"What is 2 + 2?\", \"What is the capital of France?\"]\n        })\n        dataset = pz.MemoryDataset(\"test_e2e\", df)\n\n        # Define output schema\n        class Answer(BaseModel):\n            response: str = Field(description=\"The answer to the question\")\n\n        # Create pipeline\n        plan = dataset.sem_map(\n            cols=Answer,\n            desc=\"Answer the question concisely\"\n        )\n\n        # Configure and run\n        config = QueryProcessorConfig(\n            policy=MinCost(),\n            available_models=[Model.GPT_4o_MINI],\n            execution_strategy=\"sequential\",\n            progress=False,\n            verbose=False,\n        )\n\n        # Execute the pipeline\n        results = plan.run(config)\n        result_df = results.to_df()\n\n        # Verify results\n        assert len(result_df) == 2\n        assert \"response\" in result_df.columns\n\n        # Check that we got meaningful answers\n        answers = result_df[\"response\"].astype(str).str.lower().tolist()\n        assert any(\"4\" in a for a in answers), \"Expected answer containing '4'\"\n        assert any(\"paris\" in a for a in answers), \"Expected answer containing 'paris'\"\n\n    @pytest.mark.skipif(\n        not os.environ.get(\"OPENAI_API_KEY\"),\n        reason=\"OPENAI_API_KEY not set\"\n    )\n    def test_pipeline_with_filter(self):\n        \"\"\"Test a pipeline with semantic filter end-to-end.\"\"\"\n        # Create dataset with mixed content\n        df = pd.DataFrame({\n            \"text\": [\n                \"The sky is blue.\",\n                \"Python is a programming language.\",\n                \"Water boils at 100 degrees Celsius.\",\n                \"JavaScript runs in browsers.\",\n            ]\n        })\n        dataset = pz.MemoryDataset(\"test_filter\", df)\n\n        # Filter for programming-related content\n        filtered = dataset.sem_filter(\"text is about programming\")\n\n        # Configure and run\n        config = QueryProcessorConfig(\n            policy=MinCost(),\n            available_models=[Model.GPT_4o_MINI],\n            execution_strategy=\"sequential\",\n            progress=False,\n            verbose=False,\n        )\n\n        results = filtered.run(config)\n        result_df = results.to_df()\n\n        # Should have filtered to programming-related rows\n        assert len(result_df) >= 1\n        assert len(result_df) <= 2  # Should be Python and/or JavaScript rows\n\n    @pytest.mark.skipif(\n        not os.environ.get(\"OPENAI_API_KEY\"),\n        reason=\"OPENAI_API_KEY not set\"\n    )\n    def test_pipeline_with_auto_model_selection(self):\n        \"\"\"Test that pipeline works with automatic model selection.\"\"\"\n        df = pd.DataFrame({\"input\": [\"Hello, world!\"]})\n        dataset = pz.MemoryDataset(\"test_auto\", df)\n\n        class Output(BaseModel):\n            greeting: str = Field(description=\"A friendly greeting response\")\n\n        plan = dataset.sem_map(cols=Output, desc=\"Respond with a greeting\")\n\n        # Don't specify available_models - let the system auto-select\n        config = QueryProcessorConfig(\n            policy=MinCost(),\n            execution_strategy=\"sequential\",\n            progress=False,\n            verbose=False,\n        )\n\n        results = plan.run(config)\n        result_df = results.to_df()\n\n        assert len(result_df) == 1\n        assert \"greeting\" in result_df.columns\n\n\n# =============================================================================\n# TEST CLASS: vLLM / Local Model Support\n# =============================================================================\n\nclass TestVLLMModelSupport:\n    \"\"\"Tests for local/vLLM model creation, metrics, flags, and validation.\"\"\"\n\n    # --- Model Creation ---\n\n    def test_vllm_model_creation_with_api_base(self):\n        \"\"\"Test that a vLLM model can be created with api_base.\"\"\"\n        model = Model(\"hosted_vllm/qwen/Qwen1.5-0.5B-Chat\", api_base=\"http://localhost:8000/v1\")\n        assert model.value == \"hosted_vllm/qwen/Qwen1.5-0.5B-Chat\"\n        assert model.api_base == \"http://localhost:8000/v1\"\n\n    def test_vllm_model_stores_extra_kwargs(self):\n        \"\"\"Test that extra kwargs are stored as vllm_kwargs.\"\"\"\n        model = Model(\"openai/Qwen/Qwen2.5-1.5B-Instruct\", api_base=\"http://localhost:8000/v1\", max_tokens=128)\n        assert model.vllm_kwargs == {\"max_tokens\": 128}\n\n    def test_vllm_model_without_api_base_raises(self):\n        \"\"\"Test that a model without api_base and not in curated JSON raises ValueError.\"\"\"\n        with pytest.raises(ValueError, match=\"does not contain information\"):\n            Model(\"hosted_vllm/totally-fake/NonexistentModel-v999\")\n\n    # --- Cost is Zero for Local Models ---\n\n    def test_vllm_model_cost_is_zero(self):\n        \"\"\"Test that all cost metrics are 0 for local/vLLM models.\"\"\"\n        model = Model(\"hosted_vllm/qwen/Qwen1.5-0.5B-Chat\", api_base=\"http://localhost:8000/v1\")\n        assert model.get_usd_per_input_token() == 0.0\n        assert model.get_usd_per_output_token() == 0.0\n        assert model.get_usd_per_audio_input_token() == 0.0\n        assert model.get_usd_per_cache_read_token() == 0.0\n        assert model.get_usd_per_cache_creation_token() == 0.0\n        assert model.get_usd_per_audio_cache_creation_token() == 0.0\n\n    # --- Quality and Latency Predictions ---\n\n    def test_predict_local_model_metrics_known_model(self):\n        \"\"\"Test predict_local_model_metrics for a model with known scores.\"\"\"\n        metrics = predict_local_model_metrics(\"meta-llama/Llama-3.1-8B-Instruct\")\n        assert metrics[\"MMLU_Pro_score\"] == 44.25\n        assert metrics[\"seconds_per_output_token\"] == round(1.0 / 200.0, 6)\n\n    def test_predict_local_model_metrics_unknown_model(self):\n        \"\"\"Test predict_local_model_metrics falls back to defaults for unknown models.\"\"\"\n        metrics = predict_local_model_metrics(\"some-unknown/model-xyz\")\n        assert metrics[\"MMLU_Pro_score\"] == DEFAULT_QUALITY_SCORE\n        assert metrics[\"seconds_per_output_token\"] == DEFAULT_SECONDS_PER_OUTPUT_TOKEN\n\n    def test_vllm_model_has_quality_score(self):\n        \"\"\"Test that a vLLM model gets a quality score via fuzzy matching.\"\"\"\n        model = Model(\"openai/meta-llama/Llama-3.1-8B-Instruct\", api_base=\"http://localhost:8000/v1\")\n        score = model.get_overall_score()\n        assert score == 44.25\n\n    def test_vllm_model_has_latency(self):\n        \"\"\"Test that a vLLM model gets latency via fuzzy matching.\"\"\"\n        model = Model(\"openai/meta-llama/Llama-3.1-8B-Instruct\", api_base=\"http://localhost:8000/v1\")\n        latency = model.get_seconds_per_output_token()\n        assert latency == round(1.0 / 200.0, 6)\n\n    def test_vllm_model_unknown_gets_defaults(self):\n        \"\"\"Test that an unrecognized vLLM model gets default quality and latency.\"\"\"\n        model = Model(\"openai/some-custom/MyCustomModel-v1\", api_base=\"http://localhost:8000/v1\")\n        assert model.get_overall_score() == DEFAULT_QUALITY_SCORE\n        assert model.get_seconds_per_output_token() == DEFAULT_SECONDS_PER_OUTPUT_TOKEN\n\n    # --- Fuzzy Matching ---\n\n    def test_fuzzy_match_exact_substring(self):\n        \"\"\"Test that fuzzy_match_score finds exact substring matches.\"\"\"\n        score = fuzzy_match_score(\"meta-llama/Llama-3.3-70B-Instruct-Turbo\", MMLU_PRO_SCORES)\n        assert score == 69.9\n\n    def test_fuzzy_match_normalized(self):\n        \"\"\"Test that fuzzy_match_score handles normalized matching.\"\"\"\n        score = fuzzy_match_score(\"deepseek-ai/DeepSeek-V3\", MMLU_PRO_SCORES)\n        assert score == 73.8\n\n    def test_fuzzy_match_no_match_returns_none(self):\n        \"\"\"Test that fuzzy_match_score returns None for unrecognized models.\"\"\"\n        score = fuzzy_match_score(\"totally-unknown-model\", MMLU_PRO_SCORES)\n        assert score is None\n\n    # --- derive_model_flags ---\n\n    def test_derive_model_flags_llama(self):\n        \"\"\"Test that derive_model_flags correctly detects Llama models.\"\"\"\n        flags = derive_model_flags(\"openai/meta-llama/Llama-3.1-8B-Instruct\")\n        assert flags.get(\"is_llama_model\") is True\n\n    def test_derive_model_flags_non_llama(self):\n        \"\"\"Test that derive_model_flags does not set is_llama_model for non-Llama.\"\"\"\n        flags = derive_model_flags(\"openai/Qwen/Qwen2.5-1.5B-Instruct\")\n        assert \"is_llama_model\" not in flags\n\n    def test_derive_model_flags_clip(self):\n        \"\"\"Test that derive_model_flags correctly detects CLIP models.\"\"\"\n        flags = derive_model_flags(\"clip-ViT-B-32\")\n        assert flags.get(\"is_clip_model\") is True\n\n    def test_derive_model_flags_gpt5(self):\n        \"\"\"Test that derive_model_flags correctly detects GPT-5 models.\"\"\"\n        flags = derive_model_flags(\"openai/gpt-5-2025-08-07\")\n        assert flags.get(\"is_gpt_5_model\") is True\n\n    def test_derive_model_flags_o_model(self):\n        \"\"\"Test that derive_model_flags correctly detects O-series models.\"\"\"\n        flags = derive_model_flags(\"openai/o4-mini-2025-04-16\")\n        assert flags.get(\"is_o_model\") is True\n\n    # --- is_vllm_model and is_llama_model for local models ---\n\n    def test_vllm_model_is_vllm(self):\n        \"\"\"Test that is_vllm_model returns True for api_base models.\"\"\"\n        model = Model(\"openai/Qwen/Qwen2.5-1.5B-Instruct\", api_base=\"http://localhost:8000/v1\")\n        assert model.is_vllm_model() is True\n\n    def test_vllm_llama_model_is_llama(self):\n        \"\"\"Test that a local Llama model correctly reports is_llama_model.\"\"\"\n        model = Model(\"openai/meta-llama/Llama-3.1-8B-Instruct\", api_base=\"http://localhost:8000/v1\")\n        assert model.is_llama_model() is True\n\n    def test_vllm_non_llama_is_not_llama(self):\n        \"\"\"Test that a non-Llama local model does not report is_llama_model.\"\"\"\n        model = Model(\"openai/Qwen/Qwen2.5-1.5B-Instruct\", api_base=\"http://localhost:8000/v1\")\n        assert model.is_llama_model() is False\n\n    # --- Default capabilities for local models ---\n\n    def test_vllm_model_defaults(self):\n        \"\"\"Test default capabilities for a vLLM model.\"\"\"\n        model = Model(\"openai/Qwen/Qwen2.5-1.5B-Instruct\", api_base=\"http://localhost:8000/v1\")\n        assert model.is_text_model() is True\n        assert model.is_embedding_model() is False\n\n    # --- QueryProcessor vLLM Validation ---\n\n    def test_factory_rejects_multiple_vllm_models(self):\n        \"\"\"Test that QueryProcessorFactory rejects configs with multiple vLLM models.\"\"\"\n        mock_dataset = MagicMock(spec=Dataset)\n        mock_dataset.schema = MagicMock()\n        mock_dataset.get_limit.return_value = None\n\n        model1 = Model(\"openai/model-a\", api_base=\"http://localhost:8000/v1\")\n        model2 = Model(\"openai/model-b\", api_base=\"http://localhost:8001/v1\")\n        config = QueryProcessorConfig(\n            policy=MinCost(),\n            available_models=[model1, model2],\n        )\n\n        with pytest.raises(ValueError, match=\"Only one vLLM model\"):\n            QueryProcessorFactory.create_processor(mock_dataset, config=config)\n\n    # --- Generator vLLM kwargs ---\n\n    @patch(\"palimpzest.query.generators.generators.litellm.completion\")\n    def test_generator_passes_vllm_kwargs(self, mock_completion, sample_record, output_schema, mock_litellm_response):\n        \"\"\"Test that Generator passes api_base and vllm_kwargs to litellm.\"\"\"\n        mock_completion.return_value = mock_litellm_response\n\n        model = Model(\"openai/Qwen/Qwen2.5-1.5B-Instruct\", api_base=\"http://localhost:8000/v1\", max_tokens=128)\n        generator = Generator(\n            model=model,\n            prompt_strategy=PromptStrategy.MAP,\n            reasoning_effort=\"default\",\n        )\n\n        fields = {k: FieldInfo.from_annotation(v) for k, v in output_schema.model_fields.items()}\n        generator(candidate=sample_record, fields=fields, prompt=\"Test\", parse_answer=lambda x: x, output_schema=output_schema)\n\n        _, kwargs = mock_completion.call_args\n        assert kwargs[\"api_base\"] == \"http://localhost:8000/v1\"\n        assert kwargs[\"max_tokens\"] == 128\n        assert \"api_key\" in kwargs\n"
  },
  {
    "path": "tests/pytest/test_dynamicschema.py",
    "content": "\"\"\"This testing class tests whether we can run a workload by defining a schema dynamically.\"\"\"\nfrom pathlib import Path\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.lib.schemas import TextFile\nfrom palimpzest.policy import MinCost\nfrom palimpzest.query.operators.convert import LLMConvertBonded\nfrom palimpzest.query.operators.filter import LLMFilter\nfrom palimpzest.query.processor.config import QueryProcessorConfig\nfrom palimpzest.schemabuilder.schema_builder import SchemaBuilder\n\ndata_path = Path(\"tests/pytest/data/\")\n\n\ndef test_dynamicschema_jsonld(project_root: Path):\n    asset_path = str(project_root / data_path / \"synapse_schema.jsonld\")\n    clinical_schema = SchemaBuilder.from_file(asset_path, schema_type=TextFile)\n    assert clinical_schema is not None\n\ndef test_dynamicschema_csv(project_root: Path):\n    asset_path = str(project_root / data_path / \"synapse_schema.csv\")\n    clinical_schema = SchemaBuilder.from_file(asset_path, schema_type=TextFile)\n    assert clinical_schema is not None\n\n\ndef test_dynamicschema_json(mocker, enron_workload, enron_convert, enron_filter, project_root: Path):\n    asset_path = str(project_root / data_path / \"email_schema.json\")\n    email_schema = SchemaBuilder.from_file(asset_path, schema_type=TextFile)\n    assert email_schema is not None\n    for field_name in TextFile.model_fields:\n        assert field_name in email_schema.model_fields, f\"Field {field_name} not found in the schema\"\n\n    # mock out calls to generators used by the plans which parameterize this test\n    mocker.patch.object(LLMFilter, \"filter\", side_effect=enron_filter)\n    mocker.patch.object(LLMConvertBonded, \"convert\", side_effect=enron_convert)\n\n    config = QueryProcessorConfig(\n        policy=MinCost(),\n        available_models=[Model.GPT_4o_MINI],\n        num_samples=3,\n        allow_bonded_query=True,\n        allow_rag_reduction=False,\n        allow_mixtures=False,\n        allow_critic=False,\n        allow_split_merge=False,\n        execution_strategy=\"sequential\",\n        optimizer_strategy=\"pareto\",\n    )\n    data_record_collection = enron_workload.run(config=config)\n\n    for rec in data_record_collection:\n        print(rec.to_dict())\n\n\ndef test_dynamicschema_yml(mocker, enron_workload, enron_convert, enron_filter, project_root: Path):\n    asset_path = str(project_root / data_path / \"email_schema.yml\")\n    email_schema = SchemaBuilder.from_file(asset_path, schema_type=TextFile)\n    assert email_schema is not None\n    for field_name in TextFile.model_fields:\n        assert field_name in email_schema.model_fields, f\"Field {field_name} not found in the schema\"\n\n    # mock out calls to generators used by the plans which parameterize this test\n    mocker.patch.object(LLMFilter, \"filter\", side_effect=enron_filter)\n    mocker.patch.object(LLMConvertBonded, \"convert\", side_effect=enron_convert)\n\n    config = QueryProcessorConfig(\n        policy=MinCost(),\n        available_models=[Model.GPT_4o_MINI],\n        num_samples=3,\n        allow_bonded_query=True,\n        allow_rag_reduction=False,\n        allow_mixtures=False,\n        allow_critic=False,\n        allow_split_merge=False,\n        execution_strategy=\"sequential\",\n        optimizer_strategy=\"pareto\",\n    )\n    data_record_collection = enron_workload.run(config=config)\n\n    for rec in data_record_collection:\n        print(rec.to_dict())\n"
  },
  {
    "path": "tests/pytest/test_execution.py",
    "content": "import pytest\n\nfrom palimpzest.policy import MaxQuality\nfrom palimpzest.query.operators.convert import LLMConvertBonded\nfrom palimpzest.query.operators.filter import LLMFilter\nfrom palimpzest.query.operators.rag import RAGConvert\nfrom palimpzest.query.processor.config import QueryProcessorConfig\nfrom palimpzest.query.processor.query_processor_factory import QueryProcessorFactory\n\n\n@pytest.mark.parametrize(\n    argnames=(\"execution_strategy\",),\n    argvalues=[\n        pytest.param(\"sequential\", id=\"seq-single-thread\"),\n        pytest.param(\"pipelined\", id=\"pipelined-single-thread\"),\n        pytest.param(\"parallel\", id=\"parallel\"),\n    ]\n)\nclass TestExecution:\n\n    @pytest.mark.parametrize(\n        argnames=(\"dataset\", \"physical_plan\", \"expected_records\", \"side_effect\"),\n        argvalues=[\n            pytest.param(\"enron-eval-tiny\", \"scan-only\", \"enron-all-records\", None, id=\"scan-only\"),\n            pytest.param(\"enron-eval-tiny\", \"non-llm-filter\", \"enron-filtered-records\", None, id=\"non-llm-filter\"),\n            pytest.param(\"enron-eval-tiny\", \"llm-filter\", \"enron-filtered-records\", \"enron-filter\", id=\"llm-filter\"),\n            pytest.param(\n                \"enron-eval-tiny\", \"bonded-llm-convert\", \"enron-all-records\", \"enron-convert\", id=\"bonded-llm-convert\"\n            ),\n            pytest.param(\n                \"enron-eval-tiny\", \n                \"rag-convert\",\n                \"enron-all-records\",\n                \"enron-convert\",\n                id=\"rag-convert\",\n            ),\n            pytest.param(\n                \"real-estate-eval-tiny\",\n                \"image-convert\",\n                \"real-estate-all-records\",\n                \"real-estate-convert\",\n                id=\"image-convert\",\n            ),\n            pytest.param(\n                \"real-estate-eval-tiny\",\n                \"one-to-many-convert\",\n                \"real-estate-one-to-many-records\",\n                \"real-estate-one-to-many-convert\",\n                id=\"one-to-many-convert\",\n            ),\n        ],\n        indirect=True,\n    )\n    def test_execute_full_plan(self, mocker, execution_strategy, dataset, physical_plan, expected_records, side_effect):\n        \"\"\"\n        This test executes the given\n        \"\"\"\n        # create processor\n        config = QueryProcessorConfig(execution_strategy=execution_strategy, policy=MaxQuality())\n        processor = QueryProcessorFactory.create_processor(dataset, config)\n\n        # mock out calls to generators used by the plans which parameterize this test\n        mocker.patch.object(LLMFilter, \"filter\", side_effect=side_effect)\n        mocker.patch.object(LLMConvertBonded, \"convert\", side_effect=side_effect)\n        mocker.patch.object(RAGConvert, \"convert\", side_effect=side_effect)\n\n        # execute the plan\n        output_records, plan_stats = processor.execution_strategy.execute_plan(physical_plan)     \n\n        # check that we get the expected set of output records\n        def get_id(record):\n            return record.listing if \"RealEstate\" in dataset.__class__.__name__ else record.filename\n\n        assert len(output_records) == len(expected_records)\n        assert sorted(map(get_id, output_records)) == sorted(map(get_id, expected_records))\n\n        # sanity check plan stats\n        assert plan_stats.total_plan_time > 0.0\n\n        # if the plan used (mocked) calls to an LLM, assert that the plan cost money\n        if side_effect is not None:\n            assert plan_stats.total_plan_cost > 0.0\n        else:\n            assert plan_stats.total_plan_cost == 0.0\n"
  },
  {
    "path": "tests/pytest/test_filter.py",
    "content": "\"\"\"This script contains tests for physical operators for filter.\"\"\"\n\nimport os\n\nimport pytest\nfrom pydantic import BaseModel, Field\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.elements.filters import Filter\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.lib.schemas import AudioFilepath, ImageFilepath, union_schemas\nfrom palimpzest.core.models import GenerationStats\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.operators.critique_and_refine import CritiqueAndRefineFilter\nfrom palimpzest.query.operators.filter import LLMFilter\nfrom palimpzest.query.operators.mixture_of_agents import MixtureOfAgentsFilter\nfrom palimpzest.query.operators.rag import RAGFilter\nfrom palimpzest.query.operators.split import SplitFilter\n\nif not os.environ.get(\"OPENAI_API_KEY\"):\n    from palimpzest.utils.env_helpers import load_env\n\n    load_env()\n\n\nclass TextInputSchema(BaseModel):\n    text: str = Field(description=\"Description of an animal\")\n    age: int = Field(description=\"The age of the animal in years\")\n\nclass ImageInputSchema(BaseModel):\n    image_file: ImageFilepath = Field(description=\"File path to an image of an animal\")\n    height: float = Field(description=\"The estimated height of the animal in cm\")\n\nclass AudioInputSchema(BaseModel):\n    audio_file: AudioFilepath = Field(description=\"File path to an audio recording of an animal\")\n    year: float = Field(description=\"The year the recording was made\")\n\nTextImageInputSchema = union_schemas([TextInputSchema, ImageInputSchema])\nTextAudioInputSchema = union_schemas([TextInputSchema, AudioInputSchema])\nImageAudioInputSchema = union_schemas([ImageInputSchema, AudioInputSchema])\nTextImageAudioInputSchema = union_schemas([TextInputSchema, ImageInputSchema, AudioInputSchema])\n\ndef mock_generator_call(candidate, fields, right_candidate=None, json_output=True, **kwargs):\n    field_answers = {\"passed_operator\": True}\n    reasoning = \"The input matches that of an elephant.\"\n    generation_stats = GenerationStats(cost_per_record=1.0, time_per_record=1.0, num_input_tokens=10, num_output_tokens=10)\n    messages = []\n    return field_answers, reasoning, generation_stats, messages\n\n\n@pytest.mark.parametrize(\n    \"input_schema\",\n    [TextInputSchema, ImageInputSchema, AudioInputSchema, TextImageInputSchema, TextAudioInputSchema, ImageAudioInputSchema, TextImageAudioInputSchema],\n    ids=[\"text-only\", \"image-only\", \"audio-only\", \"text-image\", \"text-audio\", \"image-audio\", \"text-image-audio\"],\n)\n@pytest.mark.parametrize(\n    \"physical_op_class\",\n    [LLMFilter, RAGFilter, SplitFilter, CritiqueAndRefineFilter, MixtureOfAgentsFilter],\n    ids=[\"llm-filter\", \"rag-filter\", \"split-filter\", \"critique-and-refine-filter\", \"mixture-of-agents-filter\"],\n)\ndef test_filter(mocker, input_schema, physical_op_class, embedding_text_only_model):\n    \"\"\"Test filter operators on simple input\"\"\"\n    # RAGFilter and SplitFilter only support text input currently\n    if physical_op_class in [RAGFilter, SplitFilter] and input_schema != TextInputSchema:\n        pytest.skip(f\"{physical_op_class} only supports text input currently\")\n\n    if os.getenv(\"NO_GEMINI\") and input_schema in [AudioInputSchema, TextAudioInputSchema, ImageAudioInputSchema, TextImageAudioInputSchema]:\n        pytest.skip(\"Skipping multi-modal audio tests on CI which does not have access to gemini models\")\n\n    model = Model.GPT_5_MINI if os.getenv(\"NO_GEMINI\") else Model.GEMINI_2_5_FLASH\n    embedding_model = embedding_text_only_model\n    proposer_models = [Model.GPT_5, Model.GPT_5_NANO] if os.getenv(\"NO_GEMINI\") else [Model.GEMINI_2_5_PRO, Model.GEMINI_2_0_FLASH]\n    critic_model = Model.GPT_5_NANO if os.getenv(\"NO_GEMINI\") else Model.GEMINI_2_0_FLASH\n    refine_model = Model.GPT_5 if os.getenv(\"NO_GEMINI\") else Model.GEMINI_2_5_PRO\n\n    # construct the kwargs for the physical operator\n    filter = Filter(filter_condition=\"The animal is an elephant.\")\n    physical_op_kwargs = {\"input_schema\": input_schema, \"output_schema\": input_schema, \"filter\": filter, \"logical_op_id\": \"test-filter\"}\n    if physical_op_class is LLMFilter:\n        physical_op_kwargs[\"model\"] = model\n    elif physical_op_class is RAGFilter:\n        physical_op_kwargs[\"model\"] = model\n        physical_op_kwargs[\"embedding_model\"] = embedding_model\n        physical_op_kwargs[\"num_chunks_per_field\"] = 1\n        physical_op_kwargs[\"chunk_size\"] = 1000\n    elif physical_op_class is SplitFilter:\n        physical_op_kwargs[\"model\"] = model\n        physical_op_kwargs[\"num_chunks\"] = 2\n        physical_op_kwargs[\"min_size_to_chunk\"] = 1000\n    elif physical_op_class is MixtureOfAgentsFilter:\n        physical_op_kwargs[\"proposer_models\"] = proposer_models\n        physical_op_kwargs[\"temperatures\"] = [0.8, 0.8]\n        physical_op_kwargs[\"aggregator_model\"] = model\n    elif physical_op_class is CritiqueAndRefineFilter:\n        physical_op_kwargs[\"model\"] = model\n        physical_op_kwargs[\"critic_model\"] = critic_model\n        physical_op_kwargs[\"refine_model\"] = refine_model\n\n    # create filter operator\n    filter_op = physical_op_class(**physical_op_kwargs)\n\n    # create input record\n    data_item = {}\n    if all(field in input_schema.model_fields for field in TextInputSchema.model_fields):\n        data_item['text'] = \"An elephant is a large gray animal with a trunk and big ears.\"\n        data_item['age'] = 3\n    if all(field in input_schema.model_fields for field in ImageInputSchema.model_fields):\n        data_item['image_file'] = \"tests/pytest/data/elephant.png\"\n        data_item['height'] = 304.5\n    if all(field in input_schema.model_fields for field in AudioInputSchema.model_fields):\n        data_item['audio_file'] = \"tests/pytest/data/elephant.wav\"\n        data_item['year'] = 2020\n    input_record = DataRecord(input_schema(**data_item), source_indices=[0])\n\n    # only execute LLM calls if specified\n    if not os.getenv(\"RUN_LLM_TESTS\"):\n        mocker.patch.object(Generator, \"__call__\", side_effect=mock_generator_call)\n\n    # apply filter operator to the input\n    data_record_set = filter_op(input_record)\n\n    # check for single output record with expected fields\n    assert len(data_record_set) == 1\n    output_record = data_record_set[0]\n\n    assert sorted(output_record.schema.model_fields) == sorted(input_schema.model_fields)\n    assert output_record._passed_operator\n"
  },
  {
    "path": "tests/pytest/test_generator.py",
    "content": "import math\nimport os\nimport time\nimport uuid\n\nimport pytest\nfrom pydantic import BaseModel, Field\n\nfrom palimpzest.constants import Model, PromptStrategy\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.lib.schemas import AudioFilepath, ImageFilepath, union_schemas\nfrom palimpzest.query.generators.generators import Generator\n\n\ndef generate_session_id() -> str:\n    \"\"\"\n    Generate a unique 12-character session ID.\n    This ensures each test run has a unique prompt prefix, preventing cache hits from previous runs.\n    \"\"\"\n    return uuid.uuid4().hex[:12].upper()\n\n\n@pytest.fixture\ndef question():\n    class Question(BaseModel):\n        question: str = Field(description=\"A simple question\")\n    dr = DataRecord(data_item=Question(question=\"What color is grass? (one-word answer)\"), source_indices=[0])\n    return dr\n\n@pytest.fixture\ndef output_schema():\n    class Answer(BaseModel):\n        answer: str = Field(description=\"The one-word answer to the question.\")\n    return Answer\n\n@pytest.mark.parametrize(\n    \"model\",\n    [\n        pytest.param(Model.GPT_4o_MINI, marks=pytest.mark.skipif(os.getenv(\"OPENAI_API_KEY\") is None, reason=\"OPENAI_API_KEY not present\")),\n        pytest.param(Model.AZURE_GPT_4o_MINI, marks=pytest.mark.skipif(os.getenv(\"AZURE_API_KEY\") is None and os.getenv(\"AZURE_OPENAI_API_KEY\") is None, reason=\"AZURE_API_KEY/AZURE_OPENAI_API_KEY not present\")),\n        pytest.param(Model.DEEPSEEK_V3, marks=pytest.mark.skipif(os.getenv(\"TOGETHER_API_KEY\") is None, reason=\"TOGETHER_API_KEY not present\")),\n        pytest.param(Model.LLAMA3_1_8B, marks=pytest.mark.skipif(os.getenv(\"TOGETHER_API_KEY\") is None, reason=\"TOGETHER_API_KEY not present\")),\n        pytest.param(Model.CLAUDE_4_5_HAIKU, marks=pytest.mark.skipif(os.getenv(\"ANTHROPIC_API_KEY\") is None, reason=\"ANTHROPIC_API_KEY not present\")),\n    ]\n)\ndef test_generator(model, question, output_schema):\n    generator = Generator(model, PromptStrategy.MAP, None)\n    output, _, gen_stats, _ = generator(question, output_schema.model_fields, **{\"output_schema\": output_schema})\n    # Basic checks: generator produced output and tracked some stats\n    assert gen_stats.output_text_tokens > 0, \"Expected positive output tokens\"\n    assert output[\"answer\"][0].lower() == \"green\"\n\n\n@pytest.mark.skipif(os.getenv(\"VLLM_API_BASE\") is None, reason=\"VLLM_API_BASE not set (no vLLM server running)\")\ndef test_vllm_generator(question, output_schema):\n    api_base = os.getenv(\"VLLM_API_BASE\")\n    model_id = os.getenv(\"VLLM_MODEL_ID\", \"openai/Qwen/Qwen2.5-1.5B-Instruct\")\n    model = Model(model_id, api_base=api_base)\n    generator = Generator(model, PromptStrategy.MAP, None)\n    output, _, gen_stats, _ = generator(question, output_schema.model_fields, **{\"output_schema\": output_schema})\n    assert gen_stats.total_input_tokens > 0\n    assert gen_stats.total_output_tokens > 0\n    assert output[\"answer\"] is not None\n\n# =============================================================================\n# GENERATOR STATS VALIDATION TESTS\n# =============================================================================\n# These tests validate that the Generator correctly tracks token usage and costs\n# for different provider/modality combinations.\n\nSTATIC_CONTEXT = \"\"\"\nWILDLIFE CONSERVATION & RESEARCH CENTER: SPECIES IDENTIFICATION MANUAL (v2025.1)\n\nSECTION 1: INTRODUCTION AND MISSION\nThe Wildlife Conservation & Research Center (WCRC) is dedicated to the preservation, study, and rehabilitation of diverse wildlife species.\nAll staff members, researchers, and volunteers must adhere to these protocols for accurate species identification and data collection.\nOur mission combines advanced biological sciences with conservation efforts to protect endangered and threatened populations worldwide.\n\nSECTION 2: MAMMAL IDENTIFICATION PROTOCOLS\n\n2.1 ELEPHANTS (Family Elephantidae):\n    - African Savanna Elephant: Larger ears (shaped like Africa), concave back, two fingers on trunk tip. Weight: 5,000-14,000 lbs.\n    - African Forest Elephant: Smaller stature, oval-shaped ears, straighter tusks pointing downward.\n    - Asian Elephant: Smaller ears, convex back, one finger on trunk tip, twin domes on head. Weight: 4,000-11,000 lbs.\n    - Vocalizations: Trumpeting (alarm/excitement), rumbling (long-distance communication), roaring (distress).\n\n2.2 BIG CATS (Family Felidae):\n    - Lion (Panthera leo): Tawny coat, males have distinctive mane. Social, live in prides. Height: 3.5-4 ft at shoulder.\n    - Tiger (Panthera tigris): Orange coat with black stripes, white underbelly. Solitary hunters. Largest cat species.\n    - Leopard (Panthera pardus): Golden-yellow coat with rosette patterns. Excellent climbers, often cache prey in trees.\n    - Cheetah (Acinonyx jubatus): Spotted coat, black \"tear marks\" from eyes to mouth. Fastest land animal (70 mph).\n    - Vocalizations: Roaring (lions, tigers, leopards), chirping/purring (cheetahs cannot roar).\n\n2.3 BEARS (Family Ursidae):\n    - Brown Bear (Ursus arctos): Large shoulder hump, dish-shaped face, long claws. Includes grizzly subspecies.\n    - Black Bear (Ursus americanus): Straight facial profile, no shoulder hump, shorter claws. Most common North American bear.\n    - Polar Bear (Ursus maritimus): White fur, longer neck, smaller ears. Marine mammal adapted to Arctic conditions.\n    - Giant Panda (Ailuropoda melanoleuca): Black and white coloring, feeds almost exclusively on bamboo.\n    - Vocalizations: Roaring, growling, huffing, jaw-popping (threat displays).\n\n2.4 PRIMATES (Order Primates):\n    - Gorilla: Largest primate, silver-back males, knuckle-walking locomotion. Vocalizations include chest-beating, hooting.\n    - Chimpanzee: Highly intelligent, uses tools, complex social structures. Vocalizations: pant-hoots, screams.\n    - Orangutan: Red-orange fur, arboreal lifestyle, solitary. Long calls can travel over 1 km.\n    - Gibbon: Smaller apes, brachiation locomotion, distinctive whooping songs for territorial marking.\n\nSECTION 3: BIRD IDENTIFICATION PROTOCOLS\n\n3.1 RAPTORS (Order Accipitriformes/Falconiformes):\n    - Bald Eagle: White head and tail, yellow beak. Wingspan: 6-7.5 ft. Call: high-pitched chattering.\n    - Golden Eagle: Dark brown plumage, golden nape. Powerful hunters of small mammals.\n    - Peregrine Falcon: Blue-gray back, barred underparts. Fastest bird in dive (240+ mph).\n    - Red-tailed Hawk: Brown back, pale underparts, distinctive red tail. Most common North American hawk.\n\n3.2 PARROTS (Order Psittaciformes):\n    - Macaw: Large, colorful, long tail feathers. Powerful curved beaks. Highly social and vocal.\n    - African Grey: Gray plumage, red tail. Exceptional mimicry and cognitive abilities.\n    - Cockatoo: White or pink plumage, distinctive crest. Loud screeching vocalizations.\n\nSECTION 4: REPTILE IDENTIFICATION PROTOCOLS\n\n4.1 CROCODILIANS (Order Crocodilia):\n    - American Alligator: Broad, U-shaped snout, dark coloration. Freshwater habitats.\n    - Nile Crocodile: V-shaped snout, aggressive. Can reach 16-18 ft in length.\n    - Gharial: Extremely narrow snout, fish-eating specialist. Critically endangered.\n\n4.2 LARGE SNAKES (Families Pythonidae/Boidae):\n    - Reticulated Python: Longest snake species (up to 23 ft), complex geometric patterns.\n    - Green Anaconda: Heaviest snake species, olive-green with black spots. Semi-aquatic.\n    - King Cobra: Longest venomous snake (up to 18 ft), distinctive hood when threatened.\n\nSECTION 5: DATA COLLECTION AND ANALYSIS\n\n5.1 Visual Identification:\n    - Document body shape, size, coloration, and distinctive markings.\n    - Note behavioral characteristics and habitat context.\n    - Use standardized photography protocols for pattern matching.\n\n5.2 Audio Identification:\n    - Record vocalizations with frequency analysis equipment.\n    - Tag recordings with behavioral context (territorial, mating, alarm, social).\n    - Cross-reference with vocalization databases for species confirmation.\n\n5.3 Biometric Data:\n    - Record body measurements according to species-specific protocols.\n    - Document age indicators (teeth wear, plumage, etc.).\n    - Collect genetic samples when possible for lineage verification.\n\nYou are an AI Research Assistant for the WCRC. Your job is to analyze data inputs (text descriptions, images, and/or audio recordings) and identify the species based on the characteristics described in this manual.\nAnalyze all provided inputs and determine the most likely species identification.\n\"\"\"\n\n# Input Schemas\nclass TextInputSchema(BaseModel):\n    \"\"\"Schema for text-only input.\"\"\"\n    text: str = Field(description=\"Description of an animal\")\n    age: int = Field(description=\"The age of the animal in years\")\n\n\nclass ImageInputSchema(BaseModel):\n    \"\"\"Schema for image-only input.\"\"\"\n    image_file: ImageFilepath = Field(description=\"File path to an image of an animal\")\n    height: float = Field(description=\"The estimated height of the animal in cm\")\n\n\nclass AudioInputSchema(BaseModel):\n    \"\"\"Schema for audio-only input.\"\"\"\n    audio_file: AudioFilepath = Field(description=\"File path to an audio recording of an animal\")\n    year: float = Field(description=\"The year the recording was made\")\n\n\n# Union schemas for multi-modal inputs\nTextImageInputSchema = union_schemas([TextInputSchema, ImageInputSchema])\nTextAudioInputSchema = union_schemas([TextInputSchema, AudioInputSchema])\nImageAudioInputSchema = union_schemas([ImageInputSchema, AudioInputSchema])\nTextImageAudioInputSchema = union_schemas([TextInputSchema, ImageInputSchema, AudioInputSchema])\n\n\nclass AnimalOutputSchema(BaseModel):\n    \"\"\"Output schema for animal identification.\"\"\"\n    animal: str = Field(description=\"The animal in the input\")\n\n\n# Expected stats from provider testing (to be filled in after running capture_provider_stats.py)\n# Format: {(provider, modality): {\"first_request\": {...}, \"second_request\": {...}}}\nEXPECTED_STATS = {\n    # Anthropic - claude-sonnet-4-5-20250929\n    # Note: Anthropic doesn't separate image tokens from text tokens in usage stats\n    (\"anthropic\", \"text-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": 64,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 2065,\n            \"output_tokens\": 230\n        },\n        \"second_request\": {\n            \"input_text_tokens\": 64,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 2065,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 338,\n        },\n    },\n    (\"anthropic\", \"image-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": 247,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 2205,\n            \"output_tokens\": 472,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": 247,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 2205,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 393,\n        },\n    },\n    # OpenAI - gpt-4o-2024-08-06\n    (\"openai\", \"text-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": 1856,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 131,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": 832,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 1024,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 88,\n        },\n    },\n    (\"openai\", \"image-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": 2220,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 85,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": 428,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 1792,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 75,\n        },\n    },\n    # Azure OpenAI - azure/gpt-4o-2024-08-06 (same model as OpenAI, same expected stats)\n    (\"azure\", \"text-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": 1856,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 131,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": 832,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 1024,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 88,\n        },\n    },\n    (\"azure\", \"image-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": 2220,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 85,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": 428,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 1792,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 75,\n        },\n    },\n    # OpenAI Audio - gpt-4o-audio-preview\n    (\"openai-audio\", \"audio-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": 1974,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 31,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 100,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": 1974,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 31,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 166,\n        },\n    },\n    # Gemini - gemini-2.5-flash\n    (\"gemini\", \"text-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": 1923,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 61,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": 913,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 1010,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 74,\n        },\n    },\n    (\"gemini\", \"image-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": 2045,\n            \"input_image_tokens\": 258,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 91,\n        },\n        # NOTE: it seems that image token caching is fickle for Gemini, thus, we accept either the expected value (32) or 258 for input_image_tokens in the second request\n        \"second_request\": {\n            \"input_text_tokens\": 247,\n            \"input_image_tokens\": [32, 258],\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": 2024,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 104,\n        },\n    },\n    (\"gemini\", \"audio-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": 2040,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 100,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 125,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": 117,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": [6, 100],\n            \"cache_read_tokens\": 2017,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 125,\n        },\n    },\n    (\"gemini\", \"text-image-audio\"): {\n        \"first_request\": {\n            \"input_text_tokens\": 2262,\n            \"input_image_tokens\": 258,\n            \"input_audio_tokens\": 100,\n            \"cache_read_tokens\": 0,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 118,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": 516,\n            \"input_image_tokens\": [59, 258],\n            \"input_audio_tokens\": [23, 100],\n            \"cache_read_tokens\": 2022,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": 181,\n        },\n    },\n    # Vertex AI - gemini-2.5-flash\n    (\"vertex_ai\", \"text-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": None,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": None,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": None,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": None,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": None,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": None,\n        },\n    },\n    (\"vertex_ai\", \"image-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": None,\n            \"input_image_tokens\": None,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": None,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": None,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": None,\n            \"input_image_tokens\": None,\n            \"input_audio_tokens\": 0,\n            \"cache_read_tokens\": None,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": None,\n        },\n    },\n    (\"vertex_ai\", \"audio-only\"): {\n        \"first_request\": {\n            \"input_text_tokens\": None,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": None,\n            \"cache_read_tokens\": None,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": None,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": None,\n            \"input_image_tokens\": 0,\n            \"input_audio_tokens\": None,\n            \"cache_read_tokens\": None,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": None,\n        },\n    },\n    (\"vertex_ai\", \"text-image-audio\"): {\n        \"first_request\": {\n            \"input_text_tokens\": None,\n            \"input_image_tokens\": None,\n            \"input_audio_tokens\": None,\n            \"cache_read_tokens\": None,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": None,\n        },\n        \"second_request\": {\n            \"input_text_tokens\": None,\n            \"input_image_tokens\": None,\n            \"input_audio_tokens\": None,\n            \"cache_read_tokens\": None,\n            \"cache_creation_tokens\": 0,\n            \"output_tokens\": None,\n        },\n    },\n}\n\n\ndef create_input_record(input_schema, modality: str):\n    \"\"\"Create an input DataRecord for the given schema and modality.\"\"\"\n    data_item = {}\n\n    # Add text fields if applicable\n    if \"text\" in modality or input_schema == TextInputSchema and hasattr(input_schema, \"model_fields\"):\n            if \"text\" in input_schema.model_fields:\n                data_item[\"text\"] = \"An elephant is a large gray animal with a trunk and big ears. It makes a trumpeting sound.\"\n            if \"age\" in input_schema.model_fields:\n                data_item[\"age\"] = 15\n\n    # Add image fields if applicable\n    if \"image\" in modality or input_schema == ImageInputSchema and hasattr(input_schema, \"model_fields\"):\n            if \"image_file\" in input_schema.model_fields:\n                data_item[\"image_file\"] = \"tests/pytest/data/elephant.png\"\n            if \"height\" in input_schema.model_fields:\n                data_item[\"height\"] = 304.5\n\n    # Add audio fields if applicable\n    if \"audio\" in modality or input_schema == AudioInputSchema and hasattr(input_schema, \"model_fields\"):\n            if \"audio_file\" in input_schema.model_fields:\n                data_item[\"audio_file\"] = \"tests/pytest/data/elephant.wav\"\n            if \"year\" in input_schema.model_fields:\n                data_item[\"year\"] = 2020\n\n    return DataRecord(input_schema(**data_item), source_indices=[0])\n\n\ndef get_model_for_provider(provider: str) -> Model:\n    \"\"\"Get the Model enum for a given provider.\"\"\"\n    if provider == \"anthropic\":\n        return Model.CLAUDE_4_5_SONNET\n    elif provider == \"openai\":\n        return Model.GPT_4o\n    elif provider == \"openai-audio\":\n        return Model.GPT_4o_AUDIO_PREVIEW\n    elif provider == \"azure\":\n        return Model.AZURE_GPT_4o\n    elif provider == \"gemini\":\n        return Model.GOOGLE_GEMINI_2_5_FLASH\n    elif provider == \"vertex_ai\":\n        return Model.GEMINI_2_5_FLASH\n    else:\n        raise ValueError(f\"Unknown provider: {provider}\")\n\n\ndef get_input_schema_for_modality(modality: str):\n    \"\"\"Get the input schema class for a given modality.\"\"\"\n    schema_map = {\n        \"text-only\": TextInputSchema,\n        \"image-only\": ImageInputSchema,\n        \"audio-only\": AudioInputSchema,\n        \"text-image\": TextImageInputSchema,\n        \"text-audio\": TextAudioInputSchema,\n        \"image-audio\": ImageAudioInputSchema,\n        \"text-image-audio\": TextImageAudioInputSchema,\n    }\n    return schema_map[modality]\n\n\n# =============================================================================\n# PROVIDER CONFIGURATION\n# =============================================================================\nPROVIDER_CONFIG = {\n    \"anthropic\": {\n        \"model\": Model.CLAUDE_4_5_SONNET,\n        \"supported_modalities\": [\"text-only\", \"image-only\"],\n        \"api_key_env\": \"ANTHROPIC_API_KEY\",\n    },\n    \"openai\": {\n        \"model\": Model.GPT_4o,\n        \"supported_modalities\": [\"text-only\", \"image-only\"],\n        \"api_key_env\": \"OPENAI_API_KEY\",\n    },\n    \"openai-audio\": {\n        \"model\": Model.GPT_4o_AUDIO_PREVIEW,\n        \"supported_modalities\": [\"audio-only\"],\n        \"api_key_env\": \"OPENAI_API_KEY\",\n    },\n    \"gemini\": {\n        \"model\": Model.GOOGLE_GEMINI_2_5_FLASH,\n        \"supported_modalities\": [\"text-only\", \"image-only\", \"audio-only\", \"text-image-audio\"],\n        \"api_key_env\": [\"GOOGLE_API_KEY\", \"GEMINI_API_KEY\"],\n    },\n    \"vertex_ai\": {\n        \"model\": Model.GEMINI_2_5_FLASH,\n        \"supported_modalities\": [\"text-only\", \"image-only\", \"audio-only\", \"text-image-audio\"],\n        \"api_key_env\": [\"GOOGLE_APPLICATION_CREDENTIALS\", \"VERTEX_PROJECT\"],\n    },\n    \"azure\": {\n        \"model\": Model.AZURE_GPT_4o,\n        \"supported_modalities\": [\"text-only\", \"image-only\"],\n        \"api_key_env\": [\"AZURE_API_KEY\", \"AZURE_OPENAI_API_KEY\"],\n    },\n}\n\nALL_MODALITIES = [\"text-only\", \"image-only\", \"audio-only\", \"text-image-audio\"]\nALL_PROVIDERS = [\"anthropic\", \"openai\", \"openai-audio\", \"gemini\", \"vertex_ai\", \"azure\"]\n\nCACHE_WAIT_SECONDS = 10\n\n\ndef check_api_key(provider: str) -> bool:\n    \"\"\"Check if the API key for a provider is present.\"\"\"\n    config = PROVIDER_CONFIG[provider]\n    api_key_env = config[\"api_key_env\"]\n    if isinstance(api_key_env, list):\n        return any(os.getenv(key) is not None for key in api_key_env)\n    return os.getenv(api_key_env) is not None\n\n\ndef is_modality_supported(provider: str, modality: str) -> bool:\n    \"\"\"Check if a modality is supported by a provider.\"\"\"\n    return modality in PROVIDER_CONFIG[provider][\"supported_modalities\"]\n\n\ndef within_tolerance(actual: int, expected: int, tolerance: float = 0.05) -> bool:\n    \"\"\"Check if actual value is within tolerance of expected value.\"\"\"\n    if expected == 0:\n        return actual == 0\n    margin = max(1, int(math.ceil(expected * tolerance)))  # At least 1 token margin\n    return abs(actual - expected) <= margin\n\n\ndef assert_stats_match(gen_stats, expected: dict, request_name: str, provider: str = \"\", tolerance: float = 0.05):\n    \"\"\"Assert that generation stats match expected values within tolerance.\n\n    For providers with implicit caching (OpenAI, Gemini), cache hits are non-deterministic.\n    So for cache_read_tokens we accept anywhere in [0, expected+5%],\n    and input_text_tokens is validated as: total logical input ≈ input_text + cache_read.\n    \"\"\"\n    has_implicit_caching = provider.startswith(\"openai\") or provider.startswith(\"gemini\") or provider == \"azure\"\n\n    if has_implicit_caching and expected.get(\"cache_read_tokens\") is not None and expected[\"cache_read_tokens\"] > 0:\n        # Implicit caching (OpenAI/Gemini): cache hit is non-deterministic, accept 0..expected+tolerance\n        expected_cache = expected[\"cache_read_tokens\"]\n        cache_upper = expected_cache + max(1, int(expected_cache * tolerance))\n        assert 0 <= gen_stats.cache_read_tokens <= cache_upper, \\\n            f\"{request_name} cache_read_tokens out of range: got {gen_stats.cache_read_tokens}, expected 0..{cache_upper}\"\n    else:\n        if expected.get(\"input_text_tokens\") is not None:\n            assert within_tolerance(gen_stats.input_text_tokens, expected[\"input_text_tokens\"], tolerance), \\\n                f\"{request_name} input_text_tokens mismatch: got {gen_stats.input_text_tokens}, expected {expected['input_text_tokens']} (±{tolerance*100}%)\"\n\n        if expected.get(\"cache_read_tokens\") is not None:\n            assert within_tolerance(gen_stats.cache_read_tokens, expected[\"cache_read_tokens\"], tolerance), \\\n                f\"{request_name} cache_read_tokens mismatch: got {gen_stats.cache_read_tokens}, expected {expected['cache_read_tokens']} (±{tolerance*100}%)\"\n\n    if expected.get(\"input_image_tokens\") is not None:\n        if isinstance(expected[\"input_image_tokens\"], list):\n            # If expected input_image_tokens is a list, accept any value in the list\n            assert any(within_tolerance(gen_stats.input_image_tokens, expected_input_image_tokens, tolerance) for expected_input_image_tokens in expected[\"input_image_tokens\"]), \\\n                f\"{request_name} input_image_tokens mismatch: got {gen_stats.input_image_tokens}, expected one of {expected['input_image_tokens']}\"\n        else:\n            assert within_tolerance(gen_stats.input_image_tokens, expected[\"input_image_tokens\"], tolerance), \\\n                f\"{request_name} input_image_tokens mismatch: got {gen_stats.input_image_tokens}, expected {expected['input_image_tokens']} (±{tolerance*100}%)\"\n\n    if expected.get(\"input_audio_tokens\") is not None:\n        if isinstance(expected[\"input_audio_tokens\"], list):\n            # If expected input_audio_tokens is a list, accept any value in the list\n            assert any(within_tolerance(gen_stats.input_audio_tokens, expected_input_audio_tokens, tolerance) for expected_input_audio_tokens in expected[\"input_audio_tokens\"]), \\\n                f\"{request_name} input_audio_tokens mismatch: got {gen_stats.input_audio_tokens}, expected one of {expected['input_audio_tokens']}\"\n        else:\n            assert within_tolerance(gen_stats.input_audio_tokens, expected[\"input_audio_tokens\"], tolerance), \\\n                f\"{request_name} input_audio_tokens mismatch: got {gen_stats.input_audio_tokens}, expected {expected['input_audio_tokens']} (±{tolerance*100}%)\"\n\n    if expected.get(\"cache_creation_tokens\") is not None:\n        assert within_tolerance(gen_stats.cache_creation_tokens, expected[\"cache_creation_tokens\"], tolerance), \\\n            f\"{request_name} cache_creation_tokens mismatch: got {gen_stats.cache_creation_tokens}, expected {expected['cache_creation_tokens']} (±{tolerance*100}%)\"\n\n    # Verify total input token invariant across all providers:\n    # input_text + input_image + input_audio + cache_read + cache_creation ≈ expected total\n    actual_total = (\n        gen_stats.input_text_tokens + gen_stats.input_image_tokens + gen_stats.input_audio_tokens\n        + gen_stats.cache_read_tokens + gen_stats.cache_creation_tokens\n    )\n    expected_total = 0\n    try:\n        for field in [\"input_text_tokens\", \"input_image_tokens\", \"input_audio_tokens\", \"cache_read_tokens\", \"cache_creation_tokens\"]:\n            if expected.get(field) is not None:\n                if isinstance(expected[field], list):\n                    expected_total += expected[field][0]\n                else:\n                    expected_total += expected[field]\n        if expected_total > 0:\n            assert within_tolerance(actual_total, expected_total, tolerance), \\\n                f\"{request_name} total input tokens mismatch: got {actual_total}, expected {expected_total} (±{tolerance*100}%)\"\n    except AssertionError:\n        for field in [\"input_text_tokens\", \"input_image_tokens\", \"input_audio_tokens\", \"cache_read_tokens\", \"cache_creation_tokens\"]:\n            if expected.get(field) is not None:\n                if isinstance(expected[field], list):\n                    expected_total += expected[field][1]\n                else:\n                    expected_total += expected[field]\n        if expected_total > 0:\n            assert within_tolerance(actual_total, expected_total, tolerance), \\\n                f\"{request_name} total input tokens mismatch: got {actual_total}, expected {expected_total} (±{tolerance*100}%)\"\n\n    assert gen_stats.output_text_tokens > 0, f\"{request_name} output_text_tokens should be positive\"\n    assert gen_stats.cost_per_record > 0, f\"{request_name} cost_per_record should be positive\"\n\n\n@pytest.mark.parametrize(\n    \"provider,modality\",\n    [(p, m) for p in ALL_PROVIDERS for m in ALL_MODALITIES],\n    ids=[f\"{p}-{m}\" for p in ALL_PROVIDERS for m in ALL_MODALITIES],\n)\ndef test_generator_stats(provider, modality):\n    \"\"\"Test Generator stats tracking for all provider/modality combinations.\n\n    Makes two requests:\n    1. First request (no cache) - should show cache_creation_tokens for providers that support it\n    2. Second request (with cache) - should show cache_read_tokens after waiting for cache availability\n    \"\"\"\n    # Skip if modality not supported by provider\n    if not is_modality_supported(provider, modality):\n        pytest.skip(f\"Modality {modality} not supported by {provider}\")\n\n    # Skip if API key not present\n    if not check_api_key(provider):\n        config = PROVIDER_CONFIG[provider]\n        pytest.skip(f\"API key not present: {config['api_key_env']}\")\n\n    # Get model and create input\n    model = PROVIDER_CONFIG[provider][\"model\"]\n    input_schema = get_input_schema_for_modality(modality)\n    input_record = create_input_record(input_schema, modality)\n    session_id = generate_session_id()\n\n    # Create generator\n    generator = Generator(model, PromptStrategy.MAP, None, desc=STATIC_CONTEXT)\n\n    # Get expected stats\n    expected = EXPECTED_STATS.get((provider, modality), {})\n\n    # First request (no cache)\n    output1, _, gen_stats1, _ = generator(\n        input_record,\n        AnimalOutputSchema.model_fields,\n        output_schema=AnimalOutputSchema,\n        cache_isolation_id=session_id,\n    )\n\n    # Verify first request output\n    assert output1 is not None\n    assert \"animal\" in output1\n\n    # Assert first request stats\n    if \"first_request\" in expected:\n        assert_stats_match(gen_stats1, expected[\"first_request\"], \"first_request\", provider=provider)\n\n    # Wait for cache to be available\n    time.sleep(CACHE_WAIT_SECONDS)\n\n    # Second request (should use cache)\n    output2, _, gen_stats2, _ = generator(\n        input_record,\n        AnimalOutputSchema.model_fields,\n        output_schema=AnimalOutputSchema,\n        cache_isolation_id=session_id,\n    )\n\n    # Verify second request output\n    assert output2 is not None\n    assert \"animal\" in output2\n\n    # Assert second request stats\n    if \"second_request\" in expected:\n        assert_stats_match(gen_stats2, expected[\"second_request\"], \"second_request\", provider=provider)\n"
  },
  {
    "path": "tests/pytest/test_iter_dataset.py",
    "content": "import os\nfrom copy import deepcopy\n\nimport pandas as pd\nimport pytest\n\nfrom palimpzest.core.data.iter_dataset import (\n    HTMLFileDataset,\n    ImageFileDataset,\n    MemoryDataset,\n    TextFileDataset,\n)\nfrom palimpzest.core.lib.schemas import TextFile, WebPage\n\n\n@pytest.fixture\ndef temp_text_file():\n    file_path = \"testdata/tmp_test.txt\"\n    with open(file_path, \"w\") as f:\n        f.write(\"Hello, World!\")\n    yield file_path\n    os.remove(file_path)\n\n@pytest.fixture\ndef temp_text_dir():\n    dir_path = \"testdata/text_dir\"\n    os.makedirs(dir_path, exist_ok=True)\n    with open(dir_path + \"/file1.txt\", \"w\") as f:\n        f.write(\"Content 1\")\n    with open(dir_path + \"/file2.txt\", \"w\") as f:\n        f.write(\"Content 2\")\n    yield dir_path\n    os.remove(\"testdata/text_dir/file1.txt\")\n    os.remove(\"testdata/text_dir/file2.txt\")\n    os.rmdir(dir_path)\n\n@pytest.fixture\ndef list_values():\n    return [1, 2, 3, 4]\n\n@pytest.fixture\ndef df_values():\n    return pd.DataFrame({\"a\": [10, 20, 30, 40], \"b\": [50, 60, 70, 80]})\n\n\ndef test_text_dataset(temp_text_dir):\n    dataset = TextFileDataset(id=\"test\", path=temp_text_dir)\n    assert len(dataset) == 2\n    assert dataset.schema == TextFile\n    \n    record = dataset[0]\n    assert isinstance(record, dict)\n    assert record[\"contents\"] == \"Content 1\"\n    \n    record = dataset[1]\n    assert record[\"contents\"] == \"Content 2\"\n\ndef test_memory_dataset_list(list_values):\n    dataset = MemoryDataset(id=\"test\", vals=list_values)\n    assert len(dataset) == len(list_values)\n    \n    record = dataset[0]\n    assert record[\"value\"] == list_values[0]\n    record = dataset[3]\n    assert record[\"value\"] == list_values[3]\n    copied = deepcopy(dataset)\n    assert copied.vals == dataset.vals\n\ndef test_memory_dataset_df(df_values):\n    dataset = MemoryDataset(id=\"test\", vals=df_values)\n    assert len(dataset) == len(df_values)\n    \n    record = dataset[0]\n    assert record[\"a\"] == df_values.iloc[0]['a']\n    assert record[\"b\"] == df_values.iloc[0]['b']\n\n    copied = deepcopy(dataset)\n    assert copied.vals.equals(dataset.vals)\n\n\ndef test_memory_dataset_copy():\n    values = [1, 2, 3]\n    dataset = MemoryDataset(id=\"test\", vals=values)\n    copied = deepcopy(dataset)\n    \n    assert copied.vals == dataset.vals\n\n@pytest.fixture\ndef temp_html_dir(tmp_path):\n    dir_path = tmp_path / \"html_files\"\n    dir_path.mkdir()\n    html_content = \"\"\"\n    <html>\n        <body>\n            <a href=\"http://example.com\">Example Link</a>\n            <p>Some text</p>\n        </body>\n    </html>\n    \"\"\"\n    (dir_path / \"page1.html\").write_text(html_content)\n    return str(dir_path)\n\ndef test_html_dataset(temp_html_dir):\n    dataset = HTMLFileDataset(id=\"test\", path=temp_html_dir)\n    assert len(dataset) == 1\n    assert dataset.schema == WebPage\n    \n    record = dataset[0]\n    assert isinstance(record, dict)\n    assert \"Example Link (http://example.com)\" in record[\"text\"]\n    assert \"<html>\" in record[\"html\"]\n\ndef test_invalid_directory():\n    with pytest.raises(AssertionError):\n        ImageFileDataset(id=\"test\", path=\"/nonexistent/path\")\n"
  },
  {
    "path": "tests/pytest/test_join.py",
    "content": "\"\"\"This script contains tests for physical operators for join.\"\"\"\n\nimport os\n\nimport pytest\nfrom pydantic import BaseModel, Field\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.lib.schemas import AudioFilepath, ImageFilepath, union_schemas\nfrom palimpzest.core.models import GenerationStats\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.operators.join import EmbeddingJoin, NestedLoopsJoin\n\nif not os.environ.get(\"OPENAI_API_KEY\"):\n    from palimpzest.utils.env_helpers import load_env\n\n    load_env()\n\n\nclass TextInputSchema(BaseModel):\n    text: str = Field(description=\"Description of an animal\")\n    age: int = Field(description=\"The age of the animal in years\")\n\nclass ImageInputSchema(BaseModel):\n    image_file: ImageFilepath = Field(description=\"File path to an image of an animal\")\n    height: float = Field(description=\"The estimated height of the animal in cm\")\n\nclass AudioInputSchema(BaseModel):\n    audio_file: AudioFilepath = Field(description=\"File path to an audio recording of an animal\")\n    year: float = Field(description=\"The year the recording was made\")\n\nTextImageInputSchema = union_schemas([TextInputSchema, ImageInputSchema])\nTextAudioInputSchema = union_schemas([TextInputSchema, AudioInputSchema])\nImageAudioInputSchema = union_schemas([ImageInputSchema, AudioInputSchema])\nTextImageAudioInputSchema = union_schemas([TextInputSchema, ImageInputSchema, AudioInputSchema])\n\ndef create_input_record(schema: type[BaseModel]) -> DataRecord:\n    data_item = {}\n    if all(field in schema.model_fields for field in TextInputSchema.model_fields):\n        data_item['text'] = \"An elephant is a large gray animal with a trunk and big ears.\"\n        data_item['age'] = 3\n    if all(field in schema.model_fields for field in ImageInputSchema.model_fields):\n        data_item['image_file'] = \"tests/pytest/data/elephant.png\"\n        data_item['height'] = 304.5\n    if all(field in schema.model_fields for field in AudioInputSchema.model_fields):\n        data_item['audio_file'] = \"tests/pytest/data/elephant.wav\"\n        data_item['year'] = 2020\n    input_record = DataRecord(schema(**data_item), source_indices=[0])\n\n    return input_record\n\ndef mock_generator_call(candidate, fields, right_candidate=None, json_output=True, **kwargs):\n    field_answers = {\"passed_operator\": True}\n    reasoning = \"The input matches that of an elephant.\"\n    generation_stats = GenerationStats(cost_per_record=1.0, time_per_record=1.0, num_input_tokens=10, num_output_tokens=10)\n    messages = []\n    return field_answers, reasoning, generation_stats, messages\n\ndef embedding_join_mock_generator_call(candidate, fields, right_candidate=None, json_output=True, **kwargs):\n    field_answers = {\"passed_operator\": candidate['text'] == right_candidate['text']}\n    reasoning = \"The input matches that of an elephant.\"\n    generation_stats = GenerationStats(cost_per_record=1.0, time_per_record=1.0, num_input_tokens=10, num_output_tokens=10)\n    messages = []\n    return field_answers, reasoning, generation_stats, messages\n\n@pytest.mark.parametrize(\n    \"left_input_schema\",\n    [TextInputSchema, ImageInputSchema, AudioInputSchema, TextImageInputSchema, TextAudioInputSchema, ImageAudioInputSchema, TextImageAudioInputSchema],\n    ids=[\"text-only\", \"image-only\", \"audio-only\", \"text-image\", \"text-audio\", \"image-audio\", \"text-image-audio\"],\n)\n@pytest.mark.parametrize(\n    \"right_input_schema\",\n    [TextInputSchema, ImageInputSchema, AudioInputSchema, TextImageInputSchema, TextAudioInputSchema, ImageAudioInputSchema, TextImageAudioInputSchema],\n    ids=[\"text-only\", \"image-only\", \"audio-only\", \"text-image\", \"text-audio\", \"image-audio\", \"text-image-audio\"],\n)\n@pytest.mark.parametrize(\n    \"physical_op_class\",\n    [NestedLoopsJoin, EmbeddingJoin],\n    ids=[\"nested-loops-join\", \"embedding-join\"],\n)\ndef test_join(mocker, left_input_schema, right_input_schema, physical_op_class, embedding_text_only_model):\n    \"\"\"Test join operators on simple input\"\"\"\n    # RAGConvert and SplitConvert only support text input currently\n    left_has_audio = any(field in left_input_schema.model_fields for field in AudioInputSchema.model_fields)\n    right_has_audio = any(field in right_input_schema.model_fields for field in AudioInputSchema.model_fields)\n    if physical_op_class in [EmbeddingJoin] and (left_has_audio or right_has_audio):\n        pytest.skip(f\"{physical_op_class} does not support audio input currently\")\n\n    audio_input_schemas = [AudioInputSchema, TextAudioInputSchema, ImageAudioInputSchema, TextImageAudioInputSchema]\n    if os.getenv(\"NO_GEMINI\") and (left_input_schema in audio_input_schemas or right_input_schema in audio_input_schemas):\n        pytest.skip(\"Skipping multi-modal audio tests on CI which does not have access to gemini models\")\n\n    # construct the kwargs for the physical operator\n    input_schema = union_schemas([left_input_schema, right_input_schema])\n    if left_input_schema == right_input_schema and right_input_schema in [TextInputSchema]:\n        embedding_model = embedding_text_only_model\n    else:\n        embedding_model = Model.CLIP_VIT_B_32\n        \n    physical_op_kwargs = {\n        \"input_schema\": input_schema,\n        \"output_schema\": input_schema,\n        \"condition\": \"Do the two inputs describe the same type of animal?\",\n        \"logical_op_id\": \"test-join\",\n        \"model\": Model.GPT_5_MINI if os.getenv(\"NO_GEMINI\") else Model.GEMINI_2_5_FLASH,\n    }\n    if physical_op_class == EmbeddingJoin:\n        physical_op_kwargs[\"embedding_model\"] = embedding_model\n        physical_op_kwargs[\"num_samples\"] = 10\n\n    # create join operator\n    join_op = physical_op_class(**physical_op_kwargs)\n\n    # create left input record\n    left_input_record = create_input_record(left_input_schema)\n    right_input_record = create_input_record(right_input_schema)\n\n    # only execute LLM calls if specified\n    if not os.getenv(\"RUN_LLM_TESTS\"):\n        mocker.patch.object(Generator, \"__call__\", side_effect=mock_generator_call)\n\n    # apply join operator to the inputs\n    data_record_set, num_inputs_processed = join_op([left_input_record], [right_input_record])\n\n    # check for single output record with expected fields\n    assert len(data_record_set) == 1\n    assert num_inputs_processed == 1\n    output_record = data_record_set[0]\n\n    assert sorted(output_record.schema.model_fields) == sorted(input_schema.model_fields)\n    assert output_record._passed_operator\n\ndef test_embedding_join(mocker, embedding_text_only_model):\n    \"\"\"Test EmbeddingJoin operator on simple text input\"\"\"\n    left_candidates = []\n    for left_idx, animal in enumerate([\"elephant\", \"lion\", \"lion\", \"bear\"]):\n        data_item = {\"text\": f\"This text describes a {animal}.\", \"age\": left_idx + 1}\n        left_input_record = DataRecord(TextInputSchema(**data_item), source_indices=[left_idx])\n        left_candidates.append(left_input_record)\n\n    right_candidates = []\n    for right_idx, animal in enumerate([\"elephant\", \"giraffe\", \"lion\", \"zebra\"]):\n        data_item = {\"text\": f\"This text describes a {animal}.\", \"age\": right_idx + 2}\n        right_input_record = DataRecord(TextInputSchema(**data_item), source_indices=[right_idx])\n        right_candidates.append(right_input_record)\n\n    # construct the kwargs for the physical operator\n    input_schema = union_schemas([TextInputSchema, TextInputSchema])\n    physical_op_kwargs = {\n        \"input_schema\": input_schema,\n        \"output_schema\": input_schema,\n        \"condition\": \"Do the two inputs describe the same type of animal?\",\n        \"logical_op_id\": \"test-join\",\n        \"model\": Model.GPT_5_MINI,\n        \"embedding_model\": embedding_text_only_model,\n        \"num_samples\": 8,\n    }\n\n    # create join operator\n    join_op = EmbeddingJoin(**physical_op_kwargs)\n\n    # only execute LLM calls if specified\n    if not os.getenv(\"RUN_LLM_TESTS\"):\n        mock_call = mocker.patch.object(Generator, \"__call__\", side_effect=embedding_join_mock_generator_call)\n\n    # apply join operator to the inputs\n    data_record_set, num_inputs_processed = join_op(left_candidates, right_candidates)\n\n    # check that the mock was called at least 8 times (num_samples) and less than or equal to 16 times (all possible pairs)\n    # This depends on the embedding of all the candidates, and so on the embedding model used\n    if not os.getenv(\"RUN_LLM_TESTS\"):\n        assert mock_call.call_count >= 8\n        assert mock_call.call_count <= len(left_candidates) * len(right_candidates)\n\n\n    # sanity checks on output records and stats\n    records = data_record_set.data_records\n    record_op_stats_lst = data_record_set.record_op_stats\n    assert len(record_op_stats_lst) == 16\n    assert num_inputs_processed == 16\n    for output_record in records:\n        assert sorted(output_record.schema.model_fields) == sorted(input_schema.model_fields)\n\n    # check that all output record stats have embedding stats\n    # embedding cost could be 0.0 if the embedding model is mocked\n    assert all(stats.cost_per_record >= 0.0 for stats in record_op_stats_lst)\n    assert sum(record._passed_operator for record in records) == 3\n"
  },
  {
    "path": "tests/pytest/test_map.py",
    "content": "\"\"\"This script contains tests for physical operators for map.\"\"\"\n\nimport os\n\nimport pytest\nfrom pydantic import BaseModel, Field\n\nfrom palimpzest.constants import Model\nfrom palimpzest.core.elements.records import DataRecord\nfrom palimpzest.core.lib.schemas import AudioFilepath, ImageFilepath, union_schemas\nfrom palimpzest.core.models import GenerationStats\nfrom palimpzest.query.generators.generators import Generator\nfrom palimpzest.query.operators.convert import LLMConvertBonded\nfrom palimpzest.query.operators.critique_and_refine import CritiqueAndRefineConvert\nfrom palimpzest.query.operators.mixture_of_agents import MixtureOfAgentsConvert\nfrom palimpzest.query.operators.rag import RAGConvert\nfrom palimpzest.query.operators.split import SplitConvert\n\nif not os.environ.get(\"OPENAI_API_KEY\"):\n    from palimpzest.utils.env_helpers import load_env\n\n    load_env()\n\n\nclass TextInputSchema(BaseModel):\n    text: str = Field(description=\"Description of an animal\")\n    age: int = Field(description=\"The age of the animal in years\")\n\nclass ImageInputSchema(BaseModel):\n    image_file: ImageFilepath = Field(description=\"File path to an image of an animal\")\n    height: float = Field(description=\"The estimated height of the animal in cm\")\n\nclass AudioInputSchema(BaseModel):\n    audio_file: AudioFilepath = Field(description=\"File path to an audio recording of an animal\")\n    year: float = Field(description=\"The year the recording was made\")\n\nTextImageInputSchema = union_schemas([TextInputSchema, ImageInputSchema])\nTextAudioInputSchema = union_schemas([TextInputSchema, AudioInputSchema])\nImageAudioInputSchema = union_schemas([ImageInputSchema, AudioInputSchema])\nTextImageAudioInputSchema = union_schemas([TextInputSchema, ImageInputSchema, AudioInputSchema])\n\nclass OutputSchema(BaseModel):\n    animal: str = Field(description=\"The animal in the input\")\n\ndef mock_generator_call(candidate, fields, right_candidate=None, json_output=True, **kwargs):\n    field_answers = {\"animal\": [\"elephant\"]}\n    reasoning = \"The input matches that of an elephant.\"\n    generation_stats = GenerationStats(cost_per_record=1.0, time_per_record=1.0, num_input_tokens=10, num_output_tokens=10)\n    messages = []\n    return field_answers, reasoning, generation_stats, messages\n\n\n@pytest.mark.parametrize(\n    \"input_schema\",\n    [TextInputSchema, ImageInputSchema, AudioInputSchema, TextImageInputSchema, TextAudioInputSchema, ImageAudioInputSchema, TextImageAudioInputSchema],\n    ids=[\"text-only\", \"image-only\", \"audio-only\", \"text-image\", \"text-audio\", \"image-audio\", \"text-image-audio\"],\n)\n@pytest.mark.parametrize(\n    \"physical_op_class\",\n    [LLMConvertBonded, RAGConvert, SplitConvert, CritiqueAndRefineConvert, MixtureOfAgentsConvert],\n    ids=[\"llm-convert-bonded\", \"rag-convert\", \"split-convert\", \"critique-and-refine-convert\", \"mixture-of-agents-convert\"],\n)\ndef test_map(mocker, input_schema, physical_op_class, embedding_text_only_model):\n    \"\"\"Test map operators on simple input\"\"\"\n    # RAGConvert and SplitConvert only support text input currently\n    if physical_op_class in [RAGConvert, SplitConvert] and input_schema != TextInputSchema:\n        pytest.skip(f\"{physical_op_class} only supports text input currently\")\n\n    if os.getenv(\"NO_GEMINI\") and input_schema in [AudioInputSchema, TextAudioInputSchema, ImageAudioInputSchema, TextImageAudioInputSchema]:\n        pytest.skip(\"Skipping audio tests on CI which does not have access to gemini models\")\n\n    model = Model.GPT_5_MINI if os.getenv(\"NO_GEMINI\") else Model.GEMINI_2_5_FLASH\n    proposer_models = [Model.GPT_5, Model.GPT_5_NANO] if os.getenv(\"NO_GEMINI\") else [Model.GEMINI_2_5_PRO, Model.GEMINI_2_0_FLASH]\n    critic_model = Model.GPT_5_NANO if os.getenv(\"NO_GEMINI\") else Model.GEMINI_2_0_FLASH\n    refine_model = Model.GPT_5 if os.getenv(\"NO_GEMINI\") else Model.GEMINI_2_5_PRO\n\n    # construct the kwargs for the physical operator\n    physical_op_kwargs = {\"input_schema\": input_schema, \"output_schema\": OutputSchema, \"logical_op_id\": \"test-map\"}\n    if physical_op_class is LLMConvertBonded:\n        physical_op_kwargs[\"model\"] = model\n    elif physical_op_class is RAGConvert:\n        physical_op_kwargs[\"model\"] = model\n        physical_op_kwargs[\"embedding_model\"] = embedding_text_only_model\n        physical_op_kwargs[\"num_chunks_per_field\"] = 1\n        physical_op_kwargs[\"chunk_size\"] = 1000\n    elif physical_op_class is SplitConvert:\n        physical_op_kwargs[\"model\"] = model\n        physical_op_kwargs[\"num_chunks\"] = 2\n        physical_op_kwargs[\"min_size_to_chunk\"] = 1000\n    elif physical_op_class is MixtureOfAgentsConvert:\n        physical_op_kwargs[\"proposer_models\"] = proposer_models\n        physical_op_kwargs[\"temperatures\"] = [0.8, 0.8]\n        physical_op_kwargs[\"aggregator_model\"] = model\n    elif physical_op_class is CritiqueAndRefineConvert:\n        physical_op_kwargs[\"model\"] = model\n        physical_op_kwargs[\"critic_model\"] = critic_model\n        physical_op_kwargs[\"refine_model\"] = refine_model\n\n    # create map operator\n    map_op = physical_op_class(**physical_op_kwargs)\n\n    # create input record\n    data_item = {}\n    if all(field in input_schema.model_fields for field in TextInputSchema.model_fields):\n        data_item['text'] = \"An elephant is a large gray animal with a trunk and big ears.\"\n        data_item['age'] = 3\n    if all(field in input_schema.model_fields for field in ImageInputSchema.model_fields):\n        data_item['image_file'] = \"tests/pytest/data/elephant.png\"\n        data_item['height'] = 304.5\n    if all(field in input_schema.model_fields for field in AudioInputSchema.model_fields):\n        data_item['audio_file'] = \"tests/pytest/data/elephant.wav\"\n        data_item['year'] = 2020\n    input_record = DataRecord(input_schema(**data_item), source_indices=[0])\n\n    # only execute LLM calls if specified\n    if not os.getenv(\"RUN_LLM_TESTS\"):\n        mocker.patch.object(Generator, \"__call__\", side_effect=mock_generator_call)\n\n    # apply map operator to the input\n    data_record_set = map_op(input_record)\n\n    # check for single output record with expected fields\n    assert len(data_record_set) == 1\n    output_record = data_record_set[0]\n\n    assert sorted(output_record.schema.model_fields) == sorted(union_schemas([input_schema, OutputSchema]).model_fields)\n    assert hasattr(output_record, \"animal\")\n    assert output_record.animal.lower() == \"elephant\"\n"
  },
  {
    "path": "tests/pytest/test_optimizer.py",
    "content": "import os\nimport time\n\nimport pytest\n\nfrom palimpzest.constants import Cardinality, Model\nfrom palimpzest.core.elements.filters import Filter\nfrom palimpzest.core.lib.schemas import TextFile\nfrom palimpzest.core.models import OperatorCostEstimates, PlanCost\nfrom palimpzest.policy import MaxQuality, MinCost, MinTime\nfrom palimpzest.query.operators.convert import LLMConvert, LLMConvertBonded\nfrom palimpzest.query.operators.filter import LLMFilter, NonLLMFilter\nfrom palimpzest.query.operators.logical import ConvertScan, FilteredScan\nfrom palimpzest.query.operators.physical import PhysicalOperator\nfrom palimpzest.query.operators.scan import MarshalAndScanDataOp, ScanPhysicalOp\nfrom palimpzest.query.optimizer.cost_model import SampleBasedCostModel\nfrom palimpzest.query.optimizer.optimizer import Optimizer\nfrom palimpzest.query.optimizer.optimizer_strategy_type import OptimizationStrategyType\nfrom palimpzest.query.optimizer.primitives import Group, LogicalExpression\n\n\nclass TestPrimitives:\n    def test_group_id_equality(self, email_schema):\n        filter1_op = FilteredScan(\n            input_schema=TextFile,\n            output_schema=TextFile,\n            filter=Filter(\"filter1\"),\n            depends_on=[],\n        )\n        LogicalExpression(\n            operator=filter1_op,\n            input_group_ids=[0],\n            input_fields={\"contents\": TextFile.model_fields[\"contents\"]},\n            depends_on_field_names=set([\"contents\"]),\n            generated_fields={},\n            group_id=None,\n        )\n        filter2_op = FilteredScan(\n            input_schema=TextFile,\n            output_schema=TextFile,\n            filter=Filter(\"filter2\"),\n            depends_on=[],\n        )\n        filter2_expr = LogicalExpression(\n            operator=filter2_op,\n            input_group_ids=[1],\n            input_fields={\"contents\": TextFile.model_fields[\"contents\"]},\n            depends_on_field_names=set([\"contents\"]),\n            generated_fields={},\n            group_id=None,\n        )\n        convert_op = ConvertScan(\n            input_schema=TextFile,\n            output_schema=email_schema,\n            cardinality=Cardinality.ONE_TO_ONE,\n            depends_on=[],\n        )\n        convert_expr = LogicalExpression(\n            operator=convert_op,\n            input_group_ids=[2],\n            input_fields={\"contents\": TextFile.model_fields[\"contents\"]},\n            depends_on_field_names=set([\"contents\"]),\n            generated_fields={\n                \"sender\": email_schema.model_fields[\"sender\"],\n                \"subject\": email_schema.model_fields[\"subject\"],\n            },\n            group_id=None,\n        )\n        g1_properties = {\n            \"filter_strs\": set([filter1_op.filter.get_filter_str(), filter2_op.filter.get_filter_str()]),\n        }\n        g1 = Group(\n            logical_expressions=[convert_expr],\n            fields={\n                \"sender\": email_schema.model_fields[\"sender\"],\n                \"subject\": email_schema.model_fields[\"subject\"],\n                \"contents\": TextFile.model_fields[\"contents\"],\n                \"filename\": TextFile.model_fields[\"filename\"],\n            },\n            properties=g1_properties,\n        )\n        g2_properties = {\n            \"filter_strs\": set([filter2_op.filter.get_filter_str(), filter1_op.filter.get_filter_str()]),\n        }\n        g2 = Group(\n            logical_expressions=[filter2_expr],\n            fields={\n                \"sender\": email_schema.model_fields[\"sender\"],\n                \"subject\": email_schema.model_fields[\"subject\"],\n                \"contents\": TextFile.model_fields[\"contents\"],\n                \"filename\": TextFile.model_fields[\"filename\"],\n            },\n            properties=g2_properties,\n        )\n        assert g1.group_id == g2.group_id\n\n\n@pytest.mark.parametrize(\n    argnames=(\"opt_strategy\",),\n    argvalues=[\n        pytest.param(OptimizationStrategyType.GREEDY, id=\"greedy\"),\n        pytest.param(OptimizationStrategyType.PARETO, id=\"pareto\"),\n    ],\n)\nclass TestOptimizer:\n    def test_basic_functionality(self, enron_eval_tiny, opt_strategy):\n        plan = enron_eval_tiny\n        policy = MaxQuality()\n        cost_model = SampleBasedCostModel()\n        optimizer = Optimizer(\n            policy=policy,\n            cost_model=cost_model,\n            verbose=True,\n            available_models=[Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_1_8B],\n            optimizer_strategy=opt_strategy,\n        )\n        physical_plans = optimizer.optimize(plan)\n        physical_plan = physical_plans[0]\n\n        assert len(physical_plan) == 1\n        assert isinstance(physical_plan[0], MarshalAndScanDataOp)\n\n    def test_simple_max_quality_convert(self, enron_eval_tiny, email_schema, opt_strategy):\n        plan = enron_eval_tiny\n        plan = plan.sem_add_columns(email_schema)\n        policy = MaxQuality()\n        cost_model = SampleBasedCostModel()\n        optimizer = Optimizer(\n            policy=policy,\n            cost_model=cost_model,\n            verbose=True,\n            available_models=[Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_1_8B],\n            optimizer_strategy=opt_strategy,\n            allow_rag_reduction=False,\n            allow_mixtures=False,\n            allow_critic=False,\n            allow_split_merge=False,\n        )\n        physical_plans = optimizer.optimize(plan)\n        physical_plan = physical_plans[0]\n\n        assert len(physical_plan) == 2\n        assert isinstance(physical_plan[0], MarshalAndScanDataOp)\n        assert isinstance(physical_plan[1], LLMConvertBonded)\n        assert physical_plan[1].model == Model.GPT_4o\n\n    def test_simple_min_cost_convert(self, enron_eval_tiny, email_schema, opt_strategy):\n        plan = enron_eval_tiny\n        plan = plan.sem_add_columns(email_schema)\n        policy = MinCost()\n        cost_model = SampleBasedCostModel()\n        optimizer = Optimizer(\n            policy=policy,\n            cost_model=cost_model,\n            verbose=True,\n            available_models=[Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_1_8B],\n            optimizer_strategy=opt_strategy,\n        )\n        physical_plans = optimizer.optimize(plan)\n        physical_plan = physical_plans[0]\n\n        assert len(physical_plan) == 2\n        assert isinstance(physical_plan[0], MarshalAndScanDataOp)\n        assert isinstance(physical_plan[1], LLMConvertBonded)\n\n    def test_simple_min_time_convert(self, enron_eval_tiny, email_schema, opt_strategy):\n        plan = enron_eval_tiny\n        plan = plan.sem_add_columns(email_schema)\n        policy = MinTime()\n        cost_model = SampleBasedCostModel()\n        optimizer = Optimizer(\n            policy=policy,\n            cost_model=cost_model,\n            verbose=True,\n            available_models=[Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_1_8B],\n            optimizer_strategy=opt_strategy,\n        )\n        physical_plans = optimizer.optimize(plan)\n        physical_plan = physical_plans[0]\n\n        assert len(physical_plan) == 2\n        assert isinstance(physical_plan[0], MarshalAndScanDataOp)\n        assert isinstance(physical_plan[1], LLMConvertBonded)\n\n    def test_simple_vllm_convert(self, enron_eval_tiny, email_schema, opt_strategy):\n        vllm_model = Model(\"hosted_vllm/qwen/Qwen1.5-0.5B-Chat\", api_base=\"http://localhost:8000/v1\")\n        plan = enron_eval_tiny\n        plan = plan.sem_add_columns(email_schema)\n        policy = MinTime()\n        cost_model = SampleBasedCostModel()\n        optimizer = Optimizer(\n            policy=policy,\n            cost_model=cost_model,\n            verbose=True,\n            available_models=[vllm_model],\n            optimizer_strategy=opt_strategy,\n        )\n        physical_plans = optimizer.optimize(plan)\n        physical_plan = physical_plans[0]\n\n        assert len(physical_plan) == 2\n        assert isinstance(physical_plan[0], MarshalAndScanDataOp)\n        assert isinstance(physical_plan[1], LLMConvertBonded)\n\n    def test_push_down_filter(self, enron_eval_tiny, email_schema, opt_strategy):\n        plan = enron_eval_tiny\n        plan = plan.sem_add_columns(email_schema)\n        plan = plan.sem_filter(\"some text filter\", depends_on=[\"contents\"])\n        policy = MinCost()\n        cost_model = SampleBasedCostModel()\n        optimizer = Optimizer(\n            policy=policy,\n            cost_model=cost_model,\n            verbose=True,\n            available_models=[Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_1_8B],\n            optimizer_strategy=opt_strategy,\n        )\n        physical_plans = optimizer.optimize(plan)\n        physical_plan = physical_plans[0]\n\n        assert len(physical_plan) == 3\n        assert isinstance(physical_plan[0], MarshalAndScanDataOp)\n        assert isinstance(physical_plan[1], LLMFilter)\n        assert isinstance(physical_plan[2], LLMConvertBonded)\n\n    def test_push_down_two_filters(self, enron_eval_tiny, email_schema, opt_strategy):\n        plan = enron_eval_tiny\n        plan = plan.sem_add_columns(email_schema)\n        plan = plan.sem_filter(\"some text filter\", depends_on=[\"contents\"])\n        plan = plan.sem_filter(\"another text filter\", depends_on=[\"contents\"])\n        policy = MinCost()\n        cost_model = SampleBasedCostModel()\n        optimizer = Optimizer(\n            policy=policy,\n            cost_model=cost_model,\n            verbose=True,\n            available_models=[Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_1_8B],\n            optimizer_strategy=opt_strategy,\n        )\n        physical_plans = optimizer.optimize(plan)\n        physical_plan = physical_plans[0]\n\n        assert len(physical_plan) == 4\n        assert isinstance(physical_plan[0], MarshalAndScanDataOp)\n        assert isinstance(physical_plan[1], LLMFilter)\n        assert isinstance(physical_plan[2], LLMFilter)\n        assert isinstance(physical_plan[3], LLMConvertBonded)\n\n    def test_small_real_estate_logical_reorder(self, small_real_estate_workload, opt_strategy):\n        policy = MinCost()\n        cost_model = SampleBasedCostModel()\n        optimizer = Optimizer(\n            policy=policy,\n            cost_model=cost_model,\n            verbose=True,\n            available_models=[Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_1_8B],\n            allow_rag_reduction=False,\n            allow_mixtures=False,\n            allow_critic=False,\n            allow_split_merge=False,\n            optimizer_strategy=opt_strategy,\n        )\n        physical_plans = optimizer.optimize(small_real_estate_workload)\n        physical_plan = physical_plans[0]\n\n        assert len(physical_plan) == 4\n        assert isinstance(physical_plan[0], MarshalAndScanDataOp)  # RealEstateListingFiles\n        assert isinstance(physical_plan[1], LLMConvert)  # ImageRealEstateListing\n        assert isinstance(physical_plan[2], LLMFilter)  # ImageRealEstateListing(attractive)\n        assert isinstance(physical_plan[3], LLMConvert)  # TextRealEstateListing\n\n    def test_real_estate_logical_reorder(self, real_estate_workload, opt_strategy):\n        policy = MinCost()\n        cost_model = SampleBasedCostModel()\n        optimizer = Optimizer(\n            policy=policy,\n            cost_model=cost_model,\n            verbose=True,\n            available_models=[Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_1_8B],\n            allow_rag_reduction=False,\n            allow_mixtures=False,\n            allow_critic=False,\n            allow_split_merge=False,\n            optimizer_strategy=opt_strategy,\n        )\n        physical_plans = optimizer.optimize(real_estate_workload)\n        physical_plan = physical_plans[0]\n\n        assert len(physical_plan) == 6\n        assert isinstance(physical_plan[0], MarshalAndScanDataOp)  # RealEstateListingFiles\n        assert isinstance(physical_plan[1], LLMConvert)  # TextRealEstateListing\n        assert isinstance(physical_plan[2], NonLLMFilter)  # TextRealEstateListing(price/addr)\n        assert isinstance(physical_plan[3], NonLLMFilter)  # TextRealEstateListing(price/addr)\n        assert isinstance(physical_plan[4], LLMConvert)  # ImageRealEstateListing\n        assert isinstance(physical_plan[5], LLMFilter)  # ImageRealEstateListing(attractive)\n\n    def test_seven_filters(self, enron_eval_tiny, email_schema, opt_strategy):\n        start_time = time.time()\n\n        plan = enron_eval_tiny\n        plan = plan.sem_add_columns(email_schema)\n        plan = plan.sem_filter(\"filter1\", depends_on=[\"contents\"])\n        plan = plan.sem_filter(\"filter2\", depends_on=[\"contents\"])\n        plan = plan.sem_filter(\"filter3\", depends_on=[\"contents\"])\n        plan = plan.sem_filter(\"filter4\", depends_on=[\"contents\"])\n        plan = plan.sem_filter(\"filter5\", depends_on=[\"contents\"])\n        plan = plan.sem_filter(\"filter6\", depends_on=[\"contents\"])\n        plan = plan.sem_filter(\"filter7\", depends_on=[\"contents\"])\n        policy = MinCost()\n        cost_model = SampleBasedCostModel()\n        optimizer = Optimizer(\n            policy=policy,\n            cost_model=cost_model,\n            verbose=True,\n            available_models=[Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_1_8B],\n            allow_rag_reduction=False,\n            allow_mixtures=False,\n            allow_critic=False,\n            allow_split_merge=False,\n            optimizer_strategy=opt_strategy,\n        )\n        physical_plans = optimizer.optimize(plan)\n        physical_plan = physical_plans[0]\n\n        assert len(physical_plan) == 9\n        assert isinstance(physical_plan[0], MarshalAndScanDataOp)\n        assert isinstance(physical_plan[1], LLMFilter)\n        assert isinstance(physical_plan[2], LLMFilter)\n        assert isinstance(physical_plan[3], LLMFilter)\n        assert isinstance(physical_plan[4], LLMFilter)\n        assert isinstance(physical_plan[5], LLMFilter)\n        assert isinstance(physical_plan[6], LLMFilter)\n        assert isinstance(physical_plan[7], LLMFilter)\n        assert isinstance(physical_plan[8], LLMConvertBonded)\n\n        if not os.getenv(\"CI\"):  # only enforce time constraint when not running in CI\n            assert time.time() - start_time < 6, (\n                \"Optimizer should complete this test within 2 to 6 seconds; if it's failed, something has caused a regression, and you should ping Matthew Russo (mdrusso@mit.edu)\"\n            )\n\n\nclass MockSampleBasedCostModel:\n    \"\"\" \"\"\"\n\n    def __init__(self, operator_to_stats):\n        # construct cost, time, quality, and selectivity matrices for each operator set;\n        self.operator_to_stats = operator_to_stats\n\n        # compute set of costed full op ids from operator_to_stats\n        self.costed_full_op_ids = set(\n            [\n                full_op_id\n                for _, full_op_id_to_stats in self.operator_to_stats.items()\n                for full_op_id, _ in full_op_id_to_stats.items()\n            ]\n        )\n\n    def get_costed_full_op_ids(self):\n        return self.costed_full_op_ids\n\n    def __call__(\n        self, operator: PhysicalOperator, source_op_estimates: OperatorCostEstimates | None = None\n    ) -> PlanCost:\n        # NOTE: some physical operators may not have any sample execution data in this cost model;\n        #       these physical operators are filtered out of the Optimizer, thus we can assume that\n        #       we will have execution data for each operator passed into __call__; nevertheless, we\n        #       still perform a sanity check\n        # look up physical and logical op ids associated with this physical operator\n        full_op_id = operator.get_full_op_id()\n        logical_op_id = operator.logical_op_id\n        assert self.operator_to_stats.get(logical_op_id).get(full_op_id) is not None, (\n            f\"No execution data for {str(operator)}\"\n        )\n\n        # look up stats for this operation\n        est_cost_per_record = self.operator_to_stats[logical_op_id][full_op_id][\"cost\"]\n        est_time_per_record = self.operator_to_stats[logical_op_id][full_op_id][\"time\"]\n        est_quality = self.operator_to_stats[logical_op_id][full_op_id][\"quality\"]\n        est_selectivity = self.operator_to_stats[logical_op_id][full_op_id][\"selectivity\"]\n\n        # create source_op_estimates for scan operators if they are not provided\n        if isinstance(operator, ScanPhysicalOp):\n            # get handle to scan operator and pre-compute its size (number of records)\n            datasource_len = len(operator.datasource)\n\n            source_op_estimates = OperatorCostEstimates(\n                cardinality=datasource_len,\n                time_per_record=0.0,\n                cost_per_record=0.0,\n                quality=1.0,\n            )\n\n        # generate new set of OperatorCostEstimates\n        op_estimates = OperatorCostEstimates(\n            cardinality=est_selectivity * source_op_estimates.cardinality,\n            time_per_record=est_time_per_record,\n            cost_per_record=est_cost_per_record,\n            quality=est_quality,\n        )\n\n        # compute estimates for this operator\n        op_time = op_estimates.time_per_record * source_op_estimates.cardinality\n        op_cost = op_estimates.cost_per_record * source_op_estimates.cardinality\n        op_quality = op_estimates.quality\n\n        # construct and return op estimates\n        return PlanCost(cost=op_cost, time=op_time, quality=op_quality, op_estimates=op_estimates)\n\n\n@pytest.mark.parametrize(\n    argnames=(\"workload\", \"policy\", \"operator_to_stats\", \"expected_plan\"),\n    argvalues=[\n        pytest.param(\"three-converts\", \"mincost\", \"3c-mincost\", \"3c-mincost\", id=\"3c-mincost\"),\n        pytest.param(\"three-converts\", \"maxquality\", \"3c-maxquality\", \"3c-maxquality\", id=\"3c-maxquality\"),\n        pytest.param(\n            \"three-converts\",\n            \"mincost@quality=0.8\",\n            \"3c-mincost@quality=0.8\",\n            \"3c-mincost@quality=0.8\",\n            id=\"3c-mincostfixedquality\",\n        ),\n        pytest.param(\n            \"three-converts\",\n            \"maxquality@cost=1.0\",\n            \"3c-maxquality@cost=1.0\",\n            \"3c-maxquality@cost=1.0\",\n            id=\"3c-maxqualityfixedcost\",\n        ),\n        pytest.param(\"one-filter-one-convert\", \"mincost\", \"1f-1c-mincost\", \"1f-1c-mincost\", id=\"1f-1c-mincost\"),\n        pytest.param(\"two-converts-two-filters\", \"mincost\", \"2c-2f-mincost\", \"2c-2f-mincost\", id=\"2c-2f-mincost\"),\n        pytest.param(\n            \"two-converts-two-filters\", \"maxquality\", \"2c-2f-maxquality\", \"2c-2f-maxquality\", id=\"2c-2f-maxquality\"\n        ),\n        pytest.param(\n            \"two-converts-two-filters\",\n            \"mincost@quality=0.8\",\n            \"2c-2f-mincost@quality=0.8\",\n            \"2c-2f-mincost@quality=0.8\",\n            id=\"2c-2f-mincostfixedquality\",\n        ),\n        pytest.param(\n            \"two-converts-two-filters\",\n            \"maxquality@cost=1.0\",\n            \"2c-2f-maxquality@cost=1.0\",\n            \"2c-2f-maxquality@cost=1.0\",\n            id=\"2c-2f-maxqualityfixedcost\",\n        ),\n    ],\n    indirect=True,\n)\nclass TestParetoOptimizer:\n    def test_pareto_optimization_strategy(self, workload, policy, operator_to_stats, expected_plan):\n        # initialize cost model with sample execution data\n        cost_model = MockSampleBasedCostModel(operator_to_stats)\n\n        # run optimizer using the cost model and the given policy\n        optimizer = Optimizer(\n            policy=policy,\n            cost_model=cost_model,\n            verbose=True,\n            available_models=[Model.GPT_4o, Model.GPT_4o_MINI, Model.LLAMA3_3_70B],\n            optimizer_strategy=OptimizationStrategyType.PARETO,\n            allow_rag_reduction=False,\n            allow_mixtures=False,\n            allow_critic=False,\n            allow_split_merge=False,\n        )\n        # run optimizer to get physical plan\n        physical_plans = optimizer.optimize(workload)\n        physical_plan = physical_plans[0]\n\n        # assert that physical plan matches expected plan\n        assert physical_plan.plan_cost.quality == pytest.approx(expected_plan.plan_cost.quality)\n        assert physical_plan.plan_cost.cost == pytest.approx(expected_plan.plan_cost.cost)\n        assert physical_plan.plan_cost.time == pytest.approx(expected_plan.plan_cost.time)\n        assert physical_plan.plan_id == expected_plan.plan_id\n"
  },
  {
    "path": "tests/pytest/test_physical.py",
    "content": "\"\"\"This script contains tests for the PhysicalOperator class.\"\"\"\n\nimport os\n\nfrom pydantic import BaseModel, Field\n\nfrom palimpzest.query.operators.physical import PhysicalOperator\n\nif not os.environ.get(\"OPENAI_API_KEY\"):\n    from palimpzest.utils.env_helpers import load_env\n\n    load_env()\n\n\nclass SimpleSchema(BaseModel):\n    name: str = Field(description=\"The name of the person\")\n    age: int = Field(description=\"The age of the person\")\n\nclass SimpleSchemaTwo(BaseModel):\n    name: str = Field(description=\"The name of the person\")\n    age: int = Field(description=\"The age of the person\")\n    height: int | float = Field(description=\"The height of the person in cm\")\n\ndef test_physical_operator_init():\n    \"\"\"Test basic initialization of PhysicalOperator\"\"\"\n\n    op = PhysicalOperator(\n        output_schema=SimpleSchema,\n        input_schema=SimpleSchema,\n        depends_on=[\"op1\", \"op2\"],\n        logical_op_id=\"logical1\",\n        verbose=True\n    )\n\n    assert op.output_schema == SimpleSchema\n    assert op.input_schema == SimpleSchema\n    assert op.depends_on == [\"op1\", \"op2\"]\n    assert op.logical_op_id == \"logical1\"\n    assert op.verbose is True\n\ndef test_physical_operator_equality():\n    \"\"\"Test equality comparison between PhysicalOperators\"\"\"\n    op1 = PhysicalOperator(logical_op_id=\"abc\", output_schema=SimpleSchema)\n    op2 = PhysicalOperator(logical_op_id=\"abc\", output_schema=SimpleSchema)\n    op3 = PhysicalOperator(logical_op_id=\"def\", output_schema=SimpleSchemaTwo)\n\n    assert op1 == op2\n    assert op1 == op1\n    assert op1 == op1.copy()\n    assert op2 != op3\n\ndef test_physical_operator_str():\n    \"\"\"Test string representation of PhysicalOperator\"\"\"\n\n    op = PhysicalOperator(\n        output_schema=SimpleSchema,\n        input_schema=SimpleSchema\n    )\n\n    str_rep = str(op)\n    assert \"SimpleSchema -> PhysicalOperator -> SimpleSchema\" in str_rep\n    assert \"age, name\" in str_rep\n\ndef test_physical_operator_id_generation():\n    \"\"\"Test operator ID generation and hashing\"\"\"\n    op = PhysicalOperator(output_schema=SimpleSchema)\n\n    # Test that op_id is initially None\n    assert op.op_id is None\n\n    # Get op_id and verify it's generated\n    op_id = op.get_op_id()\n    assert op_id is not None\n    assert isinstance(op_id, str)\n\n    # Test that subsequent calls return the same id\n    assert op.get_op_id() == op_id\n\n    # Test that hash is based on op_id\n    assert hash(op) == int(op_id, 16)\n\ndef test_physical_operator_copy():\n    \"\"\"Test copying of PhysicalOperator\"\"\"\n    original = PhysicalOperator(\n        output_schema=SimpleSchema,\n        input_schema=SimpleSchema,\n        depends_on=[\"op1\"],\n        logical_op_id=\"logical1\",\n        verbose=True\n    )\n\n    copied = original.copy()\n\n    assert copied is not original  # Different instances\n    assert copied == original  # But equal in content\n    assert copied.get_op_id() == original.get_op_id()  # Same op_id\n    assert copied.depends_on == original.depends_on\n    assert copied.logical_op_id == original.logical_op_id\n    assert copied.verbose == original.verbose\n"
  },
  {
    "path": "tests/pytest/test_records.py",
    "content": "from typing import Any\n\nimport pandas as pd\nimport pytest\nfrom pydantic import BaseModel, Field\n\nfrom palimpzest.core.elements.records import DataRecord\n\n\n# Example test schema\nclass TestSchema(BaseModel):\n    name: str = Field(description=\"Test name field\")\n    value: Any = Field(description=\"Test value field\")\n\n\nclass TestDataRecord:\n    @pytest.fixture\n    def sample_record(self):\n        \"\"\"Fixture to create a sample DataRecord for testing\"\"\"\n        record = DataRecord(data_item=TestSchema(name=\"test\", value=42), source_indices=[0])\n        return record\n\n    @pytest.fixture\n    def sample_df(self):\n        \"\"\"Fixture to create a sample DataFrame for testing\"\"\"\n        return pd.DataFrame({\n            'name': ['Alice', 'Bob'],\n            'value': [1, 2]\n        })\n\n    def test_create_record(self, sample_record):\n        \"\"\"Test basic record creation and attribute access\"\"\"\n        assert sample_record.name == \"test\"\n        assert sample_record.value == 42\n        assert sample_record._source_indices == [0]\n\n    def test_record_equality(self, sample_record):\n        \"\"\"Test record equality comparison\"\"\"\n        record2 = DataRecord(data_item=TestSchema(name=\"test\", value=42), source_indices=[0])\n        assert sample_record == record2\n\n    def test_to_df(self, sample_df):\n        \"\"\"Test converting records back to DataFrame\"\"\"\n        records = [\n            DataRecord(data_item=TestSchema(name=\"Alice\", value=1), source_indices=[0]),\n            DataRecord(data_item=TestSchema(name=\"Bob\", value=2), source_indices=[1]),\n        ]\n        df_result = DataRecord.to_df(records)\n        assert df_result.equals(sample_df)\n\n    def test_to_df_with_project_cols(self, sample_df):\n        \"\"\"Test converting records to DataFrame with project_cols\"\"\"\n        records = [\n            DataRecord(data_item=TestSchema(name=\"Alice\", value=1), source_indices=[0]),\n            DataRecord(data_item=TestSchema(name=\"Bob\", value=2), source_indices=[1]),\n        ]\n        df_result = DataRecord.to_df(records, project_cols=[\"name\"])\n        assert df_result.equals(sample_df[[\"name\"]])\n\n    def test_invalid_attribute(self, sample_record):\n        \"\"\"Test accessing non-existent attribute\"\"\"\n        with pytest.raises(AttributeError):\n            _ = sample_record.nonexistent_field\n\n    def test_to_dict(self, sample_record):\n        \"\"\"Test dictionary representation\"\"\"\n        record_dict = sample_record.to_dict()\n        assert record_dict['name'] == 'test'\n        assert record_dict['value'] == 42\n\n    def test_to_json_str(self, sample_record):\n        \"\"\"Test JSON string representation\"\"\"\n        json_str = sample_record.to_json_str()\n        assert 'test' in json_str\n        assert '42' in json_str\n\n\nif __name__ == '__main__':\n    pytest.main([__file__])"
  },
  {
    "path": "tests/pytest/test_rules.py",
    "content": "import pytest\nfrom pydantic import BaseModel, Field\nfrom pydantic.fields import FieldInfo\n\nfrom palimpzest.core.data.iter_dataset import MemoryDataset\nfrom palimpzest.query.operators.logical import BaseScan\nfrom palimpzest.query.optimizer.primitives import LogicalExpression\nfrom palimpzest.query.optimizer.rules import BasicSubstitutionRule\n\n\n@pytest.fixture\ndef schema():\n    class SimpleSchema(BaseModel):\n        filename: str = Field(description=\"The filename of the file\")\n        text: str = Field(description=\"The text of the file\")\n    return SimpleSchema\n\n@pytest.fixture\ndef base_scan_op(schema):\n    return BaseScan(\n        datasource=MemoryDataset(id=\"test\", vals=[1, 2, 3]),\n        output_schema=schema\n    )\n\ndef test_substitute_methods(base_scan_op):\n    # Create a logical expression with the BaseScan operator\n    logical_expr = LogicalExpression(\n        operator=base_scan_op,\n        input_group_ids=[],\n        input_fields={},\n        generated_fields={\"some_id\": FieldInfo(annotation=str, description=\"id\"),  \"text\": FieldInfo(annotation=str, description=\"text\")},\n        depends_on_field_names=set(),\n        group_id=1\n    )\n\n    # Apply the BasicSubstitutionRule\n    physical_exprs = BasicSubstitutionRule.substitute(logical_expr, verbose=False, api_base=None)\n\n    # Verify the substitution\n    assert len(physical_exprs) == 1\n    physical_expr = list(physical_exprs)[0]\n\n    # Check that the operator was correctly converted to MarshalAndScanDataOp\n    assert physical_expr.operator.__class__.__name__ == \"MarshalAndScanDataOp\"\n\n    # Verify that the important properties were preserved\n    assert physical_expr.operator.datasource == base_scan_op.datasource\n    assert physical_expr.input_group_ids == logical_expr.input_group_ids\n    assert physical_expr.input_fields == logical_expr.input_fields\n    assert physical_expr.generated_fields == logical_expr.generated_fields\n    assert physical_expr.depends_on_field_names == logical_expr.depends_on_field_names\n    assert physical_expr.group_id == logical_expr.group_id\n"
  },
  {
    "path": "tests/pytest/test_scan.py",
    "content": "from typing import Any\n\nfrom pydantic import BaseModel, Field\n\nfrom palimpzest.core.data.iter_dataset import MemoryDataset\nfrom palimpzest.query.operators.scan import MarshalAndScanDataOp\n\n\nclass List(BaseModel):\n    value: Any = Field(description=\"List item\")\n\n\ndef test_marshal_and_scan_memory_source():\n    # Create test data\n    test_data = [\"test1\", \"test2\", \"test3\"]\n\n    # Create MemoryDataset with test data\n    memory_source = MemoryDataset(id=\"test\", vals=test_data)\n\n    # Create MarshalAndScanDataOp\n    op = MarshalAndScanDataOp(output_schema=List, datasource=memory_source, logical_op_id=\"test_scan\")\n\n    # Execute the scan operator on the first source record\n    result = op(0)\n\n    assert len(result.data_records) == 1\n    assert result.data_records[0].value == \"test1\"\n\n    # Test stats\n    assert len(result.record_op_stats) == 1\n    stats = result.record_op_stats[0]\n    assert stats.op_name == \"MarshalAndScanDataOp\"\n    assert stats.time_per_record >= 0.0  # Should be non-negative; sometimes the read executes so quickly the assertion fails with > 0.0\n    assert stats.cost_per_record == 0.0\n\n# def test_marshal_and_scan_memory_source_multiple_records():\n#     # Test with numeric data\n#     test_data = [1, 2, 3, 4, 5]\n#     memory_source = MemoryReader(test_data, schema=List)\n\n#     op = MarshalAndScanDataOp(datasource=memory_source)\n\n#     # Test each index\n#     for idx in range(len(memory_source)):\n#         result = op(idx)\n\n#         # Verify results\n#         assert len(result.records) == 1\n#         assert result.records[0].value == test_data[idx]\n#         assert len(result.record_op_stats) == 1\n\n# def test_marshal_and_scan_empty_source():\n#     # Test with empty data\n#     memory_source = MemoryReader([], schema=List)\n\n#     op = MarshalAndScanDataOp(datasource=memory_source)\n\n#     # Should raise IndexError when trying to access empty source\n#     with pytest.raises(IndexError):\n#         op(0)\n"
  },
  {
    "path": "tests/pytest/test_schemas.py",
    "content": "from typing import Any\n\nimport pytest\nfrom pydantic import BaseModel, Field\n\nfrom palimpzest.core.lib.schemas import (\n    create_schema_from_df,\n    create_schema_from_fields,\n    get_schema_field_names,\n    project,\n    union_schemas,\n)\n\n\nclass Dog(BaseModel):\n    breed: str = Field(description=\"The breed of the dog\")\n    is_good: bool = Field(description=\"Whether the dog is good\")\n\nclass Cat(BaseModel):\n    breed: str = Field(description=\"The breed of the cat\")\n    is_good: bool = Field(description=\"Whether the cat is good\")\n\ndef test_schema_equality():\n    assert Dog == Dog\n    assert Dog != Cat\n\ndef test_get_schema_field_names():\n    assert get_schema_field_names(Dog) == [\"breed\", \"is_good\"]\n    assert get_schema_field_names(Dog, id=\"dog\") == [\"Dog.dog.breed\", \"Dog.dog.is_good\"]\n\ndef test_project_schema():\n    projected_dog = project(Dog, [\"breed\"])\n    assert projected_dog.__name__ == \"Schema['breed']\"\n    assert get_schema_field_names(projected_dog) == [\"breed\"]\n\n    projected_dog_full = project(Dog, [\"Dog.id.breed\", \"Dog.id.is_good\", \"random_field\"])\n    assert projected_dog_full.__name__ == \"Schema['breed', 'is_good']\"\n    assert get_schema_field_names(projected_dog_full) == [\"breed\", \"is_good\"]\n\ndef test_create_schema_from_fields():\n    fields = [\n        {\"name\": \"age\", \"type\": int, \"description\": \"The age of the pet\"},\n        {\"name\": \"weight\", \"type\": float, \"description\": \"The weight of the pet\"}\n    ]\n    pet_schema = create_schema_from_fields(fields)\n    assert pet_schema.__name__ == \"Schema['age', 'weight']\"\n    assert get_schema_field_names(pet_schema) == [\"age\", \"weight\"]\n    assert pet_schema.model_fields[\"age\"].annotation is int\n    assert pet_schema.model_fields[\"weight\"].annotation is float\n\ndef test_create_schema_from_df():\n    import pandas as pd\n\n    data = {\n        \"name\": [\"Buddy\", \"Mittens\"],\n        \"age\": [5, 3],\n        \"weight\": [20.5, 10.0]\n    }\n    df = pd.DataFrame(data)\n    pet_schema = create_schema_from_df(df)\n    assert pet_schema.__name__ == \"Schema['age', 'name', 'weight']\"\n    assert get_schema_field_names(pet_schema) == [\"name\", \"age\", \"weight\"]\n    assert pet_schema.model_fields[\"name\"].annotation in [str, Any]\n    assert pet_schema.model_fields[\"age\"].annotation is int\n    assert pet_schema.model_fields[\"weight\"].annotation is float\n\ndef test_union_schemas():\n    unioned_schema = union_schemas([Dog, Cat])\n    assert unioned_schema.__name__ == \"Schema['breed', 'is_good']\"\n    assert get_schema_field_names(unioned_schema) == [\"breed\", \"is_good\"]\n    assert unioned_schema.model_fields[\"breed\"].annotation is str\n    assert unioned_schema.model_fields[\"is_good\"].annotation is bool\n\n    # Test with conflicting field types\n    class Fish(BaseModel):\n        breed: str = Field(description=\"The breed of the fish\")\n        is_good: int = Field(description=\"Whether the fish is good\")\n\n    with pytest.raises(AssertionError):\n        union_schemas([Dog, Fish])\n"
  },
  {
    "path": "website/.gitignore",
    "content": "# Dependencies\n/node_modules\n\n# Production\n/build\n\n# Generated files\n.docusaurus\n.cache-loader\n\n# Misc\n.DS_Store\n.env.local\n.env.development.local\n.env.test.local\n.env.production.local\n\nnpm-debug.log*\nyarn-debug.log*\nyarn-error.log*\n"
  },
  {
    "path": "website/README.md",
    "content": "# Website\n\nThis website is built using [Docusaurus](https://docusaurus.io/), a modern static website generator.\n\n## Installation\n\n```bash\nyarn\n```\n\n## Local Development\n\n```bash\nyarn start\n```\n\nThis command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.\n\n## Build\n\n```bash\nyarn build\n```\n\nThis command generates static content into the `build` directory and can be served using any static contents hosting service.\n\n## Deployment\n\nUsing SSH:\n\n```bash\nUSE_SSH=true yarn deploy\n```\n\nNot using SSH:\n\n```bash\nGIT_USER=<Your GitHub username> yarn deploy\n```\n\nIf you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.\n"
  },
  {
    "path": "website/blog/2024-06-01-palimpzest/bibtex.js",
    "content": "import React, { useState } from \"react\"\n\n\nconst BibtexInner = ({children, handleCopyClick}) => (\n  <div\n    className=\"bibtex\"\n    style={{\n      position: \"relative\",\n      backgroundColor: \"#f5f5f5\",\n      padding: \"15px\",\n      borderRadius: \"3px\",\n      margin: \"15px 0\"\n    }}\n  >\n    <pre style={{margin: 0, fontSize: \"0.9em\"}}>\n      {children}\n    </pre>\n    <button\n      className=\"bibtex-copy\"\n      onClick={handleCopyClick}\n      style={{\n        position: \"absolute\",\n        top: \"10px\",\n        right: \"10px\",\n        cursor: \"pointer\",\n        backgroundColor: \"#aaa\",\n        color: \"#fff\",\n        border: \"0\",\n        borderRadius: \"3px\",\n        padding: \"5px\",\n        outline: \"none\",\n        fontSize: \"0.8em\",\n        fontFamily: \"sans-serif\",\n      }}\n    >\n      Copy\n    </button>\n  </div>\n)\n\n\nconst Bibtex = ({children, withToggle}) => {\n  const [visible, setVisible] = useState(false);\n\n  const handleToggle = (event) => {\n    setVisible(!visible)\n    event.preventDefault()\n  }\n\n  const handleCopyClick = () => {\n    navigator.clipboard.writeText(children)\n  }\n\n  if (withToggle) {\n    if (!visible) {\n      return (\n        <>\n          <a href=\"#\" onClick={handleToggle}>View BibTeX</a>\n        </>\n      )\n    } else {\n        return (\n          <>\n            <a href=\"#\" onClick={handleToggle}>Hide BibTeX</a>\n            <BibtexInner handleCopyClick={handleCopyClick}>{children}</BibtexInner>\n          </>\n        )\n    }\n  } else {\n    return <BibtexInner handleCopyClick={handleCopyClick}>{children}</BibtexInner>\n  }\n}\n\nexport default Bibtex\n"
  },
  {
    "path": "website/blog/2024-06-01-palimpzest/index.md",
    "content": "---\nslug: palimpzest-paper\ntitle: \"Palimpzest: A Declarative System for Optimizing AI Workloads\"\nauthors: [mdrusso]\ntags: [Palimpzest, Semantic Operators]\n---\n<!-- ---\ntitle: \"A Declarative System for Optimizing AI Workloads\"\nlink: /palimpzest\nsummary: \"A long-standing goal of data management systems has been to build systems which can compute quantitative insights over large corpora of unstructured data in a cost-effective manner. Until recently, it was difficult and expensive to extract facts from company documents, data from scientific papers, or metrics from image and video corpora. Today's models can accomplish these tasks with high accuracy. However, a programmer who wants to answer a substantive AI-powered query must orchestrate large numbers of models, prompts, and data operations. For even a single query, the programmer has to make a vast number of decisions such as the choice of model, the right inference method, the most cost-effective inference hardware, the ideal prompt design, and so on. The optimal set of decisions can change as the query changes and as the rapidly-evolving technical landscape shifts. In this paper we present Palimpzest, a system that enables anyone to process AI-powered analytical queries simply by defining them in a declarative language. The system uses its cost optimization framework -- which explores the search space of AI models, prompting techniques, and related foundation model optimizations -- to implement the query plan with the best trade-offs between runtime, financial cost, and output data quality.  We describe the workload of AI-powered analytics tasks, the optimization methods that Palimpzest uses, and the prototype system itself. We evaluate Palimpzest on tasks in Legal Discovery, Real Estate Search, and Medical Schema Matching. We show that even our simple prototype offers a range of appealing plans, including one that is 3.3x faster and 2.9x cheaper than the baseline method, while also offering better data quality. With parallelism enabled, Palimpzest can produce plans with up to a 90.3x speedup at 9.1x lower cost relative to a single-threaded GPT-4 baseline, while obtaining an F1-score within 83.5% of the baseline. These require no additional work by the user.\"\nstatus: current\nimage: imgs/palimpzest.png\n--- -->\n\nimport Bibtex from './bibtex.js';\n\n![pz](imgs/palimpzest.png)\n[![Discord](https://img.shields.io/discord/1245561987480420445?logo=discord)](https://discord.gg/dN85JJ6jaH)\n[![Colab Demo](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Fm8I4yL1az395MsFkQbEIZSmUZs0oGvZ?usp=sharing)\n[![PyPI](https://img.shields.io/pypi/v/palimpzest)](https://pypi.org/project/palimpzest/)\n[![PyPI - Monthly Downloads](https://img.shields.io/pypi/dm/palimpzest?color=teal)](https://pypi.org/project/palimpzest/)\n[![GitHub](https://img.shields.io/badge/GitHub-Code-blue?logo=github)](https://github.com/mitdbg/palimpzest)\n- [Paper](https://arxiv.org/pdf/2405.14696)\n- [Code](https://github.com/mitdbg/palimpzest)\n- [Colab Demo](https://colab.research.google.com/drive/1Fm8I4yL1az395MsFkQbEIZSmUZs0oGvZ?usp=sharing)\n- [Discord](https://discord.gg/dN85JJ6jaH)\n\n{/* truncate */}\n\n## Scaling Modern AI Systems to Large Corpora is Challenging\nAdvances in AI models have driven progress in applications such as question answering, chatbots, autonomous agents, and code synthesis. In many cases these systems have evolved far beyond posing a simple question to a chat model: they are complex [AI systems](https://bair.berkeley.edu/blog/2024/02/18/compound-ai-systems/) that combine elements of data processing, such as Retrieval Augmented Generation (RAG); ensembles of different models; and multi-step chain-of-thought reasoning.\n\nIt is easy for the runtime, cost, and complexity of these AI systems to escalate quickly, particularly when applied to large collections of documents. This places a burden on the AI system developer to marshal numerous optimizations in order to optimally manage trade-offs between runtime, cost, and output quality. To better motivate the challenges we face, let us first consider three examples from an emerging class of workloads.\n\n### Semantic Analytics Applications (SAPPs)\nConsider the following AI-powered analytical tasks:\n- **Legal Discovery:** In this use case, prosecutors conducting an investigation wish to identify emails from defendants which are (a) related to corporate fraud (e.g., by mentioning a specific fraudulent investment vehicle) and (b) do not quote from a news article reporting on the business in question. Test (a) may be implemented using a regular expression or UDF, while (b) requires semantic understanding to distinguish between employees sharing news articles versus first-hand sources of information. An efficient implementation would recognize that (a) can likely be implemented using conventional and inexpensive methods, while (b) may require an LLM or newly-trained text model to retain good quality.\n\n![Figure with a positive and negative example from the Legal Discovery workload.](imgs/enron-example.png)\n<center>\n<p>\n<strong>Figure 1a:</strong> A positive and negative example from the Legal Discovery workload. See Section 5.2 in <a href=\"https://arxiv.org/pdf/2405.14696\">our paper</a> for more details.\n</p>\n</center>\n\n- **Real Estate Search:** In this use case, a homebuyer wants to use online real estate listing data to find a place that is (a) modern and attractive, and (b) within two miles of work. Test (a) is a semantic search task that possibly involves analyzing images, while (b) is a more traditional distance calculation over extracted geographic data.  Any implementation needs to process a large number of images and listings, limit its use of slow and expensive models, and still obtain high-quality results.\n\n![Figure with a positive and negative example from the Real Estate Search workload.](imgs/real-estate-example.png)\n<center>\n<p>\n<strong>Figure 1b:</strong> A positive and negative example from the Real Estate Search workload. See Section 5.2 in <a href=\"https://arxiv.org/pdf/2405.14696\">our paper</a> for more details.\n</p>\n</center>\n\n- **Medical Schema Matching:** In the medical domain, cancer research is often based on large collections of information about patient cases and sample data. However, research studies' data outputs do not always conform to a unified standard. In this use case, based on the medical data pipeline described by [Li, *et al.*](https://www.cell.com/cancer-cell/fulltext/S1535-6108(23)00219-2), we imagine a researcher who would like to (a) download the datasets associated with a dozen specified cancer research papers, (b) identify the datasets that contain patient experiment data, and (c) integrate those datasets into a single table.  Step (a) requires parsing and understanding research paper texts to obtain dataset URLs, step (b) requires classifying each dataset as either patient-related or not, and step (c) requires a complex data integration task. As with the use cases above, the programmer must manage multiple subtasks, each of which offer different possible optimization opportunities and quality trade-offs.\n\n![Figure illustrating the workflow for the Medical Schema Matching workload.](imgs/biofabric_matching-small.png)\n<center>\n<p>\n<strong>Figure 1c:</strong> Overview of the workflow for the Medical Schema Matching workload. See Section 5.2 in <a href=\"https://arxiv.org/pdf/2405.14696\">our paper</a> for more details.\n</p>\n</center>\n\nThese tasks:\n1. Interleave traditional data processing with AI-like semantic reasoning\n2. Are data-intensive: each source dataset could reasonably range from hundreds to millions of records\n3. Can be decomposed into an execution tree of distinct operations over sets of data objects\n\nTaken together, these criteria outline a broad class of AI programs that are important, complex, and potentially very optimizable; we call them ***semantic analytics applications*** -- or, **SAPPs**. We believe there is a large set of such use cases that mix conventional data analytics with transformations and filters that would not be possible without AI methods. Such workloads frequently require interleaved data acquisition steps, conventional analytical queries, and AI operations. The AI operations process unstructured data, require broad domain knowledge to implement, or have specifications that users may not be able to implement correctly with traditional source code.\n\n## Challenges Scaling AI Systems to Process SAPPs\nNaively scaling AI Systems to process SAPPs with thousands or millions of inputs appears to require spending a huge amount of runtime and money executing high-end AI models. It is easy to show via back-of-the-envelope calculations that some high-quality LLMs process *fewer than 1 KB of text per second*, while others cost *5 USD for processing just 5MB of data*. These numbers are many orders of magnitude worse than any other component of the modern data processing stack. Thus, optimizing the use of AI components is crucial, while at the same time current AI infrastructure is in a state of tremendous technical flux. Harnessing the latest advances in model runtime, cost, and quality is complex, error-prone, and requires engineers to constantly rewrite and retune their systems.\n\nConsider the wide range of technical decisions an AI engineer faces:\n- **Prompt Design:** the engineer must optimize wording and decide on a general prompting strategy (e.g., zero-shot, few-shot, chain-of-thought, ReAct, etc.).\n- **Model Selection:** the engineer must pick the best model *for each substask in the program*, balancing time, cost, and quality.\n- **Execution Strategy:** the engineer must decide whether each subtask is best implemented by a foundation model query, synthesized code, or a locally-trained student model. Furthermore, they must consider how to combine tasks to improve GPU cache utilization, and how to avoid running over LLM context limits.\n- **System Scaling:** When scaling out to a larger dataset, the engineer faces additional challenges in selecting an efficient execution plan. Even if the system performs well on a small dataset, it may require redesign to ensure reasonable runtime, cost, and performance at a larger scale. This may involve enabling parallelism for each component and integrating these parallelized components seamlessly into the broader system for optimal efficiency. \n- **External Integration:** When integrating with external data systems, the engineer must decide how to choose parameters (e.g., the number of chunks to return per RAG query) in a manner that yields the best speed, cost, and quality trade-offs.\n\nThe space of possible decisions is vast, and choosing wisely depends on low-level details of the exact task being performed.\n\n## Declarative Programming to the Rescue (Again)\nOur key insight is that machines, not human engineers, should decide how best to optimize semantic analytics applications. Engineers should be able to write AI programs at a high level of abstraction and rely on the computer to find an optimized implementation that best fits their use case. A similar set of circumstances -- a need for performance improvements for an important workload, during a time of enormous technical change -- led to the development of the relational database query optimizer in the 1970s. Today's underlying technical challenges are very different, but the basic idea of declarative program optimization remains valuable.\n\nConsider the declarative program shown below, which we used in our evaluation for the Legal Discovery workload:\n\n```python{numberLines: true}\nimport palimpzest as pz\n\nclass Email(pz.TextFile):\n  \"\"\"Represents an email, which can subclass a text file\"\"\"\n  sender = pz.StringField(desc=\"The email address of the sender\", required=True)\n  subject = pz.StringField(desc=\"The subject of the email\", required=True)\n\n# define logical plan\nemails = pz.Dataset(source=\"enron-emails\", schema=Email) # invokes a convert operation\nemails = emails.filter(\"The email is not quoting from a news article or an article ...\")\nemails = emails.filter(\"The email refers to a fraudulent scheme (i.e., \\\"Raptor\\\", ...\")\n\n# user specified policy\npolicy = pz.MinimizeCostAtFixedQuality(min_quality=0.8)\n\n# execute plan\nresults = pz.Execute(emails, policy=policy)\n```\n\nIn this program, the user wants to identify emails that are not quoting from sources outside of Enron and that reference fraudulent investment vehicles. The program can be decomposed into the following pieces:\n- **Lines 3-6:** the programmer uses Palimpzest to create a custom schema for an `Email`.\n- **Line 9:** the programmer creates a `Dataset` of `Emails`\n  - The `source` string `\"enron-emails\"` uniquely identifies a set of files that have been preregistered with Palimpzest.\n  - The specification of a `schema` instructs Palimpzest to transform the raw input data objects into the `Email` schema.\n  - The transformed results are stored the `emails` Dataset.\n- **Line 10:** the program filters `emails` for the subset which are not quoting from news articles.\n- **Line 11:** the program filters for `emails` which discuss fraudulent investment entities.\n\nThe programmer takes two more steps:\n- **Line 14:** she specifies a `policy` that describes how the system should choose among multiple possible implementations of the steps described so far.\n  - (In this case, the plan with the lowest expected financial cost, subject to a lower bound on quality, is preferred.)\n- **Line 17:** the programmer asks Palimpzest to `Execute()` the program, this entails:\n  - Generating a logical execution plan\n  - Generating multiple optimized physical execution plan candidates\n  - Choosing one according to the specified policy\n  - And finally, executing the code and yielding results\n\nPrograms written with Palimpzest are executed lazily, thus no actual data processing occurs until line 17.\n\n## Palimpzest System\nIn our paper we lay out a vision and a prototype for **Palimpzest**, a system that enables engineers to write succinct, declarative code that can be compiled into optimized programs. Palimpzest is designed to optimize the broad SAPP workload class, which should encompass large-scale information extraction, data integration, discovery from scientific papers, image understanding tasks, and multimodal analytics. When running an input user program, Palimpzest considers a range of logical and physical optimizations, then yields a set of possible concrete executable programs. Palimpzest estimates the cost, time, and quality of each one, then chooses a program based on runtime user preferences. The system is designed to be extensible, so that new optimizations can be easily added in the future. Just as the RDBMS allowed users to write database queries more quickly and correctly than they could by writing traditional code, Palimpzest will allow engineers to write better AI programs more quickly than they could unaided.\n\n![Palimpzest system diagram.](imgs/intro.png)\n<center>\n<p>\n<strong>Figure 2:</strong> Overview of the Palimpzest system architecture. See Section 3.4 in <a href=\"https://arxiv.org/pdf/2405.14696\">our paper</a> for more details.\n</p>\n</center>\n\nWe summarize the main system steps here:\n1. The user writes a declarative AI program using the Palimpzest (PZ) library; this gets compiled to an initial logical plan.\n2. Palimpzest applies logical optimizations to transform the initial logical plan into a set of logical plans.\n3. Palimpzest applies physical optimizations to create a much larger set of candidate physical plans.\n4. Palimpzest executes **sentinel plans** to gather sample execution data.\n5. The sample execution data is used to perform better cost estimation for the candidate physical plans.\n6. The final plan is selected based on plan estimates and a user-specified policy for choosing among plans.\n7. The final plan is executed and results are yielded to the user.\n\nFor more details on each of these steps, please see Subsection 3.4 and Section 4 in <a href=\"https://arxiv.org/pdf/2405.14696\">our paper</a>.\n\n\n## Palimpzest Creates Plans with Diverse Trade-offs\nOur first experimental claim is that the Palimpzest prototype can use the three optimization strategies we implemented to create a set of physical plans that offered appealing trade-offs, regardless of the user policy. Figure 3 shows the **observed** -- not estimated -- runtime, cost, and quality that came from executing three baseline plans and ~20 of the most Pareto optimal plans suggested by Palimpzest. Plans closer to the bottom-right of each subplot are better.\n\n![Palimpzest results for all plans.](imgs/all-plans.png)\n<center>\n<p>\n<strong>Figure 3:</strong> Results from running the baselines and ~20 most Pareto optimal plans for each workload. See Section 5.3 in <a href=\"https://arxiv.org/pdf/2405.14696\">our paper</a> for more details.\n</p>\n</center>\n\nWe found that Palimpzest is able to create useful plans at a number of different points in the trade-off space. On the Legal Discovery workload, Palimpzest was able to suggest physical plans (e.g. `PLAN 1`) that dominated the GPT-3.5 and Mixtral baselines, with an F1-score that is 7.3x and 1.3x better (respectively) at lower runtime and cost. Compared to the GPT-4 baseline, Palimpzest produced a cluster of plans near `PLAN 1` that are ~4.7x faster and ~9.1x cheaper, while still achieving up to 85.7% of the GPT-4 plan's F1-score. On the Real Estate Search and Medical Schema Matching workloads, Palimpzest was able to produce plans with runtime speedups and cost savings in the range of 2.4x-4.6x, while also achieving better F1-scores than their GPT-4 baselines.\n\nPalimpzests' plans used a number of distinct optimizations to outperform the baselines:\n- **On Legal Discovery:** `PLAN 1` made use of **model selection** to judiciously swap calls to GPT-4 with faster and less expensive calls to GPT-3.5 and Mixtral-8x7B when it did little to diminish result quality. Other plans similar to `PLAN 1` used **synthesized code** to replace LLM calls entirely.\n- **On Real Estate Search:** `PLAN 2` made use of Palimpzest's logical optimizations to perform all text processing and filtering prior to any image processing, which required using a slow and expensive vision model.\n- **On Medical Schema Matching:** `PLAN 3` made use of model selection and **input token reduction** (i.e. compressing the total number of input tokens processed by the LLM).\n\n\n## Palimpzest Optimizer Selects High-Quality Plans\nOur second experimental claim is that Palimpzest can identify plans that have better end-to-end runtime, cost, and quality than a naive plan that uses the same state-of-the-art language model for each operation. To evaluate this claim, we ran three policies for each workload:\n- **Policy A** was to maximize quality at cost \\<$20.0, $3.0, and $2.0 for Legal Discovery, Real Estate Search, and Medical Schema Matching, respectively.\n- **Policy B** was to maximize quality at runtime \\<10,000s, 600s, and 1000s (for the same order of workloads)\n- **Policy C** was to minimize cost at an F1-score >0.8, 0.8, and 0.4 (for the same order of workloads).\n\nThese fixed cost, quality, and runtime thresholds were set to be challenging, yet physically attainable based on our results from the previous section.\n\nFigure 4 presents our results across three performance metrics, with each metric displayed in a separate column. The results are organized by workload, with each workload represented in a row of subplots. Within each subplot, results are further divided by the physical plan selected by the optimizer, with each plan occupying a separate row. We compared the plans chosen by Palimpzest to a baseline plan, which employs GPT-4 for all conversion and filtering operations.\n\n![Palimpzest results for re-optimization.](imgs/reopt.png)\n<center>\n<p>\n<strong>Figure 4:</strong> Results from running Palimpzest on each workload for three different policies. See Section 5.4 in <a href=\"https://arxiv.org/pdf/2405.14696\">our paper</a> for more details.\n</p>\n</center>\n\nOverall, we found that Palimpzest identifies plans in the space of physical candidates which:\n1. Offer significant performance improvements over the GPT-4 baseline\n2. Generally satisfy policy constraints (7 out of 9 satisfied), and\n3. Have speedups and cost savings which outweigh the overhead of collecting sample data.\n\nFor example, on the Legal Discovery workload (first row in Figure 4), the plan selected by Palimpzest for **Policy A** achieved a runtime and financial cost that are 80.0% and 89.7% lower than the GPT-4 baseline, respectively, at an F1-score within 81.1% of the baseline. The system achieved similar results for **Policy C**, with slightly better quality (within 84.3% of the baseline).\n\nFor the Real Estate Search workload (second row in Figure 4), the plans chosen by Palimpzest achieved (on average) 67.5% lower runtime, 65.7% lower cost, and 6% better F1-score than the baseline GPT-4 plan. Finally, on the Medical Schema Matching workload (third row in Figure 4), Palimpzest was able to identify plans that provide up to 47.2% and 36.3% lower runtime and cost, respectively, at comparable F1-scores to the GPT-4 baseline.\n\n## Palimpzest Can Provide Speedups of up to 90.3x for Free\nFor our final evaluation, we ran Palimpzest with parallel implementations of its most time-consuming operations to demonstrate the system's ability to achieve large runtime speedups -- with competitive costs and F1-scores -- relative to a single-threaded baseline plan. We allowed each parallelized operator within Palimpzest to use 32 workers. The results of our evaluation are shown in Figure 5.\n\n![Palimpzest results for parallel speedup.](imgs/speedup-small.png)\n<center>\n<p>\n<strong>Figure 5:</strong> Results from running Palimpzest with parallelism enabled. See Section 5.5 in <a href=\"https://arxiv.org/pdf/2405.14696\">our paper</a> for more details.\n</p>\n</center>\n\nWe can see that on Legal Discovery, Palimpzest achieved a **90.3x speedup at 9.1x lower cost while obtaining an F1-score within 83.5% of the GPT-4 baseline**. On Real Estate Search and Medical Schema matching, the optimized plan dominated the GPT-4 baseline on all metrics. **These plans achieve better F1-scores than their baselines, and do so with speedups of 20.0x and 5.6x as well as cost savings of 2.9x and 1.5x, respectively.**\n\nWe do not use any exotic algorithms, and of course it is straightforward to run model prompts in parallel.  Palimpzest's abstractions simply allow the system to obtain these speedups with no additional work by the user.\n\n\n## Read the Paper\n\n[Chunwei Liu*, Matthew Russo*, Michael Cafarella, Lei Cao, Peter Baile Chen, Zui Chen, Michael Franklin, Tim Kraska, Samuel Madden, Gerardo Vitagliano. A Declarative System for Optimizing AI Workloads](https://arxiv.org/pdf/2405.14696)\n\n\\* Denotes equal contribution.\n\n<Bibtex>{`@inproceedings{palimpzestCIDR,\n    title={Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing},\n    author={Liu, Chunwei and Russo, Matthew and Cafarella, Michael and Cao, Lei and Chen, Peter Baile and Chen, Zui and Franklin, Michael and Kraska, Tim and Madden, Samuel and Shahout, Rana and Vitagliano, Gerardo},\n    booktitle = {Proceedings of the {{Conference}} on {{Innovative Database Research}} ({{CIDR}})},\n    date = 2025,\n}`}</Bibtex>\n\n## Project Participants\n\nChunwei Liu, Matthew Russo, Michael Cafarella, Peter Baile Chen, Zui Chen, Tim Kraska, Samuel Madden, Gerardo Vitagliano \n"
  },
  {
    "path": "website/blog/authors.yml",
    "content": "mdrusso:\n  name: Matthew Russo\n  title: PhD Student @ MIT\n  url: https://mdr223.github.io/\n  image_url: /img/mdrusso.png\n  page: true\n  socials:\n    x: RussoMatthew\n    linkedin: matthew-russo-7a1931105\n    github: mdr223\n\nmjc:\n  name: Michael Cafarella\n  title: Principal Research Scientist @ MIT\n  url: https://www.csail.mit.edu/person/michael-cafarella\n  image_url: /img/cafarella.jpg\n  # page:\n  #   # customize the url of the author page at /blog/authors/<permalink>\n  #   permalink: '/all-sebastien-lorber-articles'\n  page: true\n  socials:\n    x: MikeCafarella\n    github: mikecafarella\n"
  },
  {
    "path": "website/blog/tags.yml",
    "content": "Semantic Operators:\n  label: Semantic Operators\n  permalink: /semantic-operators\n  description: Blog posts about semantic operators\n\nPalimpzest:\n  label: Palimpzest\n  permalink: /palimpzest\n  description: Blog posts about Palimpzest\n\nAbacus:\n  label: Abacus\n  permalink: /abacus\n  description: Blog posts about Abacus\n"
  },
  {
    "path": "website/docs/api/overview.mdx",
    "content": "---\ntitle: Developer Documentation\n---\nPlease see our [User Guides](user-guide/overview.mdx) for documentation on how to use Palimpzest.\n\nIn the future, this page will link to developer documentation for those interested in contributing to Palimpzest!\n\nIf you wish to contribute to PZ and have questions about specific classes or interfaces, please reach out to us on [Discord](https://discord.gg/dN85JJ6jaH).\n{/*\nThis section contains the full documentation for the major classes and methods in PZ.\n\nOur documentation is separated into the following categories:\n\n- `Dataset`: `pz.Dataset` and all user-facing operator methods.\n- `Data`: all classes involved in reading, representing, and storing data in PZ.\n- `Operators`: the physical operators used internally by PZ.\n- `Optimization`: all classes involved with optimization.\n\nThese pages are still a work in progress, so please reach out to us on [Discord](https://discord.gg/dN85JJ6jaH) if there is missing documentation that you would like to see!\n*/}"
  },
  {
    "path": "website/docs/getting-started/installation.mdx",
    "content": "---\ntitle: Installation\n---\n\nYou can install the latest stable version of Palimpzest using `pip`:\n\n```bash\n$ pip install palimpzest\n```\n\nYou can also install PZ with [uv](https://docs.astral.sh/uv/) for a faster installation:\n```bash\n$ uv pip install palimpzest\n```\n\nAlternatively, if you would like to install Palimpzest from source, you can clone our GitHub repo:\n```bash\n$ git clone git@github.com:mitdbg/palimpzest.git\n$ cd palimpzest\n$ pip install .\n```\n"
  },
  {
    "path": "website/docs/getting-started/next-steps.mdx",
    "content": "---\ntitle: Next Steps\n---\n{/* ## Goal\nThis page should guide the reader towards the first page of our user guide (`docs/user-guide/overview.md`), and remove any sense of \"unknown unknowns\". In other words, the user should have a complete picture of what is left to learn about PZ when they finish reading this page.\n\nIt would probably be okay to duplicate a lot of the links in `docs/user-guide/overview.md` right here, so that the user can see what they cover.\n\n- User Guide 1\n- User Guide 2\n- etc.\n\nNext, the page should link to \"barebones\" documentation, i.e. the \"third-layer\" pages which contain class and function definitions.\n\n- `Context`\n- `Dataset`\n- `Aggregate`\n- `Convert` (or whatever we're calling it)\n- etc.\n\nFinally, the page should link to the remaining cool PZ stuff, including:\n\n- Our \"Cookbook\" / tutorials repo (which I will set up any day now)\n- Our blog post(s)\n- Our talk(s)\n- Our research papers\n- etc.\n*/}\n\nNow that you've learned the basics of PZ, it's time to explore topics we've touched on in more detail.\n\nOur [User Guide](user-guide/overview.mdx) contains deeper dives into the most important aspects of PZ, including:\n\n- [How to read your own data](user-guide/dataset.mdx)\n- [An overview of all operators in PZ](user-guide/operators/overview.mdx)\n- [A primer on optimization](user-guide/optimization.mdx)\n\nFor developers who are looking to contribute to PZ, we would also encourage you to explore our [full documentation](api/overview.mdx) as well.\n\nFinally, feel free to check out other resources on PZ:\n\n- [Join our Community](https://discord.gg/dN85JJ6jaH) (if you haven't already)\n- [PalimpChat](http://3.213.4.62:8888/) (research prototype of a chat interface for PZ)\n- [Research](/research) (a timeline of our research papers)\n"
  },
  {
    "path": "website/docs/getting-started/quickstart.mdx",
    "content": "---\ntitle: Quick Start Tutorial\n---\n{/* ## Goal\nThis page should expand upon the \"teaser\" code shown at the very beginning of `docs/index.md`.\n\nEvery user who reads this page should be left with the following impression:\n\n1. PZ is easy to get started with\n2. PZ is powerful: many (serious) AI programs can be written in PZ\n3. PZ has an optimizer, which can help optimize the user's program\n4. The user can interact with and override the optimizer's decisions\n\nThe ideal way to do this is with the old mantra \"show don't tell.\" A great example application can start very small and simple (1.), then grow to be more complex (2.), then make a call to our optimizer (3.), after which a user inspects the plan and modifies certain aspects of it (4.).\n\n(In this page, I don't expect us to do any more than call `plan = ds.optimize()` for demonstrating (3.). For now, we should only allude to more powerful optimization which we will talk about later in the docs). -->\n*/}\n\n### 💽 Creating a Dataset\nLet's revisit our example from the [Getting Started](intro.mdx) page in more depth, starting with the first two lines:\n```python\nimport palimpzest as pz\n\nemails = pz.TextFileDataset(id=\"enron-emails\", path=\"emails/\")\n```\nIn this example, we provide `pz.TextFileDataset`'s constructor with a unique identifier for the dataset and a path to the local directory containing the dataset files. The directory has a flat structure, with one email per file:\n```bash\nemails\n├── allen-p-inbox-42.txt\n├── allen-p-inbox-45.txt\n...\n└── whalley-g-merchant-investments-3.txt\n```\nGiven this directory, PZ will create a [`pz.IterDataset`](user-guide/dataset.mdx), which iterates over the files in the directory at runtime.\n\n<details>\n  <summary>What if my data isn't simply text files?</summary>\n\n  That's perfectly fine!\n    \n  The `pz.IterDataset` class can be subclassed by the user to read data from more complex sources. The user just has to:\n    \n    1. implement `pz.IterDataset`'s `__len__()` method\n    2. implement `pz.IterDataset`'s `__getitem__(idx)` method\n\n    PZ also provides the following built-in dataset classes for reading local files:\n    - `pz.MemoryDataset`\n      - Loads data from (1) a list of dictionaries or (2) a pandas DataFrame provided by the user. Useful for small datasets which can fit in memory.\n    - `pz.PDFFileDataset`\n      - Loads all PDF files in a directory. Yields `filename`, `contents`, and `text_contents` fields, where the latter is the text extracted from the PDF.\n    - `pz.ImageFileDataset`\n      - Loads all image files in a directory. Yields `filename` and `contents` fields, where the latter is the base-64 encoded version of the image.\n    - `pz.AudioFileDataset`\n      - Loads all audio files (.wav) in a directory. Yields `filename` and `contents` fields, where the latter is the base-64 encoded version of the audio file.\n    - `pz.HTMLFileDataset`\n      - Loads all HTML files in a directory. Yields `filename`, `html`, and `text` fields, where the latter is the text parsed from the raw HTML.\n    - `pz.XLSFileDataset`\n      - Loads all Excel files (.xls, .xlsx) in a directory. Yields `filename`, `contents`, `sheet_names`, and `number_sheets`.\n\n  More details can be found in our [user guide for custom Datasets](user-guide/dataset.mdx).\n</details>\n\n\nThe `pz.IterDataset` will emit one dictionary per file to the next operator in the program. By default, each dictionary will have two keys: `contents` and `filename` which map to the file's contents and filename, respectively:\n\n```python\nimport palimpzest as pz\n\nemails = pz.TextFileDataset(id=\"enron-emails\", path=\"emails/\")\noutput = emails.run()\n\nprint(output.to_df())\n\n# This produces the following output:\n#                    filename                                           contents\n# 0      giron-d-inbox-13.txt  Message-ID: <14025496.1075840554055.JavaMail.e...\n# 1      cash-m-inbox-143.txt  Message-ID: <26598080.1075855360503.JavaMail.e...\n# 2     cuilla-m-inbox-24.txt  Message-ID: <300661.1075853095557.JavaMail.eva...\n# 3      beck-s-inbox-149.txt  Message-ID: <17744967.1075840358477.JavaMail.e...\n# 4      allen-p-inbox-78.txt  Message-ID: <26175277.1075863149462.JavaMail.e...\n# ..                      ...                                                ...\n# 245    allen-p-inbox-84.txt  Message-ID: <7182251.1075863149647.JavaMail.ev...\n# 246  delainey-d-inbox-9.txt  Message-ID: <15156489.1075859109691.JavaMail.e...\n# 247    allen-p-inbox-45.txt  Message-ID: <31239550.1075858645503.JavaMail.e...\n# 248   blair-l-inbox-248.txt  Message-ID: <457022.1075861908047.JavaMail.eva...\n# 249   forney-j-inbox-53.txt  Message-ID: <12770457.1075859220577.JavaMail.e...\n\n# [250 rows x 2 columns]\n```\n\n<details>\n  <summary>What is `output`?</summary>\n\n  The `output` in the program above has type `pz.DataRecordCollection`.\n    \n  This object contains:\n    1. The data emitted by the PZ program\n    2. The execution stats (i.e. cost, runtime, and quality metrics) for the entire program\n\n  We expose the `pz.DataRecordCollection.to_df()` method to make it easy for users to get the output(s) of their program in a Pandas DataFrame. We will also expose other utility methods for processing execution statistics in the near future.\n</details>\n\n\n### 🪄 Computing New Fields\nA key feature of PZ is that it provides users with the ability to compute new fields using semantic operators. To compute new fields, users need to invoke the `sem_map()` method with a list of dictionaries defining the field(s) the system should compute:\n```python\nemails = emails.sem_map([\n    {\"name\": \"subject\", \"type\": str, \"desc\": \"the subject of the email\"},\n    {\"name\": \"sender\", \"type\": str, \"desc\": \"the email address of the sender\"},\n    {\"name\": \"summary\", \"type\": str, \"desc\": \"a brief summary of the email\"},\n])\n```\nIn order to fully define a field, each dictionary must have the following three keys:\n\n1. `name`: the name of the field\n2. `type`: the type of the field (one of `str`, `int`, `float`, `bool`, `list[str]`, ..., `list[bool]`)\n3. `desc`: a short natural langague description defining what the field represents\n\nEquivalently, users can also define schemas using a `pydantic.BaseModel`:\n```python\nfrom pydantic import BaseModel, Field\n\nclass EmailSchema(BaseModel):\n    subject: str = Field(description=\"the subject of the email\")\n    sender: str = Field(description=\"the email address of the sender\")\n    summary: str = Field(description=\"a brief summary of the email\")\n\n...\nemails = emails.sem_map(EmailSchema)\n```\nPZ will then use LLM(s) to generate the field(s) for each input (i.e. for each email in this example).\n\n:::tip[But what is `sem_map()` actually doing to generate the field(s)?]\n\n    It depends! (and this is where PZ's optimizer comes in handy)\n\n    Depending on the difficulty of the task and your preferred optimization objective (e.g. `max_quality`) PZ will select one implementation from a set of `PhysicalOperators` to generate your field(s).\n\n    PZ can choose from thousands of possible implementations of its `PhysicalOperators`. Each operator uses one (or more) LLMs and may use techniques such as RAG, Mixture-of-Agents, Critique and Refine, etc. to produce a final output.\n\n    For a full list of `PhysicalOperators` in PZ, please consult our documentation on [Operators](user-guide/operators/overview.mdx).\n:::\n\n### ✂️ Filtering Inputs\nPZ also provides users with the ability to filter inputs using natural language. In order to apply a semantic filter, users need to invoke the `sem_filter()` method with a natural language description of the critieria they are *selecting for*:\n```python\nemails = emails.sem_filter(\n    'The email refers to one of the following business transactions: \"Raptor\", \"Deathstar\", \"Chewco\", and/or \"Fat Boy\"',\n)\nemails = emails.sem_filter(\n    \"The email contains a first-hand discussion of the business transaction\",\n)\n```\nThese filters will keep all emails which involve a first-hand discussion of one of the specified business transactions.\n\n### ⚒️ Naive Optimization and Execution\nFinally, once we've defined our program in PZ, we can execute it in order to generate our output:\n```python\noutput = emails.run(max_quality=True)\n```\nThe `run()` method triggers PZ's execution of the program that has been defined by applying semantic operators to `emails`. The `run()` method also takes a number of keyword arguments which can configure the execution of the program.\n\nIn particular, users can specify one ***optimization objective*** and (optionally) one ***constraint***:\n\n**Optimization objectives:**\n\n- `max_quality=True` (maximize output quality) \n- `min_cost=True` (minimize program cost)\n- `min_time=True` (minimize program runtime)\n\n**Constraints:**\n\n- `quality_threshold=<float>` (threshold in range [0, 1])\n- `cost_budget=<float>` (cost in US Dollars)\n- `time_budget=<float>` (time in seconds)\n\n:::note[More Info on Constraints]\n\n    PZ can only *estimate* the cost, quality, and runtime of each physical operator, therefore constraints are not guaranteed to be met. Furthermore, some constraints may be infeasible (even with perfect estimates).\n\n    In any case, PZ will make a best effort attempt to find the optimal plan for your stated objective and constraint (if present).\n\n    To achieve better estimates -- and thus better optimization outcomes -- please read our [Optimization User Guide](user-guide/optimization.mdx).\n:::\n\n### ✨ Optimizing Execution with a Validator\nIn the example above, we do not use a `pz.Validator` to help optimize the program. Therefore, operator quality is estimated using the MMLU Pro score(s) of the model(s) used by each operator, and the optimizer will simply use these estimates to select the highest quality operator(s).\n\nIn our [Optimization User Guide](user-guide/optimization.mdx) we show you how to provide a `pz.Validator` to improve the optimizer's performance. This includes both:\n1. Using an LLM-as-a-judge to optimize performance\n2. Using labeled validation data to optimize performance\n\nIn brief, we can specify the LLM judge for a `pz.Validator` and use it to optimize our program as follows:\n```python\nvalidator = pz.Validator(model=pz.Model.GPT_5)\noutput = emails.optimize_and_run(max_quality=True, validator=validator)\n```\n:::info\nPZ's [sample-based optimizer (Abacus)](https://arxiv.org/pdf/2505.14661) is only activated when `.optimize_and_run()` is used with a `pz.Validator`. Otherwise, PZ will use naive optimization based on MMLU Pro scores. \n:::\n\nIn this setting, PZ will evaluate the performance of different implementations of each semantic operator using GPT-5 while also measuring the cost and latency of each operator. After a fixed sampling budget is exhausted, PZ will select and execute the optimal plan for the program based on the observed performance of each operator.\n\n### 🔎 Examining Program Output\nFinally, once your program finishes executing you can convert its output to a Pandas DataFrame and examine the results:\n```python\nprint(output.to_df(cols=[\"filename\", \"sender\", \"subject\", \"summary\"]))\n```\nThe `cols` keyword argument allows you to select which columns should populate your DataFrame (if it is `None`, then all columns are selected).\n\nAs mentioned above, the `output` is a `pz.DataRecordCollection` object which contains the program output and all of the execution statistics for your program. We can use this to examine the total cost and runtime of our program:\n```python\nprint(f\"Total time: {output.execution_stats.total_execution_time:.1f}\")\nprint(f\"Total cost: {output.execution_stats.total_execution_cost:.3f}\")\n```\nWhich will produce an output like:\n```\nTotal time: 18.70s\nTotal cost: $0.5390\n```\n\n### ➡️ What's Next?\nClick below to proceed to the `Next Steps`.\n"
  },
  {
    "path": "website/docs/intro.mdx",
    "content": "# Optimized Execution for Semantic Operators\n[![Discord](https://img.shields.io/discord/1245561987480420445?logo=discord)](https://discord.gg/dN85JJ6jaH)\n[![Colab Demo](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1zqOxnh_G6eZ8_xax6PvDr-EjMt7hp4R5?usp=sharing)\n[![PyPI](https://img.shields.io/pypi/v/palimpzest)](https://pypi.org/project/palimpzest/)\n[![PyPI - Monthly Downloads](https://img.shields.io/pypi/dm/palimpzest?color=teal)](https://pypi.org/project/palimpzest/)\n[![GitHub](https://img.shields.io/badge/GitHub-Code-blue?logo=github)](https://github.com/mitdbg/palimpzest)\n{/* [![Paper](https://img.shields.io/badge/Paper-arXiv-b31b1b?logo=arxiv)](https://arxiv.org/pdf/2405.14696) */}\n{/* [![Video](https://img.shields.io/badge/YouTube-Talk-red?logo=youtube)](https://youtu.be/T8VQfyBiki0?si=eiph57DSEkDNbEIu) */}\n\nPalimpzest (PZ) enables developers to process unstructured data (i.e. documents, images, audio, etc.) using **semantic operators** -- i.e. AI-powered data transformations. PZ programs are *declarative*, which means they express what computation should be performed without specifying exactly *how* to perform it. This allows PZ's optimizer to select the best way to execute each semantic operator in order to minimize cost, minimize latency, or maximize quality, possibly subject to constraints on the other dimensions.\n\n### ✉️ Example: Processing Emails with Semantic Operators\n<details>\n  <summary>Download Example Dataset</summary>\n\n  The following code snippet sets up PZ and downloads a small datast of emails:\n\n  ```python\n  # install palimpzest with pip\n  $ pip install palimpzest\n\n  # you can also install palimpzest with uv for a faster install\n  # $ uv pip install palimpzest\n\n  # set OpenAI API key\n  $ export OPENAI_API_KEY=\"<your-api-key>\"\n\n  # download and extract emails\n  $ wget https://palimpzest-workloads.s3.us-east-1.amazonaws.com/emails.zip\n  $ unzip emails.zip\n  ```\n</details>\nThe following PZ program extracts the subject, sender, and summary of emails which contain first-hand discussion of specific business transactions. The program is written in a high-level declarative language, and PZ automatically optimizes the execution of each semantic operator to maximize output quality:\n\n```python\nimport palimpzest as pz\n\n# load the emails into a dataset\nemails = pz.TextFileDataset(id=\"enron-emails\", path=\"emails/\")\n\n# filter for emails matching natural language criteria\nemails = emails.sem_filter(\n    'The email refers to one of the following business transactions: \"Raptor\", \"Deathstar\", \"Chewco\", and/or \"Fat Boy\"',\n)\nemails = emails.sem_filter(\n    \"The email contains a first-hand discussion of the business transaction\",\n)\n\n# extract structured fields for each email\nemails = emails.sem_map([\n    {\"name\": \"subject\", \"type\": str, \"desc\": \"the subject of the email\"},\n    {\"name\": \"sender\", \"type\": str, \"desc\": \"the email address of the sender\"},\n    {\"name\": \"summary\", \"type\": str, \"desc\": \"a brief summary of the email\"},\n])\n\n# execute the program and print the output\noutput = emails.run(max_quality=True)\n\nprint(output.to_df(cols=[\"filename\", \"sender\", \"subject\", \"summary\"]))\n```\nThe output from this program is shown below:\n```\n                               filename                                    subject                    sender                                            summary\n0  whalley-g-merchant-investments-3.txt         Enron Principal Investments Update   kevin.garland@enron.com  Kevin Garland provides an update on Enron Prin...\n1              kaminski-v-inbox-291.txt  RE: Pricing of restriction on Enron stock           baker@enron.com  Ron Baker clarifies to Vince Kaminski that the...\n2               delainey-d-sent-683.txt                                 Re: Raptor  david.delainey@enron.com  David Delainey responds about hedging and the ...\n3               kaminski-v-inbox-92.txt                                FW: Raptors      j.kaminski@enron.com  Vince Kaminski forwards a previously sent mess...\n4               delainey-d-sent-295.txt                                   AIG Fund  david.delainey@enron.com  David Delainey advises colleagues to revise DA...\n```\nThere are a few features of this program which are worth highlighting:\n\n1. The program creates a dataset from the directory of emails and defines a series of ***semantic operations*** on that dataset:\n    - `sem_filter()` selects for emails which satisfy each natural language predicate\n    - `sem_map()` specifies a set of fields which PZ must compute\n2. The user does not specify ***how*** each operation is performed -- they simply declare ***what*** they want PZ to compute\n    - This is what makes PZ declarative\n3. Internally, PZ's optimizer determines the best way to execute each semantic operator\n    - In this example, PZ optimizes for output quality because the user sets `max_quality=True`\n3. The `output` is not generated until the call to `emails.run()`\n    - i.e. PZ uses [lazy evaluation](https://en.wikipedia.org/wiki/Lazy_evaluation)\n\n### 🛠️ Naive Optimization\nThe program above is optimized using **naive prior beliefs** about the quality of different operator implementations for `sem_filter` and `sem_map`. In brief, PZ will implement each `sem_filter` and `sem_map` operation using the available LLM with the highest score on MMLU-Pro. (To minimize cost (or latency) instead, we could call `emails.run(min_cost=True)` (or `min_time=True`) and PZ would use the cheapest (or fastest) available LLM based on per-token costs (or latencies)).\n\nIn order to leverage the full power of PZ's optimizer, we need to provide PZ with a `pz.Validator` which can evaluate the quality, cost, and latency of different operator implementations for each semantic operation.\n\n### ✨ Optimizing Execution with a Validator\n**The real power of Palimpzest** comes from its ability to test multiple implementations of each operator and select the best one based on empirical performance. For example, PZ can select which model to use for each semantic operation -- perhaps opting for a cheaper model that performs well on a given semantic filter or a more expensive model for a challenging semantic map. However, PZ's space of optimizations extends beyond model selection (also called \"model routing\") to include:\n- Model selection\n- Ensemble methods (e.g., [Mixture-of-Agents](https://arxiv.org/pdf/2406.04692))\n- Refinement strategies (e.g., using a model to propose an answer, a second model to critique the answer, and a third model to refine the answer)\n- Context reduction strategies (e.g., using embedding similarity to feed only the top-k most relevant chunks from the input context into the LLM)\n- And more!\n\nIn order to make use of these advanced optimizations, we need to modify the above program to use a `pz.Validator` with `.optimize_and_run()`:\n```python\nimport palimpzest as pz\n\n...\n\n# execute the program and print the output\nvalidator = pz.Validator(model=pz.Model.GPT_5)\noutput = emails.optimize_and_run(max_quality=True, validator=validator)\n\nprint(output.to_df(cols=[\"filename\", \"sender\", \"subject\", \"summary\"]))\n```\nThe call to `.optimize_and_run()` will cause PZ to first perform an optimization loop where it samples different implementations of each semantic operator and evaluates them using the `pz.Validator` (in this case, GPT-5). After the optimization loop, PZ will select the best implementation of each operator and then execute the optimized program.\n\nThe output from this optimized program is shown below:\n```\n                                filename                                      subject                        sender                                            summary\n0                delainey-d-sent-295.txt                                     AIG Fund      david.delainey@enron.com  David Delainey advises toning down claims of s...\n1      kaminski-v-all-documents-2352.txt                         RE: Cross-Guarantees           ron.baker@enron.com  Ron Baker sends the latest drafts of three Cro...\n2                delainey-d-sent-318.txt  Re: ENA Comp suggestions for Project Raptor      david.delainey@enron.com  David Delainey tells David Oxley to drop the c...\n3             giron-d-sent-items-200.txt                           FW: Enron Mentions            c..giron@enron.com  Forwarded email compiling numerous news articl...\n4                delainey-d-sent-683.txt                                   Re: Raptor      david.delainey@enron.com  David Delainey discusses hedging and restructu...\n5   whalley-g-merchant-investments-3.txt           Enron Principal Investments Update       kevin.garland@enron.com  Update on Enron Principal Investments covering...\n6    parks-j-deleted-items-913-short.txt                                   FW: E memo  gregory.schockling@enron.com  Gregory Schockling forwards an email chain reg...\n7      kaminski-v-all-documents-2355.txt         Raptor Position Reports for 12/28/00           ron.baker@enron.com  Sends the latest Daily Position Report files f...\n8               kaminski-v-inbox-291.txt    RE: Pricing of restriction on Enron stock               baker@enron.com  Ron Baker clarifies confusion about parts of a...\n9                kaminski-v-inbox-92.txt                                  FW: Raptors          j.kaminski@enron.com  Vince Kaminski forwards a previously sent mess...\n10            beck-s-notes-inbox-166.txt              Re: Enron Raptor I P&L Reversal        shona.wilson@enron.com  Shona Wilson informs recipients that the DPR c...\n```\nThe optimized output contains more relevant emails than the naively optimized program, illustrating the benefit of PZ's optimization framework. In our [user guide on optimization](user-guide/optimization.mdx), we also show how to use labeled data (instead of an LLM) to drive optimization which can further improve performance.\n\n### 📈 Declarative Optimization for AI\nThe core philosophy behind PZ is that programmers should simply specify the high-level logic of their AI programs while offloading much of the performance tuning to a powerful optimizer. Of course, users should still be able to fully control their program and override / assist the optimizer (if needed) to get the best possible performance.\n\n### 🚀 More Semantic Operators\nThis email processing example only showcases a small set of the semantic operators implemented in PZ. Other operators include:\n\n- `sem_join()` which performs a semantic join between two datasets\n- `sem_aggregate()` which performs a semantic aggregation over a dataset\n- `sem_flat_map()` which performs semantic map operations producing multiple outputs per input (e.g. extracting all the authors on a single paper)\n- `sem_topk()` which takes a vector database and a search string as input and retrieves the top-k relevant entries from the database\n- `map()`, `flat_map()`, `filter()`, and `join()` which are the relational equivalents of `sem_map()`, `sem_flat_map()`, `sem_filter()`, and `sem_join()`\n- `groupby()`, `count()`, `average()`, `limit()`, and `project()` which mirror their implementations in frameworks like Pandas and Spark.\n\n{/* **[[start of quick editorial note]]**\nThis^ example is significantly improved from before, but it can be simplified and clarified further:\n1. If possible, we should show the (abbreviated) contents of the inputs\n2. As discussed offline, the `sem_add_columns()` arguments are very verbose, and it would be nice to support (and show off) syntax like:\n    - `emails.sem_add_columns([\"sender\", \"subject\"], prompt=\"Please compute the subject and sent date of the email\")`.\n**[[end of quick editorial note]]** */}\n\n{/* PZ provides the developer with a high-level interface for composing semantic operators into concise programs. The call to `emails.run()` triggers PZ's optimizer, which automatically selects which LLMs and execution strategies to use for each semantic operation. Users have the ability to fully control the program, and can override and assist the optimizer (if needed) to get the best possible performance. */}\n\n### 🙋🏽 Join our community\nWe strongly encourage you to join our [Discord server](https://discord.gg/dN85JJ6jaH) where we are happy to help you get started with PZ.\n\n### ➡️ What's Next?\nThe rest of our Getting Started section will:\n\n1. Help you install PZ\n2. Explore more of PZ's features in our [Quick Start Tutorial](getting-started/quickstart.mdx)\n3. Give you an overview of our [User Guides](user-guide/overview.mdx) which discuss features of PZ in more depth\n\n\n{/* Palimpzest is a **cost-based optimizer for AI-powered analytical workloads**. It enables users to express complex AI-powered data queries in a **high-level declarative language**, and it **automatically generates optimized execution plans** that minimize cost, maximize quality, or balance both.\n\nIn modern AI applications, executing queries efficiently is a challenge. A single query may require:\n\n* Extracting structured data from unstructured sources (e.g., PDFs, emails, research papers)\n* **Choosing between different AI models and inference methods**\n* **Managing trade-offs between execution speed, cost, and accuracy**\n* **Handling large-scale datasets while minimizing computational overhead**\n\nTraditionally, AI engineers must **manually fine tune** prompts, select models, and optimize inference strategies for each task. This process is not only time consuming but also requires constant updates as models evolve and costs fluctuate.\n\nPalimpzest **solves this problem** by applying **cost-based optimization techniques** similar to a database query optimizer to **AI-powered analytical queries**. Users write **declarative queries**, and Palimpzest:\n\n1. **Analyzes the query structure**  \n2. **Explores different execution plans**  \n3. **Estimates cost, runtime, and quality**  \n4. **Selects the optimal plan** based on user-defined constraints  \n\n🚀 **Quick Links**:\n\n- **[📄 Read the Paper](https://arxiv.org/pdf/2405.14696)**\n- **[📝 Read the Blog](https://dsg.csail.mit.edu/projects/palimpzest/)**\n- **[▶️ Watch the MIT Video](https://youtu.be/T8VQfyBiki0?si=eiph57DSEkDNbEIu)** \n\n\n!!! info \"Getting Started I: Install Palimpzest\"\n    === \"PyPi\"\n        You can find a stable version of the PZ package on PyPI [here](https://pypi.org/project/palimpzest/). To install the package, run:\n        ```bash\n        $ pip install palimpzest\n        ```\n    === \"Clone Repo\"\n        Clone the repository and install the package:\n\n        ```bash \n        git clone git@github.com:mitdbg/palimpzest.git\n        cd palimpzest\n        pip install .\n        ```\n\n!!! info \"Getting Started II: Demo PZ modules for various tasks\"\n\n    === \"Quick Start\"\n\n        The easiest way to get started with Palimpzest is to run the `quickstart.ipynb` jupyter notebook. We demonstrate the full workflow of working with PZ, including registering a dataset, composing and executing a pipeline, and accessing the results.\n        To run the notebook, you can use the following command:\n            ```bash\n            $ jupyter notebook\n            ```\n        And then access the notebook from the jupyter interface in your browser at `localhost:8888`.\n\n    === \"Even Quicker Start\"\n\n        For eager readers, the code in the notebook can be found in the following condensed snippet. However, we do suggest reading the notebook as it contains more insight into each element of the program.\n        ```python\n        import pandas as pd\n        import palimpzest.datamanager.datamanager as pzdm\n        from palimpzest.core.data.dataset import Dataset\n        from palimpzest.core.lib.fields import Field\n        from palimpzest.core.lib.schemas import Schema, TextFile\n        from palimpzest.policy import MinCost, MaxQuality\n        from palimpzest.query.processor.config import QueryProcessorConfig\n\n        # Dataset registration\n        dataset_path = \"testdata/enron-tiny\"\n        dataset_name = \"enron-tiny\"\n        pzdm.DataDirectory().register_local_directory(dataset_path, dataset_name)\n\n        # Dataset loading\n        dataset = Dataset(dataset_name, schema=TextFile)\n\n        # Schema definition for the fields we wish to compute\n        class Email(Schema):\n            \"\"\"Represents an email, which in practice is usually from a text file\"\"\"\n            sender = Field(desc=\"The email address of the sender\")\n            subject = Field(desc=\"The subject of the email\")\n            date = Field(desc=\"The date the email was sent\")\n\n        # Lazy construction of computation to filter for emails about holidays sent in July\n        dataset = dataset.convert(Email, desc=\"An email from the Enron dataset\")\n        dataset = dataset.filter(\"The email was sent in July\")\n        dataset = dataset.filter(\"The email is about holidays\")\n\n        # Executing the compuation\n        policy = MinCost()\n        config = QueryProcessorConfig(\n            policy=policy,\n            verbose=True,\n            processing_strategy=\"no_sentinel\",\n            execution_strategy=\"sequential\",\n            optimizer_strategy=\"pareto\",\n        )\n        results, execution_stats = dataset.run(config)\n\n        # Writing output to disk\n        output_df = pd.DataFrame([r.to_dict() for r in results])[[\"date\",\"sender\",\"subject\"]]\n        output_df.to_csv(\"july_holiday_emails.csv\")\n        ```\n\n    === \"Python Demos\"\n\n        Below are simple instructions to run PZ on a test data set of enron emails that is included with the system.\n\n        ### Downloading test data\n        To run the provided demos, you will need to download the test data. Due to the size of the data, we are unable to include it in the repository. You can download the test data by running the following command from a unix terminal (requires `wget` and `tar`):\n        ```\n        chmod +x testdata/download-testdata.sh\n        ./testdata/download-testdata.sh\n        ```\n        For convenience, we have also provided a script to register all test data with Palimpzest:\n        ```\n        chmod +x testdata/register-sources.sh\n        ./testdata/register-sources.sh\n        ```\n\n        ### Running the Demos\n        - Initialize the configuration by running `pz init`.\n\n        - Palimpzest defaults to using OpenAI. You’ll need to export an environment variable `OPENAI_API_KEY`\n\n        - (Skip this step if you ran the `register-sources.sh` script successfully) Add the enron data set with:\n        `pz reg --path testdata/enron-tiny --name enron-tiny`\n\n        - Finally, run the simple test program with:\n            `python demos/simpleDemo.py --task enron --datasetid enron-eval-tiny --verbose` -->\n*/}"
  },
  {
    "path": "website/docs/user-guide/dataset.mdx",
    "content": "---\ntitle: Creating Your Own Dataset\n---\n{/* ## Goal\nThis page should provide the reader with a brief demonstration of how they can implement their own `Dataset`.\n\nKey takeaways for the reader should include:\n\n1. Knowing how to build a basic `Dataset`: (provide a schema, implement `__len__()`, implement `__getitem__()`)\n2. knowing how the data returned by `__getitem__()` can be accessed later in the program\n3. knowing how to provide multi-modal input (images must be `PIL.Image | list[PIL.Image]`)\n4. Knowing when and how the user can take advantage of an off-the-shelf `Dataset` provided by PZ\n\nKeeping in line with \"show don't tell\", this page should have a motivating use case and start with the implementation of a simple `Dataset`. The problem should then be expanded to include image and text inputs, with a demonstration of the changes made to the `Dataset` to support these.\n\nI would strongly suggest that we finish with example(s) demonstrating how to use our off-the-shelf `Datasets`. If the user understands (1.), (2.), and (3.), then (4.) will follow quite naturally. However, the inverse is not necessarily true.\n*/}\n\nAs shown in our [Quick Start Tutorial](getting-started/quickstart.mdx), Palimpzest provides built-in support for local datasets which consist solely of text, PDF, image, or audio files (as well as a few other special file types).\n\nHowever, many datasets contain heterogenous file types and/or more diverse input fields than just `filename` and `contents`. Fortunately, PZ's `pz.IterDataset` class can easily be extended to support virtually any dataset.\n\nIn this guide, we will demonstrate how to create your own custom dataset by subclassing `pz.IterDataset`.\n\n## 🔤 The Basics\n**The goal** of the `pz.IterDataset` class is to provide a simple interface for iterating over a dataset of records. Each record is represented as a dictionary, where the keys are the column names and the values are the corresponding data for that record.\n\nIn order to create your own dataset you will need to subclass `pz.IterDataset` and implement two methods:\n- `__len__()` which specifies the size of the dataset\n- `__getitem__(idx)` which returns a dictionary containing the keys and values for the `idx` item in the dataset\n\nAdditionally, you will need to provide an `id` and a `schema` when initializing your dataset.\n\nWe explore each of these aspects of the `pz.IterDataset` class in the following example.\n\n### 👉🏽 Example: Dataset for Zoo Animals\nSuppose you have a dataset of animals in a zoo, where each animal has an age, type (e.g. dog, cat, etc.), and name. Let's assume each of these fields is stored in a separate list as follows:\n```\nages = [5, 3, 8, 2, 4]\nanimals = [\"dog\", \"cat\", \"parrot\", \"rabbit\", \"hamster\"]\nnames = [\"Buddy\", \"Whiskers\", \"Polly\", \"Thumper\", \"Nibbles\"]\n```\nWe can create a custom dataset for these animals by subclassing `pz.IterDataset` as follows:\n```python\nimport palimpzest as pz\n\nanimal_schema = [\n    {\"name\": \"age\", \"type\": int, \"desc\": \"The animal's age in years\"},\n    {\"name\": \"animal\", \"type\": str, \"desc\": \"The type of animal (dog, cat, etc.)\"},\n    {\"name\": \"name\", \"type\": str, \"desc\": \"The name of the animal\"},\n]\n\nclass AnimalDataset(pz.IterDataset):\n    def __init__(self, ages: list[int], animals: list[str], names: list[str]) -> None:\n        super().__init__(id=\"zoo-animals\", schema=animal_schema)\n        self.ages = ages\n        self.animals = animals\n        self.names = names\n\n    def __len__(self) -> int:\n        return len(self.names)\n\n    def __getitem__(self, idx: int) -> dict:\n        return {\"age\": self.ages[idx], \"animal\": self.animals[idx], \"name\": self.names[idx]}\n```\nOnce we have defined our `AnimalDataset`, we can create an instance of it and use it in a PZ program as follows:\n```python\n# create dataset instance\nages = [5, 3, 8, 2, 4]\nanimals = [\"dog\", \"cat\", \"parrot\", \"rabbit\", \"hamster\"]\nnames = [\"Buddy\", \"Whiskers\", \"Polly\", \"Thumper\", \"Nibbles\"]\ndataset = AnimalDataset(ages, animals, names)\n\n# use dataset in a PZ program\ndataset = dataset.sem_filter(\"The animal has four legs and is younger than five years old.\")\ndataset = dataset.sem_map(\n    cols=[\n        {\"name\": \"greeting\", \"type\": str, \"desc\": \"A greeting for the animal\"},\n    ],\n)\noutput = dataset.run(max_quality=True)\nprint(output.to_df())\n```\nThis will output a dataframe containing the animals that match the filter criteria, along with a greeting for each animal.\n```\n   age   animal      name                                           greeting\n0    3      cat  Whiskers       Hello Whiskers, you adorable 3-year-old cat!\n1    2   rabbit   Thumper               Hello Thumper the 2-year-old rabbit!\n2    4  hamster   Nibbles  Hello Nibbles the hamster! At 4 years old, you...\n```\n\n### 🔑 Key Points\n1. The `id` parameter in the `pz.IterDataset` constructor (i.e. `\"zoo-animals\"`) is a unique identifier for the dataset. It can be any string, but it should be unique across all datasets used in a PZ program.\n2. The `schema` parameter in the `pz.IterDataset` constructor is a list of dictionaries, where each dictionary describes a column in the dataset. Each dictionary should contain the following keys:\n   - `name`: The name of the column (string)\n   - `type`: The data type of the column (e.g. `str`, `int`, `float`, etc.)\n   - `desc`: A brief description of the column (string)\n3. The `__len__()` method should return the number of records in the dataset.\n4. The `__getitem__(idx)` method should return a dictionary containing the keys and values for the `idx` item in the dataset. The keys should match the column names specified in the `schema`, and the values should be of the corresponding data type.\n5. The data returned by `__getitem__()` can be accessed later in the program using the column names specified in the `schema`. For example, we could filter for animals with `age < 5` using the `age` column as follows:\n   ```python\n   dataset = dataset.filter(lambda record: record[\"age\"] < 5)\n   ```\n\n:::tip[Using the Same Dataset Multiple Times in a Single Program]\nIf you need to use the same dataset multiple times in a program, you can create multiple instances of the dataset with different `id`s. For example:\n```python\nclass AnimalDataset(pz.IterDataset):\n    def __init__(self, id: str, ages: list[int], ...) -> None:\n        super().__init__(id=id, schema=animal_schema)\n        self.ages = ages\n        ...\n\ndataset1 = AnimalDataset(\"zoo-animals1\", ages, animals, names)\ndataset2 = AnimalDataset(\"zoo-animals2\", ages, animals, names)\nds = dataset1.sem_join(dataset2, \"both animals have four legs\")\n...\n```\n:::\n\n## ⚒️ Using Built-in Datasets\nPZ provides several built-in datasets for common use cases, including:\n- `pz.MemoryDataset`\n    - Loads data from a list of dictionaries or a pandas DataFrame provided by the user. Useful for small datasets which can fit in memory.\n- `pz.PDFFileDataset`\n    - Loads all PDF files in a directory. Yields `filename`, `contents`, and `text_contents` fields, where `text_contents` is the text extracted from the PDF.\n- `pz.ImageFileDataset`\n    - Loads all image files in a directory. Yields `filename` and `contents` fields, where `contents` is the base-64 encoded version of the image.\n- `pz.AudioFileDataset`\n    - Loads all audio files (.wav) in a directory. Yields `filename` and `contents` fields, where `contents` is the base-64 encoded version of the audio file.\n- `pz.HTMLFileDataset`\n    - Loads all HTML files in a directory. Yields `filename`, `html`, and `text` fields, where `text` is the text parsed from the raw HTML.\n- `pz.XLSFileDataset`\n    - Loads all Excel files (.xls, .xlsx) in a directory. Yields `filename`, `contents`, `sheet_names`, and `number_sheets`.\n\nRevisiting our zoo animal example, we could use a `pz.MemoryDataset` to load the animal data as follows:\n```python\nimport palimpzest as pz\n\n# create dataset instance\ndata = [\n    {\"age\": 5, \"animal\": \"dog\", \"name\": \"Buddy\"},\n    {\"age\": 3, \"animal\": \"cat\", \"name\": \"Whiskers\"},\n    {\"age\": 8, \"animal\": \"parrot\", \"name\": \"Polly\"},\n    {\"age\": 2, \"animal\": \"rabbit\", \"name\": \"Thumper\"},\n    {\"age\": 4, \"animal\": \"hamster\", \"name\": \"Nibbles\"},\n]\ndataset = pz.MemoryDataset(id=\"zoo-animals\", vals=data)\n# dataset = pz.MemoryDataset(id=\"zoo-animals\", vals=pd.DataFrame(data))  # alternatively, load from a pandas DataFrame\n\n# use dataset in a PZ program\ndataset = dataset.sem_filter(\"The animal has four legs and is younger than five years old.\")\ndataset = dataset.sem_map(\n    cols=[\n        {\"name\": \"greeting\", \"type\": str, \"desc\": \"A greeting for the animal\"},\n    ],\n)\noutput = dataset.run(max_quality=True)\nprint(output.to_df())\n```\nNote that when using a `pz.MemoryDataset`, we do not need to provide a `schema`, as it is automatically inferred from the data.\n\nFor more information on PZ's built-in datasets, please see the [API documentation](api/overview.mdx) (coming soon).\n\n## 🖼️ Multi-Modal Datasets\nMany datasets contain multiple modalities of data, such as text and images. PZ's `pz.IterDataset` class can easily be extended to support multi-modal datasets by returning dictionaries with fields of type `pz.ImageFilepath` or `pz.AudioFilepath`.\n\nPalimpzest supports semantic operations over any combination of text, image(s), and audio data. Furthermore, if the image / audio data is not stored on disk, you can also use the `pz.ImageBase64` and `pz.AudioBase64` types to provide base-64 encoded data directly.\n\n### 👉🏽 Example: Dataset for Real Estate Listings\nSuppose we have a dataset of real estate listings, where each listing contains a text description and multiple images of the home:\n\n<div className=\"scrollable-images\">\n  <img src=\"/img/listing-img1.png\" alt=\"img1\" />\n  <img src=\"/img/listing-img2.png\" alt=\"img2\" />\n  <img src=\"/img/listing-img3.png\" alt=\"img3\" />\n</div>\n```\nDESCRIPTION\n-----------\nAddress: 123 Main St Unit 1A, Cambridge, MA 02139\nHome List Price: $1,234,000\n\nBuilt in 2015, this 1763 sq ft contemporary townhouse is only minutes away from the heart of Central Square...\n```\nAnd suppose that we store each listing in a directory, where the text description is stored in a `.txt` file and the images are stored as `.png` files:\n```\n├── listing1\n│   ├── img1.png\n│   ├── img2.png\n│   ├── img3.png\n│   └── listing-text.txt\n├── listing2\n│   ├── img1.png\n│   ├── img2.png\n│   ├── img3.png\n│   └── listing-text.txt\n└── listing3\n    ├── img1.png\n    ├── img2.png\n    ├── img3.png\n    └── listing-text.txt\n```\nWe can load each listing's description and images in a single data record as follows:\n```python\nimport palimpzest as pz\n\nreal_estate_listing_cols = [\n    {\"name\": \"listing\", \"type\": str, \"desc\": \"The name of the listing\"},\n    {\"name\": \"text_content\", \"type\": str, \"desc\": \"The content of the listing's text description\"},\n    {\"name\": \"image_filepaths\", \"type\": list[pz.ImageFilepath], \"desc\": \"A list of the filepaths for each image of the listing\"},\n]\n\nclass RealEstateDataset(pz.IterDataset):\n    def __init__(self, listings_dir):\n        super().__init__(id=\"real-estate\", schema=real_estate_listing_cols)\n        self.listings_dir = listings_dir\n        self.listings = sorted(os.listdir(self.listings_dir))\n\n    def __len__(self):\n        return len(self.listings)\n\n    def __getitem__(self, idx: int):\n        # get listing: e.g. \"listing1\", \"listing2\", etc.\n        listing = self.listings[idx]\n\n        # get fields\n        image_filepaths, text_content = [], None\n        listing_dir = os.path.join(self.listings_dir, listing)\n        for file in os.listdir(listing_dir):\n            if file.endswith(\".txt\"):\n                with open(os.path.join(listing_dir, file), \"rb\") as f:\n                    text_content = f.read().decode(\"utf-8\")\n            elif file.endswith(\".png\"):\n                image_filepaths.append(os.path.join(listing_dir, file))\n\n        # construct and return dictionary with fields\n        return {\"listing\": listing, \"text_content\": text_content, \"image_filepaths\": image_filepaths}\n```\nWe can now use this `RealEstateDataset` in a PZ program to find listings which are modern and attractive, have lots of natural sunlight, and are within our budget:\n```python\nimport palimpzest as pz\n\ntext_based_cols = [\n    {\"name\": \"address\", \"type\": str, \"desc\": \"The address of the property\"},\n    {\"name\": \"price\", \"type\": int | float, \"desc\": \"The listed price of the property\"},\n]\n\nimage_based_cols = [\n    {\"name\": \"is_modern_and_attractive\", \"type\": bool, \"desc\": \"True if the home interior design is modern and attractive and False otherwise\"},\n    {\"name\": \"has_natural_sunlight\", \"type\": bool, \"desc\": \"True if the home interior has lots of natural sunlight and False otherwise\"},\n]\n\ndef in_price_range(record: dict):\n    try:\n        price = record[\"price\"]\n        if isinstance(price, str):\n            price = price.strip()\n            price = int(price.replace(\"$\", \"\").replace(\",\", \"\"))\n        return 6e5 < price <= 2e6\n    except Exception:\n        return False\n\n# create PZ program\nds = RealEstateDataset(listings_dir=\"path/to/listings\")\nds = ds.sem_map(text_based_cols, depends_on=\"text_content\")\nds = ds.sem_map(image_based_cols, depends_on=\"image_filepaths\")\nds = ds.sem_filter(\n    \"The interior is modern and attractive, and has lots of natural sunlight\",\n    depends_on=[\"is_modern_and_attractive\", \"has_natural_sunlight\"],\n)\nds = ds.filter(in_price_range, depends_on=\"price\")\n\n# run the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThis program will extract the address and price from each listing's text description, determine whether the interior design is modern and attractive and whether the home has lots of natural sunlight from the images, filter for listings which meet our criteria, and finally return a dataframe containing the matching listings.\n\n### ➡️ What's Next?\nClick below to proceed to the overview on `Semantic Operators` supported in PZ.\n"
  },
  {
    "path": "website/docs/user-guide/operators/overview.mdx",
    "content": "---\ntitle: Overview\n---\n{/* ## Goal\nThis page should provide the reader with a brief overview of the various operators provided to them.\n\nKey takeaways for the reader should include:\n\n1. Seeing an example of each operator in action.\n2. Understanding the difference between semantic and non-semantic operators\n3. Having a general awareness of the inputs required by each operator (e.g. column specifications for `sem_add_columns()`)\n\nKeeping in line with \"show don't tell\", this page should have a motivating use case (or two) which evolve(s) in a manner that gradually incorporates each operator one-at-a-time. (If it's too difficult to come up with one example that uses all operators, then try to at least fit the most important operators -- `sem_add_columns()`, `filter()`, `retrieve()` -- into a single example. Additional examples can be created to showcase the other operators.)\n*/}\n\nPZ provides a variety of semantic operators that leverage large language models (LLMs) to perform data processing tasks. These operators can be broadly categorized into semantic and non-semantic (i.e. \"relational\") operators.\n\nWe provide a discussion and example(s) for using each of the semantic operators provided by PZ, before finishing with an overview of the relational operators.\n\n- [Semantic Map and Flat Map](user-guide/operators/sem_map.mdx)\n- [Semantic Filter](user-guide/operators/sem_filter.mdx)\n- [Semantic Join](user-guide/operators/sem_join.mdx)\n- [Semantic Aggregate](user-guide/operators/sem_agg.mdx)\n- [Semantic Top-K](user-guide/operators/sem_topk.mdx)\n- [Relational Operators](user-guide/operators/relational.mdx)\n"
  },
  {
    "path": "website/docs/user-guide/operators/relational.mdx",
    "content": "---\ntitle: Relational Operators\n---\nPalimpzest supports a variety of traditional relational operators that can be used in conjunction with semantic operators to perform data processing tasks. These operators include `map()`, `flat_map()`, `filter()`, `join()`, `groupby()`, `project()`, `limit()`, `distinct()`, and aggregations (e.g. `count()`, `average()`, `min()`, and `max()`).\n\nFor each operator, we will use the following dataset of customer orders as a running example:\n```\norders = [\n    {\"order_id\": 1, \"customer_id\": 101, \"status\": \"shipped\", \"items\": [\"shoes\", \"t-shirt\"], \"item_amounts\": [200.0, 50.0]},\n    {\"order_id\": 2, \"customer_id\": 102, \"status\": \"pending\", \"items\": [\"necklace\", \"shoes\", \"phone charger\"], \"item_amounts\": [100.0, 250.0, 25.0]},\n    {\"order_id\": 3, \"customer_id\": 101, \"status\": \"delivered\", \"items\": [\"espresso machine\"], \"item_amounts\": [300.0]},\n    {\"order_id\": 4, \"customer_id\": 103, \"status\": \"shipped\", \"items\": [\"dog food\", \"t-shirt\"], \"item_amounts\": [60.0, 50.0]},\n    {\"order_id\": 5, \"customer_id\": 104, \"status\": \"delivered\", \"items\": [\"book\", \"headphones\"], \"item_amounts\": [20.0, 150.0]},\n]\n```\nWe share an overview of how to implement each relational operator below. \n\n### Relational Map and Flat Map\n:::info[Key Features of `map()` and `flat_map()`]\n1. Both operators take a `pz.Dataset` as input and produce a new dataset with the same number of rows (for `map()`) or potentially more rows (for `flat_map()`).\n2. Both operators require a user-defined function (UDF) that specifies how to transform each row in the input dataset.\n3. The output schema of the new dataset can be specified via the `cols` parameter, which is a list of column specifications (dictionaries with `name`, `type`, and `desc` keys).\n4. The user-defined function will be provided with a row from the input dataset as a dictionary of keys and values, and should return a dictionary of keys and values corresponding to the output schema (i.e. `cols`).\n:::\nSuppose we wish to compute the total amount for each order in the `orders` dataset. We can use the `map()` operator to achieve this as follows:\n```python\nimport palimpzest as pz\n\n# create dataset from list of orders\nds = pz.MemoryDataset(id=\"orders\", vals=orders)\n\n# use map to compute total amount for each order\nds = ds.map(\n    udf=lambda row: {\"total_amount\": sum(row[\"item_amounts\"])},\n    cols=[\n        {\"name\": \"total_amount\", \"type\": float, \"desc\": \"The total amount for the order\"},\n    ],\n)\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThis produces a new dataset with the `total_amount` field computed for each order in the input dataset:\n```\n   order_id  customer_id     status                             items          item_amounts  total_amount\n0         2          102    pending  [necklace, shoes, phone charger]  [100.0, 250.0, 25.0]         375.0\n1         5          104  delivered                [book, headphones]         [20.0, 150.0]         170.0\n2         3          101  delivered                [espresso machine]               [300.0]         300.0\n3         4          103    shipped               [dog food, t-shirt]          [60.0, 50.0]         110.0\n4         1          101    shipped                  [shoes, t-shirt]         [200.0, 50.0]         250.0\n```\nAs another example, suppose we wish to create a new dataset with one record for each item in each order. We can use the `flat_map()` operator to achieve this as follows:\n```python\nimport palimpzest as pz\n\n# create dataset from list of orders\nds = pz.MemoryDataset(id=\"orders\", vals=orders)\n\n# use flat_map to create a new dataset with one record for each item in each order\nds = ds.flat_map(\n    udf=lambda row: [\n        {\"item\": item, \"amount\": amount}\n        for item, amount in zip(row[\"items\"], row[\"item_amounts\"])\n    ],\n    cols=[\n        {\"name\": \"item\", \"type\": str, \"desc\": \"An item in an order\"},\n        {\"name\": \"amount\", \"type\": float, \"desc\": \"The amount for the item\"},\n    ],\n)\nds = ds.project([\"order_id\", \"customer_id\", \"status\", \"item\", \"amount\"])\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThis produces a new dataset with one record for each item in each order:\n```\n   order_id  customer_id     status              item  amount\n0         2          102    pending     phone charger    25.0\n1         5          104  delivered        headphones   150.0\n2         1          101    shipped           t-shirt    50.0\n3         2          102    pending             shoes   250.0\n4         3          101  delivered  espresso machine   300.0\n5         2          102    pending          necklace   100.0\n6         1          101    shipped             shoes   200.0\n7         4          103    shipped          dog food    60.0\n8         4          103    shipped           t-shirt    50.0\n9         5          104  delivered              book    20.0\n```\nNote that we need to use `project()` in order to drop the list fields `items` and `item_amounts`, which would be duplicated across rows with the same `order_id`. In the future we may provide convenience parameters to `flat_map()` to automatically drop such fields.\n\n### Relational Filter\n:::info[Key Features of `filter()`]\n1. The filter operator takes a `pz.Dataset` as input and produces a new dataset with only rows that satisfy the filter predicate.\n2. The operator requires a user-defined function (UDF) that applies the filter predicate to each row in the input dataset.\n3. The user-defined function will be provided with a row from the input dataset as a dictionary of keys and values, and should return `True` or `False` indicating whether or not the row satisfies the predicate.\n:::\nSuppose we wish to filter the `orders` dataset to only include orders that have been shipped. We can use the `filter()` operator to achieve this as follows:\n```python\nimport palimpzest as pz\n\n# create dataset from list of orders\nds = pz.MemoryDataset(id=\"orders\", vals=orders)\n\n# use filter to retain only orders that have been shipped\nds = ds.filter(lambda row: row[\"status\"] == \"shipped\")\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThis produces a new dataset with only the records for orders that have been shipped:\n```\n   order_id  customer_id   status                items   item_amounts\n0         1          101  shipped     [shoes, t-shirt]  [200.0, 50.0]\n1         4          103  shipped  [dog food, t-shirt]   [60.0, 50.0]\n```\n\n### Relational Join\n:::info[Key Features of `join()`]\n1. The join operator takes in a left and a right `pz.Dataset` as input and produces a new dataset with rows that satisfy the join condition.\n2. The type of join can be specified via the `how` parameter, which can take on the values `\"inner\"` (default), `\"left\"`, `\"right\"`, or `\"outer\"`.\n3. If the two datasets have overlapping field names, the duplicate fields from the right dataset will be suffixed with `_right` in the output dataset to avoid naming conflicts.\n4. The operator only supports equi-joins on the (list of) field(s) specified in the `on` parameter.\n:::\nSuppose we have a second dataset containing customer information:\n```\ncustomers = [\n    {\"customer_id\": 101, \"name\": \"Alice\", \"email\": \"alice123@gmail.com\"},\n    {\"customer_id\": 102, \"name\": \"Bob\", \"email\": \"bob456@gmail.com\"},\n    {\"customer_id\": 103, \"name\": \"Charlie\", \"email\": \"charlie789@gmail.com\"},\n    {\"customer_id\": 104, \"name\": \"David\", \"email\": \"david10@gmail.com\"},\n    {\"customer_id\": 105, \"name\": \"Eve\", \"email\": \"eve11@gmail.com\"},\n]\n```\nWe can write the following PZ program to join the `orders` and `customers` datasets on the `customer_id` field:\n```python\nimport palimpzest as pz\n\n# create datasets from list of orders and customers\norders_ds = pz.MemoryDataset(id=\"orders\", vals=orders)\ncustomers_ds = pz.MemoryDataset(id=\"customers\", vals=customers)\n\n# use join to combine the datasets on the customer_id field\nds = orders_ds.join(customers_ds, on=\"customer_id\", how=\"outer\")\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThis produces a new dataset with the joined records from the two input datasets:\n```\n   order_id  customer_id     status                             items          item_amounts     name                 email\n0         3          101  delivered                [espresso machine]               [300.0]    Alice    alice123@gmail.com\n1         5          104  delivered                [book, headphones]         [20.0, 150.0]    David     david10@gmail.com\n2         1          101    shipped                  [shoes, t-shirt]         [200.0, 50.0]    Alice    alice123@gmail.com\n3         4          103    shipped               [dog food, t-shirt]          [60.0, 50.0]  Charlie  charlie789@gmail.com\n4         2          102    pending  [necklace, shoes, phone charger]  [100.0, 250.0, 25.0]      Bob      bob456@gmail.com\n```\nWe could also execute an outer join to retain customers without orders as follows:\n```python\nds = orders_ds.join(customers_ds, on=\"customer_id\", how=\"outer\")\n```\nWhich would produce the following output:\n```\n   order_id  customer_id     status                             items          item_amounts     name                 email\n0       3.0          101  delivered                [espresso machine]               [300.0]    Alice    alice123@gmail.com\n1       1.0          101    shipped                  [shoes, t-shirt]         [200.0, 50.0]    Alice    alice123@gmail.com\n2       4.0          103    shipped               [dog food, t-shirt]          [60.0, 50.0]  Charlie  charlie789@gmail.com\n3       2.0          102    pending  [necklace, shoes, phone charger]  [100.0, 250.0, 25.0]      Bob      bob456@gmail.com\n4       5.0          104  delivered                [book, headphones]         [20.0, 150.0]    David     david10@gmail.com\n5       NaN          105       None                              None                  None      Eve       eve11@gmail.com\n```\n\n### Group By\n:::info[Key Features of `groupby()`]\n1. The operator takes a `pz.Dataset` as input and produces a new dataset with one row per group.\n2. The groupby is specified via a `pz.GroupBySig` object that defines the fields to group by, the aggregation functions to apply, and the fields to aggregate.\n3. The aggregation functions can be any of the following: `\"count\"`, `\"sum\"`, `\"average\"`, `\"min\"`, `\"max\"`, `\"list\"`, or `\"set\"`.\n:::\nWe can use the `groupby()` operator to group the `orders` dataset by the `customer_id` field and compute the total amount spent by each customer as follows:\n```python\nimport palimpzest as pz\n\n# create dataset from list of orders\nds = pz.MemoryDataset(id=\"orders\", vals=orders)\n\n# use groupby to compute total amount spent by each customer\ngby = pz.GroupBySig(group_by_fields=[\"customer_id\"], agg_funcs=[\"sum\"], agg_fields=[\"item_amounts\"])\nds = ds.groupby(gby)\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThis produces a new dataset with one record for each customer and the total amount spent by that customer:\n```\n   customer_id  sum(item_amounts)\n0          102              375.0\n1          103              110.0\n2          101              550.0\n3          104              170.0\n```\n\n### Project\nThe `project()` operator allows users to select a subset of fields from a dataset. This is useful for dropping unnecessary fields and reducing the size of the dataset. For example, we can use the project operator to drop the `items` and `item_amounts` fields from the `orders` dataset as follows:\n```python\nimport palimpzest as pz\n\n# create dataset from list of orders\nds = pz.MemoryDataset(id=\"orders\", vals=orders)\n\n# use project to drop the items and item_amounts fields\nds = ds.project([\"order_id\", \"customer_id\", \"status\"])\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThis produces a new dataset with only the `order_id`, `customer_id`, and `status` fields:\n```\n   order_id  customer_id     status\n0         2          102    pending\n1         5          104  delivered\n2         1          101    shipped\n3         4          103    shipped\n4         3          101  delivered\n```\n\n### Limit\nThe `limit()` operator allows users to restrict the number of rows in a dataset. For example, we can use the limit operator to retain only the first 2 records from the `orders` dataset as follows:\n```python\nimport palimpzest as pz\n\n# create dataset from list of orders\nds = pz.MemoryDataset(id=\"orders\", vals=orders)\n\n# use limit to retain only the first 2 records\nds = ds.limit(2)\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThis produces a new dataset with only the first 2 records:\n```\n   order_id  customer_id   status                             items          item_amounts\n0         1          101  shipped                  [shoes, t-shirt]         [200.0, 50.0]\n1         2          102  pending  [necklace, shoes, phone charger]  [100.0, 250.0, 25.0]\n```\n\n### Distinct\nThe `distinct()` operator allows users to remove duplicate rows from a dataset, possibly based on a subset of fields. For example, we can use the distinct operator to compute the distinct customer IDs in the `orders` dataset as follows:\n```python\nimport palimpzest as pz\n\n# create dataset from list of orders\nds = pz.MemoryDataset(id=\"orders\", vals=orders)\n\n# use distinct to compute the distinct customer IDs\nds = ds.project([\"customer_id\"])\nds = ds.distinct([\"customer_id\"])\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThis produces a new dataset with only the distinct customer IDs:\n```\n   customer_id\n0          104\n1          102\n2          103\n3          101\n```\n\n### Aggregations\nPalimpzest supports the following aggregation functions:\n- `count()`: Counts the number of rows in a dataset.\n- `sum()`: Computes the summation of items in a dataset.\n- `average()`: Computes the average of items in a dataset.\n- `min()`: Computes the minimum value of items in a dataset.\n- `max()`: Computes the maximum value of items in a dataset.\n:::note\nAt the moment, each of `sum()`, `average()`, `min()`, and `max()` only supports aggregating over a single field of type `int` or `float`.\n:::\n\nWe provide an example of using each aggregation function below:\n```python\nimport palimpzest as pz\n\n# create dataset from list of orders\nds = pz.MemoryDataset(id=\"orders\", vals=orders)\n\n# use count to compute the number of orders\ncount_ds = ds.count()\n\n# compute the min / max order_id\nmin_ds = ds.project([\"order_id\"]).min()\nmax_ds = ds.project([\"order_id\"]).max()\n\n# compute the sum / average of all item amounts\nitem_ds = ds.project([\"item_amounts\"]).flat_map(\n    udf=lambda row: [{\"item_amount\": amount} for amount in row[\"item_amounts\"]],\n    cols=[{\"name\": \"item_amount\", \"type\": float, \"desc\": \"An item amount\"}],\n)\nsum_ds = item_ds.sum()\navg_ds = item_ds.average()\n\n# execute the programs\ncount_output = count_ds.run(max_quality=True)\n...\n```\n"
  },
  {
    "path": "website/docs/user-guide/operators/sem_agg.mdx",
    "content": "---\ntitle: Semantic Aggregate\n---\nThe `sem_agg()` operator allows users to perform an aggregation specified in natural language on a dataset. PZ implements this operator with a function (typically powered by an LLM) that aggregates over all rows in the input dataset and produces a single aggregate row as output.\n\nSemantic aggregation is useful for summarizing data based on natural language criteria which may be difficult to express with traditional aggregation functions. For example, aggregating a dataset of customer reviews to produce a summary of the overall sentiment and key themes.\n\n:::info[Key Features of `sem_agg()`]\n1. The operator takes a `pz.Dataset` as input and produces a new dataset with a single row containing the aggregate information.\n2. The (subset of) input field(s) which are used to apply the filter predicate can be specified via the `depends_on` parameter. Omitting a field from this list will ensure that it is not templated into the prompt when computing the aggregate.\n:::\n\n### Semantic Aggregate\nTo illustrate the use of `sem_agg()`, consider the following example where we have a dataset of product reviews and want to generate a summary of the primary complaints mentioned in the reviews.\n```python\nimport palimpzest as pz\n\n# create dataset from directory of product reviews\nds = pz.TextFileDataset(id=\"product-reviews\", path=\"path/to/reviews\")\n\n# use sem_agg to generate a summary of the primary complaints in the reviews\nds = ds.sem_agg(\n    col={'name': 'top_complaints', 'type': str, 'desc': 'The top-3 most common complaints mentioned in the reviews'},\n    agg=\"Compute the top-3 most common complaints mentioned in the reviews\",\n)\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThe call to `pz.TextFileDataset()` will create a dataset with one record for each text file in the `path/to/reviews` directory. Each record will have a `filename` and `contents` field, where `contents` is the text read from the file. The call to `sem_agg()` will produce a new dataset with a single record containing the `top_complaints` field computed from all of the reviews in the input dataset.\n\nAn example output might look like the following:\n```\n                                      top_complaints\n0  1. Poor battery life\\n2. Slow performance\\n3. ...\n```\n\n:::info[Context Management with `depends_on`]\nIn the above example, the `sem_agg()` operator will feed both the `filename` and `contents` fields into the LLM when computing the aggregate. However, only the `contents` field is really necessary.\n\nThe `depends_on` parameter can be used to specify which field(s) should be presented to the underlying LLM(s) when computing the semantic aggregate.\n\nFor example, if we only wanted to use the `contents` field to compute the aggregate, we could modify the `sem_agg()` call as follows:\n```python\nds = ds.sem_agg(\n    col={'name': 'top_complaints', 'type': str, 'desc': 'The top-3 most common complaints mentioned in the reviews'},\n    agg=\"Compute the top-3 most common complaints mentioned in the reviews\",\n    depends_on=[\"contents\"],\n)\n```\nThis ensures that only the `contents` field is included in the prompt when computing the aggregate, reducing token usage and potentially improving performance.\n:::\n"
  },
  {
    "path": "website/docs/user-guide/operators/sem_filter.mdx",
    "content": "---\ntitle: Semantic Filter\n---\nThe `sem_filter()` operator allows users to filter rows in a dataset based on a natural language predicate. PZ implements this operator with a function (typically powered by an LLM) that applies the predicate to each row in the dataset and retains only those rows for which the predicate evaluates to true.\n\nSemantic filter is good for filtering data based on natural language criteria which may be difficult to express with traditional boolean logic. For example, filtering a dataset of products to only include those that are \"eco-friendly\" or \"suitable for children\".\n\n:::info[Key Features of `sem_filter()`]\n1. The operator takes a `pz.Dataset` as input and produces a new dataset with only the subset of rows that satisfy the predicate.\n2. The (subset of) input field(s) which are used to apply the filter predicate can be specified via the `depends_on` parameter. Omitting a field from this list will not drop the field from the output dataset; it will simply not template the field into the prompt when computing the filter condition.\n:::\n\n### Semantic Filter\nTo illustrate the use of `sem_filter()`, consider the following example where we have a dataset of research papers and we want to filter for papers that are about batteries and from MIT.\n```python\nimport palimpzest as pz\n\n# create dataset from directory of research papers\nds = pz.PDFFileDataset(id=\"research-papers\", path=\"path/to/papers\")\n\n# use sem_filter to retain only papers about batteries from MIT\nds = ds.sem_filter(\"The paper is about batteries and from MIT\")\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThe call to `pz.PDFFileDataset()` will create a dataset with one record for each PDF in the `path/to/papers` directory. Each record will have a `filename`, `contents`, and `text_contents` fields, where `text_contents` is the text extracted from the PDF. The call to `sem_filter()` will produce a new dataset with only the records for research papers that are about batteries and from MIT.\n\nAn example output might look like the following:\n```\n      filename                                           contents                                      text_contents\n0  battery.pdf  b'%PDF-1.4\\r%\\xe2\\xe3\\xcf\\xd3\\r\\n735 0 obj\\r<<...  Review ARticle\\nhttps:/ / doi.org/10.1038/s415...\n```\n\n:::info[Context Management with `depends_on`]\nIn the above example, the `sem_filter()` operator will feed all of the `filename`, `contents`, and `text_contents` fields into the LLM when computing whether or not the paper satisfies the filter predicate. However, only the `text_contents` field is really necessary, and including the raw bytes of the `contents` field is redundant (increasing cost) and potentially distracting to the LLM as it clutters the context.\n\nTo address these concerns, the `depends_on` parameter can be used in order to specify which field(s) should be presented to the underlying LLM(s) when computing the predicate for a semantic filter.\n\nFor example, if we only wanted to use the `text_contents` field to apply the predicate, we could modify the `sem_filter()` call as follows:\n```python\nds = ds.sem_filter(\n    \"The paper is about batteries and from MIT\",\n    depends_on=[\"text_contents\"],\n)\n```\nThis ensures that only the `text_contents` field is included in the prompt when computing the predicate, reducing token usage and potentially improving performance.\n:::\n"
  },
  {
    "path": "website/docs/user-guide/operators/sem_join.mdx",
    "content": "---\ntitle: Semantic Join\n---\nThe `sem_join()` operator allows users to perform joins between two datasets based on a specified join condition. PZ implements this operator with a function (typically powered by an LLM) that evaluates the join condition for each pair of rows from the two datasets and outputs the pairs that satisfy the condition.\n\nSemantic join is useful for combining data from different sources based on natural language criteria which may be difficult to express with traditional join conditions. For example, joining a dataset of product descriptions with a dataset of reviews based on whether the review is primarily foces on the product.\n\nSemantic join is also useful for joining multiple modalities of data, such as joining a dataset of animal images with a dataset of animal descriptions based on whether the description describes the animal in the image. (We have even joined images of animals to audio recordings of the animal's sounds!)\n\n:::info[Key Features of `sem_join()`]\n1. The operator takes a left `pz.Dataset` and a right `pz.Dataset` as input and produces a new dataset with the pairs of rows that satisfy the join condition.\n2. The type of join can be specified via the `how` parameter, which can take on the values `\"inner\"` (default), `\"left\"`, `\"right\"`, or `\"outer\"`.\n3. If the two datasets have overlapping field names, the duplicate fields from the right dataset will be suffixed with `_right` in the output dataset to avoid naming conflicts.\n4. The (subset of) input field(s) which are used to apply the filter predicate can be specified via the `depends_on` parameter. Omitting a field from this list will not drop the field from the output dataset; it will simply not template the field into the prompt when computing the join condition.\n:::\n\n### Semantic Join\nSuppose we have two datasets: one containing descriptions of animals and another containing images of animals:\n```\nanimal-descs\n├── chamois.txt\n├── dog.txt\n├── elephant.txt\n├── lion.txt\n└── zebra.txt\nanimal-images\n├── chamois.jpg\n├── elephant.jpg\n├── gorilla.jpg\n├── monkey.jpg\n└── zebra.jpg\n```\nWe can write the following PZ program to join these datasets based on whether the description matches the animal in the image.\n\n```python\nimport palimpzest as pz\n\n# create datasets of animal descriptions and animal images\ntext_ds = pz.TextFileDataset(\n    id=\"animal-descriptions\",\n    path=\"animal-descs/\",\n)\nimage_ds = pz.ImageFileDataset(\n    id=\"animal-images\",\n    path=\"animal-images/\",\n)\n\n# use sem_join to join the datasets based on whether the description matches the animal in the image\njoined_ds = text_ds.sem_join(\n    image_ds,\n    \"The description matches the animal in the image\",\n)\n\n# execute the program\noutput = joined_ds.run(max_quality=True)\nprint(output.to_df())\n```\nThe call to `pz.TextFileDataset()` will create a dataset with one record for each text file in the `animal-descs/` directory. Each record will have a `filename` and `contents` field, where `contents` is the text read from the file. Similarly, the call to `pz.ImageFileDataset()` will create a dataset with one record for each image file in the `animal-images/` directory. Each record will have a `filename` and `contents` field, where `contents` is the image data read from the file. The call to `sem_join()` will produce a new dataset with the pairs of records from the two input datasets that satisfy the join condition.\n\nAn example output might look like the following:\n```\n       filename                                           contents filename_right                                     contents_right\n0  elephant.txt  Elephants are the largest living land animals....   elephant.jpg  /9j/4QB4RXhpZgAATU0AKgAAAAgABQEaAAUAAAABAAAASg...\n1   chamois.txt  The chamois (/ˈʃæmwɑː/;[2] French: [ʃamwa] ⓘ) ...    chamois.jpg  /9j/4QCKRXhpZgAATU0AKgAAAAgABgEaAAUAAAABAAAAVg...\n2     zebra.txt  Zebras (US: /ˈziːbrəz/, UK: /ˈzɛbrəz, ˈziː-/)[...      zebra.jpg  /9j/2wBDAAQDAwQDAwQEAwQFBAQFBgoHBgYGBg0JCggKDw...\n```\n\n:::info[Context Management with `depends_on`]\nIn the above example, the `sem_join()` operator will feed all of the left and right fields into the LLM when computing whether or not a join tuple satisfies the join predicate.\n\nHowever, if we want to ensure that only the text and image contents are being used to compute the join, the `depends_on` parameter can be used in order to specify which field(s) should be presented to the underlying LLM(s) when computing the join predicate.\n\nTo this end, we could modify the `sem_join()` call as follows:\n```python\njoined_ds = text_ds.sem_join(\n    image_ds,\n    \"The description matches the animal in the image\",\n    depends_on=[\"contents\", \"contents_right\"],\n)\n```\nThis ensures that only the text and image `contents` fields are included in the prompt when computing the predicate. Note that we have to use `contents_right` to refer to the `contents` field from the right dataset, as it is automatically renamed to avoid naming conflicts.\n:::\n\nIt is also possible to perform left, right, and outer semantic joins by specifying the `how` parameter. For example, to perform a left semantic join, we could modify the `sem_join()` call as follows:\n```python\njoined_ds = text_ds.sem_join(\n    image_ds,\n    \"The description matches the animal in the image\",\n    how=\"left\",\n)\n```\nThis will produce a dataset with all records from the left dataset and the matching records from the right dataset (if any). Records from the left dataset that do not have a matching record in the right dataset will have `None` values for the fields from the right dataset:\n```\n       filename                                           contents filename_right                                     contents_right\n0   chamois.txt  The chamois (/ˈʃæmwɑː/;[2] French: [ʃamwa] ⓘ) ...    chamois.jpg  /9j/4QCKRXhpZgAATU0AKgAAAAgABgEaAAUAAAABAAAAVg...\n1     zebra.txt  Zebras (US: /ˈziːbrəz/, UK: /ˈzɛbrəz, ˈziː-/)[...      zebra.jpg  /9j/2wBDAAQDAwQDAwQEAwQFBAQFBgoHBgYGBg0JCggKDw...\n2  elephant.txt  Elephants are the largest living land animals....   elephant.jpg  /9j/4QB4RXhpZgAATU0AKgAAAAgABQEaAAUAAAABAAAASg...\n3       dog.txt  The dog (Canis familiaris or Canis lupus famil...           None                                               None\n4      lion.txt  The lion (Panthera leo) is a large cat of the ...           None                                               None\n```\n"
  },
  {
    "path": "website/docs/user-guide/operators/sem_map.mdx",
    "content": "---\ntitle: Semantic Map and Flat Map\n---\nThe primary way to compute new fields in PZ is through the use of the `sem_map()` and `sem_flat_map()` operators. These operators allow users to apply a function (typically powered by an LLM) to each row in a dataset, producing either a single new row (`sem_map()`) or multiple new rows (`sem_flat_map()`) for each input row.\n\nSemantic map is not only good for information extraction (e.g. extracting entities from text / images / audio), but also for transforming data (e.g. translating text, captioning an image, summarizing audio, etc.).\n\n:::info[Key Features of `sem_map()` and `sem_flat_map()`]\n1. Each of these operators takes a `pz.Dataset` as input and produces a new dataset with the computed field(s) as output.\n2. Multiple fields can be computed in a single map operation by specifying a list of fields to compute.\n3. The (subset of) input field(s) which are used to compute the new field(s) can be specified via the `depends_on` parameter. Omitting a field from this list will not drop the field from the output dataset; it will simply not template the field into the prompt when computing the new field(s).\n:::\n\n### Semantic Map\nTo illustrate the use of `sem_map()`, consider the following example where we have a dataset of research papers and we want to extract the title and generate a summary for each paper.\n```python\nimport palimpzest as pz\n\n# define the columns we wish to compute\npaper_cols = [\n    {\"name\": \"title\", \"type\": str, \"desc\": \"The title of the paper\"},\n    {\"name\": \"summary\", \"type\": str, \"desc\": \"A brief summary of the paper's main contributions\"},\n]\n\n# create dataset from directory of research papers\nds = pz.PDFFileDataset(id=\"research-papers\", path=\"path/to/papers\")\n\n# use sem_map to extract title and generate summary for each paper\nds = ds.sem_map(paper_cols)\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nThe call to `pz.PDFFileDataset()` will create a dataset with one record for each PDF in the `path/to/papers` directory. Each record will have a `filename`, `contents`, and `text_contents` fields, where `text_contents` is the text extracted from the PDF. The call to `sem_map()` will produce a new dataset with the `title` and `summary` fields computed for each research paper in the input dataset.\n\nAn example output might look like the following:\n```\n         filename                                           contents                                      text_contents                                              title                                            summary\n0     crowddb.pdf  b'%PDF-1.4\\n%\\xe2\\xe3\\xcf\\xd3\\n2 0 obj\\n<</Len...  CrowdDB: Query Processing with the VLDB Crowd\\...      CrowdDB: Query Processing with the VLDB Crowd  CrowdDB is a hybrid database system that exten...\n1   webtables.pdf  b'%PDF-1.3\\n%\\xc4\\xe5\\xf2\\xe5\\xeb\\xa7\\xf3\\xa0\\...  WebTables: Exploring the Power of Tables on th...  WebTables: Exploring the Power of Tables on th...  The paper introduces WebTables, a system that ...\n2    pandemic.pdf  b'%PDF-1.3\\n1 0 obj\\n<<\\n/Type /Pages\\n/Count ...   \\n \\nSince January 2020 Elsevier has created ...  Novel fractional order SIDARTHE mathematical m...  The paper introduces a novel fractional-order ...\n3  causallogs.pdf  b'%PDF-1.5\\n%\\xbf\\xf7\\xa2\\xfe\\n461 0 obj\\n<< /...  Unpublished working draft.Not for distribution...  From Logs to Causal Analysis: Extracting Data ...  The paper introduces a framework and prototype...\n4     battery.pdf  b'%PDF-1.4\\r%\\xe2\\xe3\\xcf\\xd3\\r\\n735 0 obj\\r<<...  Review ARticle\\nhttps:/ / doi.org/10.1038/s415...  Moving beyond 99.9% Coulombic efficiency for l...  This review assesses the state of lithium meta...\n5      tinydb.pdf  b'%PDF-1.4\\n3 0 obj <<\\n/Length 2702      \\n/F...  TinyDB: An Acquisitional Query Processing\\nSys...  TinyDB: An Acquisitional Query Processing Syst...  The paper introduces TinyDB, a distributed, SQ...\n```\n\n:::info[Context Management with `depends_on`]\nIn the above example, the `sem_map()` operator will feed all of the `filename`, `contents`, and `text_contents` fields into the LLM when computing the `title` and `summary`. However, only the `text_contents` field is really necessary, and including the raw bytes of the `contents` field is redundant (increasing cost) and potentially distracting to the LLM as it clutters the context.\n\nTo address these concerns, the `depends_on` parameter can be used in order to specify which field(s) should be presented to the underlying LLM(s) when computing the output fields for a semantic map.\n\nFor example, if we only wanted to use the `text_contents` field to compute the `title` and `summary`, we could modify the `sem_map()` call as follows:\n```python\nds = ds.sem_map(paper_cols, depends_on=[\"text_contents\"])\n```\nThis ensures that only the `text_contents` field is included in the prompt when generating the `title` and `summary`, reducing token usage and potentially improving performance.\n:::\n\n### Semantic Flat Map\nIn some cases, you may want to produce multiple output rows for each input row. This can be accomplished using the `sem_flat_map()` operator.\n\nFor example, suppose that we want to extract the name, institution, and email address of each author from the dataset of research papers. We can use `sem_flat_map()` to achieve this as follows:\n```python\nimport palimpzest as pz\n\n# define the columns we wish to compute\nauthor_cols = [\n    {\"name\": \"author_name\", \"type\": str, \"desc\": \"The name of the author\"},\n    {\"name\": \"institution\", \"type\": str, \"desc\": \"The institution the author is affiliated with\"},\n    {\"name\": \"email\", \"type\": str, \"desc\": \"The email address of the author\"},\n]\n\n# create dataset from directory of research papers\nds = pz.PDFFileDataset(id=\"research-papers\", path=\"path/to/papers\")\n\n# use sem_flat_map to extract authors for each paper\nds = ds.sem_flat_map(author_cols, depends_on=[\"text_contents\"])\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df(cols=[\"filename\", \"author_name\", \"institution\", \"email\"]))\n```\nThe call to `sem_flat_map()` will produce a new dataset with one record for each author extracted from each research paper in the input dataset. An example output might look like the following:\n```\n          filename            author_name                                        institution                     email\n0      crowddb.pdf             Amber Feng                                AMPLab, UC Berkeley   amber.feng@berkeley.edu\n1      crowddb.pdf       Michael Franklin                                AMPLab, UC Berkeley  franklin@cs.berkeley.edu\n2      crowddb.pdf        Donald Kossmann                          Systems Group, ETH Zurich       donaldk@inf.ethz.ch\n3      crowddb.pdf             Tim Kraska                                AMPLab, UC Berkeley    kraska@cs.berkeley.edu\n4      crowddb.pdf          Samuel Madden                                         CSAIL, MIT      madden@csail.mit.edu\n5      crowddb.pdf         Sukriti Ramesh                          Systems Group, ETH Zurich    ramess@student.ethz.ch\n6      crowddb.pdf            Andrew Wang                                AMPLab, UC Berkeley     awang@cs.berkeley.edu\n7      crowddb.pdf            Reynold Xin                                AMPLab, UC Berkeley      rxin@cs.berkeley.edu\n8   causallogs.pdf                   None                                               None                      None\n9       tinydb.pdf       Samuel R. Madden              Massachusetts Institute of Technology                      None\n10      tinydb.pdf    Michael J. Franklin                                        UC Berkeley                      None\n11      tinydb.pdf  Joseph M. Hellerstein                                        UC Berkeley                      None\n12      tinydb.pdf               Wei Hong                           Intel Research, Berkeley                      None\n13     battery.pdf        Y. Shirley Meng                 University of California San Diego           shmeng@ucsd.edu\n14     battery.pdf         Yang Shao-Horn              Massachusetts Institute of Technology          shaohorn@mit.edu\n15     battery.pdf       Betar M. Gallant              Massachusetts Institute of Technology          bgallant@mit.edu\n16    pandemic.pdf              M. Higazy  Department of Mathematics and Statistics, Facu...        m.higazy@tu.edu.sa\n17   webtables.pdf   Michael J. Cafarella                           University of Washington     mjc@cs.washington.edu\n18   webtables.pdf            Alon Halevy                                       Google, Inc.         halevy@google.com\n19   webtables.pdf         Zhe Daisy Wang                                        UC Berkeley    daisyw@cs.berkeley.edu\n20   webtables.pdf              Eugene Wu                                                MIT          eugenewu@mit.edu\n21   webtables.pdf             Yang Zhang                                                MIT          yaaang@gmail.com\n```\nNote that the `causallogs.pdf` paper did not have any author information, so the computed field values are `None` for that record. Similarly, the `tinydb.pdf` paper only had author names and institutions, but no email addresses, so the `email` field is `None` for those records.\n"
  },
  {
    "path": "website/docs/user-guide/operators/sem_topk.mdx",
    "content": "---\ntitle: Semantic Top-K\n---\nThe `sem_topk()` operator allows users to retrieve the top-K most relevant entries from a vector database for each row in a dataset. PZ implements this operator with a function (typically powered by an embedding model) that evaluates the relevance of entries in the vector database to each row in the dataset and returns the top-K entries based on embedding similarity.\n\nSemantic Top-K is useful for augmenting a dataset with additional context or information from an external knowledge base. For example, retrieving the top-K most relevant documents from a knowledge base for each query in a dataset of search queries.\n\n:::info[Key Features of `sem_topk()`]\n1. The operator takes a `pz.Dataset` as input and produces a new dataset with the same number of rows, where each row is augmented with the top-K relevant entries from the vector database.\n2. The operator takes an `index` (i.e. a vector database) as input, currently PZ only supports `chromadb.Collection` objects as indices.\n3. The number of entries to retrieve (K) can be specified via the `k` parameter. If left unspecified, PZ's optimizer can search for the best value of K (when using `.optimize_and_run()`)\n4. A custom `search_func` can be provided to customize how the vector database is queried and how results are returned.\n:::\n\n### Semantic Top-K\nTo illustrate the use of `sem_topk()`, consider the following example where we have a dataset of research topics and want to augment each topic with the top-3 most relevant papers from a vector database of research papers. Suppose we have already created a ChromaDB collection with embeddings for the following papers:\n```\nresearch-papers\n├── battery.pdf\n├── crowddb.pdf\n├── dctcp.pdf\n├── hstoredb.pdf\n├── ionic-and-electronic-conductivity.pdf\n├── pie-bufferbloat.pdf\n├── semiconductor-mixed-ion.pdf\n├── snowflakedb.pdf\n└── xcp.pdf\n```\nWe can then write the following PZ program to augment a dataset of research topics with the top-3 most relevant papers from the vector database:\n```python\nimport palimpzest as pz\nimport chromadb\nfrom chromadb.utils.embedding_functions.openai_embedding_function import OpenAIEmbeddingFunction\nimport os\n\n# get vector database index (ChromaDB Collection)\nclient = chromadb.PersistentClient(path=\".chromadb-research-papers\")\nopenai_ef = OpenAIEmbeddingFunction(\n    api_key=os.environ[\"OPENAI_API_KEY\"],\n    model_name=\"text-embedding-3-small\",\n)\ncollection = client.get_collection(name=\"research-papers\", embedding_function=openai_ef)\n\n# create dataset of research topics\ntopics = [\n    {\"topic\": \"battery technology\"},\n    {\"topic\": \"database systems\"},\n    {\"topic\": \"network congestion control\"},\n]\nds = pz.MemoryDataset(\n    id=\"research-topics\",\n    vals=topics,\n    schema=[{\"name\": \"topic\", \"type\": str, \"desc\": \"A research topic\"}],\n)\n\n# define the search function logic for returning top-k results\ndef search_func(index: chromadb.Collection, query: list[list[float]], k: int) -> dict[str, list]:\n    # execute query with embeddings\n    results = index.query(query, n_results=k)\n\n    # return the top-k similar paper ids and paper texts\n    return {\"paper_ids\": results[\"ids\"][0], \"paper_texts\": results[\"documents\"][0]}\n\n# retrieve the top-3 most relevant papers for each topic\nds = ds.sem_topk(\n    index=collection,\n    search_func=search_func,\n    search_attr=\"topic\",\n    output_attrs=[\n        {\"name\": \"paper_ids\", \"type\": list[str], \"desc\": \"The IDs of the top-3 most relevant papers\"},\n        {\"name\": \"paper_texts\", \"type\": list[str], \"desc\": \"The text contents of the top-3 most relevant papers\"},\n    ],\n    k=3,\n)\n\n# execute the program\noutput = ds.run(max_quality=True)\nprint(output.to_df())\n```\nWe load the precomputed ChromaDB collection using the `chromadb` library. The call to `pz.MemoryDataset()` creates a dataset with one record for each research topic in the `topics` list. The call to `sem_topk()` uses the `topic` field to generate a query embedding which is provided to the `search_func` along with the value of `k`. The `collection` is then queried and the top-k results are returned. Ultimately, the `sem_topk()` operator produces a new dataset with the `paper_ids` and `paper_texts` fields computed for each research topic in the input dataset.\n\nAn example output might look like the following:\n```\n                        topic                                          paper_ids                                        paper_texts\n0            database systems       [hstoredb.pdf, snowflakedb.pdf, crowddb.pdf]  [H-Store: A High-Performance, Distributed Main...\n1          battery technology  [semiconductor-mixed-ion.pdf, battery.pdf, ion...  [PCCP\\nPAPER\\nCite this: Phys. Chem. Chem. Phy...\n2  network congestion control          [dctcp.pdf, xcp.pdf, pie-bufferbloat.pdf]  [Data Center TCP (DCTCP)\\nMohammad Alizadeh‡†,...\n```\n\n:::info[Providing a Custom Search Function]\nIn the above example, we provided a custom `search_func` to define how the vector database is queried and how results are returned. The `search_func` takes three parameters: the `index` (i.e., the ChromaDB collection), a list of query embeddings, and the number of results to return (K). The function executes the query on the index and returns a dictionary containing the top-K relevant paper IDs and paper texts.\n\nThe keys of the dictionary returned by `search_func` must match the names of the fields specified in the `output_attrs` parameter of `sem_topk()`. This ensures that the results from the search function are correctly mapped to the output fields in the resulting dataset.\n\nPZ has a default `search_func` that can be used if the user does not provide one, however this default function will only work if the `output_attrs` contain a single field.\n:::\n\n### Optimizing k\nPalimpzest can automatically optimize the value of `k` (the number of top entries to retrieve) when using the `.optimize_and_run()` method. For example, we can modify the previous example to optimize `k` as follows:\n```python\nimport palimpzest as pz\n...\n# retrieve the most relevant papers for each topic\nds = ds.sem_topk(\n    index=collection,\n    search_func=search_func,\n    search_attr=\"topic\",\n    output_attrs=[\n        {\"name\": \"paper_ids\", \"type\": list[str], \"desc\": \"The IDs of the top-3 most relevant papers\"},\n        {\"name\": \"paper_texts\", \"type\": list[str], \"desc\": \"The text contents of the top-3 most relevant papers\"},\n    ],\n)\n\n# execute the program\nconfig = pz.QueryProcessorConfig(\n    sample_budget=15,\n    policy=pz.MaxQuality(),\n)\noutput = ds.optimize_and_run(config, validator=pz.Validator(model=pz.Model.GPT_5))\nprint(output.to_df())\n```\n:::tip[Optimizing K with Labels]\nTo effectively optimize `k`, it is recommended to provide a labeling function in the `pz.Validator` that can evaluate the quality of the results based on your specific criteria. In our experience, using an LLM-as-a-judge in this setting shows mixed results, as the LLM may optimize for recall instead of e.g. precision.\n:::\n"
  },
  {
    "path": "website/docs/user-guide/optimization.mdx",
    "content": "---\ntitle: Optimization\n---\n{/* ## Goal\nThis page should provide the reader with a brief overview of the PZ optimizer, and -- more importantly -- demonstrate how they can interact with and control the optimization process.\n\nKey takeaways for the reader should include:\n\n1. Having a very basic understanding of what the optimizer does\n    - (Without scaring the user) I want to leave them with the impression that the optimizer is doing \"real work\" which provides value to them\n2. Knowing how to provide validation data to the optimizer\n3. Knowing how to examine the plan output by the optimizer\n4. Having a very basic understanding of how to pass hyperparameters to constrain the optimizer's search\n\nKeeping in line with \"show don't tell\", this page should have a motivating use case for which the user calls `plan = dataset.optimize()` and prints the resulting `plan`. We should then show how to construct e.g. 3 validation examples for the program, feed them into another call to `dataset.optimize()`, and print the new `plan` (ideally showing that it is more optimal).\n\nFinally, for (4.) we can show how to limit the optimizer's access to specific models (using the `avaiable_models` config option) ***for all operators***. In the near future, we may want to do some engineering to support per-operator limits on models. (Of course, we don't want users feeling like they have to tinker with the `Optimizer` too much. However, I think most new users will gravitate towards restricting models and customizing prompts -- and we should support this in the near term.)\n*/}\n\nWhen a user calls the `.run()` method on a PZ dataset, the PZ program is executed without any sample-based optimization. Instead, PZ uses naive prior beliefs about the quality, cost, and latency of each operator to generate a physical execution plan. This plan is then executed to produce the final output dataset.\n\nIn order to use Palimpzest's [Abacus optimizer](https://arxiv.org/abs/2505.14661), users can call the `.optimize_and_run()` method on a PZ dataset instead. This method will:\n1. Sample inputs and physical operators for each semantic operator in the PZ program\n2. Run the sampled operators on the sampled inputs and observe the quality, cost, and latency of each operator\n3. Iteratively draw more samples until a sample budget (default 100 operator-input pairs) is exhausted\n4. Compute the optimal physical execution plan given the sample-based estimates for each operator and the user's optimization objective\n5. Execute the optimal plan on the user's dataset\n\nIn this guide, we will illustrate how to use the `.optimize_and_run()` method and how to customize the optimization process. Specifically, we will cover:\n1. [How to use an LLM judge to validate operator quality](#using-an-llm-judge-to-validate-operator-quality)\n2. [How to use labels to validate operator quality](#using-labels-to-validate-operator-quality)\n3. [How to provide a separate training (i.e. validation) dataset for the optimization process](#providing-a-separate-training-dataset)\n\n:::info[LLM Judges vs. Labels]\nUsing an LLM judge is a great way to quickly get started with optimization in PZ. However, LLM judges are not always accurate or faithful to the user's intent. For this reason, we recommend using labels to validate operator quality whenever possible (see [below](#using-labels-to-validate-operator-quality)).\n\nAdditionally, users can use both LLM judges and labels together to validate operator quality. In this case, the LLM judge will be used to evaluate the quality of operators for which no labels are available, while the labels will be used to evaluate the quality of operators when they are avaiable. We also cover this setting in more detail [below](#dealing-with-partial-labels).\n:::\n\n:::tip[How Many Labels Do I Need?]\nIn our experience, we've found that even using as few as 5-10 labels can significantly improve the quality of the optimized plan.\n:::\n\n### Using an LLM Judge to Validate Operator Quality\nThis simplest way to get started with optimization in PZ is to use an LLM judge to evaluate the quality of each operator. This is done by providing a `pz.Validator` object with a specified model to the `.optimize_and_run()` method.\n\nFor example, consider the email filtering program from our [introduction](intro.mdx), where we are computing the sender, subject, and a summary for each email which contains a firsthand discussion of specific business transaction(s):\n```python\nimport palimpzest as pz\n\n# load the emails into a dataset\nemails = pz.TextFileDataset(id=\"enron-emails\", path=\"emails/\")\n\n# filter for emails matching natural language criteria\nemails = emails.sem_filter(\n    'The email refers to one of the following business transactions: \"Raptor\", \"Deathstar\", \"Chewco\", and/or \"Fat Boy\"',\n)\nemails = emails.sem_filter(\n    \"The email contains a first-hand discussion of the business transaction\",\n)\n\n# extract structured fields for each email\nemails = emails.sem_map([\n    {\"name\": \"subject\", \"type\": str, \"desc\": \"the subject of the email\"},\n    {\"name\": \"sender\", \"type\": str, \"desc\": \"the email address of the sender\"},\n    {\"name\": \"summary\", \"type\": str, \"desc\": \"a brief summary of the email\"},\n])\n\n# optimize and execute the program and print the output\nvalidator = pz.Validator(model=pz.Model.GPT_5)\noutput = emails.optimize_and_run(max_quality=True, validator=validator)\n\nprint(output.to_df(cols=[\"filename\", \"sender\", \"subject\", \"summary\"]))\n```\nIn this example, instead of simply calling `emails.run(max_quality=True)`, we create a `pz.Validator` object which uses the `GPT-5` model to evaluate the quality of each operator during the optimization process. We then pass this validator to the `.optimize_and_run()` method, along with the `max_quality=True` argument to indicate that we want to optimize for quality.\n\nPZ will then sample `k` (default `k=6`) physical operators and `j` (default `j=4`) inputs for each semantic operator in the program (i.e. both semantic filters and the semantic map). For each operator-input pair, PZ will execute the operator on the input and use the LLM judge to evaluate the quality of the output on a [0, 1] scale. This process will continue until the sample budget (default `n=100`) is exhausted, at which point PZ will compute the optimal physical execution plan and execute it on the user's dataset.\n\nIf a `train_dataset` is provided ([see below](#providing-a-separate-training-dataset)), PZ will draw samples from the `train_dataset` instead of the user's dataset. If no `train_dataset` is provided, PZ will draw samples from the user's dataset.\n\n### Using Labels to Validate Operator Quality\nWhenever possible, we recommend using labels to validate operator quality during the optimization process. Users can provide labels to Palimpzest by subclassing the `pz.Validator` class and implementing the appropriate methods which score the quality of different semantic operators. The abstract methods for the `pz.Validator` class are shown below:\n```python\nclass Validator:\n    \"\"\"\n    The Validator is used during optimization to score the output of physical operator(s).\n    \"\"\"\n    def __init__(self, model: Model = Model.o4_MINI):\n        self.model = model\n        ...\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        \"\"\"\n        The map_score_fn takes in the fields being computed, the input record, and the output\n        generated by the semantic map operator. It should return a score in the range [0, 1]\n        indicating the quality of the output, or None if the score cannot be computed for this\n        map operation.\n        \"\"\"\n        raise NotImplementedError(\"Validator.map_score_fn not implemented.\")\n\n    def flat_map_score_fn(self, fields: list[str], input_record: dict, output: list[dict]) -> float | None:\n        \"\"\"\n        The flat_map_score_fn takes in the fields being computed, the input record, and the output\n        generated by the semantic map operator. It should return a score in the range [0, 1] indicating\n        the quality of the output, or None if the score cannot be computed for this map operation.\n        \"\"\"\n        raise NotImplementedError(\"Validator.flat_map_score_fn not implemented.\")\n\n    def filter_score_fn(self, filter_str: str, input_record: dict, output: bool) -> float | None:\n        \"\"\"\n        The filter_score_fn takes in the predicate filter_str being evaluated, the input record,\n        and the output (True/False) decision generated by the semantic filter. It should return a score\n        in the range [0, 1] indicating the quality of the output, or None if the score cannot be computed\n        for this filter operation.\n        \"\"\"\n        raise NotImplementedError(\"Validator.filter_score_fn not implemented.\")\n\n    def join_score_fn(self, condition: str, left_input_record: dict, right_input_record: dict, output: bool) -> float | None:\n        \"\"\"\n        The join_score_fn takes in the join condition being evaluated, the left input record,\n        the right input record, and the output (True/False) join decision generated by the semantic join.\n        It should return a score in the range [0, 1] indicating the quality of the output, or None if the\n        score cannot be computed for this join operation.\n        \"\"\"\n        raise NotImplementedError(\"Validator.join_score_fn not implemented.\")\n\n    def topk_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        \"\"\"\n        The topk_score_fn takes in the fields being added by the top-k operation, the input record,\n        and the output generated by the semantic top-k operator. It should return a score in the range\n        [0, 1] indicating the quality of the output, or None if the score cannot be computed for this\n        top-k operation.\n        \"\"\"\n        raise NotImplementedError(\"Validator.map_score_fn not implemented.\")\n```\nUsers can implement any subset of these methods to provide labels for the corresponding semantic operators. For example, if the user can only provide labels for (some) semantic map(s), they can implement the `map_score_fn` method and leave the other methods unimplemented. In this case, PZ will use the user's implementation of `map_score_fn` to evaluate the quality of the semantic map(s) during optimization, and will fall back to using an LLM judge for the other semantic operators.\n\nFurthermore, for plans which involve multiple semantic operators of the same type (e.g. two semantic filters), users can use the `fields`, `filter_str`, or `condition` arguments to differentiate between the different operators.\n\nWe share an example of our `pz.Validator` for the email processing workload below:\n```python\n# labels_file is a JSON file mapping from each email filename to the sender and subject\n# of the email and whether or not it should pass each filter, e.g.:\n# {\n#   \"kaminski-v-all-documents-2355.txt\": {\n#     \"sender\": \"ron.baker@enron.com\",\n#     \"subject\": \"Raptor Position Reports for 12/28/00\",\n#     \"mentions_transaction\": true,\n#     \"firsthand_discussion\": true\n#   },\n#   \"kaminski-v-inbox-291.txt\": {\n#     \"sender\": \"baker@enron.com\",\n#     \"subject\": \"RE: Pricing of restriction on Enron stock\",\n#     \"mentions_transaction\": true,\n#     \"firsthand_discussion\": true\n#   },\n#   ...\n# }\nclass EnronValidator(pz.Validator):\n    def __init__(self, labels_file: str):\n        super().__init__()\n        with open(labels_file) as f:\n            self.filename_to_labels = json.load(f)\n\n    def filter_score_fn(self, filter_str: str, input_record: dict, output: bool) -> float | None:\n        filename = input_record[\"filename\"]\n        labels = self.filename_to_labels[filename]\n        if labels is None:\n            return None\n\n        if \"business transactions\" in filter_str:\n            return float(labels[\"mentions_transaction\"] == output)\n        elif \"first-hand discussion\" in filter_str:\n            return float(labels[\"firsthand_discussion\"] == output)\n        else:\n            return None\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        # NOTE: we score the map based on the sender and subject fields only, as summary is too subjective;\n        #       we could also use an LLM judge within this function to score the summary field if desired\n        filename = input_record[\"filename\"]\n        labels = self.filename_to_labels[filename]\n        if labels is None:\n            return None\n\n        return (float(labels[\"sender\"] == output[\"sender\"]) + float(labels[\"subject\"] == output[\"subject\"])) / 2.0\n```\nThe `EnronValidator` class reads in a JSON file containing the labels for each email, and implements the `filter_score_fn` and `map_score_fn` methods to score the quality of the semantic filters and semantic map, respectively. The `filter_score_fn` method checks whether the output of each filter matches the corresponding label, while the `map_score_fn` method checks whether the `sender` and `subject` fields match the corresponding labels. We do not score the `summary` field in the `map_score_fn` as it is subjective; however, we could easily use an LLM judge within this function to score the `summary` field if desired.\n\nImplementations for `flat_map_score_fn`, `join_score_fn`, and `topk_score_fn` are not necessary for this workload, so we leave them unimplemented.\n\n#### Dealing with Partial Labels\nIn many cases, users may only have labels for a subset of the operators in their program. PZ handles this case gracefully, by using the user labels where applicable and falling back to an LLM judge for operators where no labels are available. For example, if the user only had labels for the semantic map in the email processing program, they could implement the `EnronValidator` as follows:\n```python\nclass EnronValidator(pz.Validator):\n    def __init__(self, labels_file: str):\n        super().__init__()\n        with open(labels_file) as f:\n            self.filename_to_labels = json.load(f)\n\n    def map_score_fn(self, fields: list[str], input_record: dict, output: dict) -> float | None:\n        filename = input_record[\"filename\"]\n        labels = self.filename_to_labels[filename]\n        if labels is None:\n            return None\n\n        return (float(labels[\"sender\"] == output[\"sender\"]) + float(labels[\"subject\"] == output[\"subject\"])) / 2.0\n```\nThis would allow PZ to use the user-provided labels to score the semantic map, while using an LLM judge to score the two semantic filters.\n\n### Providing a Separate Training Dataset\nIn some cases, users may want to provide a separate training (i.e. validation) dataset for the optimization process. This can be done by providing a `pz.Dataset` object to the `train_dataset` argument of the `.optimize_and_run()` method:\n```python\n# create validator and train_dataset\nvalidator = EnronValidator(labels_file=\"enron-eval-medium-labels.json\")\ntrain_dataset = pz.TextFileDataset(id=\"enron-emails\", path=\"train-emails/\")\n\n# construct plan\nemails = pz.TextFileDataset(id=\"enron-emails\", path=\"test-emails/\")\nemails = emails.sem_filter(\n    'The email refers to one of the following business transactions: \"Raptor\", \"Deathstar\", \"Chewco\", and/or \"Fat Boy\"',\n)\nemails = emails.sem_filter(\n    \"The email contains a first-hand discussion of the business transaction\",\n)\nemails = emails.sem_map([\n    {\"name\": \"subject\", \"type\": str, \"desc\": \"the subject of the email\"},\n    {\"name\": \"sender\", \"type\": str, \"desc\": \"the email address of the sender\"},\n    {\"name\": \"summary\", \"type\": str, \"desc\": \"a brief summary of the email\"},\n])\n\n# optimize and execute plan with training dataset and validator\noutput = emails.optimize_and_run(train_dataset=train_dataset, validator=validator, max_quality=True)\n\n# print output dataframe\nprint(output.to_df())\n```\nIn this case, PZ will draw samples from the `train_dataset` instead of the user's dataset when optimizing the plan.\n\n:::info[Matching Train and Test Dataset IDs]\nWhen providing a `train_dataset`, ensure that the `id` of the `train_dataset` matches the `id` of the corresponding test dataset. This allows PZ to correctly match input records between the training and test datasets during optimization (especially when semantic joins between multiple datasets are involved).\n:::\n\n### Specifying Constrained Optimization Objective\nIn addition to optimizing for maximum quality, minimum cost, or minimum latency, users can also specify constraints for the optimization objective. In total, PZ supports the following optimization objectives:\n- `pz.MaxQuality()`\n- `pz.MinCost()`\n- `pz.MinTime()`\n- `pz.MaxQualityAtFixedCost(max_cost: float)`\n- `pz.MaxQualityAtFixedTime(max_time: float)`\n- `pz.MinCostAtFixedQuality(min_quality: float)`\n- `pz.MinTimeAtFixedQuality(min_quality: float)`\n\nThe `max_cost`, `max_time`, and `min_quality` parameters are represent the dollars, seconds, and quality score (between 0 and 1) constraints, respectively, for processing the entire (test) dataset with the final optimized plan. For example, to optimize for maximum quality subject to a cost constraint of $0.50, users can call the `.optimize_and_run()` method as follows:\n```python\nimport palimpzest as pz\n\n# load the emails into a dataset\nemails = pz.TextFileDataset(id=\"enron-emails\", path=\"emails/\")\n\n# filter for emails matching natural language criteria\nemails = emails.sem_filter(\n    'The email refers to one of the following business transactions: \"Raptor\", \"Deathstar\", \"Chewco\", and/or \"Fat Boy\"',\n)\nemails = emails.sem_filter(\n    \"The email contains a first-hand discussion of the business transaction\",\n)\n\n# extract structured fields for each email\nemails = emails.sem_map([\n    {\"name\": \"subject\", \"type\": str, \"desc\": \"the subject of the email\"},\n    {\"name\": \"sender\", \"type\": str, \"desc\": \"the email address of the sender\"},\n    {\"name\": \"summary\", \"type\": str, \"desc\": \"a brief summary of the email\"},\n])\n\n# optimize and execute the program (subject to a cost constraint) and print the output\npolicy = pz.MaxQualityAtFixedCost(max_cost=0.50)\nconfig = pz.QueryProcessorConfig(policy=policy)\nvalidator = pz.Validator(model=pz.Model.GPT_5)\noutput = emails.optimize_and_run(config=config, validator=validator)\n\nprint(output.to_df(cols=[\"filename\", \"sender\", \"subject\", \"summary\"]))\n```\n"
  },
  {
    "path": "website/docs/user-guide/overview.mdx",
    "content": "---\ntitle: Overview\n---\n{/* ## Goal\nThis page should contain a brief overview of what the user guide will teach the reader, with direct links to each section of the user guide.\n\nUser guides should probably cover:\n\n- How to read your own data (i.e. custom and standard `Dataset`)\n- How to use each operator in PZ (i.e. an overview of all operators)\n- How to use the optimizer (halfway between a light introduction and a deep-dive)\n- And more, but let's try to keep the number as small as possible (prevent cognitive overload and reader fatigue)*/}\n\nOur User Guide is designed to help new users get familiar with PZ at a deeper level than our [Quick Start Tutorial](getting-started/quickstart.mdx).\n\nIn particular, we will provide you with a deeper understanding of:\n\n- [How to Create Your Own Dataset](user-guide/dataset.mdx)\n- [An overview of all operators in PZ](user-guide/operators/overview.mdx)\n- [A primer on optimization](user-guide/optimization.mdx)"
  },
  {
    "path": "website/docusaurus.config.ts",
    "content": "import {themes as prismThemes} from 'prism-react-renderer';\nimport type {Config} from '@docusaurus/types';\nimport type * as Preset from '@docusaurus/preset-classic';\n\n// This runs in Node.js - Don't use client-side code here (browser APIs, JSX...)\n\nconst config: Config = {\n  title: 'Palimpzest',\n  tagline: 'Optimized Execution for Semantic Operators',\n  favicon: 'img/pz-small-logo.ico',\n\n  // Future flags, see https://docusaurus.io/docs/api/docusaurus-config#future\n  future: {\n    v4: true, // Improve compatibility with the upcoming Docusaurus v4\n  },\n\n  // Set the production url of your site here\n  url: 'https://palimpzest.org',\n  // Set the /<baseUrl>/ pathname under which your site is served\n  // For GitHub pages deployment, it is often '/<projectName>/'\n  baseUrl: '/',\n\n  // GitHub pages deployment config.\n  // If you aren't using GitHub pages, you don't need these.\n  organizationName: 'mitdbg', // Usually your GitHub org/user name.\n  projectName: 'palimpzest', // Usually your repo name.\n  deploymentBranch: 'gh-pages',\n\n  // github pages remove trailing slash: https://github.com/slorber/trailing-slash-guide\n  trailingSlash: false,\n\n  onBrokenLinks: 'throw',\n  onBrokenMarkdownLinks: 'warn',\n\n  // Even if you don't use internationalization, you can use this field to set\n  // useful metadata like html lang. For example, if your site is Chinese, you\n  // may want to replace \"en\" with \"zh-Hans\".\n  i18n: {\n    defaultLocale: 'en',\n    locales: ['en'],\n  },\n\n  presets: [\n    [\n      'classic',\n      {\n        docs: {\n          sidebarPath: './sidebars.ts',\n          // Please change this to your repo.\n          // Remove this to remove the \"edit this page\" links.\n          editUrl:\n            'https://github.com/mitdbg/palimpzest/tree/main/website/',\n          admonitions: {\n            keywords: ['note', 'tip', 'info', 'warning', 'danger'],\n            extendDefaults: true,\n          }\n        },\n        blog: {\n          showReadingTime: true,\n          feedOptions: {\n            type: ['rss', 'atom'],\n            xslt: true,\n          },\n          // Please change this to your repo.\n          // Remove this to remove the \"edit this page\" links.\n          editUrl:\n            'https://github.com/mitdbg/palimpzest',\n          // Useful options to enforce blogging best practices\n          onInlineTags: 'warn',\n          onInlineAuthors: 'warn',\n          onUntruncatedBlogPosts: 'warn',\n        },\n        theme: {\n          customCss: './src/css/custom.css',\n        },\n        gtag: {\n          trackingID: 'G-ZW0TNQXT8Y',\n          anonymizeIP: true,\n        }\n      } satisfies Preset.Options,\n    ],\n  ],\n\n  themeConfig: {\n    // Replace with your project's social card\n    image: 'img/pz-social-card.png',\n    navbar: {\n      title: 'Palimpzest',\n      logo: {\n        alt: 'PZ Small Logo',\n        src: 'img/pz-small-logo.png',\n      },\n      items: [\n        {\n          type: 'docSidebar',\n          sidebarId: 'tutorialSidebar',\n          position: 'left',\n          label: 'Getting Started',\n        },\n        {\n          type: 'docSidebar',\n          sidebarId: 'documentationSidebar',\n          label: 'Documentation',\n          position: 'left',\n        },\n        {to: '/blog', label: 'Blog', position: 'left'},\n        {to: '/research', label: 'Research', position: 'left'},\n        {to: '/palimpchat', label: 'PalimpChat', position: 'left'},\n        {\n          href: 'https://github.com/mitdbg/palimpzest',\n          label: 'GitHub',\n          position: 'right',\n        },\n      ],\n    },\n    footer: {\n      style: 'dark',\n      links: [\n        {\n          title: 'Docs',\n          items: [\n            {\n              label: 'Getting Started',\n              to: '/docs/intro',\n            },\n            {\n              label: 'Documentation',\n              to: '/docs/api/overview',\n            },\n          ],\n        },\n        {\n          title: 'Community',\n          items: [\n            {\n              label: 'Discord',\n              href: 'https://discord.gg/dN85JJ6jaH',\n            },\n            {\n              label: 'X',\n              href: 'https://x.com/RussoMatthew',\n            },\n          ],\n        },\n        {\n          title: 'More',\n          items: [\n            {\n              label: 'Blog',\n              to: '/blog',\n            },\n            {\n              label: 'GitHub',\n              href: 'https://github.com/mitdbg/palimpzest',\n            },\n            {\n              label: 'Colab Demo',\n              href: 'https://colab.research.google.com/drive/1Fm8I4yL1az395MsFkQbEIZSmUZs0oGvZ?usp=sharing',\n            }\n          ],\n        },\n      ],\n      copyright: `Copyright © ${new Date().getFullYear()} MIT Data Systems Group. Built with Docusaurus.`,\n    },\n    prism: {\n      theme: prismThemes.github,\n      darkTheme: prismThemes.dracula,\n    },\n  } satisfies Preset.ThemeConfig,\n};\n\nexport default config;\n"
  },
  {
    "path": "website/package.json",
    "content": "{\n  \"name\": \"website\",\n  \"version\": \"0.0.0\",\n  \"private\": true,\n  \"scripts\": {\n    \"docusaurus\": \"docusaurus\",\n    \"start\": \"docusaurus start\",\n    \"build\": \"docusaurus build\",\n    \"swizzle\": \"docusaurus swizzle\",\n    \"deploy\": \"docusaurus deploy\",\n    \"clear\": \"docusaurus clear\",\n    \"serve\": \"docusaurus serve\",\n    \"write-translations\": \"docusaurus write-translations\",\n    \"write-heading-ids\": \"docusaurus write-heading-ids\",\n    \"typecheck\": \"tsc\"\n  },\n  \"dependencies\": {\n    \"@docusaurus/core\": \"3.8.1\",\n    \"@docusaurus/plugin-google-gtag\": \"^3.9.1\",\n    \"@docusaurus/preset-classic\": \"3.8.1\",\n    \"@emotion/react\": \"^11.14.0\",\n    \"@emotion/styled\": \"^11.14.1\",\n    \"@mdx-js/react\": \"^3.0.0\",\n    \"@mui/icons-material\": \"^7.3.4\",\n    \"@mui/material\": \"^7.3.4\",\n    \"clsx\": \"^2.0.0\",\n    \"prism-react-renderer\": \"^2.3.0\",\n    \"react\": \"^19.0.0\",\n    \"react-dom\": \"^19.0.0\",\n    \"strip-ansi\": \"^7.1.2\"\n  },\n  \"devDependencies\": {\n    \"@docusaurus/module-type-aliases\": \"3.8.1\",\n    \"@docusaurus/tsconfig\": \"3.8.1\",\n    \"@docusaurus/types\": \"3.8.1\",\n    \"typescript\": \"~5.6.2\"\n  },\n  \"browserslist\": {\n    \"production\": [\n      \">0.5%\",\n      \"not dead\",\n      \"not op_mini all\"\n    ],\n    \"development\": [\n      \"last 3 chrome version\",\n      \"last 3 firefox version\",\n      \"last 5 safari version\"\n    ]\n  },\n  \"engines\": {\n    \"node\": \">=18.0\"\n  }\n}\n"
  },
  {
    "path": "website/sidebars.ts",
    "content": "import type {SidebarsConfig} from '@docusaurus/plugin-content-docs';\n\n// This runs in Node.js - Don't use client-side code here (browser APIs, JSX...)\n\n/**\n * Creating a sidebar enables you to:\n - create an ordered group of docs\n - render a sidebar for each doc of that group\n - provide next/previous navigation\n\n The sidebars can be generated from the filesystem, or explicitly defined here.\n\n Create as many sidebars as you want.\n */\nconst sidebars: SidebarsConfig = {\n  // // By default, Docusaurus generates a sidebar from the docs folder structure\n  // tutorialSidebar: [{type: 'autogenerated', dirName: '.'}],\n\n  // But you can create a sidebar manually\n  tutorialSidebar: [\n    'intro',\n    {\n      type: 'category',\n      label: 'Getting Started',\n      items: [\n        'getting-started/installation',\n        'getting-started/quickstart',\n        'getting-started/next-steps',\n      ],\n      collapsed: false,\n    },\n    {\n      type: 'category',\n      label: 'User Guide',\n      items: [\n        'user-guide/overview',\n        'user-guide/dataset',\n        {\n          type: 'category',\n          label: 'Semantic Operators',\n          items: [\n            'user-guide/operators/overview',\n            'user-guide/operators/sem_map',\n            'user-guide/operators/sem_filter',\n            'user-guide/operators/sem_join',\n            'user-guide/operators/sem_agg',\n            'user-guide/operators/sem_topk',\n            'user-guide/operators/relational',\n          ],\n          collapsed: false,\n        },\n        'user-guide/optimization',\n      ],\n      collapsed: false,\n    }\n  ],\n  documentationSidebar: [\n    {\n      type: 'category',\n      label: 'Documentation',\n      items: [\n        'api/overview',\n        // {\n        //   type: 'category',\n        //   label: 'Data',\n        //   items: [\n        //     'api/data/dataset',\n        //     'api/data/datarecord',\n        //     'api/data/datarecordcollection',\n        //     'api/data/context',\n        //     'api/data/iter-dataset',\n        //   ],\n        // },\n        // {\n        //   type: 'category',\n        //   label: 'Operators',\n        //   items: [\n        //     'api/operators/aggregate',\n        //     {\n        //       type: 'category',\n        //       label: 'Filters',\n        //       items: [\n        //         'api/operators/filter/filter',\n        //         'api/operators/filter/llm-filter',\n        //         'api/operators/filter/non-llm-filter',\n        //       ],\n        //     },\n            // {\n            //   type: 'category',\n            //   label: 'Joins',\n            //   items: [\n            //     'api/operators/join/join',\n            //     'api/operators/join/nested-loops-join',\n            //     'api/operators/join/embedding-join',\n            //   ],\n            // },\n            // 'api/operators/limit',\n            // 'api/operators/logical',\n            // {\n            //   type: 'category',\n            //   label: 'Maps',\n            //   items: [\n            //     'api/operators/convert/convert',\n            //     'api/operators/convert/critique-and-refine-convert',\n            //     'api/operators/convert/llm-convert',\n            //     'api/operators/convert/mixture-of-agents-convert',\n            //     'api/operators/convert/non-llm-convert',\n            //     'api/operators/convert/rag-convert',\n            //   ],\n            // },\n            // 'api/operators/physical',\n            // 'api/operators/topk',\n            // 'api/operators/scan',\n        //   ],\n        // },\n      ],\n    }\n  ],\n};\n\nexport default sidebars;\n"
  },
  {
    "path": "website/src/components/HomepageFeatures/index.tsx",
    "content": "import type {ReactNode} from 'react';\nimport clsx from 'clsx';\nimport Heading from '@theme/Heading';\nimport styles from './styles.module.css';\n\ntype FeatureItem = {\n  title: string;\n  Svg: React.ComponentType<React.ComponentProps<'svg'>>;\n  description: ReactNode;\n};\n\nconst FeatureList: FeatureItem[] = [\n  {\n    title: 'Easy to Use',\n    Svg: require('@site/static/img/pz-orange-transparent.svg').default,\n    description: (\n      <>\n        Install Palimpzest with pip and get started in minutes.\n        Follow our quickstart and join the Discord community to\n        get help with your use case.\n      </>\n    ),\n  },\n  {\n    title: 'Multi-Modal Joins, Maps, and Filters',\n    Svg: require('@site/static/img/multimodal.svg').default,\n    description: (\n      <>\n        Join any combination of text, images, audio, and tables.\n        Palimpzest also supports maps and filters over any combination of modalities.\n      </>\n    ),\n  },\n  {\n    title: 'Highly Optimizable',\n    Svg: require('@site/static/img/orange-abacus.svg').default,\n    description: (\n      <>\n        Palimpzest's optimizer can leverage labels and/or an LLM judge to\n        produce the best implementation of your data processing pipeline.\n      </>\n    ),\n  },\n];\n\nfunction Feature({title, Svg, description}: FeatureItem) {\n  return (\n    <div className={clsx('col col--4')}>\n      <div className=\"text--center\">\n        <Svg className={styles.featureSvg} role=\"img\" />\n      </div>\n      <div className=\"text--center padding-horiz--md\">\n        <Heading as=\"h3\">{title}</Heading>\n        <p>{description}</p>\n      </div>\n    </div>\n  );\n}\n\nexport default function HomepageFeatures(): ReactNode {\n  return (\n    <section className={styles.features}>\n      <div className=\"container\">\n        <div className=\"row\">\n          {FeatureList.map((props, idx) => (\n            <Feature key={idx} {...props} />\n          ))}\n        </div>\n      </div>\n    </section>\n  );\n}\n"
  },
  {
    "path": "website/src/components/HomepageFeatures/styles.module.css",
    "content": ".features {\n  display: flex;\n  align-items: center;\n  padding: 2rem 0;\n  width: 100%;\n}\n\n.featureSvg {\n  height: 200px;\n  width: 200px;\n}\n"
  },
  {
    "path": "website/src/components/ResearchPage/admonitions.tsx",
    "content": "import Admonition from '@theme/Admonition';\n\ninterface AbstractProps {\n    icon?: React.ReactNode;\n    title?: string;\n    children: React.ReactNode;\n}\n\nexport default function Abstract({ children }: AbstractProps) {\n  return (\n    <div>\n      <Admonition type=\"tip\" icon=\"💡\" title=\"Abstract\">\n        <p>{children}</p>\n      </Admonition>\n    </div>\n  );\n}\n"
  },
  {
    "path": "website/src/css/custom.css",
    "content": "/**\n * Any CSS included here will be global. The classic template\n * bundles Infima by default. Infima is a CSS framework designed to\n * work well for content-centric websites.\n */\n\n/* You can override the default Infima variables here. */\n:root {\n  --ifm-color-primary: #1f52a3;\n  --ifm-color-primary-dark: #1c4a93;\n  --ifm-color-primary-darker: #1a468b;\n  --ifm-color-primary-darkest: #163972;\n  --ifm-color-primary-light: #225ab3;\n  --ifm-color-primary-lighter: #245ebb;\n  --ifm-color-primary-lightest: #286bd4;\n  --ifm-code-font-size: 95%;\n  --docusaurus-highlighted-code-line-bg: rgba(0, 0, 0, 0.1);\n}\n\n/* For readability concerns, you should choose a lighter palette in dark mode. */\n[data-theme='dark'] {\n  --ifm-color-primary: #e5872e;\n  --ifm-color-primary-dark: #dc791b;\n  --ifm-color-primary-darker: #d0721a;\n  --ifm-color-primary-darkest: #ab5e15;\n  --ifm-color-primary-light: #e89546;\n  --ifm-color-primary-lighter: #ea9c53;\n  --ifm-color-primary-lightest: #eeb177;\n  --docusaurus-highlighted-code-line-bg: rgba(0, 0, 0, 0.3);\n}\n\n\n.scrollable-images {\n  display: flex;\n  overflow-x: auto;\n  gap: 1rem;\n  padding: 1rem 0;\n}\n\n.scrollable-images img {\n  height: auto;\n  max-height: 300px; /* You can adjust this */\n  flex-shrink: 0;\n}\n"
  },
  {
    "path": "website/src/pages/index.module.css",
    "content": "/**\n * CSS files with the .module.css suffix will be treated as CSS modules\n * and scoped locally.\n */\n\n.heroBanner {\n  padding: 4rem 0;\n  text-align: center;\n  position: relative;\n  overflow: hidden;\n  background-image: url('/img/background.svg');\n  background-size: cover;\n  background-position: center;\n  background-repeat: no-repeat;\n}\n\n.heroBanner .hero__title {\n  color: #2d3748 !important;\n  text-shadow: 1px 1px 2px rgba(255, 255, 255, 0.8) !important;\n}\n\n.heroBanner .hero__subtitle {\n  color: #4a5568 !important;\n  text-shadow: 1px 1px 2px rgba(255, 255, 255, 0.8) !important;\n}\n\n.heroBanner h1 {\n  color: #2d3748 !important;\n  text-shadow: 1px 1px 2px rgba(255, 255, 255, 0.8) !important;\n}\n\n.heroBanner p {\n  color: #4a5568 !important;\n  text-shadow: 1px 1px 2px rgba(255, 255, 255, 0.8) !important;\n}\n\n.heroBanner .container {\n  position: relative;\n  z-index: 1;\n}\n\n@media screen and (max-width: 996px) {\n  .heroBanner {\n    padding: 2rem;\n  }\n}\n\n.buttons {\n  display: flex;\n  align-items: center;\n  justify-content: center;\n  position: relative;\n  z-index: 2;\n}\n"
  },
  {
    "path": "website/src/pages/index.tsx",
    "content": "import type {ReactNode} from 'react';\nimport clsx from 'clsx';\nimport Link from '@docusaurus/Link';\nimport useDocusaurusContext from '@docusaurus/useDocusaurusContext';\nimport Layout from '@theme/Layout';\nimport HomepageFeatures from '@site/src/components/HomepageFeatures';\nimport Heading from '@theme/Heading';\n\nimport styles from './index.module.css';\n\nfunction HomepageHeader() {\n  const {siteConfig} = useDocusaurusContext();\n  return (\n    <header className={clsx('hero hero--primary', styles.heroBanner)}>\n      <div className=\"container\">\n        <Heading as=\"h1\" className=\"hero__title\">\n          {siteConfig.title}\n        </Heading>\n        <p className=\"hero__subtitle\">{siteConfig.tagline}</p>\n        <div className={styles.buttons}>\n          <Link\n            className=\"button button--secondary button--lg\"\n            to=\"/docs/intro\">\n            Getting Started - 5min ⏱️\n          </Link>\n        </div>\n      </div>\n    </header>\n  );\n}\n\nexport default function Home(): ReactNode {\n  const {siteConfig} = useDocusaurusContext();\n  return (\n    <Layout\n      title={`Hello from ${siteConfig.title}`}\n      description=\"Description will go into a meta tag in <head />\">\n      <HomepageHeader />\n      <main>\n        <HomepageFeatures />\n      </main>\n    </Layout>\n  );\n}\n"
  },
  {
    "path": "website/src/pages/palimpchat.mdx",
    "content": "---\ntitle: Palimpchat\n---\n\nTo access our chat demo please visit our demo webpage for [PalimpChat](http://3.213.4.62:8888/).\n\n![PalimpChatImage](/img/palimpchat.png)\n"
  },
  {
    "path": "website/src/pages/research.mdx",
    "content": "---\ntitle: Research Papers\n---\n\nimport Abstract from '@site/src/components/ResearchPage/admonitions';\n\nResearch Papers\n===============\n\nPalimpzest has been the source of a number of research papers. Here is a timeline of the papers along with their citations.\n\nSummer 2025\n-----------\n**Abacus: A Cost-Based Optimizer for Semantic Operator Systems** \\[[arXiv](https://arxiv.org/abs/2505.14661)\\]\n\n<Abstract>\n    LLMs enable an exciting new class of data processing applications over large collections of unstructured documents. Several new programming frameworks have enabled developers to build these applications by composing them out of semantic operators: a declarative set of AI-powered data transformations with natural language specifications. These include LLM-powered maps, filters, joins, etc. used for document processing tasks such as information extraction, summarization, and more. While systems of semantic operators have achieved strong performance on benchmarks, they can be difficult to optimize. An optimizer for this setting must determine how to physically implement each semantic operator in a way that optimizes the system globally. Existing optimizers are limited in the number of optimizations they can apply, and most (if not all) cannot optimize system quality, cost, or latency subject to constraint(s) on the other dimensions. In this paper we present Abacus, an extensible, cost-based optimizer which searches for the best implementation of a semantic operator system given a (possibly constrained) optimization objective. Abacus estimates operator performance by leveraging a minimal set of validation examples and, if available, prior beliefs about operator performance. We evaluate Abacus on document processing workloads in the biomedical and legal domains (BioDEX; CUAD) and multi-modal question answering (MMQA). We demonstrate that systems optimized by Abacus achieve 18.7%-39.2% better quality and up to 23.6x lower cost and 4.2x lower latency than the next best system.\n</Abstract>\n\n```\n@misc{russo2025abacuscostbasedoptimizersemantic,\n      title={Abacus: A Cost-Based Optimizer for Semantic Operator Systems}, \n      author={Matthew Russo and Sivaprasad Sudhir and Gerardo Vitagliano and Chunwei Liu and Tim Kraska and Samuel Madden and Michael Cafarella},\n      year={2025},\n      eprint={2505.14661},\n      archivePrefix={arXiv},\n      primaryClass={cs.DB},\n      url={https://arxiv.org/abs/2505.14661}, \n}\n```\n\nWinter 2025\n-----------\n**PalimpChat: A Chat Interface for Palimpzest** \\[[arXiv](https://arxiv.org/abs/2502.03368)\\]\n\n<Abstract>\n    Thanks to the advances in generative architectures and large language models, data scientists can now code pipelines of machine-learning operations to process large collections of unstructured data. Recent progress has seen the rise of declarative AI frameworks (e.g., Palimpzest, Lotus, and DocETL) to build optimized and increasingly complex pipelines, but these systems often remain accessible only to expert programmers. In this demonstration, we present PalimpChat, a chat-based interface to Palimpzest that bridges this gap by letting users create and run sophisticated AI pipelines through natural language alone. By integrating Archytas, a ReAct-based reasoning agent, and Palimpzest's suite of relational and LLM-based operators, PalimpChat provides a practical illustration of how a chat interface can make declarative AI frameworks truly accessible to non-experts.\n\n    Our demo system is publicly available online. At SIGMOD'25, participants can explore three real-world scenarios--scientific discovery, legal discovery, and real estate search--or apply PalimpChat to their own datasets. In this paper, we focus on how PalimpChat, supported by the Palimpzest optimizer, simplifies complex AI workflows such as extracting and analyzing biomedical data.\n</Abstract>\n\n```\n@misc{liu2025palimpchatdeclarativeinteractiveai,\n    title={PalimpChat: Declarative and Interactive AI analytics}, \n    author={Chunwei Liu and Gerardo Vitagliano and Brandon Rose and Matt Prinz and David Andrew Samson and Michael Cafarella},\n    year={2025},\n    eprint={2502.03368},\n    archivePrefix={arXiv},\n    primaryClass={cs.AI},\n    url={https://arxiv.org/abs/2502.03368}, \n}\n```\n\n**SciVar: Enabling Optimized Scientific Discovery in 16 Lines of Palimpzest Code** \\[[arXiv](https://arxiv.org/abs/2411.14569)\\]\n\n<Abstract>\n    The global output of academic publications exceeds 5 million articles per year, making it difficult for humans to keep up with even a tiny fraction of scientific output. We need methods to navigate and interpret the artifacts -- texts, graphs, charts, code, models, and datasets -- that make up the literature. This paper evaluates various methods for extracting mathematical model variables from epidemiological studies, such as \"infection rate (α)\", \"recovery rate (γ)\", and \"mortality rate (μ)\". Variable extraction appears to be a basic task, but plays a pivotal role in recovering models from scientific literature. Once extracted, we can use these variables for automatic mathematical modeling, simulation, and replication of published results.\n    \n    We introduce a benchmark dataset comprising manually-annotated variable descriptions and variable values extracted from scientific papers. Based on this dataset, we present several baseline methods for variable extraction based on Large Language Models (LLMs) and rule-based information extraction systems. Our analysis shows that LLM-based solutions perform the best. Despite the incremental benefits of combining rule-based extraction outputs with LLMs, the leap in performance attributed to the transfer-learning and instruction-tuning capabilities of LLMs themselves is far more significant. This investigation demonstrates the potential of LLMs to enhance automatic comprehension of scientific artifacts and for automatic model recovery and simulation.\n</Abstract>\n\n```\n@inproceedings{liu2024variableextractionmodelrecovery,\n    title={Variable Extraction for Model Recovery in Scientific Literature}, \n    author={Chunwei Liu and Enrique Noriega-Atala and Adarsh Pyarelal and Clayton T Morrison and Mike Cafarella},\n    year={2025},\n    booktitle={AI and Scientific Discovery (AISD) Workshop at NAACL},\n    url={https://arxiv.org/abs/2411.14569}, \n}\n```\n\nSpring 2024\n-----------\n**Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing** \\[[CIDR'25](https://www.vldb.org/cidrdb/papers/2025/p12-liu.pdf)\\] \\[[arXiv](https://arxiv.org/abs/2405.14696)\\]\n\n<Abstract>\n    A long-standing goal of data management systems has been to build systems which can compute quantitative insights over large collections of unstructured data in a cost-effective manner. Until recently, it was difficult and expensive to extract facts from company documents, data from scientific papers, or metrics from image and video corpora. Today’s models can accomplish these tasks with high accuracy. However, a programmer who wants to answer a substantive AI-powered query must orchestrate large numbers of models, prompts, and data operations. In this paper, we present PALIMPZEST, a system that enables programmers to pose AI-powered analytical queries over arbitrary collections of unstructured data in a simple declarative language. The system uses a cost optimization framework—which explores the search space of AI models, prompting techniques, and related foundation model optimizations. PALIMPZEST implements the query while navigating the trade-offs between runtime, financial cost, and output data quality. We introduce a novel language for AI-powered analytics tasks, the optimization methods that PALIMPZEST uses, and the prototype system itself. We evaluate PALIMPZEST on a real-world workload. Our system produces plans that are up to 3.3 x faster and 2.9 x cheaper than a baseline method when using a singlethread setup, while also achieving superior F1-scores. PALIMPZEST applies its optimizations automatically, requiring no additional work from the user.\n</Abstract>\n\n```\n@inproceedings{palimpzestCIDR,\n    title={Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing},\n    author={Liu, Chunwei and Russo, Matthew and Cafarella, Michael and Cao, Lei and Chen, Peter Baile and Chen, Zui and Franklin, Michael and Kraska, Tim and Madden, Samuel and Shahout, Rana and Vitagliano, Gerardo},\n    booktitle = {Proceedings of the {{Conference}} on {{Innovative Database Research}} ({{CIDR}})},\n    date = 2025,\n}\n```\n"
  },
  {
    "path": "website/static/.nojekyll",
    "content": ""
  },
  {
    "path": "website/tsconfig.json",
    "content": "{\n  // This file is not used in compilation. It is here just for a nice editor experience.\n  \"extends\": \"@docusaurus/tsconfig\",\n  \"compilerOptions\": {\n    \"baseUrl\": \".\"\n  },\n  \"exclude\": [\".docusaurus\", \"build\"]\n}\n"
  }
]