[
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2023 Yanay Rosen, Yusuf Roohani, Jure Leskovec\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# Universal Cell Embeddings\n\nThis repo includes a PyTorch [HuggingFace Accelerator](https://huggingface.co/docs/accelerate/package_reference/accelerator) implementation of the UCE model, to be used to embed individual anndata datasets.\n\n## Installation\n\n```\npip install -r requirements.txt\n```\n\n## Embedding a new dataset\n\nTo generate an embedding for a new single-cell RNA sequencing dataset in the AnnData format, use the `eval_single_anndata.py` script.\n\n```\npython eval_single_anndata.py --adata_path {path_to_anndata} --dir {output_dir} --species {species} --model_loc {model_loc} --batch_size {batch_size}\n```\n\nwhere\n- `adata_path`: a h5ad file. The `.X` slot of the file should be scRNA-seq counts. The `.var_names` slot should correspond to gene names, *not ENSEMBLIDs*.\n- `dir`: the working directory in which intermediate and final output files will be saved to skip repeated processing of the same dataset.\n- `species`: the species of the dataset you are embedding.\n- `model_loc`: the location of the model weights `.torch` file.\n- `batch_size`: the per GPU batch size. For the 33 layer model, on a 80GB GPU, you should use 25. For a 4 layer model on the same GPU, you can use 100.\n\nFor a sample output on the 10k pbmc dataset, run\n```\npython eval_single_anndata.py\n```\nAll necessary model files will be downloaded automatically.\n\n\n**Note**: This script makes use of additional files, which are described in the code documentation. These are downloaded automatically unless already present in the working directory. The script defaults to the pretrained 4-layer model. For running the pretrained 33-layer model from the paper, please download using this [link](https://figshare.com/articles/dataset/Universal_Cell_Embedding_Model_Files/24320806?file=43423236) and set `--nlayers 33`.\n\n## Output\n\nFinal evaluated AnnData: `dir/{dataset_name}.h5ad`. This AnnData will be \nidentical to the proccessed input anndata, but have UCE embeddings added in the `.obsm[\"X_uce\"]` slot.\n\nPlease see documentation for information on additional output files. All \noutputs from `eval_single_anndata.py` are stored in the `dir` directory.\n\n## Data\n\nYou can download processed datasets used in the papere [here](https://drive.google.com/drive/folders/1f63fh0ykgEhCrkd_EVvIootBw7LYDVI7?usp=drive_link)\n\n**Note:** These datasets were embedded using the 33 layer model. Embeddings for the 33 layer model are not compatible with embeddings from the 4 layer model.\n\n## Citing\n\nIf you find our paper and code useful, please consider citing the [preprint](https://www.biorxiv.org/content/10.1101/2023.11.28.568918v1):\n\n```\n@article{rosen2023universal,\n  title={Universal Cell Embeddings: A Foundation Model for Cell Biology},\n  author={Rosen, Yanay and Roohani, Yusuf and Agrawal, Ayush and Samotorcan, Leon and Consortium, Tabula Sapiens and Quake, Stephen R and Leskovec, Jure},\n  journal={bioRxiv},\n  pages={2023--11},\n  year={2023},\n  publisher={Cold Spring Harbor Laboratory}\n}\n```\n\n## Analyses\n\nPlease see the [reproduce repo](https://github.com/yhr91/uce_reproduce/tree/master) for analyses figures and datasets from the paper.\n"
  },
  {
    "path": "data_proc/Create New Species Files.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"0e4018ee\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Embedding Novel Species\\n\",\n    \"\\n\",\n    \"This notebook will create the files you need to embed a novel species that wasn't included in the training data.\\n\",\n    \"\\n\",\n    \"To start, you will need to download the ESM2 protein embeddings and the reference proteome for the species.\\n\",\n    \"\\n\",\n    \"You can find precalculated ESM2 protein embeddings for many species [here](https://drive.google.com/drive/folders/1_Dz7HS5N3GoOAG6MdhsXWY1nwLoN13DJ?usp=drive_link)\\n\",\n    \"\\n\",\n    \"For reference proteomes, you can download them from [here](https://useast.ensembl.org/info/about/species.html).\\n\",\n    \"\\n\",\n    \"If there is no protein embedding for the species you are interested in, you can request to have it made via Github or email, or you can create it yourself following instructions [here](https://github.com/snap-stanford/SATURN/tree/main/protein_embeddings).\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"id\": \"ab368d92\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import numpy as np\\n\",\n    \"import pickle as pkl\\n\",\n    \"import pandas as pd\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"id\": \"c9a306f3\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"SPECIES_NAME = \\\"chicken\\\" # short hand name for this species, will be used in arguments and files\\n\",\n    \"\\n\",\n    \"# Path to the species proteome\\n\",\n    \"SPECIES_PROTEIN_FASTA_PATH = \\\"../../../SATURN/protein_embeddings/data/Gallus_gallus.bGalGal1.mat.broiler.GRCg7b.pep.all.fa\\\"\\n\",\n    \"\\n\",\n    \"# Path to the ESM2 Embeddings\\n\",\n    \"SPECIES_PROTEIN_EMBEDDINGS_PATH = \\\"../model_files/protein_embeddings/Gallus_gallus.bGalGal1.mat.broiler.GRCg7b.pep.all.gene_symbol_to_embedding_ESM2.pt\\\"\\n\",\n    \"\\n\",\n    \"# primary_assembly name, this needs to be matched to the FASTA file\\n\",\n    \"ASSEMBLY_NAME = \\\"bGalGal1.mat.broiler.GRCg7b\\\"\\n\",\n    \"# NCBI Taxonomy ID, please set this so that if someone else also embeds the same species,\\n\",\n    \"# randomly generated chromosome tokens will be the same\\n\",\n    \"TAXONOMY_ID = 9031\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"e5d37e52\",\n   \"metadata\": {},\n   \"source\": [\n    \"You can view the FASTA format here, please confirm the primary_assembly name is correct.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"id\": \"2ecf1464\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \">ENSGALP00010000002.1 pep primary_assembly:bGalGal1.mat.broiler.GRCg7b:MT:2824:3798:1 gene:ENSGALG00010000007.1 transcript:ENSGALT00010000007.1 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:ND1 description:NADH dehydrogenase subunit 1 [Source:NCBI gene (formerly Entrezgene);Acc:63549479]\\r\\n\",\n      \"MTLPTLTNLLIMTLSYILPILIAVAFLTLVERKILSYMQARKGPNIVGPFGLLQPVADGV\\r\\n\",\n      \"KLFIKEPIRPSTSSPFLFIITPILALLLALTIWVPLPLPFPLADLNLGLLFLLAMSSLTV\\r\\n\",\n      \"YSLLWSGWASNSKYALIGALRAVAQTISYEVTLAIILLSTIMLSGNYTLSTLAITQEPIY\\r\\n\",\n      \"LIFSAWPLAMMWYISTLAETNRAPFDLTEGESELVSGFNVEYAAGPFAMFFLAEYANIML\\r\\n\",\n      \"MNTLTTVLFLNPSFLNLPPELFPIALATKTLLLSSSFLWIRASYPRFRYDQLMHLLWKNF\\r\\n\",\n      \"LPLTLALCLWHTSMPISYAGLPPI\\r\\n\",\n      \">ENSGALP00010000003.1 pep primary_assembly:bGalGal1.mat.broiler.GRCg7b:MT:4015:5053:1 gene:ENSGALG00010000011.1 transcript:ENSGALT00010000011.1 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:ND2 description:NADH dehydrogenase subunit 2 [Source:NCBI gene (formerly Entrezgene);Acc:63549482]\\r\\n\",\n      \"MNPHAKLICTVSLIMGTSITISSNHWILAWTGLEINTLAIIPLISKSHHPRAIEATIKYF\\r\\n\",\n      \"LTQSTASALILFSSMTNAWSTGQWDITQLNHPTSCLMLTMAIAIKLGLVPFHFWFPEVLQ\\r\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"!head {SPECIES_PROTEIN_FASTA_PATH}\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"id\": \"90540d0b\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"species_to_paths = {\\n\",\n    \"    SPECIES_NAME: SPECIES_PROTEIN_FASTA_PATH,\\n\",\n    \"}\\n\",\n    \"\\n\",\n    \"species_to_ids = {\\n\",\n    \"    SPECIES_NAME: ASSEMBLY_NAME,\\n\",\n    \"}\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 5,\n   \"id\": \"623b99cf\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"all_pos_def = []\\n\",\n    \"\\n\",\n    \"missing_genes = {}\\n\",\n    \"for species in species_to_ids.keys():\\n\",\n    \"    missing_genes[species] = []\\n\",\n    \"    proteome_path = species_to_paths[species]\\n\",\n    \"    species_id = species_to_ids[species]\\n\",\n    \"\\n\",\n    \"    with open(proteome_path) as f:\\n\",\n    \"        proteome_lines = f.readlines()\\n\",\n    \"\\n\",\n    \"    gene_symbol_to_location = {}\\n\",\n    \"    gene_symbol_to_chrom = {}\\n\",\n    \"\\n\",\n    \"    for line in proteome_lines:\\n\",\n    \"        if line.startswith(\\\">\\\"):\\n\",\n    \"            split_line = line.split()\\n\",\n    \"            gene_symbol = [token for token in split_line if token.startswith(\\\"gene_symbol\\\")]\\n\",\n    \"            if len(gene_symbol) > 0:\\n\",\n    \"                gene_symbol = gene_symbol[0].split(\\\":\\\")\\n\",\n    \"                \\n\",\n    \"                if len(gene_symbol) == 2:\\n\",\n    \"                    gene_symbol = gene_symbol[1]\\n\",\n    \"                elif len(gene_symbol) > 2:\\n\",\n    \"                    gene_symbol = \\\":\\\".join(gene_symbol[1:]) # fix for annoying zebrafish gene names with colons in them\\n\",\n    \"                else:\\n\",\n    \"                    1/0 # something weird happening, throw an error\\n\",\n    \"                \\n\",\n    \"                \\n\",\n    \"                chrom = None\\n\",\n    \"                \\n\",\n    \"                chrom_arr = [token for token in split_line if token.startswith(\\\"chromosome:\\\")]\\n\",\n    \"                if len(chrom_arr) > 0:\\n\",\n    \"                    chrom = chrom_arr[0].replace(\\\"chromosome:\\\", \\\"\\\")\\n\",\n    \"                else:\\n\",\n    \"                    chrom_arr = [token for token in split_line if token.startswith(\\\"primary_assembly:\\\")]\\n\",\n    \"                    if len(chrom_arr) > 0:\\n\",\n    \"                        chrom = chrom_arr[0].replace(\\\"primary_assembly:\\\", \\\"\\\")\\n\",\n    \"                    else:\\n\",\n    \"                        chrom_arr = [token for token in split_line if token.startswith(\\\"scaffold:\\\")] \\n\",\n    \"                        if len(chrom_arr) > 0:\\n\",\n    \"                            chrom = chrom_arr[0].replace(\\\"scaffold:\\\", \\\"\\\")\\n\",\n    \"                if chrom is not None:\\n\",\n    \"                    gene_symbol_to_location[gene_symbol] = chrom.split(\\\":\\\")[2]\\n\",\n    \"                    gene_symbol_to_chrom[gene_symbol] = chrom.split(\\\":\\\")[1]\\n\",\n    \"                else:\\n\",\n    \"                    missing_genes[species].append(gene_symbol)\\n\",\n    \"                    \\n\",\n    \"\\n\",\n    \"    positional_df = pd.DataFrame()\\n\",\n    \"    positional_df[\\\"gene_symbol\\\"] = [gn.upper() for gn in list(gene_symbol_to_chrom.keys())]\\n\",\n    \"    positional_df[\\\"chromosome\\\"] = list(gene_symbol_to_chrom.values())\\n\",\n    \"    positional_df[\\\"start\\\"] = list(gene_symbol_to_location.values())\\n\",\n    \"    positional_df = positional_df.sort_values([\\\"chromosome\\\", \\\"start\\\"])\\n\",\n    \"    #positional_df = positional_df.set_index(\\\"gene_symbol\\\")\\n\",\n    \"    positional_df[\\\"species\\\"] = species\\n\",\n    \"    all_pos_def.append(positional_df)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 6,\n   \"id\": \"b72887b3\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>gene_symbol</th>\\n\",\n       \"      <th>chromosome</th>\\n\",\n       \"      <th>start</th>\\n\",\n       \"      <th>species</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>2327</th>\\n\",\n       \"      <td>GCC1</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>1006145</td>\\n\",\n       \"      <td>chicken</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>2502</th>\\n\",\n       \"      <td>NCAM2</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>100828671</td>\\n\",\n       \"      <td>chicken</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>3084</th>\\n\",\n       \"      <td>ENS-2</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>101147482</td>\\n\",\n       \"      <td>chicken</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>2331</th>\\n\",\n       \"      <td>DENND6B</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>1012031</td>\\n\",\n       \"      <td>chicken</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>3973</th>\\n\",\n       \"      <td>MRPL39</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>102578362</td>\\n\",\n       \"      <td>chicken</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>...</th>\\n\",\n       \"      <td>...</td>\\n\",\n       \"      <td>...</td>\\n\",\n       \"      <td>...</td>\\n\",\n       \"      <td>...</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>4722</th>\\n\",\n       \"      <td>CA9</td>\\n\",\n       \"      <td>Z</td>\\n\",\n       \"      <td>9779343</td>\\n\",\n       \"      <td>chicken</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>4738</th>\\n\",\n       \"      <td>ARHGEF39</td>\\n\",\n       \"      <td>Z</td>\\n\",\n       \"      <td>9835547</td>\\n\",\n       \"      <td>chicken</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>3885</th>\\n\",\n       \"      <td>MRPL17</td>\\n\",\n       \"      <td>Z</td>\\n\",\n       \"      <td>9850679</td>\\n\",\n       \"      <td>chicken</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>4172</th>\\n\",\n       \"      <td>CCBE1</td>\\n\",\n       \"      <td>Z</td>\\n\",\n       \"      <td>9852827</td>\\n\",\n       \"      <td>chicken</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>3293</th>\\n\",\n       \"      <td>PMAIP1</td>\\n\",\n       \"      <td>Z</td>\\n\",\n       \"      <td>9998272</td>\\n\",\n       \"      <td>chicken</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"<p>13271 rows × 4 columns</p>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"     gene_symbol chromosome      start  species\\n\",\n       \"2327        GCC1          1    1006145  chicken\\n\",\n       \"2502       NCAM2          1  100828671  chicken\\n\",\n       \"3084       ENS-2          1  101147482  chicken\\n\",\n       \"2331     DENND6B          1    1012031  chicken\\n\",\n       \"3973      MRPL39          1  102578362  chicken\\n\",\n       \"...          ...        ...        ...      ...\\n\",\n       \"4722         CA9          Z    9779343  chicken\\n\",\n       \"4738    ARHGEF39          Z    9835547  chicken\\n\",\n       \"3885      MRPL17          Z    9850679  chicken\\n\",\n       \"4172       CCBE1          Z    9852827  chicken\\n\",\n       \"3293      PMAIP1          Z    9998272  chicken\\n\",\n       \"\\n\",\n       \"[13271 rows x 4 columns]\"\n      ]\n     },\n     \"execution_count\": 6,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"master_pos_def = pd.concat(all_pos_def)\\n\",\n    \"master_pos_def\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 7,\n   \"id\": \"6d9dac28\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"chicken    13271\\n\",\n       \"Name: species, dtype: int64\"\n      ]\n     },\n     \"execution_count\": 7,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"master_pos_def[\\\"species\\\"].value_counts() # double check how many genes are mapped\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 8,\n   \"id\": \"4a3d45c2\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"chicken: 0\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"for k, v in missing_genes.items():\\n\",\n    \"    print(f\\\"{k}: {len(v)}\\\") # are any genes missing?\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 9,\n   \"id\": \"c59774b1\",\n   \"metadata\": {\n    \"scrolled\": true\n   },\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"*********\\n\",\n      \"chicken\\n\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"1                    1785\\n\",\n       \"2                    1169\\n\",\n       \"3                    1067\\n\",\n       \"4                     953\\n\",\n       \"5                     817\\n\",\n       \"Z                     629\\n\",\n       \"6                     458\\n\",\n       \"8                     450\\n\",\n       \"7                     442\\n\",\n       \"9                     382\\n\",\n       \"10                    366\\n\",\n       \"14                    359\\n\",\n       \"11                    327\\n\",\n       \"15                    326\\n\",\n       \"13                    306\\n\",\n       \"20                    298\\n\",\n       \"12                    293\\n\",\n       \"19                    278\\n\",\n       \"18                    274\\n\",\n       \"17                    260\\n\",\n       \"26                    237\\n\",\n       \"28                    237\\n\",\n       \"27                    235\\n\",\n       \"21                    226\\n\",\n       \"23                    214\\n\",\n       \"25                    176\\n\",\n       \"34                    155\\n\",\n       \"24                    149\\n\",\n       \"22                    142\\n\",\n       \"16                     54\\n\",\n       \"30                     52\\n\",\n       \"38                     49\\n\",\n       \"31                     14\\n\",\n       \"MT                     13\\n\",\n       \"39                     10\\n\",\n       \"JAENSK010000484.1       7\\n\",\n       \"35                      6\\n\",\n       \"JAENSK010000592.1       6\\n\",\n       \"W                       5\\n\",\n       \"MU179278.1              5\\n\",\n       \"MU179279.1              4\\n\",\n       \"36                      3\\n\",\n       \"JAENSK010000483.1       3\\n\",\n       \"JAENSK010000585.1       3\\n\",\n       \"JAENSK010000593.1       2\\n\",\n       \"MU179258.1              2\\n\",\n       \"MU179272.1              2\\n\",\n       \"MU179273.1              2\\n\",\n       \"JAENSK010000584.1       2\\n\",\n       \"JAENSK010000656.1       1\\n\",\n       \"Name: chromosome, dtype: int64\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"*********\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"# Count genes per chromosome\\n\",\n    \"for species in species_to_ids.keys():\\n\",\n    \"    print(\\\"*********\\\")\\n\",\n    \"    print(species)\\n\",\n    \"    display(master_pos_def[master_pos_def[\\\"species\\\"] == species][\\\"chromosome\\\"].value_counts().head(50))\\n\",\n    \"    print(\\\"*********\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 10,\n   \"id\": \"541baded\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"master_pos_def.to_csv(f\\\"{SPECIES_NAME}_to_chrom_pos.csv\\\", index=False) # Save the DF\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 11,\n   \"id\": \"eabd0e31\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"chicken_to_chrom_pos.csv\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"# The chromosome file path will be:\\n\",\n    \"print(f\\\"{SPECIES_NAME}_to_chrom_pos.csv\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 12,\n   \"id\": \"fe1345b1\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"66\"\n      ]\n     },\n     \"execution_count\": 12,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"N_UNIQ_CHROM = len(master_pos_def[master_pos_def[\\\"species\\\"] == species][\\\"chromosome\\\"].unique())\\n\",\n    \"N_UNIQ_CHROM\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"e37e277f\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Generate token file\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 13,\n   \"id\": \"d6904975\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import torch\\n\",\n    \"import pickle\\n\",\n    \"token_dim = 5120\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"a2798848\",\n   \"metadata\": {},\n   \"source\": [\n    \"This will create the token file. Please note the offset value.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 14,\n   \"id\": \"4355dabd\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"CHROM_TOKEN_OFFSET: 13275\\n\",\n      \"Saved PE, offsets file\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"species_to_offsets = {}\\n\",\n    \"\\n\",\n    \"all_pe = torch.load(\\\"../model_files/all_tokens.torch\\\")[0:4] # read in existing token file to make sure \\n\",\n    \"# that special vocab tokens are the same for different seeds\\n\",\n    \"\\n\",\n    \"offset = len(all_pe) # special tokens at the top!\\n\",\n    \"\\n\",\n    \"PE = torch.load(SPECIES_PROTEIN_EMBEDDINGS_PATH)\\n\",\n    \"\\n\",\n    \"pe_stacked = torch.stack(list(PE.values()))\\n\",\n    \"all_pe = torch.vstack((all_pe, pe_stacked))\\n\",\n    \"species_to_offsets[species] = offset\\n\",\n    \"\\n\",\n    \"print(\\\"CHROM_TOKEN_OFFSET:\\\", all_pe.shape[0])\\n\",\n    \"torch.manual_seed(TAXONOMY_ID)\\n\",\n    \"CHROM_TENSORS = torch.normal(mean=0, std=1, size=(N_UNIQ_CHROM, 5120)) \\n\",\n    \"# N_UNIQ_CHROM is the total number of chromosome choices, it is hardcoded for now (for species in the training data)\\n\",\n    \"all_pe = torch.vstack(\\n\",\n    \"    (all_pe, CHROM_TENSORS))  # Add the chrom tensors to the end\\n\",\n    \"all_pe.requires_grad = False\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"torch.save(all_pe, f\\\"{SPECIES_NAME}_pe_tokens.torch\\\")\\n\",\n    \"\\n\",\n    \"with open(f\\\"{SPECIES_NAME}_offsets.pkl\\\", \\\"wb+\\\") as f:\\n\",\n    \"    pickle.dump(species_to_offsets, f)\\n\",\n    \"print(\\\"Saved PE, offsets file\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 15,\n   \"id\": \"c26fe491\",\n   \"metadata\": {\n    \"scrolled\": true\n   },\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"torch.Size([13341, 5120])\"\n      ]\n     },\n     \"execution_count\": 15,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"all_pe.shape\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 16,\n   \"id\": \"21f937ea\",\n   \"metadata\": {\n    \"scrolled\": true\n   },\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"torch.Size([13341, 5120])\"\n      ]\n     },\n     \"execution_count\": 16,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"all_pe.shape\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 17,\n   \"id\": \"5faadace\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"chicken_offsets.pkl\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"print(f\\\"{SPECIES_NAME}_offsets.pkl\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 18,\n   \"id\": \"6ceac20b\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"'../model_files/protein_embeddings/Gallus_gallus.bGalGal1.mat.broiler.GRCg7b.pep.all.gene_symbol_to_embedding_ESM2.pt'\"\n      ]\n     },\n     \"execution_count\": 18,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"SPECIES_PROTEIN_EMBEDDINGS_PATH\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"e4697330\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Example evaluation of new species\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"2b72667d\",\n   \"metadata\": {},\n   \"source\": [\n    \"**Note: when you evaluate a new species, you need to change some arguments and modify some files:**\\n\",\n    \"\\n\",\n    \"You will  need to modify the csv in `model_files/new_species_protein_embeddings.csv` to include the new protein embeddings file you downloaded.\\n\",\n    \"\\n\",\n    \"In the file add a row for the new species with the format:\\n\",\n    \"`species name,full path to protein embedding file`\\n\",\n    \"\\n\",\n    \"Please also add this line to the dictionary created on line 247 in the file `data_proc/data_utils.py`.\\n\",\n    \"\\n\",\n    \"When you want to embed this new species, you will need to specify these newly created files as arguments.\\n\",\n    \"- `CHROM_TOKEN_OFFSET`: This tells UCE when the rows corresponding to chromosome tokens starts.\\n\",\n    \"- `spec_chrom_csv_path`: This is a new csv, created by this script, which maps genes to chromosomes and genomic positions\\n\",\n    \"- `token_file`: This is a new token file that will work just for this species. The embeddings generated will still be universal though!\\n\",\n    \"- `offset_pkl_path`: This is another file that maps genes to tokens\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"```\\n\",\n    \"\\n\",\n    \"accelerate launch eval_single_anndata.py chicken_heart.h5ad --species=chicken --CHROM_TOKEN_OFFSET=13275 --spec_chrom_csv_path=data_proc/chicken_to_chrom_pos.csv --token_file=data_proc/chicken_pe_tokens.torch --offset_pkl_path=data_proc/chicken_offsets.pkl --dir=... --multi_gpu=True\\n\",\n    \"\\n\",\n    \"```\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Python 3 (ipykernel)\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.8.6\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 5\n}\n"
  },
  {
    "path": "data_proc/data_utils.py",
    "content": "import warnings\nwarnings.filterwarnings(\"ignore\")\n\nimport scanpy as sc\nimport torch\n\nfrom torch import nn, Tensor\nimport torch.nn.functional as F\nimport torch.utils.data as data\nimport torch.optim as optim\nimport numpy as np\nimport pickle\nimport os\nimport argparse\nimport logging\nimport time\n\nfrom tqdm.auto import tqdm\nimport pandas as pd\n\nimport math\nimport anndata\nfrom pathlib import Path\n\n\nfrom torch.utils.data import dataset\nfrom torch.utils.data import DataLoader, TensorDataset, dataset\nfrom scipy.stats import binom\nfrom typing import Dict, List, Optional, Tuple\nfrom scanpy import AnnData\n\n\nfrom data_proc.gene_embeddings import load_gene_embeddings_adata\n\ndef data_to_torch_X(X):\n    if isinstance(X, sc.AnnData):\n        X = X.X\n    if not isinstance(X, np.ndarray):\n            X = X.toarray()\n    return torch.from_numpy(X).float()\n\nclass SincleCellDataset(data.Dataset):\n    def __init__(self,\n                expression: torch.tensor, # Subset to hv genes, count data! cells x genes\n                protein_embeddings: torch.tensor, # same order as expression, also subset genes x pe\n                labels: None, # optional, tensor of labels\n                covar_vals: None, # tensor of covar values or none\n                ) -> None:\n        super(SincleCellDataset, self).__init__()\n        \n        # Set expression\n        self.expression = expression\n        \n        row_sums = self.expression.sum(1) # UMI Counts\n        log_norm_count_adj = torch.log1p(self.expression / (self.expression.sum(1)).unsqueeze(1) * torch.tensor(1000))       \n        \n        # Set log norm and count adjusted expression\n        max_vals, max_idx = torch.max(log_norm_count_adj, dim=0)\n        self.expression_mod =  log_norm_count_adj / max_vals\n        \n        # Calculate dropout likliehoods of each gene\n        self.dropout_vec = (self.expression == 0).float().mean(0) # per gene dropout percentages\n        \n        # Set data info\n        self.num_cells = self.expression.shape[0]\n        self.num_genes = self.expression.shape[1]\n        \n        # Set optional label info, including categorical covariate index\n        self.covar_vals = covar_vals\n        self.labels = labels\n        \n        # Set protein embeddings\n        self.protein_embeddings = protein_embeddings\n        \n        self.item_mode = \"expression\"\n        if self.covar_vals is not None:\n            self.item_mode = \"expression+covar\"\n        \n        \n    def __getitem__(self, idx):\n        if self.item_mode == \"expression\":\n            if isinstance(idx, int):\n                if idx < self.num_cells:\n                    return self.expression[idx, :]\n                else:\n                    raise IndexError\n            else:\n                raise NotImplementedError\n        elif self.item_mode == \"expression+covar\":\n            if isinstance(idx, int):\n                if idx < self.num_cells:\n                    return self.expression[idx, :], self.covar_vals[idx]\n                else:\n                    raise IndexError\n            else:\n                raise NotImplementedError\n            \n\n    def __len__(self) -> int:\n        return self.num_cells\n\n    def get_dim(self) -> Dict[str, int]:\n        return self.num_genes\n\n\ndef data_to_torch_X(X):\n    if isinstance(X, sc.AnnData):\n        X = X.X\n    if not isinstance(X, np.ndarray):\n            X = X.toarray()\n    return torch.from_numpy(X).float()\n\n\ndef anndata_to_sc_dataset(adata:sc.AnnData, \n                                 species:str=\"human\", \n                                 labels:list=[],\n                                 covar_col:str=None,\n                                 hv_genes=None,\n                                 embedding_model=\"ESM2\",\n                                ) -> (SincleCellDataset, AnnData):\n    \n    # Subset to just genes we have embeddings for\n    adata, protein_embeddings = load_gene_embeddings_adata(\n        adata=adata,\n        species=[species],\n        embedding_model=embedding_model\n    )\n    \n    if hv_genes is not None:\n        sc.pp.highly_variable_genes(adata, flavor='seurat_v3', n_top_genes=hv_genes)  # Expects Count Data\n    \n        hv_index = adata.var[\"highly_variable\"]\n        adata = adata[:, hv_index] # Subset to hv genes only\n    \n        protein_embeddings = protein_embeddings[species][hv_index]\n    else:\n        protein_embeddings = protein_embeddings[species]\n    expression = data_to_torch_X(adata.X)\n    \n    covar_vals = None\n    if len(labels) > 0:\n        assert covar_col is None or covar_col in labels, \"Covar needs to be in labels\" # make sure you keep track of covar column!\n        labels = adata.obs.loc[:, labels].values\n        \n        if covar_col is not None:\n            # we have a categorical label to use as covariate\n            covar_vals = torch.tensor(pd.Categorical(adata.obs[covar_col]).codes)\n    return SincleCellDataset(\n        expression=expression,\n        protein_embeddings=protein_embeddings,\n        labels=labels,\n        covar_vals=covar_vals\n    ), adata    \n    \ndef adata_path_to_prot_chrom_starts(adata, dataset_species, spec_pe_genes, gene_to_chrom_pos, offset):\n    \"\"\"\n        Given a :path: to an h5ad, \n    \"\"\"    \n    pe_row_idxs = torch.tensor([spec_pe_genes.index(k.upper()) + offset for k in adata.var_names]).long()\n    print(len(np.unique(pe_row_idxs)))\n    \n    spec_chrom = gene_to_chrom_pos[gene_to_chrom_pos[\"species\"] == dataset_species].set_index(\"gene_symbol\")\n\n    gene_chrom = spec_chrom.loc[[k.upper() for k in adata.var_names]]\n\n    dataset_chroms = gene_chrom[\"spec_chrom\"].cat.codes # now this is correctely indexed by species and chromosome\n    print(\"Max Code:\", max(dataset_chroms))\n    dataset_pos = gene_chrom[\"start\"].values\n    return pe_row_idxs, dataset_chroms, dataset_pos\n\n\n\ndef process_raw_anndata(row, h5_folder_path, npz_folder_path, scp, skip,\n                        additional_filter, root):\n        path = row.path\n        if not os.path.isfile(root + \"/\" + path):\n            print( \"**********************************\")\n            print(f\"***********{root + '/' + path} File Missing****\")\n            print( \"**********************************\")\n            print(path, root)\n            return None\n\n        name = path.replace(\".h5ad\", \"\")\n        proc_path = path.replace(\".h5ad\", \"_proc.h5ad\")\n        if skip:\n            if os.path.isfile(h5_folder_path + proc_path):\n                print(f\"{name} already processed. Skipping\")\n                return None, None, None\n\n        print(f\"Proccessing {name}\")\n\n        species = row.species\n        covar_col = row.covar_col\n\n        ad = sc.read(root + \"/\" + path)\n        labels = []\n        if \"cell_type\" in ad.obs.columns:\n            labels.append(\"cell_type\")\n\n\n        if covar_col is np.nan or np.isnan(covar_col):\n            covar_col = None\n        else:\n            labels.append(covar_col)\n\n        if additional_filter:\n            sc.pp.filter_genes(ad, min_cells=10)\n            sc.pp.filter_cells(ad, min_genes=25)\n\n\n        dataset, adata = anndata_to_sc_dataset(ad, species=species, labels=labels, covar_col=covar_col, hv_genes=None)\n        adata = adata.copy()\n\n        if additional_filter:\n            sc.pp.filter_genes(ad, min_cells=10)\n            sc.pp.filter_cells(ad, min_genes=25)\n        \n        num_cells = adata.X.shape[0]\n        num_genes = adata.X.shape[1]\n\n        adata_path = h5_folder_path + proc_path\n        adata.write(adata_path)\n\n        arr = data_to_torch_X(adata.X).numpy()\n\n        print(arr.max()) # this is a nice check to make sure it's counts\n        filename = npz_folder_path + f\"{name}_counts.npz\"\n        shape = arr.shape\n        print(name, shape)\n        fp = np.memmap(filename, dtype='int64', mode='w+', shape=shape)\n        fp[:] = arr[:]\n        fp.flush()\n        \n        if scp != \"\":\n            subprocess.call([\"scp\", filename, f\"{scp}:{filename}\"])\n            subprocess.call([\"scp\", adata_path, f\"{scp}:{adata_path}\"])\n            \n        return adata, num_cells, num_genes\n    \n    \ndef get_species_to_pe(EMBEDDING_DIR):\n    \"\"\"\n    Given an embedding directory, return all embeddings as a dictionary coded by species.\n    Note: In the current form, this function is written such that the directory needs all of the following species embeddings.\n    \"\"\"\n    EMBEDDING_DIR = Path(EMBEDDING_DIR)\n\n    embeddings_paths = {\n            'human': EMBEDDING_DIR / 'Homo_sapiens.GRCh38.gene_symbol_to_embedding_ESM2.pt',\n            'mouse': EMBEDDING_DIR / 'Mus_musculus.GRCm39.gene_symbol_to_embedding_ESM2.pt',\n            'frog': EMBEDDING_DIR / 'Xenopus_tropicalis.Xenopus_tropicalis_v9.1.gene_symbol_to_embedding_ESM2.pt',\n            'zebrafish': EMBEDDING_DIR / 'Danio_rerio.GRCz11.gene_symbol_to_embedding_ESM2.pt',\n            \"mouse_lemur\": EMBEDDING_DIR / \"Microcebus_murinus.Mmur_3.0.gene_symbol_to_embedding_ESM2.pt\",\n            \"pig\": EMBEDDING_DIR / 'Sus_scrofa.Sscrofa11.1.gene_symbol_to_embedding_ESM2.pt',\n            \"macaca_fascicularis\": EMBEDDING_DIR / 'Macaca_fascicularis.Macaca_fascicularis_6.0.gene_symbol_to_embedding_ESM2.pt',\n            \"macaca_mulatta\": EMBEDDING_DIR / 'Macaca_mulatta.Mmul_10.gene_symbol_to_embedding_ESM2.pt',\n        }\n    extra_species = pd.read_csv(\"./model_files/new_species_protein_embeddings.csv\").set_index(\"species\").to_dict()[\"path\"]\n    embeddings_paths.update(extra_species) # adds new species\n    \n    \n    \n    species_to_pe = {\n        species:torch.load(pe_dir) for species, pe_dir in embeddings_paths.items()   \n    }\n\n    species_to_pe = {species:{k.upper(): v for k,v in pe.items()} for species, pe in species_to_pe.items()}\n    return species_to_pe\n\n\ndef get_spec_chrom_csv(path=\"/dfs/project/cross-species/yanay/code/all_to_chrom_pos.csv\"):\n    \"\"\"\n    Get the species to chrom csv file\n    \"\"\"\n    gene_to_chrom_pos = pd.read_csv(path)\n    gene_to_chrom_pos[\"spec_chrom\"] = pd.Categorical(gene_to_chrom_pos[\"species\"] + \"_\" +  gene_to_chrom_pos[\"chromosome\"]) # add the spec_chrom list\n    return gene_to_chrom_pos"
  },
  {
    "path": "data_proc/download_proc_czi_cxg.py",
    "content": "import os\nos.environ[\"OMP_NUM_THREADS\"] = \"20\" # export OMP_NUM_THREADS=4\nos.environ[\"OPENBLAS_NUM_THREADS\"] = \"20\" # export OPENBLAS_NUM_THREADS=4 \nos.environ[\"MKL_NUM_THREADS\"] = \"20\" # export MKL_NUM_THREADS=6\nos.environ[\"VECLIB_MAXIMUM_THREADS\"] = \"20\" # export VECLIB_MAXIMUM_THREADS=4\nos.environ[\"NUMEXPR_NUM_THREADS\"] = \"20\"\n\n\nimport warnings\nwarnings.filterwarnings('ignore')\n\nimport cellxgene_census\nfrom tqdm import tqdm\nimport scanpy as sc\n\nfrom collections import defaultdict\nfrom typing import Dict, List, Optional, Tuple\n\nimport torch\nimport torch.utils.data as data\nimport torch\nimport numpy as np\nimport scanpy as sc\nfrom numpy import array\nimport os\nimport pickle as pkl\nimport glob\n\ndef data_to_torch_X(X):\n    if isinstance(X, sc.AnnData):\n        X = X.X\n    if not isinstance(X, np.ndarray):\n            X = X.toarray()\n    return torch.from_numpy(X).float()\n    \nimport sys\nsys.path.append('../')\n\nfrom gene_embeddings import load_gene_embeddings_adata\nimport pandas as pd\nimport numpy as np\nfrom scanpy import AnnData\nfrom multiprocessing import Pool, Process, Manager\n\nimport multiprocessing.pool as mpp\n# https://stackoverflow.com/questions/57354700/starmap-combined-with-tqdm\ndef istarmap(self, func, iterable, chunksize=1):\n    \"\"\"starmap-version of imap\n    \"\"\"\n    if self._state != mpp.RUN:\n        raise ValueError(\"Pool not running\")\n\n    if chunksize < 1:\n        raise ValueError(\n            \"Chunksize must be 1+, not {0:n}\".format(\n                chunksize))\n\n    task_batches = mpp.Pool._get_tasks(func, iterable, chunksize)\n    result = mpp.IMapIterator(self._cache)\n    self._taskqueue.put(\n        (\n            self._guarded_task_generation(result._job,\n                                          mpp.starmapstar,\n                                          task_batches),\n            result._set_length\n        ))\n    return (item for chunk in result for item in chunk)\n\n\nmpp.Pool.istarmap = istarmap\n\n\nVERSION = \"2023-04-25\"\nN_TOP_GENES = 12000\n\n\nprint(cellxgene_census.get_census_version_description(VERSION))\n\ncensus = cellxgene_census.open_soma(census_version=VERSION)\ncensus_datasets = census[\"census_info\"][\"datasets\"].read().concat().to_pandas()\n\n# for convenience, indexing on the soma_joinid which links this to other census data.\ncensus_datasets = census_datasets.set_index(\"soma_joinid\")\n\nspecies_to_readable = {\n    \"Homo sapiens\":\"human\",\n    \"Mus musculus\":\"mouse\"    \n}\n\ndef process_row(row, num_genes, num_cells, paths, all_species, covar_cols, dataset_title, h5_root=\"/dfs/project/uce/cxg_data/anndatas/\", npz_root=\"/dfs/project/uce/cxg_data/npzs/\"):\n    dataset_id = row[1].dataset_id\n    #dataset_title = row[1].dataset_title.lower().replace(' ', '_').replace(\",\", \"\").replace(\"/\", \"\")\n    \n    save_path = h5_root + f\"{dataset_title}.h5ad\"\n    no_primary_path = save_path.replace(\".h5ad\", \"_no_primary.h5ad\")\n    proc_path = save_path.replace(\".h5ad\", \"_proc.h5ad\")\n    npz_path = npz_root + f\"{dataset_title}_counts.npz\"\n    # Download the anndata\n    \n    if os.path.exists(no_primary_path):\n        print(\"No Primary, skipping\")\n        return\n    \n    if not os.path.exists(save_path) and not os.path.exists(no_primary_path):\n        cellxgene_census.download_source_h5ad(\n            dataset_id, to_path=save_path\n        )\n    if os.path.exists(proc_path) and os.path.exists(npz_path):\n        print(\"Already Proc\")\n        try:\n            ad = sc.read(proc_path)\n        except:\n            print()\n            print()\n            print(\"Error reading on:\", dataset_title)\n            print()\n            print()\n            return\n        # Get organism\n        if \"organism\" in ad.obs.columns:\n            unique_organisms = list(ad.obs.organism.unique().categories)\n            unique_organism_str = \", \".join(unique_organisms)\n        else:\n            unique_organism_str = \"human\"\n        species = species_to_readable.get(unique_organism_str, \"human\")\n        # don't need to do hv if already proc\n        if \"sample\" in ad.obs.columns:\n            covar_cols[dataset_title] = \"sample\"\n        elif \"batch\" in ad.obs.columns:\n            covar_cols[dataset_title] = \"batch\"\n        else:\n            covar_cols[dataset_title] = \"\"\n\n\n        num_genes[dataset_title] = ad.X.shape[1]\n        num_cells[dataset_title] = ad.X.shape[0]\n        paths[dataset_title] = f\"{dataset_title}.h5ad\"\n        all_species[dataset_title] = species\n        \n        return # Skip everything else\n    # Read the raw AD\n    ad = sc.read(save_path)\n    \n    # Change to counts\n    if not sc._utils.check_nonnegative_integers(ad.X):\n        # don't have counts yet, need raw\n        if ad.raw is None:\n            print(\"Skipped, no counts\")\n            return\n        ad.X = ad.raw.X.toarray()\n    if not sc._utils.check_nonnegative_integers(ad.X):\n        print(\"Skipped, no counts\")\n        return\n        \n    # SUBSET TO primary data\n    if len(np.unique(ad.obs[\"is_primary_data\"])) >= 1:\n        primary_data = ad.obs.is_primary_data.value_counts()\n        ad = ad[ad.obs.is_primary_data]\n    if ad.X.shape[0] == 0:\n        print(\"no primary data\")\n        print(primary_data)\n        os.rename(save_path, no_primary_path)\n        return # No primary data\n    print(\"has primary data\")\n    # Switch to gene symbols\n    ad.var[\"feature_id_orig\"] = list(ad.var.index)\n    ad.var_names = list(ad.var.feature_name)\n\n    # Get organism\n    if \"organism\" in ad.obs.columns:\n        unique_organisms = list(ad.obs.organism.unique().categories)\n        unique_organism_str = \", \".join(unique_organisms)\n    else:\n        unique_organism_str = \"human\"\n    species = species_to_readable.get(unique_organism_str, \"human\")\n    # Filter to gene symbols with protein embeddings\n    ad, _ = load_gene_embeddings_adata(\n        adata=ad,\n        species=[species],\n        embedding_model=\"ESM2\"\n    )\n    \n    ad = ad.copy()\n    # Simple filtering by counts\n    sc.pp.filter_cells(ad, min_genes=200)\n    sc.pp.filter_genes(ad, min_cells=10)\n    \n    #print(ad)\n    \n    if \"sample\" in ad.obs.columns:\n        try:\n            sc.pp.highly_variable_genes(ad, flavor=\"seurat_v3\", n_top_genes=N_TOP_GENES, subset=True, batch_key=\"sample\")\n        except:\n            try:\n                sc.pp.highly_variable_genes(ad, flavor=\"seurat_v3\", n_top_genes=N_TOP_GENES, subset=True, batch_key=\"sample\", span=1)\n            except:\n                print(f\"can't hv gene subset {dataset_title}\")\n        covar_cols[dataset_title] = \"sample\"\n    elif \"batch\" in ad.obs.columns:\n        try:\n            sc.pp.highly_variable_genes(ad, flavor=\"seurat_v3\", n_top_genes=N_TOP_GENES, subset=True, batch_key=\"batch\")\n        except:\n            try:\n                sc.pp.highly_variable_genes(ad, flavor=\"seurat_v3\", n_top_genes=N_TOP_GENES, subset=True, batch_key=\"batch\", span=1)\n            except:\n                print(f\"can't hv gene subset {dataset_title}\")\n        covar_cols[dataset_title] = \"batch\"\n    else:\n        try:\n            sc.pp.highly_variable_genes(ad, flavor=\"seurat_v3\", n_top_genes=N_TOP_GENES, subset=True)\n        except:\n            try:\n                sc.pp.highly_variable_genes(ad, flavor=\"seurat_v3\", n_top_genes=N_TOP_GENES, subset=True, span=1)\n            except:\n                print(f\"can't hv gene subset {dataset_title}\")\n        covar_cols[dataset_title] = \"\"\n        \n    num_genes[dataset_title] = ad.X.shape[1]\n    num_cells[dataset_title] = ad.X.shape[0]\n    paths[dataset_title] = f\"{dataset_title}.h5ad\"\n    all_species[dataset_title] = species\n    \n    print(\"writing proc\")\n    ad.write(proc_path)\n    \n    arr = data_to_torch_X(ad.X).numpy()\n    \n    shape = arr.shape\n    \n    fp = np.memmap(npz_path, dtype='int64', mode='w+', shape=shape)\n    fp[:] = arr[:]\n    fp.flush()\n    \n    return\n    \nif __name__ == '__main__':\n    '''\n    manager = Manager()\n    num_genes = manager.dict()\n    num_cells = manager.dict()\n    paths = manager.dict()\n    all_species = manager.dict()\n    covar_cols = manager.dict()\n    '''\n    num_genes = {}\n    num_cells = {}\n    paths = {}\n    all_species = {}\n    covar_cols = {}\n\n    df = pd.DataFrame()\n    # Shuffle the dataset \n    census_datasets = census_datasets#.iloc[270:]\n    iterrows = list(census_datasets.iterrows())\n    #p = Pool(8)\n    #for row in tqdm(iterrows, total=len(census_datasets)):\n    #    p.apply_async(process_row, args=(row, num_genes, num_cells, paths, all_species, covar_cols))            \n    #p.close()\n    #p.join()\n    '''\n    with Pool(1) as p:\n        nrows = len(iterrows)\n        inputs = zip(iterrows, [num_genes]*nrows, [num_cells]*nrows, [paths]*nrows, [all_species]*nrows, [covar_cols]*nrows)\n        for _ in tqdm(p.istarmap(process_row, inputs),\n                           total=nrows):\n            pass\n    \n    '''\n    \n    if os.path.exists(\"dataset_rows_mouse_fixed.pkl\"):\n        dataset_rows = {}\n        for path in glob.glob(\"dataset_rows_mouse_fixed*.pkl\"):\n            with open(path, \"rb\") as f:\n                dataset_rows_path = pkl.load(f)\n                dataset_rows.update(dataset_rows_path)\n                \n        print(f\"{len(dataset_rows)} already counted\")\n    else:\n        dataset_rows = {}\n    \n    \n    pbar = tqdm(iterrows)\n    all_errors = []\n    total_number_of_cells = 0\n    \n    duplicate_titles = ['Dissection: Body of hippocampus (HiB) - Rostral DG-CA4', 'Retina',\n       'Colon', 'Myeloid cells', 'Ileum', 'Airway']\n    duplicate_titles_2 = ['retina', 'airway', 'myeloid_cells', 'colon', 'ileum', 'immune_cells']\n    \n    for row in pbar:\n        dataset_title = row[1].dataset_title\n        if dataset_title in duplicate_titles:\n            dataset_title = row[1].collection_name + row[1].dataset_title\n\n        dataset_title = dataset_title.lower().replace(' ', '_').replace(\",\", \"\").replace(\"/\", \"\")\n\n        if dataset_title in duplicate_titles_2:\n            dataset_title = (row[1].collection_name + \"_\" + dataset_title).lower().replace(' ', '_').replace(\",\", \"\").replace(\"/\", \"\")\n        \n        print(f\"{total_number_of_cells} cells done\")\n        if dataset_title in dataset_rows:\n            paths[dataset_title] = dataset_rows[dataset_title][0]\n            all_species[dataset_title] = dataset_rows[dataset_title][1]\n            covar_cols[dataset_title] = dataset_rows[dataset_title][2]\n            num_cells[dataset_title] = dataset_rows[dataset_title][3]\n            num_genes[dataset_title] = dataset_rows[dataset_title][4]\n            #print(\"skipped read of proc\")\n            \n            total_number_of_cells += dataset_rows[dataset_title][3]\n            continue # Skip!\n        else:\n            pbar.set_description(f\"{dataset_title} proc\")\n            try:\n                process_row(row, num_genes, num_cells, paths, all_species, covar_cols, dataset_title=dataset_title)\n            except:\n                print(f\"****{dataset_title} ERROR****\")\n                all_errors.append(dataset_title)\n                \n                \n            pbar.set_description(f\"{dataset_title} done\")\n            \n            if dataset_title in paths:\n                dataset_rows[dataset_title] = [paths[dataset_title], all_species[dataset_title], covar_cols[dataset_title], num_cells[dataset_title], num_genes[dataset_title], dataset_title]\n\n                total_number_of_cells += dataset_rows[dataset_title][3]\n\n                with open(\"dataset_rows_mouse_fixed.pkl\", \"wb\") as f:\n                    pkl.dump(dataset_rows, f)\n                    print(\"wrote pkl\")\n            \n    # path,species,covar_col,num_cells,names\n    \n    df[\"path\"] = list(paths.values())\n    df[\"species\"] = list(all_species.values())\n    df[\"covar_col\"] = list(covar_cols.values())\n    df[\"num_cells\"] = list(num_cells.values())\n    df[\"num_genes\"] = list(num_genes.values())\n    df[\"names\"] = list(paths.keys())\n\n    print(df.head(20))\n    print()\n    print(\"Errors:\")\n    print(all_errors)\n    df.to_csv(\"cxg_datasets.csv\", index=False)\n"
  },
  {
    "path": "data_proc/gene_embeddings.py",
    "content": "\"\"\"Helper functions for loading pretrained gene embeddings.\"\"\"\nfrom pathlib import Path\nfrom typing import Dict, Tuple\n\nimport torch\n\nfrom scanpy import AnnData\nimport numpy as np\nimport pandas as pd\n\n\nEMBEDDING_DIR = Path('model_files/protein_embeddings')\nMODEL_TO_SPECIES_TO_GENE_EMBEDDING_PATH = {\n    'ESM2': {\n        'human': EMBEDDING_DIR / 'Homo_sapiens.GRCh38.gene_symbol_to_embedding_ESM2.pt',\n        'mouse': EMBEDDING_DIR / 'Mus_musculus.GRCm39.gene_symbol_to_embedding_ESM2.pt',\n        'frog': EMBEDDING_DIR / 'Xenopus_tropicalis.Xenopus_tropicalis_v9.1.gene_symbol_to_embedding_ESM2.pt',\n        'zebrafish': EMBEDDING_DIR / 'Danio_rerio.GRCz11.gene_symbol_to_embedding_ESM2.pt',\n        \"mouse_lemur\": EMBEDDING_DIR / \"Microcebus_murinus.Mmur_3.0.gene_symbol_to_embedding_ESM2.pt\",\n        \"pig\": EMBEDDING_DIR / 'Sus_scrofa.Sscrofa11.1.gene_symbol_to_embedding_ESM2.pt',\n        \"macaca_fascicularis\": EMBEDDING_DIR / 'Macaca_fascicularis.Macaca_fascicularis_6.0.gene_symbol_to_embedding_ESM2.pt',\n        \"macaca_mulatta\": EMBEDDING_DIR / 'Macaca_mulatta.Mmul_10.gene_symbol_to_embedding_ESM2.pt',\n    }\n}\n\nextra_species = pd.read_csv(\"./model_files/new_species_protein_embeddings.csv\").set_index(\"species\").to_dict()[\"path\"]\nMODEL_TO_SPECIES_TO_GENE_EMBEDDING_PATH[\"ESM2\"].update(extra_species) # adds new species\n\n\ndef load_gene_embeddings_adata(adata: AnnData, species: list, embedding_model: str) -> Tuple[AnnData, Dict[str, torch.FloatTensor]]:\n    \"\"\"Loads gene embeddings for all the species/genes in the provided data.\n\n    :param data: An AnnData object containing gene expression data for cells.\n    :param species: Species corresponding to this adata\n    \n    :param embedding_model: The gene embedding model whose embeddings will be loaded.\n    :return: A tuple containing:\n               - A subset of the data only containing the gene expression for genes with embeddings in all species.\n               - A dictionary mapping species name to the corresponding gene embedding matrix (num_genes, embedding_dim).\n    \"\"\"\n    # Get species names\n    species_names = species\n    species_names_set = set(species_names)\n\n    # Get embedding paths for the model\n    species_to_gene_embedding_path = MODEL_TO_SPECIES_TO_GENE_EMBEDDING_PATH[embedding_model]\n    available_species = set(species_to_gene_embedding_path)\n\n    # Ensure embeddings are available for all species\n    if not (species_names_set <= available_species):\n        raise ValueError(f'The following species do not have gene embeddings: {species_names_set - available_species}')\n\n    # Load gene embeddings for desired species (and convert gene symbols to lower case)\n    species_to_gene_symbol_to_embedding = {\n        species: {\n            gene_symbol.lower(): gene_embedding\n            for gene_symbol, gene_embedding in torch.load(species_to_gene_embedding_path[species]).items()\n        }\n        for species in species_names\n    }\n\n    # Determine which genes to include based on gene expression and embedding availability\n    genes_with_embeddings = set.intersection(*[\n        set(gene_symbol_to_embedding)\n        for gene_symbol_to_embedding in species_to_gene_symbol_to_embedding.values()\n    ])\n    genes_to_use = {gene for gene in adata.var_names if gene.lower() in genes_with_embeddings}\n\n    # Subset data to only use genes with embeddings\n    adata = adata[:, adata.var_names.isin(genes_to_use)]\n\n    # Set up dictionary mapping species to gene embedding matrix (num_genes, embedding_dim)\n    species_to_gene_embeddings = {\n        species_name: torch.stack([\n            species_to_gene_symbol_to_embedding[species_name][gene_symbol.lower()]\n            for gene_symbol in adata.var_names\n        ])\n        for species_name in species_names\n    }\n\n    return adata, species_to_gene_embeddings\n"
  },
  {
    "path": "data_proc/generate_reduced_chrom_files.py",
    "content": "import os\nos.environ[\"OMP_NUM_THREADS\"] = \"4\" # export OMP_NUM_THREADS=4\nos.environ[\"OPENBLAS_NUM_THREADS\"] = \"4\" # export OPENBLAS_NUM_THREADS=4 \nos.environ[\"MKL_NUM_THREADS\"] = \"4\" # export MKL_NUM_THREADS=6\nos.environ[\"VECLIB_MAXIMUM_THREADS\"] = \"4\" # export VECLIB_MAXIMUM_THREADS=4\nos.environ[\"NUMEXPR_NUM_THREADS\"] = \"4\"\n\n\nimport warnings\nwarnings.filterwarnings(\"ignore\")\n\nimport scanpy as sc\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torch.optim as optim\nimport numpy as np\nimport pickle\nimport os\nimport argparse\nimport logging\nimport time\n\nfrom tqdm.auto import tqdm\nimport matplotlib.pyplot as plt\nimport pandas as pd\n\n#sc._settings.ScanpyConfig.n_jobs = 6\n\nimport math\nfrom typing import Tuple\n\nimport torch\nfrom torch import nn, Tensor\nimport torch.nn.functional as F\nfrom torch.nn import TransformerEncoder, TransformerEncoderLayer\nfrom torch.utils.data import dataset\n\n\nfrom accelerate import Accelerator\nimport anndata\nfrom data_utils import adata_path_to_prot_chrom_starts, get_spec_chrom_csv\n\n\n\nfrom torch.utils.data import dataset\nfrom torch.utils.data import DataLoader, TensorDataset\nfrom scipy.stats import binom\n\n\n\n\ndef padding_tensor(sequences):\n    \"\"\"\n    :param sequences: list of tensors\n    :return:\n    \"\"\"\n    num = len(sequences)\n    max_len = max([s.size(0) for s in sequences])\n    out_dims = (num, max_len, 1280)\n    \n    \n    out_tensor = sequences[0].data.new(*out_dims).fill_(0)\n    out_dims2 = (num, max_len)\n    \n    mask = sequences[0].data.new(*out_dims2).fill_(float('-inf'))\n    for i, tensor in enumerate(sequences):\n        length = tensor.size(0)\n        out_tensor[i, :length] = tensor\n        mask[i, :length] = 1\n    return out_tensor.permute(1, 0, 2), mask\n\n\nfrom pathlib import Path\n# ESM1b\n'''\nEMBEDDING_DIR = Path('/dfs/project/cross-species/data/proteome/embeddings')\nhuman_pe_dir =  EMBEDDING_DIR / 'Homo_sapiens.GRCh38.gene_symbol_to_embedding_ESM1b.pt'\nmouse_pe_dir =  EMBEDDING_DIR / 'Mus_musculus.GRCm39.gene_symbol_to_embedding_ESM1b.pt'\nlemur_pe_dir =  Path(\"/dfs/project/cross-species/yanay/data/proteome/embeddings/\") / 'Microcebus_murinus.Mmur_3.0.gene_symbol_to_embedding_ESM1b.pt'\n\n'''\n\n# Upgrade to ESM2\nEMBEDDING_DIR = Path('/dfs/project/cross-species/data/proteome/embeddings')\nEMBEDDING_DIR = Path('/dfs/project/cross-species/yanay/data/proteome/embeddings')\n\nembeddings_paths = {\n        'human': EMBEDDING_DIR / 'Homo_sapiens.GRCh38.gene_symbol_to_embedding_ESM2.pt',\n        'mouse': EMBEDDING_DIR / 'Mus_musculus.GRCm39.gene_symbol_to_embedding_ESM2.pt',\n        'frog': EMBEDDING_DIR / 'Xenopus_tropicalis.Xenopus_tropicalis_v9.1.gene_symbol_to_embedding_ESM2.pt',\n        'zebrafish': EMBEDDING_DIR / 'Danio_rerio.GRCz11.gene_symbol_to_embedding_ESM2.pt',\n        \"mouse_lemur\": EMBEDDING_DIR / \"Microcebus_murinus.Mmur_3.0.gene_symbol_to_embedding_ESM2.pt\",\n        \"pig\": EMBEDDING_DIR / 'Sus_scrofa.Sscrofa11.1.gene_symbol_to_embedding_ESM2.pt',\n        \"macaca_fascicularis\": EMBEDDING_DIR / 'Macaca_fascicularis.Macaca_fascicularis_6.0.gene_symbol_to_embedding_ESM2.pt',\n        \"macaca_mulatta\": EMBEDDING_DIR / 'Macaca_mulatta.Mmul_10.gene_symbol_to_embedding_ESM2.pt',\n    }\n\nspecies_to_pe = {\n    species:torch.load(pe_dir) for species, pe_dir in embeddings_paths.items()   \n}\n\nspecies_to_pe = {species:{k.upper(): v for k,v in pe.items()} for species, pe in species_to_pe.items()}\n\n#species_to_keys = {species:list(pe.keys()) for species, pe in species_to_pe.items()}\n#species_to_keys = {species:dict(zip(keys, np.arange(len(keys)))) for species, keys in species_to_keys.items()}\n\n\n#datasets_df = pd.read_csv(\"/dfs/project/cross-species/yanay/code/UCE/data_proc/full_train_datasets.csv\")\ndatasets_df = pd.read_csv(\"tissue_datasets.csv\")\ndatasets_df = pd.read_csv(\"perturb_datasets.csv\")\ndatasets_df = pd.read_csv(\"../new_perturb_datasets.csv\")\n\n\n#pd.concat((#pd.read_csv(\"new_datasets.csv\"),\n                             #pd.read_csv(\"pbmcs_nohvg.csv\"), \n                             #pd.read_csv(\"lung_nohvg.csv\"),\n                             #pd.read_csv(\"new_tabula_datasets.csv\"),\n                             #pd.read_csv(\"updated_datasets.csv\"),\n  #                           #pd.read_csv(\"sanger_heart_atlas_datasets.csv\"),\n #                            pd.read_csv(\"tissue_datasets.csv\")\n #                       ))\n\n\n\n\n#datasets_df = pd.read_csv(\"cell_cycle_datasets.csv\")\n#datasets_df = pd.read_csv(\"spatial_datasets.csv\")\n#datasets_df = pd.read_csv(\"perturb_datasets.csv\")\n#datasets_df = pd.read_csv(\"ccle_datasets.csv\")\n#datasets_df = pd.read_csv(\"pancreas_datasets.csv\")\n\n\n\nsorted_dataset_names = sorted(datasets_df[\"names\"])\nwith open(\"dataset_shapes.pkl\", \"rb\") as f:\n    shapes_dict = pickle.load(f)\n    \n\nshapes_dict.update({\n \"madissoon_novel_lung\":(190728, 8000),   \n 'flores_cerebellum_human': (20232, 8000),\n 'osuch_gut_human': (272310, 8000),\n 'msk_ovarian_human': (929690, 8000),\n 'htan_vmuc_dis_epi_human': (65084, 8000),\n 'htan_vmuc_val_epi_human': (57564, 8000),\n 'htan_vmuc_non_epi_human': (9099, 8000),\n 'hao_pbmc_3p_human': (161764, 8000),\n 'hao_pbmc_5p_human': (49147, 8000),\n 'gao_tumors_human': (36111, 8000),\n 'swabrick_breast_human': (92427, 8000),\n 'wu_cryo_tumors_human': (105662, 8000),\n 'cell_line_het_human': (53513, 8000),\n 'bi_allen_metastasis_human': (27787, 8000),\n 'zheng68k_human': (68579, 8000),\n 'zheng68k_12k_human': (68579, 12000),\n 'mouse_embryo_ct': (153597, 12000),\n \"regev_gtex_heart\": (36574, 8000),\n \"tabula_sapiens_heart\": (11505, 8000),\n \"10k_pbmcs\":(11990, 12000),\n \"epo_ido\":(35834,12000),\n 'tabula_sapiens_kidney': (9641, 8000),\n 'tabula_microcebus_kidney': (14592, 8000),\n 'tabula_muris_kidney': (2781, 8000),\n 'tabula_muris_senis_kidney': (19610, 8000),\n  'immune_human': (33506, 8000)\n                   })\n\nfor row in datasets_df.iterrows():\n    ngenes = row[1].num_genes\n    ncells = row[1].num_cells\n    name = row[1].names\n    if not np.isnan(ngenes):\n        shapes_dict[name] = (int(ncells), int(ngenes))\n                   \n#with open(\"dataset_shapes.pkl\", \"wb\") as f:\n#    pickle.dump(shapes_dict, f)\ntoken_dim = 5120\nmmap_dict = {}\n\nroot_dir = \"/lfs/local/0/yanay/uce_h5s/\"\nroot_dir_census = \"/lfs/local/0/yanay/cxg_h5s/\"\n\ndataset_to_paths = {r[1][\"names\"]:root_dir + r[1][\"path\"].replace(\".h5ad\", \"_proc.h5ad\") for r in datasets_df.iterrows()}\nfor row in datasets_df.iterrows():\n    name = row[1].names\n    census = row[1].census\n    \n    if census == \"yes\":\n        dataset_to_paths[name] = dataset_to_paths[name].replace(root_dir, root_dir_census)\n\n\ndatasets_to_species = {r[1][\"names\"]:r[1][\"species\"] for r in datasets_df.iterrows()}\n\n#species_to_pe = {\"mouse\":mouse_pe, \"human\":human_pe, \"mouse_lemur\":lemur_pe}\n\n#dataset_to_protein_embeddings_all = {k:species_to_pe[v] for k, v in datasets_to_species.items()}\n\ndataset_to_protein_embeddings = {}\n\n\n#dataset_to_protein_embeddings_all[\"madissoon_novel_lung\"] = species_to_pe[\"human\"]\ndatasets_to_species[\"madissoon_novel_lung\"] = \"human\"\n#dataset_to_paths[\"madissoon_novel_lung\"] = \"/lfs/local/0/yanay/uce_h5s/madissoon_novel_lung_proc.h5ad\"\n\n\n\n# New Chrom Based Code\ngene_to_chrom_pos = get_spec_chrom_csv()\nspecies_to_chrom_categories = {}\n\nfor species in np.unique(gene_to_chrom_pos[\"species\"]):\n    species_to_chrom_categories[species] = pd.Categorical(gene_to_chrom_pos[\"chromosome\"]).categories\n\n    \ndataset_to_chroms = {}\ndataset_to_starts = {}\n\nsorted_species_names = sorted(species_to_pe.keys())\nprint(sorted_species_names)\n\nif os.path.exists(f\"/dfs/project/uce/all_species_pe_tokens.torch\"):\n    all_pe = torch.load(f\"/dfs/project/uce/all_species_pe_tokens.torch\")\n    with open(\"/dfs/project/uce/all_species_offsets.pkl\", \"rb\") as f:\n        species_to_offsets = pickle.load(f)\n    print(\"Loaded PE\", all_pe.shape)\nelse:\n    torch.manual_seed(8)\n    MASK_TENSOR = torch.zeros((1, token_dim)) # this is the padding token\n    CHROM_TENSOR_LEFT = torch.normal(mean=0, std=1, size=(1, token_dim))\n    CHROM_TENSOR_RIGHT = torch.normal(mean=0, std=1, size=(1, token_dim))\n    CLS_TENSOR = torch.normal(mean=0, std=1, size=(1, token_dim))\n    species_to_offsets = {}\n\n    all_pe = [MASK_TENSOR, CHROM_TENSOR_LEFT, CHROM_TENSOR_RIGHT, CLS_TENSOR]\n    offset = len(all_pe) # special tokens at the top!\n    for species in sorted_species_names:\n        pe_stacked = torch.stack(list(species_to_pe[species].values()))\n        all_pe.append(pe_stacked)\n        species_to_offsets[species] = offset\n        offset += pe_stacked.shape[0]\n\n    all_pe = torch.vstack(all_pe)\n    print(all_pe.shape)\n    torch.save(all_pe, f\"/dfs/project/uce/all_species_pe_tokens.torch\")\n    with open(\"/dfs/project/uce/all_species_offsets.pkl\", \"wb+\") as f:\n        pickle.dump(species_to_offsets, f)\n    print(\"Saved PE\")\n\n# Load in already saved!\nif os.path.exists(f\"/lfs/local/0/yanay/reduced_datasets_to_pe_chrom_{token_dim}_new.torch\"):\n    dataset_to_protein_embeddings = torch.load(f\"/lfs/local/0/yanay/reduced_datasets_to_pe_chrom_{token_dim}_new.torch\")\n\n    with open(\"/lfs/local/0/yanay/dataset_to_chroms_new.pkl\", \"rb\") as f:\n        dataset_to_chroms = pickle.load(f)\n    with open(\"/lfs/local/0/yanay/dataset_to_starts_new.pkl\", \"rb\") as f:\n        dataset_to_starts = pickle.load(f)\nelse:\n    dataset_to_protein_embeddings = {}\n    dataset_to_chroms = {}\n    dataset_to_starts = {}\n\n\n# Add the new ones\nprint(\"creating reduced size protein embeddings file\")\n\nredo = True\n\nfor dataset, path in tqdm(list(dataset_to_paths.items())):\n    if dataset in dataset_to_protein_embeddings.keys() and not redo:\n        continue # skip since already procced\n    print(dataset)\n    adata = sc.read(path)\n    dataset_species = datasets_to_species[dataset]\n    spec_pe_genes = list(species_to_pe[dataset_species].keys())\n    offset = species_to_offsets[dataset_species]\n    \n    # Get proper idxs\n    pe_row_idxs, dataset_chroms, dataset_pos = adata_path_to_prot_chrom_starts(adata, dataset_species, spec_pe_genes, gene_to_chrom_pos, offset)\n    # Add to dicts\n    dataset_to_chroms[dataset] = dataset_chroms\n    dataset_to_starts[dataset] = dataset_pos\n    dataset_to_protein_embeddings[dataset] = pe_row_idxs\n    \n    del adata\n# save Dicts and idxs\ntorch.save(dataset_to_protein_embeddings, f\"/lfs/local/0/yanay/reduced_datasets_to_pe_chrom_{token_dim}_new.torch\")\n\nwith open(\"/lfs/local/0/yanay/dataset_to_chroms_new.pkl\", \"wb+\") as f:\n    pickle.dump(dataset_to_chroms, f)\nwith open(\"/lfs/local/0/yanay/dataset_to_starts_new.pkl\", \"wb+\") as f:\n    pickle.dump(dataset_to_starts, f)        "
  },
  {
    "path": "data_proc/preproc_many_dataset.py",
    "content": "import os\nos.environ[\"OMP_NUM_THREADS\"] = \"10\" # export OMP_NUM_THREADS=4\nos.environ[\"OPENBLAS_NUM_THREADS\"] = \"10\" # export OPENBLAS_NUM_THREADS=4 \nos.environ[\"MKL_NUM_THREADS\"] = \"10\" # export MKL_NUM_THREADS=6\nos.environ[\"VECLIB_MAXIMUM_THREADS\"] = \"10\" # export VECLIB_MAXIMUM_THREADS=4\nos.environ[\"NUMEXPR_NUM_THREADS\"] = \"10\"\n\n\n\nfrom collections import defaultdict\nfrom typing import Dict, List, Optional, Tuple\n\nimport torch\nimport torch.utils.data as data\nimport numpy as np\nimport scanpy as sc\nfrom numpy import array\nimport subprocess\nimport os\nfrom tqdm import tqdm\nimport warnings\nwarnings.filterwarnings(\"ignore\")\n\n\nfrom gene_embeddings import load_gene_embeddings_adata\nimport pandas as pd\nimport numpy as np\nfrom scanpy import AnnData\nfrom data_utils import process_raw_anndata\n\ndef data_to_torch_X(X):\n    if isinstance(X, sc.AnnData):\n        X = X.X\n    if not isinstance(X, np.ndarray):\n            X = X.toarray()\n    return torch.from_numpy(X).float()\n\nclass SincleCellDataset(data.Dataset):\n    def __init__(self,\n                expression: torch.tensor, # Subset to hv genes, count data! cells x genes\n                protein_embeddings: torch.tensor, # same order as expression, also subset genes x pe\n                labels: None, # optional, tensor of labels\n                covar_vals: None, # tensor of covar values or none\n                ) -> None:\n        super(SincleCellDataset, self).__init__()\n        \n        # Set expression\n        self.expression = expression\n        \n        row_sums = self.expression.sum(1) # UMI Counts\n        log_norm_count_adj = torch.log1p(self.expression / (self.expression.sum(1)).unsqueeze(1) * torch.tensor(1000))       \n        \n        # Set log norm and count adjusted expression\n        max_vals, max_idx = torch.max(log_norm_count_adj, dim=0)\n        self.expression_mod =  log_norm_count_adj / max_vals\n        \n        # Calculate dropout likliehoods of each gene\n        self.dropout_vec = (self.expression == 0).float().mean(0) # per gene dropout percentages\n        \n        # Set data info\n        self.num_cells = self.expression.shape[0]\n        self.num_genes = self.expression.shape[1]\n        \n        # Set optional label info, including categorical covariate index\n        self.covar_vals = covar_vals\n        self.labels = labels\n        \n        # Set protein embeddings\n        self.protein_embeddings = protein_embeddings\n        \n        self.item_mode = \"expression\"\n        if self.covar_vals is not None:\n            self.item_mode = \"expression+covar\"\n        \n        \n    def __getitem__(self, idx):\n        if self.item_mode == \"expression\":\n            if isinstance(idx, int):\n                if idx < self.num_cells:\n                    return self.expression[idx, :]\n                else:\n                    raise IndexError\n            else:\n                raise NotImplementedError\n        elif self.item_mode == \"expression+covar\":\n            if isinstance(idx, int):\n                if idx < self.num_cells:\n                    return self.expression[idx, :], self.covar_vals[idx]\n                else:\n                    raise IndexError\n            else:\n                raise NotImplementedError\n            \n\n    def __len__(self) -> int:\n        return self.num_cells\n\n    def get_dim(self) -> Dict[str, int]:\n        return self.num_genes\n\n\ndef data_to_torch_X(X):\n    if isinstance(X, sc.AnnData):\n        X = X.X\n    if not isinstance(X, np.ndarray):\n            X = X.toarray()\n    return torch.from_numpy(X).float()\n\n\ndef anndata_to_sc_dataset(adata:sc.AnnData, \n                                 species:str=\"human\", \n                                 labels:list=[],\n                                 covar_col:str=None,\n                                 hv_genes:int=12000,\n                                 embedding_model=\"ESM1b\",\n                                ) -> (SincleCellDataset, AnnData):\n    \n    # Subset to just genes we have embeddings for\n    adata, protein_embeddings = load_gene_embeddings_adata(\n        adata=adata,\n        species=[species],\n        embedding_model=embedding_model\n    )\n    \n    if DO_HVG:\n        sc.pp.highly_variable_genes(adata, flavor='seurat_v3', n_top_genes=hv_genes)  # Expects Count Data\n    \n        hv_index = adata.var[\"highly_variable\"]\n        adata = adata[:, hv_index] # Subset to hv genes only\n    \n        protein_embeddings = protein_embeddings[species][hv_index]\n    else:\n        protein_embeddings = protein_embeddings[species]\n    expression = data_to_torch_X(adata.X)\n    \n    covar_vals = None\n    if len(labels) > 0:\n        assert covar_col is None or covar_col in labels, \"Covar needs to be in labels\" # make sure you keep track of covar column!\n        labels = adata.obs.loc[:, labels].values\n        \n        if covar_col is not None:\n            # we have a categorical label to use as covariate\n            covar_vals = torch.tensor(pd.Categorical(adata.obs[covar_col]).codes)\n    return SincleCellDataset(\n        expression=expression,\n        protein_embeddings=protein_embeddings,\n        labels=labels,\n        covar_vals=covar_vals\n    ), adata    \n    \ndef proc(args):\n    datasets_df = pd.read_csv(args.datasets_df)\n    datasets_df[\"covar_col\"] = np.nan\n    skip = args.skip\n    additional_filter = args.filter\n    DO_HVG = args.DO_HVG\n    \n    num_genes = {}\n    num_cells = {}\n    \n    ir = list(datasets_df.iterrows())\n    for i, row in tqdm(ir, total=len(datasets_df)):\n        _, ncells, ngenes = process_raw_anndata(row, h5_folder_path, npz_folder_path, scp, skip, additional_filter, root=args.file_root_path)\n        if (ncells is not None) and (ngenes is not None):\n            num_genes[path] = adata.X.shape[1]\n            num_cells[path] = ngenes\n        \n    if \"num_cells\" not in datasets_df.columns:\n        datasets_df[\"num_cells\"] = 0\n    if \"num_genes\" not in datasets_df.columns:\n        datasets_df[\"num_genes\"] = 0\n    for k in num_genes.keys():\n        ng = num_genes[k]\n        nc = num_cells[k]\n        datasets_df.loc[datasets_df[\"path\"] == k, \"num_cells\"] = nc\n        datasets_df.loc[datasets_df[\"path\"] == k, \"num_genes\"] = ng\n    # Write with the cells and genes info back to the original path\n    datasets_df.to_csv(args.datasets_df, index=False)\nif __name__==\"__main__\":\n    # Parse command-line arguments\n    \n    parser = argparse.ArgumentParser(description='Preproc datasets h5ad datasets.')\n\n    # Define command-line arguments\n    parser.add_argument('--scp', type=str, default=\"\", help='Name of a SNAP server to SCP the results to. It should have the same folders as the script is already saving to.')\n    parser.add_argument('--h5_folder_path', type=str, default=\"/lfs/local/0/yanay/uce_h5s/\", help='Folder to save H5s to.')\n    parser.add_argument('--npz_folder_path', type=str, default=\"/lfs/local/0/yanay/uce_proc/\", help='Folder to save NPZs to.')\n    \n    \n    parser.add_argument('--datasets_df', type=str, default=\"/dfs/project/uce/new_perturb_datasets.csv\", help='Path to datasets csv. Will be overwritten to have the correct num cells and num genes for each dataset.')\n    \n    parser.add_argument('--filter', type=bool, default=True, help='Should you do an additional gene/cell filtering? This can be a good step since even if you have already done it, subsetting to protein embeddings can make some cells sparser.')\n    parser.add_argument('--skip', type=bool, default=True, help='Should you skip datasets that appear to have already been created in the h5 folder?')\n    \n    parser.add_argument('--DO_HVG', type=bool, default=False, help='Should a HVG subset be done.')\n    \n    \n    parse\n    args = parser.parse_args()\n    main(args)\n    "
  },
  {
    "path": "eval_data.py",
    "content": "\"\"\"\nDataloaders\n\n\"\"\"\n\nimport warnings\nwarnings.filterwarnings(\"ignore\")\nimport sys\nsys.path.append('../')\nfrom typing import Dict, List, Optional, Tuple, Any\nimport torch\nimport numpy as np\nimport pickle\nimport torch.utils.data as data\n\n\nclass MultiDatasetSentences(data.Dataset):\n    def __init__(self, sorted_dataset_names, shapes_dict, args, \n                 dataset_to_protein_embeddings_path= \"/lfs/local/0/yanay/reduced_datasets_to_pe_chrom_5120_new.torch\",\n                 datasets_to_chroms_path=\"/lfs/local/0/yanay/dataset_to_chroms_new.pkl\",\n                 datasets_to_starts_path=\"/lfs/local/0/yanay/dataset_to_starts_new.pkl\",\n                 npzs_dir=\"/lfs/local/0/yanay/uce_proc/\") -> None:\n        super(MultiDatasetSentences, self).__init__()\n        # self.xs = {}\n        self.num_cells = {}\n        self.num_genes = {}\n        self.shapes_dict = shapes_dict\n        self.args = args\n\n        self.total_num_cells = 0\n        for name in sorted_dataset_names:\n            num_cells, num_genes = self.shapes_dict[name]\n            # self.xs[name] = X\n            self.num_cells[name] = num_cells\n            self.num_genes[name] = num_genes\n\n            self.total_num_cells += num_cells\n\n        self.datasets = sorted_dataset_names\n\n        # TODO: preferably not hard-coded here\n        self.dataset_to_protein_embeddings = torch.load(dataset_to_protein_embeddings_path)\n        with open(datasets_to_chroms_path, \"rb\") as f:\n            self.dataset_to_chroms = pickle.load(f)\n        with open(datasets_to_starts_path, \"rb\") as f:\n            self.dataset_to_starts = pickle.load(f)\n        \n        self.npzs_dir = npzs_dir\n\n    def __getitem__(self, idx):\n        if isinstance(idx, int):\n            for dataset in sorted(self.datasets):\n                if idx < self.num_cells[dataset]:\n                    #cts = np.memmap(f\"/lfs/local/0/yanay/cxg_npzs/\" + f\"{dataset}_counts.npz\",\n                    #        dtype='int64', mode='r', shape=self.shapes_dict[dataset])\n                    cts = np.memmap(self.npzs_dir + f\"{dataset}_counts.npz\", dtype='int64', mode='r', shape=self.shapes_dict[dataset])\n                    counts = cts[idx]\n                    counts = torch.tensor(counts).unsqueeze(0)\n                    weights = torch.log1p(counts)\n                    weights = (weights / torch.sum(weights))\n                    batch_sentences, mask, seq_len, cell_sentences = \\\n                        sample_cell_sentences(counts, weights, dataset, self.args,\n                            dataset_to_protein_embeddings= self.dataset_to_protein_embeddings,\n                            dataset_to_chroms=self.dataset_to_chroms,\n                            dataset_to_starts=self.dataset_to_starts)\n                    return batch_sentences, mask, idx, seq_len, cell_sentences\n                else:\n                    idx -= self.num_cells[dataset]\n            raise IndexError\n        else:\n            raise NotImplementedError\n\n    def __len__(self) -> int:\n        return self.total_num_cells\n\n    def get_dim(self) -> Dict[str, int]:\n        return self.num_genes\n\n\nclass MultiDatasetSentenceCollator(object):\n    def __init__(self, args):\n        self.pad_length = args.pad_length\n\n\n    def __call__(self, batch):\n        batch_size = len(batch)\n        batch_sentences = torch.zeros((batch_size, self.pad_length))\n        mask = torch.zeros((batch_size, self.pad_length))\n        cell_sentences = torch.zeros((batch_size, self.pad_length))\n\n        idxs = torch.zeros(batch_size)\n\n        i = 0\n        max_len = 0\n        for bs, msk, idx, seq_len, cs in batch:\n            batch_sentences[i, :] = bs\n            cell_sentences[i, :] = cs\n            max_len = max(max_len, seq_len)\n            mask[i, :] = msk\n            idxs[i] = idx\n\n            i += 1\n\n        return batch_sentences[:, :max_len] , mask[:, :max_len], idxs, cell_sentences\n\n\n\ndef sample_cell_sentences(counts, batch_weights, dataset, args,\n                          dataset_to_protein_embeddings,\n                          dataset_to_chroms,\n                          dataset_to_starts):\n\n    dataset_idxs = dataset_to_protein_embeddings[dataset] # get the dataset specific protein embedding idxs\n    cell_sentences = torch.zeros((counts.shape[0], args.pad_length)) # init the cell representation as 0s\n    mask = torch.zeros((counts.shape[0], args.pad_length)) # start of masking the whole sequence\n    chroms = dataset_to_chroms[dataset] # get the dataset specific chroms for each gene\n    starts = dataset_to_starts[dataset] # get the dataset specific genomic start locations for each gene\n\n    longest_seq_len = 0 # we need to keep track of this so we can subset the batch at the end\n\n    for c, cell in enumerate(counts):\n        weights = batch_weights[c].numpy()\n        weights = weights / sum(weights)  # RE NORM after mask\n        \n        # randomly choose the genes that will make up the sample, weighted by expression, with replacement\n        choice_idx = np.random.choice(np.arange(len(weights)),\n                                      size=args.sample_size, p=weights,\n                                      replace=True)\n        choosen_chrom = chroms[choice_idx] # get the sampled genes chromosomes\n        # order the genes by chromosome\n        chrom_sort = np.argsort(choosen_chrom)  \n        choice_idx = choice_idx[chrom_sort]\n\n        # sort the genes by start\n        new_chrom = chroms[choice_idx]\n        choosen_starts = starts[choice_idx]\n\n        ordered_choice_idx = np.full((args.pad_length),\n                                     args.cls_token_idx)  # start with cls\n        # i= 0 first token is CLS\n        i = 1  # continue on to the rest of the sequence with left bracket being assumed.\n        # Shuffle the chroms now, there's no natural order to chromosomes\n        uq_chroms = np.unique(new_chrom)\n        np.random.shuffle(uq_chroms) # shuffle\n        \n        # This loop is actually just over one cell\n        for chrom in uq_chroms:\n            # Open Chrom token\n            ordered_choice_idx[i] = int(chrom) + args.CHROM_TOKEN_OFFSET # token of this chromosome # i = 1 next token is a chrom open\n            i += 1\n            # now sort the genes by start order within the chroms\n            loc = np.where(new_chrom == chrom)[0]\n            sort_by_start = np.argsort(\n                choosen_starts[loc])  # start locations for this chromsome\n\n            to_add = choice_idx[loc[sort_by_start]]\n            ordered_choice_idx[i:(i + len(to_add))] = dataset_idxs[to_add]\n            i += len(to_add)\n            ordered_choice_idx[i] = args.chrom_token_right_idx # add the chrom sep again\n            i += 1  # add the closing token again\n\n        longest_seq_len = max(longest_seq_len, i)\n        remainder_len = (args.pad_length - i)\n\n        cell_mask = torch.concat((torch.ones(i),\n                                  # pay attention to all of these tokens, ignore the rest!\n                                  torch.zeros(remainder_len)))\n\n        mask[c, :] = cell_mask\n\n        ordered_choice_idx[i:] = args.pad_token_idx # the remainder of the sequence\n        cell_sentences[c, :] = torch.from_numpy(ordered_choice_idx)\n        \n    cell_sentences_pe = cell_sentences.long() # token indices\n    \n    return cell_sentences_pe, mask, longest_seq_len, cell_sentences"
  },
  {
    "path": "eval_single_anndata.py",
    "content": "\"\"\"\nScript for Evaluating a Single AnnData\n\nParameters:\n----------\n- `adata_path` (str):\n    Full path to the AnnData you want to embed.\n- `dir` (str):\n    Working folder where all files will be saved.\n- `species` (str):\n    Species of the AnnData.\n- `filter` (bool):\n    Additional gene/cell filtering on the AnnData.\n- `skip` (bool):\n    Skip datasets that appear to have already been created.\n- `model_loc` (str):\n    Location of pretrained UCE model's weights in a `.torch` file.\n- `batch_size` (int):\n    Batch size for processing.\n- `CXG` (bool):\n    Use CXG model.\n- `nlayers` (int):\n    Number of transformer layers.\n- `output_dim` (int):\n    Desired output dimension.\n- `d_hid` (int):\n    Hidden dimension for processing.\n- `token_dim` (int):\n    Token dimension.\n- `spec_chrom_csv_path` (str):\n    CSV file mapping genes from each species to their respective chromosomes\n    and genomic start positions.\n- `token_file` (str):\n    `.torch` file containing token/protein embeddings for all tokens.\n- `protein_embeddings_dir` (str):\n    Directory containing protein embedding `.pt` files for all species.\n- `offset_pkl_path` (str):\n    `.pkl` file mapping between species and their gene's locations in the `token_file`.\n- `pad_length` (int):\n    Length to pad the cell sentence to.\n- `pad_token_idx` (int):\n    Index of the padding token in the `token_file`.\n- `chrom_token_left_idx` (int):\n    Left chromosome token index\n- `chrom_token_right_idx` (int):\n    Right chromosome token index\n- `cls_token_idx` (int):\n    CLS token index in the `token_file`.\n- `CHROM_TOKEN_OFFSET` (int):\n    Offset index, tokens after this mark are chromosome identifiers.\n- `sample_size` (int):\n    Number of genes sampled for cell sentence.\n- `multi_gpu` (bool):\n    Run evaluation on multiple GPUs (using accelerator)    \n\nReturns:\n-------\n- `dir/{dataset_name}_proc.h5ad`:\n    The processed AnnData. Processing involves subsetting it to genes which\n    have protein embeddings and then refiltering the dataset by minimum counts.\n- `dir/{dataset_name}_chroms.pkl`:\n    File mapping the genes in the dataset to their corresponding chromosome\n    indices.\n- `dir/{dataset_name}_counts.npz`:\n    File containing the counts of the AnnData in an easily accessible format.\n- `dir/{dataset_name}_shapes_dict.pkl`:\n    File containing the shape (ncell x ngene) of the AnnData, used to read the\n    `.npz` file.\n- `dir/{dataset_name}_pe_idx.torch`:\n    File mapping between the genes in the dataset and their index in the tokens file.\n- `dir/{dataset_name}_starts.pkl`:\n    File mapping between the genes in the dataset and their genomic start locations.\n\n\"\"\"\n\n\nimport argparse\nfrom evaluate import AnndataProcessor\nfrom accelerate import Accelerator\n\ndef main(args, accelerator):\n    processor = AnndataProcessor(args, accelerator)\n    processor.preprocess_anndata()\n    processor.generate_idxs()\n    processor.run_evaluation()\n\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser(\n        description='Embed a single anndata using UCE.')\n\n    # Anndata Processing Arguments\n    parser.add_argument('--adata_path', type=str,\n                        default=None,\n                        help='Full path to the anndata you want to embed.')\n    parser.add_argument('--dir', type=str,\n                        default=\"./\",\n                        help='Working folder where all files will be saved.')\n    parser.add_argument('--species', type=str, default=\"human\",\n                        help='Species of the anndata.')\n    parser.add_argument('--filter', type=bool, default=True,\n                        help='Additional gene/cell filtering on the anndata.')\n    parser.add_argument('--skip', type=bool, default=True,\n                        help='Skip datasets that appear to have already been created.')\n\n    # Model Arguments\n    parser.add_argument('--model_loc', type=str,\n                        default=None,\n                        help='Location of the model.')\n    parser.add_argument('--batch_size', type=int, default=25,\n                        help='Batch size.')\n    parser.add_argument('--pad_length', type=int, default=1536,\n                        help='Batch size.')\n    parser.add_argument(\"--pad_token_idx\", type=int, default=0,\n                        help=\"PAD token index\")\n    parser.add_argument(\"--chrom_token_left_idx\", type=int, default=1,\n                        help=\"Chrom token left index\")\n    parser.add_argument(\"--chrom_token_right_idx\", type=int, default=2,\n                        help=\"Chrom token right index\")\n    parser.add_argument(\"--cls_token_idx\", type=int, default=3,\n                        help=\"CLS token index\")\n    parser.add_argument(\"--CHROM_TOKEN_OFFSET\", type=int, default=143574,\n                        help=\"Offset index, tokens after this mark are chromosome identifiers\")\n    parser.add_argument('--sample_size', type=int, default=1024,\n                        help='Number of genes sampled for cell sentence')\n    parser.add_argument('--CXG', type=bool, default=True,\n                        help='Use CXG model.')\n    parser.add_argument('--nlayers', type=int, default=4,\n                        help='Number of transformer layers.')\n    parser.add_argument('--output_dim', type=int, default=1280,\n                        help='Output dimension.')\n    parser.add_argument('--d_hid', type=int, default=5120,\n                        help='Hidden dimension.')\n    parser.add_argument('--token_dim', type=int, default=5120,\n                        help='Token dimension.')\n    parser.add_argument('--multi_gpu', type=bool, default=False,\n                        help='Use multiple GPUs')\n\n    # Misc Arguments\n    parser.add_argument(\"--spec_chrom_csv_path\",\n                        default=\"./model_files/species_chrom.csv\", type=str,\n                        help=\"CSV Path for species genes to chromosomes and start locations.\")\n    parser.add_argument(\"--token_file\",\n                        default=\"./model_files/all_tokens.torch\", type=str,\n                        help=\"Path for token embeddings.\")\n    parser.add_argument(\"--protein_embeddings_dir\",\n                        default=\"./model_files/protein_embeddings/\", type=str,\n                        help=\"Directory where protein embedding .pt files are stored.\")\n    parser.add_argument(\"--offset_pkl_path\",\n                        default=\"./model_files/species_offsets.pkl\", type=str,\n                        help=\"PKL file which contains offsets for each species.\")\n\n    args = parser.parse_args()\n    accelerator = Accelerator(project_dir=args.dir)\n    main(args, accelerator)\n"
  },
  {
    "path": "evaluate.py",
    "content": "import os\n\n# os.environ[\"NCCL_DEBUG\"] = \"INFO\"\nos.environ[\"OMP_NUM_THREADS\"] = \"12\"  # export OMP_NUM_THREADS=4\nos.environ[\"OPENBLAS_NUM_THREADS\"] = \"12\"  # export OPENBLAS_NUM_THREADS=4\nos.environ[\"MKL_NUM_THREADS\"] = \"12\"  # export MKL_NUM_THREADS=6\nos.environ[\"VECLIB_MAXIMUM_THREADS\"] = \"12\"  # export VECLIB_MAXIMUM_THREADS=4\nos.environ[\"NUMEXPR_NUM_THREADS\"] = \"12\"\n\nimport warnings\n\nwarnings.filterwarnings(\"ignore\")\n\nimport scanpy as sc\nfrom tqdm.auto import tqdm\nfrom torch import nn, Tensor\n\nfrom model import TransformerModel\nfrom eval_data import MultiDatasetSentences, MultiDatasetSentenceCollator\nfrom utils import figshare_download\n\nfrom torch.utils.data import DataLoader\nfrom data_proc.data_utils import adata_path_to_prot_chrom_starts, \\\n    get_spec_chrom_csv, process_raw_anndata, get_species_to_pe\n\nimport os\nimport pickle\nimport pandas as pd\nimport numpy as np\nimport torch\n\n\nclass AnndataProcessor:\n    def __init__(self, args, accelerator):\n        self.args = args\n        self.accelerator = accelerator\n        self.h5_folder_path = self.args.dir\n        self.npz_folder_path = self.args.dir\n        self.scp = \"\"\n\n        # Check if paths exist, if not, create them\n        self.check_paths()\n\n        # Set up the anndata\n        self.adata_name = self.args.adata_path.split(\"/\")[-1]\n        self.adata_root_path = self.args.adata_path.replace(self.adata_name, \"\")\n        self.name = self.adata_name.replace(\".h5ad\", \"\")\n        self.proc_h5_path = self.h5_folder_path + f\"{self.name}_proc.h5ad\"\n        self.adata = None\n\n        # Set up the row\n        row = pd.Series()\n        row.path = self.adata_name\n        row.covar_col = np.nan\n        row.species = self.args.species\n        self.row = row\n\n        # Set paths once to be used throughout the class\n        self.pe_idx_path = self.args.dir + f\"{self.name}_pe_idx.torch\"\n        self.chroms_path = self.args.dir + f\"{self.name}_chroms.pkl\"\n        self.starts_path = self.args.dir + f\"{self.name}_starts.pkl\"\n        self.shapes_dict_path = self.args.dir + f\"{self.name}_shapes_dict.pkl\"\n\n    def check_paths(self):\n        \"\"\"\n        Check if the paths exist, if not, create them\n        \"\"\"\n        figshare_download(\"https://figshare.com/ndownloader/files/42706558\",\n                                self.args.spec_chrom_csv_path)\n        figshare_download(\"https://figshare.com/ndownloader/files/42706555\",\n                                self.args.offset_pkl_path)\n        if not os.path.exists(self.args.protein_embeddings_dir):\n            figshare_download(\"https://figshare.com/ndownloader/files/42715213\",\n                'model_files/protein_embeddings.tar.gz')\n        figshare_download(\"https://figshare.com/ndownloader/files/42706585\",\n                                self.args.token_file)\n        if self.args.adata_path is None:\n            print(\"Using sample AnnData: 10k pbmcs dataset\")\n            self.args.adata_path = \"./data/10k_pbmcs_proc.h5ad\"\n            figshare_download(\n                \"https://figshare.com/ndownloader/files/42706966\",\n                self.args.adata_path)\n        if self.args.model_loc is None:\n            print(\"Using sample 4 layer model\")\n            self.args.model_loc = \"./model_files/4layer_model.torch\"\n            figshare_download(\n                \"https://figshare.com/ndownloader/files/42706576\",\n                self.args.model_loc)\n\n\n    def preprocess_anndata(self):\n        if self.accelerator.is_main_process:\n            self.adata, num_cells, num_genes = \\\n                process_raw_anndata(self.row,\n                                    self.h5_folder_path,\n                                    self.npz_folder_path,\n                                    self.scp,\n                                    self.args.skip,\n                                    self.args.filter,\n                                    root=self.adata_root_path)\n            if (num_cells is not None) and (num_genes is not None):\n                self.save_shapes_dict(self.name, num_cells, num_genes,\n                                       self.shapes_dict_path)\n\n            if self.adata is None:\n                self.adata = sc.read(self.proc_h5_path)\n\n    def save_shapes_dict(self, name, num_cells, num_genes, shapes_dict_path):\n        shapes_dict = {name: (num_cells, num_genes)}\n        with open(shapes_dict_path, \"wb+\") as f:\n            pickle.dump(shapes_dict, f)\n            print(\"Wrote Shapes Dict\")\n\n    def generate_idxs(self):\n        if self.accelerator.is_main_process:\n            if os.path.exists(self.pe_idx_path) and \\\n                    os.path.exists(self.chroms_path) and \\\n                    os.path.exists(self.starts_path):\n                print(\"PE Idx, Chrom and Starts files already created\")\n\n            else:\n                species_to_pe = get_species_to_pe(self.args.protein_embeddings_dir)\n                with open(self.args.offset_pkl_path, \"rb\") as f:\n                    species_to_offsets = pickle.load(f)\n\n                gene_to_chrom_pos = get_spec_chrom_csv(\n                    self.args.spec_chrom_csv_path)\n                dataset_species = self.args.species\n                spec_pe_genes = list(species_to_pe[dataset_species].keys())\n                offset = species_to_offsets[dataset_species]\n                pe_row_idxs, dataset_chroms, dataset_pos = adata_path_to_prot_chrom_starts(\n                    self.adata, dataset_species, spec_pe_genes, gene_to_chrom_pos, offset)\n\n                # Save to the temp dict\n                torch.save({self.name: pe_row_idxs}, self.pe_idx_path)\n                with open(self.chroms_path, \"wb+\") as f:\n                    pickle.dump({self.name: dataset_chroms}, f)\n                with open(self.starts_path, \"wb+\") as f:\n                    pickle.dump({self.name: dataset_pos}, f)\n\n    def run_evaluation(self):\n        self.accelerator.wait_for_everyone()\n        with open(self.shapes_dict_path, \"rb\") as f:\n            shapes_dict = pickle.load(f)\n        run_eval(self.adata, self.name, self.pe_idx_path, self.chroms_path,\n                 self.starts_path, shapes_dict, self.accelerator, self.args)\n\n\ndef get_ESM2_embeddings(args):\n    # Load in ESM2 embeddings and special tokens\n    all_pe = torch.load(args.token_file)\n    if all_pe.shape[0] == 143574:\n        torch.manual_seed(23)\n        CHROM_TENSORS = torch.normal(mean=0, std=1, size=(1895, args.token_dim))\n        # 1895 is the total number of chromosome choices, it is hardcoded for now\n        all_pe = torch.vstack(\n            (all_pe, CHROM_TENSORS))  # Add the chrom tensors to the end\n        all_pe.requires_grad = False\n\n    return all_pe\n\n\ndef padding_tensor(sequences):\n    \"\"\"\n    :param sequences: list of tensors\n    :return:\n    \"\"\"\n    num = len(sequences)\n    max_len = max([s.size(0) for s in sequences])\n    out_dims = (num, max_len, 1280)\n\n    out_tensor = sequences[0].data.new(*out_dims).fill_(0)\n    out_dims2 = (num, max_len)\n\n    mask = sequences[0].data.new(*out_dims2).fill_(float('-inf'))\n    for i, tensor in enumerate(sequences):\n        length = tensor.size(0)\n        out_tensor[i, :length] = tensor\n        mask[i, :length] = 1\n    return out_tensor.permute(1, 0, 2), mask\n\n\ndef run_eval(adata, name, pe_idx_path, chroms_path, starts_path, shapes_dict,\n             accelerator, args):\n\n    #### Set up the model ####\n    token_dim = args.token_dim\n    emsize = 1280  # embedding dimension\n    d_hid = args.d_hid  # dimension of the feedforward network model in nn.TransformerEncoder\n    nlayers = args.nlayers  # number of nn.TransformerEncoderLayer in nn.TransformerEncoder\n    nhead = 20  # number of heads in nn.MultiheadAttention\n    dropout = 0.05  # dropout probability\n    model = TransformerModel(token_dim=token_dim, d_model=emsize, nhead=nhead,\n                             d_hid=d_hid,\n                             nlayers=nlayers, dropout=dropout,\n                             output_dim=args.output_dim)\n    if args.model_loc is None:\n        raise ValueError(\"Must provide a model location\")\n    # intialize as empty\n    empty_pe = torch.zeros(145469, 5120)\n    empty_pe.requires_grad = False\n    model.pe_embedding = nn.Embedding.from_pretrained(empty_pe)\n    model.load_state_dict(torch.load(args.model_loc, map_location=\"cpu\"),\n                          strict=True)\n    # Load in the real token embeddings\n    all_pe = get_ESM2_embeddings(args)\n    # This will make sure that you don't overwrite the tokens in case you're embedding species from the training data\n    # We avoid doing that just in case the random seeds are different across different versions. \n    if all_pe.shape[0] != 145469: \n        all_pe.requires_grad = False\n        model.pe_embedding = nn.Embedding.from_pretrained(all_pe)\n    print(f\"Loaded model:\\n{args.model_loc}\")\n    model = model.eval()\n    model = accelerator.prepare(model)\n    batch_size = args.batch_size\n\n    #### Run the model ####\n    # Dataloaders\n    dataset = MultiDatasetSentences(sorted_dataset_names=[name],\n                                    shapes_dict=shapes_dict,\n                                    args=args, npzs_dir=args.dir,\n                                    dataset_to_protein_embeddings_path=pe_idx_path,\n                                    datasets_to_chroms_path=chroms_path,\n                                    datasets_to_starts_path=starts_path\n                                    )\n    multi_dataset_sentence_collator = MultiDatasetSentenceCollator(args)\n\n    dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False,\n                            collate_fn=multi_dataset_sentence_collator,\n                            num_workers=0)\n    dataloader = accelerator.prepare(dataloader)\n    pbar = tqdm(dataloader, disable=not accelerator.is_local_main_process)\n    dataset_embeds = []\n    with torch.no_grad():\n        for batch in pbar:\n            batch_sentences, mask, idxs = batch[0], batch[1], batch[2]\n            batch_sentences = batch_sentences.permute(1, 0)\n            if args.multi_gpu:\n                batch_sentences = model.module.pe_embedding(batch_sentences.long())\n            else:\n                batch_sentences = model.pe_embedding(batch_sentences.long())\n            batch_sentences = nn.functional.normalize(batch_sentences,\n                                                      dim=2)  # Normalize token outputs now\n            _, embedding = model.forward(batch_sentences, mask=mask)\n            # Fix for duplicates in last batch\n            accelerator.wait_for_everyone()\n            embeddings = accelerator.gather_for_metrics((embedding))\n            if accelerator.is_main_process:\n                dataset_embeds.append(embeddings.detach().cpu().numpy())\n\n    accelerator.wait_for_everyone()\n    if accelerator.is_main_process:\n        dataset_embeds = np.vstack(dataset_embeds)\n        adata.obsm[\"X_uce\"] = dataset_embeds\n        write_path = args.dir + f\"{name}_uce_adata.h5ad\"\n        adata.write(write_path)\n\n        print(\"*****Wrote Anndata to:*****\")\n        print(write_path)\n"
  },
  {
    "path": "examples/Benchmark Embeddings with scIB.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"6b258384-9a56-4ed0-be6f-db1c94711356\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Large Scale Embedding benchmarks\\n\",\n    \"\\n\",\n    \"This notebook includes an example showing how to run large scale embedding benchmarks using scIB [(single-cell integration benchmark)](https://www.nature.com/articles/s41592-021-01336-8)\\n\",\n    \"\\n\",\n    \"We use the GPU accelerated version implemented here: https://github.com/YosefLab/scib-metrics\\n\",\n    \"\\n\",\n    \"Please follow installation instructions in that repo. \\n\",\n    \"\\n\",\n    \"*Note: installing Faiss can be difficult and may take some time*\\n\",\n    \"\\n\",\n    \"*Running the full benchmarking suite on many cells can take many hours, even on GPUs with large amounts of memory, such as A100s, and with many threads*\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"ca4ba3a1-5c85-4c7b-8564-f8c5689e9345\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Load Imports and define Benchmark Function\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"id\": \"b9d9fd58-915b-492d-9880-48c37e3859a8\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import numpy as np\\n\",\n    \"import scanpy as sc\\n\",\n    \"\\n\",\n    \"from scib_metrics.benchmark import Benchmarker\\n\",\n    \"\\n\",\n    \"import faiss\\n\",\n    \"\\n\",\n    \"from scib_metrics.nearest_neighbors import NeighborsResults\\n\",\n    \"\\n\",\n    \"# Faiss GPU accelerate nearest neighbors methods\\n\",\n    \"def faiss_hnsw_nn(X: np.ndarray, k: int):\\n\",\n    \"    \\\"\\\"\\\"Gpu HNSW nearest neighbor search using faiss.\\n\",\n    \"\\n\",\n    \"    See https://github.com/nmslib/hnswlib/blob/master/ALGO_PARAMS.md\\n\",\n    \"    for index param details.\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"    X = np.ascontiguousarray(X, dtype=np.float32)\\n\",\n    \"    res = faiss.StandardGpuResources()\\n\",\n    \"    M = 32\\n\",\n    \"    index = faiss.IndexHNSWFlat(X.shape[1], M, faiss.METRIC_L2)\\n\",\n    \"    gpu_index = faiss.index_cpu_to_gpu(res, 0, index)\\n\",\n    \"    gpu_index.add(X)\\n\",\n    \"    distances, indices = gpu_index.search(X, k)\\n\",\n    \"    del index\\n\",\n    \"    del gpu_index\\n\",\n    \"    # distances are squared\\n\",\n    \"    return NeighborsResults(indices=indices, distances=np.sqrt(distances))\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def faiss_brute_force_nn(X: np.ndarray, k: int):\\n\",\n    \"    \\\"\\\"\\\"Gpu brute force nearest neighbor search using faiss.\\\"\\\"\\\"\\n\",\n    \"    X = np.ascontiguousarray(X, dtype=np.float32)\\n\",\n    \"    res = faiss.StandardGpuResources()\\n\",\n    \"    index = faiss.IndexFlatL2(X.shape[1])\\n\",\n    \"    gpu_index = faiss.index_cpu_to_gpu(res, 0, index)\\n\",\n    \"    gpu_index.add(X)\\n\",\n    \"    distances, indices = gpu_index.search(X, k)\\n\",\n    \"    del index\\n\",\n    \"    del gpu_index\\n\",\n    \"    # distances are squared\\n\",\n    \"    return NeighborsResults(indices=indices, distances=np.sqrt(distances))\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"id\": \"4c5fb90f-ffa5-4cb9-bf6a-6afce956fc86\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import warnings\\n\",\n    \"warnings.filterwarnings(\\\"ignore\\\")\\n\",\n    \"from scib_metrics.benchmark import Benchmarker, BioConservation, BatchCorrection\\n\",\n    \"import pandas as pd\\n\",\n    \"\\n\",\n    \"## Benchmarking Function, returns dataframe of scores\\n\",\n    \"def benchmark(ad, label_key=\\\"cell_type\\\", batch_key=\\\"sample_id\\\", obsm_keys=[\\\"X_uce\\\", \\\"X_scGPT\\\", \\\"X_geneformer\\\"]):\\n\",\n    \"    print(f\\\"Running using CT key:\\\", label_key)\\n\",\n    \"    biocons = BioConservation()\\n\",\n    \"    batchcons = BatchCorrection(pcr_comparison=False)\\n\",\n    \"    \\n\",\n    \"    bm = Benchmarker(\\n\",\n    \"        ad,\\n\",\n    \"        batch_key=batch_key,\\n\",\n    \"        label_key=label_key,\\n\",\n    \"        embedding_obsm_keys=obsm_keys,\\n\",\n    \"        bio_conservation_metrics=biocons,\\n\",\n    \"        batch_correction_metrics=None,\\n\",\n    \"        n_jobs=48,\\n\",\n    \"    )\\n\",\n    \"    bm.prepare(neighbor_computer=faiss_brute_force_nn)\\n\",\n    \"    bm.benchmark()\\n\",\n    \"    df = bm.get_results(min_max_scale=False)\\n\",\n    \"    return df\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"2f3bb257-21d4-41d5-9726-50b5e7af04b2\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Load in anndata\\n\",\n    \"\\n\",\n    \"For this example, we will benchmark cells from developing mouse brain.\\n\",\n    \"\\n\",\n    \"You can download an anndata object with UCE, scGPT and Geneformer embeddings precalulated from [here](https://drive.google.com/drive/folders/1f63fh0ykgEhCrkd_EVvIootBw7LYDVI7)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"id\": \"35392e93-6ffd-4df6-9609-f85ea6aad4ae\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"AnnData object with n_obs × n_vars = 597668 × 18285\\n\",\n       \"    obs: 'n_counts', 'n_genes', 'region', 'age', 'experiment', 'species', 'sex', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'cell_type', 'sex_old', 'abca_class', 'abca_subclass', 'abca_supertype', 'abca_cluster', 'abca_region', 'leiden_old', 'region_dissected', 'biosample_id', 'donor_id', 'species__ontology_label', 'disease', 'disease__ontology_label', 'organ', 'organ__ontology_label', 'library_preparation_protocol', 'library_preparation_protocol__ontology_label', 'cell_type_author', 'cell_type__ontology_label', 'supercluster'\\n\",\n       \"    var: 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'feature_name'\\n\",\n       \"    uns: '10x_batch_colors', '_scvi_manager_uuid', '_scvi_uuid', 'age_colors', 'ages_ordered_colors', 'dendrogram_leiden', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'region_colors', 'region_dissected_colors', 'regions_ordered_colors', 'replicate_colors', 'sex_colors', 'umap'\\n\",\n       \"    obsm: 'X_geneformer', 'X_pca', 'X_scGPT', 'X_scVI', 'X_uce', 'X_umap', 'latent_gene_encoding'\\n\",\n       \"    layers: 'counts'\\n\",\n       \"    obsp: 'connectivities', 'distances'\"\n      ]\n     },\n     \"execution_count\": 3,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"ad = sc.read(\\\"developing_mouse_brain.h5ad\\\", cache=True)\\n\",\n    \"ad\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"id\": \"a4cb1a5e-1672-4ba7-b488-036de0e3ff61\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"cell_type_column = \\\"supercluster\\\"\\n\",\n    \"batch_column = \\\"donor_id\\\"\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 5,\n   \"id\": \"134f4e09-8e68-43fb-9d12-d87a1b5318c1\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"33\"\n      ]\n     },\n     \"execution_count\": 5,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"len(ad.obs[cell_type_column].unique()) # Number of unique cell types\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 6,\n   \"id\": \"ac956e69-9a66-4225-adb8-a01a2d6e23bf\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"25\"\n      ]\n     },\n     \"execution_count\": 6,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"len(ad.obs[batch_column].unique()) # Number of unique batches\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"ee280476-4057-4051-b4f1-eb7ee0055e69\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Running the Benchmark\\n\",\n    \"\\n\",\n    \"Running the benchmark on the full dataset can take a very long time. Instead, we can run on medium sized samples of cells.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 7,\n   \"id\": \"0cae96b8-5be1-4ea5-a919-d16d2205d645\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"sample_size = 100_000 # number of cells\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 8,\n   \"id\": \"189ad01d-83c0-40e6-ab13-d16ed7eb0c88\",\n   \"metadata\": {\n    \"scrolled\": true\n   },\n   \"outputs\": [\n    {\n     \"data\": {\n      \"application/vnd.jupyter.widget-view+json\": {\n       \"model_id\": \"0d430c0038f84d33915a3d9b211d9608\",\n       \"version_major\": 2,\n       \"version_minor\": 0\n      },\n      \"text/plain\": [\n       \"  0%|          | 0/10 [00:00<?, ?it/s]\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Running using CT key: supercluster\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"\\n\",\n      \"Computing neighbors:   0%|                                                                                                                                                                                                              | 0/3 [00:00<?, ?it/s]\\u001b[A\\n\",\n      \"Computing neighbors:  33%|██████████████████████████████████████████████████████████████████                                                                                                                                    | 1/3 [00:02<00:04,  2.44s/it]\\u001b[A\\n\",\n      \"Computing neighbors:  67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                  | 2/3 [00:03<00:01,  1.61s/it]\\u001b[A\\n\",\n      \"Computing neighbors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00,  1.50s/it]\\u001b[A\\n\",\n      \"Embeddings:   0%|\\u001b[32m                                                                                                                                                                                                                       \\u001b[0m| 0/3 [00:00<?, ?it/s]\\u001b[0m\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:57<08:33, 57.09s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:57<08:33, 57.09s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [01:08<04:01, 30.17s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [01:08<04:01, 30.17s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [01:51<04:11, 35.98s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [01:51<04:11, 35.98s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [01:52<02:12, 22.15s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [01:52<02:12, 22.15s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [02:17<01:56, 23.23s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [02:17<01:56, 23.23s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                        \\u001b[0m| 6/10 [02:17<01:01, 15.40s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [02:17<01:01, 15.40s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [03:39<01:51, 37.27s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [03:39<01:51, 37.27s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [03:40<00:50, 25.49s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [03:40<00:50, 25.49s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  33%|\\u001b[32m████████████████████████████████████████████████████████████████████▋                                                                                                                                         \\u001b[0m| 1/3 [03:42<07:25, 222.58s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:38, 17.58s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:38, 17.58s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:23<01:28, 11.01s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:24<01:28, 11.01s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:37, 13.94s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:37, 13.94s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [00:41<00:50,  8.48s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<00:50,  8.48s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:46<00:36,  7.39s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:46<00:36,  7.39s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:47<00:29,  7.39s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:39<00:50, 16.92s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:39<00:50, 16.92s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:39<00:24, 12.49s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:39<00:24, 12.49s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  67%|\\u001b[32m█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                    \\u001b[0m| 2/3 [05:23<02:30, 150.75s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:38, 17.57s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:38, 17.57s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:23<01:27, 10.95s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:23<01:27, 10.95s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:37, 13.95s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:37, 13.95s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<01:23, 13.95s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:42<00:32,  6.49s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:42<00:32,  6.49s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:43<00:25,  6.49s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:36<00:46, 15.63s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:36<00:46, 15.63s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:37<00:23, 11.92s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:37<00:23, 11.92s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings: 100%|\\u001b[32m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\\u001b[0m| 3/3 [07:00<00:00, 140.20s/it]\\u001b[0m\\u001b[A\\n\",\n      \"\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"AnnData object with n_obs × n_vars = 100000 × 18285\\n\",\n       \"    obs: 'n_counts', 'n_genes', 'region', 'age', 'experiment', 'species', 'sex', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'cell_type', 'sex_old', 'abca_class', 'abca_subclass', 'abca_supertype', 'abca_cluster', 'abca_region', 'leiden_old', 'region_dissected', 'biosample_id', 'donor_id', 'species__ontology_label', 'disease', 'disease__ontology_label', 'organ', 'organ__ontology_label', 'library_preparation_protocol', 'library_preparation_protocol__ontology_label', 'cell_type_author', 'cell_type__ontology_label', 'supercluster'\\n\",\n       \"    var: 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'feature_name'\\n\",\n       \"    uns: '10x_batch_colors', '_scvi_manager_uuid', '_scvi_uuid', 'age_colors', 'ages_ordered_colors', 'dendrogram_leiden', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'region_colors', 'region_dissected_colors', 'regions_ordered_colors', 'replicate_colors', 'sex_colors', 'umap'\\n\",\n       \"    obsm: 'X_geneformer', 'X_pca', 'X_scGPT', 'X_scVI', 'X_uce', 'X_umap', 'latent_gene_encoding'\\n\",\n       \"    varm: 'PCs'\\n\",\n       \"    layers: 'counts'\\n\",\n       \"    obsp: 'connectivities', 'distances'\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Running using CT key: supercluster\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"\\n\",\n      \"Computing neighbors:   0%|                                                                                                                                                                                                              | 0/3 [00:00<?, ?it/s]\\u001b[A\\n\",\n      \"Computing neighbors:  33%|██████████████████████████████████████████████████████████████████                                                                                                                                    | 1/3 [00:01<00:03,  1.97s/it]\\u001b[A\\n\",\n      \"Computing neighbors:  67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                  | 2/3 [00:03<00:01,  1.43s/it]\\u001b[A\\n\",\n      \"Computing neighbors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00,  1.35s/it]\\u001b[A\\n\",\n      \"Embeddings:   0%|\\u001b[32m                                                                                                                                                                                                                       \\u001b[0m| 0/3 [00:00<?, ?it/s]\\u001b[0m\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:39<05:53, 39.31s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:39<05:53, 39.31s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:49<02:58, 22.36s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:49<02:58, 22.36s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [01:29<03:31, 30.17s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [01:29<03:31, 30.17s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [01:29<01:49, 18.30s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [01:29<01:49, 18.30s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [01:56<01:47, 21.46s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [01:56<01:47, 21.46s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [01:56<01:25, 21.46s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [02:56<01:17, 25.85s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [02:56<01:17, 25.85s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [02:56<00:38, 19.05s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [02:56<00:38, 19.05s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  33%|\\u001b[32m████████████████████████████████████████████████████████████████████▋                                                                                                                                         \\u001b[0m| 1/3 [02:58<05:56, 178.39s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:36, 17.40s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:36, 17.40s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:23<01:26, 10.83s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:23<01:26, 10.83s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:40<01:36, 13.77s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:40<01:36, 13.77s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [00:41<00:50,  8.38s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<00:50,  8.38s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:47<00:37,  7.54s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:47<00:37,  7.54s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:47<00:30,  7.54s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:47<00:57, 19.17s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:47<00:57, 19.17s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:48<00:28, 14.15s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:48<00:28, 14.15s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  67%|\\u001b[32m█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                    \\u001b[0m| 2/3 [04:47<02:17, 137.45s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:37, 17.50s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:37, 17.50s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:24<01:28, 11.07s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:24<01:28, 11.07s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:38, 14.04s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:38, 14.04s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [00:41<00:51,  8.54s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<00:51,  8.54s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:43<00:30,  6.03s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:43<00:30,  6.03s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:43<00:24,  6.03s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:40<00:52, 17.48s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:40<00:52, 17.48s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:40<00:25, 12.89s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:40<00:25, 12.89s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings: 100%|\\u001b[32m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\\u001b[0m| 3/3 [06:28<00:00, 129.48s/it]\\u001b[0m\\u001b[A\\n\",\n      \"\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"AnnData object with n_obs × n_vars = 100000 × 18285\\n\",\n       \"    obs: 'n_counts', 'n_genes', 'region', 'age', 'experiment', 'species', 'sex', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'cell_type', 'sex_old', 'abca_class', 'abca_subclass', 'abca_supertype', 'abca_cluster', 'abca_region', 'leiden_old', 'region_dissected', 'biosample_id', 'donor_id', 'species__ontology_label', 'disease', 'disease__ontology_label', 'organ', 'organ__ontology_label', 'library_preparation_protocol', 'library_preparation_protocol__ontology_label', 'cell_type_author', 'cell_type__ontology_label', 'supercluster'\\n\",\n       \"    var: 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'feature_name'\\n\",\n       \"    uns: '10x_batch_colors', '_scvi_manager_uuid', '_scvi_uuid', 'age_colors', 'ages_ordered_colors', 'dendrogram_leiden', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'region_colors', 'region_dissected_colors', 'regions_ordered_colors', 'replicate_colors', 'sex_colors', 'umap'\\n\",\n       \"    obsm: 'X_geneformer', 'X_pca', 'X_scGPT', 'X_scVI', 'X_uce', 'X_umap', 'latent_gene_encoding'\\n\",\n       \"    varm: 'PCs'\\n\",\n       \"    layers: 'counts'\\n\",\n       \"    obsp: 'connectivities', 'distances'\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Running using CT key: supercluster\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"\\n\",\n      \"Computing neighbors:   0%|                                                                                                                                                                                                              | 0/3 [00:00<?, ?it/s]\\u001b[A\\n\",\n      \"Computing neighbors:  33%|██████████████████████████████████████████████████████████████████                                                                                                                                    | 1/3 [00:01<00:03,  1.93s/it]\\u001b[A\\n\",\n      \"Computing neighbors:  67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                  | 2/3 [00:02<00:01,  1.40s/it]\\u001b[A\\n\",\n      \"Computing neighbors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:03<00:00,  1.32s/it]\\u001b[A\\n\",\n      \"Embeddings:   0%|\\u001b[32m                                                                                                                                                                                                                       \\u001b[0m| 0/3 [00:00<?, ?it/s]\\u001b[0m\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:39<05:53, 39.24s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:39<05:53, 39.24s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:50<03:03, 22.88s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:50<03:03, 22.88s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [01:30<03:32, 30.42s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [01:30<03:32, 30.42s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [01:30<01:50, 18.45s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [01:30<01:50, 18.45s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [01:56<01:46, 21.22s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [01:56<01:46, 21.22s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [01:56<01:24, 21.22s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [02:58<01:18, 26.20s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [02:58<01:18, 26.20s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [02:58<00:38, 19.30s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [02:58<00:38, 19.30s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  33%|\\u001b[32m████████████████████████████████████████████████████████████████████▋                                                                                                                                         \\u001b[0m| 1/3 [03:00<06:00, 180.00s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:36, 17.34s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:36, 17.34s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:23<01:26, 10.82s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:23<01:26, 10.82s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:36, 13.83s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:36, 13.83s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<01:22, 13.83s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:45<00:37,  7.40s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:45<00:37,  7.40s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:46<00:29,  7.40s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:45<00:52, 17.45s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:45<00:52, 17.45s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:45<00:26, 13.29s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:45<00:26, 13.29s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  67%|\\u001b[32m█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                    \\u001b[0m| 2/3 [04:46<02:16, 136.79s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:38, 17.58s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:38, 17.58s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:23<01:27, 10.88s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:23<01:27, 10.88s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:36, 13.84s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:36, 13.84s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<01:23, 13.84s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:42<00:31,  6.35s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:42<00:31,  6.35s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:42<00:25,  6.35s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:42<00:50, 16.92s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:42<00:50, 16.92s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:42<00:25, 12.89s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:42<00:25, 12.89s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings: 100%|\\u001b[32m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\\u001b[0m| 3/3 [06:29<00:00, 129.91s/it]\\u001b[0m\\u001b[A\\n\",\n      \"\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"AnnData object with n_obs × n_vars = 100000 × 18285\\n\",\n       \"    obs: 'n_counts', 'n_genes', 'region', 'age', 'experiment', 'species', 'sex', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'cell_type', 'sex_old', 'abca_class', 'abca_subclass', 'abca_supertype', 'abca_cluster', 'abca_region', 'leiden_old', 'region_dissected', 'biosample_id', 'donor_id', 'species__ontology_label', 'disease', 'disease__ontology_label', 'organ', 'organ__ontology_label', 'library_preparation_protocol', 'library_preparation_protocol__ontology_label', 'cell_type_author', 'cell_type__ontology_label', 'supercluster'\\n\",\n       \"    var: 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'feature_name'\\n\",\n       \"    uns: '10x_batch_colors', '_scvi_manager_uuid', '_scvi_uuid', 'age_colors', 'ages_ordered_colors', 'dendrogram_leiden', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'region_colors', 'region_dissected_colors', 'regions_ordered_colors', 'replicate_colors', 'sex_colors', 'umap'\\n\",\n       \"    obsm: 'X_geneformer', 'X_pca', 'X_scGPT', 'X_scVI', 'X_uce', 'X_umap', 'latent_gene_encoding'\\n\",\n       \"    varm: 'PCs'\\n\",\n       \"    layers: 'counts'\\n\",\n       \"    obsp: 'connectivities', 'distances'\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Running using CT key: supercluster\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"\\n\",\n      \"Computing neighbors:   0%|                                                                                                                                                                                                              | 0/3 [00:00<?, ?it/s]\\u001b[A\\n\",\n      \"Computing neighbors:  33%|██████████████████████████████████████████████████████████████████                                                                                                                                    | 1/3 [00:01<00:03,  1.97s/it]\\u001b[A\\n\",\n      \"Computing neighbors:  67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                  | 2/3 [00:02<00:01,  1.42s/it]\\u001b[A\\n\",\n      \"Computing neighbors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00,  1.34s/it]\\u001b[A\\n\",\n      \"Embeddings:   0%|\\u001b[32m                                                                                                                                                                                                                       \\u001b[0m| 0/3 [00:00<?, ?it/s]\\u001b[0m\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:39<05:51, 39.11s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:39<05:51, 39.11s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:49<02:58, 22.37s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:49<02:58, 22.37s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [01:31<03:38, 31.18s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [01:31<03:38, 31.18s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [01:31<03:07, 31.18s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [01:58<01:45, 21.06s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [01:58<01:45, 21.06s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [01:58<01:24, 21.06s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [02:55<01:13, 24.48s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [02:55<01:13, 24.48s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [02:55<00:37, 18.63s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [02:55<00:37, 18.63s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  33%|\\u001b[32m████████████████████████████████████████████████████████████████████▋                                                                                                                                         \\u001b[0m| 1/3 [02:57<05:54, 177.02s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:37, 17.45s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:37, 17.45s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:24<01:28, 11.09s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:24<01:28, 11.09s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:37, 13.92s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:37, 13.92s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<01:23, 13.92s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:45<00:35,  7.16s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:45<00:35,  7.16s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:45<00:28,  7.16s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:39<00:47, 15.99s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:39<00:47, 15.99s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:39<00:24, 12.20s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:39<00:24, 12.20s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  67%|\\u001b[32m█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                    \\u001b[0m| 2/3 [04:36<02:11, 131.67s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:39, 17.76s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:39, 17.76s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:24<01:29, 11.22s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:24<01:29, 11.22s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:38, 14.11s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:38, 14.11s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [00:42<00:51,  8.58s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:42<00:51,  8.58s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:43<00:29,  5.98s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:43<00:29,  5.98s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:43<00:23,  5.98s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:37<00:50, 16.78s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:37<00:50, 16.78s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:38<00:24, 12.40s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:38<00:24, 12.40s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings: 100%|\\u001b[32m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\\u001b[0m| 3/3 [06:15<00:00, 125.25s/it]\\u001b[0m\\u001b[A\\n\",\n      \"\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"AnnData object with n_obs × n_vars = 100000 × 18285\\n\",\n       \"    obs: 'n_counts', 'n_genes', 'region', 'age', 'experiment', 'species', 'sex', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'cell_type', 'sex_old', 'abca_class', 'abca_subclass', 'abca_supertype', 'abca_cluster', 'abca_region', 'leiden_old', 'region_dissected', 'biosample_id', 'donor_id', 'species__ontology_label', 'disease', 'disease__ontology_label', 'organ', 'organ__ontology_label', 'library_preparation_protocol', 'library_preparation_protocol__ontology_label', 'cell_type_author', 'cell_type__ontology_label', 'supercluster'\\n\",\n       \"    var: 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'feature_name'\\n\",\n       \"    uns: '10x_batch_colors', '_scvi_manager_uuid', '_scvi_uuid', 'age_colors', 'ages_ordered_colors', 'dendrogram_leiden', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'region_colors', 'region_dissected_colors', 'regions_ordered_colors', 'replicate_colors', 'sex_colors', 'umap'\\n\",\n       \"    obsm: 'X_geneformer', 'X_pca', 'X_scGPT', 'X_scVI', 'X_uce', 'X_umap', 'latent_gene_encoding'\\n\",\n       \"    varm: 'PCs'\\n\",\n       \"    layers: 'counts'\\n\",\n       \"    obsp: 'connectivities', 'distances'\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Running using CT key: supercluster\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"\\n\",\n      \"Computing neighbors:   0%|                                                                                                                                                                                                              | 0/3 [00:00<?, ?it/s]\\u001b[A\\n\",\n      \"Computing neighbors:  33%|██████████████████████████████████████████████████████████████████                                                                                                                                    | 1/3 [00:02<00:04,  2.06s/it]\\u001b[A\\n\",\n      \"Computing neighbors:  67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                  | 2/3 [00:03<00:01,  1.54s/it]\\u001b[A\\n\",\n      \"Computing neighbors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00,  1.48s/it]\\u001b[A\\n\",\n      \"Embeddings:   0%|\\u001b[32m                                                                                                                                                                                                                       \\u001b[0m| 0/3 [00:00<?, ?it/s]\\u001b[0m\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:40<06:01, 40.19s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:40<06:01, 40.19s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:50<03:02, 22.82s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:50<03:02, 22.82s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [01:30<03:33, 30.47s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [01:30<03:33, 30.47s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [01:30<03:02, 30.47s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [01:55<01:41, 20.24s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [01:55<01:41, 20.24s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [01:55<01:20, 20.24s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [02:54<01:13, 24.39s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [02:54<01:13, 24.39s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [02:54<00:37, 18.54s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [02:54<00:37, 18.54s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  33%|\\u001b[32m████████████████████████████████████████████████████████████████████▋                                                                                                                                         \\u001b[0m| 1/3 [02:55<05:51, 175.81s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:37, 17.52s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:37, 17.52s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:23<01:27, 10.89s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:23<01:27, 10.89s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:37, 13.93s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:37, 13.93s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<01:23, 13.93s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:46<00:36,  7.38s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:46<00:36,  7.38s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:46<00:29,  7.38s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:42<00:50, 16.73s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:42<00:50, 16.73s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:42<00:25, 12.75s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:42<00:25, 12.75s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  67%|\\u001b[32m█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                    \\u001b[0m| 2/3 [04:39<02:13, 133.22s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:38, 17.56s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:38, 17.56s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:24<01:31, 11.40s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:24<01:31, 11.40s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:42<01:39, 14.16s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:42<01:39, 14.16s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:42<01:24, 14.16s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:43<00:32,  6.53s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:43<00:32,  6.53s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:43<00:26,  6.53s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:39<00:48, 16.18s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:39<00:48, 16.18s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:40<00:24, 12.34s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:40<00:24, 12.34s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings: 100%|\\u001b[32m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\\u001b[0m| 3/3 [06:19<00:00, 126.60s/it]\\u001b[0m\\u001b[A\\n\",\n      \"\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"AnnData object with n_obs × n_vars = 100000 × 18285\\n\",\n       \"    obs: 'n_counts', 'n_genes', 'region', 'age', 'experiment', 'species', 'sex', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'cell_type', 'sex_old', 'abca_class', 'abca_subclass', 'abca_supertype', 'abca_cluster', 'abca_region', 'leiden_old', 'region_dissected', 'biosample_id', 'donor_id', 'species__ontology_label', 'disease', 'disease__ontology_label', 'organ', 'organ__ontology_label', 'library_preparation_protocol', 'library_preparation_protocol__ontology_label', 'cell_type_author', 'cell_type__ontology_label', 'supercluster'\\n\",\n       \"    var: 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'feature_name'\\n\",\n       \"    uns: '10x_batch_colors', '_scvi_manager_uuid', '_scvi_uuid', 'age_colors', 'ages_ordered_colors', 'dendrogram_leiden', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'region_colors', 'region_dissected_colors', 'regions_ordered_colors', 'replicate_colors', 'sex_colors', 'umap'\\n\",\n       \"    obsm: 'X_geneformer', 'X_pca', 'X_scGPT', 'X_scVI', 'X_uce', 'X_umap', 'latent_gene_encoding'\\n\",\n       \"    varm: 'PCs'\\n\",\n       \"    layers: 'counts'\\n\",\n       \"    obsp: 'connectivities', 'distances'\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Running using CT key: supercluster\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"\\n\",\n      \"Computing neighbors:   0%|                                                                                                                                                                                                              | 0/3 [00:00<?, ?it/s]\\u001b[A\\n\",\n      \"Computing neighbors:  33%|██████████████████████████████████████████████████████████████████                                                                                                                                    | 1/3 [00:02<00:03,  2.00s/it]\\u001b[A\\n\",\n      \"Computing neighbors:  67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                  | 2/3 [00:03<00:01,  1.61s/it]\\u001b[A\\n\",\n      \"Computing neighbors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00,  1.61s/it]\\u001b[A\\n\",\n      \"Embeddings:   0%|\\u001b[32m                                                                                                                                                                                                                       \\u001b[0m| 0/3 [00:00<?, ?it/s]\\u001b[0m\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:44<06:38, 44.27s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:44<06:38, 44.27s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:57<03:29, 26.22s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:57<03:29, 26.22s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [01:37<03:46, 32.29s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [01:37<03:46, 32.29s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [01:37<03:13, 32.29s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [02:02<01:45, 21.12s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [02:02<01:45, 21.12s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [02:02<01:24, 21.12s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [02:58<01:12, 24.13s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [02:58<01:12, 24.13s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [02:58<00:36, 18.37s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [02:58<00:36, 18.37s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  33%|\\u001b[32m████████████████████████████████████████████████████████████████████▋                                                                                                                                         \\u001b[0m| 1/3 [03:01<06:02, 181.04s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:37, 17.51s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:37, 17.51s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:25<01:33, 11.64s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:25<01:33, 11.64s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:42<01:40, 14.30s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:42<01:40, 14.30s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:42<01:25, 14.30s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:46<00:36,  7.33s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:46<00:36,  7.33s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:46<00:29,  7.33s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:37<00:46, 15.55s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:37<00:46, 15.55s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:38<00:23, 11.86s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:38<00:23, 11.86s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  67%|\\u001b[32m█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                    \\u001b[0m| 2/3 [04:40<02:13, 133.05s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:38, 17.56s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:38, 17.56s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:24<01:32, 11.51s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:24<01:32, 11.51s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:42<01:39, 14.22s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:42<01:39, 14.22s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [00:42<00:51,  8.65s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:42<00:51,  8.65s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:43<00:30,  6.01s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:43<00:30,  6.01s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:43<00:24,  6.01s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:37<00:49, 16.52s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:37<00:49, 16.52s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:37<00:24, 12.19s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:37<00:24, 12.19s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings: 100%|\\u001b[32m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\\u001b[0m| 3/3 [06:19<00:00, 126.64s/it]\\u001b[0m\\u001b[A\\n\",\n      \"\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"AnnData object with n_obs × n_vars = 100000 × 18285\\n\",\n       \"    obs: 'n_counts', 'n_genes', 'region', 'age', 'experiment', 'species', 'sex', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'cell_type', 'sex_old', 'abca_class', 'abca_subclass', 'abca_supertype', 'abca_cluster', 'abca_region', 'leiden_old', 'region_dissected', 'biosample_id', 'donor_id', 'species__ontology_label', 'disease', 'disease__ontology_label', 'organ', 'organ__ontology_label', 'library_preparation_protocol', 'library_preparation_protocol__ontology_label', 'cell_type_author', 'cell_type__ontology_label', 'supercluster'\\n\",\n       \"    var: 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'feature_name'\\n\",\n       \"    uns: '10x_batch_colors', '_scvi_manager_uuid', '_scvi_uuid', 'age_colors', 'ages_ordered_colors', 'dendrogram_leiden', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'region_colors', 'region_dissected_colors', 'regions_ordered_colors', 'replicate_colors', 'sex_colors', 'umap'\\n\",\n       \"    obsm: 'X_geneformer', 'X_pca', 'X_scGPT', 'X_scVI', 'X_uce', 'X_umap', 'latent_gene_encoding'\\n\",\n       \"    varm: 'PCs'\\n\",\n       \"    layers: 'counts'\\n\",\n       \"    obsp: 'connectivities', 'distances'\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Running using CT key: supercluster\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"\\n\",\n      \"Computing neighbors:   0%|                                                                                                                                                                                                              | 0/3 [00:00<?, ?it/s]\\u001b[A\\n\",\n      \"Computing neighbors:  33%|██████████████████████████████████████████████████████████████████                                                                                                                                    | 1/3 [00:02<00:04,  2.02s/it]\\u001b[A\\n\",\n      \"Computing neighbors:  67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                  | 2/3 [00:03<00:01,  1.46s/it]\\u001b[A\\n\",\n      \"Computing neighbors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00,  1.41s/it]\\u001b[A\\n\",\n      \"Embeddings:   0%|\\u001b[32m                                                                                                                                                                                                                       \\u001b[0m| 0/3 [00:00<?, ?it/s]\\u001b[0m\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:39<05:53, 39.27s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:39<05:53, 39.27s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:50<03:03, 22.92s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:50<03:03, 22.92s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [01:31<03:37, 31.06s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [01:31<03:37, 31.06s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [01:31<03:06, 31.06s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [01:56<01:43, 20.69s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [01:56<01:43, 20.69s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                        \\u001b[0m| 6/10 [01:57<00:59, 14.80s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [01:57<00:59, 14.80s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [02:58<01:25, 28.39s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [02:58<01:25, 28.39s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [02:58<00:40, 20.12s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [02:58<00:40, 20.12s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  33%|\\u001b[32m████████████████████████████████████████████████████████████████████▋                                                                                                                                         \\u001b[0m| 1/3 [03:02<06:05, 182.92s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:37, 17.52s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:37, 17.52s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:25<01:33, 11.64s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:25<01:33, 11.64s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:42<01:39, 14.28s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:42<01:39, 14.28s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [00:42<00:52,  8.69s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:42<00:52,  8.69s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:46<00:34,  6.96s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:46<00:34,  6.96s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:46<00:27,  6.96s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:45<00:54, 18.31s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:45<00:54, 18.31s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:45<00:27, 13.51s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:45<00:27, 13.51s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  67%|\\u001b[32m█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                    \\u001b[0m| 2/3 [04:51<02:19, 139.15s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:37, 17.52s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:37, 17.52s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:24<01:31, 11.44s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:24<01:31, 11.44s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:42<01:39, 14.20s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:42<01:39, 14.20s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [00:42<00:51,  8.64s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:42<00:51,  8.64s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:44<00:32,  6.47s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:44<00:32,  6.47s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:45<00:25,  6.47s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:41<00:52, 17.51s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:41<00:52, 17.51s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:41<00:25, 12.93s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:41<00:25, 12.93s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings: 100%|\\u001b[32m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\\u001b[0m| 3/3 [06:34<00:00, 131.63s/it]\\u001b[0m\\u001b[A\\n\",\n      \"\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"AnnData object with n_obs × n_vars = 100000 × 18285\\n\",\n       \"    obs: 'n_counts', 'n_genes', 'region', 'age', 'experiment', 'species', 'sex', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'cell_type', 'sex_old', 'abca_class', 'abca_subclass', 'abca_supertype', 'abca_cluster', 'abca_region', 'leiden_old', 'region_dissected', 'biosample_id', 'donor_id', 'species__ontology_label', 'disease', 'disease__ontology_label', 'organ', 'organ__ontology_label', 'library_preparation_protocol', 'library_preparation_protocol__ontology_label', 'cell_type_author', 'cell_type__ontology_label', 'supercluster'\\n\",\n       \"    var: 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'feature_name'\\n\",\n       \"    uns: '10x_batch_colors', '_scvi_manager_uuid', '_scvi_uuid', 'age_colors', 'ages_ordered_colors', 'dendrogram_leiden', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'region_colors', 'region_dissected_colors', 'regions_ordered_colors', 'replicate_colors', 'sex_colors', 'umap'\\n\",\n       \"    obsm: 'X_geneformer', 'X_pca', 'X_scGPT', 'X_scVI', 'X_uce', 'X_umap', 'latent_gene_encoding'\\n\",\n       \"    varm: 'PCs'\\n\",\n       \"    layers: 'counts'\\n\",\n       \"    obsp: 'connectivities', 'distances'\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Running using CT key: supercluster\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"\\n\",\n      \"Computing neighbors:   0%|                                                                                                                                                                                                              | 0/3 [00:00<?, ?it/s]\\u001b[A\\n\",\n      \"Computing neighbors:  33%|██████████████████████████████████████████████████████████████████                                                                                                                                    | 1/3 [00:02<00:04,  2.04s/it]\\u001b[A\\n\",\n      \"Computing neighbors:  67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                  | 2/3 [00:03<00:01,  1.50s/it]\\u001b[A\\n\",\n      \"Computing neighbors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00,  1.46s/it]\\u001b[A\\n\",\n      \"Embeddings:   0%|\\u001b[32m                                                                                                                                                                                                                       \\u001b[0m| 0/3 [00:00<?, ?it/s]\\u001b[0m\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:39<05:53, 39.29s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:39<05:53, 39.29s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:52<03:13, 24.22s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:52<03:13, 24.22s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [01:33<03:40, 31.49s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [01:33<03:40, 31.49s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [01:33<01:55, 19.18s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [01:33<01:55, 19.18s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [02:01<01:51, 22.27s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [02:01<01:51, 22.27s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [02:01<01:29, 22.27s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [03:01<01:18, 26.27s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [03:01<01:18, 26.27s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [03:01<00:38, 19.35s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [03:01<00:38, 19.35s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  33%|\\u001b[32m████████████████████████████████████████████████████████████████████▋                                                                                                                                         \\u001b[0m| 1/3 [03:06<06:12, 186.12s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:37, 17.50s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:37, 17.50s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:24<01:30, 11.30s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:24<01:30, 11.30s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:38, 14.12s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:38, 14.12s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:42<01:24, 14.12s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:47<00:38,  7.71s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:47<00:38,  7.71s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:47<00:30,  7.71s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:51<00:55, 18.59s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:51<00:55, 18.59s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:51<00:28, 14.17s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:51<00:28, 14.17s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  67%|\\u001b[32m█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                    \\u001b[0m| 2/3 [04:58<02:22, 142.91s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:36, 17.43s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:36, 17.43s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:24<01:28, 11.07s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:24<01:28, 11.07s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:37, 13.97s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:37, 13.97s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [00:41<00:51,  8.51s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<00:51,  8.51s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:43<00:30,  6.09s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:43<00:30,  6.09s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:43<00:24,  6.09s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:38<00:50, 16.88s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:38<00:50, 16.88s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:38<00:24, 12.47s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:38<00:24, 12.47s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings: 100%|\\u001b[32m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\\u001b[0m| 3/3 [06:37<00:00, 132.64s/it]\\u001b[0m\\u001b[A\\n\",\n      \"\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"AnnData object with n_obs × n_vars = 100000 × 18285\\n\",\n       \"    obs: 'n_counts', 'n_genes', 'region', 'age', 'experiment', 'species', 'sex', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'cell_type', 'sex_old', 'abca_class', 'abca_subclass', 'abca_supertype', 'abca_cluster', 'abca_region', 'leiden_old', 'region_dissected', 'biosample_id', 'donor_id', 'species__ontology_label', 'disease', 'disease__ontology_label', 'organ', 'organ__ontology_label', 'library_preparation_protocol', 'library_preparation_protocol__ontology_label', 'cell_type_author', 'cell_type__ontology_label', 'supercluster'\\n\",\n       \"    var: 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'feature_name'\\n\",\n       \"    uns: '10x_batch_colors', '_scvi_manager_uuid', '_scvi_uuid', 'age_colors', 'ages_ordered_colors', 'dendrogram_leiden', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'region_colors', 'region_dissected_colors', 'regions_ordered_colors', 'replicate_colors', 'sex_colors', 'umap'\\n\",\n       \"    obsm: 'X_geneformer', 'X_pca', 'X_scGPT', 'X_scVI', 'X_uce', 'X_umap', 'latent_gene_encoding'\\n\",\n       \"    varm: 'PCs'\\n\",\n       \"    layers: 'counts'\\n\",\n       \"    obsp: 'connectivities', 'distances'\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Running using CT key: supercluster\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"\\n\",\n      \"Computing neighbors:   0%|                                                                                                                                                                                                              | 0/3 [00:00<?, ?it/s]\\u001b[A\\n\",\n      \"Computing neighbors:  33%|██████████████████████████████████████████████████████████████████                                                                                                                                    | 1/3 [00:01<00:03,  1.99s/it]\\u001b[A\\n\",\n      \"Computing neighbors:  67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                  | 2/3 [00:03<00:01,  1.49s/it]\\u001b[A\\n\",\n      \"Computing neighbors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00,  1.41s/it]\\u001b[A\\n\",\n      \"Embeddings:   0%|\\u001b[32m                                                                                                                                                                                                                       \\u001b[0m| 0/3 [00:00<?, ?it/s]\\u001b[0m\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:39<05:53, 39.25s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:39<05:53, 39.25s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:52<03:11, 23.90s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:52<03:11, 23.90s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [01:31<03:37, 31.07s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [01:31<03:37, 31.07s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [01:32<03:06, 31.07s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [01:58<01:45, 21.05s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [01:58<01:45, 21.05s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [01:58<01:24, 21.05s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [02:57<01:14, 24.87s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [02:57<01:14, 24.87s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [02:58<00:37, 18.91s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [02:58<00:37, 18.91s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  33%|\\u001b[32m████████████████████████████████████████████████████████████████████▋                                                                                                                                         \\u001b[0m| 1/3 [02:59<05:59, 179.64s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:38, 17.61s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:38, 17.61s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:24<01:28, 11.10s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:24<01:28, 11.10s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:38, 14.02s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:38, 14.02s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<01:24, 14.02s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:46<00:37,  7.40s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:46<00:37,  7.40s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:46<00:29,  7.40s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:42<00:50, 16.69s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:42<00:50, 16.69s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:42<00:25, 12.73s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:42<00:25, 12.73s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  67%|\\u001b[32m█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                    \\u001b[0m| 2/3 [04:43<02:14, 134.93s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:38, 17.61s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:38, 17.61s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:25<01:32, 11.62s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:25<01:32, 11.62s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:42<01:40, 14.29s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:42<01:40, 14.29s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [00:42<00:52,  8.79s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:42<00:52,  8.79s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:44<00:30,  6.12s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:44<00:30,  6.12s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                        \\u001b[0m| 6/10 [00:44<00:16,  4.12s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:44<00:16,  4.12s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:42<01:05, 21.70s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:42<01:05, 21.70s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:42<00:29, 14.87s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:42<00:29, 14.87s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings: 100%|\\u001b[32m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\\u001b[0m| 3/3 [06:27<00:00, 129.16s/it]\\u001b[0m\\u001b[A\\n\",\n      \"\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"AnnData object with n_obs × n_vars = 100000 × 18285\\n\",\n       \"    obs: 'n_counts', 'n_genes', 'region', 'age', 'experiment', 'species', 'sex', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'cell_type', 'sex_old', 'abca_class', 'abca_subclass', 'abca_supertype', 'abca_cluster', 'abca_region', 'leiden_old', 'region_dissected', 'biosample_id', 'donor_id', 'species__ontology_label', 'disease', 'disease__ontology_label', 'organ', 'organ__ontology_label', 'library_preparation_protocol', 'library_preparation_protocol__ontology_label', 'cell_type_author', 'cell_type__ontology_label', 'supercluster'\\n\",\n       \"    var: 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'feature_name'\\n\",\n       \"    uns: '10x_batch_colors', '_scvi_manager_uuid', '_scvi_uuid', 'age_colors', 'ages_ordered_colors', 'dendrogram_leiden', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'region_colors', 'region_dissected_colors', 'regions_ordered_colors', 'replicate_colors', 'sex_colors', 'umap'\\n\",\n       \"    obsm: 'X_geneformer', 'X_pca', 'X_scGPT', 'X_scVI', 'X_uce', 'X_umap', 'latent_gene_encoding'\\n\",\n       \"    varm: 'PCs'\\n\",\n       \"    layers: 'counts'\\n\",\n       \"    obsp: 'connectivities', 'distances'\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Running using CT key: supercluster\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"\\n\",\n      \"Computing neighbors:   0%|                                                                                                                                                                                                              | 0/3 [00:00<?, ?it/s]\\u001b[A\\n\",\n      \"Computing neighbors:  33%|██████████████████████████████████████████████████████████████████                                                                                                                                    | 1/3 [00:02<00:04,  2.07s/it]\\u001b[A\\n\",\n      \"Computing neighbors:  67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                  | 2/3 [00:03<00:01,  1.50s/it]\\u001b[A\\n\",\n      \"Computing neighbors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00,  1.43s/it]\\u001b[A\\n\",\n      \"Embeddings:   0%|\\u001b[32m                                                                                                                                                                                                                       \\u001b[0m| 0/3 [00:00<?, ?it/s]\\u001b[0m\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:39<05:56, 39.56s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:39<05:56, 39.56s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:53<03:15, 24.41s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:53<03:15, 24.41s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [01:32<03:38, 31.23s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [01:32<03:38, 31.23s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [01:32<03:07, 31.23s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [02:00<01:47, 21.49s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [02:00<01:47, 21.49s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [02:00<01:25, 21.49s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [02:55<01:12, 24.22s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [02:55<01:12, 24.22s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [02:56<00:36, 18.42s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [02:56<00:36, 18.42s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  33%|\\u001b[32m████████████████████████████████████████████████████████████████████▋                                                                                                                                         \\u001b[0m| 1/3 [03:00<06:01, 180.63s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:37, 17.52s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:37, 17.52s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:23<01:27, 10.91s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:23<01:27, 10.91s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:37, 13.89s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:37, 13.89s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<01:23, 13.89s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:46<00:37,  7.57s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:46<00:37,  7.57s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:46<00:30,  7.57s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:41<00:49, 16.38s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:41<00:49, 16.38s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:41<00:24, 12.49s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:41<00:24, 12.49s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings:  67%|\\u001b[32m█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                    \\u001b[0m| 2/3 [04:42<02:14, 134.47s/it]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                                                         \\u001b[0m| 0/10 [00:00<?, ?it/s]\\u001b[0m\\u001b[A\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\\n\",\n      \"Metrics:   0%|\\u001b[34m                                                                                                                                                                                      \\u001b[0m| 0/10 [00:00<?, ?it/s, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m█████████████████▍                                                                                                                                                            \\u001b[0m| 1/10 [00:17<02:37, 17.45s/it, Bio conservation: isolated_labels]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  10%|\\u001b[34m████████████████                                                                                                                                                \\u001b[0m| 1/10 [00:17<02:37, 17.45s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m████████████████████████████████                                                                                                                                \\u001b[0m| 2/10 [00:24<01:30, 11.29s/it, Bio conservation: nmi_ari_cluster_labels_kmeans]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  20%|\\u001b[34m██████████████████████████████████▌                                                                                                                                          \\u001b[0m| 2/10 [00:24<01:30, 11.29s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m███████████████████████████████████████████████████▉                                                                                                                         \\u001b[0m| 3/10 [00:41<01:38, 14.08s/it, Bio conservation: silhouette_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  30%|\\u001b[34m██████████████████████████████████████████████████████                                                                                                                              \\u001b[0m| 3/10 [00:41<01:38, 14.08s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m████████████████████████████████████████████████████████████████████████                                                                                                            \\u001b[0m| 4/10 [00:41<00:51,  8.56s/it, Bio conservation: clisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  40%|\\u001b[34m█████████████████████████████████████████████████████████████████████▏                                                                                                       \\u001b[0m| 4/10 [00:41<00:51,  8.56s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████▌                                                                                      \\u001b[0m| 5/10 [00:43<00:29,  5.95s/it, Batch correction: silhouette_batch]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  50%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████                                                                                          \\u001b[0m| 5/10 [00:43<00:29,  5.95s/it, Batch correction: ilisi_knn]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  60%|\\u001b[34m█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                      \\u001b[0m| 6/10 [00:43<00:23,  5.95s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                    \\u001b[0m| 7/10 [01:36<00:49, 16.34s/it, Batch correction: kbet_per_label]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  70%|\\u001b[34m███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                   \\u001b[0m| 7/10 [01:36<00:49, 16.34s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                  \\u001b[0m| 8/10 [01:36<00:24, 12.06s/it, Batch correction: graph_connectivity]\\u001b[0m\\u001b[A\\n\",\n      \"Metrics:  80%|\\u001b[34m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                   \\u001b[0m| 8/10 [01:36<00:24, 12.06s/it, Batch correction: pcr_comparison]\\u001b[0m\\u001b[A\\n\",\n      \"Embeddings: 100%|\\u001b[32m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\\u001b[0m| 3/3 [06:21<00:00, 127.17s/it]\\u001b[0m\\u001b[A\\n\",\n      \"\\n\",\n      \"                                                                                                                                                                                                                                                              \\u001b[A\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"AnnData object with n_obs × n_vars = 100000 × 18285\\n\",\n       \"    obs: 'n_counts', 'n_genes', 'region', 'age', 'experiment', 'species', 'sex', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'cell_type', 'sex_old', 'abca_class', 'abca_subclass', 'abca_supertype', 'abca_cluster', 'abca_region', 'leiden_old', 'region_dissected', 'biosample_id', 'donor_id', 'species__ontology_label', 'disease', 'disease__ontology_label', 'organ', 'organ__ontology_label', 'library_preparation_protocol', 'library_preparation_protocol__ontology_label', 'cell_type_author', 'cell_type__ontology_label', 'supercluster'\\n\",\n       \"    var: 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'feature_name'\\n\",\n       \"    uns: '10x_batch_colors', '_scvi_manager_uuid', '_scvi_uuid', 'age_colors', 'ages_ordered_colors', 'dendrogram_leiden', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'region_colors', 'region_dissected_colors', 'regions_ordered_colors', 'replicate_colors', 'sex_colors', 'umap'\\n\",\n       \"    obsm: 'X_geneformer', 'X_pca', 'X_scGPT', 'X_scVI', 'X_uce', 'X_umap', 'latent_gene_encoding'\\n\",\n       \"    varm: 'PCs'\\n\",\n       \"    layers: 'counts'\\n\",\n       \"    obsp: 'connectivities', 'distances'\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"from tqdm.auto import tqdm\\n\",\n    \"sample_score_dfs = []\\n\",\n    \"\\n\",\n    \"for i in tqdm(range(10)):\\n\",\n    \"    # benchmark one sample\\n\",\n    \"    # sample is drawn with random state i\\n\",\n    \"    subsample_ad = sc.pp.subsample(ad, copy=True, n_obs=sample_size, random_state=i)\\n\",\n    \"    sample_df = benchmark(subsample_ad, label_key=cell_type_column,  batch_key=batch_column)\\n\",\n    \"    # show the results for this sample\\n\",\n    \"    display(subsample_ad)\\n\",\n    \"    # add it to the results for all samples\\n\",\n    \"    sample_score_dfs.append(sample_df)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"131fee3d-108d-4685-8afb-ec2863775b2f\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Final Scores\\n\",\n    \"\\n\",\n    \"We can aggregate the scores from all the samples, taking the mean value (and standard deviation of the score)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 9,\n   \"id\": \"4bd62bbd-f8ac-453f-b172-76cff96e5fc3\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"grouped_mean = pd.concat([df.drop(\\\"Metric Type\\\").reset_index() for df in sample_score_dfs]).groupby(\\\"Embedding\\\").agg(np.mean)\\n\",\n    \"# Note: we drop the \\\"Metric Type\\\" row since it contains strings which we can't take the mean of\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 10,\n   \"id\": \"8f0216e0-7d47-4c8c-a094-c27642b8669b\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"grouped_std = pd.concat([df.drop(\\\"Metric Type\\\").reset_index() for df in sample_score_dfs]).groupby(\\\"Embedding\\\").agg(np.std)\\n\",\n    \"# Note: we drop the \\\"Metric Type\\\" row since it contains strings which we can't take the std of\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 11,\n   \"id\": \"8db86f16-ad9b-486d-935b-c69084866f6c\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>Isolated labels</th>\\n\",\n       \"      <th>KMeans NMI</th>\\n\",\n       \"      <th>KMeans ARI</th>\\n\",\n       \"      <th>Silhouette label</th>\\n\",\n       \"      <th>cLISI</th>\\n\",\n       \"      <th>Silhouette batch</th>\\n\",\n       \"      <th>iLISI</th>\\n\",\n       \"      <th>KBET</th>\\n\",\n       \"      <th>Graph connectivity</th>\\n\",\n       \"      <th>PCR comparison</th>\\n\",\n       \"      <th>Batch correction</th>\\n\",\n       \"      <th>Bio conservation</th>\\n\",\n       \"      <th>Total</th>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>Embedding</th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"      <th></th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>X_geneformer</th>\\n\",\n       \"      <td>0.532857</td>\\n\",\n       \"      <td>0.321145</td>\\n\",\n       \"      <td>0.118405</td>\\n\",\n       \"      <td>0.479277</td>\\n\",\n       \"      <td>0.982301</td>\\n\",\n       \"      <td>0.868312</td>\\n\",\n       \"      <td>0.165735</td>\\n\",\n       \"      <td>0.497117</td>\\n\",\n       \"      <td>0.709678</td>\\n\",\n       \"      <td>0.0</td>\\n\",\n       \"      <td>0.448169</td>\\n\",\n       \"      <td>0.486797</td>\\n\",\n       \"      <td>0.471346</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>X_scGPT</th>\\n\",\n       \"      <td>0.627445</td>\\n\",\n       \"      <td>0.615529</td>\\n\",\n       \"      <td>0.351586</td>\\n\",\n       \"      <td>0.536652</td>\\n\",\n       \"      <td>0.998366</td>\\n\",\n       \"      <td>0.88578</td>\\n\",\n       \"      <td>0.137406</td>\\n\",\n       \"      <td>0.426221</td>\\n\",\n       \"      <td>0.872698</td>\\n\",\n       \"      <td>0.0</td>\\n\",\n       \"      <td>0.464421</td>\\n\",\n       \"      <td>0.625916</td>\\n\",\n       \"      <td>0.561318</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>X_uce</th>\\n\",\n       \"      <td>0.752708</td>\\n\",\n       \"      <td>0.727828</td>\\n\",\n       \"      <td>0.50454</td>\\n\",\n       \"      <td>0.594331</td>\\n\",\n       \"      <td>0.99963</td>\\n\",\n       \"      <td>0.860244</td>\\n\",\n       \"      <td>0.136796</td>\\n\",\n       \"      <td>0.401463</td>\\n\",\n       \"      <td>0.832073</td>\\n\",\n       \"      <td>0.0</td>\\n\",\n       \"      <td>0.446115</td>\\n\",\n       \"      <td>0.715807</td>\\n\",\n       \"      <td>0.607931</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"             Isolated labels KMeans NMI KMeans ARI Silhouette label     cLISI  \\\\\\n\",\n       \"Embedding                                                                       \\n\",\n       \"X_geneformer        0.532857   0.321145   0.118405         0.479277  0.982301   \\n\",\n       \"X_scGPT             0.627445   0.615529   0.351586         0.536652  0.998366   \\n\",\n       \"X_uce               0.752708   0.727828    0.50454         0.594331   0.99963   \\n\",\n       \"\\n\",\n       \"             Silhouette batch     iLISI      KBET Graph connectivity  \\\\\\n\",\n       \"Embedding                                                              \\n\",\n       \"X_geneformer         0.868312  0.165735  0.497117           0.709678   \\n\",\n       \"X_scGPT               0.88578  0.137406  0.426221           0.872698   \\n\",\n       \"X_uce                0.860244  0.136796  0.401463           0.832073   \\n\",\n       \"\\n\",\n       \"             PCR comparison Batch correction Bio conservation     Total  \\n\",\n       \"Embedding                                                                \\n\",\n       \"X_geneformer            0.0         0.448169         0.486797  0.471346  \\n\",\n       \"X_scGPT                 0.0         0.464421         0.625916  0.561318  \\n\",\n       \"X_uce                   0.0         0.446115         0.715807  0.607931  \"\n      ]\n     },\n     \"execution_count\": 11,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"grouped_mean\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 12,\n   \"id\": \"3a71d4f4-1555-4cea-8f8f-332075e00de6\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"Embedding\\n\",\n       \"X_geneformer    0.486797\\n\",\n       \"X_scGPT         0.625916\\n\",\n       \"X_uce           0.715807\\n\",\n       \"Name: Bio conservation, dtype: object\"\n      ]\n     },\n     \"execution_count\": 12,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"grouped_mean[\\\"Bio conservation\\\"]\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Faiss\",\n   \"language\": \"python\",\n   \"name\": \"faiss_1.8\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.11.9\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 5\n}\n"
  },
  {
    "path": "examples/Label Transfer Using Logistic Classifier.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"3f4f1b19-5369-4e4d-9366-b6f07f88b402\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Transferring Labels Using UCE\\n\",\n    \"\\n\",\n    \"This notebook walks through the example from Figure 4d,4e of transferring labels from mouse kidney norn cells to a human lung disease dataset.\\n\",\n    \"\\n\",\n    \"To transfer labels, we use a basic default implementation of sklearn's logistic classifier.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"id\": \"5ca49083-fd91-473f-b60a-621b07d52de2\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"## Imports\\n\",\n    \"import scanpy as sc\\n\",\n    \"import numpy as np\\n\",\n    \"import random\\n\",\n    \"from sklearn.linear_model import LogisticRegression\\n\",\n    \"sc._settings.settings._vector_friendly=True\\n\",\n    \"import matplotlib\\n\",\n    \"import matplotlib.pyplot as plt\\n\",\n    \"\\n\",\n    \"## Seed\\n\",\n    \"np.random.seed(0)\\n\",\n    \"random.seed(0)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"72536f9e-b010-44a7-b323-c32f05cf7d98\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Load in anndatas\\n\",\n    \"You can download the anndatas here: https://drive.google.com/drive/folders/1f63fh0ykgEhCrkd_EVvIootBw7LYDVI7\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"id\": \"8e5d7a7e-2a86-4fce-82fe-17afcf83dec5\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"epo_uce = sc.read(\\\"mouse_kidney_norn.h5ad\\\")\\n\",\n    \"kam_20_uce = sc.read(\\\"human_lung_disease.h5ad\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"dcbf764a-ad99-4622-a780-ecd62e471132\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Train Classifier on Mouse Kidney Cells\\n\",\n    \"\\n\",\n    \"We train a classifier to predict coarsened cell types, from the UCE embeddings\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"id\": \"9e56e9aa-bf4a-42d9-b8a6-e76833161083\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"epo_map = {\\n\",\n    \"    \\\"Norn\\\":\\\"Norn\\\",\\n\",\n    \"    \\\"Proximal tubule\\\":\\\"Proximal tubule\\\",\\n\",\n    \"    \\\"Collecting duct principal\\\":\\\"Collecting duct\\\",\\n\",\n    \"    \\\"Distal convoluted tubule\\\":\\\"Distal convoluted tubule\\\",\\n\",\n    \"    \\\"Fibroblasts\\\":\\\"Fibroblast\\\",\\n\",\n    \"    \\\"Endothelial\\\":\\\"Endothelial\\\",\\n\",\n    \"    \\\"Collecting duct transient\\\":\\\"Collecting duct\\\",\\n\",\n    \"    \\\"Other\\\":\\\"misc\\\",\\n\",\n    \"    \\\"Pericyte Ren1+\\\":\\\"Pericyte\\\",\\n\",\n    \"    \\\"Podocytes\\\":\\\"Podocyte\\\",\\n\",\n    \"    \\\"Pericyte3\\\":\\\"Pericyte\\\",\\n\",\n    \"    \\\"Pericyte1\\\":\\\"Pericyte\\\",\\n\",\n    \"    \\\"Pericyte2\\\":\\\"Pericyte\\\",\\n\",\n    \"    \\\"Collecting duct intercalated\\\":\\\"Collecting duct\\\",\\n\",\n    \"    \\\"Loop of henle\\\":\\\"Loop of henle\\\",\\n\",\n    \"    \\\"Proximal tubule2\\\":\\\"Proximal tubule\\\",\\n\",\n    \"    \\\"Macrophages\\\":\\\"Macrophage\\\",\\n\",\n    \"    \\\"Neutrophil\\\":\\\"Granulocyte\\\",\\n\",\n    \"    \\\"T lymphocyte\\\":\\\"T cell\\\",\\n\",\n    \"    \\\"Collecting duct\\\":\\\"Collecting duct\\\",\\n\",\n    \"    \\\"Monocytes\\\":\\\"Monocyte\\\",\\n\",\n    \"    \\n\",\n    \"} # coarse cell type map\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"id\": \"39b160cd-4462-437f-b89e-7363e20e8ebe\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"epo_uce_no_misc = epo_uce[epo_uce.obs.group != \\\"Other\\\"] # remove misc cells\\n\",\n    \"X = epo_uce_no_misc.obsm[\\\"X_uce\\\"] # input is UCE embeddings\\n\",\n    \"y = [epo_map[ct] for ct in epo_uce_no_misc.obs[\\\"group\\\"].values] # output is mapped cell types\\n\",\n    \"clf = LogisticRegression(random_state=0).fit(X, y) # fit classifier\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"ae2fd8b6-e681-4f02-9f9c-71d71d95f925\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Predict norn-like cells using classifier\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 5,\n   \"id\": \"722bcc74-aaca-43f9-b5dd-2427584d7683\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"kam_20_uce.obs[\\\"pred\\\"] = clf.predict(kam_20_uce.obsm[\\\"X_uce\\\"]) # predict cell types for lung disease dataset\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 6,\n   \"id\": \"ae31d2c0-5987-46c1-9be0-d874ae6577b5\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"pred\\n\",\n       \"Proximal tubule    119834\\n\",\n       \"T cell              93556\\n\",\n       \"Granulocyte         52485\\n\",\n       \"Collecting duct     15727\\n\",\n       \"Macrophage          11800\\n\",\n       \"Endothelial          7233\\n\",\n       \"Norn                 6005\\n\",\n       \"Podocyte             4270\\n\",\n       \"Pericyte             1316\\n\",\n       \"Fibroblast            623\\n\",\n       \"Monocyte               56\\n\",\n       \"Loop of henle          23\\n\",\n       \"Name: count, dtype: int64\"\n      ]\n     },\n     \"execution_count\": 6,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"kam_20_uce.obs[\\\"pred\\\"].value_counts()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"23a839f8-88ce-4446-b4a5-961a38a264b5\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Check Differential Expression\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 7,\n   \"id\": \"159cc172-1be5-4ad2-a4b5-2ebd2e52302e\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"# Preproccess Count Values\\n\",\n    \"sc.pp.highly_variable_genes(kam_20_uce, n_top_genes=8000, flavor=\\\"seurat_v3\\\", subset=True)\\n\",\n    \"sc.pp.normalize_per_cell(kam_20_uce)\\n\",\n    \"sc.pp.log1p(kam_20_uce)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 8,\n   \"id\": \"57b9fb8a-2699-43f4-b74f-2a646e8610c8\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"# Subset to predicted Norn-like cells\\n\",\n    \"kam20_norn_ad = kam_20_uce[kam_20_uce.obs.pred == \\\"Norn\\\"].copy()\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 9,\n   \"id\": \"591d1aa7-c4ad-4ad9-9bef-2ed91ed19b57\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"all_de_dfs = {}\\n\",\n    \"ngenes = 4\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 10,\n   \"id\": \"bb5bc067-c588-4a34-a8f3-138824531828\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"sc.tl.rank_genes_groups(kam20_norn_ad, groupby=\\\"Disease_Identity\\\", use_raw=False, reference=\\\"Control\\\") # DE, diseases vs control\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 11,\n   \"id\": \"db176232-0dee-4c1a-bda2-38a19a7afcdd\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"de_df = sc.get.rank_genes_groups_df(kam20_norn_ad, group=\\\"COPD\\\") # get COPD vs control results\\n\",\n    \"all_de_dfs[\\\"copd_vs_control\\\"] = de_df[~de_df.index.isin(de_df.iloc[10:-10].index)] # top 10 and bottom 10 genes\\n\",\n    \"copd_control_genes = list(de_df.head(ngenes)[\\\"names\\\"].values)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 12,\n   \"id\": \"c4d25f7a-4056-4e03-bcd1-9ab9cb2705e6\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"de_df = sc.get.rank_genes_groups_df(kam20_norn_ad, group=\\\"IPF\\\") # get IPF vs control results\\n\",\n    \"all_de_dfs[\\\"ipf_vs_control\\\"] = de_df[~de_df.index.isin(de_df.iloc[10:-10].index)] # top 10 and bottom 10 genes\\n\",\n    \"ipf_control_genes = list(de_df.head(ngenes)[\\\"names\\\"].values)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 13,\n   \"id\": \"4e72311d-44ee-4ca5-8abc-24e9079b574b\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"sc.tl.rank_genes_groups(kam20_norn_ad, groupby=\\\"Disease_Identity\\\", use_raw=False, reference=\\\"IPF\\\") # DE, all vs IPF\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 14,\n   \"id\": \"69be495e-3b0f-41f3-95d8-47a526f00bbd\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"de_df = sc.get.rank_genes_groups_df(kam20_norn_ad, group=\\\"COPD\\\") # COPD vs IPF\\n\",\n    \"all_de_dfs[\\\"copd_vs_ipf\\\"] = de_df[~de_df.index.isin(de_df.iloc[10:-10].index)] # top 10 and bottom 10 genes\\n\",\n    \"copd_ipf_genes = list(de_df.head(ngenes)[\\\"names\\\"].values)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 15,\n   \"id\": \"db9e2358-7431-49df-a980-ff4869d824d4\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"sc.tl.rank_genes_groups(kam20_norn_ad, groupby=\\\"Disease_Identity\\\", use_raw=False, reference=\\\"COPD\\\") # DE, all vs COPD\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 16,\n   \"id\": \"753b9891-f940-46b7-9fa2-8972b6f67136\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"de_df = sc.get.rank_genes_groups_df(kam20_norn_ad, group=\\\"IPF\\\") # IPF vs COPD\\n\",\n    \"all_de_dfs[\\\"ipf_vs_copd\\\"] = de_df[~de_df.index.isin(de_df.iloc[10:-10].index)] # top 10 and bottom 10 genes\\n\",\n    \"ipf_copd_genes = list(de_df.head(ngenes)[\\\"names\\\"].values)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 17,\n   \"id\": \"647407f4-c9ef-405e-9742-342e31664497\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"['POSTN',\\n\",\n       \" 'COL1A1',\\n\",\n       \" 'COL3A1',\\n\",\n       \" 'SPARC',\\n\",\n       \" 'LUM',\\n\",\n       \" 'MFAP4',\\n\",\n       \" 'PTGDS',\\n\",\n       \" 'PTPRG',\\n\",\n       \" 'GPX3',\\n\",\n       \" 'NAMPT',\\n\",\n       \" 'RPL41',\\n\",\n       \" 'CRISPLD2',\\n\",\n       \" 'SERPINH1',\\n\",\n       \" 'COL1A2']\"\n      ]\n     },\n     \"execution_count\": 17,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"gene_list = ipf_control_genes + copd_control_genes + copd_ipf_genes + ipf_copd_genes\\n\",\n    \"\\n\",\n    \"reduced_gene_list = []\\n\",\n    \"for g in gene_list:\\n\",\n    \"    if g in reduced_gene_list:\\n\",\n    \"        next\\n\",\n    \"    else:\\n\",\n    \"        reduced_gene_list.append(g)\\n\",\n    \"reduced_gene_list\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"c2abdd88-2f7b-4ac4-961a-ea417f85bb04\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Plot Results\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 18,\n   \"id\": \"f11581f2-2895-419e-85da-0f53c07756ec\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAdsAAAHICAYAAAAGOEABAAAAP3RFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMS5wb3N0MSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8kixA/AAAACXBIWXMAAA9hAAAPYQGoP6dpAACfUUlEQVR4nOzdd1hT1xsH8G8GG0LYAQRxIe69cG9xrzqqKO7VWu22tY46sEtrW6tWrQtx1rpxVREUt+JEUZEhyCZAGJn39wc/o5GZkJugvJ/nuc8jufecvBfDfXPPPYPDMAwDQgghhLCGa+wACCGEkPcdJVtCCCGEZZRsCSGEEJZRsiWEEEJYRsmWEEIIYRklW0IIIYRllGwJIYQQllGyJYQQQlhGyZYQQghhGSVbQgghhGWUbAkhhBCWUbIlhBBCWEbJlhBCCGEZJVtCCCGEZZRsCSGEEJZRsiWEEEJYRsmWEEIIYRklW0IIIYRllGwJIYQQllGyJYQQQlhGyZYQQghhGSVbQgghhGWUbAkhhBCWUbIlhBBCWEbJlhBCCGEZJVtCCCGEZZRsCSGEEJZRsiWEEEJYRsmWEEIIYRklW0IIIYRllGwJIYQQllGyJYQQQlhGyZYQQghhGSVbQgghhGWUbAkhhBCWUbIlhBBCWEbJlhBCCGEZJVtCCCGEZZRsCSGEEJZRsiWEEEJYRsmWEEIIYRklW0IIIYRllGwJIYQQllGyJYQQQlhGyZYQQghhGSVbQgghhGWUbAkhhBCWUbIlhBBCWEbJlhBCCGEZJVtCCCGEZZRsCSGEEJZRsiWEEEJYRsmWEEIIYRklW0IIIYRllGwJIYQQllGyJYQQQlhGyZYQQghhGSVbQgghhGWUbAkhhBCWUbIlhBBCWEbJlhBCCGEZJVtCCCGEZZRsCSGEEJZRsiWEEEJYRsmWEEIIYRklW0IIIYRllGwJIYQQllGyJYQQQlhGyZYQQghhGSVbQgghhGWUbAkhhBCWUbIlhBBCWMY3dgDvIpVKhaSkJNjY2IDD4Rg7HPIOYhgGubm5cHNzA5dL33kJed9RstVBUlISPDw8jB0GeQ8kJCSgRo0axg6DEMIySrY6sLGxAVB0oRQIBEaOhryLcnJy4OHhof4sEULeb5RsdfCq6VggEFCyJZVCjyEIqR7oYREhhBDCMkq2hBBCCMso2RJCCCEso2RLCCGEsIySLSGEEMIySraEEEIIyyjZEkIIISyjZEsIIYSwjJItIYQQwjJKtoQQQgjLKNkSQgghLKNkSwghhLCMki0hhBDCMlr1h1Qrcc+f486F0+DLCsFwubBwrYmu/QaAx+MZOzRCyHuMki2pFlQqFf7Z9AdqyzMxwMMZgAkAIDf3KQ79tAhNBo2Fd6PGxg2SEPLeomZkUi0c2fYX+tgq0NzDWeN1GwtzDG3gjucn9yH5ZZKRoiOEvO+Mlmy9vLxgaWkJa2truLq64pNPPoFCoQAA/Pnnn/Dx8YGFhQW8vLzw/fffQ6lUqsveu3cPPXv2hJ2dHezs7ODr64vr169j5cqVsLa2hrW1NczMzGBiYqL+eebMmQgNDQWHw8GCBQs0YjE3N0dsbKwhT79SFAqF+ndFypeVlQVb8QtYW5iXekzPOi64evKoAaMihFQnRm1GPn36NDp16oQnT56gS5cu8PHxgVgsxrp16xAcHAxfX188ePAA48aNQ1JSEjZs2AAAGDx4MObPn49Tp05BoVDg4sWLMDMzwzfffINvvvkGALBq1So8evQI27ZtU79faGgobGxssGHDBnz++edwcHAwxmlXSuTNm3jx9BE4HA5EXnXQqm07Y4dU5V0+dQy9a4nKPIbD4YCTkWigiAgh1U2VaEauV68eOnfujGvXrmHZsmX4888/0aVLF/D5fDRr1gxBQUHYtGkToqOjkZaWhtjYWEybNg18Ph/m5ubo1asXmjZtWqH3cnZ2xuDBg/HLL79UOD6pVIqcnByNzVheJsRiYO/uGNCrG1KTXhgtjrIoFAosW7rU2GG8JpeCyy3/o86TS8EwjAECIoRUN1Ui2T5+/Bjh4eFo1aoV5HI5BgwYoLG/efPm8PT0xPnz5+Ho6Ig6depg3LhxOHr0KDIyMrR+v4ULF2L9+vXIzMys0PGBgYGwtbVVbx4eHlq/p75Y2QrxKPopop/GwNzaxmhxlIXP5+Pb774zdhivcSvWgKPi8cHhcFgOhhBSHRk12fr5+UEoFMLPzw8BAQEQCARwdHQscRiGi4sL0tPTweFwcO7cOTg7O+Ojjz6Cs7MzBgwYgOTk5Aq/b7169TBw4MAK390uWLAA2dnZ6i0hIaHC76VvXbr3BGwdobASonuvPkaLozwVuZM0lMYdu+F2Qkq5x6mELgaIhhBSHRn1ihgSEgKxWIyYmBgEBgbCyckJ6enpGp2hXklJSYGjoyMAwNPTExs2bEBcXByioqKQkpKCefPmafXeCxcuxJ9//lmhu1szMzMIBAKNzZh8GjREQxqmUmGeXl54qrSASqUq9ZjbL9Lh07GHAaMihFQnVef2A0CHDh1gYmKC48ePa7weGRmJuLg4dOvWrVgZb29vBAQE4P79+1q9V/369dG/f3+sXr26MiGTd8TgaXOx53EacvILNF5nGAYRz5Mh92mH+vQFhhDCkio1qYVQKMQ333yD2bNnQygUwtfXFw8fPsT48eMxefJk1K9fH1lZWfjtt98wceJE1KxZE0lJSdizZw/atm2r9fstXLgQvr6+NIymGrC0tMSHXyxC2OkQ5MY8BE9WCHC5UAic0HrkNLjXMN5zeELI+69KJVugKAEKhUJMnToV8fHxcHFxwaRJk7Bw4UIAgKmpKZ49e4YuXbogMzMTAoEAfn5++Pnnn7V+rwYNGqBfv37Ys2ePvk+DVEE8Hg/d/QYCGGjsUAgh1QyHobEOWsvJyYGtrS2ys7ON/vyWvJvoM0RI9VKlntkSQggh7yNKtoQQQgjLKNkSQgghLKNkSwghhLCMki0hhBDCsio39IcQtty5fgMR27ZD8ugxFPl54JqYwsTBHh69emDQlCkwNy99CT5CCKkMGvqjAxq28W65+t85hK39DfxrN+BUKCu2X84weOHlAZd+fTFp+fcwMTFhPSb6DBFSvdCdLdGrlJRk3AoLBVQK8C2s0Kl3P1hYWBgtnlPBu3H/uyVwzcgq9RgTDge14l5AvmEzfoyOxvxdO2FpaWnAKAkh7zu6s9WBMe9KGIZBamoqGIaBs7NzlVldJyb6Ee6eOQonWRbaeTqDw+FAKlfgQlw6CoSu6D12EmxsDLsk4KUTIbgyey5cxNkVLqNiGCT264Uvg4NKXH1KX+jOlpDqhZKtDoxxoVQqlQjZGwRF8nPUMC36L0uUccATeaHfqPHg843XSPHwzi2knPsXXWo6lrhfpVLhQHQq/GZ/BaFQaJCYGIbByq49UPPuA63LFjIMHH9ZhaFTp7AQWRFKtoRUL1XjtoiUSaVSIfjXVehmloNB3q5o4eWGFl5uGOjtiu7mEgSvCSxxWUJDKCwsRPSJfaUmWqBobdsP6rvg9NZ1Bovr7MF/4aBDogUAcw4Hz44e03NEhJDqjPVku2nTJjRp0gRWVlbw9PTExIkTERsbCwA4cOAAmjdvDktLS7i5uWHevHnIz89Xlw0ICMDy5cuL1alQKDBy5Eh4eHiAw+Go63tbnz594OLiUmxVnwMHDqB9+/YwNzdHQECAvk6VNacP7sPQmgJYmJkW22duaoIRdexw6oBxFlMIPfYv+tZ2Kvc4DocDH1MpYp5EGyAq4MHBf2FdifLcq9dx98ZNvcVDCKneWE22y5cvx6JFi/DDDz8gIyMDUVFR6NixI86dO4fg4GBMmzYNS5cuhVgsRnh4OG7fvo0RI0agIi3bnTt3xr59+2BmZlbi/pcvX+L8+fOQyWQ4ffq0xj57e3t8/vnnmD17tl7Ok23SF09haV480b5ibmoCedIzA0b0muzFU5iaVKwJu5GrA+6Hn2U5oqImZPGtyErV4VQoQ+Rb6yoTQoiuWHvQJxaLsXLlSgQHB6N///7q16dPnw6VSgVPT08sXboUQ4YMAQDUqVMHe/fuRa1atXD27Fn07t279KD5fHzyySdlvv/u3bvRrl07NGvWDEFBQRox9OjRAwDw9OlTZGZmlnsuUqkUUqlU/XNOTk65ZfRFpVLBVJYHoOy7R0tFIaRSaalfPtjCkxUAqPgzR768kL1g/i8/Px+8vLxK16OQVL4OQggBWLyzvXz5MmQyGQYOLL52aHR0NBITE9WJ9hWRSIT27dvj3LlzlX7/oKAgjB49GmPGjMHhw4chkUh0riswMBC2trbqzcPDcAuNczgcMOCUe5ySYVjtPVsqTvmxvVWAlTDexOPxwGgdV3EcHnVpIIToB2tXk4yMDDg6OpbYSzY9PR1AUXJ9m4uLi3q/rqKionDnzh2MHDkSnTp1gr29PQ4ePKhzfQsWLEB2drZ6S0hIqFR82uBwOJBZ2JZ7nNTC1ig9khVmFX8yyjAMlOaVeZJaMebm5mAq2cOXYRjwbaiXMCFEP1hLtg4ODkhPTy/WOenVPgBITk4uti8lJQWOjqX3bK2InTt3okuXLnB1dQWHw8GoUaMQFBSkc31mZmYQCAQamyHZezdBsji31P1pOXmwrdvIgBG9JqjbGDn5FWsavhqfinZ+g1mOqIhDh7aVKp8ksEbXD8foKRpCSHXHWrLt0KEDTExMcLyETib169eHm5sbDh8+rPF6cnIyrly5gu7du+v8vgzDIDg4GNeuXYNIJIJIJMKWLVtw7tw5vHz5Uud6jalL3/64LrdBYmbxZ8XJ4lxcKjBHN79BRoisKLZT8eU/w5bJFXhp7gxnZxcDRAW08x+PzAp23CqJWZdO8KxVS48REUKqM9baHYVCIb799lvMnj0bZmZm6N69O5RKJfbsKRqismrVKsydOxc1a9aEn58fEhISMHnyZHTq1Emjc5RCoUBh4es7Jz6fDz6fD6lUqu61LJVKUVhYCHNzc4SHhyM1NRV3796FtfXrJsu+ffti9+7d+PTTT6FUKiGXy6FQKKBUKlFYWKiut6oaOmkGLl/4D7ce3IZJQdGMSHJzAZwatMCIHr2MFhePx0OXCTNxOGgDBtdzAqeEZ6X5UhkOvSjAmHkLDBZXm86dcbZNK9hHXNW6bDaPh6YfjGQhKkJIdcX6DFJ//fUXfv/9dzx79gwODg7o0aMHvv/+e9SsWRN79+7FypUrER0dDVtbW4waNQqBgYGwsrICUDTOdvv27Rr1TZkyBZs3b4aXlxfi4uI09jEMg+nTp0Mmk2Hbtm0a+zZs2IC//voLt27dwrZt2zBp0iSN/YsXL8aSJUsqdE40+09xYrEY4Yf3QZUciya2JrCxMENyTh6eK8xg5umN3sNGGXxqyeh79/Hvh/5wT0iscJlCANIpEzDnl5/ZCwz0GSKkuqHpGnVAF8rSKZVKPIp6CEm2GE6ubqhdu45R44mMiMDJj+ahRkxsucdKeFzIPhyNj9auYf2LAX2GCKleKNnqgC6U75a4p09xYs1aZJwLhUdSMnhvNXVnmvBR2KYVfEYMw+Apkw0SE32GCKleKNnqgC6U7yaJRILDf25AdlQUFHn54JmawMTBHi0/GInWnToZNBb6DBFSvVCy1QFdKEll0WeIkOqFpsghhBBCWEbJlhBCCGFZ1R1YSoieyWQyHN++HRl370Gelw+eiQlM7e3hO/5DeDcyzgxchJDqgZItee+lJCXh2K+/I/XsWdSIiYPNW72RT2wPwomO7dF41Ej0+uADI0VJCHmfUbIlesMwDK6GnkNmzCNAqQT4Jqjbvgu8GxrvrvH+jRs4PnsuvJ48K1pMvoQZrlwLCoGzoYi5cBEbb9zE9FWBJc6ERQghuqLeyDqgnqTFhR09iKzbl9DWioGzjYX69UfpOXjGE6J2z0Fo3LpyiwNo61lUFP750B81Yyu+SlM+GMinT8H0HwJZjIw+Q4RUN3Rn+465d/s2XkTfBwC41W2AZq1aGzki4NjWjWia8QgdRZbF9vk4CuADFW6d2oXrkly06dbTIDExDIP9n36OWlokWgCwBAfZW3fgv3Zt0HP4cJaiI4RUN9Qb+R2RmpKCPb+uhPmDC+jjwKCPAwPrxxex99eVSH6ZZLS4Lp06jkbpj+AuKJ5o39TSyRp5oYeQmBBvkLiunDsHu+u3dSprK1fgwb4Deo6IEFKdsZ5sN23ahCZNmsDKygqenp6YOHEiYmNjAQAHDhxA8+bNYWlpCTc3N8ybNw/5+fnqsgEBAVi+fHmxOmNiYtCmTRvY2dnB3t4eQ4cOLXH5vPr166Nly5bFXl+/fj1atmwJExOTCi8+YExKpRL/7dyADxrXQG2Rvfp1L2d7jGxcA2HBmyGXy40SW9qtS/C0LTvRvtLJ1QY3ThxiN6D/u7VrN4RKpc7llRcvI+bxYz1GRAipzlhNtsuXL8eiRYvwww8/ICMjA1FRUejYsSPOnTuH4OBgTJs2DUuXLi1aMSY8HLdv38aIESNQ3mNkJycn7Nu3D5mZmUhOToaPjw/mzp2rccy1a9eQlJSE+/fvIyoqSmOfq6srlixZghEjRuj9nNlw4dQJ+HmLSt3vV98VF04WXzeYbVF378KHI6nw8RwOB7ykp1AoFCxGVbTkovjS5UrV4ZaXj/CgYD1FRAip7lh7ZisWi7Fy5UoEBwejf//+6tenT58OlUoFT09PLF26FEOGDAEA1KlTB3v37kWtWrVw9uxZjTVt32ZjYwMbGxv1z1wuF8+ePdM4JigoCEOGDEFWVhZ27tyJlStXqvcNHToUAHDixIkKnYtUKoVUKlX/nJNT/mLp+lSQnAAbT+tS91uam0Ea/8KAERWJe3gHve1tyj/wDZ5cGV6+fAkPDw+WogIyMzNhlpVV6Xrk4srXQQghAIt3tpcvX4ZMJsPAgQOL7YuOjkZiYqI60b4iEonQvn17nDt3rkLvIRQKYWFhgZ9//hmffvqp+nWFQoG9e/di9OjRGDNmDIKDg8u9Wy5LYGAgbG1t1RubiaIkHKb85lCOSmWASDQxOjTTmvG4KCgoYCGa16RSKbjyyt89q2TGaZonhLx/WEu2GRkZcHR0BJ9f/OY5PT0dQFFyfZuLi4t6f3nEYjGysrLwww8/wNvbW/366dOnIZPJ0LdvXwwdOhQpKSkIDw/X8UyABQsWIDs7W70lJGjXw7Wy5CYW5X5ZkJuYGyia10xtbFGoZVJLkarg7OzMUkRFhEIhZJYVe45cFr6VlR6iIYQQFpOtg4MD0tPTS3w+5+DgAABITk4uti8lJQWOjo4Vfh+BQIAJEyZgyJAhUP3/7i4oKAhDhw6FqakpbGxs0L9/fwQFBel4JoCZmRkEAoHGZkite/TDtZjEUvfffJ6EZp17GTCiIp369seFlPzyD3xDlkAEoVDITkD/Z2trC46Pd/kHliEfDEStWugpIkJIdcdasu3QoQNMTExw/Hjxjjv169eHm5sbDh8+rPF6cnIyrly5gu7du2v1XgqFAsnJyZBIJEVrlh4+jH/++QcikQgikQinT5/GgQMHNJ67vkvca9RAoVsDPE4qfsf/NCUDOc514VW7tsHjMjMzg8y1ToWb6DPzC2HflP2JLTgcDmr07Q1lJR4dpPjUQ98xY/QYFSGkOmMt2QqFQnz77beYPXs2Tp48CalUivz8fPz999/Ytm0bVq1ahcWLF+Pw4cOQyWR49uwZRo8ejU6dOml0jlIoFCgsLFRvCoUCoaGhuHXrFpRKJbKysvDZZ5+hVatWEAgEOHjwIOzs7PD48WNERkYiMjISjx49Ap/PVyf+V3UqlUqNf1dlPQYNg9THF8eei3HyYTxOPozDsediSOq0Ra8hI40WV9cxE3A4qfy7W5lCiVOF1ujUx88AUQFDZs9CvKuLTmUZhoGody/weDw9R0UIqa5Yn67xr7/+wu+//45nz57BwcEBPXr0wPfff4+aNWti7969WLlyJaKjo2Fra4tRo0YhMDAQVv9/VhYQEIDt27dr1DdlyhQMGTIEX375JRISEmBpaYmuXbvi559/Rs2aNdGnTx/4+voWGz/79ddfIzo6GgcPHsSSJUuwdOlSjf1bt25FQEBAhc6JptrTlJyUiPMbfkFfBy5sLcyK7Y/NkuAKxx6j5n8DExMTg8W1+8efkP/jalgrtes8FtPQBzOOHIT9/x93sIE+Q4RULzQ3sg7oQlmcUqlE+MljyH54C/zsVHCUSqj4plA4uqOWbw80a9POKHFt/OIr8LbugE0FE25MnVoYuf1v1GvUkNW46DNESPVCyVYHdKEsH8MwVWblnD0//4LnW3fAM/El+KXElGbKR177thi3dg1qeHmxHhN9hgipXijZ6oAulO+e/Px8HN34F+JDTkIe/RRmBQVQ8E0gtxPCsUtHtJ/gj+bt2xssHvoMEVK9ULLVAV0o320FBQUQi8UwNzeHra0tuFzDr8dBnyFCqhdaYo9UOxYWFrCwsCj/QEII0RNaYo8QQghhGd3Zkmrl3vXreHbxIlT5+eCamsDE0RE9xoylO11CCKso2ZL3nkwmw5kdO5Bx+iTc7txFA9XrCUxkDIP9m/6CSecuaDMxAHUbsjvkhxBSPVEHKR0Yq3OLSqVCxH+nIUl9CYZhYO3sio69+hqlg09JUlNScOXYPzDJE4NRKMDwTMATeaDbkJEwNzf8QgkAkJyQgCNzZqPdg/swL+f39MjcHLzZczBw1mzW46IOUoRUL5RsdWDoC6VcLkfIjs1g4qLQUQDY/X+WJnG+FBdzGHBq+qCv/1SYmRWfvckQcnNzcXLTb3DLT0U7V6HG+FqpXIGw5FwoPBtiQMB0g469TUtOxpHx49ApLrbCZVK5XGTOnoMhn8xjLS6Aki0h1Q0lWx0Y8kJZUFCAA4HfYqQDYMovea5euVKJA2nAkC+XwsZGu8XcKys7W4wTq7/HSA8bcLmlJ9KcAilCCq0w9tNvDZJwGYbBX6M+QNfbt7R+vwRTM1j/+BPaDxjAUnSUbAmpbqpG+yMp1aE1KzDaiVNqogUAEx4Po525OPLrCgNGVuTEn7/gA8+yEy0ACCzM4GeRh5Cgvw0S15UzZ9AwMlKnxO4hk+LJnt0sREUIqa4o2VZh92/fRGtVBngVeCbL5XLgy8vB7SuXDRBZkah7d9GMJ6lwQhOYm0EZc7/ENY71LeafA3CG7o029jdv4OnDh3qMiBBSnRk92YaFhaF9+/awtbVVrwr0/PlzLFmyBCYmJrC2toZQKESPHj3w8I2LX0REBDgcDn777TeN+kJDQ8HlcmFtbQ0bGxs0adIER44c0Tjm33//Rdu2bWFlZQU3NzcMHz4cd+/eNcj5auNZ+BnUtrOu8PGetlaIv3KOxYg0RYefQX1HW63KdHW2QOixQ+wE9H/p6ekwreSXjrpyOW7u3KmniAgh1Z1Rk212drZ6ubysrCzExcXh448/Vq8jOnHiREgkErx8+RLu7u6YNGmSumxQUBDs7OwQFBRUrN7atWtDIpEgOzsbH330EcaOHQuxWAwA2LlzJwICAvDJJ58gNTUVsbGxGDNmDEJCQgxyztrgpSVqXyZd+zK64mWlaF3GyswUspdxLETz2pM7d+CZm1v5ipKTK18HIYTAyMk2OjoaZmZmGD58uPpudNiwYfD09NQ4zsLCAmPHjsWDBw8AFPXO3bdvH3777TfcunUL0dHRJdbP5XLh7++P/Px8REdHQ6VS4euvv8bSpUsxbtw4WFlZwdTUFKNGjcJXX31VapxSqRQ5OTkamyFwlHLtyyi0L6MzHeKrVLkKyk5Lg5UehkMpC/L1EA0hhBg52Xp7e0Mmk2Hq1Kk4c+ZMqUksLy8PwcHBaN68OQAgJCQEPB4PY8eORdeuXUu8uwWK1ljdunUr+Hw+atasicePHyMpKQlDhw7VKs7AwEDY2tqqNw8PD63K64ynw5wjupTREYen40LwuparIGs7OxSotFswviRcM+OMDSaEvH+MmmxtbW0RFhYGqVQKf39/ODk5Yfz48cj9fxPgzp07IRQKUbt2bYjFYmzbtg1AURPyiBEjwOPxMGbMGOzatUuj3ufPn0MoFMLCwgLz58/Htm3b4OLigoyMDACASCTSKs4FCxYgOztbvSUkJFT+5CtAYaddnLqW0ZVC4KB1GalcAZ6DKwvRvFa3aVMkWFhWviInp8rXQQghqAIdpBo3boydO3ciOTkZERERiIiIwIoVRUNY/P39IRaLkZKSgmPHjqFu3brIycnB0aNHMXr0aADAiBEjkJCQgIiICHWdtWrVglgshlgsxujRoxEWFgYAcHAoSg7JWj6LMzMzg0Ag0NgMwb1dFyTmVLwpM0VSAJdWnViMSFPN9l0Rm6nds9ELyRJ0GzKCpYiKiFxdkd+uXaXqiOPx0HjMGD1FRAip7oyebN/UqlUrDB8+HPfv3y/1mAMHDqCwsBCjRo2CSCRCw4YNoVKpSmxKtrS0xB9//IEDBw7g9u3bqF+/Ptzc3HD48GE2T0NvWvl2xkWpBSoy7wjDMDifZ4p23XoYILIizdu0x/VC0wofXyhXQOZW1yAzXXkMHoKsSszXkty0ORq3bq3HiAgh1ZlRk+2jR4+wZs0aJCUlASjqMHX06FG0bdu21DJBQUGYN28e7ty5g8jISERGRmLr1q3Yt28f5PLiHW9sbW0xbdo0BAYGgsvlYtWqVVi8eDF2796NvLw8yOVyHDx4ED/++CNr56krDocDv7nfYP9LKVSq0hMHwzA4mCxD34+/Nuh0iADQbcrHOBqfXe5xUrkCB9MZDJg00wBRAZ0HD8adBrotKpDC48FjBLt334SQ6sWoydbGxgYRERFo1aoVrKys0KtXLwwYMABff/11ice/ePEC4eHhmDt3LkQikXobO3YszM3NSx2+8/HHH+Po0aOIjo6Gv78/tm7dijVr1sDZ2Rk1a9ZEUFAQ/Pz82DxVndnZ22Pg1ytxjCPC6aRcSOWvJ4SQKZQ4k5iDI4wz+n25HI5OzgaPT+TqBt9ZX+FAigKP0oonXYZhEPEiE8dlthj71VLw+YbpwMXlctFv9RpcF2n3fDgbQOKYsej+/8cUhBCiDzQ3sg6MNa9tQUEBwo/9C2VuNhhGBa61LboMGg5LSz10BtKDxw/u41HYaZhKsqBUyMExMYXS3hUdBo6Ak7PhvwgAwLP79xE6by58Y2PBLeeu/wWfj/Rx/hj1LfvzN9PcyIRUL5RsdUAXyndLdnY2zm5Yj5zz59Hg6RPYvZFIGYbBPWtryDr4ot7ID9C2Z0+DxESfIUKqF0q2OqAL5buJYRiEHT6MjDuRUOblg2tqAo7QDp39/eHk4mLQWOgzREj1QslWB3ShJJVFnyFCqpcqNfSHEEIIeR8Zbm4/UmkZGRk4sW8/pLkSQKWCmcAG/T4YCSea6ahCCgsLsXfjJqRFP4U0Lw98UxOY2wrRfewHaNqypbHDI4S8x6gZWQeGbgKMOB+Ks9uC8OzMeVi+TAUHRR18GDDId3FCrT7d0HPiOHTq0cPg42zfpFQqEX74X+TFPQcjkwHm5nBv1RYtOnU2WkwAEB0VhcN/bsSTU//B9Gkc+G/9jiQW5nDo0h6thg/BiEkT1atOsYmakQmpXijZ6sBQF0qZTIalk6cjaf8RWMjKXnC90IQH5xEDsGTrZpibG3YC/Yy0NIRt2QjVtUtol/USAt7rpxPxSuCRZz1YdOyGPpOmwsSE3UUI3nZs9x4c+fxbWCanl3usjGFgObgPvg/ewfpwKkq2hFQvlGx1YIgLpVwuxxfDRqHg+FnwULG7VRUYmPXrhp8O/wNT04pPo1gZT+5E4v7SBeiRnVzmXbVUpcIJr8YYumYdbG21W3BeV4d37sKJuV/CIkdS4TIMwwC9OuPHowdZnVaSki0h1Qt1kKqiVs78SKtECwBccFB4MhTLphlmSsS46Gg8XvQFeuaklNt8bcblYmjcAxz+ZAakUinrsUVeu4YTXy3SKtECRVNkMmfDsWrmRyxFRgipjijZVkHRjx7h+YEjWiXaV3jgIP6f43h49y4LkWmK+GEpuuRlVPh4DoeDgfGPcfynQBajKnJ8w2ZYpJTfdFwSLoeD+CMhiI2J0XNUhJDqipJtFXR4/V+wzMnTubxVXgGObtyix4iKe3DjOnxiorQux+dywL1+CTKZjIWoimRkZCDm9LlK1WGdlYN//tigp4gIIdXdO5lsvby8cPHiRfXPoaGhqFu3bpnHBQQEgMPh4Pz58xrH9Ph/D97Y2FhWY64oqVSKqJAzla7n8cmzyM+v+Fq4Wtd/cC/q6Nhpt31OGs7vCdZvQG/Y9/ufsE5KrVQdHA4H0afOsvqlgBBSfbyTyVZX9erVQ3Dw64t8YmIinj9/brDORBVx+8YNqJ48r3Q9nJh4XAkP10NEpdT/+IHOZS15XEjv3NRjNJpSoh7pZQiUPOoJnkRH6yEiQkh1V62S7bBhw3D8+HF1B53g4GCMGTOm3AuzVCpFTk6OxsaWlBeJMNPhWe3bTAGkJ72sfEClyde9mRsAmEqWL4tUop+6zcBBcnyCXuoihFRv1SrZ2tjYoFOnTjhx4gQAYNeuXRg/fny55QIDA2Fra6vePDw8WIvR1MwUKj3UowJgYsbiHTu3ch8dDosTR3D1VLcKDMwsDDtmmRDyfqpWyRYAxo0bh127duHBg6Jm0EaNGpVbZsGCBcjOzlZvCQns3e3UqF0bhbzK/7cUcjhwr1VLDxGVjBEIK1VeZcPeWFszGxu91CM14cPdy0svdRFCqrf3Itny+XzI5fJir8vl8mIzFvn5+eHy5ctYt24dxo0bV6H6zczMIBAINDa2NG7SBII2zStdj2WrJmjdrl3lAyoFp3lr6DofykuFCm69++k5otfqd+sMuR7mahG2awUvSraEED14L5Kth4cHUlJSNCZLKCgoQGpqKjw9PTWONTU1xYABA/DXX3/hww8/NHSo5eJwOGgysB9U0D1ZMGDQZEBfcCvZ1FsW3wlTcIOrWxPrvRp10aY7e4u0Dw+YAGUj70rVoWQYNBnYz6hzTRNC3h/vbLKVyWQoLCxEYWEhXFxc0LRpU3z33XfIz89Hfn4+vv32W3To0AGurq7Fyi5duhQXLlyAu7u7ESIv39iPZqPQo3jcFVXg6ozRH8/RY0TFubi6Iq1leyhU2n0pSFExsOs7kNUkZmJigvp9e+p85w0ABV41MGbOLD1GRQipzt7ZZNuzZ09YWFiot48++ghPnjyBl5cXvLy8EBcXh927d5dY1tXVFR07djRwxBVna2uLnp9/DKmZ9pP2S01N0OXTOXBwcGAhMk3Dvl+FwzXqQ1nBpJapZHCr60D09A9gNzAA47/8DIoW5T+PL0mhCR8dZ02BlZWVnqMihFRXtBCBDgw1ifwfi5bg1o9/wFxasYkVpGYmaDx/FuYHLmctprcVFBRg/+dz0fLhTXiW0gmYYRjcAR/p/YZj2OdfG6xpNuruXawdOxHmjyo+7aKUz0O9udPw6U8/sBgZLURASHVDyVYHhrxQBv+5Hud+/wvcR0/BL2X8rQIMlN610HX2NEz45GNW4ynNrfBwPD/yDyzuXEPN/BxY8bjIVqgQ4+gKtPFF63EB8GCxd3RpYp48wZopMyGPuAHzMj7pDMNA4uyAdnNnYPo3X7MeFyVbQqoXSrY6MPSFUiaTYf+Wv3Hr32NIuXEbqhwJOAAYGyuIWjVHy2ED8cHUKawuCVdRubm5iHv2DHlZWbB1cUHtunWNPkMXwzA4efAgLu05gJfnwmGdlaO+u5YC4DRrAB+/Phj98Wy4iEQGiYmSLSHVCyVbHRjzQpmTk4OsrCwwDAN7e3u6UGsp7vlzXA29gMJcCUzMTOFUwx3d+/UDj8VJNkpCyZaQ6oVv7ACIdtge5/u+q1mrFmoaoTmbEFK9vbO9kQkhhJB3BSVbQgghhGXUjPyOkMvlOL9/N6SR18DNlwAAVJbWMG3aGt1HfWj0TkjviqTEF4gMPQuOQgoOjw9rFzf49mR3ti1CCKEOUjowZOcWhmEQsvF3KCP+QydFNiz5mh15ChRKXOQJwLTrjoFz5hl1esHCwkKEHjsEZXY6oFKC4fJhW7MeOvboZfRkFnnlEuIizsFFkozWLrbq31NmfiEixCpwa3qj15gAWFhYGCQe6iBFSPVCyVYHhrpQMgyDPYu/Ro8nVyE0Kbu3bI5chTO1WmLM8p8MntgYhsGxoL9hkhaHLrWcYfbG4g8ZOXm4nJwDx6bt0bEXe4sPlOXs/mC4Pr0GH4fSZ4RSqlQ4+FKG3h9/DQdHJ9ZjomRLSPVCbWdV2OE1P6DXk2vlJloAEJhw0Tf2Jg79vNIAkb3GMAx2//ELOptJ0NvbXSPRAoCDwAoDvV3hFH8H/x3+x6CxAUD48cOoGVN2ogUAHpeLkW5mOPX7KhQWFhooOkJIdUHJtorKysqCzbXzEJhU/L/Ims+D3c0LSE9LYzEyTacP7kFfJz5sLMteAaiOsxCOKY/w4E6kYQJD0XNu8ZWzqG1XsTmOORwOhotM8N/+YJYjI4RUN0ZNtl5eXrCyskJeXp76tfz8fNjY2KjXEfXy8oKlpSWsra1hbW0N0Rsz/AQHB4PD4eDIkSMa9W7btg18Ph/W1tYQCARo164dLl68WOz9/fz8wOdXzT5i4UFb4WtSfI3e8nQwVeJS8Db9B1QChmFQEBcNoXXFnnM2dnPE02thLEf1WujRf9HNRbtnsCY8HhSxUZVaMYgQQt5m9Dtbd3d3HDp0SP3z4cOHiy2Ld/r0aUgkEkgkEiQnJ6tfDwoKgp2dHYKCgorV261bN0gkEmRmZqJHjx4YOXKkxgX00KFDyM3N1f8J6YnqwU1wdejsxOFwwNy7yUJExV0JO4/2ImutylhL0g32ey98dh+WptqvnNTKQoGbly+xEBEhpLoyerIdO3Ysdu3apf45KCgI48aNK7dcamoqzpw5g3Xr1uHo0aPIyckp8Tg+nw9/f3+kpKQgIyMDQFGv2YULF2LVqlUVilEqlSInJ0djYxsvJ0v3srli/QVSBnHSCzgJtEu23vZWeBb9mKWINPEK8so/qASuAkukxT/XczSEkOrM6Mm2R48euHfvHtLS0pCWloa7d++iV69e5Zbbs2cPmjZtirFjx8Ld3R0HDhwo8TiZTIbt27fD3d0djo6OAIBVq1ZhzJgxqFGjRoViDAwMhK2trXrz8PCo+AnqSqXSuShHpTBIMygH2r+HKZ8PmUzKQjTFMSql7oUr8fsnhJC3GT3Z8ng8jBw5Env37sXevXsxYsSIYpPC+/n5QSgUQigU4tNPPwVQdAc8evRoAMDo0aOLNSVfuHABQqEQ7u7uuHr1Kv79918AQGxsLPbt24fPP/+8wjEuWLAA2dnZ6i0hIaEyp1whKnPdx3uqzC0NM97W1BwKpXYJLVEsgWsNT5YCeoupbqsgyRRK8Cxp4XhCiP5Uid5B48aNw9y5c8EwDH777Tco37qAh4SEoFOnTuqfo6OjcePGDezbtw8AMGbMGKxatQovXrxQ36127doVZ8+eLfZe8+fPx7Jly2BuXnbv2TeZmZkZfPk6pnYD4OlVncqq6jTQczQl69R3AC5s/AE961eshQAAElRmaGmIlgEASicPMKokrb94XEiWoOukASxFRQipjox+ZwsArVu3RmZmJrKystCmTZtyj391F9u+fXuIRCL07t0bKpUKwcHlD9kIDQ3FnDlzIBKJ0KZNGyiVSohEIjx48KDS56FP3oNG4Gmh9s2gsYUK1Oo3hIWIirO2tobExrHCTdaSgkJYenqzHNVrbQcOx7WXYq3LFbrUMthMUoSQ6qFKJFsAOHjwIA4ePFihY3ft2oWff/4ZkZGR6m3FihUl9kp+2+PHj9VlTpw4AR6Ph8jISNSvX7+yp6BXDVu0wl1RXa3L3XSqjWbtfVmIqGSdh47FscdJ5R6nUCpxJC4HPQYa5osAAIhc3ZAo9IBci6buyykSNOs7iMWoCCHVUZVJtg0bNkTDhg3LPS4iIgKpqamYNm0aRCKReps1axaePn2Ku3fvllne2dlZXcbJqWhaPpFIVCXH2/ZcsAzHuMIKHx/CEaD710vZC6gETs7OaD1qCg48eAFJQckzL73IzMaBmGx88NEXBl+kfcjsz7A/jYFMUX7CvZUmgWmnQahV13B334SQ6oHmRtaBIee1ffkiAee+/xo9cxJhZ1pyohLLFfjP2g1dvw2E+/8nAzE0hUKBsFMnkPs8Cvz8bPAByDkcKAVO8GjeDi3bdTBKXEDRTFKHN/wK+7RYdHazBZer+Qw3MTsPNwtMULvvMDRrZ5hWAZobmZDqhZKtDgx9oVSpVLh0/Agyws7A9vlDuKik4ICDFK4pxF4+sOvUC50HDzP6yjqvMAwDpVJZ5VoLsrKycPHQfnDTEgCFDBwuDwpLG7i16oRWvp0MumISJVtCqhdKtjow5oVSLBYjNSUZDMPA2UUEOzs7g74/0Q9KtoRUL1Xr1oOU69V4Y0IIIe+OqtHuSAghhLzHKNkSQgghLKNkSwghhLCMntmSaoVhGMTHxyMjMRHmNtbwrFUb1tbarVxECCHaomT7DklKiMfVoJ3gZGUWvWBnhzYfjod7TS+jxvUukEgkOL91MwovXUCN54/hAAaFKhUu2NijsFU71B0xyqAzbxFCqhca+qMDQw/beBETg0u//AD7yOtoJi9UjwdlGAb3+GbIaNYa7T/9Ep716rEey7vo/uUIRC1fiO7ZqeCXMpY2VsXBgy59MXbFDwYZr0xDfwipXijZ6sCQF8qn9+/j7hdz0TkzpczjIuyc0WDVang3b8FqPGWRy+W4EHIMhS9jwVUpoeSbQtSoJVp36GjQCSPeFHXzBhIWzEPbguxyj81XqvCfb2+M/2kN6/FSsiWkeqEOUlVYYWEhLi/4rNxECwC+Wam48e2XyM/PN0BkxT1/+gSHVy9FW2kc/NzM0beGFfqLTCB6cgm7fllulLgYhsHtVUsrlGgBwJLHRafLZ3FuT/mrRxFCiDaMlmy9vLxgaWkJa2truLq64pNPPoG1tbV643A4sLKyUv8cHx8PADh8+DA6dOgAKysruLi4oHPnzti9e7e63m7dusHc3Bw2NjYQCoXo1KkTtm3bpvHeYWFhaN++PWxtbeHg4IAePXrg+fPnhjz9CvlvxzZ0S46v8PFdUxPx39YtLEZUsry8PNw5uB3DGrrDylxz3V83ewFG17PH4b/WGjyuiBPH0C5Ju/9XOy4HWWdCWIqIEFJdGfXO9vTp05BIJAgLC8O+ffvw008/QSKRQCKRwMzMDA8ePFD/7OnpiZ07d2LixImYPXs2kpOT8fLlS/z000/FFonfvHkzcnNzERcXh3nz5uGbb77B119/DQDIzs7GkCFD8OWXXyIrKwtxcXH4+OOPDb4aTUVILpyDuRbPD025HOSHn2cxopJdOPYv+tUTlbqfy+WiqaUSjx8ads3glyFH4cDVvjm4ZvQ9RN2+xUJEhJDqqkr0Rq5Xrx46d+5c5gLuKpUKX331Fb7//nv4+/urX2/fvj3at29fYhlbW1uMHDkSpqamGDlyJD777DPExsbCzMwMw4cPB1C0APqwYcPKjE8qlUIqlap/zsnJ0eb0dJKfnw/LmCdal7ONeYqsrCyDzpmsTE2AiZewzGN8XB1w4tpF1G/YyDBBAeDGxehUrh5HhYjLl9CgRUs9R0QIqa6qxDPbx48fIzw8HM2bNy/zmJcvX2LIEO0XH+/fvz9UKhWuX78Ob29vyGQyTJ06FWfOnKlQ4gwMDIStra168/Dw0DoGbeXk5MBaLtO6nLVChtzcXBYiKh1XJa/QcTyVguVI3vLGFyRtqaQlr81LCCG6MGqy9fPzg1AohJ+fHwICAjB58uRSj83IyABQtND7K23btoVQKISFhQXi4uJKLcvn8+Ho6IisrCzY2toiLCwMUqkU/v7+cHJywvjx48tMUAsWLEB2drZ6S0hI0OFstSMQCCAxMdW6nIRvChsbGxYiKh3Dq1icKr7251MpZuY6F+WaW+gxEEJIdWfUZBsSEgKxWIyYmBgEBgaWOb7R3t4eAJCcnKx+7dq1axCLxWAYBmWNYFIoFEhPT1c3rTZu3Bg7d+5EcnIyIiIiEBERgRUrVpRa3szMDAKBQGNjm6WlJQpqaz9uNrt2PYMvu8dz9oBcoSzzmEcvM+DdpqOBIiqiqllLp3JPGC5qdzBsrISQ91uVaEauCB8fH4hEIhw5ckTrsiEhIeByuWjTpk2xfa1atcLw4cNx//59fYSpV1bdeqFQparw8TIVA8su3VmMqGRdBw5DyJPkUvcrlSrczefBu0FDA0YFuPoNRrpS+2HkcfWbwMeI45UJIe+fdybZcrlcBAYGYtGiRdi1axdyc3PVz2GVypLvqnJycnDw4EHMnDkTn3zyCZycnPDo0SOsWbMGSUlJAIDo6GgcPXoUbdu2NeTpVEivCRMR6lqzwseHOrujZ0DpTfFssbS0RIsRATj4MBGSAs1nnS8ys7H/WSaGTP/E4HH59h+Aa+61tSqTqQLsevmxFBEhpLp6Z5ItAAQEBGDLli34/fff4eLiAldXV8ybNw87duyAp6en+ripU6fC2toaHh4e+OWXX/D999/jxx9/BADY2NggIiICrVq1gpWVFXr16oUBAwaohwZVJWZmZvAN/AVh9i7lHnvJzhltVv4ES0tLA0RWnFeduhj22RLcsKyFkKRCnErMx4kUBdLrd8aHny40SlwcDgctFyzGNQvbCh2fr1QhwrcXeoz5kOXICCHVDU3XqANDT7WX+Pw5Lv6yCna3b6C5vEBjbuQ7fHNkNW+N9vO/oLmRS3H/ymU8XL4Q3bNSYFLKuNvnKg4edfPD6GVl9x3QF5qukZDqhZKtDox1oUxOTMSVXTuAzAwADCB0QLvx/nCtwf5QpHddXl4eQrf/jfzw83CPeQx7KCFVMUiwtYe0VQd4jxiNJm3bGSweSraEVC+UbHVAF8p3F8MwePHiBTJeJsHc2gYeNWvCysrK4HHQZ4iQ6qVKzCBFiKFwOBx4eHgYZGISQgh55Z3qIEUIIYS8iyjZEkIIISyjZuR3yLPHj3HjyFGoJBIwKgY8G2u0HDgA9RoadrKIsuTk5CDpRQLys7MhcHKGh4cHzMzMyi9oINF37yL68iWo8vPANTGFqYMjuo0oWqyCEELYQh2kdGDIzi0qlQpn9u3Hs38Pg3/5KjzyCzT2v7Awh6x9W9QeOgh9xowxylKBDMPg2rmzSDx1DNYPb8NdKoEFj4tcFYN4G0comrVF0w8+RJ0GDQweG1A0Xee53cEQnzsNz6i7qMd5PStXgUqFq3bOQNuOaD1hEmrWrWuQmKiDFCHVCyVbHRjqQpmTk4O/Jk1BzdCLsOaUvS5rHsPgeecOmL7tb9gKhazF9LaEZ89wccVCtE15DpFJ6U8lHiiApw1aY/iyH2FurvsCAdp6EROD/76Yh24JT2BdzheRO1wT5I2ZiEEfz2M9Lkq2hFQv9My2isrLy8P6D0ajYQUSLQBYcThoFH4ZG0aOMsh6uwDw7MF93F3wMQZnxpWZaAGgER8YGH0d++ZMRn5+vkHiS3j6FFc+no6BSTHlJloAaKaSo+6uzTi4aqUBoiOEVCeUbKuozTNmofGNSPVsURXB4XDQ5PY9bJk+k8XIimSkpeHO8m/QTSaucBkeh4Nh6c9x4Kt5Za7SpA+FhYUI/XIeumeWvkBCSZw5QN3De3E+aCdLkRFCqiNKtlXQ3evXITh3AVwtEu0rHA4HDqHhuHnxIguRvRa+dSP65KdpXY7L4aBTzB1cPXuGhaheO7djG3okxuhU1h1KpP67j/UvBISQ6oOSbRV0bdsOiGRyncs7KZS4tXOXHiPSpFAowNy+otVd95tcTHhIOn1Uz1Fpyr1wDuaVmOO4xYsYXDp+TI8REUKqM6MmWy8vL1haWsLa2hqurq745JNPYG1trd44HA6srKzUP8fHx6Nbt24wNzeHtbU1nJ2d4e/vr35GWdK+3NxcjfcMDw9Hr169YGNjAwcHB7Ru3Rp//PFHlbmLyc3NhSQ0rNL1FFwIR1ZWlh4iKu7Cwf3okKP9Xe2bHB7dQXJSop4i0nQnIgLez6IqVYcDl4OXx7VfO5kQQkpi9Dvb06dPQyKRICwsDPv27cNPP/0EiUQCiUQCMzMzPHjwQP3zq2X0Nm/eDIlEgjt37iAyMhIrVqxQ1/dq371793D37l0EBgaq950/fx79+vXDwIEDERcXh4yMDGzduhXXrl2DTCYz+LmX5Gb4RdR4mVLpejzTMnDtv//0EFFxBY8ewMakckOMWnHluH32tJ4i0hR/9TI8OZX/8sSLf66HaAghpApNalGvXj107twZDx48qHAZV1dX+Pn5lVjGxcUFffv2RWRkpPq1BQsWYMaMGZg3b576tSZNmmDHjh1lvo9UKoVUKlX/zGZv3+yUFFjo2Dz7JjMOB3mZ7NzZMtKC8g8qB4fDgaqg8vWUqCBPL9WoDNRrmhDy/jP6ne0rjx8/Rnh4OJo3b17hMomJiQgJCSmxzKt9derUAVA0lObatWsYMmSI1rEFBgbC1tZWvbE5ib2JuTlU5R9WLhXDgG9iooeaiuPw9PMdjcNn6buenurlGGGCEELI+8noydbPzw9CoRB+fn4ICAjA5MmTyy0zY8YMCIVCdOjQAb6+vvjmm2809gkEAtSoUQNCoRBLly4FAGRlZYFhGIhEIvWxI0eOhFAohKWlJcLCSn9OumDBAmRnZ6u3hISESpxx2US1a0Fc+Rtb5ICBs1fNyldUAsbKutJ15CmUsHB00kM0JbAWQKWHZ/CMjY0egiGEkCqQbENCQiAWixETE4PAwEBwK9CDdOPGjRCLxYiPj8fGjRthaWmpsS8nJweXLl1CTEwM0tKKOvLY2dmBw+EgOfn1uMsDBw5ALBbD09MTKlXp95NmZmYQCAQaG1tatG2L9CaNKl3Py4Y+aNu1qx4iKq5ev4GIllcumV2yckTnAYP0FJEm31FjcMO08mvUcpu11EM0hBBSBZItW3x9fTFz5kx89dVXAAArKyu0adMGR45U7R6mHA4HLn16QlmJOzMVw8CpV48KfXHRhU+zFnhe00fn8gzDgNOyPfgsNSM7ODoir1W7StVxn2uCVuMD9BMQIaTae2+TLQDMmTMHZ8+eVXegWrlyJTZs2IDff/8dmZmZYBgGDx8+hFgsNm6gb+k/axaeOTvqXD7GwQ5+s2fpMaLiBJ17Ikuh29PlGyo+2oydqOeINNUaNhIJlXj4ndKkpcEWJSCEvP/e62Rrb2+PiRMnYtWqVQCAnj174sSJE/j333/h6ekJR0dHTJgwAYsXL4avr6+Ro31NKBTC86NZyDTV/s4vi8+D66wZcHRi6Xno//UcMw7nG3aArIzm95IkKgHZyElw//8wLra06tIVD3r0R55S+4wbIXRC2/lfsBAVIaS6olV/dGCoFVt2LV8B5s+/4FDB2aQyTPhgpk+G/9IlrMX0Jrlcjt1ff4rej69BwC+/5+4zFQ8v+o9G/5kfGSC6ouUJd376CbpcPgcBt2K9zi7bOsJr0Uo0ZvnLF636Q0j1QslWB4a8UB7btBnRm7ag1rPnMOeU3BBRyDCIrV0TdaZMwuCZM1iN520MwyDkrz9RcPE/tMlMhEMJk108lQPPPL3hPngk2vkNNHh8R1b/BOXpE2gnTi11Cscn4CG2QVO0++Ib1DLAuruUbAmpXijZ6sDQF0q5XI6TO4MQf+w4pJF3YZGfDzBAgaUlTJs1gcfA/ug/cQJMWBpXWxEMw+Di8SNIv3AW/BwxGJkMHHNzKJzdUH/ISDRs2cposQFFE5Oc27kduaFnwXsRD3OZFDIuDwprG3BbtEGTsePh3bSpweKhZEtI9ULJVgfGvFAWFBSoxwzb2dlpDHsiFcMwDPLz82FmZsZaj+jyULIlpHp5rztIvY8sLCzg5uYGd3d3SrQ6erXAhbESLSHh4eFo1qyZQd+TYRj4+/tDKBRi2LBheqkzNjZW4++oW7duCAoK0kvd7xtKtkSvVCoVxGIxkpKSIJFIqsxqSoSU5c0VyKytrTVmmtNX/RffWGO6c+fOuHPnjl7fozzh4eG4dOkSkpOT8e+//xr0vUkVWoiAvNvu37qFi1u3IS00HPyMTPBkMsgtzMHxqAFRrx4YMHsWXPR8ASNEn06fPo1OnTqVul+hULzTrSHx8fGoXbs2zM3NjR1KtUR3tu+Ql4kvcPTP1Tjx8/c4/tP3OLpuNRLj44waU0JMDH4cNhJn+w+GYOce1ElIRM38AtRQKFErNw9eDx/DbO2f2OrbBb/Nmq2xehIhVVloaCjq1q2LxYsXw9HREYsXL8azZ8/QpUsXCIVCuLm5aczLDgB79+5F48aNYWNjgyZNmuDx48eYOnUq4uPj0adPH1hbW2PXrl3qul958OABOnfuDKFQiFatWuHSpUvqfV5eXvjll1/QoEEDCIVCfPRR6UPnsrKyMHbsWDg6OqJOnTrYuHEjAGDXrl2YOnUqQkNDYW1tjd9++61Y2by8PMyePRtubm6ws7ODv7+/et+BAwfQqFEj2NvbY/DgwUhNTS3393flyhW0aNECAoEA7u7uWLNmTbll3mfv7te0aiTq5nU8PrwbzgmP0c+KA87/l+BjGAY3b5zF9Rr1UW/waDRq096gcT28dRuHp89AzaexZR7H4XDgmSmGMng/fopPwCd7gmFDk/yTd0BsbCx4PB5evnwJhUKBpKQkLFu2DB07dsTz58/Rs2dPtG3bFkOHDsWlS5cwZ84cHD58GB06dEB0dDQEAgE2b96Ms2fPIigoSH3nHBoaqn4PmUyGQYMGYd68eTh37hwOHjyIQYMG4dmzZ7CzswMAHDp0COHh4SgsLETLli0xYsQIdO/evVi8rxJxfHw8nj59ip49e8LHxwfjxo2DXC5HUFAQzp49W+K5zps3Dy9fvsTdu3dha2uLq1evAgCuXbuGefPmISQkBD4+Pvj2228xe/ZsHDhwoMzf3bx58/D5559j3LhxyMrKQmxsrLa//vcKJdsq7urJY2AObsJAMxVgrdkQweFw0NqaB4if4tZfK3E5dQo6DNB+CUFdJMbH49DMOfAqJ9G+icfhwOviFfwRMBlf7N1tlCa5u7duIe7+TfCUcoDLBayE6NZ/MHU2I/Dz8wPv/8sqTp48GYMHD4aZmRm++eYb8Pl8mJiYoE6dOuplO+vVq4dx48bh4sWLGDp0KLZt24YZM2agY8eOAAAfn4rNH3716lWoVCrMnTsXADB69Gj8+uuvOHnyJMaOHQugKHE5OhZN4dqtWzfcuXOnWLJVKpXYv38/Hj9+DEtLSzRt2hRTp05FcHAwupazKIlKpcLOnTtx79499fu8+mLw999/Y/bs2WjSpAkA4LvvvoO9vT0UCkWZdZqYmODp06fIzMyEvb29+otDdUXNyFXY/asR4PzzF1qblT/lYEtzBqaHt+DOxdKXCtSnQytXwSv6qdbluBwORGdDcXjz3yxEVbrMzEzs/u1HWD67iv61hOhb1wl9azughwODsG1rEXryuEHjIVXPqxXIxGIxVq9eDQAQiUQaXwoTExMxbNgwiEQi2Nra4tdff0VGRgYA4MWLF6hVq5bW75uUlFRsjeyaNWsiKSlJ/bOLi4v635aWlpBIJMXqSU9Ph1wuh+cbU6G+XU9p0tLSIJVKS4w/Pj4eK1asgFAohFAohIeHB/h8vsYKaiXZvHkzHjx4gLp166JTp064fPlyuXG8zyjZVmFPDwahpXnFe/M2M2Pw/NAuFiMqkp2djezzoTqXN+Nw8Py44ZJbfn4+Tu/YgA+ae6GWSHOBBxM+H72b1IVrThzCTp8wWEzk3fDqkc0rCxcuhJ2dHaKjo5GdnY158+ape9x7eHiU2lT6dj1vcnNzK7ZGdnx8PNzc3LSK1dHRESYmJoiPj9e6HicnJ5iZmZUYv7u7O5YtW6b+IiIWi1FQUIAaNWqUWWf9+vWxb98+pKamYsyYMeq79OqqSifbLVu2oHnz5rCysoKrqyv69OmDU6dOAdDsqu/q6opPPvkECoUCOTk5cHd3x8mTJ9X1PHr0CHZ2dnj6tOhObPr06XB1dYVAIECTJk1w9OhRo5xfWaIf3EOdtOdal/PJjMeDWzdYiOi1o3+uh2dyWqXqMLl6A5FXruoporKdO3oQw5rVLvOCV9fVCeLoO+U2jZHqLTc3FzY2NrC2tsb9+/c1xpROnDgRGzduxOXLl8EwDB4/foyXL18CAJydnUtNxO3aFS0H+ccff0ChUGD//v2IiopCv379tIqNx+Nh5MiRWLhwIfLz83H//n1s2bIFY8aMKbcsl8vFhAkT8OmnnyIjIwNyuVzdSWvSpEn4448/1EOVMjMzcfjw4XLr3LVrFzIyMsDn82FjY6Nuoq+uqmyyXbZsGb777jssX74caWlpSEhIwJdffqmRRE+fPg2JRIKwsDDs27cPmzZtgkAgwG+//YZZs2YhPz8fDMNgxowZ+PLLL9W9/z799FPExsYiJycHf//9N8aPH69uCqoqHp88ggZW2j/TrGvFx7Mzx1iI6LW0K1fBLSNxVYSTTI7bh9lfW5hhGCjSk8CvwEIJXeq5I+zMyXKPI9XXokWLcP78eQgEAsydOxcjRoxQ7+vYsSPWrl2LyZMnQyAQ4IMPPkBOTg4A4KuvvsLXX38NoVCI4OBgjTpNTU1x5MgR7N69Gw4ODggMDMSRI0d0esb5KmF7eHhg8ODBWLJkSYkdqUqyevVquLm5oVGjRnBxccFff/0FoGht8J9//hkTJkyAQCBAy5YtNXpLl+bEiROoX78+bGxs8Ntvv2HHjh1an8/7pEpO15iVlQU3Nzfs3bsXgwcPLvEYLy8vjd59o0aNgrOzM/744w8AwODBg1G/fn3Ur18fv/32G27evFni3ME3btxA586dcfXqVTQtZW5cqVSqMWQlJycHHh4erE61d3zpF+ib/kinsqfs6mHA96v1HNFrP3XrCffIe5WuJ3/cGExdV3wIgj7l5OTgzp718G1Qu0LHn36RB7+xAazGBNB0jYRUN1WyN/KVK1egUCgwYMCACh3/+PFjhIeHY9myZerX1q1bh6ZNm4LD4eDEiRPFEu3s2bOxdetWFBYWon///uqediUJDAzE0qVLdTsZXWm5TqwGphJlK1S9fupXKiu2dGBlKBQK8HkVvwvX17kRQsibqmQzckZGBhwdHTXa+EUiEYRCocbsJ35+fhAKhfDz80NAQAAmT56s3ufm5oYaNWrA0dERbdu2LfYef/75JyQSCc6ePYs+ffqU+TxvwYIFRZ2C/r+93ZmBDYy57kNRKlO2IvhWVnqpx8TKWi/1lMXW1hbpBcoKHatUqgBTC5YjIoRUR1Uy2drb2yM9PR1K5euLZHJyMh49eqTRnPuqq35MTAwCAwPBfWOt0rVr10IgEEAgEGDDhg0lvg+Px0PPnj1x9uxZnDhRek9UMzMzdV2vNrbZt/RFaqH2d36ZUgUETYt/udAnQaOGla4jj2Hg3q6NHqIpG4/Hg1LgWP6BAMIexaJT34q1phBCiDaqZLLt0KED+Hx+mQmwLHFxcVi+fDn++usvbNy4Ed99912ZY8IUCoW6p3JV0aFPP1wzrViSeFMEzw6dBpT8nFtfuk+bgpdWlbt7TmvcAL1HjtRTRGVr3b0vQqPKntZSkl8IiaUjPT8lhLCiSiZbOzs7fPXVV5g1axZOnDiBgoICKJVK9fRh5Zk9ezZmz56NRo0aoVWrVvD398e8efMAFI0RDQ4OhkQiUXezP3/+PLp06cLiGWmPw+FA2NUPSdKKNYECQIpUAZtOfTTu8NlQp359cDvqPjWkimHg2rsn63G+UsPDEy5te+DU3adQlfBMNikjCyeeZ2Hw+EkGiYe8uxo1avTeTc7w8ccfY9cu7cfnb9myBZ9//jkLEb2fqmRv5Ff++usvrFu3Dk+ePIFQKETDhg3x2Wefwc/Pr1hv5Ff27t2LhQsX4t69e+rnuxKJBA0aNMCmTZvg6+uLIUOG4Pbt22AYBnXr1sW3336L4cOHVzguQ/YkPbT2R7R+cB4is7L7sqVKlbjs3REjPv+W1XheuREairDJ0yHKFGtdNtanHmadOAo7e3v9B1aGzMxMXDp1DIw4FXylHAyHA4WFDVx9mqFNh45lPrfXN+qNTKqCly9folOnToiOjgaPx8ONGzcwZswYFBYWYsuWLejbty+Aojmix44di4sXL6r70shkMtSrVw83btyAk5OTMU/jnVClk21VZegL5ZmdWyANP4ku3DxYmmiOFy1QKBGmtAS/Q2/0nTSD9VjeFLJjJ6K/XQyn3OJTx5UmvqYHhu3YivrNSh5mVV1QsiWlMeRSfj/++COSk5PV01P269cPn332Gdzc3DB69Gjcv38fADB8+HDMnz8fnTt31ig/a9YseHt7Y/78+QaJ911WJZuRiabe/lPQd10wrnUfhxOC2jht4oRTfEeECGrjcucx6P1HsMETLQD4TfBHk7U/45mbCMpyvrPlMyo8bdYII3fvrPaJlry73lwEPiAgAHPnzkXPnj1hY2ODPn36IDMzs8RyCoUCH330ERwcHODj44MffvhBPclObGws+Hw+NmzYAHd3dwQEBKCwsBBz5syBSCSCp6cnvv/+e/UjkCVLlmDq1Knqut9cru9VXevXr4eLiws8PT01Zrl628mTJzUSaFxcHLp27YpGjRohPz8fQNHkQZaWlsUSLQB06dJFY6IhUroqOc6WFGdiYoJeo8cBo8cZOxQNPYYPR5vevXFk/QbEnzwN28i7sFUVJV6GYZBiaQGmQzt4Dx2MSeM+rPZTtpH3y759+3D69Gl4e3tjwIABWLt2bYlj8tevX49Lly4hKioKSqUS/fv319ivVCoRGRmJZ8+egWEYLFu2DA8ePEBUVBRyc3PRq1cveHp6IiAgoNyYlEolrl27hri4ONy6dQv9+vVD27Zt4e3tXezYe/fuoV69euqfGzRogP/++w8eHh5wdHSEXC7HwoULcejQoRLfy8fHB3fv3i03JkLJluiBjY0Nxn35BZgvPsfl8+eR9OQJlFIp+JaWGNChA+o3amTsEAlhxQcffKCeeW7EiBE4ffp0icf9888/mD9/PpydnQEUrTsbGBiocczixYvV/Uz27NmDzZs3w87ODnZ2dvjss8+we/fuCiXbN+vy9fXF4MGDceDAgWIL3QOAWCyGtfXr8e4//vgjZsyYgYKCAqxbtw5r167FBx98gMTERPj7+8Pc3Bzr1q2Dl5cXgKK//ezs7ArFVN1RsiV6w+Fw4NujB9Cjh7FDIcQgKrL0HVA0T8Cbq+S8vWIOl8uFq6ur+uekpCSdlsp75c0l+zw8PNQLIrzN1tZWI+a6deviv//+U8e8b98+XLp0CR07dsT+/fuRkJCAL774Avv37wdQtDCDra1theOqzuiZLSGEsEwkEiExMVH984sXLzT2v90T3s3NrdSl8qysrFBQUKDel5KSUuz93pzlLiEhQSORv6lJkyZ48uRJifu+/PJLLF++HCYmJkhPT0fNmjXRpk0bdacpoGiq3LKmuiWvUbIlhBCWDR8+HL/++itSU1ORnJyMdevWlXn86NGjsWzZMmRlZSEhIQGrV69WL5XXrFkzhIaGIjk5GampqVi7dm2x8suWLUNhYSGuXLmCI0eOaKxO9KZ+/fqpO3y9KSIiAnl5eejTpw8AwMLCAg8fPsT58+fVTcgAEBYWph4eRMpGzcjviJSXL3Fp6yao7twGR5ILMAwYaxtwm7aE76QpELm7GztExDx6hDv7gsFPT4FKJgPX3ByMZ210njRVp+XCCHlfzJo1C48ePYKPjw+cnJzg7++P3bt3l3r8d999h08//RQ+Pj4wMTHB1KlTMXHiRABA7969MXDgQPj4+MDd3R2TJ0/G+vXr1WV5PB5at24NT09PmJmZ4ffff0f9+vVLfB9/f3907twZP/74o7rzokqlwpdffqnRi/mnn35C7969YWFhgX379gEA5HI5jh8/juvXr1f691Md0DhbHRhyjGRubi6OfbcAjreuoIWsoFhzE8MwuM23QHrLtui/LNAoz0/uX47A4x2b4fb4LhpxNWdoUjEMrphYIbdpa3T99Cs4i0puzjKEuOcxiDx5BNzUOHDlMoDLg8LKFoKGrdCl/yCD9pSmcbbV28aNG/HPP/+U2qFKV7Gxsahbty4UCkWFy3z00Ufw9fXFhx9+qNV7bdmyBVFRUfj555+1DbNaomSrA0NdKDPT03Fs1hT4JTwtd7F2hmEQ4lYbfus3w/GNThtsu3z0CGR//oQW8rxyjz0jdEWrFavhVcq3bLbIZDL8+/uPqJubiObOxf+/cgulCM1UoNagD9Gsna9BYqJkW73k5ubi6tWr6N69O54/f44BAwZg7ty5mDNnjl7fR5dkSwyDntlWUQqFAofnzUH/CiRaoKiDhV9SDI7NmwO5nP11YgHgXsQlyCuYaAGgt/glri/8DJnp6SxH9ppCocDewO8wxFRcYqIFABtzMwxys4Ls1C7cDA81WGyk+njVNGtra4suXbpgwIABmD59urHDIgZEybaKCt2/B92j72o1Xy+Hw0HPZw9xbtdOFiN77dHfG9C8gon2lT6ZibiwsezOIfp04u8NGCFUwKQCTcQtHa2RHLIHubm5BoiMVCe2tra4desWJBIJkpKSsHr1apiYmOj9fby8vOiutooyeLL18vJCzZo1Ne6+Zs6ciSVLlqh/joiIAIfDwW+//aZRNjQ0FBwOR91R4JUdO3aAw+Go6wgNDQWXy4W1tTVsbGzQpEkTHD58GLt27YK1tTWsra1hbm4OHo+n/tnPz4+1c9ZF1plTsNHhGaIlj4vs82dZiEjTw5s3UOfZQ63LcTgcMNcuGuTuWyaTgZ8QBTOTivcD7O1mg7B/97IYFSGkOjLKnW1ubi62bt1a6v6goCDY2dmVOKenq6srzpw5ozHObNeuXRpTjgFA7dq1IZFIkJ2djVmzZmHMmDHw8/ODRCKBRCLBtm3b0LlzZ/XPISEh+jvBSnr68AHcH97RubzX43uIunVLjxEV9+jgftTl6fa4v0NOKs7vLb0npr5cOHIQXRzMtCrD43Ihj3kI6spAyLuDYRhERkZix44d2LBhAzZs2ICdO3ciMjKyyvwtGyXZzp8/HytXrizx7kYul2Pfvn347bffcOvWLURHR2vst7S0RPfu3XH06FEARbOcPHr0CN26dSvxvbhcrnpi75iYGJ3ilUqlyMnJ0djYFH39GrxR8XVs31aHw+D5bXaTLT+t5BlpKsKKx4MsIVZ/wZRCnpoIC1Ptm+ocZbms/x8TQipPoVBg69at+PLLL7Fz507ExMQgJSUFKSkpePbsGXbu3ImvvvoKW7duNXrzulGSbffu3eHp6Ylt27YV2xcSEgIej4exY8eia9euJd7djhs3Tr3Y8Z49ezBq1KhSFyJXKpXYunUrrKys1CtjaCswsGhIzavtzanQ2MAUFlZ6bVWVtFBP0ZRWv7RS5ZlKlq8QhW5N1dZ8lDrtHiHV1YsXL/DHH3/gyy+/xNy5c/HRRx/h008/xU8//YRbLLekleTBgwf45ptv8OzZM1hbW0MgEGhcNzkcDgQCAaysrBATE4Nvv/0WDx48MHicrxitg9TixYtLvLsNCgrCiBEjwOPxMGbMGHVSfVOfPn1w8+ZNZGZmIigoCOPGFV8J5/nz5xAKhXB2dsb27dvxzz//QCgU6hTrggULkJ2drd7enAqNDRxzi0o3fXAtzPUUTSn1m2nXPFusvDm78QEAo+OaoNlyhobjEKN69OgRgoODsXPnThw7dsxgIwxKkpeXhxUrVuCnn35Ceno6rKys4ODgACcnJ9ja2iI/Px+7du3CggULSp36Ud9u3LiBv//+G1ZWVhXqaMbn82FpaYktW7YYbRIOo80g1bNnT7i7u2P79u3q13JycnD06FH1+ogjRozAnDlzEBERAV/f1+Mf+Xw+hg4diuXLl6OwsBDNmzcvVn+tWrXw9OlTvcRqZmYGs0omF2007doN9/76DU1VMp3KR3H4aOBbfO1JfVLUqAnE3C//wBKIFUpYeTfQc0TFWXrWRc7jRAgsTLUqJ7ayh42NDUtREVK606dPIyIiAtnZ2eqbA5lMhv/++w+1a9fGhAkTDDpxTXZ2Nr7//ntYW1vD3t6+1ONefTndsGEDJkyYgGbNmrEWU0ZGBnbt2qXTzZOtrS2Cg4NRu3ZtODg46D+4Mhh16M/bd7cHDhxAYWEhRo0aBZFIhIYNG0KlUpXYlPzhhx/i119/LfGu9l1Xw8sLaU1a6Fw+qVFz1GJ54ogWY8bjvkq3GZeuOXmgy9Dheo6ouC5+gxCerd1zGqlcAdM6jVmKiFQlly5dwtq1a/HLL7/gzz//ZL3FqjybN2/GmTNnwOFwNBKJqakphEIhMjIy8P3332u1+k9lqFQqBAYGwsbGptTHdG8TCATYsWNHqasM6cOGDRsq9YXD1tYWGzZs0GNEFWPUZNu7d2+IRCL1wsRBQUGYN28e7ty5g8jISERGRmLr1q3Yt29fsWYUX19fnDlzBjNnzjRC5Oxz6TcImUpV+Qe+JVupgmOf/uUfWEm1vOsjsX5TrcspGQa8th0r/MdbGTweDyb1WiC7sOItBKdSCtB92CgWoyLGlpGRga+//hqHDh2CWCyGRCJBamoqfv75Z6xdu9YovVf/+ecfREdHa6wt+7ZXzyBXr15tkM4+p06dAofD0br/iEAgwIEDB1iJ6dmzZ0hLS6tUnxYOh4O0tDQ8e/ZMj5GVz+iTWixevBiZmZl48eIFwsPDMXfuXIhEIvU2duxYmJublzg0p2fPnu/tBPddhgxBePP2UGrxh69iGIQ2bo1uI0ayGNlrLT+ajwhLoVZljrvWQd85n7ATUAn6jZ+Ek0o75FYg4Yal5KHBmOnqBbxJ5WVnZ2PTpk1YvXo1tm3bhvz8fKPGo1KpsGrVKpibm2skNg6HA3t7e6SkpGDTpk0Gj+ny5cuwsrKq0PGmpqY4cuQIy1EB169f1/lvISYmhpXnzCEhIXq55tvZ2Rl8uCfNjawDQ81rW1BQgN0zp8Lv0W2YlnMnKFcxCKnXBKP/+huWlpasxfS2exGX8GLVInTMzyrzOBXD4JioNvr+8gecDLwYAcMwOLp5HSzi7qOLiw34PM3fZZw4D5EKCzQdGYC6DRsZJKbqMDdydnY2lixZAoFAAC6XC6VSiby8PKxcudKgfSDedOLECYSFhcHCwqLUYzIzMxEYGGiwv6OTJ08iNDS0zJjeplKpsHTpUtZiio+Px+rVq3VObDKZDC1atMDIkfr94r9kyZJKj9R40+LFi/VWV3mMfmdLSmdhYQH/zdsQMeADXBA4oFBVvFlZqlLhgrU9LvoNx7i/dxg00QJAE9+OaPjzevzXpjsieJbF7sTzlEqctXHCuR5DMXzTToMnWqDormXwtI/Q4eufcdbWByESU5zKYnAqm4tjciFye/tj+KKfDJZoq4v9+/erEy1Q1KxvYWGBgwcPGi2m+/fvl5vUBAKBehy/ITx58kSrRAsAaWlpUCp1H4tfnlu3blXqS6CpqSnS0tL0GFGR7OxsvdUlFov1VldF0Hq2VZyJiQlGLFoKqVSKczu2oeDGVXAkkqLnStbWMGvdFn7+AVr/sepTTW9v1Az8BWKxGOE7tgJpyWCkUnAsLGDm3RCDR48FX8dhOPpkbW2NAROmGDuMaiMnJ6fYs3kTExNkZmYaKSKgsLD88ed8Ph95edrN+V0Zujx/5XA4yM/PZ63XfGFhYaWXnGSjGVmfXzDY/LJSEuNfAUmFmJmZwW/aDGDaDGOHUiqhUIhBc+cbOwxSRdjb20MsFmtctGUyGVwMuATk28zNzTWmei2JXC436PAaXRYkYBimws94dWFpaQmlUlmphGtqqt2Qu4rQ55d2Q98AUDMyIYQVo0ePRl5envoORyqVQqlUYujQoUaLqXXr1uV20srLy8OgQYMMFBHQqFEjre+kRSIRqz3627ZtW6km28LCQri66v+Rka4TE7FdV0VQsiWEsMLCwgKBgYFo3LgxHB0d0apVKyxfvtyojxS6d+8OLpcLVQn9H4CiToktW7Zk5a6sNN26ddOq009hYSFat27NYkRFybwyLRAymQwDBw7UY0RFHB0d9TI0i2EYODo66iGiiqNm5HfIjbAwRJ0+A2VePhhGBb6VNbx79UC77t2NHRoAICM9Hf/9/TcUGZlQSqXgW1jAqnYt9Js40aAXr7Lk5eXh/vVrECcnw9zKGjW8vVGH5QlAqjNTU1OMGTPG2GGocTgcfPfdd/j555+RlZWlvrtRKpXIzs5Gs2bN4O/vb/CYunXrhv/++6/cZ7AMw4BhGPTr14/1uNq3b49Tp05p3emSYRjUrVu30s98SzJkyBD8+OOPZc5mVRFZWVmYOnWqnqKqGBr6owNDDtuQSqU4snETXpw8CZsbkXB8qzNFBp+H7JbNUaNfXwyeaZwxotfPh+JOcDAUYRdRL1Os8S1dxjB4UtMDgu7d0GXaFNTy9jZ4fAAQdfsWHu4LhumNK2iUmwEBnwepikEiw0F8vUaw7NITPf0nGmxISnUY+lPVPX78GKGhoVAoFLCyssKIESOMOk3n3r17ce3atVI/D0qlEvn5+Vi4cKFBmkAZhsGiRYvA5XK1arLOycnBwoULWZsD4YcffkB+fr7OzegqlQpWVlb48ssv9RxZ2SjZ6sBQF8qUpCRsDZiCOtdvwaScZiY5wyCmVXNM2LoZriyvSvQKwzDYsWgJTP7eBpG8/B6VT+3tUHfZYvT44AMDRFdEKpVi7xfz0ODmJdTjlv5RL1SpEGbrjDqffoNWvXqzHhclW1KSq1ev4ty5c0hOToaNjQ14PB7y8vLA5XJRv359jBs3zqBfqAsLC7FkyRKYmJhUqPk/JycHM2fOLLa+uD7l5eXhu+++07kTW3Z2NpYtW8ZqB7OSULLVgSEulBlpadg0fCR8HjzWqtxjn3qYdHA/nEUiVuJ60+bPv4DzjmCUPsFccYlWFnBduQy9xo5lLa5XpFIpdk2biMEx98Gv4DOx+yYW4H6yAL5DhrEaGyVbUpa0tDTcvHkT+fn58PT0RKtWrfQ6mYM25HI5/vjjD8TFxUEgEBRrHmYYBmKxGA4ODpg6dSorHaPe9ujRI2zevFnrloicnBxMnToVDRqwvxDK2yjZ6oDtCyXDMPhx6Ah4h13S6Q/sUYc2+OrYEVb/OI9s2AD50hWw02H+5lh7Idru3IbGbdqwENlrOz6ehf63wyucaF+5bW4D11W/wacVe51QKNmSd41EIsH+/fsRExOj7lluZmYGkUiEoUOHwtPT06DxxMbGYtOmTVCpVOXe7UulUnA4HEydOhW1atUyUISaKNnqgO0L5aXTZxD94UTYldJjsjzZHMBr22Z0HaT/3oBA0ZeB33v0QuMHj3SuI3bEEExe/6ceo9L04MZ1SOdNQW2Obh/vcy07Y/ha9uKjZEtI5TEMg4MHD+L27dsQi8Wws7NTN3crFAqIxWLY2tqiRYsWGD58uNFaBwDqjVwl3d2zF646JloAsGWAB/v2s5Zsz/97CB4PogDo/sEtDA2DWCxmraPH4wN70EPHRAsAtneuITkxESJ3dz1GRQjRJw6HgxEjRmDEiBFISUnB5cuXIRaLwTAMhEIhfH19jTqJypuqxDhbLy8vWFpawtraGm5ubpg7d656Kq1u3bqpV+hwdnaGv78/cnJy1PtKWuv2TX5+fsUe7C9ZsgSNGjUCl8vFtm3bWDknXaWlpKDwQnil65GHX0ISS+tzPv33EGwrkWgBwDs9EyfXs7OmZG5uLkxvXK5UHS0VhYjYvkVPERFC2Obi4oKhQ4ciICAAkyZNwrBhw6pMogWqSLIFgNOnT0MikSA8PBz//PMPtmx5faHbvHkzJBKJep3bFStWVKjOQ4cOITc3t9jrdevWxerVq9GpUye9xa8v10NDUSNTXOl6PLJzcf3c+coHVAJZTEyl6+ByOCh8Wvl6SnLn0iU0yRNXqg4OhwNu/HP9BFTNSaVSLFmyBFKp1NihFEOx6YZi016VSbav1KlTBx07dkRkZGSxfa6urvDz88ODBw/KraewsBALFy7EqlWriu0bP348+vbtW+HB2lKpFDk5ORobWwrE2eUO86kILocDWQlfNPRBmaefNUkV+RK91PO2nPRU2PAq/9FmyplDl1SMVCrF0qVLq9zFD6DYdEWxaa/KJdvo6GiEh4ejTp06xfYlJiYiJCQEzZs3L7eeVatWYcyYMahRo0alYwoMDIStra1682BxHCvP1EQv05EV1cXOrE1cPU23xzNhZwIJE3NzKPTwO+SYUJcGQoh+VJlk6+fnB2tra9SvXx++vr6YM2eOet+MGTMgFArRoUMH+Pr64ptvvimzrtjYWOzbtw+ff/65XmJbsGABsrOz1VsCS89CAUBUuzay9dBhTsIwsK/BTucenp46NfFs2Zmtp4Z3fSTq3r/sNRvDrfxCCHm/VZlkGxISgtzcXBw6dAg3btyARPK6iXHjxo0Qi8WIj4/Hxo0by23+nT9/PpYtW6a3mVbMzMwgEAg0Nra07dwZ6Y0bVrqe5Ab10blvXz1EVJx1+7aVvvvO4nBQZ0B/PUWkqUHTZnhaq3LzHWcrVRB2qRpzThNC3n1VJtkCRZ1ShgwZgl69emH58uU61xMaGoo5c+ZAJBKhTZs2UCqVEIlEFXrWa2xcLhcuvXtBVYlkxjAMnHr3YGUicADoNXMGntpW7gtHUtNG8GXpywAAWHTqAblK99/hdVFNdBkyXI8RVV9mZmZYvHixwead1gbFphuKTXtVKtm+8vnnn2Pz5s1IT08v91i5XI7CwkL1plQq8fjxY0RGRiIyMhInTpwAj8dDZGQk6v9/dZdXZVQqlca/q4qBH83GcycHncvH2AvRf/YsPUakya1GDaBjB53LyxkGLn17szrAvEfAZIRZ67YyiFSlgkmnbqyuF1qdmJmZYcmSJVXu4gdQbLqi2LRXJa8mDRo0QNeuXbF27dpyj508eTIsLCzUW2BgIJydnSESiSASieDk5ASgaH3GV+Ntp02bBgsLC5w5cwbTp0+HhYUFwsLCWD0nbdjZ2aHWJx8hy9RE67JiPh81586BE8vjy/wWLUSUl/bTszEMg/sd22HYJ5+wENVrlpaWcP/kSzzia/cHp2IYhNRvgQFzP2UpMkJIdUTTNerAUFPt7VoRiMI//oSDVF6h4zNNTWAyexr8Fy1iLaY33b96FWEffQKfuIp1GFMyDO60aYlpwUGwNcASYQAQtm8PsHENmsrKH64kVakQUr8FRv62AdbW2iyvoD2arpGQ6oWSrQ4MeaE8vnUrHv61Be6PomHJKbkhooBR4YV3XTSYOhkDpxl2QeTY6Cc4+s03sL16AzWkshKPYRgG0Xa24PbsgYlrfjH4mrv3Iy7h0a5tcLp3E01VsmLN19lKFa45e8CkUzcMnPd5hZYSqyxKtoRUL5RsdWDoC6VSqcSp3Xvw9NBh5EXegXleAThgUGBpAavmzVBn8CD0HfehQZJEaR7ejsTVbduQffESTDOzYCqVocDCHHB3h12P7ug7awbrTdvlSXj+HNd3bgXvRRyY/HxwTEygEthC2Lk7ugwbwVqHspJQsiWkmmGI1rKzsxkATHZ2tsHfWy6XM2lpaUxqaiojk8kM/v7lUalUTG5uLpOcnMwUFBQYO5wqy5ifITYVFhYykyZNYjw8PBgbGxumXbt2TEREhHp/YGAg4+joyNjZ2TFffPEFo1KpjBJnREQEw+FwmGXLllWp2H744QemRo0ajLW1NdO8eXMmJyenSsR2+/ZtxtfXl7GxsWFq1arFbNq0iWEYhlEqlcwnn3zC2NraMs7Ozszq1atZjePPP/9kWrRowfD5fGbx4sUa+7Zu3cq4u7szNjY2TEBAACOVStX7nj59yvj6+jIWFhZMixYtmMjISFbjLAklWx28rxdKYjjv62dIIpEwS5cuZeLi4hilUsns3r2bcXBwYHJzc5njx48zNWrUYJ4+fcq8fPmSady4MbN582aDx6hUKpl27doxbdu2VSfbqhDbH3/8wXTv3p2Ji4tjVCoVc+fOHaawsLBKxNa4cWNm6dKljFKpZG7evMlYW1szDx8+ZNatW8c0a9aMSUlJYaKjoxk3Nzfm7NmzrMXx77//MocPH2ZGjx6tkWzv3r3LCIVC5tq1a4xYLGZ69uzJLFy4UL2/TZs2zKJFi5iCggLmzz//ZGrVqsXI5XLW4iwJJVsdvK8XSmI41ekz5Orqyty4cYMZM2aMxp3k1q1bmS5duhg8nvXr1zNz585lJk6cqI7H2LEpFArG1dWVefr0abF9xo6NYRjG2tqaiY6OVv/cpk0b5uDBg0z79u2ZnTt3ql9fvHgxM2HCBNbjmTFjhkay/frrr5kpU6aofz5//jzj6enJMAzDPHr0iLGysmIKCwvV+2vWrMmcO3eO9TjfRJO/viPy8vJwbP1GiK9fh0oiAcMw4FlbQ9C6FQbOmgkbG3amPqwouVyOUzt2IPnUGSjT0sHIZOBamINf0xP1R46Ab79+Rl24GSjqqHUrPAzJkdeBwgJwTEwAgR06fTCWnpuy5MmTJ8jMzETdunXx8OFDjB07Vr2vSZMmBp9oJiMjA7/++iuuXLmCefPmqV83dmwvXrxAfn4+Dhw4gNWrV0MoFOLzzz/HtGnTjB4bAHz88ccICgrCd999h1u3biE+Ph7t27fHw4cP0bRpU43Yjh07ZtDYgKL/v549e2rEER8fD4lEgocPH8Lb21tj3O2r32H37oabJY6SbRWXmZ6Of5Z+j/zQMNR9mQLhWwlLee4Ctu4IgnnXzhj63UI4i0QGjU+pVGLvsuUQh5yCT2wcGr7dY/p+FDJPnMKGJo1Q138cevv7GzQ+AJDJZDizYwtktyLQNOsFGpu97gilVDG4+N8h5NVvjgZDPoBP85YGj+99VVBQgPHjx2PBggWwtbWFRCLR+FIjEAg0pmU1hG+//Rbz5s2D8K2hZ8aOLTExEdnZ2YiOjkZsbCyePHmCnj17wsfHx+ixAUVz10+YMEG9vOmWLVvg6upaJWIDSv7/e/X62/te7Td0nJRsq7DY6GgcnDYDDR88KrorLOHOkMfhoH5KGpi9/2DX3XsYvHED6jRsYJD4pFIpNgRMRtNzF+DF4QClDE2yZwD7uw+Q+s0iBD+PxYeLvjNIfACQlZmB4ws/w5DceJjyuICZZo9jHpeDrqZy4Pl13PnpFkKHBqDbB2NLqY1UlFwuxwcffIC6deti0f/HfVtbW2ssT5mTk8P6eOY33b59G9evX8e6deuK7TN2bBYWFgCARYsWwcLCAk2bNsWYMWNw4sQJo8eWmZmJAQMG4O+//8awYcPw4MED9OvXD02aNDF6bK+UFMer19/e92q/oeOskjNIESAjLQ0Hp05Ho4ePK9T8yuFw0PDRExyZNgOpycmsx6dSqfDX9Jloce4CzCrYPOwsV0CwcTP+Wb2G5eiKFBQU4MSCuRiZl1CUaMvRzEQJp4ObEXZwnwGie3+pVCr4+/uDw+Fg+/bt6s9vw4YNce/ePfVx9+/fR6NGjQwW14ULF/D48WO4u7tDJBJh7969+OGHHzBp0iSjx+bt7Q1TU1ONv/Wq8nt79uwZrKysMHLkSPB4PDRt2hS+vr64cOGC0WN7paQ4PD09YW1tjYYNG+LJkyca69saI05KtlXUge8WoeHDx1qXa/D4Cf75ZiELEWk6sW0bfE6eAV/L57D2ShVy/tyIhNhYdgJ7w4mfl2NYQbJWz4rrmnEg27cJyUlJLEb2fpsxYwZevnyJ/fv3a4z9Hj9+PDZu3IiYmBikpKRg9erVmDBhgsHimj59Op4+faqeN33w4MGYM2cO1qxZY/TYXiWzFStWQCqVIioqCnv37kX//v2NHpu3tzfy8/Nx+PBhMAyDhw8fIjw8HE2aNMH48ePx888/Iy0tDU+fPsWmTZtYjU2hUKjnwH/z3x9++CH++ecf3Lx5E9nZ2VixYoU6jvr166NBgwZYtWoVpFIp/vrrL3A4HHTu3Jm1OEtCybYKysrKgiw0XKcORRwOB8rwi0hPS2MhsteSjp2AlY4dnrxzJbjw1yY9R6QpNzcXNlG3wONqH2MnMxWu7N7OQlTvv7i4OGzevBnXrl2Do6OjuhkvPDwcAwYMwKxZs9C2bVv4+Pigb9++mDx5ssFis7S0VM+ZLhKJYGFhAWtrawiFQqPHBgDr1q1Deno6HB0d0b9/fyxbtgydO3c2emy2trbYt28fFi9eDIFAAD8/P3z66afo1asXZs2aha5du6JevXrw9fXFp59+qtFRSd+WL18OCwsLbN68GStWrICFhQV27tyJJk2aYPXq1Rg8eDBq1KgBNzc3LFz4+qYjODgYp0+fhlAoxPr163Hw4EGDTwKk8wxSmzZtwm+//YaYmBg4ODige/fuWLp0KQICAnDlyhXw+XxYWlqib9+++PPPP9W9ZQMCAlC3bl0sXLgQDMNg+fLl2Lx5MzIyMuDg4IBhw4bh119/BQB4eXkhNTUVXC4XNjY2GDVqFH755Rfw+Xx4eXkhKCgInTp10ohryZIlWLFihXpKwJo1a2LEiBH46quv1Ovg/vjjj9i2bRsSEhLg6uqKBQsWYNKkSRU+d7Zn/9m1YiXs1/wOro7JjGEYpH88C+MXszNH8oObN/Fg6AeoIa/YnM0luVPLE1PDQmFiov1iCxVxdP1v6HH5EPg6rtxzAgIM3rKftVmlaAYpQqoXna5Ey5cvx6JFi/DDDz8gIyMDUVFR6NixI86dOwcA2Lx5MyQSCe7du4e7d+8iMDCwxHq2b9+O/fv348KFC5BIJAgPD0erVq00jjl9+jQkEgnCwsKwb98+bNpU/h3RxIkTkZubi7S0NGzatAknT55E7969oVQqARTd/QUHB0MsFuPAgQP4+uuvcenSJV1+FazIvnpN50QLFJ1f9tVreoxI091DhyuVaAGg7rNYhB0/rqeIimMe39U50QJAJ1kmLp06qceICCHVmdZXI7FYjJUrV2L9+vXo378/zM3NYWVlhenTpxdr2nBxcUHfvn0RGRlZYl3Xr19H//794eXlBQDw9PSEfylDQ+rVq4fOnTtrNb7M3NwcHTp0wKFDh3Dnzh31+K8vvvgCzZs3Vz/s79mzJ65cuVJqPVKpFDk5ORobm5Q5uXqog8UYc/MqXYUVl4vclFQ9BFMyXkHlYhSY8iFJT9FTNISQ6k7rZHv58mXIZDIMHDiw3GMTExMREhKCOnXqlLi/Xbt22LJlC9asWYPIyMgyF3B//PgxwsPD0bx5c21DhqurK1q3bl3i3atcLseVK1fK7JkWGBgIW1tb9ebh4aF1DNpgVEo91FH677KyVHqID2A7Rj3UTWt0EEL0ROtkm5GRAUdHxzIfLs+YMQMCgQA1atSAUCjE0qVLSzxuwoQJ+Pnnn3H48GF06NABrq6u+PvvvzWO8fPzg1AohJ+fHwICAnTuGCASiZCVlVXs9c8++wxeXl7o27dvqWUXLFiA7Oxs9ZaQULH1W3XF18P4L54Ve2PIuNZWla5DzjAwZ3NNW3PLShWXKlUwEwj1EwshpNrTOtk6ODggPT0dCoWi1GM2btyInJwcXLp0CTExMUgro2fsxIkTERoaiqysLCxatEg9PdkrISEhEIvFiImJQWBgILg6Pod7+fIl7OzsNF4LDAzEuXPncODAgTJ7/pqZmUEgEGhsbLJs1hQ69ltTs27eTE/RFOfVvRvSdejl+6YokTO6DB6kn4BKoKxVv1K/w3DGEh39BugxIkJIdaZ15urQoQNMTExwvAKdW3x9fTFz5kx89dVX5R5rbm6OOXPmwM7OTiPZ6kNycjJu3ryJjh07ql9bt24dNm/ejNOnT8Pe3l6v71dZvWZMR6yN7nem8VYW6DKVvaEB7Xv1QmLTxpWqw6JbF1hZVf4OuTQdPpyIa1LdvxDIG7Uy+CL3hJD3l9bJVigU4ttvv8Xs2bNx8uRJSKVS5Ofn4++//y7WBAwAc+bMwdmzZ0vs2LR9+3acPHkSeXl5UCqVCAoKQk5ODlq0aFGhWGQyGQoLC9Xb28/ppFIprl69imHDhqFJkybq58w7duzAypUrcfr0abi5uWn7K2BdDS8voJOvzuUVHdqhdv36+gvoLRwOB469e0Gh451jgqkJ2gRM1HNUmlxErkitpdsXgrsFKjQeMlrPERFCqjOd2mQXLlyIxYsX44svvoCdnR3q16+PCxculDiY2d7eHhMnTsSqVauK7bOxscHSpUvh7u4Oe3t7rFmzBvv27Su1Q9XbevbsCQsLC/UWHBwMoCiJ29jYwN7eHlOmTEHPnj1x9uxZ9ZjJxYsXIy0tDc2aNVMPul+5cqUuvwrW9Pr6Kzytof0XgWfurujxdfktCZU1+KM5iGzTUuum2gJGhdyRw9Cwgl+oKqPzx1/iJMdWqzJpMhVSug1B3YYNWYqKEFId6TypRXVmqAkJbl8Iw+VP5qNO4ssKHf/M1QVt1vyMNizO4PKm9JQU7Bo/Aa3uPqjQuOA8hsGTwQMwc+N6nZ+9a+v5oyhE/vgd/JRZ5c7IFS9n8KBNPwyd9wXrcdGkFoRUL5RsdWDIC+WT+/dxaukyWF65BvdCaYnHJJmZIa9da/Rc+A0aGOCO8U25ubnYPf8zmF0IQ90cSYkJTcEwiHJ1gWD4EIz57juDr2ubnpqKi5vXweThTXTm5MPsrUUJogtVeCKqA+ce/dFp8DCDxETJlpDqhZKtDoxxoXx09x4itm1D7vWbUOXmFi0eb2MDm9Yt0X7iRDRs0dwgcZQmKSEB5zduQnZYGJjUNJjKFZCZm4FX0xPOfXqj3/Rp6ukyjaWwsBDn9wRBFfsEkBYAfBMorGxQ328ofJo1N2gslGwJqV4o2eqALpRlUyqVKCgogJWVlcHvYt8V9BkipHqhxeOJ3vF4PKMsIE0IIVUVLbFHCCGEsIySLSGEEMIySraEEEIIyyjZkmqJ+gUSQgyJOkiRauP5s2e4H3EBnNx08FQKMBwO5KZWsHavha59/MpcyYoQQiqDri5Er+RyOS6GhaIwPx9OLiK0atO2Sgz/OX3oABxykzCgjgcAzdWfJPm52PfHj+g1dgqcXVyMEyAh5L1GzchEb86GHMfp/UFoU8MO/ZrXgyunAEd2bUXkrRtGjSv05DF4Ixst6niUuN/a0gJj2jZA6N5tKCgoMHB0hJDqQOtkGxYWhvbt28PW1hYODg7o0aMHnj9/jiVLlsDExEQ9sb+1tTV8fYtWrgkNDQWXy4W1tTVsbGzQpEkTHDlyRF1nbGwsOByOuly9evWwadMm9b43m/e6desGGxsbZGRkqF9btWoVAgICSjz+zXJBQUEAgJSUFAwaNAjOzs5V4q5LW1lZWcjKyjJ2GBr+O3kCPo5W6N+5Haz+P1OUm8gZg7u2hyw5Dg/v3zNKXHK5HPnPo+DpXP4yikOb18H544cNEBUhpLrRKtlmZ2djyJAh+PLLL5GVlYW4uDh8/PHH6tV0Jk6cCIlEot4iIiLUZWvXrg2JRILs7Gx89NFHGDt2LMRisXo/j8eDRCJBbm4u1qxZg5kzZ5a4LB8AmJiY4JdfftHhdItwuVz0798fO3bs0LkOY2AYBvt27cSj6xF4fOMy9u7cXmxZQWOQy+WQZaXCXeRc4v42TRrg6b1Iwwb1fxdOhaBL/ZLvaN/G5/MgT0ukzlOEEL3TKtlGR0fDzMwMw4cPV9+pDhs2DJ6enhV/Qy4X/v7+yM/PR3R0dLH9HA4HAwcOhIODA6Kiokqs4+OPP8b69es17m614eTkhFmzZqF58+YVOl4qlSInJ0djM4ZzZ07Dr3MHdGjTCu1bt8TA7p1x9lSIUWJ5U/iF8+jSqkmZxwjNeUb5vcmzUmFpblbh490teUhJSWExIkLeD35+fti7d6+xw3hnaJVsvb29IZPJMHXqVJw5c0ani6dSqcTWrVvB5/NRs2bNYvsZhsGRI0eQlZWFJk1KvoDXq1cPgwYNwurVq7V+f10EBgbC1tZWvXl4VOxOSd/k0kLY2LyeBtHKyhIqhcIosbxJWpAPSwuLMo+xt7FGdna2gSJ6jWGUWh1vY26CvLw8lqIhVYmXlxesrKw0/r/z8/NhY2MDLy8v4wX2jggJCcHo0aONHcY7Q6tka2tri7CwMEilUvj7+8PJyQnjx49Hbm4uAGDnzp0QCoXqbcaMGeqyz58/h1AohIWFBebPn49t27bB5Y2en0qlEkKhEA4ODli4cCG2bduG+vXrlxrLwoUL8eeffyIzM7PYvld1vbldvHhRm1PVsGDBAmRnZ6u3hIQEneuqDAcnZ8QlvFD//CIxCUJ7B6PE8iZ7R2ekppfdyvAyQwxHR0cDRfQaw9Wuw326pAD29uU/3yXvB3d3dxw6dEj98+HDh+Hq6mq8gFikVGr3xZPol9YdpBo3boydO3ciOTkZERERiIiIwIoVKwAA/v7+EIvF6m3jxo3qcrVq1VK/Pnr0aISFhWnUy+PxIBaLkZmZibt372LcuHFlxuHt7Y0BAwaUeHf7qq43t06dOml7qmpmZmYQCAQamzG0ad8BT16m4cR/YQg5F4YHcUlo31H389KXtu074MqD4o8E3pTHcGBRzt0vGxy86iIlq+J31OkqU9jZ2ZV/IHkvjB07Frt27VL/HBQUVOzaEx8fjwEDBsDBwQENGjTAyZMn1fv+/vtveHt7w8bGBk2bNkVoaKh6X7du3bB48WK0bt0aAoEAo0ePhlRa8prUSqUSixcvRs2aNeHi4oLPPvsMCoUCUqkUjRo1Uvcvyc3NhZeXF06cOAGg6O78xx9/hLe3NxwcHPD555+r+3EsWbIEY8eOxYgRI2BtbY1z586VeS4rVqyAq6srBAIBmjRpgocPH5b5+pudTlUqFRYvXgwPDw+4urpi7ty56nPdtm0bevTogVmzZkEgEKBhw4a4deuW9v9Z77hKDf1p1aoVhg8fjvv371e4jKWlJf744w8cOHAAt2/frszbY+HChVi3bl2Jd7fvq159/dB/5Cj4jRiFvgMGGjscAEXP2Wt4N8Gth49L3H/swhW069rLwFEVadexC67EpVXo2GxJHmxq1GY5IlKV9OjRA/fu3UNaWhrS0tJw9+5d9Or1+rOqUqkwaNAg9O3bFykpKfj777/h7++vfq4vEonw33//QSwW4+OPP8aYMWM0Euq+ffvwzz//ID4+Hvfv30dwcHCJcaxevRrh4eG4ceMGHj9+jFu3bmHDhg0wMzPD9u3b8dlnnyExMRGffvopevXqhf79+6vL7t69G2FhYbh37x5CQkKwdetW9b5///0XM2bMQE5ODjp27FjquTx69AgbNmzA7du3kZ2djf3798Pe3r7U19+2ZcsWHDhwAJcvX8b9+/dx8+ZNBAYGqveHh4ejS5cuyMrKwvDhwzF//nzd/9PeUVol20ePHmHNmjVISkoCUNRh6ujRo2jbtq1Wb2pra4tp06Zp/GfowsfHB35+ftiyZYvWZQsLC9V/FG/+m+imVdu2MBXVwtGLN3H59l1Ex8Ti/NVbOHr5Nlr39DNa0xyHw0Gd9t1x49mLMo+TyeU4EZ2C7v0GGCgyUhXweDyMHDkSe/fuxd69ezFixAj16AoAuHbtGgoKCjB37lzw+Xx06NABXbt2RUhIUcfE/v37w8PDAzweD9OmTQOHw8GTJ0/U5adOnYqaNWtCKBRiwIABuHPnTolxbNmyBcuXL4eTkxOEQiE+++wzHDhwAADQunVrzJw5E3369MHp06eLteZ98sknEIlEcHNzw/z58zU6LXXt2hV9+vQBl8vF3bt3Sz0XPp8PqVSKqKgoKJVK+Pj4QCQSlfr62/bs2YPPP/8cNWrUgIODAxYtWoTdu3er9/v4+GDs2LHg8Xj48MMPS/09vM+0SrY2NjaIiIhAq1atYGVlhV69emHAgAH4+uuvAQDbt2/XGGdbVieDjz/+GEePHi2xR7I2vvvuO40hRBVlYWGhjs/CwqLM58OkYpo0a47BY/3RoHNfmHj4oH3/4Rg8apzRn4E1bdkapj5tceTWY0jyi09aERnzAkefpGHUjLnv5LhrUjnjxo1DcHAwdu3aVWIT8qv+Jq+2kydP4uXLlwCAQ4cOoWXLlup9qampGqMk3uyXYmlpCYlEUmIM8fHx8PPzU9czbtw4pKamqvdPnjwZUVFRmDRpUrHHWG922PTw8FDHBgA1atSo0LnUrVsXv/zyC7755hu4uLhg6tSpyMnJKfX1tyUlJWmMSqlZs6b6pkyb38P7TKveI+7u7ti/f3+J+5YsWYIlS5aUuK9bt254+vRpsbrenK1HUUqvWi8vL419bz4TAYAGDRpoPPh/+/jSytFYSvbY2dlVueeezdu0Q+MWrXDhdAjy4xPAUykALhdyvgWadxmIll61jB0iMZLWrVurH0W1adMGV65cUe9zd3dHgwYNcPfu3WLlpFIpxo4di4MHD6JPnz7g8XhwdXXV6dri7u6OvXv3omXLliXunzlzJj788EOsW7cOkydP1khsb3bYTEhI0Phy++aXx7LOBSjqc+Pv74/09HSMGTMGq1evxpIlS0p9/U1ubm6Ij49X/xwfHw83NzetfgfvO5obmVQbfD4fPfsPMnYYpAo6ePBgia+3a9cOKpUK69evx5QpUwAAV69eVTcNy2QyODsXTeaydu1apKVVrH/A2yZPnoyFCxdiy5YtEIlEiIuLQ1xcHLp27YoNGzYgLS0Nx44dQ2BgIKZMmYLTp0+rE+nvv/8OPz8/qFQq/Prrr5g3b57W51JQUICXL1/C19cXlpaWMDMzA4/Hw+PHj0t8/W2jR4/GL7/8gj59+sDCwgLLli3DmDFjdPpdvK9obmRCSLXXsGFDNGzYsNjrfD4fx48fx6lTp+Du7g43NzesWLECKpUKAoEAP/30E/r27QuRSISMjAzUrVtXp/f/4osv0KFDB3Ts2BG2trYYNGgQEhIS8Pz5c3z77bfYvn07TExM8M033yAzMxMbNmxQlx01ahQ6d+6Mxo0bo3fv3pg0aVKJ71HWuUilUnzxxRdwcHCAp6cnbG1tMX/+/FJff9uUKVMwbNgwtG3bFg0bNkSzZs2wYMECnX4X7ysOQ+2pWsvJyYGtrS2ys7ONNgyIvNvoM0T0wcvLC0FBQZUa2kgMg+5sCSGEEJZRsiWEEEJYRh2kSLUReeMGjm/aisyYWEjz8sA3MYGFnRA+Pbpg9PRpMDc3N3aIhGglNjbW2CGQCqJntjqg520luxUehviTR8AXZ0All4NjZg6law20HOMPz9p1jBbX4V27EbYzGOlhV2BRUHzyEiUYKOt5oX6/Xpj07ddwfmNMIFvoM0RI9ULJVgfGuFAmJyXh6p7t4Dy+C56kaFC50soGKu8maDt6AtyMtBIRAFw6dBAvj+5Hk+TnqGFS/MnEHQUHL+s2RuOJM1C/RcnjCNnAMAxWf/0N7v+2CWaFsvKPBwOmRSN8un0TfEpZcUpfKNlWD40aNcLmzZvRoUMHY4dCjIySrQ4MeaFUqVQ4+OP3cL0bgdbmTLEZjhiGwc1CILFhewxfsKTEMXBsOv7nb6gZshe1eOV/jK7xrSCY9QVa9exjgMiA1V9/i/s//gFTLT/iysbe+O7oAXiyuMwaJVtCqhfqIFWFMQyD4G8/Q6+ocLSxQIlTCXI4HLS24KDPkwgEL5inXvHDEM7u2Iq6IXsqlGgBoK0iD7nrfkTUjessRwYc27MX99Zu1DrRAgDvfjR+mjqbZhkj74TSZt8jVQsl2yrsxMbf4Zd4F5b88u9WLfg8DEx5iOPr1hggMiAvLw/Sf4NQQ8sb6TaqfDzcso6doN5wIWhPhZqOS5MbdhkXz53TY0SkOvLy8lKvpR0QEIC5c+eiZ8+esLGxQZ8+fUpdsUyhUOCjjz6Cg4MDfHx88MMPP6gnzIiNjQWfz8eGDRvg7u6OgIAAFBYWYs6cORCJRPD09MT333+vsdTe1KlT1XWHhoYWq2v9+vVwcXGBp6enetk8ol+UbKsopVIJxfUw2JhUPJtZ8Xng3LoEuVzOYmRFzu/Yik7KfJ3K1o5/jEd3Kre8Ylke3LmDlNCIStVhLlfiv+27yj+QEC3s27cPa9asQVpaGpRKJdauXVvicevXr8elS5cQFRWF8+fPY8+ePRr7lUolIiMj8ezZM2zatAnLli3DgwcPEBUVhYsXLyIoKEi9Bm55lEolrl27hri4OOzZswezZ8+u9AIxpDjWk+2mTZvQpEkTWFlZwdPTExMnTlR3Vz9w4ACaN28OS0tLuLm5Yd68ecjPf30BDwgIwPLly4vVqVAoMHLkSHh4eIDD4ZTa/b1Pnz5wcXEp1szy2WefoU6dOuoFn48dO6a389WXsMP/oKM8S+tynVTZCP1nb/kHVgLDMJBeCwefq9sKOd4mHEQd3KfnqF47umkrLPOKr+6jrZjToRoruBBSWR988AGaNm0Kc3NzjBgxotSl5v755x/Mnz8fzs7OcHV1xUcffVTsmMWLF8Pc3BwWFhbYs2cPFi9eDDs7O3h6euKzzz7TWOKuPK/q8vX1xeDBg9XL+xH9YTXZLl++HIsWLcIPP/yAjIwMREVFoWPHjjh37hyCg4Mxbdo0LF26FGKxGOHh4bh9+zZGjBhRoWdlnTt3xr59+2BmZlbi/pcvX+L8+fOQyWQ4ffq0xj4bGxuEhIQgOzsba9euxfjx4/H8+XO9nLO+5D2IhECLu9pXLPk8yB6VvKqHvmRnZ0OYHF/+gWXgJrD3+xbHxumlHl5KGu7evKmXuggBKr7UXHJyssbyeG/+GwC4XK7G6j7lLXFXnrKW6SP6wdqkFmKxGCtXrkRwcDD69++vfn369OlQqVTw9PTE0qVLMWTIEABAnTp1sHfvXtSqVQtnz55F7969Sw+az8cnn3xS5vvv3r0b7dq1Q7NmzRAUFKQRw5vLQ3Xv3h0NGzbErVu3UKtWycusSaVSjcXlS1rPUd840krcmRVW/q6uLNnZ2RAqZEApX3QqpDLnV45CPa2VaQogLYkuOsTwRCIREhMT1T+/ePFCY//bnSVfLXFXp07RePY3l7izsrLSWM40JSWl2PslJCSo1/dOSEgocVEGUjms3dlevnwZMpkMAwcOLLYvOjoaiYmJ6kT7ikgkQvv27XFODx1TgoKCMHr0aIwZMwaHDx8u9RtkVlYW7t+/X+aHKzAwELa2turNwxBjWvkmupc1qUTZCjA3N4eUV7nvaZzKnF85eCameqlHCcDKxkYvdRGijeHDh+PXX39FamoqkpOTsW5d2Z0KR48ejWXLliErKwsJCQlYvXq1eom7Zs2aITQ0FMnJyUhNTS3xOfGyZctQWFiIK1eu4MiRIxgxYgQr51WdsZZsMzIy4OjoCD6/+EU5PT0dQFFyfZuLi4t6v66ioqJw584djBw5Ep06dYK9vX2J61WqVCpMmjQJI0aMQIMGDUqtb8H/2rv3oCbPfA/g3wQDIiEgeIkIiBYK1aLFdi1Iq9Xd1iqtbm3BGyt4qVY93drZM85IrXFXT21n2uWMdlprrZfpamu3dhetR3erqFCl212veMUbitypFpBLQvI+54+EQEhQBJI3lO9nJjPvPd83ZPLjvT3P8uWorKy0vpp31uwsJv8+7Xr0RAgBo38fJyRqEhAQgArvjhUhk8Z5ncv3CuicbetVPTAoon1dphF1xKJFi/Dkk08iKioKY8eOxdSpU1u9ZAYAb7/9NiIjIxEVFYW4uDhMnz4dKSkpAIBnn30WL7zwAqKiojBu3Di7Qurh4YEnnngCoaGhSExMxPr16xEZGenU/euOnFZsAwMDUVFR4fAZsMDAQADm6xItlZaWok+fjhWLzz//HGPGjMGAAQOgUCiQlJTk8Hb2xYsXo7Ky0qZvSEe8vLyg0WhsXs42KikZ/9Y/+A1Ip/TA46/MdEKiJiqVCqbhv2r3+jVGE9RxYzoxka3oCb9BAzr+jKx/7OMY5uSWpOiXLT8/39r93datW7FixQrrvNTUVBw4cMDheiqVCh999BFu376NS5cuoW/fvhg4cCAA8+NELX9Xvb298fHHH6O0tBS3bt3CqlWroFSaf94VCgU++eQT/Pzzzzh37hz+8Ic/4MqVKzbrL1q0CGVlZSgoKLAWaepcTiu2cXFxUKlU2Lt3r928yMhIBAUFISMjw2Z6SUkJfvjhB4wbN67d7yuEwI4dO/Djjz9Cq9VCq9Xis88+Q2Zmps1F/2XLluH48ePYvXv3Pf9jlIs2KAjlYcMeeL3CkEcQPCis8wO1EPVSIq608wmjYz598Ezi9M4N1MzU2clQPNqx/8xNEBjx4kSHDYkQOVt1dTUOHDgAk8mEK1eu4M9//rPdZTfqWpxWbP39/fHWW29h8eLF2L9/P/R6PWpra7F582Zs3boV7777LnQ6HTIyMmAwGHD16lVMmzYNTz31lM3NUUajEfX19dZX4390er0e9fX1dsPZ2dkoKyvD6dOncerUKZw6dQoXL17EsGHDrLfCr1mzBt9++y32798PXze+JvfEvP/CIeHT5uWzpF4YOXeJExM1eeSxkTj7UPQDn+q+YxTo8fSzUDnxunKPHj0wdNKzEB04ujUMDsHMJYs6MRVR20mShGXLlsHPzw9jxoxBQkICFixYIHcs6gCnt428ceNGrF+/HlevXkVgYCDGjx+PP/3pTxg0aBB27tyJd955B3l5efDz80NSUhLWrl0LHx9zgUlNTcW2bdtstjdv3jxs2rQJYWFhuHHD9hEPIQQWLFgAg8GArVu32szbsGEDNm7ciBMnTkChUMDT09PmB/+TTz7BrFmz2rRPrmzXNu/0SeT972o8q6hu9ShLCIFMqBG2JA2PPN7+07sPqrq6Gn9fMgcvVRa26Qiw2ijhwKPxmLX2facfMVZVVWHZ85OBnBMPvK7euyeeS1+DGQud9+PGtpGJuhd2RNAOrv6hLC8txbHtm6E4+288ZfwZapX5prOaBhOylRpIwx5H3Ky56G+51d+VqqqqsHv5mxiVfx4DVa0X0FyjAgVxv8Eraatcdmr2xrVr+J/EWVCcOAsF2vaeeu+eeDJtKV5bkebUbCy2RN0Li207yPVDaTQa8f3e3aitKAOEgHeffngqYbJTT8m21YmsI8jf93eoz53AIP1d9PJQoqrBhKv+/SHFPImYpFkYFB7h8lxlpaV4b/4iVBzMdtiXbSMTBIzhYXjuv1936hFtIxZbou6FxbYd+EPZuqqqKty8dg01d+7Ar38/DAmPgKdn5zz32hEn/vUv/N9nW3F5fyakgkJ4QQETAIOnCn3jRyFmyiQkLXgV3t7eLsnD7xBR98Ji2w78oey67t69iwtnz6L0ViF8NL4IHTIED4W7/lnarvwdqq+vh8HQ/h6ViH5JPD090bNnz/su57TmGonckVqtxq9iY+WO0WXV19djcFgYShw0+UfUHWm1Wly/fv2+BZfFlojazGAwoKS0FAWXcqFRqwEBQEiAEACEg3HJ8niYACTJMq3Zq+U6QliGLeMmE4CmecJkaponWYalxuFm61vmiXst17itxmySybyMkCzjlmFTszzWbTeuIzUbbrF+830ySS3et8UwBIT1PR1tu9mwNa8w759JQEiNn0+z6VLj/jfNM+9u43KWYUmyrg+pKYcQApJJWP5kApIQlkXM05v+jAImqdm4JMy7jubTLes2jkuWebD8+YVojAPJ8tCeJMzbkCxfEckyDFi23bguzNs0WbZjXtacUwAwiaZ1BQCjMA9LjfMsw0ZLBkmYhxszmJcTNssJy3AtJGwvKYHBYGCxJaLOp/H1hcbXt1lxbL1wOi62917Hptg2K3T3LrbNCqLpXsXWwbbtimWz4ZbF1mSyXae1Yis12ydHBbbFsLhXgW0+bJ3WWGwlx8XW1KLYKpsVW2VTYRcmRYtiq2gqtoqmYtlYKCUhzAWxebFtvpxCQCgs09Gi2DaOwzIPlkL6AMW2scCamg0LWIotmhVby7CxcV3Lexktw5LdMBwOS5ZtNw57NF9OtP3JCnYeT0RE5GQstkRERE7GYktERORkLLZEREROxmJLRETkZCy2RERETsZiS0RE5GR8zrYdGlu4rKqqkjkJdVWN352u2lpqVXW15SFLtP7MLBu1aPZebNTil9ioheEB+sxmsW2H6upqAEBISIjMSairq66uhp+fn9wx2szT0xNarRYhkdFyRyFyC1qttk2drbAjgnaQJAlFRUXw9fV1Wd+szVVVVSEkJAQFBQVu2Yi9u+cD5M8ohEB1dTWCgoKgVHatqzldsSMCuf/eHcHsrvcgudkRgRMplUoEBwfLHQMajcatv8Dung+QN2NXOqJtrmfPnm36cXFHXeE72Rpmd73OzN21/qUmIiLqglhsiYiInIzFtgvy8vKCTqeDl5eX3FEccvd8QNfISJ2nK/+9md31nJGbN0gRERE5GY9siYiInIzFloiIyMlYbImIiJyMxZaIiMjJWGyJiIicjMWWiH4RysvLkZCQAB8fH0RGRuLgwYMOl0tNTYWXlxfUajXUajWGDRvm4qT2Pv74Y4wcORIqlQqrVq1qdTlJkrB06VL4+/ujf//+SE9Pd13IVrQ1+6pVq6BSqayfu1qtdl1IB/R6PebOnYvQ0FBoNBrExsYiJyfH4bJ1dXVITk6Gr68vQkND8cUXXzzw+7G5xi4iNzcXR48exZ07d9C7d2/Ex8cjOlr+xuCLioratFxQUJCTk1B3t2TJEmi1WpSXl+PAgQNISkrC5cuXERAQYLfs22+/jRUrVsiQ0rEBAwZg1apV2LFjxz2X27BhAw4fPoy8vDxUVlbimWeewfDhw/HrX//aRUnttTU7AKSkpGDTpk0uSHV/RqMRYWFh+P777xEcHIyvvvoKL774IvLz8+3+EdDpdKioqEBhYSHOnz+PiRMnYuTIkYiMjGz7Gwpya0ajUUyfPl0olUoRFhYmYmNjxaBBg4SHh4eYNm2aMBqNsuZTKBRCqVQKhULR6kupVMqaMSMjQ9b3J+errq4WKpVKFBQUWKeNHTtWbN682W7ZlJQUsXr1alfGa7OFCxcKnU7X6vzY2Fjx+eefW8d1Op2YPXu2C5Ld3/2y63Q6MW/ePNcFaocBAwaI//znP3bTtVqtyM7Oto6npKSIlStXPtC2eRrZza1btw4nT57EiRMncP36deTk5CA/Px/Hjx/HmTNnsG7dOlnzSZIEk8kESZJafZlMJlkzJicn24zzKPuX5/Lly1Cr1TYdhERHR+PcuXMOl09PT0dgYCBGjx6NI0eOuCpmh50/fx7Dhw+3jt9rH93R119/jcDAQMTExOCbb76RO46Ny5cv4/bt2wgPD7eZfufOHZSUlHT4c2exdXNffPEFPvzwQ4wYMcJm+ogRI7Bu3bo2nbpxtdLSUpw8eRJlZWVyRwFg30F7XV2dTEnIWe7evWvXO4tGo8Hdu3ftln3jjTdw5coVFBcXY8mSJZg8eTJu3Ljhqqgd0nI/W9tHd5SUlISLFy+itLQU7777LlJTU/Hjjz/KHQtA0zXZ5cuX2/XG1fj5+vr6Wqe153NnsXVzeXl5GD16tMN5o0ePxqVLl1ycqHVFRUUYP348QkJCMGnSJAQHB2PcuHEoLCyUNVfLPofl6IOYnEutVqOqqspmWlVVlcObcGJiYtC7d294enpi1qxZiIuLwz//+U9XRe2QlvvZ2j66o6FDh0Kr1aJHjx6YMGECZs6ciYyMDLljoaGhAYmJiQgPD8fKlSvt5jd+vtXV1dZp7fnceYOUmxNCoFevXg7n9erVy60Kx2uvvYbo6Gjs2bMHPj4+qKmpwYoVK7Bw4UJ8++23suWqr6/HggULrOO1tbU24wCwceNGV8eiThQREYG7d++isLAQAwcOBACcPXsWs2fPvu+6SqXS7uyHuxo6dChyc3OtpzTPnj3rFndTt4c7fO6SJOF3v/sdFAoFtm3b5vD3tHfv3tBqtcjNzUV8fDyA9n3uLLZuTq/X45133nE4TwgBg8Hg4kStO3r0KHbt2gWVSgUA8PHxwXvvvYcBAwbImuutt96yGV++fLlMSchZ1Go1pkyZAp1Oh/Xr1+PgwYM4c+YMpkyZYrfsrl278Pzzz8PLywu7du1CdnY2PvzwQxlSNzEajTAajTCZTDAajaivr4dKpYKHh4fNcsnJyXj//ffx3HPPobKyEp9++im2bdsmU2qztmbfvXs3xo4dC19fXxw+fBjbt2/Hvn37ZEpttnDhQhQXF+Mf//gHevRovRwmJydjzZo1+Oqrr3DhwgVkZGS0+phQqzpy5xY5X0pKikhNTb3ny1088sgj4tixYzbTcnJyRGRkpEyJqDspKysTEydOFN7e3iIiIkJ89913Qggh/vKXv4ihQ4dal4uPjxcajUZoNBoxatQoceDAAbkiW+l0OgHA5rVlyxaRlZUlfHx8rMuZTCbxxhtvCD8/P9G3b1/xwQcfyJjarK3Zp02bJvz9/YVarRbR0dHiyy+/lDG1EPn5+QKA6Nmzp/Dx8bG+srKy7L4ztbW1YubMmcLHx0cEBweL7du3P/D7sYs96jTffPMN5s6di5deegmhoaG4ceMGMjIysGnTJrz88suy5bp48SLmzJmDc+fOISYmBlu2bMGQIUNky0NE3Q9vkHJzxcXFmDFjBoYPH46UlBRUVFTIHalVU6ZMwQ8//ICHHnoI5eXlCA8PR05OjqyFFjA3dhAREYGdO3ciJCQES5culTUPEXU/PLJ1c7/97W9RU1ODqVOnYteuXejXr59bPu5jMpmgVqtRWVkJT09PuePYCAgIQHFxMby8vFBTU4Pw8HAUFxfLHYuIuhHeIOXmjh49iqtXr0Kj0WDatGk2D1a7Ew8PDzz66KMoLCzE4MGD5Y5jw2g0wsvLC4D5pi29Xi9zIiLqblhs3Zxer7c+xB4QEIDa2lqZE7Vu4sSJmDBhAubNm4fg4GCb2+hnzpwpWy4++kNEcmOxdXMGg8Hm0Z/6+nq7R4HS0tJcHcuh7OxsDBw4EPv377eZrlAoZC22fPSHiOTGa7Zubs6cOfecr1AosHnzZhel6ZqOHTt232Vaa6WLiKgzsNhSp9FoNHZN5gHm09+3b9+WIZHZ4MGDoVAoWm2tRqFQ4Nq1ay5ORUTdCU8jdxEHDx5EZmYmKioq0KdPH4wfP17WPiwdcVTMamtrZW9S8vr167K+PxERi62bq6urw8svv4zDhw8jNjYW/fv3x+XLl5Geno4xY8bgb3/7G7y9vWXNGBERAYVCgbq6Ojz88MM288rLyx02mUdE1J3wNLKbe/PNN3Hy5Ens3LkT/fv3t04vKSnB9OnTERMTg/T0dBkTAkeOHIEQApMmTbJp61ShUKBfv36IioqSMR0RkfxYbN1ccHAwsrOzHT67evXqVTz99NMoKiqSIZk9SZKgVLJRMiKillhs3ZyPjw8qKysd9kjR0NAAf39/1NTUyJDMXkNDA3bs2IHTp0/bdazM51iJqDvjNVs39/DDD2P//v144YUX7Obt27cPERERMqRybPbs2cjNzUVCQoK1T1EiIuKRrdv7+uuvsWDBAqxevRqTJ0+GVqtFSUkJMjIyoNPpsGHDBiQmJsodEwDg5+eHgoICa4tXRERkxiNbN/fKK6+goaEBy5Ytw+9//3vr9KCgIKxbt85tCi1gPgqvqqpisSUiaoHF1s3dvHkTer0eN2/eRF5eHioqKhAYGIjIyEhs27YNBQUFCAkJkTsmACAhIQETJkzA/Pnzbe6cBuRtG5mISG48jezm5syZg/j4eMyfP99u3pYtW5CVlYUtW7bIkMzeuHHjHE5XKBTIzMx0cRoiIvfBYuvmwsLCcOHCBYcNV9TV1SEqKgo3btyQIRkREbUVTyO7uYqKCnh4eDic5+HhgZ9++snFie7t2rVr+Otf/4qioiIEBQUhMTERQ4YMkTsWEZGs2AKBmxsyZAgOHTrkcN6hQ4fcqpDt3r0bjz32GE6fPo1evXrhzJkziImJQUZGhtzRiIhkxSNbN/f6669j7ty52LBhAxISEqBUKiFJEvbu3YvFixdDp9PJHdEqLS0Ne/bswdixY63TsrKysGjRIraPTETdGq/ZdgFr1qzB2rVr0dDQgD59+qCiogIqlQppaWl2HaPLKSAgAKWlpVCpVNZpDQ0N6NevH+7cuSNjMiIiebHYdhGVlZXIycnBTz/9hMDAQMTFxcHPz0/uWDYSEhIQHR2NP/7xj/Dy8oLBYMDKlStx+vRpmw4KiIi6GxZb6jS3bt3CjBkzcPz4cfTt2xfl5eUYOXIkvvzySwQHB8sdj4hINrxmSx128+ZNZGZmIjU1FdnZ2bh165b1buTvvvvOYafyRETdCe9Gpg7T6XQwGo3W8eDgYIwaNcp6NLty5Uq5ohERuQWeRqYOY8MbRET3xiNb6rCu1vAGEZGrsdhSh3WlhjeIiOTAYksd1tjwxp49eyBJEgBAkiTs2bMH8+fPt+kakIioO+LdyNRhr776KkpLSzF9+nSHDW846rGIiKg74Q1S1Gm6QsMbRERyYLElIiJyMl6zJSIicjIWWyIiIidjsSUiInIyFlsiIiInY7ElIiJyMhZbIiIiJ2OxJSIicrL/B9DkJB2crvVZAAAAAElFTkSuQmCC\",\n      \"text/plain\": [\n       \"<Figure size 500x500 with 5 Axes>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"fig, ax = plt.subplots(1,1, figsize=(5, 5))\\n\",\n    \"sc.pl.dotplot(kam20_norn_ad, groupby=\\\"Disease_Identity\\\", var_names=reduced_gene_list, show=True, swap_axes=True, ax=ax)\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Python 3 (ipykernel)\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.11.7\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 5\n}\n"
  },
  {
    "path": "model.py",
    "content": "\"\"\"\nModel class\n\n\"\"\"\n\nimport warnings\nwarnings.filterwarnings(\"ignore\")\nimport math\nfrom torch import nn, Tensor\nfrom torch.nn import TransformerEncoder, TransformerEncoderLayer\n\nimport sys\nsys.path.append('../')\nfrom typing import Any\nimport torch\n\n\ndef full_block(in_features, out_features, p_drop=0.1):\n    return nn.Sequential(\n        nn.Linear(in_features, out_features, bias=True),\n        nn.LayerNorm(out_features),\n        nn.GELU(),\n        nn.Dropout(p=p_drop),\n    )\n\n\nclass PositionalEncoding(nn.Module):\n\n    def __init__(self, d_model: int, dropout: float = 0.1, max_len: int = 1536):\n        super().__init__()\n        self.dropout = nn.Dropout(p=dropout)\n\n        position = torch.arange(max_len).unsqueeze(1)\n        div_term = torch.exp \\\n            (torch.arange(0, d_model, 2) * (-math.log(10000.0) / d_model))\n        pe = torch.zeros(max_len, 1, d_model)\n        pe[:, 0, 0::2] = torch.sin(position * div_term)\n        pe[:, 0, 1::2] = torch.cos(position * div_term)\n        self.register_buffer('pe', pe)\n\n    def forward(self, x: Tensor) -> Tensor:\n        \"\"\"\n        Args:\n            x: Tensor, shape [seq_len, batch_size, embedding_dim]\n        \"\"\"\n        x = x + self.pe[:x.size(0)]\n        return self.dropout(x)\n\n\nclass TransformerModel(nn.Module):\n\n    def __init__(self, token_dim: int, d_model: int, nhead: int, d_hid: int,\n                 nlayers: int, output_dim:int, dropout: float = 0.05):\n        super().__init__()\n        self.model_type = 'Transformer'\n        self.pos_encoder = PositionalEncoding(d_model, dropout)\n        self.d_model = d_model\n\n        self.encoder = nn.Sequential(nn.Linear(token_dim, d_model),\n                                     nn.GELU(),\n                                     nn.LayerNorm(d_model))\n\n\n\n        encoder_layers = TransformerEncoderLayer(d_model, nhead, d_hid, dropout)\n        self.transformer_encoder = TransformerEncoder(encoder_layers, nlayers)\n\n\n        self.d_model = d_model\n        self.dropout = dropout\n\n\n        self.decoder = nn.Sequential(full_block(d_model, 1024, self.dropout),\n                                     full_block(1024, output_dim, self.dropout),\n                                     full_block(output_dim, output_dim, self.dropout),\n                                     nn.Linear(output_dim, output_dim)\n                                     )\n\n        self.binary_decoder = nn.Sequential(\n            full_block(output_dim + 1280, 2048, self.dropout),\n            full_block(2048, 512, self.dropout),\n            full_block(512, 128, self.dropout),\n            nn.Linear(128, 1)\n        )\n\n        self.gene_embedding_layer = nn.Sequential(nn.Linear(token_dim, d_model),\n                                                  nn.GELU(),\n                                                  nn.LayerNorm(d_model))\n\n        self.pe_embedding = None\n\n    def forward(self, src: Tensor, mask: Tensor):\n        \"\"\"\n        Args:\n            src: Tensor, shape [seq_len, batch_size]\n        Returns:\n            output Tensor of shape [seq_len, batch_size, ntoken]\n        \"\"\"\n        src = self.encoder(src) * math.sqrt(self.d_model)\n        src = self.pos_encoder(src)\n        output = self.transformer_encoder(src, src_key_padding_mask=( 1 -mask))\n        gene_output = self.decoder(output) # batch x seq_len x 128\n        # embedding = torch.mul(gene_output, mask.t().unsqueeze(2)).sum(0) # average over non zero genes\n        # In the new format, the cls token, which is at the 0 index mark, is the output.\n        embedding = gene_output[0, :, :] # select only the CLS token.\n        embedding = nn.functional.normalize(embedding, dim=1) # Normalize.\n        return gene_output, embedding\n\n\n    def predict(self, cell_embedding, gene_embeddings):\n        gene_embeddings = self.gene_embedding_layer(gene_embeddings)\n        dec = self.binary_decoder \\\n            (torch.hstack((cell_embedding, gene_embeddings)))\n        return dec\n\n"
  },
  {
    "path": "model_files/new_species_protein_embeddings.csv",
    "content": "species,path\n"
  },
  {
    "path": "requirements.txt",
    "content": "numpy==1.26.4\nscipy==1.14.1\npandas==2.2.2\ntqdm==4.66.5\ntorch==2.1.1\nscanpy==1.10.2\naccelerate==0.24.0\nrequests==2.25.1\nurllib3==1.26.6\n"
  },
  {
    "path": "utils.py",
    "content": "\"\"\"\nUtils\n\n\"\"\"\n\nimport warnings\nwarnings.filterwarnings(\"ignore\")\nimport pandas as pd\nimport numpy as np\nimport os\nimport requests\nfrom tqdm import tqdm\nimport tarfile\n\n\ndef get_shapes_dict(dataset_path):\n    shapes_dict = {}\n    datasets_df = pd.read_csv(dataset_path)\n    sorted_dataset_names = sorted(datasets_df[\"names\"])\n\n    for name in sorted_dataset_names:\n        shapes_dict[name] = (int(datasets_df.set_index(\"names\").loc[name][\"num_cells\"]), 8000)\n\n    shapes_dict[\"dev_immune_mouse\"] = (443697, 4786)\n    shapes_dict[\"dev_immune_human\"] = (34009, 5566)\n    shapes_dict[\"intestinal_tract_human\"] =  (69668, 5192)\n    shapes_dict[\"gtex_human\"] =  (18511, 7109)\n    shapes_dict[\"gut_endoderm_mouse\"] =  (113043, 6806)\n    shapes_dict[\"luca\"] =  (249591, 7196)\n    shapes_dict.update({\n     \"madissoon_novel_lung\":(190728, 8000),\n     'flores_cerebellum_human': (20232, 8000),\n     'osuch_gut_human': (272310, 8000),\n     'msk_ovarian_human': (929690, 8000),\n     'htan_vmuc_dis_epi_human': (65084, 8000),\n     'htan_vmuc_val_epi_human': (57564, 8000),\n     'htan_vmuc_non_epi_human': (9099, 8000),\n     'hao_pbmc_3p_human': (161764, 8000),\n     'hao_pbmc_5p_human': (49147, 8000),\n     'gao_tumors_human': (36111, 8000),\n     'swabrick_breast_human': (92427, 8000),\n     'wu_cryo_tumors_human': (105662, 8000),\n     'cell_line_het_human': (53513, 8000),\n     'bi_allen_metastasis_human': (27787, 8000),\n     'zheng68k_human': (68579, 8000),\n     'zheng68k_12k_human': (68579, 12000),\n     'mouse_embryo_ct': (153597, 12000),\n     \"regev_gtex_heart\": (36574, 8000),\n     \"tabula_sapiens_heart\": (11505, 8000),\n     \"10k_pbmcs\":(11990, 12000),\n     \"epo_ido\":(35834,12000),\n     'tabula_sapiens_kidney': (9641, 8000),\n     'tabula_microcebus_kidney': (14592, 8000),\n     'tabula_muris_kidney': (2781, 8000),\n     'tabula_muris_senis_kidney': (19610, 8000),\n      'immune_human': (33506, 8000)\n                       })\n\n    shapes_dict[\"zyl_sanes_glaucoma_pig\"] = (5901, 6819)\n    shapes_dict[\"parkinsons_macaF\"] = (1062, 5103)\n\n    for row in datasets_df.iterrows():\n        ngenes = row[1].num_genes\n        ncells = row[1].num_cells\n        name = row[1].names\n        if not np.isnan(ngenes):\n            shapes_dict[name] = (int(ncells), int(ngenes))\n\n    return shapes_dict\n\n\ndef figshare_download(url, save_path):\n    \"\"\"\n    Figshare download helper with progress bar\n\n    Args:\n        url (str): the url of the dataset\n        path (str): the path to save the dataset\n    \"\"\"\n\n    if os.path.exists(save_path):\n        return\n    else:\n        # Check if directory exists\n        if not os.path.exists(os.path.dirname(save_path)):\n            os.makedirs(os.path.dirname(save_path))\n        print(\"Downloading \" + save_path + \" from \" + url + \" ...\" + \"\\n\")\n        response = requests.get(url, stream=True)\n        total_size_in_bytes = int(response.headers.get('content-length', 0))\n        block_size = 1024\n        progress_bar = tqdm(total=total_size_in_bytes, unit='iB',\n                            unit_scale=True)\n        with open(save_path, 'wb') as file:\n            for data in response.iter_content(block_size):\n                progress_bar.update(len(data))\n                file.write(data)\n        progress_bar.close()\n\n    # If the downloaded filename ends in tar.gz then extraact it\n    if save_path.endswith(\".tar.gz\"):\n       with tarfile.open(save_path) as tar:\n            tar.extractall(path=os.path.dirname(save_path))\n            print(\"Done!\")\n"
  }
]