Repository: KxSystems/kdbai-samples Branch: main Commit: ad492f122018 Files: 71 Total size: 143.4 MB Directory structure: gitextract_79vuf6ne/ ├── .gitignore ├── HuggingFace_search/ │ └── huggingface_inference.ipynb ├── KDB.AI_course/ │ ├── README.md │ ├── course_specific_content/ │ │ ├── making_queries.ipynb │ │ ├── managing_tables.ipynb │ │ └── rag_example.ipynb │ └── notebook_references.md ├── LICENSE ├── LlamaIndex_advanced_RAG/ │ └── KDBAI_Advanced_RAG_Demo.ipynb ├── LlamaIndex_samples/ │ ├── Hybrid_Search_LlamaIndex_KDBAI.ipynb │ ├── Multimodal_RAG_LLamaIndex_CLIP_KDBAI.ipynb │ └── Sub_Question_Query_Engine_LlamaIndex_KDBAI.ipynb ├── LlamaParse_pdf_RAG/ │ └── llamaParse_demo.ipynb ├── README.md ├── TSS_non_transformed/ │ ├── Non_Transformed_TSS_Technical_Analysis.ipynb │ ├── Temporal_Similarity_Search_KDB+.ipynb │ ├── Temporal_Similarity_Search_Non-Transformed_Demo.ipynb │ ├── createHDB.q │ └── data/ │ └── marketTrades.parquet ├── TSS_transformed/ │ ├── Temporal_Similarity_Search_Transformed_Demo.ipynb │ ├── Transformed_TSS_pattern_matching.ipynb │ └── data/ │ └── marketTrades.parquet ├── document_search/ │ └── document_search.ipynb ├── fuzzy_filtering_on_metadata/ │ └── fuzzy_filtering_demo.ipynb ├── hybrid_search/ │ ├── data/ │ │ └── inflation.txt │ └── hybrid_search_inflation.ipynb ├── image_search/ │ └── image_search.ipynb ├── metadata_filtering/ │ ├── data/ │ │ └── filtered_embedded_movies.pkl │ └── metadata_filtering_demo.ipynb ├── multi_index_multimodal_search/ │ ├── data/ │ │ ├── bat1.txt │ │ ├── bat2.txt │ │ ├── bear1.txt │ │ ├── bear2.txt │ │ ├── caterpillar1.txt │ │ ├── caterpillar2.txt │ │ ├── deer1.txt │ │ ├── deer2.txt │ │ ├── fox1.txt │ │ ├── fox2.txt │ │ ├── hedgehog1.txt │ │ └── hedgehog2.txt │ └── multi_index_multimodal_search.ipynb ├── multimodal_RAG_VoyageAI/ │ ├── Multimodal_RAG_VoyageAI.ipynb │ └── data/ │ └── text/ │ ├── bat.txt │ ├── bear.txt │ ├── caterpillar.txt │ ├── deer.txt │ ├── fox.txt │ └── hedgehog.txt ├── multimodal_RAG_unified_text/ │ ├── data/ │ │ └── text/ │ │ ├── bat.txt │ │ ├── bear.txt │ │ ├── caterpillar.txt │ │ ├── deer.txt │ │ ├── fox.txt │ │ └── hedgehog.txt │ └── multi_modal_demo.ipynb ├── music_recommendation/ │ ├── data/ │ │ └── song_data.csv │ └── music_recommendation.ipynb ├── pattern_matching/ │ └── pattern_matching.ipynb ├── qFlat_index_pdf_search/ │ └── pdf_qFlat_Search.ipynb ├── qHnsw_index_pdf_search/ │ └── pdf_qHNSW_Search.ipynb ├── quickstarts/ │ └── python_quickstart.ipynb ├── requirements.txt ├── retrieval_augmented_generation/ │ ├── data/ │ │ └── state_of_the_union.txt │ ├── retrieval_augmented_generation.ipynb │ └── retrieval_augmented_generation_evaluation.ipynb ├── sentiment_analysis/ │ ├── data/ │ │ └── disneyland_reviews.csv │ └── sentiment_analysis.ipynb ├── unstructured_io_RAG/ │ └── Table_RAG_Unstructured_KDBAI_LangChain_RAG.ipynb └── video_RAG/ ├── video_RAG_TwelveLabs.ipynb └── video_RAG_VoyageAI.ipynb ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ *.ipynb_checkpoints .venv/ .DS_Store ================================================ FILE: HuggingFace_search/huggingface_inference.ipynb ================================================ { "cells": [ { "cell_type": "markdown", "id": "bb2094b8-13a5-4f7c-bd21-d2c709dab914", "metadata": { "id": "bb2094b8-13a5-4f7c-bd21-d2c709dab914" }, "source": [ "# Using Hugging Face Inference with KDB.AI to Create a AI Tool Search Engine\n", "\n", "##### Note: This example requires a KDB.AI endpoint and API key. Sign up for a free [KDB.AI account](https://kdb.ai/get-started).\n", "\n", "How to get started with using the Huggingface Inference API with KDB.AI.\n", "\n", "You will learn how to:\n", "\n", "1. Connect to KDB.AI\n", "2. Create a KDB.AI Database & Table\n", "3. Load Data\n", "4. Use the Sentence Transformers library to embed every description in the dataset\n", "5. Insert the data into our KDB.AI table\n", "6. Perform Similarity Search using the Huggingface Inference API\n", "7. Delete the KDB.AI Database & Table to Conserve Resources" ] }, { "cell_type": "markdown", "id": "nZHRcTHI9bZG", "metadata": { "id": "nZHRcTHI9bZG" }, "source": [ "# Why Use Hugging Face for Embeddings?\n", "\n", "When building production applications that utilize embeddings, it's often advantageous to use open-source embedding models for several reasons:\n", "\n", "1. **Control**: Open-source models give developers more control over the embeddings process, reducing dependence on third-party embedding providers.\n", "\n", "2. **Local Embedding**: With open-source models, you can create embeddings locally, which is particularly useful for embedding your dataset.\n", "\n", "A common approach is to use a Python framework like sentence-transformers, developed by Hugging Face, which offers state-of-the-art sentence, text, and image embeddings. Here's a typical workflow:\n", "\n", "1. **Embed your dataset locally**: Use a library like Sentence Transformers to embed your dataset, which might consist of AI tools and associated metadata.\n", "\n", "2. **Embed queries at inference time**: When a user submits a query, use an external service like Hugging Face's Inference API to embed the query. This eliminates the need to deploy your own model, allowing you to leverage a fully optimized external service.\n", "\n", "By following this approach, you can build a system that searches through hundreds of AI tools without the need to deploy any infrastructure (and scale to millions!). Additionally, since you embed the dataset locally, you can use Hugging Face's free plan without requiring a credit card or worrying about hitting rate limits, at least until you are ready for production.\n", "\n", "In this tutorial, we will walk through the process of embedding a dataset of AI tools using Sentence Transformers, and then using Hugging Face's Inference API to embed queries at inference time, enabling efficient and scalable search capabilities.\n", "\n", "You will need a Hugging Face api token for this sample. Please create a Hugging Face account by going to [Hugging Face – The AI community building the future](https://huggingface.co/) and create a token by going to https://huggingface.co/settings/tokens\n", "\n", "You can then enter this token below or set it to HF_TOKEN in your environment." ] }, { "cell_type": "markdown", "id": "260d0f4b-ef09-4bd2-a197-a9351be24684", "metadata": { "id": "260d0f4b-ef09-4bd2-a197-a9351be24684" }, "source": [ "# 0. Setup" ] }, { "cell_type": "markdown", "id": "d1468bd3", "metadata": { "id": "d1468bd3" }, "source": [ "### Install dependencies\n", "\n", "In order to successfully run this sample, note the following steps depending on where you are running this notebook:\n", "\n", "-***Run Locally / Private Environment:*** The [Setup](https://github.com/KxSystems/kdbai-samples/blob/main/README.md#setup) steps in the repository's `README.md` will guide you on prerequisites and how to run this with Jupyter.\n", "\n", "\n", "-***Colab / Hosted Environment:*** Open this notebook in Colab and run through the cells." ] }, { "cell_type": "code", "execution_count": null, "id": "9f4996e9", "metadata": {}, "outputs": [], "source": [ "!pip install kdbai_client" ] }, { "cell_type": "code", "execution_count": null, "id": "491cd6d6", "metadata": { "id": "491cd6d6" }, "outputs": [], "source": [ "!pip install sentence-transformers" ] }, { "cell_type": "markdown", "id": "cc6d17b7", "metadata": { "id": "cc6d17b7" }, "source": [ "### Import Packages" ] }, { "cell_type": "code", "execution_count": 26, "id": "805d97da", "metadata": { "id": "805d97da" }, "outputs": [], "source": [ "# vector DB\n", "import os\n", "from getpass import getpass\n", "import kdbai_client as kdbai\n", "import time" ] }, { "cell_type": "code", "execution_count": 27, "id": "a55ae34e-472b-4aa7-9add-1fcb2ee24a41", "metadata": { "id": "a55ae34e-472b-4aa7-9add-1fcb2ee24a41" }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "id": "8c660c7d", "metadata": { "id": "8c660c7d" }, "source": [ "# 1. Connect to KDB.AI" ] }, { "cell_type": "markdown", "id": "d3a3aa22", "metadata": { "id": "d3a3aa22" }, "source": [ "To use KDB.AI Server, you will need download and run your own container.\n", "To do this, you will first need to sign up for free [here](https://trykdb.kx.com/kdbaiserver/signup/).\n", "\n", "You will receive an email with the required license file and bearer token needed to download your instance.\n", "Follow instructions in the signup email to get your session up and running.\n", "\n", "Once the [setup steps](https://code.kx.com/kdbai/gettingStarted/kdb-ai-server-setup.html) are complete you can then connect to your KDB.AI Server session using `kdbai.Session` and passing your local endpoint.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "2e85c1ff", "metadata": { "id": "2e85c1ff" }, "outputs": [], "source": [ "#Set up KDB.AI server endpoint \n", "KDBAI_ENDPOINT = (\n", " os.environ[\"KDBAI_ENDPOINT\"]\n", " if \"KDBAI_ENDPOINT\" in os.environ\n", " else \"http://localhost:8082\"\n", ")\n", "\n", "#connect to KDB.AI Server, default mode is qipc\n", "session = kdbai.Session(endpoint=KDBAI_ENDPOINT)\n" ] }, { "cell_type": "code", "execution_count": 29, "id": "Dpi_auWw68cy", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Dpi_auWw68cy", "outputId": "fb43c068-7893-426b-b5bf-559e31a401e2" }, "outputs": [], "source": [ "HF_TOKEN = (\n", " os.environ[\"HF_TOKEN\"]\n", " if \"HF_TOKEN\" in os.environ\n", " else getpass(\"Hugging Face token: \")\n", ")" ] }, { "cell_type": "markdown", "id": "8788a6b1", "metadata": { "id": "8788a6b1" }, "source": [ "### Verify Defined Databases\n", "\n", "We can check our connection using the `session.databases()` function.\n", "This will return a list of all the databases we have defined in our vector database thus far.\n", "This should return a \"default\" database along with any other databases you have already created." ] }, { "cell_type": "code", "execution_count": 32, "id": "7877f51c", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "7877f51c", "outputId": "0e6fca8a-e50b-4b01-a080-b082bf23d889" }, "outputs": [ { "data": { "text/plain": [ "[KDBAI database \"default\"]" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "session.databases()" ] }, { "cell_type": "markdown", "id": "i5NYByShWqeK", "metadata": { "id": "i5NYByShWqeK" }, "source": [ "### Create a Database Called \"myDatabase\"" ] }, { "cell_type": "code", "execution_count": 33, "id": "97e5f4a9", "metadata": { "id": "97e5f4a9" }, "outputs": [], "source": [ "# ensure no database called \"myDatabase\" exists\n", "try:\n", " session.database(\"myDatabase\").drop()\n", "except kdbai.KDBAIException:\n", " pass" ] }, { "cell_type": "code", "execution_count": 34, "id": "Gbvw4SzqWprx", "metadata": { "id": "Gbvw4SzqWprx" }, "outputs": [], "source": [ "# Create the database\n", "db = session.create_database(\"myDatabase\")" ] }, { "cell_type": "markdown", "id": "e33f03c3", "metadata": { "id": "e33f03c3" }, "source": [ "# 2. Create a KDB.AI Table\n", "\n", "To create a table we can use `create_table`, this function takes two arguments - the name and schema of the table.\n", "\n", "This schema must meet the following criteria:\n", "- It must contain a list of columns.\n", "- All columns must have either a `type` or a `qtype`.\n", "- One column of vector embeddings, this column is implicitly an array of `float64s`." ] }, { "cell_type": "markdown", "id": "9da55253", "metadata": { "id": "9da55253" }, "source": [ "### Define Schema\n", "The schema contains all metadata columns, and a 'description_embedding' column which will be used for similarity search\n" ] }, { "cell_type": "code", "execution_count": 35, "id": "e5e8b782", "metadata": { "id": "e5e8b782" }, "outputs": [], "source": [ "schema = [\n", " {\"name\": \"id\", \"type\": \"str\"},\n", " {\"name\": \"name\", \"type\": \"str\"},\n", " {\"name\": \"description\", \"type\": \"str\"},\n", " {\"name\": \"summary\", \"type\": \"str\"},\n", " {\"name\": \"title\", \"type\": \"str\"},\n", " {\"name\": \"visitors\", \"type\": \"int64\"},\n", " {\"name\": \"description_embedding\", \"type\": \"float64s\"},\n", " ]" ] }, { "cell_type": "markdown", "id": "i9ePLlo3adwt", "metadata": { "id": "i9ePLlo3adwt" }, "source": [ "### Define the indexes\n", "We will define our dimensionality, similarity metric and index type with the vectorIndex attribute. For this example we chose:\n", "\n", "- type = hnsw : HNSW enhances efficiency while maintaining accuracy. You have the choice of using other indexes like, qHNSW, and IVFPQ, qFlat or a Flat index here, as with metrics the one you chose depends your data and your overall performance requirements.\n", "- name = hnsw_index : this is a custom name you give your index.\n", "\n", "#### params:\n", "- dims = 384 : In the next section, we generate embeddings that are 384-dimensional to match this. The number of dimensions should mirror the output dimensions of your embedding model.\n", "- metric = L2 : We chose L2/Euclidean distance. Our dummy dataset is low dimensional which Euclidean distance is suitable for. You have the choice of using other metrics here like IP/Inner Product and CS/Cosine Similarity and the one you chose depends on the specific context and nature of your data.\n", "\n", "!Note, it is possible to define multiple indexes within a table!" ] }, { "cell_type": "code", "execution_count": 36, "id": "1-2uL1JMXP37", "metadata": { "id": "1-2uL1JMXP37" }, "outputs": [], "source": [ "# Define the index\n", "indexes = [\n", " {\n", " 'type': 'hnsw',\n", " 'name': 'hnsw_index',\n", " 'column': 'description_embedding',\n", " 'params': {'dims': 384, 'metric': \"L2\"},\n", " },\n", "]\n" ] }, { "cell_type": "markdown", "id": "09a5caa0", "metadata": { "id": "09a5caa0" }, "source": [ "### Create Table" ] }, { "cell_type": "code", "execution_count": 37, "id": "34067680", "metadata": { "id": "34067680" }, "outputs": [], "source": [ "table = db.create_table(table=\"ai_tools\", schema=schema, indexes=indexes)" ] }, { "cell_type": "markdown", "id": "20afbea1", "metadata": { "id": "20afbea1" }, "source": [ "# 3. Load Data\n", "\n", "We fetch data from a github gist containing companies, descriptions, and some metadata. We will then add these to pandas dataframe with column names/types matching the target table." ] }, { "cell_type": "code", "execution_count": 38, "id": "37581e86", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 293 }, "id": "37581e86", "outputId": "aebcdcdb-303e-4eda-8c58-36610243e3ac" }, "outputs": [ { "data": { "text/html": [ "
| \n", " | description | \n", "id | \n", "name | \n", "summary | \n", "title | \n", "visitors | \n", "
|---|---|---|---|---|---|---|
| 0 | \n", "Generate 3D textures for your game in seconds ... | \n", "rec_cfn1112cibvc11jnn2qg | \n", "TextureLab | \n", "TextureLab is a website that provides 3D textu... | \n", "Instant And Unique 3D Textures For Your Next G... | \n", "23913 | \n", "
| 1 | \n", "Luma Labs enables users to explore 3D modeling... | \n", "rec_cfn1112cibvc11jnn2r0 | \n", "lumalabs | \n", "Luma Labs is a website that offers an early ex... | \n", "Imagine 3D V1.2 (Alpha) | \n", "456963 | \n", "
| 2 | \n", "Make motion capture from video easier and more... | \n", "rec_cfn1112cibvc11jnn2rg | \n", "plask | \n", "Plask is an AI-powered mocap animation tool th... | \n", "Ai-Powered Mocap Animation Tool. | \n", "90960 | \n", "
| 3 | \n", "Get hundreds of interior design ideas for your... | \n", "rec_cfn1112cibvc11jnn2s0 | \n", "AI Room Planner | \n", "AI Room Planner is an online platform that uti... | \n", "Interior Design By Ai | \n", "211540 | \n", "
| 4 | \n", "A platform powered by AI to help you create be... | \n", "rec_cfn1112cibvc11jnn2sg | \n", "AI TWO | \n", "AI TWO is a website that provides a platform f... | \n", "Aitwo.Co - The Ai-Powered All-In-One Design Pl... | \n", "7201 | \n", "
| \n", " | id | \n", "name | \n", "description | \n", "summary | \n", "title | \n", "visitors | \n", "description_embedding | \n", "
|---|---|---|---|---|---|---|---|
| 0 | \n", "rec_cfn1112cibvc11jnn2qg | \n", "TextureLab | \n", "Generate 3D textures for your game in seconds ... | \n", "TextureLab is a website that provides 3D textu... | \n", "Instant And Unique 3D Textures For Your Next G... | \n", "23913 | \n", "[-0.06802839040756226, 0.017697788774967194, 0... | \n", "
| 1 | \n", "rec_cfn1112cibvc11jnn2r0 | \n", "lumalabs | \n", "Luma Labs enables users to explore 3D modeling... | \n", "Luma Labs is a website that offers an early ex... | \n", "Imagine 3D V1.2 (Alpha) | \n", "456963 | \n", "[0.0028436651919037104, 0.003491099225357175, ... | \n", "
| 2 | \n", "rec_cfn1112cibvc11jnn2rg | \n", "plask | \n", "Make motion capture from video easier and more... | \n", "Plask is an AI-powered mocap animation tool th... | \n", "Ai-Powered Mocap Animation Tool. | \n", "90960 | \n", "[-0.08536490797996521, -0.05372241884469986, 0... | \n", "
| 3 | \n", "rec_cfn1112cibvc11jnn2s0 | \n", "AI Room Planner | \n", "Get hundreds of interior design ideas for your... | \n", "AI Room Planner is an online platform that uti... | \n", "Interior Design By Ai | \n", "211540 | \n", "[0.020655963569879532, 0.028269633650779724, 0... | \n", "
| 4 | \n", "rec_cfn1112cibvc11jnn2sg | \n", "AI TWO | \n", "A platform powered by AI to help you create be... | \n", "AI TWO is a website that provides a platform f... | \n", "Aitwo.Co - The Ai-Powered All-In-One Design Pl... | \n", "7201 | \n", "[-0.02213478274643421, -0.03189412131905556, 0... | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 846 | \n", "rec_cod2au57l1i4603r3hvg | \n", "Scott Krager | \n", "Thumbnails.com uses AI to generate dozens of u... | \n", "Unlock the power of eye-catching thumbnails wi... | \n", "Thumbnails.com | \n", "0 | \n", "[-0.07755479961633682, -0.05978638306260109, -... | \n", "
| 847 | \n", "rec_codntepuqmnhe7ku1ing | \n", "Nen Fard | \n", "StockTune: AI-powered, public-domain music for... | \n", "\\nStockTune is a revolutionary platform offeri... | \n", "StockTune | \n", "0 | \n", "[-0.03542690351605415, -0.057283081114292145, ... | \n", "
| 848 | \n", "rec_codr709uqmnhe7ku1te0 | \n", "Nen Fard | \n", "StockCake: Free, AI-generated stock photos in... | \n", "StockCake is a revolutionary stock photo site ... | \n", "StockCake | \n", "0 | \n", "[0.005515319295227528, -0.025487307459115982, ... | \n", "
| 849 | \n", "rec_coidgc9uqmnhe7l0eug0 | \n", "Jason West | \n", "FastBots enables anyone to quickly create a po... | \n", "FastBots is a no-code AI chatbot builder for b... | \n", "FastBots | \n", "0 | \n", "[-0.0814945250749588, -0.006074093747884035, -... | \n", "
| 850 | \n", "rec_coj3l5aa8o7fb0ajha0g | \n", "Dubformer | \n", "AI-driven translation and dubbing services | \n", "Dubformer is an end-to-end innovative service ... | \n", "AI dubbing and video translation solution | \n", "0 | \n", "[-0.09128136932849884, -0.05604198947548866, 0... | \n", "
851 rows × 7 columns
\n", "| \n", " | __nn_distance | \n", "id | \n", "name | \n", "description | \n", "summary | \n", "title | \n", "visitors | \n", "description_embedding | \n", "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", "0.25221 | \n", "rec_cfn1112cibvc11jnn2qg | \n", "TextureLab | \n", "Generate 3D textures for your game in seconds ... | \n", "TextureLab is a website that provides 3D textu... | \n", "Instant And Unique 3D Textures For Your Next G... | \n", "23913 | \n", "[-0.06802839040756226, 0.017697788774967194, 0... | \n", "
| 1 | \n", "0.26723 | \n", "rec_cfn11a2cibvc11jnndbg | \n", "Ponzu.gg | \n", "Create realistic 3D images with AI-generated t... | \n", "Ponzu is a website that helps 3D artists and d... | \n", "Ponzu. | \n", "6526 | \n", "[-0.06463481485843658, -0.014672131277620792, ... | \n", "
| 2 | \n", "0.34271 | \n", "rec_cfn119acibvc11jnncf0 | \n", "Masterpiece Studio | \n", "Create 3D models with Generative AI and deploy... | \n", "Masterpiece Studio is a company that has devel... | \n", "Masterpiece Studio. | \n", "38954 | \n", "[-0.04131263867020607, -0.0035701903980225325,... | \n", "
| \n", " | id | \n", "name | \n", "age | \n", "city | \n", "description | \n", "embeddings | \n", "
|---|---|---|---|---|---|---|
| 0 | \n", "0 | \n", "Alice | \n", "58 | \n", "New York | \n", "A passionate environmentalist with 5 years of ... | \n", "[-0.006158471, 0.063678846, 0.09181005, -0.023... | \n", "
| 1 | \n", "1 | \n", "Bob | \n", "25 | \n", "London | \n", "A software engineer with 7 years of experience... | \n", "[-0.035581246, 0.07986437, 0.04891828, -0.0604... | \n", "
| 2 | \n", "2 | \n", "Charlie | \n", "19 | \n", "New York | \n", "A guitarist with over 10 years of experience p... | \n", "[0.050266247, 0.05255312, 0.048840936, -0.0032... | \n", "
| 3 | \n", "3 | \n", "Monica | \n", "35 | \n", "Paris | \n", "A data scientist in Tokyo with 4 years of expe... | \n", "[-0.008097345, 0.030305384, 0.012246384, -0.04... | \n", "
| 4 | \n", "4 | \n", "Eve | \n", "33 | \n", "Berlin | \n", "An avid reader and travel blogger with 3 years... | \n", "[0.029772803, 0.07571457, 0.042140756, 0.06809... | \n", "
| 5 | \n", "5 | \n", "Frank | \n", "32 | \n", "New York | \n", "A graphic designer based in Berlin with 8 year... | \n", "[0.013257692, 0.045190323, 0.0074770325, -0.00... | \n", "
| 6 | \n", "6 | \n", "Grace | \n", "26 | \n", "San Francisco | \n", "A high school teacher with 15 years of experie... | \n", "[-0.011028861, 0.051242497, 0.063257486, -0.05... | \n", "
| 7 | \n", "7 | \n", "Hannah | \n", "24 | \n", "Amsterdam | \n", "A professional photographer with 6 years of ex... | \n", "[0.04469839, 0.07050187, 0.046390466, -0.03404... | \n", "
| 8 | \n", "8 | \n", "Ivy | \n", "52 | \n", "Rome | \n", "A fitness trainer with 5 years of experience w... | \n", "[0.0002550126, 0.024398372, 0.09861772, 0.0062... | \n", "
| 9 | \n", "9 | \n", "Jack | \n", "23 | \n", "Toronto | \n", "A chef with 12 years of experience who runs a ... | \n", "[-0.008186043, 0.051337104, 0.02683556, -0.030... | \n", "
| 10 | \n", "10 | \n", "Kara | \n", "55 | \n", "Chicago | \n", "A journalist with 9 years of experience writin... | \n", "[-0.017909497, 0.08548332, 0.0022086229, -0.04... | \n", "
| 11 | \n", "11 | \n", "Leo | \n", "45 | \n", "Barcelona | \n", "A musician with 20 years of experience who pla... | \n", "[0.008686635, 0.03110498, 0.05405915, -0.07571... | \n", "
| 12 | \n", "12 | \n", "Mia | \n", "20 | \n", "Madrid | \n", "A software developer with 6 years of experienc... | \n", "[-0.04372146, 0.06704399, 0.022140108, -0.1017... | \n", "
| 13 | \n", "13 | \n", "Nate | \n", "19 | \n", "New York | \n", "An artist with 10 years of experience who pain... | \n", "[0.01933304, 0.023277232, 0.044062667, 0.01242... | \n", "
| 14 | \n", "14 | \n", "Olivia | \n", "23 | \n", "Moscow | \n", "A historian with 7 years of experience who lov... | \n", "[-0.0051849326, 0.16519417, 0.06066864, 0.0311... | \n", "
| 15 | \n", "15 | \n", "Paul | \n", "31 | \n", "Dubai | \n", "A marketing manager with 8 years of experience... | \n", "[0.010789718, 0.017695278, 0.018274685, -0.033... | \n", "
| 16 | \n", "16 | \n", "Quinn | \n", "32 | \n", "Singapore | \n", "A nurse with 12 years of experience in emergen... | \n", "[-0.041632365, 0.034463193, 0.06313535, 0.0160... | \n", "
| 17 | \n", "17 | \n", "Rita | \n", "50 | \n", "New York | \n", "A financial analyst with 5 years of experience... | \n", "[0.015000028, 0.024906091, 0.0010010687, 0.011... | \n", "
| 18 | \n", "18 | \n", "Sam | \n", "56 | \n", "Istanbul | \n", "A project manager with 10 years of experience ... | \n", "[-0.020330371, 0.079401195, 0.02162953, -0.080... | \n", "
| 19 | \n", "19 | \n", "Tina | \n", "19 | \n", "Munich | \n", "A UX designer with 6 years of experience in cr... | \n", "[-0.030572662, 0.04520395, 0.04553928, -0.0925... | \n", "
| 20 | \n", "20 | \n", "Uma | \n", "53 | \n", "Vienna | \n", "A sales executive with 8 years of experience i... | \n", "[-0.014194918, 0.032352123, -0.0070426096, -0.... | \n", "
| 21 | \n", "21 | \n", "Victor | \n", "30 | \n", "Dublin | \n", "A content writer with 5 years of experience in... | \n", "[-0.018195461, 0.032041155, 0.059233848, -0.03... | \n", "
| 22 | \n", "22 | \n", "Wendy | \n", "59 | \n", "Zurich | \n", "A civil engineer with 10 years of experience i... | \n", "[-0.00980266, 0.04713828, 0.05187823, -0.03932... | \n", "
| 23 | \n", "23 | \n", "Xander | \n", "52 | \n", "Stockholm | \n", "A teacher with 15 years of experience in prima... | \n", "[-0.013646452, 0.028070105, 0.05104053, -0.064... | \n", "
| 24 | \n", "24 | \n", "Yara | \n", "44 | \n", "Lisbon | \n", "A business analyst with 7 years of experience ... | \n", "[-0.044623584, 0.054378174, 0.0015794634, -0.0... | \n", "
| 25 | \n", "25 | \n", "Zane | \n", "32 | \n", "Prague | \n", "A psychologist with 6 years of experience in c... | \n", "[0.016778275, 0.09543604, 0.048281595, -0.0022... | \n", "
| 26 | \n", "26 | \n", "Alice | \n", "46 | \n", "Budapest | \n", "A software architect with 9 years of experienc... | \n", "[-0.06051296, 0.031862404, -0.031203829, -0.07... | \n", "
| 27 | \n", "27 | \n", "Cody | \n", "55 | \n", "Berlin | \n", "A research scientist with 8 years of experienc... | \n", "[-0.01787689, 0.07915241, -0.004790489, -0.031... | \n", "
| 28 | \n", "28 | \n", "Diana | \n", "35 | \n", "Copenhagen | \n", "An operations manager with 12 years of experie... | \n", "[0.011406942, 0.02994747, 0.06136875, -0.02639... | \n", "
| 29 | \n", "29 | \n", "Ethan | \n", "18 | \n", "Seoul | \n", "A public relations specialist with 7 years of ... | \n", "[-0.001325855, 0.089781284, 0.05144235, -0.036... | \n", "
| \n", " | id | \n", "name | \n", "age | \n", "city | \n", "description | \n", "embeddings | \n", "
|---|---|---|---|---|---|---|
| 0 | \n", "3 | \n", "Monica | \n", "35 | \n", "Paris | \n", "A data scientist in Tokyo with 4 years of expe... | \n", "[-0.008097345, 0.030305384, 0.012246384, -0.04... | \n", "
| 1 | \n", "8 | \n", "Ivy | \n", "52 | \n", "Rome | \n", "A fitness trainer with 5 years of experience w... | \n", "[0.0002550126, 0.024398372, 0.09861772, 0.0062... | \n", "
| \n", " | city | \n", "avgAge | \n", "countCity | \n", "
|---|---|---|---|
| 0 | \n", "Seoul | \n", "18.0 | \n", "1 | \n", "
| 1 | \n", "Munich | \n", "19.0 | \n", "1 | \n", "
| 2 | \n", "Madrid | \n", "20.0 | \n", "1 | \n", "
| 3 | \n", "Moscow | \n", "23.0 | \n", "1 | \n", "
| 4 | \n", "Toronto | \n", "23.0 | \n", "1 | \n", "
| 5 | \n", "Amsterdam | \n", "24.0 | \n", "1 | \n", "
| 6 | \n", "London | \n", "25.0 | \n", "1 | \n", "
| 7 | \n", "San Francisco | \n", "26.0 | \n", "1 | \n", "
| 8 | \n", "Dublin | \n", "30.0 | \n", "1 | \n", "
| 9 | \n", "Dubai | \n", "31.0 | \n", "1 | \n", "
| 10 | \n", "Prague | \n", "32.0 | \n", "1 | \n", "
| 11 | \n", "Singapore | \n", "32.0 | \n", "1 | \n", "
| 12 | \n", "Copenhagen | \n", "35.0 | \n", "1 | \n", "
| 13 | \n", "Paris | \n", "35.0 | \n", "1 | \n", "
| 14 | \n", "New York | \n", "35.6 | \n", "5 | \n", "
| 15 | \n", "Berlin | \n", "44.0 | \n", "2 | \n", "
| 16 | \n", "Lisbon | \n", "44.0 | \n", "1 | \n", "
| 17 | \n", "Barcelona | \n", "45.0 | \n", "1 | \n", "
| 18 | \n", "Budapest | \n", "46.0 | \n", "1 | \n", "
| 19 | \n", "Rome | \n", "52.0 | \n", "1 | \n", "
| 20 | \n", "Stockholm | \n", "52.0 | \n", "1 | \n", "
| 21 | \n", "Vienna | \n", "53.0 | \n", "1 | \n", "
| 22 | \n", "Chicago | \n", "55.0 | \n", "1 | \n", "
| 23 | \n", "Istanbul | \n", "56.0 | \n", "1 | \n", "
| 24 | \n", "Zurich | \n", "59.0 | \n", "1 | \n", "
| \n", " | id | \n", "vectors | \n", "
|---|---|---|
| 0 | \n", "h | \n", "[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8] | \n", "
| 1 | \n", "e | \n", "[0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9] | \n", "
| 2 | \n", "l | \n", "[0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0] | \n", "
| 3 | \n", "l | \n", "[0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 0.1] | \n", "
| 4 | \n", "o | \n", "[0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 0.1, 0.2] | \n", "
| \n", " | id | \n", "vectors | \n", "
|---|---|---|
| 0 | \n", "h | \n", "[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8] | \n", "
| 1 | \n", "e | \n", "[0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9] | \n", "
| 2 | \n", "l | \n", "[0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0] | \n", "
| 3 | \n", "l | \n", "[0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 0.1] | \n", "
| 4 | \n", "o | \n", "[0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 0.1, 0.2] | \n", "
| \n", " | __nn_distance | \n", "text | \n", "embedding | \n", "
|---|---|---|---|
| 0 | \n", "0.823007 | \n", "b'In late 2015 I spent 3 months writing essays, and when I went back to working on Bel I could barely understand the code. Not so much because it was badly written as because the problem is so convoluted. When you\\'re working on an interpreter written in itself, it\\'s hard to keep track of what\\'s happening at what level, and errors can be practically encrypted by the time you get them.\\n\\nSo I said no more essays till Bel was done. But I told few people about Bel while I was working on it. So for years it must have seemed that I was doing nothing, when in fact I was working harder than I\\'d ever worked on anything. Occasionally after wrestling for hours with some gruesome bug I\\'d check Twitter or HN and see someone asking \"Does Paul Graham still code?\"\\n\\nWorking on Bel was hard but satisfying. I worked on it so intensively that at any given time I had a decent chunk of the code in my head and could write more there. I remember taking the boys to the coast on a sunny day in 2015 and figuring out how to deal with some problem involving continuations while I watched them play in the tide pools. It felt like I was doing life right. I remember that because I was slightly dismayed at how novel it felt. The good news is that I had more moments like this over the next few years.\\n\\nIn the summer of 2016 we moved to England. We wanted our kids to see what it was like living in another country, and since I was a British citizen by birth, that seemed the obvious choice. We only meant to stay for a year, but we liked it so much that we still live there. So most of Bel was written in England.\\n\\nIn the fall of 2019, Bel was finally finished. Like McCarthy\\'s original Lisp, it\\'s a spec rather than an implementation, although like McCarthy\\'s Lisp it\\'s a spec expressed as code.\\n\\nNow that I could write essays again, I wrote a bunch about topics I\\'d had stacked up. I kept writing essays through 2020, but I also started to think about other things I could work on. How should I choose what to do?' | \n", "[-0.05267877, 0.005840427, -0.01187801, -0.028083289, 0.029767925, -0.01268333, -0.009753024, -0.011209541, 0.030792488, -0.07470311, 0.0005716741, 0.034681723, -0.0025648128, -0.007870674, -0.037071493, -0.0026503617, -0.030294443, -0.046712548, -0.026220752, -0.010382689, -0.047210008, 0.0039388337, -0.009324926, 0.04539282, 0.04298206, 0.051068194, 0.029527958, -0.012021941, -0.051774003, -0.20419116, -0.019487105, 0.03856181, 0.054865412, -0.024023462, 0.005628216, 0.059498444, -0.023029648, -0.011461271, 0.0007990732, 0.01532533, 0.013435846, 0.009714834, 0.010104686, -0.014338494, 0.004052569, 0.020879505, 0.0112869395, -0.048422333, 0.025670612, 0.033183247, -0.071020156, -0.032056253, -0.0013147242, 0.045764726, -0.023884403, 0.013609344, 0.021824384, 0.0791942, 0.0021155155, -0.0058458406, 0.022163069, -0.0010415328, -0.1377265, 0.05194325, -0.035091735, 0.020503322, -0.03358411, -0.039575316, -0.018544003, 0.07090187, -0.030203853, 0.0024145627, -0.050365325, 0.1062729, 0.04504893, 0.020158818, -0.0055481945, 0.0020900085, 0.014658697, -0.01600323, 0.018643875, -0.020128626, 0.001960821, 0.014573526, -0.018745624, -0.011082115, -0.026627902, 0.035287272, 0.033186108, 0.004842385, 0.04288919, -0.051519115, 0.021143924, 0.03511711, -0.032461487, -0.053802498, -2.9269107e-05, 0.022274038, -0.019326271, 0.5066904, ...] | \n", "
| 1 | \n", "0.851789 | \n", "b\"He wanted to start a startup to make nuclear reactors. But I kept at it, and in October 2013 he finally agreed. We decided he'd take over starting with the winter 2014 batch. For the rest of 2013 I left running YC more and more to Sam, partly so he could learn the job, and partly because I was focused on my mother, whose cancer had returned.\\n\\nShe died on January 15, 2014. We knew this was coming, but it was still hard when it did.\\n\\nI kept working on YC till March, to help get that batch of startups through Demo Day, then I checked out pretty completely. (I still talk to alumni and to new startups working on things I'm interested in, but that only takes a few hours a week.)\\n\\nWhat should I do next? Rtm's advice hadn't included anything about that. I wanted to do something completely different, so I decided I'd paint. I wanted to see how good I could get if I really focused on it. So the day after I stopped working on YC, I started painting. I was rusty and it took a while to get back into shape, but it was at least completely engaging. [18]\\n\\nI spent most of the rest of 2014 painting. I'd never been able to work so uninterruptedly before, and I got to be better than I had been. Not good enough, but better. Then in November, right in the middle of a painting, I ran out of steam. Up till that point I'd always been curious to see how the painting I was working on would turn out, but suddenly finishing this one seemed like a chore. So I stopped working on it and cleaned my brushes and haven't painted since. So far anyway.\\n\\nI realize that sounds rather wimpy. But attention is a zero sum game. If you can choose what to work on, and you choose a project that's not the best one (or at least a good one) for you, then it's getting in the way of another project that is. And at 50 there was some opportunity cost to screwing around.\" | \n", "[-0.04173409, -0.020306244, 0.026670614, -0.028619805, 0.013841975, -0.004587492, -0.03740281, -0.0023207841, -0.005583664, -0.02458708, 0.032301717, -0.003981511, -0.0022139344, 0.040776156, 0.008303966, 0.065411426, -0.05266241, -0.0147317415, -0.013039435, -0.02108635, -0.08220996, -0.023095597, 0.009018569, -0.06593445, 0.053503707, 0.02561, -0.011278506, -0.029375598, -0.02894449, -0.17977206, 0.015862752, 0.037204675, 0.028550476, -0.008014831, 0.050124772, 0.053289328, -0.037882008, -0.004310019, -0.040979013, 0.031382367, -0.019382592, 0.041386265, -0.06535482, -0.03808074, 0.013384267, 0.010357172, 0.0032444543, -0.052392986, 0.042238504, 0.020043798, -0.028322041, -0.055793695, -0.011091505, 0.020135079, -0.003494716, 0.01618655, 0.08450317, 0.040414557, 0.032989975, 0.011764182, -0.013049825, -0.029259514, -0.102057606, 0.016020596, 0.016062474, 0.010199196, -0.009390674, -0.043287795, 0.034758028, 0.13968067, 0.025622727, 0.016510569, -0.02354023, 0.073845506, 0.009602881, -0.049839057, 0.022470307, 0.043024465, 0.0017405926, -0.028580481, 0.0027170023, 0.010050958, -0.013109462, 0.014532717, -0.04200619, 0.01677191, -0.07769759, 0.0073121856, 0.0189732, 0.08225239, 0.052873313, 0.020460907, 0.017190987, -0.025781311, -0.057865854, -0.015826138, 0.04352462, 0.040577717, -0.045354914, 0.47870147, ...] | \n", "
| \n", " | document_id | \n", "text | \n", "embeddings | \n", "title | \n", "publication_date | \n", "
|---|---|---|---|---|---|
| 0 | \n", "b'272d7d24-c232-41b6-823e-27aa6203c100' | \n", "b'PUBLIC LAW 106\\xc2\\xb1102\\xc3\\x90NOV. 12, 19... | \n", "[0.034452137, 0.03166917, -0.011892043, 0.0184... | \n", "GRAMM–LEACH–BLILEY ACT, 1999 | \n", "1999-11-12 | \n", "
| 1 | \n", "b'89e3f2ee-f5a6-4e40-bb81-0632f08341f0' | \n", "b\"113 STAT. 1338 PUBLIC LAW 106\\xc2\\xb1102\\xc3... | \n", "[0.02164333, 1.0030156e-05, 0.0028665832, 0.02... | \n", "GRAMM–LEACH–BLILEY ACT, 1999 | \n", "1999-11-12 | \n", "
| 2 | \n", "b'56fbe82a-5458-4a4a-a5ed-026d9399151d' | \n", "b'113 STAT. 1339 PUBLIC LAW 106\\xc2\\xb1102\\xc3... | \n", "[0.01380091, 0.026945233, 0.02838467, 0.043132... | \n", "GRAMM–LEACH–BLILEY ACT, 1999 | \n", "1999-11-12 | \n", "
| 3 | \n", "b'b6bf9e48-51b6-45d9-9259-b6346f93831f' | \n", "b'113 STAT. 1340 PUBLIC LAW 106\\xc2\\xb1102\\xc3... | \n", "[0.0070182937, 0.014063503, 0.026525516, 0.040... | \n", "GRAMM–LEACH–BLILEY ACT, 1999 | \n", "1999-11-12 | \n", "
| 4 | \n", "b'f398b133-b4f5-4a34-94d1-9a97fdb658e5' | \n", "b\"113 STAT. 1341 PUBLIC LAW 106\\xc2\\xb1102\\xc3... | \n", "[0.025041763, 0.01968024, 0.030940715, 0.02899... | \n", "GRAMM–LEACH–BLILEY ACT, 1999 | \n", "1999-11-12 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 989 | \n", "b'8e84d1d5-d87d-4351-b7eb-5d569fdb8d9c' | \n", "b'124 STAT. 2219 PUBLIC LAW 111\\xe2\\x80\\x93203... | \n", "[0.024505286, 0.015549232, 0.0536601, 0.028532... | \n", "DODD-FRANK WALL STREET REFORM AND CONSUMER PRO... | \n", "2010-07-21 | \n", "
| 990 | \n", "b'0c47f590-050c-4374-bf8c-2a4502dc980f' | \n", "b'124 STAT. 2220 PUBLIC LAW 111\\xe2\\x80\\x93203... | \n", "[0.014071382, -0.0044553108, 0.03662071, 0.035... | \n", "DODD-FRANK WALL STREET REFORM AND CONSUMER PRO... | \n", "2010-07-21 | \n", "
| 991 | \n", "b'63a2235f-d368-43b8-a1a9-a5a11d497245' | \n", "b'124 STAT. 2221 PUBLIC LAW 111\\xe2\\x80\\x93203... | \n", "[0.0005448305, 0.013075933, 0.044821188, 0.031... | \n", "DODD-FRANK WALL STREET REFORM AND CONSUMER PRO... | \n", "2010-07-21 | \n", "
| 992 | \n", "b'bac4d75e-4867-4d89-a71e-09a6762bf3c4' | \n", "b'124 STAT. 2222 PUBLIC LAW 111\\xe2\\x80\\x93203... | \n", "[0.032077603, 0.016817383, 0.04507993, 0.03376... | \n", "DODD-FRANK WALL STREET REFORM AND CONSUMER PRO... | \n", "2010-07-21 | \n", "
| 993 | \n", "b'e262e4da-f6e1-4b9d-9232-77fc3f0c81a7' | \n", "b'124 STAT. 2223 PUBLIC LAW 111\\xe2\\x80\\x93203... | \n", "[0.0387719, -0.025150038, 0.030345473, 0.04303... | \n", "DODD-FRANK WALL STREET REFORM AND CONSUMER PRO... | \n", "2010-07-21 | \n", "
994 rows × 5 columns
\n", "| \n", " | document_id | \n", "text | \n", "embedding | \n", "sparseVectors | \n", "title | \n", "file_path | \n", "
|---|---|---|---|---|---|---|
| 0 | \n", "b'c26bcfc2-951e-40bd-959a-ae2b8edd2467' | \n", "b'At last year\\'s Jackson Hole symposium, I de... | \n", "[-0.035284244, 0.0753799, -0.022666411, -0.017... | \n", "{101: 1, 2012: 4, 2197: 1, 2095: 3, 1005: 3, 1... | \n", "inflation.txt | \n", "/content/data/inflation.txt | \n", "
| 1 | \n", "b'e4d97506-7118-49ae-87bf-47c41abe670c' | \n", "b\"On a 12-month basis, core PCE inflation peak... | \n", "[-0.04378559, 0.046354603, -0.030167095, 0.013... | \n", "{101: 1, 2006: 7, 1037: 5, 2260: 2, 1011: 6, 3... | \n", "inflation.txt | \n", "/content/data/inflation.txt | \n", "
| 2 | \n", "b'0014da23-8348-48af-ab56-c64ec48c47cc' | \n", "b'In the highly interest-sensitive housing sec... | \n", "[-0.07940253, 0.008506958, -0.035946056, -0.00... | \n", "{101: 1, 1999: 7, 1996: 21, 3811: 1, 3037: 2, ... | \n", "inflation.txt | \n", "/content/data/inflation.txt | \n", "
| 3 | \n", "b'1c00e107-b816-40d2-8445-a0ae707c2564' | \n", "b\"Getting inflation sustainably back down to 2... | \n", "[-0.046816133, 0.052543037, -0.038334284, -0.0... | \n", "{101: 1, 2893: 1, 14200: 2, 15770: 1, 8231: 1,... | \n", "inflation.txt | \n", "/content/data/inflation.txt | \n", "
| 4 | \n", "b'73ac6d92-a93f-4b5f-a8c6-8bba94068e3f' | \n", "b'While nominal wage growth must ultimately sl... | \n", "[-0.033225708, 0.037619803, -0.030979052, -0.0... | \n", "{101: 1, 2096: 1, 15087: 2, 11897: 4, 3930: 4,... | \n", "inflation.txt | \n", "/content/data/inflation.txt | \n", "
| 5 | \n", "b'0decb50d-966f-448f-a2a4-88dacc50a375' | \n", "b'Doing too little could allow above-target in... | \n", "[-0.042863447, 0.02854309, -0.030805789, -0.03... | \n", "{101: 1, 2725: 2, 2205: 2, 2210: 1, 2071: 2, 3... | \n", "inflation.txt | \n", "/content/data/inflation.txt | \n", "
| \n", " | score | \n", "text | \n", "
|---|---|---|
| 0 | \n", "0.375000 | \n", "At last year's Jackson Hole symposium, I deliv... | \n", "
| 1 | \n", "0.333333 | \n", "In my remaining comments, I will focus on core... | \n", "
| 2 | \n", "0.291667 | \n", "Core goods prices fell the past two months, bu... | \n", "
| 3 | \n", "0.250000 | \n", "Total hours worked has been flat over the past... | \n", "
| 4 | \n", "0.200000 | \n", "Over time, restrictive monetary policy will he... | \n", "
| \n", " | score | \n", "text | \n", "
|---|---|---|
| 0 | \n", "0.466667 | \n", "In my remaining comments, I will focus on core... | \n", "
| 1 | \n", "0.325000 | \n", "Core goods prices fell the past two months, bu... | \n", "
| 2 | \n", "0.275000 | \n", "At last year's Jackson Hole symposium, I deliv... | \n", "
| 3 | \n", "0.200000 | \n", "Over time, restrictive monetary policy will he... | \n", "
| 4 | \n", "0.183333 | \n", "Total hours worked has been flat over the past... | \n", "
| \n", " | score | \n", "text | \n", "
|---|---|---|
| 0 | \n", "0.475000 | \n", "At last year's Jackson Hole symposium, I deliv... | \n", "
| 1 | \n", "0.316667 | \n", "Total hours worked has been flat over the past... | \n", "
| 2 | \n", "0.258333 | \n", "Core goods prices fell the past two months, bu... | \n", "
| 3 | \n", "0.200000 | \n", "Over time, restrictive monetary policy will he... | \n", "
| 4 | \n", "0.200000 | \n", "In my remaining comments, I will focus on core... | \n", "
| \n", " | document_id | \n", "text | \n", "embeddings | \n", "filename | \n", "file_path | \n", "
|---|---|---|---|---|---|
| 0 | \n", "b'97d9fa84-5ffe-4d1f-b83f-a397a480166c' | \n", "b'Niccol\\xc3\\xb2 di Bernardo dei Machiavelli (... | \n", "[-0.024216307, -0.013386093, 0.001253736, -0.0... | \n", "Machiavelli.txt | \n", "data/Machiavelli.txt | \n", "
| 1 | \n", "b'9f8d942e-d37b-4776-a982-c02ee524e871' | \n", "b\"Machiavelli's political realism has continue... | \n", "[-0.026074765, -0.008071378, -0.010988744, -0.... | \n", "Machiavelli.txt | \n", "data/Machiavelli.txt | \n", "
| 2 | \n", "b'f86edd6d-0c9f-43e3-a844-bc5d13048280' | \n", "b\"Shortly thereafter, he was also made the sec... | \n", "[-0.01654868, -0.003647899, 0.0055484013, -0.0... | \n", "Machiavelli.txt | \n", "data/Machiavelli.txt | \n", "
| 3 | \n", "b'd5ed7c93-f6d6-481a-86d3-eb4ef16c9d89' | \n", "b'The Florentine city-state and the republic w... | \n", "[-0.035484154, -0.0016756041, 0.00013820048, -... | \n", "Machiavelli.txt | \n", "data/Machiavelli.txt | \n", "
| 4 | \n", "b'09f6f434-8a1b-445c-9238-66aa75356fa3' | \n", "b'In 1789 George Nassau Clavering, and Pietro ... | \n", "[-0.02049841, -0.0031482982, 0.0036144697, -0.... | \n", "Machiavelli.txt | \n", "data/Machiavelli.txt | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 124 | \n", "b'fd4746c1-3849-4b82-a5a5-26c8465bc0b3' | \n", "b'With the capability to render 3D actors and ... | \n", "[-0.016764004, -0.016048055, -0.0012655741, -0... | \n", "Video Game.txt | \n", "data/Video Game.txt | \n", "
| 125 | \n", "b'c29097ed-a8a2-424d-893b-9efd19361789' | \n", "b'A 2018 systematic review found evidence that... | \n", "[-0.024174018, -0.0026724807, 0.02956007, -0.0... | \n", "Video Game.txt | \n", "data/Video Game.txt | \n", "
| 126 | \n", "b'283ace6b-d8f8-47a0-9173-8c893c8e9a90' | \n", "b'Parents and children\\'s advocates regularly ... | \n", "[-0.019061007, -0.01754194, 0.009266318, -0.03... | \n", "Video Game.txt | \n", "data/Video Game.txt | \n", "
| 127 | \n", "b'72a289bb-6886-4825-8c4b-74d40e81b958' | \n", "b'A further issue in the industry is related t... | \n", "[-0.010214739, -0.03079727, 0.015305471, -0.03... | \n", "Video Game.txt | \n", "data/Video Game.txt | \n", "
| 128 | \n", "b'6ee7b25d-f958-4024-87c8-fcaf4d8b2d32' | \n", "b'== See also ==\\n\\nLists of video games\\nList... | \n", "[-0.014099253, -0.005158676, -0.0129085295, -0... | \n", "Video Game.txt | \n", "data/Video Game.txt | \n", "
129 rows × 5 columns
\n", "| \n", " | document_id | \n", "text | \n", "embeddings | \n", "filename | \n", "file_path | \n", "
|---|---|---|---|---|---|
| 0 | \n", "b'7fc6e810-c343-43d6-b301-af2914b429ca' | \n", "b'' | \n", "[-0.40234375, -0.06915283, -0.18640137, -0.089... | \n", "b'1.jpg' | \n", "b'data/1.jpg' | \n", "
| 1 | \n", "b'5bf41855-d981-4446-833f-75f5b1038f31' | \n", "b'' | \n", "[-0.10321045, -0.27416992, -0.064208984, -0.13... | \n", "b'2.jpg' | \n", "b'data/2.jpg' | \n", "
| 2 | \n", "b'987f467b-0fee-4127-8058-9d4955242bfd' | \n", "b'' | \n", "[0.2142334, -0.014846802, -0.21496582, -0.1887... | \n", "b'3.jpg' | \n", "b'data/3.jpg' | \n", "
| 3 | \n", "b'2bf05e51-d245-4dae-ab68-d59ce7962fc8' | \n", "b'' | \n", "[0.17700195, -0.05218506, -0.05999756, -0.4033... | \n", "b'4.jpg' | \n", "b'data/4.jpg' | \n", "
| 4 | \n", "b'85406f6c-38c9-46ee-9760-2ea449eb2bca' | \n", "b'' | \n", "[-0.17883301, 0.16442871, -0.34716797, -0.0219... | \n", "b'5.jpg' | \n", "b'data/5.jpg' | \n", "
| 5 | \n", "b'9affd42b-2ac2-4dcd-bf72-7b21f1fa8d86' | \n", "b'' | \n", "[-0.66259766, 0.3112793, 0.032714844, 0.219238... | \n", "b'6.jpg' | \n", "b'data/6.jpg' | \n", "
| 6 | \n", "b'bccdaede-f99c-4094-b1a9-470b0798b0cf' | \n", "b'' | \n", "[0.09362793, -0.20043945, 0.027572632, -0.0228... | \n", "b'7.jpg' | \n", "b'data/7.jpg' | \n", "
| 7 | \n", "b'ba58047a-ba93-4645-859f-83ef7e54912a' | \n", "b'' | \n", "[-0.026107788, -0.06652832, -0.007106781, -0.3... | \n", "b'8.jpg' | \n", "b'data/8.jpg' | \n", "
| 8 | \n", "b'4a8a5f65-8f70-4d1c-9f79-75a409c39bd4' | \n", "b'' | \n", "[-0.35302734, -0.06524658, -0.18603516, -0.509... | \n", "b'9.jpg' | \n", "b'data/9.jpg' | \n", "
| 9 | \n", "b'8ac68011-9285-46ab-9a1d-0abfd2e24a46' | \n", "b'' | \n", "[0.11975098, 0.19494629, -0.15234375, -0.21606... | \n", "b'10.jpg' | \n", "b'data/10.jpg' | \n", "
| 10 | \n", "b'4d5f59d0-41af-4acb-a5fb-1fba2bcbac3b' | \n", "b'' | \n", "[-0.22644043, -0.17614746, 0.06756592, -0.5668... | \n", "b'11.jpg' | \n", "b'data/11.jpg' | \n", "
| 11 | \n", "b'4e340214-2673-4464-8c21-c5fd706c5a94' | \n", "b'' | \n", "[-0.4272461, -0.009635925, -0.22509766, -0.047... | \n", "b'12.jpg' | \n", "b'data/12.jpg' | \n", "
| 12 | \n", "b'1e3a7605-8e19-47e4-a894-2ea6b78e6dbb' | \n", "b'' | \n", "[-0.090148926, -0.030685425, -0.296875, -0.246... | \n", "b'13.jpg' | \n", "b'data/13.jpg' | \n", "
| 13 | \n", "b'8e74924d-7b1e-4af5-953d-08270885229e' | \n", "b'' | \n", "[-0.25830078, -0.08703613, -0.2130127, -0.5092... | \n", "b'14.jpg' | \n", "b'data/14.jpg' | \n", "
| 14 | \n", "b'aa40b753-4633-4a7c-9b17-241b34ae2012' | \n", "b'' | \n", "[-0.35766602, -0.1550293, -0.3503418, -0.33764... | \n", "b'15.jpg' | \n", "b'data/15.jpg' | \n", "
| 15 | \n", "b'3215a096-4740-4c25-8814-44a0def22337' | \n", "b'' | \n", "[-0.39746094, 0.0010585785, 0.18469238, -0.244... | \n", "b'16.jpg' | \n", "b'data/16.jpg' | \n", "
| 16 | \n", "b'0b605bbb-b481-4d52-b6d9-3dbf5b80be22' | \n", "b'' | \n", "[-0.43652344, 0.3840332, -0.24523926, -0.02165... | \n", "b'17.jpg' | \n", "b'data/17.jpg' | \n", "
| 17 | \n", "b'6f94849b-e2e1-4a13-8879-0acce1d12439' | \n", "b'' | \n", "[-0.8803711, 0.013214111, -0.21557617, -0.25, ... | \n", "b'18.jpg' | \n", "b'data/18.jpg' | \n", "
| 18 | \n", "b'b2888ad4-bc74-4f01-9a7a-5f04e7787f96' | \n", "b'' | \n", "[-0.15161133, 0.19128418, -0.43139648, -0.4448... | \n", "b'19.jpg' | \n", "b'data/19.jpg' | \n", "
| 19 | \n", "b'2815cb0b-3a95-4aac-a30b-11abe0d3b065' | \n", "b'' | \n", "[-0.20056152, 0.12310791, 0.20739746, -0.21630... | \n", "b'20.jpg' | \n", "b'data/20.jpg' | \n", "
| 20 | \n", "b'8f0e5464-5aa6-4063-be41-02e55ff89fa5' | \n", "b'' | \n", "[-0.28833008, 0.06768799, -0.57177734, 0.16613... | \n", "b'21.jpg' | \n", "b'data/21.jpg' | \n", "
| 21 | \n", "b'64d517f9-54f3-45eb-8987-e986534eaee9' | \n", "b'' | \n", "[-0.076171875, -0.021621704, 0.28271484, -0.51... | \n", "b'22.jpg' | \n", "b'data/22.jpg' | \n", "
| 22 | \n", "b'82230793-6bed-4ac9-b37e-6b3801ffd10e' | \n", "b'' | \n", "[-0.27490234, -0.026290894, -0.07720947, -0.37... | \n", "b'23.jpg' | \n", "b'data/23.jpg' | \n", "
| 23 | \n", "b'0fa87b25-b4bb-4b63-b5db-dd68393baab8' | \n", "b'' | \n", "[-0.5776367, 0.091796875, 0.024261475, 0.10638... | \n", "b'24.jpg' | \n", "b'data/24.jpg' | \n", "
| 24 | \n", "b'7f05cd1a-6549-4678-8f21-b0b5dd2f1860' | \n", "b'' | \n", "[-0.023208618, -0.07775879, 0.22302246, -0.003... | \n", "b'25.jpg' | \n", "b'data/25.jpg' | \n", "
| 25 | \n", "b'3127662f-41ca-4b10-b158-68de606bad7e' | \n", "b'' | \n", "[-0.10839844, 0.38085938, -0.5332031, -0.08142... | \n", "b'26.jpg' | \n", "b'data/26.jpg' | \n", "
| 26 | \n", "b'6fde0c80-8600-4e6c-bd4b-572f45c51988' | \n", "b'' | \n", "[0.0021247864, -0.17321777, -0.13647461, -0.12... | \n", "b'27.jpg' | \n", "b'data/27.jpg' | \n", "
| 27 | \n", "b'37055b09-151c-4c9f-a313-cdfb84f6f6af' | \n", "b'' | \n", "[0.3269043, 0.42211914, -0.14086914, 0.0228881... | \n", "b'29.jpg' | \n", "b'data/29.jpg' | \n", "
| 28 | \n", "b'eb7b306e-ff44-401f-91a2-137b1a14d4ce' | \n", "b'' | \n", "[-0.12109375, 0.18664551, 0.03665161, -0.22521... | \n", "b'30.jpg' | \n", "b'data/30.jpg' | \n", "
| \n", " | document_id | \n", "text | \n", "embeddings | \n", "
|---|---|---|---|
| 0 | \n", "60ab97dd-699e-4d4a-88ad-8af45252f889 | \n", "LLM In-Context Recall is Prompt Dependent\\n\\nD... | \n", "[-0.011131451, 0.027312648, 0.04182894, 0.0080... | \n", "
| 1 | \n", "eaee75ac-8d87-4aef-9229-2e23d5b4e4a4 | \n", "Table 1. LLMs evaluated with needle-in-a-hayst... | \n", "[-0.0037533694, 0.008636131, 0.04999082, -0.04... | \n", "
| 2 | \n", "fde41f83-0476-4003-9ad5-bcb794029a2a | \n", "Arxiv, April 2024, Preprint\\n\\n Machlab & Batt... | \n", "[0.0044781016, 0.002235319, 0.044452623, -0.04... | \n", "
| 3 | \n", "170d4b26-1284-4bd6-9c92-eca6a795ad4e | \n", "Question\\n\\n- What did PistachioAI receive bef... | \n", "[-0.021702243, 0.0040557785, 0.028268011, -0.0... | \n", "
| 4 | \n", "f13d3190-1361-49f2-a919-ace65770124d | \n", "LLM In-Context Recall is Prompt Dependent\\n\\nA... | \n", "[-0.011283955, -0.0078016864, 0.02792849, 0.03... | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "
| 56 | \n", "8344d6bc-b03f-4df2-baf8-70dc5a063b59 | \n", "The table appears to be related to a test cond... | \n", "[-0.046381496, -0.030222965, 0.082688674, -0.0... | \n", "
| 57 | \n", "b22eccd8-e134-4b48-af62-a54a062f9283 | \n", "The table appears to represent a series of mea... | \n", "[-0.014041103, -0.028811967, 0.073300235, -0.0... | \n", "
| 58 | \n", "5e581faf-751f-47d6-b1c0-49b19d72f1e9 | \n", "The table appears to represent a series of num... | \n", "[-0.038328543, -0.039194115, 0.06691942, -0.02... | \n", "
| 59 | \n", "27bc255e-bbc5-4e58-91bd-93332e5bebcf | \n", "The table appears to contain numerical data wi... | \n", "[-0.020661851, -0.009247683, 0.07606771, -0.02... | \n", "
| 60 | \n", "d4830a76-4ae2-4d49-854b-1abc4fe64cfd | \n", "The table appears to contain numerical data wi... | \n", "[-0.025765242, -0.019740395, 0.06763376, -0.01... | \n", "
61 rows × 3 columns
\n", "| \n", " | timestamp | \n", "close | \n", "
|---|---|---|
| 0 | \n", "2024-01-01 09:30:00 | \n", "0.473890 | \n", "
| 1 | \n", "2024-01-01 09:30:01 | \n", "0.474245 | \n", "
| 2 | \n", "2024-01-01 09:30:02 | \n", "0.473890 | \n", "
| 3 | \n", "2024-01-01 09:30:03 | \n", "0.473535 | \n", "
| 4 | \n", "2024-01-01 09:30:04 | \n", "0.473179 | \n", "
| ... | \n", "... | \n", "... | \n", "
| 9999995 | \n", "2024-04-26 03:16:35 | \n", "0.784014 | \n", "
| 9999996 | \n", "2024-04-26 03:16:36 | \n", "0.784369 | \n", "
| 9999997 | \n", "2024-04-26 03:16:37 | \n", "0.784014 | \n", "
| 9999998 | \n", "2024-04-26 03:16:38 | \n", "0.784369 | \n", "
| 9999999 | \n", "2024-04-26 03:16:39 | \n", "0.784014 | \n", "
10000000 rows × 2 columns
\n", "| \n", " | date | \n", "sym | \n", "time | \n", "price | \n", "size | \n", "
|---|---|---|---|---|---|
| 0 | \n", "2024-08-19 | \n", "AAPL | \n", "2024-08-19 09:30:00.001 | \n", "218.00 | \n", "46 | \n", "
| 1 | \n", "2024-08-19 | \n", "AAPL | \n", "2024-08-19 09:30:01.029 | \n", "217.95 | \n", "93 | \n", "
| 2 | \n", "2024-08-19 | \n", "AAPL | \n", "2024-08-19 09:30:01.061 | \n", "217.90 | \n", "80 | \n", "
| 3 | \n", "2024-08-19 | \n", "AAPL | \n", "2024-08-19 09:30:01.154 | \n", "217.92 | \n", "86 | \n", "
| 4 | \n", "2024-08-19 | \n", "AAPL | \n", "2024-08-19 09:30:01.265 | \n", "217.83 | \n", "67 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 9999995 | \n", "2024-08-30 | \n", "TSLA | \n", "2024-08-30 15:59:59.093 | \n", "233.51 | \n", "38 | \n", "
| 9999996 | \n", "2024-08-30 | \n", "TSLA | \n", "2024-08-30 15:59:59.249 | \n", "233.56 | \n", "24 | \n", "
| 9999997 | \n", "2024-08-30 | \n", "TSLA | \n", "2024-08-30 15:59:59.770 | \n", "233.43 | \n", "68 | \n", "
| 9999998 | \n", "2024-08-30 | \n", "TSLA | \n", "2024-08-30 15:59:59.824 | \n", "233.47 | \n", "50 | \n", "
| 9999999 | \n", "2024-08-30 | \n", "TSLA | \n", "2024-08-30 15:59:59.993 | \n", "233.52 | \n", "68 | \n", "
10000000 rows × 5 columns
\n", "| \n", " | index | \n", "time | \n", "sym | \n", "qty | \n", "price | \n", "
|---|---|---|---|---|---|
| 0 | \n", "0 | \n", "2024-02-19 00:00:23.408442735 | \n", "AAA | \n", "8000 | \n", "25.198061 | \n", "
| 1 | \n", "1 | \n", "2024-02-19 00:00:50.002746284 | \n", "AAA | \n", "2000 | \n", "25.589870 | \n", "
| 2 | \n", "2 | \n", "2024-02-19 00:01:13.951318860 | \n", "AAA | \n", "4000 | \n", "25.435139 | \n", "
| 3 | \n", "3 | \n", "2024-02-19 00:01:21.386703997 | \n", "AAA | \n", "1000 | \n", "25.378082 | \n", "
| 4 | \n", "4 | \n", "2024-02-19 00:01:48.257409185 | \n", "AAA | \n", "8000 | \n", "25.830731 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 49995 | \n", "49995 | \n", "2024-03-04 23:56:43.126595914 | \n", "BBB | \n", "3000 | \n", "17.611573 | \n", "
| 49996 | \n", "49996 | \n", "2024-03-04 23:56:49.295240789 | \n", "BBB | \n", "3000 | \n", "17.652760 | \n", "
| 49997 | \n", "49997 | \n", "2024-03-04 23:57:06.397743076 | \n", "BBB | \n", "1000 | \n", "17.215983 | \n", "
| 49998 | \n", "49998 | \n", "2024-03-04 23:59:19.743730723 | \n", "BBB | \n", "10000 | \n", "17.096576 | \n", "
| 49999 | \n", "49999 | \n", "2024-03-04 23:59:33.235208541 | \n", "BBB | \n", "5000 | \n", "17.468638 | \n", "
50000 rows × 5 columns
\n", "| \n", " | index | \n", "time | \n", "sym | \n", "qty | \n", "price | \n", "
|---|---|---|---|---|---|
| 0 | \n", "0 | \n", "2024-02-19 00:00:23.408442735 | \n", "AAA | \n", "8000 | \n", "25.198061 | \n", "
| 1 | \n", "1 | \n", "2024-02-19 00:00:50.002746284 | \n", "AAA | \n", "2000 | \n", "25.589870 | \n", "
| 2 | \n", "2 | \n", "2024-02-19 00:01:13.951318860 | \n", "AAA | \n", "4000 | \n", "25.435139 | \n", "
| 3 | \n", "3 | \n", "2024-02-19 00:01:21.386703997 | \n", "AAA | \n", "1000 | \n", "25.378082 | \n", "
| 4 | \n", "4 | \n", "2024-02-19 00:01:48.257409185 | \n", "AAA | \n", "8000 | \n", "25.830731 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 49995 | \n", "49995 | \n", "2024-03-04 23:56:43.126595914 | \n", "BBB | \n", "3000 | \n", "17.611573 | \n", "
| 49996 | \n", "49996 | \n", "2024-03-04 23:56:49.295240789 | \n", "BBB | \n", "3000 | \n", "17.652760 | \n", "
| 49997 | \n", "49997 | \n", "2024-03-04 23:57:06.397743076 | \n", "BBB | \n", "1000 | \n", "17.215983 | \n", "
| 49998 | \n", "49998 | \n", "2024-03-04 23:59:19.743730723 | \n", "BBB | \n", "10000 | \n", "17.096576 | \n", "
| 49999 | \n", "49999 | \n", "2024-03-04 23:59:33.235208541 | \n", "BBB | \n", "5000 | \n", "17.468638 | \n", "
50000 rows × 5 columns
\n", "| \n", " | index | \n", "time | \n", "sym | \n", "price | \n", "
|---|---|---|---|---|
| 0 | \n", "0 | \n", "2024-02-19 00:00:23.408442735 | \n", "AAA | \n", "[25.1980605549179, 25.58986971201375, 25.43513... | \n", "
| 1 | \n", "1 | \n", "2024-02-19 00:00:50.002746284 | \n", "AAA | \n", "[25.58986971201375, 25.43513912591152, 25.3780... | \n", "
| 2 | \n", "2 | \n", "2024-02-19 00:01:13.951318860 | \n", "AAA | \n", "[25.43513912591152, 25.37808242137544, 25.8307... | \n", "
| 3 | \n", "3 | \n", "2024-02-19 00:01:21.386703997 | \n", "AAA | \n", "[25.37808242137544, 25.830730823799968, 25.607... | \n", "
| 4 | \n", "4 | \n", "2024-02-19 00:01:48.257409185 | \n", "AAA | \n", "[25.830730823799968, 25.607446282170713, 26.07... | \n", "
| \n", " | index | \n", "time | \n", "sym | \n", "qty | \n", "price | \n", "nnIdx | \n", "nnDist | \n", "
|---|---|---|---|---|---|---|---|
| 0 | \n", "100 | \n", "2024-02-19 01:22:33.092023730 | \n", "AAA | \n", "8000 | \n", "28.180955 | \n", "100 | \n", "0.000000 | \n", "
| 1 | \n", "99 | \n", "2024-02-19 01:20:12.802501022 | \n", "AAA | \n", "2000 | \n", "27.996566 | \n", "99 | \n", "4.527070 | \n", "
| 2 | \n", "101 | \n", "2024-02-19 01:22:40.782989859 | \n", "AAA | \n", "2000 | \n", "27.793489 | \n", "101 | \n", "4.532988 | \n", "
| 3 | \n", "98 | \n", "2024-02-19 01:19:35.593288242 | \n", "AAA | \n", "9000 | \n", "28.493217 | \n", "98 | \n", "6.594388 | \n", "
| 4 | \n", "102 | \n", "2024-02-19 01:24:08.510938882 | \n", "AAA | \n", "7000 | \n", "27.933083 | \n", "102 | \n", "6.612245 | \n", "
| 5 | \n", "97 | \n", "2024-02-19 01:19:32.584855556 | \n", "AAA | \n", "6000 | \n", "28.734376 | \n", "97 | \n", "7.984468 | \n", "
| 6 | \n", "103 | \n", "2024-02-19 01:24:39.233463853 | \n", "AAA | \n", "9000 | \n", "28.403252 | \n", "103 | \n", "8.011267 | \n", "
| 7 | \n", "96 | \n", "2024-02-19 01:19:27.989527434 | \n", "AAA | \n", "1000 | \n", "28.989396 | \n", "96 | \n", "9.142472 | \n", "
| 8 | \n", "104 | \n", "2024-02-19 01:26:20.481255054 | \n", "AAA | \n", "9000 | \n", "28.880188 | \n", "104 | \n", "9.172721 | \n", "
| 9 | \n", "95 | \n", "2024-02-19 01:19:09.146237969 | \n", "AAA | \n", "7000 | \n", "29.201223 | \n", "95 | \n", "10.124061 | \n", "
| \n", " | index | \n", "time | \n", "sym | \n", "qty | \n", "price | \n", "nnIdx | \n", "nnDist | \n", "
|---|---|---|---|---|---|---|---|
| 0 | \n", "13075 | \n", "2024-02-26 20:34:19.050180763 | \n", "AAA | \n", "4000 | \n", "59.715426 | \n", "13075 | \n", "59.125275 | \n", "
| 1 | \n", "13074 | \n", "2024-02-26 20:33:45.411256402 | \n", "AAA | \n", "2000 | \n", "59.351530 | \n", "13074 | \n", "59.114667 | \n", "
| 2 | \n", "13076 | \n", "2024-02-26 20:34:52.536420375 | \n", "AAA | \n", "5000 | \n", "59.826193 | \n", "13076 | \n", "59.111672 | \n", "
| 3 | \n", "13073 | \n", "2024-02-26 20:33:17.896016389 | \n", "AAA | \n", "5000 | \n", "59.138623 | \n", "13073 | \n", "59.107935 | \n", "
| 4 | \n", "13077 | \n", "2024-02-26 20:35:33.751646429 | \n", "AAA | \n", "10000 | \n", "60.131198 | \n", "13077 | \n", "59.097348 | \n", "
| 5 | \n", "13072 | \n", "2024-02-26 20:32:59.920857399 | \n", "AAA | \n", "4000 | \n", "58.847480 | \n", "13072 | \n", "59.091311 | \n", "
| 6 | \n", "13078 | \n", "2024-02-26 20:39:03.432464450 | \n", "AAA | \n", "9000 | \n", "59.749958 | \n", "13078 | \n", "59.074251 | \n", "
| 7 | \n", "13071 | \n", "2024-02-26 20:32:00.419369638 | \n", "AAA | \n", "4000 | \n", "59.093172 | \n", "13071 | \n", "59.071500 | \n", "
| 8 | \n", "13070 | \n", "2024-02-26 20:30:26.397550106 | \n", "AAA | \n", "9000 | \n", "58.770047 | \n", "13070 | \n", "59.057446 | \n", "
| 9 | \n", "13079 | \n", "2024-02-26 20:39:26.932638734 | \n", "AAA | \n", "1000 | \n", "60.111276 | \n", "13079 | \n", "59.056613 | \n", "
| \n", " | index | \n", "time | \n", "sym | \n", "qty | \n", "price | \n", "nnIdx | \n", "nnDist | \n", "
|---|---|---|---|---|---|---|---|
| 0 | \n", "100 | \n", "2024-02-19 01:22:33.092023730 | \n", "AAA | \n", "8000 | \n", "28.180955 | \n", "100 | \n", "0.000004 | \n", "
| 1 | \n", "99 | \n", "2024-02-19 01:20:12.802501022 | \n", "AAA | \n", "2000 | \n", "27.996566 | \n", "99 | \n", "2.060435 | \n", "
| 2 | \n", "101 | \n", "2024-02-19 01:22:40.782989859 | \n", "AAA | \n", "2000 | \n", "27.793489 | \n", "101 | \n", "2.140956 | \n", "
| 3 | \n", "14907 | \n", "2024-02-27 23:18:12.271115630 | \n", "AAA | \n", "9000 | \n", "58.904305 | \n", "14907 | \n", "2.369705 | \n", "
| 4 | \n", "27249 | \n", "2024-02-20 07:46:35.792305469 | \n", "BBB | \n", "6000 | \n", "15.817259 | \n", "27249 | \n", "2.375721 | \n", "
| 5 | \n", "14906 | \n", "2024-02-27 23:17:01.033421158 | \n", "AAA | \n", "9000 | \n", "59.331964 | \n", "14906 | \n", "2.385418 | \n", "
| 6 | \n", "42517 | \n", "2024-02-29 10:50:59.415277093 | \n", "BBB | \n", "6000 | \n", "10.757307 | \n", "42517 | \n", "2.388048 | \n", "
| 7 | \n", "27250 | \n", "2024-02-20 07:48:04.330449253 | \n", "BBB | \n", "9000 | \n", "15.494382 | \n", "27250 | \n", "2.431020 | \n", "
| 8 | \n", "14905 | \n", "2024-02-27 23:14:51.579989343 | \n", "AAA | \n", "4000 | \n", "59.699730 | \n", "14905 | \n", "2.434130 | \n", "
| 9 | \n", "19547 | \n", "2024-03-01 16:56:43.881322592 | \n", "AAA | \n", "8000 | \n", "77.814046 | \n", "19547 | \n", "2.459691 | \n", "
| \n", " | index | \n", "time | \n", "sym | \n", "qty | \n", "price | \n", "nnIdx | \n", "nnDist | \n", "
|---|---|---|---|---|---|---|---|
| 0 | \n", "100 | \n", "2024-02-19 01:22:33.092023730 | \n", "AAA | \n", "8000 | \n", "28.180955 | \n", "100 | \n", "0.000006 | \n", "
| 1 | \n", "101 | \n", "2024-02-19 01:22:40.782989859 | \n", "AAA | \n", "2000 | \n", "27.793489 | \n", "101 | \n", "2.973744 | \n", "
| 2 | \n", "99 | \n", "2024-02-19 01:20:12.802501022 | \n", "AAA | \n", "2000 | \n", "27.996566 | \n", "99 | \n", "2.989074 | \n", "
| 3 | \n", "102 | \n", "2024-02-19 01:24:08.510938882 | \n", "AAA | \n", "7000 | \n", "27.933083 | \n", "102 | \n", "4.320273 | \n", "
| 4 | \n", "98 | \n", "2024-02-19 01:19:35.593288242 | \n", "AAA | \n", "9000 | \n", "28.493217 | \n", "98 | \n", "4.364905 | \n", "
| 5 | \n", "103 | \n", "2024-02-19 01:24:39.233463853 | \n", "AAA | \n", "9000 | \n", "28.403252 | \n", "103 | \n", "5.199802 | \n", "
| 6 | \n", "97 | \n", "2024-02-19 01:19:32.584855556 | \n", "AAA | \n", "6000 | \n", "28.734376 | \n", "97 | \n", "5.293356 | \n", "
| 7 | \n", "104 | \n", "2024-02-19 01:26:20.481255054 | \n", "AAA | \n", "9000 | \n", "28.880188 | \n", "104 | \n", "5.912937 | \n", "
| 8 | \n", "96 | \n", "2024-02-19 01:19:27.989527434 | \n", "AAA | \n", "1000 | \n", "28.989396 | \n", "96 | \n", "6.068448 | \n", "
| 9 | \n", "105 | \n", "2024-02-19 01:27:28.738277703 | \n", "AAA | \n", "6000 | \n", "29.004960 | \n", "105 | \n", "6.461498 | \n", "
| \n", " | index | \n", "sym | \n", "time | \n", "price | \n", "
|---|---|---|---|---|
| 0 | \n", "0 | \n", "AAA | \n", "2024-02-19 00:00:23.408442735 | \n", "[25.1980605549179, 25.58986971201375, 25.43513... | \n", "
| 1 | \n", "1 | \n", "AAA | \n", "2024-02-19 00:00:50.002746284 | \n", "[25.58986971201375, 25.43513912591152, 25.3780... | \n", "
| 2 | \n", "2 | \n", "AAA | \n", "2024-02-19 00:01:13.951318860 | \n", "[25.43513912591152, 25.37808242137544, 25.8307... | \n", "
| 3 | \n", "3 | \n", "AAA | \n", "2024-02-19 00:01:21.386703997 | \n", "[25.37808242137544, 25.830730823799968, 25.607... | \n", "
| 4 | \n", "4 | \n", "AAA | \n", "2024-02-19 00:01:48.257409185 | \n", "[25.830730823799968, 25.607446282170713, 26.07... | \n", "
| \n", " | __nn_distance | \n", "index | \n", "sym | \n", "time | \n", "price | \n", "
|---|---|---|---|---|---|
| 0 | \n", "0.000000 | \n", "100 | \n", "AAA | \n", "2024-02-19 01:22:33.092023730 | \n", "[28.18095506168902, 27.793488922994584, 27.933... | \n", "
| 1 | \n", "83.772339 | \n", "99 | \n", "AAA | \n", "2024-02-19 01:20:12.802501022 | \n", "[27.996565851615742, 28.18095506168902, 27.793... | \n", "
| 2 | \n", "83.809570 | \n", "101 | \n", "AAA | \n", "2024-02-19 01:22:40.782989859 | \n", "[27.793488922994584, 27.9330833053682, 28.4032... | \n", "
| 3 | \n", "177.901764 | \n", "98 | \n", "AAA | \n", "2024-02-19 01:19:35.593288242 | \n", "[28.49321669829078, 27.996565851615742, 28.180... | \n", "
| 4 | \n", "178.144348 | \n", "102 | \n", "AAA | \n", "2024-02-19 01:24:08.510938882 | \n", "[27.9330833053682, 28.403252121293917, 28.8801... | \n", "
| 5 | \n", "260.937866 | \n", "97 | \n", "AAA | \n", "2024-02-19 01:19:32.584855556 | \n", "[28.73437575995922, 28.49321669829078, 27.9965... | \n", "
| 6 | \n", "261.279205 | \n", "103 | \n", "AAA | \n", "2024-02-19 01:24:39.233463853 | \n", "[28.403252121293917, 28.880188381765038, 29.00... | \n", "
| 7 | \n", "342.180145 | \n", "96 | \n", "AAA | \n", "2024-02-19 01:19:27.989527434 | \n", "[28.989396206568927, 28.73437575995922, 28.493... | \n", "
| 8 | \n", "342.364197 | \n", "104 | \n", "AAA | \n", "2024-02-19 01:26:20.481255054 | \n", "[28.880188381765038, 29.00496045150794, 29.219... | \n", "
| 9 | \n", "418.405457 | \n", "105 | \n", "AAA | \n", "2024-02-19 01:27:28.738277703 | \n", "[29.00496045150794, 29.21952441590838, 29.4126... | \n", "
| \n", " | __nn_distance | \n", "index | \n", "sym | \n", "time | \n", "price | \n", "
|---|---|---|---|---|---|
| 0 | \n", "7.105427e-14 | \n", "100 | \n", "AAA | \n", "2024-02-19 01:22:33.092023730 | \n", "[28.18095506168902, 27.793488922994584, 27.933... | \n", "
| 1 | \n", "2.851702e-04 | \n", "101 | \n", "AAA | \n", "2024-02-19 01:22:40.782989859 | \n", "[27.793488922994584, 27.9330833053682, 28.4032... | \n", "
| 2 | \n", "2.914894e-04 | \n", "99 | \n", "AAA | \n", "2024-02-19 01:20:12.802501022 | \n", "[27.996565851615742, 28.18095506168902, 27.793... | \n", "
| 3 | \n", "1.118651e-03 | \n", "98 | \n", "AAA | \n", "2024-02-19 01:19:35.593288242 | \n", "[28.49321669829078, 27.996565851615742, 28.180... | \n", "
| 4 | \n", "1.178602e-03 | \n", "102 | \n", "AAA | \n", "2024-02-19 01:24:08.510938882 | \n", "[27.9330833053682, 28.403252121293917, 28.8801... | \n", "
| 5 | \n", "2.448621e-03 | \n", "97 | \n", "AAA | \n", "2024-02-19 01:19:32.584855556 | \n", "[28.73437575995922, 28.49321669829078, 27.9965... | \n", "
| 6 | \n", "2.667162e-03 | \n", "103 | \n", "AAA | \n", "2024-02-19 01:24:39.233463853 | \n", "[28.403252121293917, 28.880188381765038, 29.00... | \n", "
| 7 | \n", "4.248296e-03 | \n", "96 | \n", "AAA | \n", "2024-02-19 01:19:27.989527434 | \n", "[28.989396206568927, 28.73437575995922, 28.493... | \n", "
| 8 | \n", "4.676504e-03 | \n", "104 | \n", "AAA | \n", "2024-02-19 01:26:20.481255054 | \n", "[28.880188381765038, 29.00496045150794, 29.219... | \n", "
| 9 | \n", "6.496350e-03 | \n", "95 | \n", "AAA | \n", "2024-02-19 01:19:09.146237969 | \n", "[29.20122275315225, 28.989396206568927, 28.734... | \n", "
| \n", " | vectors | \n", "sentences | \n", "
|---|---|---|
| 0 | \n", "[-0.059804268181324005, -0.09221810102462769, 0.058069996535778046, 0.06884294003248215, -0.0030452304054051638, 0.007304240483790636, -0.028959110379219055, 0.0595787838101387, 0.017943743616342545, 0.043501030653715134, 0.005012953653931618, -0.07875007390975952, -0.02570340223610401, -0.04147... | \n", "Draft version August 14, 2023\\nTypeset using L ATEX default style in AASTeX631\\nThe Galactic Interstellar Object Population: A Framework for Prediction and Inference\\nMatthew J. Hopkins\\n ,1Chris Lintott\\n ,1Michele T. Bannister\\n ,2J. | \n", "
| 1 | \n", "[-0.08154530823230743, -0.11342314630746841, 0.08425560593605042, 0.08849422633647919, 0.024723634123802185, -0.08773960173130035, -0.06128991022706032, 0.0106121264398098, 0.06387822329998016, 0.021902449429035187, -0.033225465565919876, -0.09321605414152145, -0.04425090178847313, -0.0568475760... | \n", "Ted Mackereth\\n ,3, 4, 5, ∗and\\nJohn C. Forbes\\n2\\n1Department of Physics, University of Oxford, Denys Wilkinson Building, Keble Road, Oxford, OX1 3RH, UK\\n2School of Physical and Chemical Sciences—Te Kura Mat¯ u, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand\\n3Just ... | \n", "
| 2 | \n", "[-0.07799109816551208, -0.08398666232824326, 0.02723710611462593, 0.09858055412769318, 0.010515090078115463, -0.01670389622449875, -0.055797528475522995, 0.027984360232949257, 0.03926650434732437, 0.03248238563537598, 0.03837994113564491, -0.12464361637830734, -0.012645130045711994, -0.018288377... | \n", "We define a novel framework: firstly to predict\\nthe properties of this Galactic ISO population by combining models of processes across planetary\\nand galactic scales, and secondly to make inferences about the processes modelled, by comparing the\\npredicted population to what is observed. | \n", "
| 3 | \n", "[-0.05338575318455696, -0.037147365510463715, 0.07784582674503326, 0.05165949463844299, -0.07674478739500046, -0.007766904775053263, 0.008464669808745384, 0.04289115220308304, 0.04356798157095909, 0.06012833118438721, 0.04407043382525444, -0.10177762806415558, 0.030359704047441483, -0.0340371690... | \n", "We predict the spatial and compositional distribution of the\\nGalaxy’s population of ISOs by modelling the Galactic stellar population with data from the APOGEE\\nsurvey and combining this with a protoplanetary disk chemistry model. | \n", "
| 4 | \n", "[-0.06326982378959656, -0.01720457151532173, 0.047737207263708115, 0.08325360715389252, 0.03296627104282379, -0.0660020187497139, 0.0015915816184133291, 0.003246008651331067, 0.03456269949674606, -0.020524047315120697, 0.041225481778383255, -0.15274153649806976, -0.011771060526371002, 0.07314326... | \n", "Selecting ISO water mass\\nfraction as an example observable quantity, we evaluate its distribution both at the position of the Sun\\nand averaged over the Galactic disk; our prediction for the Solar neighbourhood is compatible with the\\ninferred water mass fraction of 2I/Borisov. | \n", "
| \n", " | sentences | \n", "vectors | \n", "
|---|
| \n", " | vectors | \n", "sentences | \n", "
|---|---|---|
| 0 | \n", "[-0.059804268181324005, -0.09221810102462769, 0.058069996535778046, 0.06884294003248215, -0.0030452304054051638, 0.007304240483790636, -0.028959110379219055, 0.0595787838101387, 0.017943743616342545, 0.043501030653715134, 0.005012953653931618, -0.07875007390975952, -0.02570340223610401, -0.04147... | \n", "Draft version August 14, 2023\\nTypeset using L ATEX default style in AASTeX631\\nThe Galactic Interstellar Object Population: A Framework for Prediction and Inference\\nMatthew J. Hopkins\\n ,1Chris Lintott\\n ,1Michele T. Bannister\\n ,2J. | \n", "
| 1 | \n", "[-0.08154530823230743, -0.11342314630746841, 0.08425560593605042, 0.08849422633647919, 0.024723634123802185, -0.08773960173130035, -0.06128991022706032, 0.0106121264398098, 0.06387822329998016, 0.021902449429035187, -0.033225465565919876, -0.09321605414152145, -0.04425090178847313, -0.0568475760... | \n", "Ted Mackereth\\n ,3, 4, 5, ∗and\\nJohn C. Forbes\\n2\\n1Department of Physics, University of Oxford, Denys Wilkinson Building, Keble Road, Oxford, OX1 3RH, UK\\n2School of Physical and Chemical Sciences—Te Kura Mat¯ u, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand\\n3Just ... | \n", "
| 2 | \n", "[-0.07799109816551208, -0.08398666232824326, 0.02723710611462593, 0.09858055412769318, 0.010515090078115463, -0.01670389622449875, -0.055797528475522995, 0.027984360232949257, 0.03926650434732437, 0.03248238563537598, 0.03837994113564491, -0.12464361637830734, -0.012645130045711994, -0.018288377... | \n", "We define a novel framework: firstly to predict\\nthe properties of this Galactic ISO population by combining models of processes across planetary\\nand galactic scales, and secondly to make inferences about the processes modelled, by comparing the\\npredicted population to what is observed. | \n", "
| 3 | \n", "[-0.05338575318455696, -0.037147365510463715, 0.07784582674503326, 0.05165949463844299, -0.07674478739500046, -0.007766904775053263, 0.008464669808745384, 0.04289115220308304, 0.04356798157095909, 0.06012833118438721, 0.04407043382525444, -0.10177762806415558, 0.030359704047441483, -0.0340371690... | \n", "We predict the spatial and compositional distribution of the\\nGalaxy’s population of ISOs by modelling the Galactic stellar population with data from the APOGEE\\nsurvey and combining this with a protoplanetary disk chemistry model. | \n", "
| 4 | \n", "[-0.06326982378959656, -0.01720457151532173, 0.047737207263708115, 0.08325360715389252, 0.03296627104282379, -0.0660020187497139, 0.0015915816184133291, 0.003246008651331067, 0.03456269949674606, -0.020524047315120697, 0.041225481778383255, -0.15274153649806976, -0.011771060526371002, 0.07314326... | \n", "Selecting ISO water mass\\nfraction as an example observable quantity, we evaluate its distribution both at the position of the Sun\\nand averaged over the Galactic disk; our prediction for the Solar neighbourhood is compatible with the\\ninferred water mass fraction of 2I/Borisov. | \n", "
| \n", " | __nn_distance | \n", "vectors | \n", "sentences | \n", "
|---|---|---|---|
| 0 | \n", "0.678681 | \n", "[-0.08154530823230743, -0.11342314630746841, 0.08425560593605042, 0.08849422633647919, 0.024723634123802185, -0.08773960173130035, -0.06128991022706032, 0.0106121264398098, 0.06387822329998016, 0.021902449429035187, -0.033225465565919876, -0.09321605414152145, -0.04425090178847313, -0.0568475760... | \n", "Ted Mackereth\\n ,3, 4, 5, ∗and\\nJohn C. Forbes\\n2\\n1Department of Physics, University of Oxford, Denys Wilkinson Building, Keble Road, Oxford, OX1 3RH, UK\\n2School of Physical and Chemical Sciences—Te Kura Mat¯ u, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand\\n3Just ... | \n", "
| 1 | \n", "0.665591 | \n", "[-0.0868825614452362, -0.023635495454072952, 0.06744848191738129, 0.08018778264522552, -0.05974714457988739, -0.03222731873393059, -0.008770040236413479, 0.04886192828416824, -0.05181043595075607, 0.017594708129763603, 0.024309009313583374, -0.09147438406944275, 0.038199249655008316, -0.06181899... | \n", "In this work, we develop\\nthis method and apply it to the stellar population of the Milky Way, estimated with data from the APOGEE survey, to\\npredict a broader set of properties of our own Galaxy’s population of interstellar objects. | \n", "
| 2 | \n", "0.629392 | \n", "[-0.07292614132165909, 0.021450141444802284, 0.01660231687128544, 0.05125197023153305, -0.017564907670021057, -0.08743879944086075, -0.020641718059778214, 0.037198327481746674, -0.024873124435544014, 0.039150986820459366, 0.043217096477746964, -0.1320355087518692, -0.033658064901828766, -0.06451... | \n", "Keywords: Interstellar objects (52), Small Solar System bodies(1469), Galaxy Evolution (594)\\n1.INTRODUCTION\\n1I/‘Oumuamua (Meech et al. | \n", "
| \n", " | __nn_distance | \n", "vectors | \n", "sentences | \n", "
|---|---|---|---|
| 0 | \n", "0.546382 | \n", "[-0.07665147632360458, -0.06582025438547134, 0.034305740147829056, 0.026705816388130188, 0.07752171903848648, -0.05098922178149223, 0.007996230386197567, 0.023463979363441467, 0.09635236114263535, 0.05890350416302681, -0.009348639287054539, -0.04947573319077492, 0.04072212800383568, -0.086648009... | \n", "The pop-\\nulation’s dominant dynamical formation mechanisms would preferentially harvest more distant, ice-rich planetesimals\\nfrom the disks of the source systems. | \n", "
| 1 | \n", "0.533592 | \n", "[-0.011731946840882301, -0.06267537921667099, 0.08392807841300964, -0.036336500197649, -0.0021244140807539225, -0.05082397535443306, -0.00048589325160719454, -0.02759421430528164, 0.13681785762310028, 0.06662701815366745, -0.02651246450841427, -0.019386690109968185, 0.0160849429666996, -0.083458... | \n", "A protoplanetary disk has to first order the same composition as the star it forms around,\\nsince they both form from the same molecular cloud core. | \n", "
| 2 | \n", "0.522974 | \n", "[-0.0878974199295044, -0.053996648639440536, 0.08170221745967865, 0.05057608336210251, -0.001267113140784204, -0.051809072494506836, -0.03642294555902481, 0.014396563172340393, 0.10822276026010513, 0.043686095625162125, -0.07996439933776855, -0.07062527537345886, 0.05373870208859444, -0.06110462... | \n", "While in reality, stars will each produce a distribution of ISOs that\\nformed at different positions in their protoplanetary disk and thus have a range of compositions, this simplification\\nof only modelling planetesimals which form exterior to the water ice line is justified by the proportional... | \n", "
| \n", " | ReleaseYear | \n", "Title | \n", "Origin | \n", "Director | \n", "Cast | \n", "Genre | \n", "Plot | \n", "embeddings | \n", "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", "1975 | \n", "The Candy Tangerine Man | \n", "American | \n", "Matt Cimber | \n", "John Daniels Eli Haines Tom Hankason | \n", "action | \n", "A successful Los Angeles-based businessperson ... | \n", "[-0.06835174, -0.013138616, -0.12417501, 0.002... | \n", "
| 1 | \n", "1975 | \n", "Capone | \n", "American | \n", "Steve Carver | \n", "Ben Gazzara Susan Blakely John Cassavetes Sylv... | \n", "crime drama | \n", "The story is of the rise and fall of the Chica... | \n", "[-0.01411798, 0.040705115, -0.0014280609, 0.00... | \n", "
| 2 | \n", "1975 | \n", "Cleopatra Jones and the Casino of Gold | \n", "American | \n", "Charles Bail | \n", "Tamara Dobson Stella Stevens | \n", "action | \n", "The story begins with two government agents Ma... | \n", "[-0.0925895, 0.01188509, -0.08999529, -0.01541... | \n", "
| 3 | \n", "1975 | \n", "Conduct Unbecoming | \n", "American | \n", "Michael Anderson | \n", "Stacy Keach Richard Attenborough Christopher P... | \n", "drama | \n", "Around 1880 two young British officers arrive ... | \n", "[-0.07435084, -0.06386179, 0.017042944, 0.0288... | \n", "
| 4 | \n", "1975 | \n", "Cooley High | \n", "American | \n", "Michael Schultz | \n", "Lawrence Hilton-Jacobs Glynn Turman Garrett Mo... | \n", "comedy | \n", "Set in 1964 Chicago Preach an aspiring playwri... | \n", "[-0.041632336, 0.037923656, -0.072276264, -0.0... | \n", "
| \n", " | ReleaseYear | \n", "Title | \n", "Origin | \n", "Director | \n", "Cast | \n", "Genre | \n", "Plot | \n", "embeddings | \n", "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", "1975 | \n", "The Candy Tangerine Man | \n", "American | \n", "Matt Cimber | \n", "John Daniels Eli Haines Tom Hankason | \n", "action | \n", "A successful Los Angeles-based businessperson ... | \n", "[-0.06835173815488815, -0.01313861645758152, -... | \n", "
| 1 | \n", "1975 | \n", "Capone | \n", "American | \n", "Steve Carver | \n", "Ben Gazzara Susan Blakely John Cassavetes Sylv... | \n", "crime drama | \n", "The story is of the rise and fall of the Chica... | \n", "[-0.014117980375885963, 0.0407051146030426, -0... | \n", "
| 2 | \n", "1975 | \n", "Cleopatra Jones and the Casino of Gold | \n", "American | \n", "Charles Bail | \n", "Tamara Dobson Stella Stevens | \n", "action | \n", "The story begins with two government agents Ma... | \n", "[-0.09258949756622314, 0.011885089799761772, -... | \n", "
| 3 | \n", "1975 | \n", "Conduct Unbecoming | \n", "American | \n", "Michael Anderson | \n", "Stacy Keach Richard Attenborough Christopher P... | \n", "drama | \n", "Around 1880 two young British officers arrive ... | \n", "[-0.07435084134340286, -0.06386178731918335, 0... | \n", "
| 4 | \n", "1975 | \n", "Cooley High | \n", "American | \n", "Michael Schultz | \n", "Lawrence Hilton-Jacobs Glynn Turman Garrett Mo... | \n", "comedy | \n", "Set in 1964 Chicago Preach an aspiring playwri... | \n", "[-0.041632335633039474, 0.0379236564040184, -0... | \n", "
| \n", " | ID | \n", "chunk | \n", "dense | \n", "sparse | \n", "
|---|---|---|---|---|
| 0 | \n", "0 | \n", "At last year's Jackson Hole symposium, I deliv... | \n", "[-0.022856269031763077, -0.02936530113220215, ... | \n", "{2012: 2, 2197: 1, 2095: 3, 1005: 2, 1055: 2, ... | \n", "
| 1 | \n", "1 | \n", "are confident that inflation is moving sustain... | \n", "[0.011283153668045998, -0.030178586021065712, ... | \n", "{2024: 1, 9657: 1, 2008: 1, 14200: 1, 2003: 1,... | \n", "
| 2 | \n", "2 | \n", "Today I will review our progress so far and di... | \n", "[-0.03170400112867355, 0.01769343577325344, 0.... | \n", "{2651: 1, 1045: 2, 2097: 2, 3319: 1, 2256: 2, ... | \n", "
| 3 | \n", "3 | \n", "The Decline in Inflation So Far | \n", "[0.003466668538749218, 0.007666163146495819, -... | \n", "{1996: 1, 6689: 1, 1999: 1, 14200: 1, 2061: 1,... | \n", "
| 4 | \n", "4 | \n", "The ongoing episode of high inflation initiall... | \n", "[-0.02072943188250065, -0.055148035287857056, ... | \n", "{1996: 7, 7552: 1, 2792: 1, 1997: 4, 2152: 1, ... | \n", "
| \n", " | ID | \n", "chunk | \n", "
|---|---|---|
| 0 | \n", "9 | \n", "coming quarters. Twelve-month core inflation is still elevated, and there is substantial further ground to cover to get back to price stability. | \n", "
| 1 | \n", "35 | \n", "That assessment is further complicated by uncertainty about the duration of the lags with which monetary tightening affects economic activity and especially inflation. Since the symposium a year ago, the Committee has raised the policy rate by 300 basis points, including 100 basis points over the past seven months. And we have substantially reduced the size of our securities holdings. The wide range of estimates of these lags suggests that there may be significant further drag in the pipeline. | \n", "
| 2 | \n", "29 | \n", "Total hours worked has been flat over the past six months, and the average workweek has declined to the lower end of its pre-pandemic range, reflecting a gradual normalization in labor market conditions (figure 5). | \n", "
| 3 | \n", "23 | \n", "Restrictive monetary policy has tightened financial conditions, supporting the expectation of below-trend growth.5 Since last year's symposium, the two-year real yield is up about 250 basis points, and longer-term real yields are higher as well—by nearly 150 basis points.6 Beyond changes in interest rates, bank lending standards have tightened, and loan growth has slowed sharply.7 Such a tightening of broad financial conditions typically contributes to a slowing in the growth of economic | \n", "
| 4 | \n", "24 | \n", "activity, and there is evidence of that in this cycle as well. For example, growth in industrial production has slowed, and the amount spent on residential investment has declined in each of the past five quarters (figure 4). | \n", "
| \n", " | ID | \n", "chunk | \n", "
|---|---|---|
| 0 | \n", "14 | \n", "Similar dynamics are playing out for core goods inflation overall. As they do, the effects of monetary restraint should show through more fully over time. Core goods prices fell the past two months, but on a 12-month basis, core goods inflation remains well above its pre-pandemic level. Sustained progress is needed, and restrictive monetary policy is called for to achieve that progress. | \n", "
| 1 | \n", "8 | \n", "On a 12-month basis, core PCE inflation peaked at 5.4 percent in February 2022 and declined gradually to 4.3 percent in July (figure 1, panel B). The lower monthly readings for core inflation in June and July were welcome, but two months of good data are only the beginning of what it will take to build confidence that inflation is moving down sustainably toward our goal. We can't yet know the extent to which these lower readings will continue or where underlying inflation will settle over | \n", "
| 2 | \n", "6 | \n", "On a 12-month basis, U.S. total, or \"headline,\" PCE (personal consumption expenditures) inflation peaked at 7 percent in June 2022 and declined to 3.3 percent as of July, following a trajectory roughly in line with global trends (figure 1, panel A).1 The effects of Russia's war against Ukraine have been a primary driver of the changes in headline inflation around the world since early 2022. Headline inflation is what households and businesses experience most directly, so this decline is very | \n", "
| 3 | \n", "9 | \n", "coming quarters. Twelve-month core inflation is still elevated, and there is substantial further ground to cover to get back to price stability. | \n", "
| 4 | \n", "23 | \n", "Restrictive monetary policy has tightened financial conditions, supporting the expectation of below-trend growth.5 Since last year's symposium, the two-year real yield is up about 250 basis points, and longer-term real yields are higher as well—by nearly 150 basis points.6 Beyond changes in interest rates, bank lending standards have tightened, and loan growth has slowed sharply.7 Such a tightening of broad financial conditions typically contributes to a slowing in the growth of economic | \n", "
| \n", " | ID | \n", "chunk | \n", "
|---|---|---|
| 0 | \n", "9 | \n", "coming quarters. Twelve-month core inflation is still elevated, and there is substantial further ground to cover to get back to price stability. | \n", "
| 1 | \n", "14 | \n", "Similar dynamics are playing out for core goods inflation overall. As they do, the effects of monetary restraint should show through more fully over time. Core goods prices fell the past two months, but on a 12-month basis, core goods inflation remains well above its pre-pandemic level. Sustained progress is needed, and restrictive monetary policy is called for to achieve that progress. | \n", "
| 2 | \n", "23 | \n", "Restrictive monetary policy has tightened financial conditions, supporting the expectation of below-trend growth.5 Since last year's symposium, the two-year real yield is up about 250 basis points, and longer-term real yields are higher as well—by nearly 150 basis points.6 Beyond changes in interest rates, bank lending standards have tightened, and loan growth has slowed sharply.7 Such a tightening of broad financial conditions typically contributes to a slowing in the growth of economic | \n", "
| 3 | \n", "8 | \n", "On a 12-month basis, core PCE inflation peaked at 5.4 percent in February 2022 and declined gradually to 4.3 percent in July (figure 1, panel B). The lower monthly readings for core inflation in June and July were welcome, but two months of good data are only the beginning of what it will take to build confidence that inflation is moving down sustainably toward our goal. We can't yet know the extent to which these lower readings will continue or where underlying inflation will settle over | \n", "
| 4 | \n", "35 | \n", "That assessment is further complicated by uncertainty about the duration of the lags with which monetary tightening affects economic activity and especially inflation. Since the symposium a year ago, the Committee has raised the policy rate by 300 basis points, including 100 basis points over the past seven months. And we have substantially reduced the size of our securities holdings. The wide range of estimates of these lags suggests that there may be significant further drag in the pipeline. | \n", "
| \n", " | ID | \n", "chunk | \n", "
|---|---|---|
| 0 | \n", "14 | \n", "Similar dynamics are playing out for core goods inflation overall. As they do, the effects of monetary restraint should show through more fully over time. Core goods prices fell the past two months, but on a 12-month basis, core goods inflation remains well above its pre-pandemic level. Sustained progress is needed, and restrictive monetary policy is called for to achieve that progress. | \n", "
| 1 | \n", "8 | \n", "On a 12-month basis, core PCE inflation peaked at 5.4 percent in February 2022 and declined gradually to 4.3 percent in July (figure 1, panel B). The lower monthly readings for core inflation in June and July were welcome, but two months of good data are only the beginning of what it will take to build confidence that inflation is moving down sustainably toward our goal. We can't yet know the extent to which these lower readings will continue or where underlying inflation will settle over | \n", "
| 2 | \n", "9 | \n", "coming quarters. Twelve-month core inflation is still elevated, and there is substantial further ground to cover to get back to price stability. | \n", "
| 3 | \n", "6 | \n", "On a 12-month basis, U.S. total, or \"headline,\" PCE (personal consumption expenditures) inflation peaked at 7 percent in June 2022 and declined to 3.3 percent as of July, following a trajectory roughly in line with global trends (figure 1, panel A).1 The effects of Russia's war against Ukraine have been a primary driver of the changes in headline inflation around the world since early 2022. Headline inflation is what households and businesses experience most directly, so this decline is very | \n", "
| 4 | \n", "23 | \n", "Restrictive monetary policy has tightened financial conditions, supporting the expectation of below-trend growth.5 Since last year's symposium, the two-year real yield is up about 250 basis points, and longer-term real yields are higher as well—by nearly 150 basis points.6 Beyond changes in interest rates, bank lending standards have tightened, and loan growth has slowed sharply.7 Such a tightening of broad financial conditions typically contributes to a slowing in the growth of economic | \n", "
| \n", " | ID | \n", "chunk | \n", "
|---|---|---|
| 0 | \n", "9 | \n", "coming quarters. Twelve-month core inflation is still elevated, and there is substantial further ground to cover to get back to price stability. | \n", "
| 1 | \n", "35 | \n", "That assessment is further complicated by uncertainty about the duration of the lags with which monetary tightening affects economic activity and especially inflation. Since the symposium a year ago, the Committee has raised the policy rate by 300 basis points, including 100 basis points over the past seven months. And we have substantially reduced the size of our securities holdings. The wide range of estimates of these lags suggests that there may be significant further drag in the pipeline. | \n", "
| 2 | \n", "29 | \n", "Total hours worked has been flat over the past six months, and the average workweek has declined to the lower end of its pre-pandemic range, reflecting a gradual normalization in labor market conditions (figure 5). | \n", "
| 3 | \n", "23 | \n", "Restrictive monetary policy has tightened financial conditions, supporting the expectation of below-trend growth.5 Since last year's symposium, the two-year real yield is up about 250 basis points, and longer-term real yields are higher as well—by nearly 150 basis points.6 Beyond changes in interest rates, bank lending standards have tightened, and loan growth has slowed sharply.7 Such a tightening of broad financial conditions typically contributes to a slowing in the growth of economic | \n", "
| 4 | \n", "24 | \n", "activity, and there is evidence of that in this cycle as well. For example, growth in industrial production has slowed, and the amount spent on residential investment has declined in each of the past five quarters (figure 4). | \n", "
| \n", " | ID | \n", "chunk | \n", "
|---|---|---|
| 0 | \n", "9 | \n", "coming quarters. Twelve-month core inflation is still elevated, and there is substantial further ground to cover to get back to price stability. | \n", "
| 1 | \n", "35 | \n", "That assessment is further complicated by uncertainty about the duration of the lags with which monetary tightening affects economic activity and especially inflation. Since the symposium a year ago, the Committee has raised the policy rate by 300 basis points, including 100 basis points over the past seven months. And we have substantially reduced the size of our securities holdings. The wide range of estimates of these lags suggests that there may be significant further drag in the pipeline. | \n", "
| 2 | \n", "29 | \n", "Total hours worked has been flat over the past six months, and the average workweek has declined to the lower end of its pre-pandemic range, reflecting a gradual normalization in labor market conditions (figure 5). | \n", "
| 3 | \n", "23 | \n", "Restrictive monetary policy has tightened financial conditions, supporting the expectation of below-trend growth.5 Since last year's symposium, the two-year real yield is up about 250 basis points, and longer-term real yields are higher as well—by nearly 150 basis points.6 Beyond changes in interest rates, bank lending standards have tightened, and loan growth has slowed sharply.7 Such a tightening of broad financial conditions typically contributes to a slowing in the growth of economic | \n", "
| 4 | \n", "24 | \n", "activity, and there is evidence of that in this cycle as well. For example, growth in industrial production has slowed, and the amount spent on residential investment has declined in each of the past five quarters (figure 4). | \n", "
| \n", " | source | \n", "class | \n", "embedding | \n", "
|---|---|---|---|
| 0 | \n", "data/glioma_tumor/glioma_tumor_0.png | \n", "glioma_tumor | \n", "[0.0, 1.3172649145126343, 0.20154666900634766,... | \n", "
| 1 | \n", "data/glioma_tumor/glioma_tumor_1.png | \n", "glioma_tumor | \n", "[0.10450763255357742, 0.559810221195221, 0.870... | \n", "
| 2 | \n", "data/glioma_tumor/glioma_tumor_10.png | \n", "glioma_tumor | \n", "[0.055571720004081726, 1.653620958328247, 1.16... | \n", "
| 3 | \n", "data/glioma_tumor/glioma_tumor_11.png | \n", "glioma_tumor | \n", "[0.7401718497276306, 0.6310665607452393, 0.324... | \n", "
| 4 | \n", "data/glioma_tumor/glioma_tumor_12.png | \n", "glioma_tumor | \n", "[0.21819375455379486, 0.19898559153079987, 0.0... | \n", "
| \n", " | u0 | \n", "u1 | \n", "
|---|---|---|
| 0 | \n", "11.201051 | \n", "8.133024 | \n", "
| 1 | \n", "16.847599 | \n", "13.015171 | \n", "
| 2 | \n", "11.072457 | \n", "8.326119 | \n", "
| 3 | \n", "11.330195 | \n", "5.185107 | \n", "
| 4 | \n", "11.302164 | \n", "8.262789 | \n", "
| \n", " | source | \n", "class | \n", "embedding | \n", "
|---|---|---|---|
| 0 | \n", "data/glioma_tumor/glioma_tumor_0.png | \n", "glioma_tumor | \n", "[0.0, 1.3172649145126343, 0.20154666900634766,... | \n", "
| 1 | \n", "data/glioma_tumor/glioma_tumor_1.png | \n", "glioma_tumor | \n", "[0.10450763255357742, 0.559810221195221, 0.870... | \n", "
| 2 | \n", "data/glioma_tumor/glioma_tumor_10.png | \n", "glioma_tumor | \n", "[0.055571720004081726, 1.653620958328247, 1.16... | \n", "
| 3 | \n", "data/glioma_tumor/glioma_tumor_11.png | \n", "glioma_tumor | \n", "[0.7401718497276306, 0.6310665607452393, 0.324... | \n", "
| 4 | \n", "data/glioma_tumor/glioma_tumor_12.png | \n", "glioma_tumor | \n", "[0.21819375455379486, 0.19898559153079987, 0.0... | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "
| 389 | \n", "data/pituitary_tumor/pituitary_tumor_71.png | \n", "pituitary_tumor | \n", "[0.1535000056028366, 1.7610539197921753, 1.467... | \n", "
| 390 | \n", "data/pituitary_tumor/pituitary_tumor_72.png | \n", "pituitary_tumor | \n", "[0.1535000056028366, 1.7610539197921753, 1.467... | \n", "
| 391 | \n", "data/pituitary_tumor/pituitary_tumor_73.png | \n", "pituitary_tumor | \n", "[0.0, 1.1767886877059937, 1.2333405017852783, ... | \n", "
| 392 | \n", "data/pituitary_tumor/pituitary_tumor_8.png | \n", "pituitary_tumor | \n", "[0.367981880903244, 0.07278978824615479, 0.017... | \n", "
| 393 | \n", "data/pituitary_tumor/pituitary_tumor_9.png | \n", "pituitary_tumor | \n", "[0.4310145080089569, 0.5233449935913086, 0.163... | \n", "
394 rows × 3 columns
\n", "| \n", " | source | \n", "class | \n", "embedding | \n", "
|---|---|---|---|
| 0 | \n", "data/glioma_tumor/glioma_tumor_0.png | \n", "glioma_tumor | \n", "[0.0, 1.3172649145126343, 0.20154666900634766,... | \n", "
| 1 | \n", "data/glioma_tumor/glioma_tumor_1.png | \n", "glioma_tumor | \n", "[0.10450763255357742, 0.559810221195221, 0.870... | \n", "
| 2 | \n", "data/glioma_tumor/glioma_tumor_10.png | \n", "glioma_tumor | \n", "[0.055571720004081726, 1.653620958328247, 1.16... | \n", "
| 3 | \n", "data/glioma_tumor/glioma_tumor_11.png | \n", "glioma_tumor | \n", "[0.7401718497276306, 0.6310665607452393, 0.324... | \n", "
| 4 | \n", "data/glioma_tumor/glioma_tumor_12.png | \n", "glioma_tumor | \n", "[0.21819375455379486, 0.19898559153079987, 0.0... | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "
| 95 | \n", "data/glioma_tumor/glioma_tumor_95.png | \n", "glioma_tumor | \n", "[0.6331982016563416, 0.5156578421592712, 0.037... | \n", "
| 96 | \n", "data/glioma_tumor/glioma_tumor_96.png | \n", "glioma_tumor | \n", "[1.1890076398849487, 0.10027773678302765, 0.25... | \n", "
| 97 | \n", "data/glioma_tumor/glioma_tumor_97.png | \n", "glioma_tumor | \n", "[0.528931736946106, 0.37649887800216675, 0.011... | \n", "
| 98 | \n", "data/glioma_tumor/glioma_tumor_98.png | \n", "glioma_tumor | \n", "[1.0254501104354858, 0.8127456903457642, 0.083... | \n", "
| 99 | \n", "data/glioma_tumor/glioma_tumor_99.png | \n", "glioma_tumor | \n", "[0.055571720004081726, 1.653620958328247, 1.16... | \n", "
100 rows × 3 columns
\n", "| \n", " | __nn_distance | \n", "source | \n", "class | \n", "embedding | \n", "
|---|---|---|---|---|
| 0 | \n", "0.000000 | \n", "data/glioma_tumor/glioma_tumor_45.png | \n", "glioma_tumor | \n", "[0.6581199169158936, 0.7380955219268799, 0.374... | \n", "
| 1 | \n", "655.002991 | \n", "data/glioma_tumor/glioma_tumor_71.png | \n", "glioma_tumor | \n", "[0.5400809645652771, 0.6335983872413635, 0.327... | \n", "
| 2 | \n", "695.868774 | \n", "data/glioma_tumor/glioma_tumor_61.png | \n", "glioma_tumor | \n", "[1.1261733770370483, 1.0511531829833984, 0.607... | \n", "
| 3 | \n", "709.470703 | \n", "data/glioma_tumor/glioma_tumor_95.png | \n", "glioma_tumor | \n", "[0.6331982016563416, 0.5156578421592712, 0.037... | \n", "
| 4 | \n", "714.166199 | \n", "data/glioma_tumor/glioma_tumor_68.png | \n", "glioma_tumor | \n", "[0.5686506032943726, 1.2953143119812012, 0.092... | \n", "
| 5 | \n", "716.417480 | \n", "data/glioma_tumor/glioma_tumor_41.png | \n", "glioma_tumor | \n", "[0.14557020366191864, 0.7579463124275208, 1.33... | \n", "
| 6 | \n", "734.459229 | \n", "data/glioma_tumor/glioma_tumor_46.png | \n", "glioma_tumor | \n", "[1.7890287637710571, 0.47240081429481506, 0.27... | \n", "
| 7 | \n", "751.183411 | \n", "data/glioma_tumor/glioma_tumor_12.png | \n", "glioma_tumor | \n", "[0.21819375455379486, 0.19898559153079987, 0.0... | \n", "
| 8 | \n", "761.760010 | \n", "data/glioma_tumor/glioma_tumor_15.png | \n", "glioma_tumor | \n", "[0.7461263537406921, 1.31121826171875, 0.50695... | \n", "
| \n", " | ReleaseYear | \n", "Title | \n", "Origin | \n", "Director | \n", "Cast | \n", "Genre | \n", "Plot | \n", "embeddings | \n", "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", "1975 | \n", "The Candy Tangerine Man | \n", "American | \n", "Matt Cimber | \n", "John Daniels Eli Haines Tom Hankason | \n", "action | \n", "A successful Los Angeles-based businessperson ... | \n", "[-0.06835174, -0.013138616, -0.12417501, 0.002... | \n", "
| 1 | \n", "1975 | \n", "Capone | \n", "American | \n", "Steve Carver | \n", "Ben Gazzara Susan Blakely John Cassavetes Sylv... | \n", "crime drama | \n", "The story is of the rise and fall of the Chica... | \n", "[-0.01411798, 0.040705115, -0.0014280609, 0.00... | \n", "
| 2 | \n", "1975 | \n", "Cleopatra Jones and the Casino of Gold | \n", "American | \n", "Charles Bail | \n", "Tamara Dobson Stella Stevens | \n", "action | \n", "The story begins with two government agents Ma... | \n", "[-0.0925895, 0.01188509, -0.08999529, -0.01541... | \n", "
| 3 | \n", "1975 | \n", "Conduct Unbecoming | \n", "American | \n", "Michael Anderson | \n", "Stacy Keach Richard Attenborough Christopher P... | \n", "drama | \n", "Around 1880 two young British officers arrive ... | \n", "[-0.07435084, -0.06386179, 0.017042944, 0.0288... | \n", "
| 4 | \n", "1975 | \n", "Cooley High | \n", "American | \n", "Michael Schultz | \n", "Lawrence Hilton-Jacobs Glynn Turman Garrett Mo... | \n", "comedy | \n", "Set in 1964 Chicago Preach an aspiring playwri... | \n", "[-0.041632336, 0.037923656, -0.072276264, -0.0... | \n", "
| \n", " | ReleaseYear | \n", "Title | \n", "Origin | \n", "Director | \n", "Cast | \n", "Genre | \n", "Plot | \n", "embeddings | \n", "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", "1975 | \n", "b'The Candy Tangerine Man' | \n", "American | \n", "b'Matt Cimber' | \n", "b'John Daniels Eli Haines Tom Hankason' | \n", "action | \n", "b'A successful Los Angeles-based businessperso... | \n", "[-0.06835173815488815, -0.01313861645758152, -... | \n", "
| 1 | \n", "1975 | \n", "b'Capone' | \n", "American | \n", "b'Steve Carver' | \n", "b'Ben Gazzara Susan Blakely John Cassavetes Sy... | \n", "crime drama | \n", "b'The story is of the rise and fall of the Chi... | \n", "[-0.014117980375885963, 0.0407051146030426, -0... | \n", "
| 2 | \n", "1975 | \n", "b'Cleopatra Jones and the Casino of Gold' | \n", "American | \n", "b'Charles Bail' | \n", "b'Tamara Dobson Stella Stevens' | \n", "action | \n", "b'The story begins with two government agents ... | \n", "[-0.09258949756622314, 0.011885089799761772, -... | \n", "
| 3 | \n", "1975 | \n", "b'Conduct Unbecoming' | \n", "American | \n", "b'Michael Anderson' | \n", "b'Stacy Keach Richard Attenborough Christopher... | \n", "drama | \n", "b'Around 1880 two young British officers arriv... | \n", "[-0.07435084134340286, -0.06386178731918335, 0... | \n", "
| 4 | \n", "1975 | \n", "b'Cooley High' | \n", "American | \n", "b'Michael Schultz' | \n", "b'Lawrence Hilton-Jacobs Glynn Turman Garrett ... | \n", "comedy | \n", "b'Set in 1964 Chicago Preach an aspiring playw... | \n", "[-0.041632335633039474, 0.0379236564040184, -0... | \n", "
| \n", " | path | \n", "media_type | \n", "embeddings | \n", "
|---|
| \n", " | path | \n", "media_type | \n", "embeddings | \n", "
|---|---|---|---|
| 0 | \n", "/content/data/images/deer1.jpg | \n", "image | \n", "[0.036132812, -0.0051574707, 0.05053711, 0.042... | \n", "
| 1 | \n", "/content/data/images/hedgehog1.jpg | \n", "image | \n", "[-0.004119873, 0.00491333, 0.009094238, 0.0177... | \n", "
| 2 | \n", "/content/data/images/fox2.jpg | \n", "image | \n", "[0.004272461, 0.005432129, 0.038085938, 0.0217... | \n", "
| 3 | \n", "/content/data/images/deer2.jpg | \n", "image | \n", "[0.03564453, -0.0017547607, 0.057617188, 0.041... | \n", "
| 4 | \n", "/content/data/images/bear2.jpg | \n", "image | \n", "[0.033691406, 0.00051116943, 0.040527344, 0.00... | \n", "
| 5 | \n", "/content/data/images/caterpillar1.jpg | \n", "image | \n", "[0.039794922, -0.049072266, 0.0065307617, 0.03... | \n", "
| 6 | \n", "/content/data/images/caterpillar2.jpg | \n", "image | \n", "[0.053466797, 0.001914978, 0.044677734, 0.0069... | \n", "
| 7 | \n", "/content/data/images/hedgehog2.jpg | \n", "image | \n", "[0.032470703, 0.014465332, 0.003692627, 0.0412... | \n", "
| 8 | \n", "/content/data/images/bear1.jpg | \n", "image | \n", "[0.036865234, 0.013793945, 0.048095703, 0.0216... | \n", "
| 9 | \n", "/content/data/images/fox1.jpg | \n", "image | \n", "[-0.009765625, 0.01928711, 0.030761719, 0.0274... | \n", "
| 10 | \n", "/content/data/images/bat1.jpg | \n", "image | \n", "[0.005859375, -0.033203125, -0.0072631836, 0.0... | \n", "
| 11 | \n", "/content/data/images/bat2.jpg | \n", "image | \n", "[0.03564453, -0.026367188, -0.016723633, 0.021... | \n", "
| 12 | \n", "/content/data/text/caterpillar.txt | \n", "text | \n", "[0.002532959, 0.011779785, 0.001083374, -0.027... | \n", "
| 13 | \n", "/content/data/text/fox.txt | \n", "text | \n", "[-0.016601562, 0.029418945, 0.040283203, 0.042... | \n", "
| 14 | \n", "/content/data/text/bat.txt | \n", "text | \n", "[-0.044433594, -0.011474609, 0.02319336, 0.026... | \n", "
| 15 | \n", "/content/data/text/deer.txt | \n", "text | \n", "[0.005706787, 0.007659912, 0.041015625, 0.0610... | \n", "
| 16 | \n", "/content/data/text/hedgehog.txt | \n", "text | \n", "[0.010070801, -0.006072998, 0.017333984, 0.048... | \n", "
| 17 | \n", "/content/data/text/bear.txt | \n", "text | \n", "[0.025634766, 0.02319336, 0.052978516, 0.02856... | \n", "
| \n", " | path | \n", "media_type | \n", "text | \n", "embeddings | \n", "
|---|---|---|---|---|
| 0 | \n", "./data/text/bat.txt | \n", "text | \n", "Bats are the only mammals capable of sustained... | \n", "[0.043051157146692276, 0.05017940700054169, 0.... | \n", "
| 1 | \n", "./data/text/bear.txt | \n", "text | \n", "Bears are large mammals with a stocky body, po... | \n", "[0.052125297486782074, 0.0072755659930408, 0.0... | \n", "
| 2 | \n", "./data/text/deer.txt | \n", "text | \n", "Deer are hoofed mammals known for their gracef... | \n", "[0.07531881332397461, 0.03122134692966938, 0.0... | \n", "
| 3 | \n", "./data/text/caterpillar.txt | \n", "text | \n", "Caterpillars are the larval stage of butterfli... | \n", "[0.05927688628435135, -0.017723916098475456, 0... | \n", "
| 4 | \n", "./data/text/hedgehog.txt | \n", "text | \n", "Hedgehogs are small, nocturnal mammals known f... | \n", "[0.0144752012565732, 0.0004009466210845858, 0.... | \n", "
| \n", " | path | \n", "media_type | \n", "text | \n", "embeddings | \n", "
|---|---|---|---|---|
| 0 | \n", "./data/text/bat.txt | \n", "text | \n", "Bats are the only mammals capable of sustained... | \n", "[0.043051157146692276, 0.05017940700054169, 0.... | \n", "
| 1 | \n", "./data/text/bear.txt | \n", "text | \n", "Bears are large mammals with a stocky body, po... | \n", "[0.052125297486782074, 0.0072755659930408, 0.0... | \n", "
| 2 | \n", "./data/text/deer.txt | \n", "text | \n", "Deer are hoofed mammals known for their gracef... | \n", "[0.07531881332397461, 0.03122134692966938, 0.0... | \n", "
| 3 | \n", "./data/text/caterpillar.txt | \n", "text | \n", "Caterpillars are the larval stage of butterfli... | \n", "[0.05927688628435135, -0.017723916098475456, 0... | \n", "
| 4 | \n", "./data/text/hedgehog.txt | \n", "text | \n", "Hedgehogs are small, nocturnal mammals known f... | \n", "[0.0144752012565732, 0.0004009466210845858, 0.... | \n", "
| 5 | \n", "./data/text/fox.txt | \n", "text | \n", "Foxes are small to medium-sized, omnivorous ma... | \n", "[0.017004722729325294, 0.017473476007580757, 0... | \n", "
| 6 | \n", "./data/images/fox2.jpg | \n", "image | \n", "The image features a red fox standing in front... | \n", "[-0.0029516557697206736, -0.010554404929280281... | \n", "
| 7 | \n", "./data/images/bat2.jpg | \n", "image | \n", "The image shows a large bat hanging upside dow... | \n", "[0.059701841324567795, -0.008926010690629482, ... | \n", "
| 8 | \n", "./data/images/bat1.jpg | \n", "image | \n", "The image depicts a bat in flight at night. Th... | \n", "[0.015258435159921646, 0.015619066543877125, 0... | \n", "
| 9 | \n", "./data/images/deer2.jpg | \n", "image | \n", "The image depicts a mature male deer, known as... | \n", "[0.0621773786842823, -0.013964351266622543, -0... | \n", "
| 10 | \n", "./data/images/hedgehog1.jpg | \n", "image | \n", "The image features a close-up of a small hedge... | \n", "[0.03395913168787956, -0.045510806143283844, -... | \n", "
| 11 | \n", "./data/images/hedgehog2.jpg | \n", "image | \n", "The image shows a hedgehog surrounded by vibra... | \n", "[0.03718235343694687, -0.05030420050024986, -0... | \n", "
| 12 | \n", "./data/images/bear2.jpg | \n", "image | \n", "The image is a close-up shot of a bear, specif... | \n", "[0.009648554027080536, -0.023458998650312424, ... | \n", "
| 13 | \n", "./data/images/bear1.jpg | \n", "image | \n", "The image is of a large bear walking toward th... | \n", "[0.03627762198448181, -0.0058790878392755985, ... | \n", "
| 14 | \n", "./data/images/caterpillar2.jpg | \n", "image | \n", "The image displays a caterpillar. This caterpi... | \n", "[0.05489984154701233, -0.015030968934297562, 0... | \n", "
| 15 | \n", "./data/images/deer1.jpg | \n", "image | \n", "The image shows a male deer, commonly referred... | \n", "[0.06323786079883575, -0.020763417705893517, -... | \n", "
| 16 | \n", "./data/images/fox1.jpg | \n", "image | \n", "The image features a red fox resting on a patc... | \n", "[0.008236058056354523, -0.015511339530348778, ... | \n", "
| 17 | \n", "./data/images/caterpillar1.jpg | \n", "image | \n", "The image showcases a bright green caterpillar... | \n", "[0.07789730280637741, -0.03089781105518341, 0.... | \n", "
| \n", " | path | \n", "media_type | \n", "text | \n", "embeddings | \n", "
|---|---|---|---|---|
| 0 | \n", "./data/text/bat.txt | \n", "text | \n", "Bats are the only mammals capable of sustained... | \n", "[0.043051157146692276, 0.05017940700054169, 0.... | \n", "
| 1 | \n", "./data/text/bear.txt | \n", "text | \n", "Bears are large mammals with a stocky body, po... | \n", "[0.052125297486782074, 0.0072755659930408, 0.0... | \n", "
| 2 | \n", "./data/text/deer.txt | \n", "text | \n", "Deer are hoofed mammals known for their gracef... | \n", "[0.07531881332397461, 0.03122134692966938, 0.0... | \n", "
| 3 | \n", "./data/text/caterpillar.txt | \n", "text | \n", "Caterpillars are the larval stage of butterfli... | \n", "[0.05927688628435135, -0.017723916098475456, 0... | \n", "
| 4 | \n", "./data/text/hedgehog.txt | \n", "text | \n", "Hedgehogs are small, nocturnal mammals known f... | \n", "[0.0144752012565732, 0.0004009466210845858, 0.... | \n", "
| 5 | \n", "./data/text/fox.txt | \n", "text | \n", "Foxes are small to medium-sized, omnivorous ma... | \n", "[0.017004722729325294, 0.017473476007580757, 0... | \n", "
| 6 | \n", "./data/images/fox2.jpg | \n", "image | \n", "The image features a red fox standing in front... | \n", "[-0.0029516557697206736, -0.010554404929280281... | \n", "
| 7 | \n", "./data/images/bat2.jpg | \n", "image | \n", "The image shows a large bat hanging upside dow... | \n", "[0.059701841324567795, -0.008926010690629482, ... | \n", "
| 8 | \n", "./data/images/bat1.jpg | \n", "image | \n", "The image depicts a bat in flight at night. Th... | \n", "[0.015258435159921646, 0.015619066543877125, 0... | \n", "
| 9 | \n", "./data/images/deer2.jpg | \n", "image | \n", "The image depicts a mature male deer, known as... | \n", "[0.0621773786842823, -0.013964351266622543, -0... | \n", "
| 10 | \n", "./data/images/hedgehog1.jpg | \n", "image | \n", "The image features a close-up of a small hedge... | \n", "[0.03395913168787956, -0.045510806143283844, -... | \n", "
| 11 | \n", "./data/images/hedgehog2.jpg | \n", "image | \n", "The image shows a hedgehog surrounded by vibra... | \n", "[0.03718235343694687, -0.05030420050024986, -0... | \n", "
| 12 | \n", "./data/images/bear2.jpg | \n", "image | \n", "The image is a close-up shot of a bear, specif... | \n", "[0.009648554027080536, -0.023458998650312424, ... | \n", "
| 13 | \n", "./data/images/bear1.jpg | \n", "image | \n", "The image is of a large bear walking toward th... | \n", "[0.03627762198448181, -0.0058790878392755985, ... | \n", "
| 14 | \n", "./data/images/caterpillar2.jpg | \n", "image | \n", "The image displays a caterpillar. This caterpi... | \n", "[0.05489984154701233, -0.015030968934297562, 0... | \n", "
| 15 | \n", "./data/images/deer1.jpg | \n", "image | \n", "The image shows a male deer, commonly referred... | \n", "[0.06323786079883575, -0.020763417705893517, -... | \n", "
| 16 | \n", "./data/images/fox1.jpg | \n", "image | \n", "The image features a red fox resting on a patc... | \n", "[0.008236058056354523, -0.015511339530348778, ... | \n", "
| 17 | \n", "./data/images/caterpillar1.jpg | \n", "image | \n", "The image showcases a bright green caterpillar... | \n", "[0.07789730280637741, -0.03089781105518341, 0.... | \n", "
| \n", " | id | \n", "name | \n", "artists | \n", "acousticness | \n", "danceability | \n", "duration_ms | \n", "energy | \n", "explicit | \n", "instrumentalness | \n", "key | \n", "liveness | \n", "loudness | \n", "mode | \n", "popularity | \n", "release_date | \n", "speechiness | \n", "tempo | \n", "valence | \n", "year | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "4BJqT0PrAfrxzMOxytFOIz | \n", "Piano Concerto No. 3 in D Minor, Op. 30: III. Finale. Alla breve | \n", "['Sergei Rachmaninoff', 'James Levine', 'Berliner Philharmoniker'] | \n", "0.982 | \n", "0.279 | \n", "831667 | \n", "0.211 | \n", "0 | \n", "0.878000 | \n", "10 | \n", "0.665 | \n", "-20.096 | \n", "1 | \n", "4 | \n", "1921 | \n", "0.0366 | \n", "80.954 | \n", "0.0594 | \n", "1921 | \n", "
| 1 | \n", "7xPhfUan2yNtyFG0cUWkt8 | \n", "Clancy Lowered the Boom | \n", "['Dennis Day'] | \n", "0.732 | \n", "0.819 | \n", "180533 | \n", "0.341 | \n", "0 | \n", "0.000000 | \n", "7 | \n", "0.160 | \n", "-12.441 | \n", "1 | \n", "5 | \n", "1921 | \n", "0.4150 | \n", "60.936 | \n", "0.9630 | \n", "1921 | \n", "
| 2 | \n", "1o6I8BglA6ylDMrIELygv1 | \n", "Gati Bali | \n", "['KHP Kridhamardawa Karaton Ngayogyakarta Hadiningrat'] | \n", "0.961 | \n", "0.328 | \n", "500062 | \n", "0.166 | \n", "0 | \n", "0.913000 | \n", "3 | \n", "0.101 | \n", "-14.850 | \n", "1 | \n", "5 | \n", "1921 | \n", "0.0339 | \n", "110.339 | \n", "0.0394 | \n", "1921 | \n", "
| 3 | \n", "3ftBPsC5vPBKxYSee08FDH | \n", "Danny Boy | \n", "['Frank Parker'] | \n", "0.967 | \n", "0.275 | \n", "210000 | \n", "0.309 | \n", "0 | \n", "0.000028 | \n", "5 | \n", "0.381 | \n", "-9.316 | \n", "1 | \n", "3 | \n", "1921 | \n", "0.0354 | \n", "100.109 | \n", "0.1650 | \n", "1921 | \n", "
| 4 | \n", "4d6HGyGT8e121BsdKmw9v6 | \n", "When Irish Eyes Are Smiling | \n", "['Phil Regan'] | \n", "0.957 | \n", "0.418 | \n", "166693 | \n", "0.193 | \n", "0 | \n", "0.000002 | \n", "3 | \n", "0.229 | \n", "-10.096 | \n", "1 | \n", "2 | \n", "1921 | \n", "0.0380 | \n", "101.665 | \n", "0.2530 | \n", "1921 | \n", "
| \n", " | song_description | \n", "song_name | \n", "song_artists | \n", "song_acousticness | \n", "song_danceability | \n", "song_duration_ms | \n", "song_energy | \n", "song_explicit | \n", "song_instrumentalness | \n", "song_key | \n", "song_liveness | \n", "song_loudness | \n", "song_mode | \n", "song_popularity | \n", "song_speechiness | \n", "song_tempo | \n", "song_valence | \n", "song_year | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "Piano Concerto No. 3 in D Minor, Op. 30: III. Finale. Alla breve - Sergei Rachmaninoff, James Levine, Berliner Philharmoniker | \n", "Piano Concerto No. 3 in D Minor, Op. 30: III. Finale. Alla breve | \n", "Sergei Rachmaninoff, James Levine, Berliner Philharmoniker | \n", "0.982 | \n", "0.279 | \n", "831667 | \n", "0.211 | \n", "0 | \n", "0.878000 | \n", "10 | \n", "0.665 | \n", "-20.096 | \n", "1 | \n", "4 | \n", "0.0366 | \n", "80.954 | \n", "0.0594 | \n", "1921 | \n", "
| 1 | \n", "Clancy Lowered the Boom - Dennis Day | \n", "Clancy Lowered the Boom | \n", "Dennis Day | \n", "0.732 | \n", "0.819 | \n", "180533 | \n", "0.341 | \n", "0 | \n", "0.000000 | \n", "7 | \n", "0.160 | \n", "-12.441 | \n", "1 | \n", "5 | \n", "0.4150 | \n", "60.936 | \n", "0.9630 | \n", "1921 | \n", "
| 2 | \n", "Gati Bali - KHP Kridhamardawa Karaton Ngayogyakarta Hadiningrat | \n", "Gati Bali | \n", "KHP Kridhamardawa Karaton Ngayogyakarta Hadiningrat | \n", "0.961 | \n", "0.328 | \n", "500062 | \n", "0.166 | \n", "0 | \n", "0.913000 | \n", "3 | \n", "0.101 | \n", "-14.850 | \n", "1 | \n", "5 | \n", "0.0339 | \n", "110.339 | \n", "0.0394 | \n", "1921 | \n", "
| 3 | \n", "Danny Boy - Frank Parker | \n", "Danny Boy | \n", "Frank Parker | \n", "0.967 | \n", "0.275 | \n", "210000 | \n", "0.309 | \n", "0 | \n", "0.000028 | \n", "5 | \n", "0.381 | \n", "-9.316 | \n", "1 | \n", "3 | \n", "0.0354 | \n", "100.109 | \n", "0.1650 | \n", "1921 | \n", "
| 4 | \n", "When Irish Eyes Are Smiling - Phil Regan | \n", "When Irish Eyes Are Smiling | \n", "Phil Regan | \n", "0.957 | \n", "0.418 | \n", "166693 | \n", "0.193 | \n", "0 | \n", "0.000002 | \n", "3 | \n", "0.229 | \n", "-10.096 | \n", "1 | \n", "2 | \n", "0.0380 | \n", "101.665 | \n", "0.2530 | \n", "1921 | \n", "
| \n", " | song_name | \n", "song_artists | \n", "song_year | \n", "song_embeddings | \n", "
|---|---|---|---|---|
| 0 | \n", "Piano Concerto No. 3 in D Minor, Op. 30: III. Finale. Alla breve | \n", "Sergei Rachmaninoff, James Levine, Berliner Philharmoniker | \n", "1921 | \n", "[-1.3666918277740479, 1.3530807495117188, 1.202406644821167, -0.2246660739183426, 0.4187828600406647, -0.46274974942207336, -1.2627567052841187, -0.4888093173503876, -0.6343100666999817, 1.3277226686477661, 0.7242574691772461, 0.22040177881717682, 0.43592825531959534, -0.4605342745780945, -1.2532882690429688, 1.2703070294949106, -1.461259048884883, 4.752569009266444, -1.007676175100162, -0.3092011481361043, 2.262496351074803, 1.3649563314116429, 2.6110012104955738, -1.5078176079821606, 0.6453499264358126, -1.2499471942272533, -0.38364744367670833, -1.1655450558051375, -1.7786347004763523, -2.142666230649] | \n", "
| 1 | \n", "Clancy Lowered the Boom | \n", "Dennis Day | \n", "1921 | \n", "[-0.4902784526348114, 0.8817081451416016, 1.11865234375, 0.08565742522478104, 0.6903175711631775, 0.5949982404708862, -0.7584375143051147, -0.0548345223069191, -0.2068416178226471, 0.09101372212171555, 0.9204968214035034, 0.6219479441642761, 0.5529213547706604, -0.3401077389717102, -0.10770940780639648, 0.6055353575765545, 1.600008527700499, -0.3958314054754836, -0.5219481676317469, -0.3092011481361043, -0.5349944602375606, 0.5117210944216953, -0.26640400859153673, -0.1646257835500665, 0.6453499264358126, -1.2044048830284118, 1.872793548131166, -1.8164526201212032, 1.6541851866630062, -2.142666230649] | \n", "
| 2 | \n", "Gati Bali | \n", "KHP Kridhamardawa Karaton Ngayogyakarta Hadiningrat | \n", "1921 | \n", "[-0.2844405770301819, 0.631794273853302, -0.17285339534282684, 0.5646762251853943, 0.9662110805511475, 0.2495516836643219, 0.417460560798645, -0.22620971500873566, -0.6103554964065552, 0.26909953355789185, 0.5071016550064087, -0.1065995916724205, 0.22436323761940002, 0.33269351720809937, -1.0331687927246094, 1.2144662090537686, -1.183477361379913, 2.1306274127125904, -1.1758127930699978, -0.3092011481361043, 2.374013638541697, -0.6259258882315683, -0.6025761034947833, -0.5873232499193564, 0.6453499264358126, -1.2044048830284118, -0.3997478418740689, -0.21005905433508043, -1.854615663007104, -2.142666230649] | \n", "
| 3 | \n", "Danny Boy | \n", "Frank Parker | \n", "1921 | \n", "[-0.7487789988517761, 1.3591301441192627, 1.1782379150390625, -0.2647874355316162, 0.49281495809555054, 0.7352277636528015, -1.1125710010528564, 0.05070333927869797, -0.19258345663547516, 0.5183664560317993, 1.150445580482483, 0.5057250261306763, 0.8323601484298706, -0.2318739891052246, 0.04909057915210724, 1.2304207291798093, -1.4839351050077376, -0.16284109162119192, -0.6415119848547415, -0.3092011481361043, -0.5349062022700511, -0.05710239690493648, 0.9928168892663868, 0.3837053008849818, 0.6453499264358126, -1.295489505426095, -0.39080317620886856, -0.5426988976237884, -1.3774552183139837, -2.142666230649] | \n", "
| 4 | \n", "When Irish Eyes Are Smiling | \n", "Phil Regan | \n", "1921 | \n", "[-0.3061518669128418, 0.8175497651100159, 0.9532014727592468, -0.05882035940885544, 1.183526635169983, 0.30472809076309204, -0.9971756935119629, 0.1633039265871048, -0.2687903046607971, 0.04581066593527794, 1.3318020105361938, 0.1084437295794487, 0.9008270502090454, -0.5091925263404846, -0.18293435871601105, 1.203829862303075, -0.6732660986156829, -0.5052618172494476, -1.0749308222880962, -0.3092011481361043, -0.5349891074077622, -0.6259258882315683, 0.12674640748175162, 0.24684186220999385, 0.6453499264358126, -1.3410318166249366, -0.3752990890558546, -0.4921038246856425, -1.0431389831786766, -2.142666230649] | \n", "
| \n", " | song_name | \n", "song_artists | \n", "song_year | \n", "song_embeddings | \n", "
|---|---|---|---|---|
| 0 | \n", "Piano Concerto No. 3 in D Minor, Op. 30: III. Finale. Alla breve | \n", "b'Sergei Rachmaninoff, James Levine, Berliner Philharmoniker' | \n", "1921 | \n", "[-1.3666918277740479, 1.3530807495117188, 1.202406644821167, -0.2246660739183426, 0.4187828600406647, -0.46274974942207336, -1.2627567052841187, -0.4888093173503876, -0.6343100666999817, 1.3277226686477661, 0.7242574691772461, 0.22040177881717682, 0.43592825531959534, -0.4605342745780945, -1.2532882690429688, 1.2703070294949106, -1.461259048884883, 4.752569009266444, -1.007676175100162, -0.3092011481361043, 2.262496351074803, 1.3649563314116429, 2.6110012104955738, -1.5078176079821606, 0.6453499264358126, -1.2499471942272533, -0.38364744367670833, -1.1655450558051375, -1.7786347004763523, -2.142666230649] | \n", "
| 1 | \n", "Clancy Lowered the Boom | \n", "b'Dennis Day' | \n", "1921 | \n", "[-0.4902784526348114, 0.8817081451416016, 1.11865234375, 0.08565742522478104, 0.6903175711631775, 0.5949982404708862, -0.7584375143051147, -0.0548345223069191, -0.2068416178226471, 0.09101372212171555, 0.9204968214035034, 0.6219479441642761, 0.5529213547706604, -0.3401077389717102, -0.10770940780639648, 0.6055353575765545, 1.600008527700499, -0.3958314054754836, -0.5219481676317469, -0.3092011481361043, -0.5349944602375606, 0.5117210944216953, -0.26640400859153673, -0.1646257835500665, 0.6453499264358126, -1.2044048830284118, 1.872793548131166, -1.8164526201212032, 1.6541851866630062, -2.142666230649] | \n", "
| 2 | \n", "Gati Bali | \n", "b'KHP Kridhamardawa Karaton Ngayogyakarta Hadiningrat' | \n", "1921 | \n", "[-0.2844405770301819, 0.631794273853302, -0.17285339534282684, 0.5646762251853943, 0.9662110805511475, 0.2495516836643219, 0.417460560798645, -0.22620971500873566, -0.6103554964065552, 0.26909953355789185, 0.5071016550064087, -0.1065995916724205, 0.22436323761940002, 0.33269351720809937, -1.0331687927246094, 1.2144662090537686, -1.183477361379913, 2.1306274127125904, -1.1758127930699978, -0.3092011481361043, 2.374013638541697, -0.6259258882315683, -0.6025761034947833, -0.5873232499193564, 0.6453499264358126, -1.2044048830284118, -0.3997478418740689, -0.21005905433508043, -1.854615663007104, -2.142666230649] | \n", "
| 3 | \n", "Danny Boy | \n", "b'Frank Parker' | \n", "1921 | \n", "[-0.7487789988517761, 1.3591301441192627, 1.1782379150390625, -0.2647874355316162, 0.49281495809555054, 0.7352277636528015, -1.1125710010528564, 0.05070333927869797, -0.19258345663547516, 0.5183664560317993, 1.150445580482483, 0.5057250261306763, 0.8323601484298706, -0.2318739891052246, 0.04909057915210724, 1.2304207291798093, -1.4839351050077376, -0.16284109162119192, -0.6415119848547415, -0.3092011481361043, -0.5349062022700511, -0.05710239690493648, 0.9928168892663868, 0.3837053008849818, 0.6453499264358126, -1.295489505426095, -0.39080317620886856, -0.5426988976237884, -1.3774552183139837, -2.142666230649] | \n", "
| 4 | \n", "When Irish Eyes Are Smiling | \n", "b'Phil Regan' | \n", "1921 | \n", "[-0.3061518669128418, 0.8175497651100159, 0.9532014727592468, -0.05882035940885544, 1.183526635169983, 0.30472809076309204, -0.9971756935119629, 0.1633039265871048, -0.2687903046607971, 0.04581066593527794, 1.3318020105361938, 0.1084437295794487, 0.9008270502090454, -0.5091925263404846, -0.18293435871601105, 1.203829862303075, -0.6732660986156829, -0.5052618172494476, -1.0749308222880962, -0.3092011481361043, -0.5349891074077622, -0.6259258882315683, 0.12674640748175162, 0.24684186220999385, 0.6453499264358126, -1.3410318166249366, -0.3752990890558546, -0.4921038246856425, -1.0431389831786766, -2.142666230649] | \n", "
| \n", " | song_name | \n", "song_artists | \n", "song_year | \n", "song_embeddings | \n", "
|---|---|---|---|---|
| 0 | \n", "Flashback | \n", "b'Calvin Harris' | \n", "2009 | \n", "[-0.6639690399169922, 1.2091901302337646, 0.6080996990203857, -0.12048277258872986, 0.9051563739776611, 0.2514127492904663, -0.6373221278190613, -0.29814955592155457, 0.22219587862491608, 0.4700770080089569, 0.8792182207107544, 0.5950366258621216, 0.8773530721664429, -0.03305034339427948, 0.20640605688095093, -1.3351724705550174, -1.5406252453148743, -0.011030115749796763, 1.7460279903169285, -0.3092011481361043, -0.5313940449564866, 1.0805445857483271, -0.7039975151774578, 1.0250333372402145, -1.549546934208044, 0.7539144985217788, -0.307319630000332, 0.36420802995835755, -1.2520866301382436, 1.247808968405853] | \n", "
| 1 | \n", "You Used To Hold Me | \n", "b'Calvin Harris' | \n", "2009 | \n", "[-0.18039822578430176, 1.2320046424865723, 1.0078709125518799, -0.3332071900367737, 1.0227277278900146, 0.3912988007068634, -1.1201367378234863, 0.06693186610937119, 0.01194898784160614, 0.4125801920890808, 1.7104941606521606, 0.0907072052359581, 1.1263823509216309, 0.005882999859750271, -0.15810611844062805, -1.279491195315136, 0.46053670752705134, 0.004253770518922344, 1.8244917453695186, -0.3092011481361043, -0.5349904137531296, 1.6493680770749588, 0.6224578016611152, 0.958707209266951, -1.549546934208044, 0.5717452537264123, -0.35562082459241384, 0.3955535635781341, -1.2824790151505443, 1.247808968405853] | \n", "
| 2 | \n", "I'm Not Alone - Radio Edit | \n", "b'Calvin Harris' | \n", "2009 | \n", "[-0.4578489363193512, 1.1988781690597534, 0.9186619520187378, -0.3090927302837372, 0.8355614542961121, 0.3251281678676605, -0.956695556640625, 0.2001478672027588, -0.004034769721329212, 0.5499118566513062, 1.427262544631958, 0.3321062922477722, 0.9700223803520203, -0.3015604019165039, -0.10042643547058105, -1.3273547556932577, 0.3074733286977821, -0.1491385696844665, 0.7932538218211915, -0.3092011481361043, -0.14309084999676247, 0.5117210944216953, 0.4914076629700189, 0.8665875870818629, 0.6453499264358126, 0.3895760089310457, -0.411077751716656, 0.46191895166730823, -0.36310936852844955, 1.247808968405853] | \n", "
| 3 | \n", "We Found Love | \n", "b'Rihanna, Calvin Harris' | \n", "2011 | \n", "[-0.32985055446624756, 1.3834985494613647, 0.926184356212616, -0.28489983081817627, 0.8302570581436157, 0.22301919758319855, -1.0492554903030396, -0.0061478391289711, 0.029022544622421265, 0.4409855902194977, 1.2905532121658325, 0.4727650284767151, 1.1560120582580566, -0.15044355392456055, -0.18808501958847046, -1.2744389306085564, 1.1181423350898372, -0.12151213480453658, 1.0660087798611477, -0.3092011481361043, -0.5305974929031516, -1.1947493795582, -0.5626912786757541, 1.2313812909348116, 0.6453499264358126, 1.9835569008905032, -0.37351015592281456, 0.36375280436636925, 0.2751307167298641, 1.3248652229298272] | \n", "
| 4 | \n", "Dance Wiv Me - Radio Edit | \n", "b'Dizzee Rascal, Calvin Harris, Chrome' | \n", "2011 | \n", "[-0.5854893922805786, 1.0818036794662476, 0.7288601994514465, -0.25424012541770935, 0.6128330826759338, 0.41814619302749634, -0.6447377800941467, 0.027640312910079956, -0.1611126810312271, 0.39173322916030884, 0.8761189579963684, 0.3748489320278168, 0.6108167767524719, 0.0187530517578125, -0.023305920884013176, -1.2143435714671371, 1.9344803555126058, -0.20954668716662583, 0.9912813940967762, -0.3092011481361043, -0.5349944602375606, 1.6493680770749588, -0.30059100129356187, 1.2671763441267319, 0.6453499264358126, 1.5281337889020867, -0.3329610049072396, -0.1561798539118754, 1.00454795702508, 1.3248652229298272] | \n", "
| 5 | \n", "Feel So Close - Radio Edit | \n", "b'Calvin Harris' | \n", "2012 | \n", "[-0.5671982169151306, 1.21516752243042, 0.8834764957427979, -0.34776613116264343, 0.8313254117965698, 0.4305516481399536, -0.8739187717437744, 0.12808793783187866, -0.13710293173789978, 0.49484601616859436, 1.3459782600402832, 0.40094777941703796, 1.118976354598999, -0.10406777262687683, 0.11825662106275558, -1.3383314655399736, 0.9650789562605678, -0.19120286091549893, 1.6563551273996828, -0.3092011481361043, -0.5125954164977816, 0.5117210944216953, -0.015699395443352974, 1.5196718418873827, 0.6453499264358126, 2.2112684568847114, -0.4170408621601229, 0.36215951479440944, 1.4870270690953529, 1.363393350191814] | \n", "
| 6 | \n", "Sweet Nothing (feat. Florence Welch) | \n", "b'Calvin Harris, Florence Welch' | \n", "2012 | \n", "[-0.5403595566749573, 1.3501120805740356, 0.8737241625785828, -0.30949994921684265, 0.5602115988731384, 0.044263433665037155, -0.8665704727172852, -0.07287808507680893, 0.004971159156411886, 0.4264087975025177, 1.185262680053711, 0.6865300536155701, 0.7812601327896118, -0.16709819436073303, -0.27374014258384705, -0.8170760203287275, 0.20543107614493594, -0.14259962817167257, 1.6750369738407758, -0.3092011481361043, -0.5346376049176665, 0.7961328400850112, -0.8549900662780685, 1.3266593001662457, -1.549546934208044, 1.8469299672939783, 0.048081752430295104, 0.36206196645326905, 0.20674785045218758, 1.363393350191814] | \n", "
| 7 | \n", "I Need Your Love (feat. Ellie Goulding) | \n", "b'Calvin Harris, Ellie Goulding' | \n", "2012 | \n", "[-0.3471231460571289, 1.4163035154342651, 0.9777242541313171, -0.3454452455043793, 0.5281813144683838, 0.0064859092235565186, -0.8150930404663086, -0.003013826906681061, -0.05287719517946243, 0.6403617858886719, 1.3331170082092285, 0.4529487192630768, 0.9392791390419006, -0.1485135555267334, -0.12306036055088043, -0.2506905558542882, 0.8970507878920038, 0.030931386799656055, 1.4508548165476611, -0.3092011481361043, -0.5349944602375606, 0.7961328400850112, 0.17232906441778495, 1.1294355757166477, 0.6453499264358126, 1.8013876560951365, -0.31387905148814554, 0.2663020115671261, 0.1991497541991124, 1.363393350191814] | \n", "
| 8 | \n", "Let's Go (feat. Ne-Yo) | \n", "b'Calvin Harris, Ne-Yo' | \n", "2012 | \n", "[-0.4517887234687805, 1.4095631837844849, 1.0143579244613647, -0.42695263028144836, 0.3167036771774292, 0.22329269349575043, -0.8331954479217529, 0.06706950813531876, -0.11937189847230911, 0.6374547481536865, 1.3507567644119263, 0.6769067049026489, 1.002878189086914, -0.12010153383016586, -0.21650470793247223, -1.3202549942371695, 0.9820859983527089, 0.01743444222608983, 1.4994276172945025, -0.3092011481361043, -0.5104287949127105, -0.34151414256825235, 0.4971054950870231, 1.5038799066556534, -1.549546934208044, 1.5736761001009283, -0.24709221452131636, 0.36472828777777266, 1.3198689515276991, 1.363393350191814] | \n", "
| 9 | \n", "Thinking About You (feat. Ayah Marar) | \n", "b'Calvin Harris, Ayah Marar' | \n", "2012 | \n", "[-0.4033048748970032, 1.201682209968567, 0.7232522368431091, -0.21271994709968567, 0.4565010666847229, 0.08975932747125626, -0.7079424262046814, 0.05442984402179718, -0.04738606885075569, 0.5769458413124084, 1.0412596464157104, 0.4582187235355377, 0.8240703344345093, -0.08101780712604523, -0.14304494857788086, -1.3339492906786878, 1.067121208813414, 0.1370883744063931, 1.4695366629887539, -0.3092011481361043, -0.5336817424536646, -1.479161125221516, -0.6322048305032051, 1.3664900701396077, -1.549546934208044, 1.61921841129977, -0.36575811234630756, 0.3637202882526556, 0.8373898394574263, 1.363393350191814] | \n", "
| 10 | \n", "Spectrum (Say My Name) - Calvin Harris Remix | \n", "b'Florence + The Machine, Calvin Harris' | \n", "2012 | \n", "[-0.5985769033432007, 1.3303905725479126, 1.0559614896774292, -0.25539952516555786, 0.8112835884094238, 0.17980347573757172, -0.812416672706604, -0.08991655707359314, 0.104682557284832, 0.2630487084388733, 1.328857660293579, 0.6589300632476807, 0.8192211389541626, -0.11460264772176743, -0.08975117653608322, -1.334933152753127, 0.23377614629850432, -0.09808422222604993, 1.7385552517404912, -0.3092011481361043, -0.5218672823986005, 1.6493680770749588, -0.6276465648096017, 1.3428021672920136, -1.549546934208044, 0.9360837433171454, -0.31447536253249225, 0.30216728499307355, 0.2295421392114131, 1.363393350191814] | \n", "
| 11 | \n", "Bounce (feat. Kelis) - Radio Edit | \n", "b'Calvin Harris, Kelis' | \n", "2012 | \n", "[-0.6689270734786987, 1.3969231843948364, 0.8880228400230408, -0.326450377702713, 0.34622687101364136, 0.18351785838603973, -0.7323996424674988, -0.0037096429150551558, -0.1328837126493454, 0.6257920861244202, 1.0089510679244995, 0.6650357842445374, 0.8406809568405151, 0.010336706414818764, 0.018450651317834854, -1.2521026024320998, 1.3732479664719526, -0.06648065605115584, 1.802073529640207, -0.3092011481361043, 1.0358061889389716, -0.9103376338948841, 2.6053033783785695, 1.6454809259001602, -1.549546934208044, 1.4370491665044034, -0.36396917921326755, 0.3622895792492634, 0.8791793688493397, 1.363393350191814] | \n", "
| 12 | \n", "We'll Be Coming Back (feat. Example) | \n", "b'Calvin Harris, Example' | \n", "2012 | \n", "[-0.29318350553512573, 1.2344664335250854, 0.91950523853302, -0.2705417275428772, 0.5557699203491211, 0.24373485147953033, -0.9255906343460083, 0.11558005213737488, -0.09228789061307907, 0.5509446263313293, 1.300772786140442, 0.46837666630744934, 1.019039273262024, -0.2337048500776291, -0.30271509289741516, -1.3374326942395398, 0.3358183988513505, 0.029769084015640684, 1.7609734674698028, -0.3092011481361043, -0.5349944602375606, 0.5117210944216953, 2.2292464586562937, 1.252612670524137, 0.6453499264358126, 1.5281337889020867, -0.08131774419293655, 0.36241964370411694, 0.16495832106027414, 1.363393350191814] | \n", "
| 13 | \n", "Drinking from the Bottle (feat. Tinie Tempah) | \n", "b'Calvin Harris, Tinie Tempah' | \n", "2012 | \n", "[-0.49500662088394165, 1.250335454940796, 0.836037278175354, -0.21671639382839203, 0.4340763986110687, 0.14523451030254364, -0.6550229787826538, -0.0313369482755661, -0.14998087286949158, 0.44496625661849976, 0.9944913387298584, 0.5686814188957214, 0.6103565096855164, -0.23264175653457642, -0.1452845335006714, -1.2162049321485084, 0.7269803669705941, 0.07710722529387208, 1.5143730944473768, -0.3092011481361043, -0.5347956408450482, 1.0805445857483271, -0.878920961169486, 1.2857757345107685, -1.549546934208044, 1.4370491665044034, -0.2953934091133982, 0.3662240290085925, 0.009197347872233573, 1.363393350191814] | \n", "
| 14 | \n", "Summer | \n", "b'Calvin Harris' | \n", "2014 | \n", "[-0.8264520764350891, 1.3505786657333374, 1.0463076829910278, -0.22568410634994507, 0.913348913192749, 0.2567248046398163, -0.8605219125747681, -0.35306626558303833, 0.03924936056137085, 0.37715211510658264, 1.1773360967636108, 0.5755943655967712, 1.1420295238494873, -0.08487856388092041, 0.3010081648826599, -1.284809368690483, 0.3358183988513505, -0.06374489575680674, 1.4022820158008196, -0.3092011481361043, -0.4782799540401117, -0.34151414256825235, -0.37466281881461627, 1.394389155715663, -1.549546934208044, 2.2112684568847114, -0.3955736645636421, 0.362549708158971, 0.8183945988247383, 1.440449604715788] | \n", "
| 15 | \n", "Outside (feat. Ellie Goulding) | \n", "b'Calvin Harris, Ellie Goulding' | \n", "2014 | \n", "[-0.45815980434417725, 1.4566391706466675, 0.8504025340080261, -0.373926043510437, 0.44716230034828186, 0.05849791318178177, -0.6436365842819214, -0.1507851779460907, -0.06338223814964294, 0.6237001419067383, 1.1598680019378662, 0.6893650889396667, 0.8899211883544922, -0.07148391753435135, -0.03473608195781708, -0.7745306333259528, 0.6192691003870343, -0.02631400201851587, 1.2789818292896065, -0.3092011481361043, -0.5349944602375606, -0.9103376338948841, 0.6566447943631403, 1.2948999637557677, -1.549546934208044, 2.2112684568847114, -0.366950734435001, 0.36534609393832856, -0.41629604229997574, 1.440449604715788] | \n", "
| 16 | \n", "Blame (feat. John Newman) | \n", "b'Calvin Harris, John Newman' | \n", "2014 | \n", "[-0.6763609647750854, 1.6109453439712524, 0.9073587656021118, -0.1523026078939438, 0.6198676228523254, 0.2695436179637909, -1.1315089464187622, -0.25966623425483704, 0.04239802807569504, 0.6183485388755798, 1.0361096858978271, 0.6836062073707581, 0.8744892477989197, -0.3744211196899414, 0.004306115675717592, -1.2646003098641647, -0.6959421547385375, -0.13943689950768517, 1.406018385089038, -0.3092011481361043, -0.51670562509299, -1.479161125221516, 0.7762992688202281, 1.3027959313716324, -1.549546934208044, 2.0290992120893447, -0.12007796207547143, 0.36498841668748067, -0.6822294111576065, 1.440449604715788] | \n", "
| 17 | \n", "Under Control (feat. Hurts) | \n", "b'Calvin Harris, Alesso, Hurts' | \n", "2014 | \n", "[-0.5020278692245483, 1.4283548593521118, 0.8164223432540894, -0.28505995869636536, 0.4534760117530823, 0.2287231981754303, -0.774612307548523, 0.002535061212256551, 0.015209964476525784, 0.5660685896873474, 1.197741150856018, 0.6164646148681641, 0.8045300841331482, -0.029482128098607063, -0.11208241432905197, -0.9978939150905204, 0.04102966925423999, -0.36620454471558167, 1.6227278038057156, -0.3092011481361043, -0.532082265930568, 0.7961328400850112, -0.5228064538567248, 1.4208845137155646, 0.6453499264358126, 1.8924722784928198, -0.10039969761203067, 0.30223231722050037, -0.06678361465851812, 1.440449604715788] | \n", "
| 18 | \n", "Pray to God (feat. HAIM) | \n", "b'Calvin Harris, HAIM' | \n", "2014 | \n", "[-0.4818477928638458, 1.3358467817306519, 0.7798342108726501, -0.2659180760383606, 0.4285810887813568, 0.08754778653383255, -0.7763833999633789, -0.006817118264734745, -0.03107377141714096, 0.46428442001342773, 1.156795620918274, 0.6081282496452332, 0.8255685567855835, -0.1547342836856842, -0.21263934671878815, -1.3010297974852907, 0.39250853915848727, 0.01290383341492788, 1.7497643596051469, -0.3092011481361043, -0.5349065208908724, 0.5117210944216953, -0.5569934465587499, 1.5359901749601697, -1.549546934208044, 1.4370491665044034, -0.3550245135480672, 0.10508711977578558, -0.3479131760222992, 1.440449604715788] | \n", "
| 19 | \n", "Open Wide (feat. Big Sean) | \n", "b'Calvin Harris, Big Sean' | \n", "2014 | \n", "[-0.49002212285995483, 1.6617223024368286, 1.0235131978988647, -0.33206960558891296, 0.4414531886577606, 0.11732295900583267, -0.8655310869216919, -0.186820387840271, -0.027344567701220512, 0.6834815144538879, 1.3333178758621216, 0.829616129398346, 0.8085821866989136, -0.11922396719455719, 0.046914491802453995, -1.1553118470007873, 1.2655366998883928, -0.3403729583524646, 1.7273461438758355, 3.2341406428407566, -0.5349944602375606, -1.1947493795582, 1.8133047141149887, 1.5440616085230539, 0.6453499264358126, 1.254879921709037, -0.38543637680974835, 0.36342764322923443, 0.6018488556120964, 1.440449604715788] | \n", "
| 20 | \n", "How Deep Is Your Love | \n", "b'Calvin Harris, Disciples' | \n", "2015 | \n", "[-0.41091299057006836, 1.258643388748169, 0.8983922004699707, -0.262657105922699, 0.8501120805740356, 0.3192669153213501, -0.9955586194992065, -0.04502531513571739, -0.04089539870619774, 0.2862780690193176, 1.3955638408660889, 0.2914259135723114, 0.8563987612724304, -0.27214497327804565, -0.13632315397262573, -1.2374776256498958, 1.140818391212692, -0.1419670824388751, 1.4471184472594425, -0.3092011481361043, -0.5295460441927495, 1.6493680770749588, 1.032701714085416, 1.251033477000964, -1.549546934208044, 2.0746415232881863, -0.16599391249016654, 0.16920889601873085, -0.7278179886760573, 1.478977731977775] | \n", "
| 21 | \n", "This Is What You Came For (feat. Rihanna) | \n", "b'Calvin Harris, Rihanna' | \n", "2016 | \n", "[-0.3814137876033783, 1.4554016590118408, 1.0522133111953735, -0.3466358184814453, 0.49159470200538635, 0.13975763320922852, -1.058195948600769, 0.060457807034254074, -0.1191626563668251, 0.4785536825656891, 1.4339147806167603, 0.35597559809684753, 1.0968818664550781, -0.28769201040267944, -0.22834518551826477, -0.8117578469533806, 0.5285648758956155, -0.06669414023597499, 1.6713006045525571, -0.3092011481361043, -0.13990464178342266, 1.0805445857483271, -0.33477799399558694, 1.5293224689734395, -1.549546934208044, 2.16572614568587, -0.4045183302288425, 0.23294047889711456, -0.23774078035270924, 1.517505859239762] | \n", "
| 22 | \n", "The Weekend - Funk Wav Remix | \n", "b'SZA, Calvin Harris, Funk Wav' | \n", "2017 | \n", "[-0.7186599373817444, 1.1158796548843384, 0.731167197227478, -0.19770754873752594, 0.5679613947868347, 0.3211177885532379, -0.6246710419654846, -0.09280967712402344, -0.07223807275295258, 0.3498360812664032, 0.9513226747512817, 0.6551465392112732, 0.7420637011528015, -0.03915492817759514, -0.06295780837535858, -0.024668187402047063, 1.3505719103490978, -0.46483423810202873, 0.3448895072349624, -0.3092011481361043, -0.5349944602375606, 1.6493680770749588, -0.46013030056967885, 1.1527725466702032, 0.6453499264358126, 2.0290992120893447, -0.2530553249647832, -0.48364963512014286, 0.5296669412078823, 1.556033986501749] | \n", "
| 23 | \n", "Slide (feat. Frank Ocean & Migos) | \n", "b'Calvin Harris, Frank Ocean, Migos' | \n", "2017 | \n", "[-0.8330506086349487, 1.4721840620040894, 0.9542821049690247, -0.4979126751422882, 0.4859296679496765, 0.20288336277008057, -0.7739049792289734, -0.3222369849681854, 0.03495460003614426, 0.4573882222175598, 1.1311147212982178, 0.6848472952842712, 1.0187188386917114, -0.017646288499236107, -0.1254880428314209, -0.016690927339026784, 1.1294803631512644, 0.0017235875877324247, 1.1743634892194865, 3.2341406428407566, -0.5349906049256224, -1.1947493795582, 0.2691922104068561, 1.4394839040996015, -1.549546934208044, 1.8924722784928198, -0.2769077667386508, -0.41403263565962234, -0.06298456653198053, 1.556033986501749] | \n", "
| 24 | \n", "Feels (feat. Pharrell Williams, Katy Perry & Big Sean) | \n", "b'Calvin Harris, Pharrell Williams, Katy Perry, Big Sean, Funk Wav' | \n", "2017 | \n", "[-0.4316557049751282, 1.6159189939498901, 1.0156694650650024, -0.3224739134311676, 0.3746965825557709, 0.37119245529174805, -0.8607751131057739, -0.24534280598163605, -0.11486667394638062, 0.7656596302986145, 1.0190465450286865, 0.7592442631721497, 0.878508985042572, -0.08537297695875168, 0.03988835588097572, -1.1702027324517583, 2.019515565973311, -0.05678689269603446, 0.9875450248085577, 3.2341406428407566, -0.5349944602375606, 1.6493680770749588, -0.6407515786787114, 1.473524297821329, -1.549546934208044, 1.9835569008905032, -0.26140367958563687, -0.513141750258252, 1.3084718071480863, 1.556033986501749] | \n", "
| 25 | \n", "Rollin (feat. Future & Khalid) | \n", "b'Calvin Harris, Future, Khalid' | \n", "2017 | \n", "[-0.45464855432510376, 1.5903383493423462, 0.9394875764846802, -0.3995354473590851, 0.38158369064331055, 0.07916951924562454, -0.6712697744369507, -0.25031328201293945, -0.08396566659212112, 0.6828315854072571, 1.174385666847229, 0.8878921270370483, 0.8390617370605469, -0.02976704202592373, -0.09085194766521454, -0.1443270883473511, 1.2315226157041106, 0.3326557013440539, 1.0510633027082734, 3.2341406428407566, -0.5347819401497308, 0.2273093487583794, 0.09825724689673071, 1.2312058249877926, -1.549546934208044, 1.5736761001009283, -0.09562920925725711, -0.8063070314988182, 0.5524612299671078, 1.556033986501749] | \n", "
| 26 | \n", "Slide (feat. Frank Ocean & Migos) | \n", "b'Calvin Harris, Frank Ocean, Migos, Funk Wav' | \n", "2017 | \n", "[-0.8306244015693665, 1.3914999961853027, 0.8867557644844055, -0.4634178578853607, 0.46206897497177124, 0.2473207265138626, -0.7462942004203796, -0.27696043252944946, 0.0020592056680470705, 0.4404282569885254, 1.0507673025131226, 0.6723338961601257, 0.9572036862373352, -0.017701195552945137, -0.15432587265968323, -0.016690927339026784, 1.1294803631512644, 0.0017235875877324247, 1.1743634892194865, 3.2341406428407566, -0.5349906049256224, -1.1947493795582, 0.2691922104068561, 1.4394839040996015, -1.549546934208044, 1.61921841129977, -0.2769077667386508, -0.41403263565962234, -0.06298456653198053, 1.556033986501749] | \n", "
| 27 | \n", "Faking It (feat. Kehlani & Lil Yachty) | \n", "b'Calvin Harris, Kehlani, Lil Yachty, Funk Wav' | \n", "2017 | \n", "[-0.3710099756717682, 1.4853330850601196, 0.8649624586105347, -0.42896997928619385, 0.30232155323028564, 0.11700998991727829, -0.5737792253494263, -0.23985503613948822, -0.13998503983020782, 0.7621097564697266, 1.1860604286193848, 0.8189935088157654, 0.8819252252578735, -0.015730898827314377, -0.18540021777153015, -0.4926674444325697, 1.3562409243798115, 0.07509889259224008, 0.40467141584645966, 3.2341406428407566, -0.5349944602375606, 1.0805445857483271, -0.7706621509464066, 1.2212042660076972, -1.549546934208044, 1.3915068553055618, 0.08386041509109653, 0.10518466811692596, 0.5334659893344199, 1.556033986501749] | \n", "
| 28 | \n", "Rollin (feat. Future & Khalid) | \n", "b'Calvin Harris, Future, Khalid, Funk Wav' | \n", "2017 | \n", "[-0.5150197744369507, 1.480997085571289, 0.8669241070747375, -0.37760406732559204, 0.3724628686904907, 0.14916333556175232, -0.6576970219612122, -0.2119932770729065, -0.10069605708122253, 0.6264132857322693, 1.0778988599777222, 0.8401476740837097, 0.8006543517112732, -0.027807924896478653, -0.12866665422916412, -0.15496343509804483, 1.225853601673397, 0.33254500584081437, 1.0435905641318364, 3.2341406428407566, -0.534894731920483, 0.2273093487583794, -0.14105170201744485, 1.2255909146831776, -1.549546934208044, 1.3915068553055618, -0.0896660988137902, -0.8079003210707776, 0.4916764599425064, 1.556033986501749] | \n", "
| 29 | \n", "One Kiss (with Dua Lipa) | \n", "b'Calvin Harris, Dua Lipa' | \n", "2018 | \n", "[-0.3802143335342407, 1.540840983390808, 0.8170551657676697, -0.15833815932273865, 0.4922180473804474, 0.14575688540935516, -0.7095619440078735, 0.06333868205547333, 0.08945013582706451, 0.5746606588363647, 1.1774417161941528, 0.595933198928833, 0.7621734142303467, 0.0023274512495845556, -0.030562711879611015, -1.2425298903564754, 1.4412761348405165, -0.12451672703532461, 1.424700231530131, -0.3092011481361043, -0.5349246822776885, 1.0805445857483271, -0.7142536129880652, 1.449836394973735, -1.549546934208044, 2.3023530792823945, 0.054044862873762006, 0.23394847842223204, 0.24473833171756343, 1.5945621137637358] | \n", "
| 30 | \n", "Promises (with Sam Smith) | \n", "b'Calvin Harris, Sam Smith, Jessie Reyez' | \n", "2018 | \n", "[-0.4551751911640167, 1.2134348154067993, 0.7615858316421509, -0.199832022190094, 0.5633880496025085, 0.6762082576751709, -1.0863019227981567, -0.2817540466785431, -0.037644486874341965, 0.6317135095596313, 1.1003193855285645, 0.8381323218345642, 0.7380902171134949, -0.047631580382585526, -0.15066808462142944, -1.3092729662170783, 1.3845859945333798, -0.13667741874835615, 1.073481518437585, -0.3092011481361043, -0.5349788159552331, 1.6493680770749588, 0.6737382907141528, 0.9671295747238734, 0.6453499264358126, 2.256810768083553, -0.366950734435001, 0.2039035893509937, -0.15796076969542017, 1.5945621137637358] | \n", "
| 31 | \n", "Giant (with Rag'n'Bone Man) | \n", "b'Calvin Harris\\', \"Rag\\'n\\'Bone Man\"' | \n", "2019 | \n", "[-0.3992515802383423, 1.2729873657226562, 0.7182956337928772, -0.26333242654800415, 0.3592974543571472, 0.4867834448814392, -0.6996715664863586, 0.08461927622556686, -0.16724444925785065, 0.6392613649368286, 0.9774713516235352, 0.48773056268692017, 0.5017319321632385, -0.6514177918434143, -0.211467906832695, -1.2983707107976172, 1.5319803593319354, -0.011156624896356259, 1.5181094637355954, -0.3092011481361043, -0.5333917975062507, -1.1947493795582, -0.7159629626231665, 1.2619123657161553, -1.549546934208044, 2.16572614568587, -0.38662899889844177, 0.16959908938329238, 0.2979250054890896, 1.6330902410257229] | \n", "
| 32 | \n", "Over Now (with The Weeknd) | \n", "b'Calvin Harris, The Weeknd' | \n", "2020 | \n", "[-0.5559015274047852, 1.355072021484375, 1.1667373180389404, -0.22373133897781372, 0.5634499788284302, 0.28147223591804504, -0.9545113444328308, -0.043781451880931854, 0.010776842944324017, 0.37074705958366394, 1.331946611404419, 0.7073851227760315, 0.8230187296867371, -0.3093962073326111, -0.0394388772547245, -1.1390914182059793, 0.4265226233427693, -0.15655516840151695, 1.5069003558709397, 3.2341406428407566, -0.530788665395952, -0.34151414256825235, 0.2293073855878268, 1.29665462322596, -1.549546934208044, 2.256810768083553, -0.33117207177419955, 1.9914119085210957, 0.5752555187263333, 1.67161836828771] | \n", "
| \n", " | song_name | \n", "song_artists | \n", "song_year | \n", "song_embeddings | \n", "
|---|---|---|---|---|
| 0 | \n", "We Found Love | \n", "b'Rihanna, Calvin Harris' | \n", "2011 | \n", "[-0.32985055446624756, 1.3834985494613647, 0.926184356212616, -0.28489983081817627, 0.8302570581436157, 0.22301919758319855, -1.0492554903030396, -0.0061478391289711, 0.029022544622421265, 0.4409855902194977, 1.2905532121658325, 0.4727650284767151, 1.1560120582580566, -0.15044355392456055, -0.18808501958847046, -1.2744389306085564, 1.1181423350898372, -0.12151213480453658, 1.0660087798611477, -0.3092011481361043, -0.5305974929031516, -1.1947493795582, -0.5626912786757541, 1.2313812909348116, 0.6453499264358126, 1.9835569008905032, -0.37351015592281456, 0.36375280436636925, 0.2751307167298641, 1.3248652229298272] | \n", "
| \n", " | __nn_distance | \n", "song_name | \n", "song_artists | \n", "song_year | \n", "song_embeddings | \n", "
|---|---|---|---|---|---|
| 0 | \n", "0.514847 | \n", "We Found Love | \n", "b'Rihanna, Calvin Harris' | \n", "2011 | \n", "[-0.32985055446624756, 1.3834985494613647, 0.926184356212616, -0.28489983081817627, 0.8302570581436157, 0.22301919758319855, -1.0492554903030396, -0.0061478391289711, 0.029022544622421265, 0.4409855902194977, 1.2905532121658325, 0.4727650284767151, 1.1560120582580566, -0.15044355392456055, -0.18808501958847046, -1.2744389306085564, 1.1181423350898372, -0.12151213480453658, 1.0660087798611477, -0.3092011481361043, -0.5305974929031516, -1.1947493795582, -0.5626912786757541, 1.2313812909348116, 0.6453499264358126, 1.9835569008905032, -0.37351015592281456, 0.36375280436636925, 0.2751307167298641, 1.3248652229298272] | \n", "
| 1 | \n", "1.471889 | \n", "Bad At Love | \n", "b'Halsey' | \n", "2017 | \n", "[-0.33408525586128235, 1.2016308307647705, 1.164484977722168, 0.007420691661536694, 0.6395940780639648, 0.3916250765323639, -1.0832836627960205, 0.2853870987892151, 0.03573070093989372, 0.391010582447052, 1.6295465230941772, 0.3820643723011017, 0.6865345239639282, -0.18368127942085266, -0.13861115276813507, -1.1803072618649173, 0.7836705072777309, -0.3899329165171471, 1.0099632405378691, -0.3092011481361043, -0.5349944602375606, -1.479161125221516, -0.6692407392637322, 1.3973720768149898, 0.6453499264358126, 1.9380145896916616, -0.4253892167809766, 0.051533080489714826, 0.32071929424831513, 1.556033986501749] | \n", "
| 2 | \n", "1.572724 | \n", "Sweet but Psycho | \n", "b'Ava Max' | \n", "2020 | \n", "[-0.34748899936676025, 1.084930658340454, 0.8265242576599121, -0.12685643136501312, 0.7913858294487, 0.2851320207118988, -1.0733402967453003, 0.2622759938240051, -0.07236860692501068, 0.6118565201759338, 0.9629176259040833, 0.5539129376411438, 0.650129497051239, -0.33480215072631836, -0.23625598847866058, -1.15903456836353, 1.0387761386598457, -0.3412506155567211, 0.841826622568033, -0.3092011481361043, -0.5349944602375606, -1.1947493795582, -0.23221701588951166, 1.1903222593323153, 0.6453499264358126, 2.393437701680078, -0.31984216193161247, 0.5268536307530907, 0.3511116792606158, 1.67161836828771] | \n", "
| 3 | \n", "1.677343 | \n", "Stay Gold | \n", "b'BTS' | \n", "2020 | \n", "[-0.39910316467285156, 0.949821412563324, 0.8702532052993774, -0.15616460144519806, 0.7773756980895996, 0.3858698010444641, -0.999448299407959, 0.16055533289909363, -0.40568050742149353, 0.41508427262306213, 1.3218148946762085, 0.34586480259895325, 0.6678043603897095, -0.3104383945465088, 0.05108780786395073, -1.1037255652599227, 1.0557831807519866, 0.1001951445409801, 0.5728080338162957, -0.3092011481361043, -0.5349944602375606, -1.1947493795582, -0.7296377597039766, 1.0681979602069414, 0.6453499264358126, 2.2112684568847114, -0.2786966998716909, 0.39727691760494743, 0.20674785045218758, 1.67161836828771] | \n", "
| 4 | \n", "1.864022 | \n", "Baby | \n", "b'Madison Beer' | \n", "2020 | \n", "[-0.264282763004303, 1.3024324178695679, 0.9755517840385437, -0.4190329313278198, 0.732369065284729, 0.6265831589698792, -1.0190303325653076, 0.062001943588256836, -0.19904416799545288, 0.2775894105434418, 1.141913652420044, 0.5574018955230713, 0.4761144518852234, 0.11363387852907181, -0.10809878259897232, -1.1284550714552857, 0.9537409281991405, -0.18075794950268054, 0.8455629918562516, -0.3092011481361043, -0.5349944602375606, -1.479161125221516, -0.3290801618785828, 1.1104852534385723, 0.6453499264358126, 2.16572614568587, -0.2834671882264644, 0.038201473867195856, -0.28712840599769784, 1.67161836828771] | \n", "
| \n", " | Unnamed: 0 | \n", "timestamp | \n", "sensor_00 | \n", "sensor_01 | \n", "sensor_02 | \n", "sensor_03 | \n", "sensor_04 | \n", "sensor_05 | \n", "sensor_06 | \n", "sensor_07 | \n", "... | \n", "sensor_43 | \n", "sensor_44 | \n", "sensor_45 | \n", "sensor_46 | \n", "sensor_47 | \n", "sensor_48 | \n", "sensor_49 | \n", "sensor_50 | \n", "sensor_51 | \n", "machine_status | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "0 | \n", "2018-04-01 00:00:00 | \n", "2.465394 | \n", "47.09201 | \n", "53.2118 | \n", "46.310760 | \n", "634.3750 | \n", "76.45975 | \n", "13.41146 | \n", "16.13136 | \n", "... | \n", "41.92708 | \n", "39.641200 | \n", "65.68287 | \n", "50.92593 | \n", "38.194440 | \n", "157.9861 | \n", "67.70834 | \n", "243.0556 | \n", "201.3889 | \n", "NORMAL | \n", "
| 1 | \n", "1 | \n", "2018-04-01 00:01:00 | \n", "2.465394 | \n", "47.09201 | \n", "53.2118 | \n", "46.310760 | \n", "634.3750 | \n", "76.45975 | \n", "13.41146 | \n", "16.13136 | \n", "... | \n", "41.92708 | \n", "39.641200 | \n", "65.68287 | \n", "50.92593 | \n", "38.194440 | \n", "157.9861 | \n", "67.70834 | \n", "243.0556 | \n", "201.3889 | \n", "NORMAL | \n", "
| 2 | \n", "2 | \n", "2018-04-01 00:02:00 | \n", "2.444734 | \n", "47.35243 | \n", "53.2118 | \n", "46.397570 | \n", "638.8889 | \n", "73.54598 | \n", "13.32465 | \n", "16.03733 | \n", "... | \n", "41.66666 | \n", "39.351852 | \n", "65.39352 | \n", "51.21528 | \n", "38.194443 | \n", "155.9606 | \n", "67.12963 | \n", "241.3194 | \n", "203.7037 | \n", "NORMAL | \n", "
| 3 | \n", "3 | \n", "2018-04-01 00:03:00 | \n", "2.460474 | \n", "47.09201 | \n", "53.1684 | \n", "46.397568 | \n", "628.1250 | \n", "76.98898 | \n", "13.31742 | \n", "16.24711 | \n", "... | \n", "40.88541 | \n", "39.062500 | \n", "64.81481 | \n", "51.21528 | \n", "38.194440 | \n", "155.9606 | \n", "66.84028 | \n", "240.4514 | \n", "203.1250 | \n", "NORMAL | \n", "
| 4 | \n", "4 | \n", "2018-04-01 00:04:00 | \n", "2.445718 | \n", "47.13541 | \n", "53.2118 | \n", "46.397568 | \n", "636.4583 | \n", "76.58897 | \n", "13.35359 | \n", "16.21094 | \n", "... | \n", "41.40625 | \n", "38.773150 | \n", "65.10416 | \n", "51.79398 | \n", "38.773150 | \n", "158.2755 | \n", "66.55093 | \n", "242.1875 | \n", "201.3889 | \n", "NORMAL | \n", "
5 rows × 55 columns
\n", "| \n", " | timestamp | \n", "sensor_00 | \n", "sensor_01 | \n", "sensor_02 | \n", "sensor_03 | \n", "sensor_04 | \n", "sensor_05 | \n", "sensor_06 | \n", "sensor_07 | \n", "sensor_08 | \n", "... | \n", "sensor_42 | \n", "sensor_43 | \n", "sensor_44 | \n", "sensor_45 | \n", "sensor_46 | \n", "sensor_47 | \n", "sensor_48 | \n", "sensor_49 | \n", "sensor_51 | \n", "machine_status | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "2018-04-01 00:00:00 | \n", "2.465394 | \n", "47.09201 | \n", "53.2118 | \n", "46.310760 | \n", "634.3750 | \n", "76.45975 | \n", "13.41146 | \n", "16.13136 | \n", "15.56713 | \n", "... | \n", "31.770832 | \n", "41.92708 | \n", "39.641200 | \n", "65.68287 | \n", "50.92593 | \n", "38.194440 | \n", "157.9861 | \n", "67.70834 | \n", "201.3889 | \n", "NORMAL | \n", "
| 1 | \n", "2018-04-01 00:01:00 | \n", "2.465394 | \n", "47.09201 | \n", "53.2118 | \n", "46.310760 | \n", "634.3750 | \n", "76.45975 | \n", "13.41146 | \n", "16.13136 | \n", "15.56713 | \n", "... | \n", "31.770832 | \n", "41.92708 | \n", "39.641200 | \n", "65.68287 | \n", "50.92593 | \n", "38.194440 | \n", "157.9861 | \n", "67.70834 | \n", "201.3889 | \n", "NORMAL | \n", "
| 2 | \n", "2018-04-01 00:02:00 | \n", "2.444734 | \n", "47.35243 | \n", "53.2118 | \n", "46.397570 | \n", "638.8889 | \n", "73.54598 | \n", "13.32465 | \n", "16.03733 | \n", "15.61777 | \n", "... | \n", "31.770830 | \n", "41.66666 | \n", "39.351852 | \n", "65.39352 | \n", "51.21528 | \n", "38.194443 | \n", "155.9606 | \n", "67.12963 | \n", "203.7037 | \n", "NORMAL | \n", "
| 3 | \n", "2018-04-01 00:03:00 | \n", "2.460474 | \n", "47.09201 | \n", "53.1684 | \n", "46.397568 | \n", "628.1250 | \n", "76.98898 | \n", "13.31742 | \n", "16.24711 | \n", "15.69734 | \n", "... | \n", "31.510420 | \n", "40.88541 | \n", "39.062500 | \n", "64.81481 | \n", "51.21528 | \n", "38.194440 | \n", "155.9606 | \n", "66.84028 | \n", "203.1250 | \n", "NORMAL | \n", "
| 4 | \n", "2018-04-01 00:04:00 | \n", "2.445718 | \n", "47.13541 | \n", "53.2118 | \n", "46.397568 | \n", "636.4583 | \n", "76.58897 | \n", "13.35359 | \n", "16.21094 | \n", "15.69734 | \n", "... | \n", "31.510420 | \n", "41.40625 | \n", "38.773150 | \n", "65.10416 | \n", "51.79398 | \n", "38.773150 | \n", "158.2755 | \n", "66.55093 | \n", "201.3889 | \n", "NORMAL | \n", "
5 rows × 52 columns
\n", "| \n", " | start_time | \n", "end_time | \n", "vectors | \n", "
|---|---|---|---|
| 0 | \n", "2018-04-01 00:00:00 | \n", "2018-04-01 00:23:00 | \n", "[2.465394, 2.465394, 2.444734, 2.460474, 2.445... | \n", "
| 1 | \n", "2018-04-01 00:10:00 | \n", "2018-04-01 00:33:00 | \n", "[2.46441, 2.444734, 2.460474, 2.448669, 2.4535... | \n", "
| 2 | \n", "2018-04-01 00:20:00 | \n", "2018-04-01 00:43:00 | \n", "[2.445718, 2.460474, 2.448669, 2.453588, 2.453... | \n", "
| 3 | \n", "2018-04-01 00:30:00 | \n", "2018-04-01 00:53:00 | \n", "[2.463426, 2.448669, 2.453588, 2.455556, 2.449... | \n", "
| 4 | \n", "2018-04-01 00:40:00 | \n", "2018-04-01 01:03:00 | \n", "[2.449653, 2.453588, 2.453588, 2.448669, 2.460... | \n", "
| \n", " | start_time | \n", "end_time | \n", "vectors | \n", "
|---|---|---|---|
| 0 | \n", "2018-04-01 00:00:00 | \n", "2018-04-01 00:23:00 | \n", "[1.0, 1.0, 0.0, 0.7618586640851966, 0.04762826... | \n", "
| 1 | \n", "2018-04-01 00:10:00 | \n", "2018-04-01 00:33:00 | \n", "[1.0, 0.0, 0.7999593413295418, 0.1999898353323... | \n", "
| 2 | \n", "2018-04-01 00:20:00 | \n", "2018-04-01 00:43:00 | \n", "[0.05001016466760889, 0.7999593413295418, 0.19... | \n", "
| 3 | \n", "2018-04-01 00:30:00 | \n", "2018-04-01 00:53:00 | \n", "[0.8571566871581443, 0.14284331284187718, 0.38... | \n", "
| 4 | \n", "2018-04-01 00:40:00 | \n", "2018-04-01 01:03:00 | \n", "[0.19047388547365338, 0.3809477709472852, 0.38... | \n", "
| \n", " | start_time | \n", "end_time | \n", "vectors | \n", "
|---|---|---|---|
| 0 | \n", "2018-04-01 00:00:00 | \n", "2018-04-01 00:23:00 | \n", "[1.0, 1.0, 0.0, 0.7618586640851966, 0.04762826... | \n", "
| 1 | \n", "2018-04-01 00:10:00 | \n", "2018-04-01 00:33:00 | \n", "[1.0, 0.0, 0.7999593413295418, 0.1999898353323... | \n", "
| 2 | \n", "2018-04-01 00:20:00 | \n", "2018-04-01 00:43:00 | \n", "[0.05001016466760889, 0.7999593413295418, 0.19... | \n", "
| 3 | \n", "2018-04-01 00:30:00 | \n", "2018-04-01 00:53:00 | \n", "[0.8571566871581443, 0.14284331284187718, 0.38... | \n", "
| 4 | \n", "2018-04-01 00:40:00 | \n", "2018-04-01 01:03:00 | \n", "[0.19047388547365338, 0.3809477709472852, 0.38... | \n", "
| \n", " | __nn_distance | \n", "start_time | \n", "end_time | \n", "vectors | \n", "
|---|---|---|---|---|
| 0 | \n", "0.000000 | \n", "2018-04-01 13:00:00 | \n", "2018-04-01 13:23:00 | \n", "[0.1818139814258758, 0.1818139814258758, 0.409... | \n", "
| 1 | \n", "0.177821 | \n", "2018-04-12 16:00:00 | \n", "2018-04-12 16:23:00 | \n", "[0.2499788225328221, 0.2499788225328221, 0.416... | \n", "
| 2 | \n", "0.295486 | \n", "2018-04-02 06:20:00 | \n", "2018-04-02 06:43:00 | \n", "[0.6818370835836001, 0.1818139814258758, 0.500... | \n", "
| 3 | \n", "0.314223 | \n", "2018-04-05 08:30:00 | \n", "2018-04-05 08:53:00 | \n", "[0.4999682358172739, 0.0, 0.43751985261418974,... | \n", "
| 4 | \n", "0.321309 | \n", "2018-06-01 11:33:00 | \n", "2018-06-01 11:56:00 | \n", "[0.3332956855658574, 0.44443189518861076, 0.44... | \n", "
| \n", " | document_id | \n", "text | \n", "embeddings | \n", "
|---|
| \n", " | document_id | \n", "text | \n", "embeddings | \n", "
|---|---|---|---|
| 0 | \n", "b'9eb9cc3b-286e-4597-8c8a-f7ca682f2432' | \n", "b'Draft version August 14, 2023\\nTypeset using L ATEX default style in AASTeX631\\nThe Galactic Interstellar Object Population: A Framework for Prediction and Inference\\nMatthew J. Hopkins\\n ,1Chris Lintott\\n ,1Michele T. Bannister\\n ,2J. Ted Mackereth\\n ,3, 4, 5, \\xe2\\x88\\x97and\\nJohn C. Forbes\\... | \n", "[-0.004561606794595718, 0.04634055867791176, 0.013912900350987911, 0.0031797082629054785, 0.0200442373752594, -0.020312566310167313, 0.012255963869392872, -0.014811805449426174, -0.011712595820426941, 0.0018984334310516715, -0.020097902044653893, 0.0038371162954717875, -0.00469912588596344, -0.0... | \n", "
| 1 | \n", "b'f576d93a-564d-4ac5-ba33-7e1f13631898' | \n", "b'2 Hopkins et al.\\nInitially it was expected that interstellar objects would display cometary characteristics (e.g. Jewitt 2003). The pop-\\nulation\\xe2\\x80\\x99s dominant dynamical formation mechanisms would preferentially harvest more distant, ice-rich planetesimals\\nfrom the disks of the sourc... | \n", "[0.0033084347378462553, 0.010728420689702034, -0.004158694297075272, 0.005771661177277565, 0.011960876174271107, -0.02330215647816658, 0.027854831889271736, -0.012802716344594955, 0.015557220205664635, 0.024352774024009705, -0.008337592706084251, 0.009091882035136223, -0.011509649455547333, -0.0... | \n", "
| 2 | \n", "b'2744c059-b537-4ce9-b602-920da3e53bca' | \n", "b'The Galactic ISO Population 3\\nprocesses modelled, and demonstrate this method by constraining the metallicity dependence of the ISO production\\nrate.\\n2.APOGEE AND STELLAR DENSITY MODELLING\\nTo predict the distribution of ISOs in the Milky Way, we first obtain the distribution of all stars th... | \n", "[0.034129347652196884, -0.011306616477668285, 0.06444961577653885, 0.006231018342077732, 0.011325662024319172, -0.01886763796210289, 0.007827657274901867, -0.014207865111529827, -0.02854269929230213, -0.0022870346438139677, -0.023108413442969322, 0.01556643657386303, -0.03113287314772606, 0.0062... | \n", "
| 3 | \n", "b'a554ad80-98d2-4a54-8e0c-ccde627dd2bd' | \n", "b'4 Hopkins et al.\\nfollows that the probability of finding a point (i.e. an observed star) with observables in the infinitesimal volume \\xce\\xb4O\\nis given by \\xce\\xbb(O)\\xce\\xb4O, and the total number of points (i.e. stars observed) is a Poisson random variable with mean and\\nvariance \\xce\\x9b... | \n", "[0.020134035497903824, 0.023560522124171257, 0.0745055228471756, -0.01058784406632185, 0.01670754887163639, -0.002545879688113928, 0.01768067106604576, -0.0037965471856296062, -0.011999555863440037, 0.00578048313036561, -0.009292631410062313, 0.0019273987272754312, -0.05098612233996391, 0.008847... | \n", "
| 4 | \n", "b'3ff2673c-f60f-4b73-bccd-05d81d38d9a4' | \n", "b'The Galactic ISO Population 5\\nThis particular form for the density profile has the advantage that the Poisson point process likelihood takes the\\ntractable form\\nlnL(logA , aR, az, \\xcf\\x840, \\xcf\\x89) = const + N\\x10\\nlogA\\xe2\\x88\\x92aR\\xe2\\x9f\\xa8R\\xe2\\x88\\x92R0\\xe2\\x9f\\xa9 \\xe2\\x88\\x92az\\x... | \n", "[0.027388010174036026, 0.04201922193169594, 0.04892353340983391, 0.009756388142704964, 0.02934333309531212, -0.014199691824615002, 0.02049718052148819, 0.00019879821047652513, -0.0037724252324551344, -0.0010425581131130457, -0.01610107533633709, 0.012965815141797066, -0.03557339683175087, 0.0098... | \n", "
| \n", " | document_id | \n", "text | \n", "embeddings | \n", "
|---|
| \n", " | document_id | \n", "text | \n", "embeddings | \n", "
|---|---|---|---|
| 0 | \n", "b'cb35e257-647f-4953-a447-919fd99f8957' | \n", "b'Draft version August 14, 2023\\nTypeset using L ATEX default style in AASTeX631\\nThe Galactic Interstellar Object Population: A Framework for Prediction and Inference\\nMatthew J. Hopkins\\n ,1Chris Lintott\\n ,1Michele T. Bannister\\n ,2J. Ted Mackereth\\n ,3, 4, 5, \\xe2\\x88\\x97and\\nJohn C. Forbes\\... | \n", "[-0.001896298, 0.04461563, 0.016039172, 0.004387382, 0.020142527, -0.02016926, 0.014515451, -0.014675843, -0.010986833, 0.001757626, -0.019247007, 0.0028486238, -0.006916893, -0.027132934, 0.027560644, 0.03560696, -0.026772052, -0.0022705453, -0.025194867, 0.018832661, 0.0025261696, -0.014903064... | \n", "
| 1 | \n", "b'f8b82c1f-d988-4f4f-acb5-0fe77d12baa2' | \n", "b'2 Hopkins et al.\\nInitially it was expected that interstellar objects would display cometary characteristics (e.g. Jewitt 2003). The pop-\\nulation\\xe2\\x80\\x99s dominant dynamical formation mechanisms would preferentially harvest more distant, ice-rich planetesimals\\nfrom the disks of the sourc... | \n", "[0.004427607, 0.00960885, -0.0025132392, 0.006977855, 0.011768823, -0.024210533, 0.029257199, -0.01263012, 0.016055124, 0.024721928, -0.0077987793, 0.008895588, -0.012596475, -0.040830884, -0.01087388, 0.019419566, -0.03523245, -0.0022962324, -0.031814177, -0.00019114243, 0.015880171, -0.0127579... | \n", "
| 2 | \n", "b'35284c91-4732-4831-b78d-c2284ec603b5' | \n", "b'The Galactic ISO Population 3\\nprocesses modelled, and demonstrate this method by constraining the metallicity dependence of the ISO production\\nrate.\\n2.APOGEE AND STELLAR DENSITY MODELLING\\nTo predict the distribution of ISOs in the Milky Way, we first obtain the distribution of all stars th... | \n", "[0.039092664, -0.01292948, 0.067334704, 0.008587963, 0.011623856, -0.018899858, 0.009500632, -0.013347787, -0.028723726, -0.0010053621, -0.021295615, 0.014830874, -0.03392087, 0.0021153009, 0.018215355, 0.02143505, -0.011446392, -0.007104876, -0.03916872, 0.032222293, 0.009564012, 0.010742877, 0... | \n", "
| 3 | \n", "b'6854d718-c425-4db7-a4dc-2737db1538aa' | \n", "b'4 Hopkins et al.\\nfollows that the probability of finding a point (i.e. an observed star) with observables in the infinitesimal volume \\xce\\xb4O\\nis given by \\xce\\xbb(O)\\xce\\xb4O, and the total number of points (i.e. stars observed) is a Poisson random variable with mean and\\nvariance \\xce\\x9b... | \n", "[0.023402294, 0.0226368, 0.07654956, -0.010600748, 0.017483374, -0.0025510825, 0.019055374, -0.004674991, -0.0119472, 0.0058847475, -0.0091654435, 0.0012020674, -0.053557355, 0.005966765, 0.03362713, 0.014708452, 0.008085548, -0.014038643, -0.050686747, 0.036224347, -0.010135982, 0.005768556, 0.... | \n", "
| 4 | \n", "b'0033757f-3651-4bdd-a52a-4e20aaa2da24' | \n", "b'The Galactic ISO Population 5\\nThis particular form for the density profile has the advantage that the Poisson point process likelihood takes the\\ntractable form\\nlnL(logA , aR, az, \\xcf\\x840, \\xcf\\x89) = const + N\\x10\\nlogA\\xe2\\x88\\x92aR\\xe2\\x9f\\xa8R\\xe2\\x88\\x92R0\\xe2\\x9f\\xa9 \\xe2\\x88\\x92az\\x... | \n", "[0.0287323, 0.04074224, 0.051109564, 0.010259612, 0.029594, -0.013921836, 0.021421317, 0.0003107252, -0.0040055574, -0.00013253682, -0.015672164, 0.012669679, -0.037403155, 0.008536213, 0.019495957, 0.032125242, -0.033660144, -0.012521574, -0.04103845, -0.014177653, 0.03632603, 0.0049749697, 0.0... | \n", "
| \n", " | company_name | \n", "company_description | \n", "vectors | \n", "
|---|---|---|---|
| 0 | \n", "Apple | \n", "A technology company known for its iPhones, Ma... | \n", "[-0.034876905, 0.032589626, -0.002934602, -0.0... | \n", "
| 1 | \n", "A search engine giant that also specializes in... | \n", "[-0.017866805, -0.057211027, -0.028582964, 0.0... | \n", "|
| 2 | \n", "Brave | \n", "A privacy-focused search engine and browser. | \n", "[-0.017717587, -0.020544883, -0.024149919, -0.... | \n", "
| 3 | \n", "Perplexity | \n", "An answer engine that searches the internet an... | \n", "[-0.043428417, 0.0026718834, -0.014964736, 0.0... | \n", "
| 4 | \n", "Amazon | \n", "An e-commerce leader that offers a wide range ... | \n", "[-0.04374583, -0.05412757, 0.03689078, -0.0363... | \n", "
| \n", " | company_name | \n", "company_description | \n", "vectors | \n", "
|---|---|---|---|
| 0 | \n", "Apple | \n", "A technology company known for its iPhones, Ma... | \n", "[-0.034876905381679535, 0.0325896255671978, -0... | \n", "
| 1 | \n", "A search engine giant that also specializes in... | \n", "[-0.01786680519580841, -0.057211026549339294, ... | \n", "|
| 2 | \n", "Brave | \n", "A privacy-focused search engine and browser. | \n", "[-0.01771758683025837, -0.020544882863759995, ... | \n", "
| 3 | \n", "Perplexity | \n", "An answer engine that searches the internet an... | \n", "[-0.043428417295217514, 0.0026718834415078163,... | \n", "
| 4 | \n", "Amazon | \n", "An e-commerce leader that offers a wide range ... | \n", "[-0.04374583065509796, -0.05412757024168968, 0... | \n", "
| 5 | \n", "Microsoft | \n", "A technology company known for its software pr... | \n", "[-0.03227386251091957, 0.018396837636828423, 0... | \n", "
| 6 | \n", "A social media platform that connects people w... | \n", "[0.001751603209413588, -0.07014184445142746, 0... | \n", "|
| 7 | \n", "Tesla | \n", "An electric vehicle manufacturer known for its... | \n", "[-0.0014115246012806892, 0.07673310488462448, ... | \n", "
| 8 | \n", "Rivian | \n", "An electric vehicle company focusing on advent... | \n", "[-0.004529878031462431, 0.05161484703421593, 0... | \n", "
| 9 | \n", "Lucid Motors | \n", "A company specializing in high-performance ele... | \n", "[0.003517858451232314, 0.07283230125904083, 0.... | \n", "
| 10 | \n", "Netflix | \n", "A streaming service that offers a wide variety... | \n", "[-0.06678618490695953, -0.04583403840661049, 0... | \n", "
| 11 | \n", "Hulu | \n", "A streaming platform providing a wide range of... | \n", "[-0.05717369541525841, -0.06787450611591339, 0... | \n", "
| 12 | \n", "Disney+ | \n", "A streaming service offering movies, TV shows,... | \n", "[-0.08020051568746567, -0.06656964868307114, 0... | \n", "
| 13 | \n", "Uber | \n", "A ride-sharing company that also offers food d... | \n", "[-0.0012249506544321775, -0.06182726100087166,... | \n", "
| 14 | \n", "Lyft | \n", "A ride-sharing platform connecting passengers ... | \n", "[-0.01856295019388199, -0.02381841652095318, 0... | \n", "
| 15 | \n", "Didi | \n", "A Chinese ride-sharing company offering variou... | \n", "[-0.025807108730077744, -0.03256842494010925, ... | \n", "
| 16 | \n", "Airbnb | \n", "A platform that allows people to rent out thei... | \n", "[-0.0020053349435329437, -0.0806913897395134, ... | \n", "
| 17 | \n", "Vrbo | \n", "A vacation rental online marketplace where hom... | \n", "[-0.023459283635020256, -0.017960241064429283,... | \n", "
| 18 | \n", "Booking.com | \n", "An online travel agency offering lodging reser... | \n", "[-0.011170933023095131, -0.01784929819405079, ... | \n", "
| 19 | \n", "Spotify | \n", "A music streaming service offering a wide rang... | \n", "[-0.04392420873045921, -0.08940696716308594, 0... | \n", "
| 20 | \n", "Apple Music | \n", "A music and video streaming service developed ... | \n", "[-0.03607852756977081, -0.07780104875564575, -... | \n", "
| 21 | \n", "YouTube Music | \n", "A music streaming service developed by YouTube | \n", "[-0.044059764593839645, -0.06850195676088333, ... | \n", "
| 22 | \n", "A social media platform for sharing short mess... | \n", "[-0.07155980169773102, 0.019955234602093697, -... | \n", "|
| 23 | \n", "A photo and video sharing social networking se... | \n", "[-0.03469916060566902, -0.016102692112326622, ... | \n", "|
| 24 | \n", "Snapchat | \n", "A multimedia messaging app known for its disap... | \n", "[-0.0604371502995491, 0.03670385479927063, -0.... | \n", "
| 25 | \n", "A professional networking platform for job see... | \n", "[-0.08664378523826599, 0.012671931646764278, -... | \n", "|
| 26 | \n", "Slack | \n", "A collaboration platform for team communicatio... | \n", "[-0.04067962244153023, 0.02870851941406727, -0... | \n", "
| 27 | \n", "Microsoft Teams | \n", "A collaboration platform for team communicatio... | \n", "[-0.04067962244153023, 0.02870851941406727, -0... | \n", "
| 28 | \n", "Zoom | \n", "A video conferencing platform used for virtual... | \n", "[-0.047964468598365784, -0.001985185779631138,... | \n", "
| \n", " | company_name | \n", "company_description | \n", "vectors | \n", "
|---|---|---|---|
| 0 | \n", "Apple | \n", "A technology company known for its iPhones, Ma... | \n", "[-0.034876905381679535, 0.0325896255671978, -0... | \n", "
| 1 | \n", "Amazon | \n", "An e-commerce leader that offers a wide range ... | \n", "[-0.04374583065509796, -0.05412757024168968, 0... | \n", "
| 2 | \n", "Airbnb | \n", "A platform that allows people to rent out thei... | \n", "[-0.0020053349435329437, -0.0806913897395134, ... | \n", "
| 3 | \n", "Apple Music | \n", "A music and video streaming service developed ... | \n", "[-0.03607852756977081, -0.07780104875564575, -... | \n", "
| \n", " | __nn_distance | \n", "company_name | \n", "company_description | \n", "vectors | \n", "
|---|---|---|---|---|
| 0 | \n", "0.730767 | \n", "Zoom | \n", "A video conferencing platform used for virtual... | \n", "[-0.047964468598365784, -0.001985185779631138,... | \n", "
| \n", " | __nn_distance | \n", "company_name | \n", "company_description | \n", "vectors | \n", "
|---|---|---|---|---|
| 0 | \n", "0.730767 | \n", "Zoom | \n", "A video conferencing platform used for virtual... | \n", "[-0.047964468598365784, -0.001985185779631138,... | \n", "
| 1 | \n", "0.714641 | \n", "Booking.com | \n", "An online travel agency offering lodging reser... | \n", "[-0.011170933023095131, -0.01784929819405079, ... | \n", "
| 2 | \n", "0.714121 | \n", "Microsoft Teams | \n", "A collaboration platform for team communicatio... | \n", "[-0.04067962244153023, 0.02870851941406727, -0... | \n", "
| \n", " | __nn_distance | \n", "company_name | \n", "company_description | \n", "vectors | \n", "
|---|---|---|---|---|
| 0 | \n", "0.730767 | \n", "Zoom | \n", "A video conferencing platform used for virtual... | \n", "[-0.047964468598365784, -0.001985185779631138,... | \n", "
| 1 | \n", "0.714121 | \n", "Microsoft Teams | \n", "A collaboration platform for team communicatio... | \n", "[-0.04067962244153023, 0.02870851941406727, -0... | \n", "
| 2 | \n", "0.714121 | \n", "Slack | \n", "A collaboration platform for team communicatio... | \n", "[-0.04067962244153023, 0.02870851941406727, -0... | \n", "
as_retriever, we can create a retriever from a vectorstore and use it to retrieve relevant documents for a query. This allows us to perform question answering over the documents indexed by the vectorstore `vecdb_kdbai`."
]
},
{
"cell_type": "code",
"execution_count": 72,
"id": "2d1ba3cf",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"what are the nations strengths?\n",
"-----\n",
"The strengths of the United States, as highlighted in the context, include:\n",
"\n",
"1. **Resilience and Adaptability**: The nation has a history of turning crises into opportunities.\n",
"2. **Possibilities**: The country is defined by the concept of possibilities, indicating a forward-looking and optimistic outlook.\n",
"3. **Strength of the People**: The American people are described as strong, contributing to the overall strength of the nation.\n",
"4. **Diplomacy and Resolve**: American diplomacy and resolve are emphasized as important factors in international relations.\n",
"5. **Military Preparedness**: The U.S. has mobilized ground forces, air squadrons, and ship deployments to protect NATO allies.\n",
"6. **Economic Sanctions and Coalition Building**: The U.S. has built a coalition of nations to impose economic sanctions on Russia and support Ukraine.\n",
"7. **Historical Achievements**: The nation has a legacy of fighting for freedom, expanding liberty, and defeating totalitarianism and terror.\n",
"8. **Infrastructure Investment**: The passage of the Bipartisan Infrastructure Law is seen as a significant investment in rebuilding and improving the nation's infrastructure.\n",
"9. **Unity and Collective Action**: The emphasis on unity and collective action as one people, one America, is a key strength.\n",
"\n",
"These strengths contribute to the nation's ability to meet and overcome current and future challenges.\n"
]
}
],
"source": [
"print(query)\n",
"print(\"-----\")\n",
"print(qabot.invoke(dict(query=query))[\"result\"])"
]
},
{
"cell_type": "markdown",
"id": "e20a3d7a",
"metadata": {},
"source": [
"Trying another query:"
]
},
{
"cell_type": "code",
"execution_count": 73,
"id": "9ed67c2b",
"metadata": {},
"outputs": [],
"source": [
"def query_qabot(qabot, query: str):\n",
" print(new_query)\n",
" print(\"---\")\n",
" return qabot.invoke(dict(query=new_query))[\"result\"]"
]
},
{
"cell_type": "code",
"execution_count": 74,
"id": "a1b517b1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"what are the things this country needs to protect?\n",
"---\n"
]
},
{
"data": {
"text/plain": [
"\"This country needs to protect several key areas:\\n\\n1. **American Jobs and Businesses**: By ensuring taxpayer dollars support American jobs and businesses through initiatives like Buy American policies.\\n2. **Safety and Security**: By investing in crime prevention, community policing, and measures to reduce gun violence, such as universal background checks and banning assault weapons and high-capacity magazines.\\n3. **Immigration and Border Security**: By providing pathways to citizenship for certain groups, revising laws to meet labor needs, and securing borders with new technology and joint patrols.\\n4. **Voting Rights**: By protecting the fundamental right to vote and ensuring that votes are counted, combating laws that suppress or subvert elections.\\n5. **Liberty and Justice**: By advancing immigration reform, protecting women's rights, and holding law enforcement accountable.\\n6. **National and International Security**: By maintaining strong American diplomacy and resolve, particularly in response to international conflicts like Russia's attack on Ukraine.\\n\\nThese areas are crucial for maintaining the country's integrity, safety, and prosperity.\""
]
},
"execution_count": 74,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"new_query = \"what are the things this country needs to protect?\"\n",
"query_qabot(qabot, new_query)"
]
},
{
"cell_type": "markdown",
"id": "9d7574cc",
"metadata": {},
"source": [
"Clearly, Retrieval Augmented Generation stands out as a valuable technique that synergizes the capabilities of language models such as GPT-3 with the potency of information retrieval.\n",
"By enhancing the input with contextually specific data, RAG empowers language models to produce responses that are not only more precize but also well-suited to the context. \n",
"Particularly in enterprize scenarios where extensive fine-tuning may not be feasible, RAG presents an efficient and economically viable approach to deliver personalized and informed interactions with users."
]
},
{
"cell_type": "markdown",
"id": "65f0568a",
"metadata": {},
"source": [
"## 6. Delete the KDB.AI Table\n",
"\n",
"Once finished with the table, it is best practice to drop it."
]
},
{
"cell_type": "code",
"execution_count": 75,
"id": "bf0e3026",
"metadata": {},
"outputs": [],
"source": [
"table.drop()"
]
},
{
"cell_type": "markdown",
"id": "2e8a102d",
"metadata": {},
"source": [
"## Take Our Survey\n",
"\n",
"We hope you found this sample helpful! Your feedback is important to us, and we would appreciate it if you could take a moment to fill out our brief survey. Your input helps us improve our content.\n",
"\n",
"[**Take the Survey**](https://delighted.com/t/dgCLUkdx)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
================================================
FILE: retrieval_augmented_generation/retrieval_augmented_generation_evaluation.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"id": "48eeba82",
"metadata": {},
"source": [
"# Retrieval Augmented Generation Evaluation with LangChain and KDB.AI\n",
"\n",
"##### Note: This example requires KDB.AI server. Sign up for a free [KDB.AI account](https://kdb.ai/get-started).\n",
"\n",
"This notebook serves as a guide to utilizing LangChain tooling for evaluating a basic Retrieval Augmented Generation (RAG) system. \n",
"\n",
"The evaluation process involves employing [LangChain's String Evaluators](https://python.langchain.com/docs/guides/evaluation/string/) to assess both conciseness and correctness. KDB.AI serves as the primary knowledge base, enabling the retrieval of semantically relevant content for the evaluation.\n",
"\n",
"### Aim\n",
"\n",
"In this tutorial, we build upon the retrieval augmented generation pipeline seen in our [retrieval_augmented_generation.ipynb](retrieval_augmented_generation.ipynb) notebook.\n",
"If you have not seen it, please read and understand that notebook as it will cover the setup steps of RAG in greater detail than we do here.\n",
"\n",
"This notebook focuses on the evaluation of your retrieval augmented generation using KDB.AI as the vector store.\n",
"We will cover the following topics:\n",
"\n",
"1. Load Text Data\n",
"1. Define OpenAI Text Emedding Model\n",
"1. Store Embeddings In KDB.AI\n",
"1. Perform Retrieval Augmented Generation\n",
"1. Evaluate Retrieval Augmented Generation\n",
"1. Delete the KDB.AI Table\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "e88331c4",
"metadata": {},
"source": [
"## 0. Setup"
]
},
{
"cell_type": "markdown",
"id": "80d6c97e",
"metadata": {},
"source": [
"### Install dependencies \n",
"\n",
"In order to successfully run this sample, note the following steps depending on where you are running this notebook:\n",
"\n",
"-***Run Locally / Private Environment:*** The [Setup](https://github.com/KxSystems/kdbai-samples/blob/main/README.md#setup) steps in the repository's `README.md` will guide you on prerequisites and how to run this with Jupyter.\n",
"\n",
"\n",
"-***Colab / Hosted Environment:*** Open this notebook in Colab and run through the cells.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c93b2276",
"metadata": {},
"outputs": [],
"source": [
"!pip install kdbai_client langchain langchain_openai #langchain-community\n",
"\n",
"import os\n",
"!git clone -b KDBAI_v1.4 https://github.com/KxSystems/langchain.git\n",
"os.chdir('langchain/libs/community')\n",
"!pip install ."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c95778f5",
"metadata": {},
"outputs": [],
"source": [
"### !!! Only run this cell if you need to download the data into your environment, for example in Colab\n",
"### This downloads State of the Union Speech data\n",
"import os\n",
"\n",
"if os.path.exists(\"./data/state_of_the_union.txt\") == False:\n",
" !mkdir ./data\n",
" !wget -P ./data https://raw.githubusercontent.com/KxSystems/kdbai-samples/main/retrieval_augmented_generation/data/state_of_the_union.txt"
]
},
{
"cell_type": "markdown",
"id": "679126f7",
"metadata": {},
"source": [
"### Import Packages\n",
"\n",
"Load the various libraries that will be needed in this tutorial, including all the langchain libraries we will use."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "894980f2",
"metadata": {},
"outputs": [],
"source": [
"# vector DB\n",
"from getpass import getpass\n",
"import kdbai_client as kdbai\n",
"import time"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "b9549fe3",
"metadata": {},
"outputs": [],
"source": [
"# langchain packages\n",
"from langchain.chains import RetrievalQA\n",
"from langchain_openai import ChatOpenAI\n",
"from langchain.document_loaders import TextLoader\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_openai import OpenAIEmbeddings\n",
"from langchain_community.vectorstores import KDBAI"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "5ab423cd",
"metadata": {},
"outputs": [],
"source": [
"# evaluation packages\n",
"from langchain.evaluation import load_evaluator"
]
},
{
"cell_type": "markdown",
"id": "bc263a6e",
"metadata": {},
"source": [
"### Set API Keys\n",
"\n",
"To follow this example you will need to request an [OpenAI API Key](https://platform.openai.com/apps). \n",
"\n",
"You can create this for free by registering using the links provided.\n",
"Once you have the credentials you can add them below."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "ed70fbe3",
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"OPENAI_API_KEY\"] = (\n",
" os.environ[\"OPENAI_API_KEY\"]\n",
" if \"OPENAI_API_KEY\" in os.environ\n",
" else getpass(\"OpenAI API Key: \")\n",
")"
]
},
{
"cell_type": "markdown",
"id": "f56faa93",
"metadata": {},
"source": [
"### Define Helper Functions"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "b03039cb",
"metadata": {},
"outputs": [],
"source": [
"def print_dict(d: dict) -> None:\n",
" for k, v in d.items():\n",
" print(f\"\\n{k.capitalize()}\\n---\\n{v}\".replace('\\n\\n', '\\n'))"
]
},
{
"cell_type": "markdown",
"id": "164f0b99",
"metadata": {},
"source": [
"## 1. Load Text Data"
]
},
{
"cell_type": "markdown",
"id": "f04aa63a",
"metadata": {},
"source": [
"### Read In Text Document\n",
"\n",
"The document we will use for this examples is a State of the Union message from the President of the United States to the United States Congress.\n",
"\n",
"In the below code snippet, we read the text file in."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "69dfbffd",
"metadata": {},
"outputs": [],
"source": [
"# Load the documents we want to prompt an LLM about\n",
"doc = TextLoader(\"data/state_of_the_union.txt\").load()"
]
},
{
"cell_type": "markdown",
"id": "ed001b92",
"metadata": {},
"source": [
"### Split The Document Into Chunks\n",
"\n",
"We then split this document into chunks."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "84bfd8a4",
"metadata": {},
"outputs": [],
"source": [
"# Chunk the documents into 500 character chunks using langchain's text splitter \"RucursiveCharacterTextSplitter\"\n",
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "e9c70879",
"metadata": {},
"outputs": [],
"source": [
"# split_documents produces a list of all the chunks created, printing out first chunk for example\n",
"pages = [p.page_content for p in text_splitter.split_documents(doc)]"
]
},
{
"cell_type": "markdown",
"id": "fd1cf6a4",
"metadata": {},
"source": [
"## 2. Define OpenAI Text Embedding Model\n",
" \n",
"We will use OpenAIEmbeddings to embed our document into a format suitable for the vector database. We select `text-embedding-ada-002` for use in the next step."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "ffa379e4",
"metadata": {},
"outputs": [],
"source": [
"embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")"
]
},
{
"cell_type": "markdown",
"id": "7287e75d",
"metadata": {},
"source": [
"## 3. Store Embeddings In KDB.AI"
]
},
{
"cell_type": "markdown",
"id": "8a9f5b2f",
"metadata": {},
"source": [
"With the embeddings created, we need to store them in a vector database to enable efficient searching.\n",
"\n",
"### Define KDB.AI Session\n",
"To use KDB.AI Server, you will need download and run your own container.\n",
"To do this, you will first need to sign up for free [here](https://trykdb.kx.com/kdbaiserver/signup/).\n",
"\n",
"You will receive an email with the required license file and bearer token needed to download your instance.\n",
"Follow instructions in the signup email to get your session up and running.\n",
"\n",
"Once the [setup steps](https://code.kx.com/kdbai/gettingStarted/kdb-ai-server-setup.html) are complete you can then connect to your KDB.AI Server session using `kdbai.Session` and passing your local endpoint.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e62f00a8",
"metadata": {},
"outputs": [],
"source": [
"#Set up KDB.AI server endpoint \n",
"KDBAI_ENDPOINT = (\n",
" os.environ[\"KDBAI_ENDPOINT\"]\n",
" if \"KDBAI_ENDPOINT\" in os.environ\n",
" else \"http://localhost:8082\"\n",
")\n",
"\n",
"#connect to KDB.AI Server, default mode is qipc\n",
"session = kdbai.Session(endpoint=KDBAI_ENDPOINT)\n"
]
},
{
"cell_type": "markdown",
"id": "d4d72b5b",
"metadata": {},
"source": [
"### Define Vector DB Table Schema"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "299902d3",
"metadata": {},
"outputs": [],
"source": [
"rag_eval_schema = [\n",
" {\"name\": \"id\", \"type\": \"str\"},\n",
" {\"name\": \"text\", \"type\": \"bytes\"},\n",
" {\"name\": \"embeddings\", \"type\": \"float32s\"}\n",
"]\n",
"indexes = [{\"name\": \"flat_index\", \"type\": \"flat\", \"column\": \"embeddings\", \"params\": {\"dims\": 1536, \"metric\": \"L2\"}}]"
]
},
{
"cell_type": "markdown",
"id": "640fceb2",
"metadata": {},
"source": [
"### Create Vector DB Table\n",
"\n",
"Use the KDB.AI `create_table` function to create a table that matches the defined schema in the vector database."
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "0bbc9942",
"metadata": {},
"outputs": [],
"source": [
"database = session.database(\"default\")\n",
"# First ensure the table does not already exist\n",
"try:\n",
" database.table(\"rag_eval\").drop()\n",
"except kdbai.KDBAIException:\n",
" pass"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "37840395",
"metadata": {},
"outputs": [],
"source": [
"table = database.create_table(\"rag_eval\", schema=rag_eval_schema, indexes=indexes)"
]
},
{
"cell_type": "markdown",
"id": "934da954",
"metadata": {},
"source": [
"### Add Embedded Data to KDB.AI Table\n",
"\n",
"We can now store our data in KDB.AI by passing a few parameters to `KDBAI.from_texts`:\n",
"\n",
"- `session` our handle to talk to KDB.AI\n",
"- `table_name` our KDB.AI table name\n",
"- `texts` the chunked document \n",
"- `embeddings` the embeddings model we have chosen "
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "7680e758",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['3a39deab-3ca1-457b-a725-192878e2ef3e',\n",
" 'b9e62100-9bf3-4d66-9066-e546505346a8',\n",
" 'baf3fbeb-1997-4eec-be6a-faec7431380e',\n",
" 'de7c3272-38e3-48e0-bbc6-1240cb639430',\n",
" 'ee1984d6-3be6-4e16-bb54-8a0d4ba80e36',\n",
" 'ac042a18-cb3e-4369-bd7a-114fd77b938a',\n",
" 'e4842ca7-d965-44d2-a20e-a23c8710c469',\n",
" 'd574a79a-2cdc-41e5-bc5b-56de4796a2da',\n",
" '67f4e777-37e4-4cb6-a2b5-614218593a2a',\n",
" 'e3656ced-e86b-400d-8f5b-d706ece9dd70',\n",
" '47e20998-cadc-4d5e-8514-0221821921f6',\n",
" 'cf8c9905-dd1b-48e0-8cfb-b3e6cfe41649',\n",
" '9f58e0e9-bd90-4ae4-9961-798b312467c4',\n",
" '2da5b147-45dd-4325-977b-26b9e9c825f0',\n",
" '6744caee-09d6-4aee-a2d0-017c568cbbfd',\n",
" '434d9d87-0e35-4aa7-8ffc-db4bfbc90f16',\n",
" '4c2fc45d-c631-4856-b8ab-61af03d3c41a',\n",
" '536d2df1-40f9-4dab-be85-28b4cb6f79af',\n",
" '750b3adc-17d2-4d6f-836e-1a6382ad2ff2',\n",
" '0b27e61e-638f-442a-9b92-752eb99cf67d',\n",
" 'fff6c3a1-94ee-4d36-90a3-42f5405f52ff',\n",
" 'a3779604-81bf-4267-aae7-e38c97577ed9',\n",
" '6539774a-07c3-4529-a3e3-f71ee58d4667',\n",
" '0db73c13-5b88-4a48-a8d8-7a94777c3470',\n",
" '8b25f891-30d1-4e2b-969b-3e39359edbc8',\n",
" '02780b29-f975-4b93-b861-c79bac8b8c1b',\n",
" '83d5862e-735b-4b9f-814b-b84032a1ca16',\n",
" 'c5180fcb-123d-478d-ab3c-a0268768ffee',\n",
" 'df4a243a-3098-49d5-aa74-5d1aa8fe0f39',\n",
" 'd48d9b20-397b-49e3-bb9b-bc11e8bd60d1',\n",
" '48b63bab-3a8b-4591-b622-53e35960efef',\n",
" '1926e282-f3e1-463e-92a1-a0377e6368c1',\n",
" '7e998714-bc74-4c15-9b04-3da4ddb5db9e',\n",
" '72fd0de7-45d7-41ec-9c14-871125832889',\n",
" '63002836-4c17-40ea-a00a-0717b482ea81',\n",
" 'c3051102-79f3-4fca-acec-f2cda194991c',\n",
" '40427fc7-fa8c-44c9-a9c0-9894172cc60d',\n",
" 'bd7d5d09-8da6-44c7-b479-bd913e82ce15',\n",
" '8d8e9abd-21e3-4a5a-a5ef-bea3f06f5a84',\n",
" '0662a00a-eb47-4a3f-9c97-8642033efed2',\n",
" 'd918c68c-dec9-4782-a563-6225cafede56',\n",
" '1898189a-d5ae-4fef-acfd-e23023d2eca2',\n",
" 'e1138c36-5253-4730-8176-b605252b7d99',\n",
" 'dbf19503-d6fc-4547-81f5-58c27041b668',\n",
" 'ed6b3314-22da-44ac-b4e5-16afee761b6a',\n",
" '5193a98e-3265-416d-babb-cc2f6945ed1a',\n",
" '8e66f06e-d0e2-4988-8e63-6dd09c9c4322',\n",
" 'f0a267a5-90b0-4785-8ff3-f686de966916',\n",
" 'cd856315-e980-483e-a8f3-b921477ba8fb',\n",
" 'b856a809-d4c9-4c9f-a0e4-929314a00d10',\n",
" '9f603984-a7d0-4cd1-9aea-3c94b8ab9e51',\n",
" 'a351f9cf-f06c-4269-98c6-79c9d657b48e',\n",
" '1ab0f48d-11dc-44f0-befc-7dee347e781a',\n",
" '4460e410-ee24-4dad-a9bd-ed0fc260be08',\n",
" '6d52313a-8372-42b2-950e-46072d27677e',\n",
" 'e491c483-9f36-4cdf-87a1-3b67effd2961',\n",
" '0fa360bb-0511-46c7-ab6e-0c7f29451174',\n",
" 'ecb0b0c7-e5b5-42a3-821b-73a60c664396',\n",
" '29181cdd-ed88-47e4-9ce1-5f55a967168b',\n",
" '4f046129-3cf9-4717-90ca-50cf83b9995b',\n",
" 'f8781a59-22b5-4b5c-887b-bb425eb33db1',\n",
" '7378bb22-dcc0-454d-a0b5-dda287961fe7',\n",
" '6d800662-6278-4100-8656-9943cce01b98',\n",
" '8ccf782f-05b9-4688-9971-58e2a61392f1',\n",
" '0aed16bd-74ac-437e-b6c4-c5a02723ff1f',\n",
" '4b028b58-7c89-4678-9421-187b87a0cd3b',\n",
" '3b2d0909-7ad7-42ed-86df-aeefbf41a553',\n",
" 'e98f3e47-cf31-4efd-9ab9-2b44125129cb',\n",
" 'c1827cb0-513e-4d31-bca1-94bd0ec69238',\n",
" '4071e390-052f-4ae7-8c49-150c9840243c',\n",
" 'bb797699-18fa-4df1-889a-e1ef41c312fe',\n",
" '37f14b6c-26d4-4f45-b69e-747998ce4ce9',\n",
" '2345f851-e017-4604-b54c-cca25f58598f',\n",
" '1320c41a-cab0-478e-84e8-ee28e880e38f',\n",
" '11108ede-09bb-47c3-8739-0be3ee1ace5a',\n",
" 'bac8a8f7-7d49-4faa-b4cc-a1267392eea4',\n",
" '58f3355d-1abe-484b-bcbc-30d405c0734d',\n",
" '1a516946-6ed9-46eb-82c4-5d38c32423d9',\n",
" '699a3389-3fe9-493c-9e64-979b72dafcb7',\n",
" 'a6a244eb-26fe-4538-8fb7-4d9a94df7ddb',\n",
" '67fa79e6-3feb-447c-a173-847070ca9e81',\n",
" 'bb9b7307-6c5f-4c49-ac16-8d306dc7d368',\n",
" '4b413721-89b4-4681-8e0e-b1406d4ced37',\n",
" 'fc2b623a-eb81-4272-880b-b3864dd2ec8c',\n",
" '443ff2b1-7710-42cc-b805-6fd776468503',\n",
" '7cbec07a-5336-4025-8a08-eaaf889ee75e',\n",
" '131a74eb-534f-4b60-87f8-d1f2e782f4d7',\n",
" '42552610-46ca-4a3b-800f-ad575a94d5db',\n",
" '64568872-a133-4777-a63c-1324b99fc845']"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# use KDBAI as vector store\n",
"vecdb_kdbai = KDBAI(table, embeddings)\n",
"vecdb_kdbai.add_texts(texts=pages)"
]
},
{
"cell_type": "markdown",
"id": "ece8d806",
"metadata": {},
"source": [
"Now we have the vector embeddings stored in KDB.AI we are ready to query."
]
},
{
"cell_type": "markdown",
"id": "38569892",
"metadata": {},
"source": [
"## 4. Perform Retrieval Augmented Generation"
]
},
{
"cell_type": "markdown",
"id": "d34a0636",
"metadata": {},
"source": [
"We will perform [question answering (QA) in LangChain](https://python.langchain.com/docs/use_cases/question_answering/#go-deeper-4) using `RetrievalQA`.\n",
"\n",
"`RetrievalQA` retrieves the most relevant chunk of text and does QA on that subset.\n",
"We will use KDB.AI as the retriever of `RetrievalQA`.\n",
"\n",
"### Define QA Bot\n",
"\n",
"The code below defines a question-answering bot that combines OpenAI's GPT-4o-mini for generating responses and a retriever that accesses the KDB.AI vector database to find relevant information."
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "9011f654",
"metadata": {},
"outputs": [],
"source": [
"K = 10"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "3ca7342e",
"metadata": {},
"outputs": [],
"source": [
"qabot = RetrievalQA.from_chain_type(\n",
" chain_type=\"stuff\",\n",
" llm=ChatOpenAI(model=\"gpt-4o-mini\", temperature=0.0),\n",
" retriever=vecdb_kdbai.as_retriever(search_kwargs=dict(k=K, index=\"flat_index\")),\n",
" return_source_documents=True,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "cb8ee6a4",
"metadata": {},
"source": [
"`as_retriever` is a method that converts a vectorstore into a retriever. A retriever is an interface that returns documents given an unstructured query. By using as_retriever, we can create a retriever from a vectorstore and use it to retrieve relevant documents for a query. This allows us to perform question answering over the documents indexed by the vectorstore `vecdb_kdbai`."
]
},
{
"cell_type": "markdown",
"id": "057670d5",
"metadata": {},
"source": [
"### Query The QA Bot"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "de98d6be",
"metadata": {},
"outputs": [],
"source": [
"def query_qabot(qabot, query: str) -> str:\n",
" query_res = qabot.invoke(dict(query=query))[\"result\"]\n",
" print(f\"{query}\\n---\\n{query_res}\")\n",
" return query_res"
]
},
{
"cell_type": "markdown",
"id": "df3ca2ca",
"metadata": {},
"source": [
"##### Query 1"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "85ca7b27",
"metadata": {},
"outputs": [],
"source": [
"query1 = \"What improvements could be made in infrastructure?\""
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "fd88bf8e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"What improvements could be made in infrastructure?\n",
"---\n",
"Improvements that could be made in infrastructure include:\n",
"\n",
"1. Rebuilding and modernizing roads, highways, and bridges to ensure safety and efficiency.\n",
"2. Expanding and upgrading public transportation systems to provide better access and reduce congestion.\n",
"3. Developing a national network of electric vehicle charging stations to support the transition to electric vehicles.\n",
"4. Replacing lead pipes to ensure clean drinking water for all Americans.\n",
"5. Providing affordable high-speed internet access to urban, suburban, rural, and tribal communities.\n",
"6. Upgrading airports, ports, and waterways to enhance transportation and trade capabilities.\n",
"7. Implementing sustainable practices to withstand the effects of climate change and promote environmental justice. \n",
"\n",
"These improvements aim to enhance the overall infrastructure and support economic growth and competitiveness.\n"
]
}
],
"source": [
"res1 = query_qabot(qabot, query1)"
]
},
{
"cell_type": "markdown",
"id": "a0f329b3",
"metadata": {},
"source": [
"##### Query 2"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "d0eece5c",
"metadata": {},
"outputs": [],
"source": [
"query2 = \"How many jobs were created in the country due the electric vehicle manufacturing industry?\""
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "41997198",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"How many jobs were created in the country due the electric vehicle manufacturing industry?\n",
"---\n",
"Ford is creating 11,000 jobs and GM is creating 4,000 jobs in the electric vehicle manufacturing industry, which totals 15,000 jobs.\n"
]
}
],
"source": [
"res2 = query_qabot(qabot, query2)"
]
},
{
"cell_type": "markdown",
"id": "36d38581",
"metadata": {},
"source": [
"## 5. Evaluate Retrieval Augmented Generation"
]
},
{
"cell_type": "markdown",
"id": "3933ae8c",
"metadata": {},
"source": [
"Here we will carry out two evaluation techniques against the results of our retrieval augmented generation pipeline.\n",
"We will measure the *Conciseness* and the *Correctness* of the answers.\n",
"\n",
"### Evaluate Conciseness\n",
"\n",
"We will evaluate the conciseness of the answers the QA bot returns using LangChain's `load_evaluator` function with the `criteria` set to `\"conciseness\"`.\n",
"\n",
"In this example, we use GPT-4o as the LLM that performs the evaluation."
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "9c14a699",
"metadata": {},
"outputs": [],
"source": [
"evaluation_llm = ChatOpenAI(model=\"gpt-4o\")"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "d41ed25a",
"metadata": {},
"outputs": [],
"source": [
"concise_evaluator = load_evaluator(\n",
" \"criteria\", criteria=\"conciseness\", llm=evaluation_llm\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "0a63b03e",
"metadata": {},
"outputs": [],
"source": [
"concise_eval_res = concise_evaluator.evaluate_strings(prediction=res1, input=query1)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "a7866960-9256-4df2-8087-49dcf43c3124",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Reasoning\n",
"---\n",
"To determine if the submission meets the criterion of conciseness, we need to assess whether it is brief and to the point. Here is a step-by-step reasoning process:\n",
"1. **Identify Key Points**: The submission lists seven specific improvements that could be made in infrastructure:\n",
" - Rebuilding and modernizing roads, highways, and bridges.\n",
" - Expanding and upgrading public transportation systems.\n",
" - Developing a national network of electric vehicle charging stations.\n",
" - Replacing lead pipes.\n",
" - Providing affordable high-speed internet access.\n",
" - Upgrading airports, ports, and waterways.\n",
" - Implementing sustainable practices.\n",
"2. **Examine Each Point for Brevity**:\n",
" - Each point is presented in a single sentence.\n",
" - The points are specific and avoid unnecessary elaboration.\n",
"3. **Overall Length and Focus**:\n",
" - The list format helps in making the submission concise.\n",
" - The concluding sentence summarizes the purpose of the improvements succinctly: \"These improvements aim to enhance the overall infrastructure and support economic growth and competitiveness.\"\n",
"4. **Relevance**:\n",
" - Each item on the list is directly relevant to the question about improvements in infrastructure.\n",
" - There is no extraneous information or digression from the main topic.\n",
"5. **Conclusion**:\n",
" - The submission effectively communicates the necessary information in a clear and concise manner without unnecessary verbosity.\n",
"Based on this detailed analysis, the submission meets the criterion of conciseness.\n",
"Y\n",
"\n",
"Value\n",
"---\n",
"Y\n",
"\n",
"Score\n",
"---\n",
"1\n"
]
}
],
"source": [
"print_dict(concise_eval_res)"
]
},
{
"cell_type": "markdown",
"id": "ed8e4e74-876c-4dae-9e0d-42f20698ea43",
"metadata": {},
"source": [
"### Evaluate Correctness\n",
"\n",
"We can use the same `load_evaluator` function to calculate correctness by simply changing the `criteria` to `\"correctness\"`.\n",
"\n",
"When using this option, we can pass a reference for the evaluator to check the correctness against.\n",
"Let's pass a reference that matches the information returned as well as one that doesn't.\n",
"\n",
"For this evaluation, we will use the result of the second query we ran through our RAG pipeline."
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "3d5742f8",
"metadata": {},
"outputs": [],
"source": [
"correct_evaluator = load_evaluator(\n",
" \"labeled_criteria\",\n",
" criteria=\"correctness\",\n",
" llm=evaluation_llm,\n",
" requires_reference=True,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "8b0fe16e",
"metadata": {},
"source": [
"##### Matching Reference"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "86e41652",
"metadata": {},
"outputs": [],
"source": [
"matching_ref = \"15000 jobs were created due to manufacturing of electric vehicles.\""
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "a2bd3f02",
"metadata": {},
"outputs": [],
"source": [
"correct_eval_res1 = correct_evaluator.evaluate_strings(\n",
" prediction=res2, input=query2, reference=matching_ref\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "708d3386-f28d-4a6a-bb7c-10a15e8574af",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Reasoning\n",
"---\n",
"Step-by-step reasoning:\n",
"1. **Correctness**: \n",
" - The submission states that Ford is creating 11,000 jobs and GM is creating 4,000 jobs, which totals 15,000 jobs.\n",
" - The reference states that 15,000 jobs were created due to the manufacturing of electric vehicles.\n",
"2. **Accuracy**:\n",
" - The total number of jobs mentioned in the submission (15,000) matches the reference number (15,000). \n",
" - The specific companies mentioned (Ford and GM) and their respective job creation numbers (11,000 and 4,000) add up correctly to 15,000 jobs.\n",
"3. **Factuality**:\n",
" - There is no conflicting information between the submission and the reference.\n",
" - The details provided about the companies (Ford and GM) are not disputed by the reference, and since the total number aligns, it can be considered factual.\n",
"Conclusion:\n",
"- Since the submission correctly totals 15,000 jobs, which matches the reference, and there are no inaccuracies or factual errors, the submission meets the criteria.\n",
"Y\n",
"\n",
"Value\n",
"---\n",
"Y\n",
"\n",
"Score\n",
"---\n",
"1\n"
]
}
],
"source": [
"print_dict(correct_eval_res1)"
]
},
{
"cell_type": "markdown",
"id": "4f1a317f",
"metadata": {},
"source": [
"##### Contradictory Reference"
]
},
{
"cell_type": "code",
"execution_count": 34,
"id": "a2b8c14e",
"metadata": {},
"outputs": [],
"source": [
"contractic_ref = \"12000 jobs were created due to manufacturing of electric vehicles.\""
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "ea0cb2fc",
"metadata": {},
"outputs": [],
"source": [
"correct_eval_res2 = correct_evaluator.evaluate_strings(\n",
" prediction=res2, input=query2, reference=contractic_ref\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "2172b83f-61ca-4963-8225-0378580a67a8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Reasoning\n",
"---\n",
"First, I will assess the submission based on the criterion of correctness, which includes accuracy and factuality. Here is the step-by-step reasoning:\n",
"1. **Correctness**:\n",
" - The submission states that Ford is creating 11,000 jobs and GM is creating 4,000 jobs in the electric vehicle manufacturing industry, totaling 15,000 jobs.\n",
" - The reference data indicates that 12,000 jobs were created due to the manufacturing of electric vehicles.\n",
" - There is a discrepancy between the submission and the reference data. The submission claims a total of 15,000 jobs, whereas the reference data states 12,000 jobs.\n",
" - Since the submission's total (15,000 jobs) does not match the reference data (12,000 jobs), it is not factually correct.\n",
"Given this analysis, the submission does not meet the criterion of correctness as it provides an inaccurate total number of jobs created.\n",
"Therefore, the answer is:\n",
"N\n",
"\n",
"Value\n",
"---\n",
"N\n",
"\n",
"Score\n",
"---\n",
"0\n"
]
}
],
"source": [
"print_dict(correct_eval_res2)"
]
},
{
"cell_type": "markdown",
"id": "7195efbb",
"metadata": {},
"source": [
"## 6. Delete the KDB.AI Table\n",
"\n",
"Once finished with the table, it is best practice to drop it."
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "1d83ed49",
"metadata": {},
"outputs": [],
"source": [
"table.drop()"
]
},
{
"cell_type": "markdown",
"id": "f7ed75e9",
"metadata": {},
"source": [
"## Take Our Survey\n",
"\n",
"We hope you found this sample helpful! Your feedback is important to us, and we would appreciate it if you could take a moment to fill out our brief survey. Your input helps us improve our content.\n",
"\n",
"[**Take the Survey**](https://delighted.com/t/dgCLUkdx)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
================================================
FILE: sentiment_analysis/data/disneyland_reviews.csv
================================================
[File too large to display: 30.6 MB]
================================================
FILE: sentiment_analysis/sentiment_analysis.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"id": "48b6c907",
"metadata": {
"id": "48b6c907"
},
"source": [
"# Sentiment Analysis on Disneyland Resort Reviews\n",
"\n",
"##### Note: This example requires KDB.AI server. Sign up for a free [KDB.AI account](https://kdb.ai/get-started).\n",
"\n",
"In this example, we will extract valuable sentiments from Disneyland Resort reviews, gaining a deeper understanding of customer experiences.\n",
"\n",
"We will leverage the power of Natural Language Processing (NLP) and sentiment analysis techniques to assess the sentiment expressed in these reviews. But that's not all, our approach doesn't stop at sentiment analysis; it extends to the realm of powerful vector databases.\n",
"\n",
"Using KDB.AI we can store not only the reviews themselves but also the sentiment labels as metadata. With KDB.AI, we can easily search for any topic, keyword, or sentiment and retrieve relevant customer reviews instantly. Whether you're interested in finding the happiest moments in the park, uncovering areas for improvement, or simply exploring the multitude of experiences Disneyland Resort has to offer, KDB.AI makes it all possible with just a few clicks.\n",
"\n",
"### Aim\n",
"\n",
"In the sections that follow, we'll walk you through the entire process:\n",
"\n",
"1. Load Review Data\n",
"1. Perform Sentiment Analysis On The Reviews\n",
"1. Create Review Vector Embeddings\n",
"1. Store Embeddings in KDB.AI\n",
"1. Get The Sentiment Of Similar Reviews To A Target Query\n",
"1. Delete the KDB.AI Database & Table\n",
"\n",
"By the end of this tutorial, you'll not only have a deeper understanding of sentiment analysis but also the tools and knowledge to harness the insights hidden within vast datasets of customer reviews. Let's embark on this journey to uncover the magic and meaning behind Disneyland Resort reviews with sentiment analysis and KDB.AI.\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "6a10030c",
"metadata": {
"id": "6a10030c"
},
"source": [
"## 0. Setup"
]
},
{
"cell_type": "markdown",
"id": "91255bcd",
"metadata": {
"id": "91255bcd"
},
"source": [
"### Install dependencies\n",
"\n",
"In order to successfully run this sample, note the following steps depending on where you are running this notebook:\n",
"\n",
"-***Run Locally / Private Environment:*** The [Setup](https://github.com/KxSystems/kdbai-samples/blob/main/README.md#setup) steps in the repository's `README.md` will guide you on prerequisites and how to run this with Jupyter.\n",
"\n",
"\n",
"-***Colab / Hosted Environment:*** Open this notebook in Colab and run through the cells."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f936c0ce",
"metadata": {},
"outputs": [],
"source": [
"!pip install kdbai_client"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dea527a1",
"metadata": {
"id": "dea527a1"
},
"outputs": [],
"source": [
"!pip install sentence_transformers matplotlib"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "acb9eaf5",
"metadata": {
"id": "acb9eaf5"
},
"outputs": [],
"source": [
"### !!! Only run this cell if you need to download the data into your environment, for example in Colab\n",
"### This downloads customer review data\n",
"!mkdir ./data\n",
"!wget -P ./data https://raw.githubusercontent.com/KxSystems/kdbai-samples/main/sentiment_analysis/data/disneyland_reviews.csv"
]
},
{
"cell_type": "markdown",
"id": "0bd6cfb9-6b68-472f-be47-5b7d03e4564d",
"metadata": {
"id": "0bd6cfb9-6b68-472f-be47-5b7d03e4564d"
},
"source": [
"### Set Environment Variables"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "08f05e63-9ef8-49ca-b198-aab6d0017579",
"metadata": {
"id": "08f05e63-9ef8-49ca-b198-aab6d0017579"
},
"outputs": [],
"source": [
"import os"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "5be006c3-2bd1-4944-abef-78664e288fcd",
"metadata": {
"id": "5be006c3-2bd1-4944-abef-78664e288fcd"
},
"outputs": [],
"source": [
"# ignore tensorflow warnings\n",
"os.environ[\"TF_CPP_MIN_LOG_LEVEL\"] = \"3\""
]
},
{
"cell_type": "markdown",
"id": "36f53e98",
"metadata": {
"id": "36f53e98"
},
"source": [
"### Import Packages"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "4c8be1e0",
"metadata": {
"id": "4c8be1e0"
},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "51e1c7ef",
"metadata": {
"id": "51e1c7ef"
},
"outputs": [],
"source": [
"# tokenisation\n",
"from transformers import pipeline\n",
"from transformers import AutoTokenizer\n",
"from transformers import AutoModelForSequenceClassification"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "972ed652",
"metadata": {
"id": "972ed652"
},
"outputs": [],
"source": [
"# timing\n",
"from tqdm.auto import tqdm"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "5c28352b",
"metadata": {
"id": "5c28352b"
},
"outputs": [],
"source": [
"# plotting\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "b1d68085",
"metadata": {
"id": "b1d68085"
},
"outputs": [],
"source": [
"# embedding\n",
"from sentence_transformers import SentenceTransformer"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "070348ba",
"metadata": {
"id": "070348ba"
},
"outputs": [],
"source": [
"# vector DB\n",
"import kdbai_client as kdbai\n",
"from getpass import getpass\n",
"import time"
]
},
{
"cell_type": "markdown",
"id": "b20421bf",
"metadata": {
"id": "b20421bf"
},
"source": [
"### Configure Console"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "a7ec66ab",
"metadata": {
"id": "a7ec66ab"
},
"outputs": [],
"source": [
"pd.set_option(\"display.max_colwidth\", 200)"
]
},
{
"cell_type": "markdown",
"id": "c8b843f5",
"metadata": {
"id": "c8b843f5"
},
"source": [
"### Define Helper Functions"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "db32355f",
"metadata": {
"id": "db32355f"
},
"outputs": [],
"source": [
"def show_df(df: pd.DataFrame) -> pd.DataFrame:\n",
" print(df.shape)\n",
" return df.head()"
]
},
{
"cell_type": "markdown",
"id": "13f3838f",
"metadata": {
"id": "13f3838f"
},
"source": [
"## 1. Load Review Data"
]
},
{
"cell_type": "markdown",
"id": "fa866120",
"metadata": {
"id": "fa866120"
},
"source": [
"### Dataset Overview\n",
"\n",
"The dataset that will be used for this example is these [Disneyland Reviews](https://www.kaggle.com/datasets/arushchillar/disneyland-reviews) available on Kaggle. The dataset includes 42,000 reviews of 3 Disneyland branches - Paris, California and Hong Kong, posted by visitors on Trip Advisor."
]
},
{
"cell_type": "markdown",
"id": "8989a4a6",
"metadata": {
"id": "8989a4a6"
},
"source": [
"### Read In Review Data From CSV"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "cf7619be",
"metadata": {
"id": "cf7619be"
},
"outputs": [],
"source": [
"raw_reviews_df = pd.read_csv(\"data/disneyland_reviews.csv\", encoding=\"ISO-8859-1\")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "d3a14cc9",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 379
},
"id": "d3a14cc9",
"outputId": "07404714-f1da-4d3c-cc78-c3fc55ff1814"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(42656, 6)\n"
]
},
{
"data": {
"text/html": [
"| \n", " | Review_ID | \n", "Rating | \n", "Year_Month | \n", "Reviewer_Location | \n", "Review_Text | \n", "Branch | \n", "
|---|---|---|---|---|---|---|
| 0 | \n", "670772142 | \n", "4 | \n", "2019-4 | \n", "Australia | \n", "If you've ever been to Disneyland anywhere you'll find Disneyland Hong Kong very similar in the layout when you walk into main street! It has a very familiar feel. One of the rides its a Small Wo... | \n", "Disneyland_HongKong | \n", "
| 1 | \n", "670682799 | \n", "4 | \n", "2019-5 | \n", "Philippines | \n", "Its been a while since d last time we visit HK Disneyland .. Yet, this time we only stay in Tomorrowland .. AKA Marvel land!Now they have Iron Man Experience n d Newly open Ant Man n d Wasp!!Ironm... | \n", "Disneyland_HongKong | \n", "
| 2 | \n", "670623270 | \n", "4 | \n", "2019-4 | \n", "United Arab Emirates | \n", "Thanks God it wasn t too hot or too humid when I was visiting the park otherwise it would be a big issue (there is not a lot of shade).I have arrived around 10:30am and left at 6pm. Unfortunat... | \n", "Disneyland_HongKong | \n", "
| 3 | \n", "670607911 | \n", "4 | \n", "2019-4 | \n", "Australia | \n", "HK Disneyland is a great compact park. Unfortunately there is quite a bit of maintenance work going on at present so a number of areas are closed off (including the famous castle) If you go midwee... | \n", "Disneyland_HongKong | \n", "
| 4 | \n", "670607296 | \n", "4 | \n", "2019-4 | \n", "United Kingdom | \n", "the location is not in the city, took around 1 hour from Kowlon, my kids like disneyland so much, everything is fine. but its really crowded and hot in Hong Kong | \n", "Disneyland_HongKong | \n", "
| \n", " | Review_ID | \n", "Label | \n", "Score | \n", "
|---|---|---|---|
| 0 | \n", "670772142 | \n", "positive | \n", "0.984786 | \n", "
| 1 | \n", "670682799 | \n", "positive | \n", "0.858987 | \n", "
| 2 | \n", "670623270 | \n", "positive | \n", "0.898818 | \n", "
| 3 | \n", "670607911 | \n", "positive | \n", "0.863198 | \n", "
| 4 | \n", "670607296 | \n", "positive | \n", "0.56407 | \n", "
| \n", " | Review_ID | \n", "Label | \n", "Score | \n", "Rating | \n", "Year_Month | \n", "Reviewer_Location | \n", "Review_Text | \n", "Branch | \n", "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", "670772142 | \n", "positive | \n", "0.984786 | \n", "4 | \n", "2019-4 | \n", "Australia | \n", "If you've ever been to Disneyland anywhere you'll find Disneyland Hong Kong very similar in the layout when you walk into main street! It has a very familiar feel. One of the rides its a Small Wo... | \n", "Disneyland_HongKong | \n", "
| 1 | \n", "670682799 | \n", "positive | \n", "0.858987 | \n", "4 | \n", "2019-5 | \n", "Philippines | \n", "Its been a while since d last time we visit HK Disneyland .. Yet, this time we only stay in Tomorrowland .. AKA Marvel land!Now they have Iron Man Experience n d Newly open Ant Man n d Wasp!!Ironm... | \n", "Disneyland_HongKong | \n", "
| 2 | \n", "670623270 | \n", "positive | \n", "0.898818 | \n", "4 | \n", "2019-4 | \n", "United Arab Emirates | \n", "Thanks God it wasn t too hot or too humid when I was visiting the park otherwise it would be a big issue (there is not a lot of shade).I have arrived around 10:30am and left at 6pm. Unfortunat... | \n", "Disneyland_HongKong | \n", "
| 3 | \n", "670607911 | \n", "positive | \n", "0.863198 | \n", "4 | \n", "2019-4 | \n", "Australia | \n", "HK Disneyland is a great compact park. Unfortunately there is quite a bit of maintenance work going on at present so a number of areas are closed off (including the famous castle) If you go midwee... | \n", "Disneyland_HongKong | \n", "
| 4 | \n", "670607296 | \n", "positive | \n", "0.56407 | \n", "4 | \n", "2019-4 | \n", "United Kingdom | \n", "the location is not in the city, took around 1 hour from Kowlon, my kids like disneyland so much, everything is fine. but its really crowded and hot in Hong Kong | \n", "Disneyland_HongKong | \n", "
| Label | \n", "negative | \n", "neutral | \n", "positive | \n", "
|---|---|---|---|
| Branch | \n", "\n", " | \n", " | \n", " |
| Disneyland_California | \n", "1.500000 | \n", "NaN | \n", "4.822222 | \n", "
| Disneyland_HongKong | \n", "3.166667 | \n", "3.833333 | \n", "4.351351 | \n", "
| Disneyland_Paris | \n", "2.416667 | \n", "2.500000 | \n", "4.628571 | \n", "
| \n", " | Branch | \n", "Label | \n", "Score | \n", "Rating | \n", "Review_Text | \n", "embeddings | \n", "
|---|---|---|---|---|---|---|
| 0 | \n", "Disneyland_HongKong | \n", "positive | \n", "0.984786 | \n", "4 | \n", "If you've ever been to Disneyland anywhere you'll find Disneyland Hong Kong very similar in the layout when you walk into main street! It has a very familiar feel. One of the rides its a Small Wo... | \n", "[0.12519372, -0.047441825, 0.0871841, 0.010215119, -0.04802456, 0.003773756, 0.0026167803, -0.05580014, -0.01470384, -0.020438066, 0.003935949, -0.02602119, -0.0065161493, -0.01756644, 0.06614383,... | \n", "
| 1 | \n", "Disneyland_HongKong | \n", "positive | \n", "0.858987 | \n", "4 | \n", "Its been a while since d last time we visit HK Disneyland .. Yet, this time we only stay in Tomorrowland .. AKA Marvel land!Now they have Iron Man Experience n d Newly open Ant Man n d Wasp!!Ironm... | \n", "[0.047256097, -0.022195395, 0.12704651, -0.01767993, 0.012125689, 0.023794448, 0.0635631, -0.058635417, -0.07302157, -0.020539569, -0.019598728, -0.07271388, -0.04178504, -0.0031552485, 0.07268948... | \n", "
| 2 | \n", "Disneyland_HongKong | \n", "positive | \n", "0.898818 | \n", "4 | \n", "Thanks God it wasn t too hot or too humid when I was visiting the park otherwise it would be a big issue (there is not a lot of shade).I have arrived around 10:30am and left at 6pm. Unfortunat... | \n", "[0.10299566, 0.018655304, 0.12572218, 0.08505048, 0.0050070393, -0.034780897, -0.017856728, -0.027097572, -0.06612961, -0.04508932, -0.05486106, -0.0064851535, -0.026050558, 0.017010149, 0.0760149... | \n", "
| 3 | \n", "Disneyland_HongKong | \n", "positive | \n", "0.863198 | \n", "4 | \n", "HK Disneyland is a great compact park. Unfortunately there is quite a bit of maintenance work going on at present so a number of areas are closed off (including the famous castle) If you go midwee... | \n", "[0.14166501, -0.01674562, 0.0866899, 0.00852721, -0.037891164, 0.052252676, 0.010435957, -0.04532679, -0.07340584, 0.0044011963, 0.012121033, -0.022934418, -0.039868694, -0.03126051, 0.12090388, -... | \n", "
| 4 | \n", "Disneyland_HongKong | \n", "positive | \n", "0.56407 | \n", "4 | \n", "the location is not in the city, took around 1 hour from Kowlon, my kids like disneyland so much, everything is fine. but its really crowded and hot in Hong Kong | \n", "[0.11346179, 0.0044915066, 0.065638125, 0.055530254, 0.027315829, -0.006237473, -0.07888978, -0.0714263, -0.046673268, 0.011053106, 0.08186102, -0.10437099, -0.03234412, 0.0073166923, 0.082220174,... | \n", "
| \n", " | Branch | \n", "Label | \n", "Score | \n", "Rating | \n", "Review_Text | \n", "embeddings | \n", "
|---|---|---|---|---|---|---|
| 0 | \n", "Disneyland_HongKong | \n", "positive | \n", "0.984786 | \n", "4 | \n", "If you've ever been to Disneyland anywhere you'll find Disneyland Hong Kong very similar in the layout when you walk into main street! It has a very familiar feel. One of the rides its a Small Wo... | \n", "[0.12519371509552002, -0.04744182527065277, 0.08718410134315491, 0.01021511945873499, -0.04802456125617027, 0.00377375609241426, 0.0026167803443968296, -0.05580013990402222, -0.014703840017318726,... | \n", "
| 1 | \n", "Disneyland_HongKong | \n", "positive | \n", "0.858987 | \n", "4 | \n", "Its been a while since d last time we visit HK Disneyland .. Yet, this time we only stay in Tomorrowland .. AKA Marvel land!Now they have Iron Man Experience n d Newly open Ant Man n d Wasp!!Ironm... | \n", "[0.047256097197532654, -0.022195395082235336, 0.12704651057720184, -0.017679929733276367, 0.01212568860501051, 0.023794448003172874, 0.06356310099363327, -0.058635417371988297, -0.0730215683579444... | \n", "
| 2 | \n", "Disneyland_HongKong | \n", "positive | \n", "0.898818 | \n", "4 | \n", "Thanks God it wasn t too hot or too humid when I was visiting the park otherwise it would be a big issue (there is not a lot of shade).I have arrived around 10:30am and left at 6pm. Unfortunat... | \n", "[0.10299565643072128, 0.018655303865671158, 0.12572218477725983, 0.08505047857761383, 0.005007039289921522, -0.034780897200107574, -0.01785672828555107, -0.02709757164120674, -0.06612960994243622,... | \n", "
| 3 | \n", "Disneyland_HongKong | \n", "positive | \n", "0.863198 | \n", "4 | \n", "HK Disneyland is a great compact park. Unfortunately there is quite a bit of maintenance work going on at present so a number of areas are closed off (including the famous castle) If you go midwee... | \n", "[0.1416650116443634, -0.016745619475841522, 0.08668989688158035, 0.008527209982275963, -0.03789116442203522, 0.05225267633795738, 0.010435957461595535, -0.04532679170370102, -0.07340583950281143, ... | \n", "
| 4 | \n", "Disneyland_HongKong | \n", "positive | \n", "0.564070 | \n", "4 | \n", "the location is not in the city, took around 1 hour from Kowlon, my kids like disneyland so much, everything is fine. but its really crowded and hot in Hong Kong | \n", "[0.11346179246902466, 0.004491506610065699, 0.06563812494277954, 0.055530253797769547, 0.027315828949213028, -0.006237472873181105, -0.07888977974653244, -0.07142630219459534, -0.04667326807975769... | \n", "
| \n", " | __nn_distance | \n", "Branch | \n", "Label | \n", "Score | \n", "Rating | \n", "Review_Text | \n", "
|---|---|---|---|---|---|---|
| 0 | \n", "0.600825 | \n", "Disneyland_Paris | \n", "negative | \n", "0.914860 | \n", "2 | \n", "Visited the Park today 20 4 and can conclude that they simply let too many people in. Already at opening, queues at the attractions where 1 hour plus. At midday the so called fast track was ... | \n", "
| 1 | \n", "0.568034 | \n", "Disneyland_California | \n", "positive | \n", "0.763060 | \n", "5 | \n", "I wish they had better food restaurant choices, but the attractions make up for this deficit so it's all good. | \n", "
| 2 | \n", "0.492384 | \n", "Disneyland_California | \n", "positive | \n", "0.962652 | \n", "5 | \n", "We found this park to provide family friendly fun with a variety of shows and rides. We pre paid our tickets to save time which was a benefit and were glad we paid extra for the park hopper ticket... | \n", "
| 3 | \n", "0.483248 | \n", "Disneyland_Paris | \n", "negative | \n", "0.450606 | \n", "3 | \n", "Don't get my wrong, my family have been to Disneyland Paris twice in as many years, my kids love it and the atmosphere is unique. I think its going a bit far to call it magical though. I wont comm... | \n", "
| 4 | \n", "0.481315 | \n", "Disneyland_Paris | \n", "positive | \n", "0.925374 | \n", "5 | \n", "Not the same as Disney in the states, none of the usual foods but none the less still fun. We had a great time. | \n", "
| \n", " | Branch | \n", "Sentiments | \n", "
|---|---|---|
| 0 | \n", "Disneyland_California | \n", "{'negative': 2, 'neutral': 0, 'positive': 7} | \n", "
| 1 | \n", "Disneyland_HongKong | \n", "{'negative': 1, 'neutral': 0, 'positive': 6} | \n", "
| 2 | \n", "Disneyland_Paris | \n", "{'negative': 4, 'neutral': 0, 'positive': 5} | \n", "
| \n", " | id | \n", "text | \n", "metadata | \n", "embedding | \n", "
|---|---|---|---|---|
| 0 | \n", "7673dd5dd3348ca922edfeb765c4f8ec | \n", "b'FACEBOOK\\n\\nNEWS RELEASE\\n\\nMeta Reports Sec... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.057783662762384565, 0.015500455026484948, 0... | \n", "
| 1 | \n", "0042c2ce77a154ed737cfb0d9b20b598 | \n", "b'Three Months Ended June 30, In millions, exc... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[0.011829043858470036, 0.013133554749833454, 0... | \n", "
| 2 | \n", "1006ba147b4696dcfa364d82a7cc3ff9 | \n", "b'Second Quarter 2024 Operational and Other Fi... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.07881357843105635, 0.010791519166744949, 0.... | \n", "
| 3 | \n", "3ccaeebfca3cd0b37f89d9865ed86620 | \n", "b\"CFO Outlook Commentary\\n\\nWe expect third qu... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.06501258679384729, 0.02305678941583444, -0.... | \n", "
| 4 | \n", "f83ab884e1b7f6cd22bb6fec166375de | \n", "b'About Meta\\n\\nMeta builds technologies that ... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.044354494483358486, -0.008683296817018685, ... | \n", "
| \n", " | id | \n", "text | \n", "metadata | \n", "embedding | \n", "
|---|---|---|---|---|
| 0 | \n", "7673dd5dd3348ca922edfeb765c4f8ec | \n", "b'FACEBOOK\\n\\nNEWS RELEASE\\n\\nMeta Reports Sec... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.05788249539290755, 0.015626749069267247, 0.... | \n", "
| 1 | \n", "0042c2ce77a154ed737cfb0d9b20b598 | \n", "b\"### Comprehensive Description of the Table\\n... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[-0.0004421418779838655, 0.011881716893403, 0.... | \n", "
| 2 | \n", "1006ba147b4696dcfa364d82a7cc3ff9 | \n", "b'Second Quarter 2024 Operational and Other Fi... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.07885341798234707, 0.010770062515672503, 0.... | \n", "
| 3 | \n", "3ccaeebfca3cd0b37f89d9865ed86620 | \n", "b\"CFO Outlook Commentary\\n\\nWe expect third qu... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.06499940353648115, 0.02307817596004495, -0.... | \n", "
| 4 | \n", "f83ab884e1b7f6cd22bb6fec166375de | \n", "b'About Meta\\n\\nMeta builds technologies that ... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.044356841931628053, -0.008697156860572564, ... | \n", "
| \n", " | id | \n", "text | \n", "metadata | \n", "embedding | \n", "
|---|---|---|---|---|
| 0 | \n", "7673dd5dd3348ca922edfeb765c4f8ec | \n", "b'FACEBOOK\\n\\nNEWS RELEASE\\n\\nMeta Reports Sec... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.057882495, 0.015626749, 0.0125014, 0.016935... | \n", "
| 1 | \n", "0042c2ce77a154ed737cfb0d9b20b598 | \n", "b\"### Comprehensive Description of the Table\\n... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[-0.00044214187, 0.011881717, 0.03507208, 0.01... | \n", "
| 2 | \n", "1006ba147b4696dcfa364d82a7cc3ff9 | \n", "b'Second Quarter 2024 Operational and Other Fi... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.07885342, 0.010770063, 0.032093342, 0.07417... | \n", "
| 3 | \n", "3ccaeebfca3cd0b37f89d9865ed86620 | \n", "b\"CFO Outlook Commentary\\n\\nWe expect third qu... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.0649994, 0.023078175, -0.0025019816, 6.0829... | \n", "
| 4 | \n", "f83ab884e1b7f6cd22bb6fec166375de | \n", "b'About Meta\\n\\nMeta builds technologies that ... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.04435684, -0.008697157, -0.0011843009, 0.05... | \n", "
| 5 | \n", "6523c2a592eaef3c9c62b13ff33414e6 | \n", "b'Ryan Moore\\n\\npress@meta.com / about.fb.com/... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.068496555, 0.012527236, 0.0041264608, 0.055... | \n", "
| 6 | \n", "b288eb094301be949cb9c4d070135af5 | \n", "b'For a discussion of limitations in the measu... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.06137703, 0.026085237, 0.07956282, 0.036115... | \n", "
| 7 | \n", "69e526bdf29fb6ab25756fefa40b5439 | \n", "b\"Non-GAAP Financial Measures\\n\\nTo supplement... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[-0.00022395901, 0.037475675, 0.05577307, 0.02... | \n", "
| 8 | \n", "75d393ef2f359efa9a296e832490f823 | \n", "b'For more information on our non-GAAP nancial... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.01794313, 0.016595712, 0.021199394, 2.51983... | \n", "
| 9 | \n", "adf7621ac4c8f6206d85b1bba43e591f | \n", "b\"The table presented shows the condensed cons... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[0.007979893, -0.012464022, 0.051368486, 0.019... | \n", "
| 10 | \n", "a918dcd3136389f1f3321920cd5bfc00 | \n", "b'$ 31,999 $ 75,527 $ 60,645\\n\\n12,054\\n\\n18,7... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.045977995, -0.037664726, 0.07803254, 0.0160... | \n", "
| 11 | \n", "6ac0846a53cd0f99bb678c15f19d468e | \n", "b\"The table provided is an excerpt from Meta P... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[0.027606387, -0.012659598, 0.04341087, 0.0401... | \n", "
| 12 | \n", "46348476e1035ccd87ab69a2d1292d40 | \n", "b'$ 3.03 $ 10.17 $ 5.24\\n\\n$ 2.98 $ 9.86 $ 5.1... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.015753612, 0.004310955, 0.041805614, 0.0384... | \n", "
| 13 | \n", "b0cf8bf065eab52e08c0428bbaa94527 | \n", "b\"### Comprehensive Description\\n\\nThe table p... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[0.0070284354, 0.034549743, 0.043274693, 0.004... | \n", "
| 14 | \n", "f7f4229f80e00f509426a6d8f45343c2 | \n", "b'Common stock and additional paid-in capital\\... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.031031432, 0.0055321315, 0.05864805, 0.0336... | \n", "
| 15 | \n", "a7a41c0fdf37f0f30e8d8a2bf5bfafff | \n", "b\"### Comprehensive Description of the Table\\n... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[0.016377663, 0.03243708, 0.03787182, 0.015178... | \n", "
| 16 | \n", "b10e1bd32614ae548f2e08c4d131f460 | \n", "b\"Accumulated other comprehensive loss\\n\\nReta... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.02234056, 0.01776979, 0.0062756445, 0.04135... | \n", "
| 17 | \n", "109f8e55d17287867e2c5007111f08c9 | \n", "b\"### Description of the Table\\n\\nThe table pr... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[-0.012494377, 0.044495758, 0.034548387, 0.005... | \n", "
| 18 | \n", "d8928bcb2a3bb51176df4be51ba74732 | \n", "b'Three Months Ended June 30, Six Months Ended... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.037447758, 0.031958383, 0.029735964, 0.0262... | \n", "
| 19 | \n", "cf915cc8ae9c9cc15b396ea448e94228 | \n", "b'Here\\'s a detailed description of the table ... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[0.0018331239, 0.026139371, 0.046606544, 0.014... | \n", "
| 20 | \n", "341be5c61346e8a761bcb3bc8442cfe9 | \n", "b'$ 28,785\\n\\n165\\n\\n854\\n\\n$ 29,804\\n\\nMETA P... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.027821643, -0.0057690465, 0.0058191307, 0.0... | \n", "
| 21 | \n", "d41baecdce6b423b19c82ed8d8312c7d | \n", "b'The table provided is a snippet of Meta Plat... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[-0.005334932, 0.04107228, 0.04553666, 0.00887... | \n", "
| 22 | \n", "0c91552759dba13dcd139b9b0c9dab62 | \n", "b'$ 1,507\\n\\n$ 182\\n\\n$ 3,845\\n\\n$ 217\\n\\nSegm... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[-0.014556726, 0.023491282, 0.035078283, 0.032... | \n", "
| 23 | \n", "b07080a36773cb7ea01192e93db04f36 | \n", "b'### Comprehensive Description of the Table\\n... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[-0.010667113, 0.034761183, 0.052321482, -0.00... | \n", "
| 24 | \n", "2ba38b7e64d61e5e37565e0bb9e46f27 | \n", "b'Reconciliation of GAAP to Non-GAAP Results\\n... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.008114564, 0.035163112, 0.06455587, 0.00853... | \n", "
| 25 | \n", "d08c13817b3961107632d16dbb25b4fe | \n", "b\"### Detailed Description of the Table\\n\\nThe... | \n", "{'last_modified': '2024-07-31T21:06:06', 'file... | \n", "[-0.013586279, 0.03099968, 0.03918972, 0.00822... | \n", "
| 26 | \n", "5022c9c2aa1e8d5fce278b73f445a561 | \n", "b'GAAP revenue\\n\\n$ 60,645\\n\\n$ 59,599\\n\\nchan... | \n", "{'filetype': 'application/pdf', 'languages': [... | \n", "[0.030714538, 0.0024296725, 0.0496302, 0.00056... | \n", "
| \n", " | segment_id | \n", "start_offset_sec | \n", "end_offset_sec | \n", "embeddings | \n", "
|---|---|---|---|---|
| 0 | \n", "0 | \n", "0.0 | \n", "6.0 | \n", "[0.025205607, 0.0032751139, -0.014859959, 0.02... | \n", "
| 1 | \n", "6 | \n", "6.0 | \n", "12.0 | \n", "[0.02405941, 0.018603785, -0.02103669, 0.04805... | \n", "
| 2 | \n", "12 | \n", "12.0 | \n", "18.0 | \n", "[0.023615949, 0.013139918, -0.022509707, 0.047... | \n", "
| 3 | \n", "18 | \n", "18.0 | \n", "24.0 | \n", "[0.0229081, 0.015123129, -0.028940378, 0.04754... | \n", "
| 4 | \n", "24 | \n", "24.0 | \n", "30.0 | \n", "[0.0034697105, 0.0023037025, -0.023470787, 0.0... | \n", "
| \n", " | section | \n", "start_time | \n", "end_time | \n", "images | \n", "text | \n", "
|---|---|---|---|---|---|
| 0 | \n", "0 | \n", "0.0 | \n", "48.3 | \n", "[/content/video_data/frames/frame0000.png, /content/video_data/frames/frame0001.png, /content/video_data/frames/frame0002.png, /content/video_data/frames/frame0003.png, /content/video_data/frames/frame0004.png, /content/video_data/frames/frame0005.png, /content/video_data/frames/frame0006.png, /content/video_data/frames/frame0007.png, /content/video_data/frames/frame0008.png, /content/video_data/frames/frame0009.png] | \n", "The basic function underlying a normal distribution, a.k.a. a Gaussian, is e to the negative x-squared. But you might wonder, why this function? Of all the expressions we could dream up that give you some symmetric smooth graph with mass concentrated towards the middle, why is it that the theory of probability seems to have a special place in its heart for this particular expression? For the last many videos I've been hinting at an answer to this question, and here we'll finally arrive at something like a satisfying answer. As a quick refresher on where we are, a couple videos ago we talked about the Central Limit Theorem, which describes how as you add multiple copies of a random variable, for example, rolling a weighted die many different times or letting a ball bounce off of a peg repeatedly, then the distribution describing that sum tends to look approximately like a normal distribution. | \n", "
| 1 | \n", "1 | \n", "48.3 | \n", "80.2 | \n", "[/content/video_data/frames/frame0010.png, /content/video_data/frames/frame0011.png, /content/video_data/frames/frame0012.png, /content/video_data/frames/frame0013.png, /content/video_data/frames/frame0014.png, /content/video_data/frames/frame0015.png] | \n", "What the Central Limit Theorem says is as you make that sum bigger and bigger, under appropriate conditions, that approximation to a normal becomes better and better. But I never explained why this theorem is actually true, we only talked about what it's claiming. In the last video, we started talking about the math involved in adding two random variables. If you have two random variables, each following some distribution, then to find the distribution describing the sum of those variables, you compute something known as a convolution between the two original functions. | \n", "
| 2 | \n", "2 | \n", "80.2 | \n", "112.8 | \n", "[/content/video_data/frames/frame0016.png, /content/video_data/frames/frame0017.png, /content/video_data/frames/frame0018.png, /content/video_data/frames/frame0019.png, /content/video_data/frames/frame0020.png, /content/video_data/frames/frame0021.png, /content/video_data/frames/frame0022.png] | \n", "And we spent a lot of time building up two distinct ways to visualize what this convolution operation really is. Today, our basic job is to work through a particular example, which is to ask, what happens when you add two normally distributed random variables? Which, as you know by now, is the same as asking, what do you get if you compute a convolution between two Gaussian functions? I'd like to share an especially pleasing visual way that you can think about this calculation, which hopefully offers some sense of what makes the e to the negative x squared function special in the first place. | \n", "
| 3 | \n", "3 | \n", "112.8 | \n", "153.9 | \n", "[/content/video_data/frames/frame0023.png, /content/video_data/frames/frame0024.png, /content/video_data/frames/frame0025.png, /content/video_data/frames/frame0026.png, /content/video_data/frames/frame0027.png, /content/video_data/frames/frame0028.png, /content/video_data/frames/frame0029.png, /content/video_data/frames/frame0030.png] | \n", "After we walk through it, we'll talk about how this calculation is one of the steps involved in proving the central limit theorem. It's the step that answers the question of why a Gaussian, and not something else, is the central limit. But first, let's dive in. The full formula for a Gaussian is more complicated than just e to the negative x squared. The exponent is typically written as negative 1 half times x divided by sigma squared, where sigma describes the spread of the distribution. Specifically, the standard deviation. All of this needs to be multiplied by a fraction on the front, which is there to make sure that the area under the curve is 1, making it a valid probability distribution. | \n", "
| 4 | \n", "4 | \n", "153.9 | \n", "190.5 | \n", "[/content/video_data/frames/frame0031.png, /content/video_data/frames/frame0032.png, /content/video_data/frames/frame0033.png, /content/video_data/frames/frame0034.png, /content/video_data/frames/frame0035.png, /content/video_data/frames/frame0036.png, /content/video_data/frames/frame0037.png] | \n", "And if you want to consider distributions that aren't necessarily centered at 0, you would also throw another parameter, mu, into the exponent like this. Although, for everything we'll be doing here, we just consider centered distributions. Now, if you look at our central goal for today, which is to compute a convolution between two Gaussian functions, the direct way to do this would be to take the definition of a convolution, this integral expression we built up last video, and then to plug in, for each one of the functions involved, the formula for a Gaussian. It's kind of a lot of symbols when you throw it all together, but more than anything, working this out is an exercise in completing the square. | \n", "
| 5 | \n", "5 | \n", "190.5 | \n", "224.7 | \n", "[/content/video_data/frames/frame0038.png, /content/video_data/frames/frame0039.png, /content/video_data/frames/frame0040.png, /content/video_data/frames/frame0041.png, /content/video_data/frames/frame0042.png, /content/video_data/frames/frame0043.png, /content/video_data/frames/frame0044.png] | \n", "And there's nothing wrong with that. That will get you the answer that you want. But of course, you know me, I'm a sucker for visual intuition, and in this case, there's another way to think about it that I haven't seen written about before that offers a very nice connection to other aspects of this distribution, like the presence of pi and certain ways to derive where it comes from. And the way I'd like to do this is by first peeling away all of the constants associated with the actual distribution, and just showing the computation for the simplified form, e to the negative x squared. The essence of what we want to compute is what the convolution between two copies of this function looks like. | \n", "
| 6 | \n", "6 | \n", "224.7 | \n", "257.9 | \n", "[/content/video_data/frames/frame0045.png, /content/video_data/frames/frame0046.png, /content/video_data/frames/frame0047.png, /content/video_data/frames/frame0048.png, /content/video_data/frames/frame0049.png, /content/video_data/frames/frame0050.png, /content/video_data/frames/frame0051.png] | \n", "If you'll remember, in the last video, we had two different ways to visualize convolutions, and the one we'll be using here is the second one, involving diagonal slices. And as a quick reminder of the way that worked, if you have two different distributions that are described by two different functions, f and g, then every possible pair of values that you might get when you sample from these two distributions can be thought of as individual points on the xy-plane. And the probability density of landing on one such point, assuming independence, looks like f of x times g of y. | \n", "
| 7 | \n", "7 | \n", "257.9 | \n", "291.3 | \n", "[/content/video_data/frames/frame0052.png, /content/video_data/frames/frame0053.png, /content/video_data/frames/frame0054.png, /content/video_data/frames/frame0055.png, /content/video_data/frames/frame0056.png, /content/video_data/frames/frame0057.png] | \n", "So what we do is we look at a graph of that expression as a two-variable function of x and y, which is a way of showing the distribution of all possible outcomes when we sample from the two different variables. To interpret the convolution of f and g, evaluated on some input s, which is a way of saying how likely are you to get a pair of samples that adds up to this sum, s, what you do is you look at a slice of this graph over the line x plus y equals s, and you consider the area under that slice. | \n", "
| 8 | \n", "8 | \n", "291.3 | \n", "327.3 | \n", "[/content/video_data/frames/frame0058.png, /content/video_data/frames/frame0059.png, /content/video_data/frames/frame0060.png, /content/video_data/frames/frame0061.png, /content/video_data/frames/frame0062.png, /content/video_data/frames/frame0063.png, /content/video_data/frames/frame0064.png] | \n", "This area is almost, but not quite, the value of the convolution at s. For a mildly technical reason, you need to divide by the square root of 2. Still, this area is the key feature to focus on. You can think of it as a way to combine together all the probability densities for all of the outcomes corresponding to a given sum. In the specific case where these two functions look like e to the negative x squared and e to the negative y squared, the resulting 3D graph has a really nice property that you can exploit. It's rotationally symmetric. | \n", "
| 9 | \n", "9 | \n", "327.3 | \n", "365.8 | \n", "[/content/video_data/frames/frame0065.png, /content/video_data/frames/frame0066.png, /content/video_data/frames/frame0067.png, /content/video_data/frames/frame0068.png, /content/video_data/frames/frame0069.png, /content/video_data/frames/frame0070.png, /content/video_data/frames/frame0071.png, /content/video_data/frames/frame0072.png] | \n", "You can see this by combining the terms and noticing that it's entirely a function of x squared plus y squared, and this term describes the square of the distance between any point on the xy-plane and the origin. So in other words, the expression is purely a function of the distance from the origin. And by the way, this would not be true for any other distribution. It's a property that uniquely characterizes bell curves. So for most other pairs of functions, these diagonal slices will be some complicated shape that's hard to think about, and honestly, calculating the area would just amount to computing the original integral that defines a convolution in the first place. | \n", "
| \n", " | images | \n", "text | \n", "emb | \n", "
|---|---|---|---|
| 0 | \n", "[/content/video_data/frames/frame0000.png, /co... | \n", "The basic function underlying a normal distrib... | \n", "[0.0108642578125, -0.037109375, 0.023193359375... | \n", "
| 1 | \n", "[/content/video_data/frames/frame0010.png, /co... | \n", "What the Central Limit Theorem says is as you ... | \n", "[0.0004634857177734375, -0.025390625, 0.013732... | \n", "
| 2 | \n", "[/content/video_data/frames/frame0016.png, /co... | \n", "And we spent a lot of time building up two dis... | \n", "[0.029541015625, -0.0120849609375, 0.014404296... | \n", "
| 3 | \n", "[/content/video_data/frames/frame0023.png, /co... | \n", "After we walk through it, we'll talk about how... | \n", "[0.029296875, -0.0289306640625, 0.02490234375,... | \n", "
| 4 | \n", "[/content/video_data/frames/frame0031.png, /co... | \n", "And if you want to consider distributions that... | \n", "[0.01708984375, -0.014404296875, 0.00564575195... | \n", "
| \n", " | id | \n", "text_bytes | \n", "image_paths | \n", "emb | \n", "
|---|---|---|---|---|
| 0 | \n", "0 | \n", "b\"The basic function underlying a normal distr... | \n", "[\"/content/video_data/frames/frame0000.png\", \"... | \n", "[0.0108642578125, -0.037109375, 0.023193359375... | \n", "
| 1 | \n", "1 | \n", "b\"What the Central Limit Theorem says is as yo... | \n", "[\"/content/video_data/frames/frame0010.png\", \"... | \n", "[0.0004634857177734375, -0.025390625, 0.013732... | \n", "
| 2 | \n", "2 | \n", "b\"And we spent a lot of time building up two d... | \n", "[\"/content/video_data/frames/frame0016.png\", \"... | \n", "[0.029541015625, -0.0120849609375, 0.014404296... | \n", "
| 3 | \n", "3 | \n", "b\"After we walk through it, we'll talk about h... | \n", "[\"/content/video_data/frames/frame0023.png\", \"... | \n", "[0.029296875, -0.0289306640625, 0.02490234375,... | \n", "
| 4 | \n", "4 | \n", "b\"And if you want to consider distributions th... | \n", "[\"/content/video_data/frames/frame0031.png\", \"... | \n", "[0.01708984375, -0.014404296875, 0.00564575195... | \n", "
| \n", " | id | \n", "text_bytes | \n", "image_paths | \n", "embeddings | \n", "
|---|---|---|---|---|
| 0 | \n", "0 | \n", "b\"The basic function underlying a normal distr... | \n", "[\"/content/video_data/frames/frame0000.png\", \"... | \n", "[0.010864258, -0.037109375, 0.02319336, -0.011... | \n", "
| 1 | \n", "1 | \n", "b\"What the Central Limit Theorem says is as yo... | \n", "[\"/content/video_data/frames/frame0010.png\", \"... | \n", "[0.00046348572, -0.025390625, 0.01373291, 0.00... | \n", "
| 2 | \n", "2 | \n", "b\"And we spent a lot of time building up two d... | \n", "[\"/content/video_data/frames/frame0016.png\", \"... | \n", "[0.029541016, -0.012084961, 0.014404297, -0.01... | \n", "
| 3 | \n", "3 | \n", "b\"After we walk through it, we'll talk about h... | \n", "[\"/content/video_data/frames/frame0023.png\", \"... | \n", "[0.029296875, -0.028930664, 0.024902344, -0.03... | \n", "
| 4 | \n", "4 | \n", "b\"And if you want to consider distributions th... | \n", "[\"/content/video_data/frames/frame0031.png\", \"... | \n", "[0.017089844, -0.014404297, 0.005645752, -0.02... | \n", "