Full Code of gkamradt/langchain-tutorials for AI

main 697c4de4f6c6 cached

360 files

55.9 MB

1.6M tokens

147 symbols

1 requests

Download .txt

Showing preview only (6,235K chars total). Download the full file or copy to clipboard to get everything.

Repository: gkamradt/langchain-tutorials
Branch: main
Commit: 697c4de4f6c6
Files: 360
Total size: 55.9 MB

Directory structure:
gitextract_j4ji0gy0/

├── .gitignore
├── LangChain Cookbook Part 1 - Fundamentals.ipynb
├── LangChain Cookbook Part 2 - Use Cases.ipynb
├── README.md
├── SUMMARY.md
├── agents/
│   ├── Agents + ZapierToolkit.ipynb
│   └── Agents.ipynb
├── bots/
│   └── Twitter_Reply_Bot/
│       └── Twitter Reply Bot Notebook.ipynb
├── chains/
│   └── Chain Types.ipynb
├── chatapi/
│   └── ChatAPI + LangChain Basics.ipynb
├── data/
│   ├── LinkedInIndustries.csv
│   ├── LinkedInSubIndustries.csv
│   ├── PaulGrahamEssayMedium/
│   │   ├── fr.txt
│   │   ├── guidetoinvestors.txt
│   │   ├── mit.txt
│   │   ├── notnot.txt
│   │   ├── popular.txt
│   │   ├── re.txt
│   │   ├── road.txt
│   │   ├── start.txt
│   │   ├── startupfunding.txt
│   │   ├── startupideas.txt
│   │   ├── wealth.txt
│   │   └── worked.txt
│   ├── PaulGrahamEssaySmall/
│   │   ├── cred.txt
│   │   ├── disc.txt
│   │   ├── fix.txt
│   │   ├── fp.txt
│   │   ├── getideas.txt
│   │   ├── lwba.txt
│   │   ├── nft.txt
│   │   ├── noob.txt
│   │   ├── nov.txt
│   │   ├── pow.txt
│   │   ├── prop62.txt
│   │   ├── rootsoflisp.txt
│   │   ├── rss.txt
│   │   ├── todo.txt
│   │   └── twitter.txt
│   ├── PaulGrahamEssays/
│   │   ├── 13sentences.txt
│   │   ├── 5founders.txt
│   │   ├── 6631327.txt
│   │   ├── 95.txt
│   │   ├── ace.txt
│   │   ├── addiction.txt
│   │   ├── airbnb.txt
│   │   ├── airbnbs.txt
│   │   ├── alien.txt
│   │   ├── altair.txt
│   │   ├── ambitious.txt
│   │   ├── america.txt
│   │   ├── angelinvesting.txt
│   │   ├── aord.txt
│   │   ├── apple.txt
│   │   ├── artistsship.txt
│   │   ├── avg.txt
│   │   ├── badeconomy.txt
│   │   ├── before.txt
│   │   ├── better.txt
│   │   ├── bias.txt
│   │   ├── boss.txt
│   │   ├── bronze.txt
│   │   ├── bubble.txt
│   │   ├── charisma.txt
│   │   ├── cities.txt
│   │   ├── college.txt
│   │   ├── colleges.txt
│   │   ├── conformism.txt
│   │   ├── control.txt
│   │   ├── convergence.txt
│   │   ├── convince.txt
│   │   ├── copy.txt
│   │   ├── corpdev.txt
│   │   ├── cred.txt
│   │   ├── credentials.txt
│   │   ├── desres.txt
│   │   ├── determination.txt
│   │   ├── die.txt
│   │   ├── diff.txt
│   │   ├── disagree.txt
│   │   ├── disc.txt
│   │   ├── discover.txt
│   │   ├── distraction.txt
│   │   ├── divergence.txt
│   │   ├── donate.txt
│   │   ├── ds.txt
│   │   ├── early.txt
│   │   ├── earnest.txt
│   │   ├── ecw.txt
│   │   ├── equity.txt
│   │   ├── essay.txt
│   │   ├── ffb.txt
│   │   ├── fh.txt
│   │   ├── fix.txt
│   │   ├── fn.txt
│   │   ├── founders.txt
│   │   ├── foundersatwork.txt
│   │   ├── foundervisa.txt
│   │   ├── fp.txt
│   │   ├── fr.txt
│   │   ├── fundraising.txt
│   │   ├── future.txt
│   │   ├── gap.txt
│   │   ├── gba.txt
│   │   ├── genius.txt
│   │   ├── getideas.txt
│   │   ├── gh.txt
│   │   ├── good.txt
│   │   ├── goodart.txt
│   │   ├── goodtaste.txt
│   │   ├── googles.txt
│   │   ├── growth.txt
│   │   ├── guidetoinvestors.txt
│   │   ├── hackernews.txt
│   │   ├── head.txt
│   │   ├── herd.txt
│   │   ├── heresy.txt
│   │   ├── heroes.txt
│   │   ├── highres.txt
│   │   ├── hiresfund.txt
│   │   ├── hiring.txt
│   │   ├── hp.txt
│   │   ├── hs.txt
│   │   ├── hubs.txt
│   │   ├── hundred.txt
│   │   ├── hw.txt
│   │   ├── hwh.txt
│   │   ├── icad.txt
│   │   ├── ideas.txt
│   │   ├── identity.txt
│   │   ├── iflisp.txt
│   │   ├── ineq.txt
│   │   ├── inequality.txt
│   │   ├── investors.txt
│   │   ├── invtrend.txt
│   │   ├── island.txt
│   │   ├── javacover.txt
│   │   ├── jessica.txt
│   │   ├── judgement.txt
│   │   ├── kate.txt
│   │   ├── kids.txt
│   │   ├── know.txt
│   │   ├── ladder.txt
│   │   ├── langdes.txt
│   │   ├── laundry.txt
│   │   ├── lesson.txt
│   │   ├── lies.txt
│   │   ├── love.txt
│   │   ├── lwba.txt
│   │   ├── mac.txt
│   │   ├── makersschedule.txt
│   │   ├── marginal.txt
│   │   ├── maybe.txt
│   │   ├── mean.txt
│   │   ├── microsoft.txt
│   │   ├── mit.txt
│   │   ├── mod.txt
│   │   ├── name.txt
│   │   ├── nerds.txt
│   │   ├── newideas.txt
│   │   ├── newthings.txt
│   │   ├── nft.txt
│   │   ├── noob.txt
│   │   ├── noop.txt
│   │   ├── notnot.txt
│   │   ├── nov.txt
│   │   ├── nthings.txt
│   │   ├── opensource.txt
│   │   ├── organic.txt
│   │   ├── orth.txt
│   │   ├── own.txt
│   │   ├── patentpledge.txt
│   │   ├── pgh.txt
│   │   ├── philosophy.txt
│   │   ├── pinch.txt
│   │   ├── polls.txt
│   │   ├── popular.txt
│   │   ├── pow.txt
│   │   ├── power.txt
│   │   ├── prcmc.txt
│   │   ├── procrastination.txt
│   │   ├── progbot.txt
│   │   ├── prop62.txt
│   │   ├── property.txt
│   │   ├── publishing.txt
│   │   ├── pypar.txt
│   │   ├── ramenprofitable.txt
│   │   ├── randomness.txt
│   │   ├── re.txt
│   │   ├── read.txt
│   │   ├── real.txt
│   │   ├── really.txt
│   │   ├── relres.txt
│   │   ├── revolution.txt
│   │   ├── richnow.txt
│   │   ├── road.txt
│   │   ├── ronco.txt
│   │   ├── rootsoflisp.txt
│   │   ├── rss.txt
│   │   ├── safe.txt
│   │   ├── say.txt
│   │   ├── schlep.txt
│   │   ├── seesv.txt
│   │   ├── segway.txt
│   │   ├── selfindulgence.txt
│   │   ├── sfp.txt
│   │   ├── siliconvalley.txt
│   │   ├── simply.txt
│   │   ├── smart.txt
│   │   ├── softwarepatents.txt
│   │   ├── spam.txt
│   │   ├── speak.txt
│   │   ├── start.txt
│   │   ├── startupfunding.txt
│   │   ├── startuphubs.txt
│   │   ├── startupideas.txt
│   │   ├── startuplessons.txt
│   │   ├── startupmistakes.txt
│   │   ├── stuff.txt
│   │   ├── submarine.txt
│   │   ├── sun.txt
│   │   ├── superangels.txt
│   │   ├── swan.txt
│   │   ├── tablets.txt
│   │   ├── talk.txt
│   │   ├── taste.txt
│   │   ├── think.txt
│   │   ├── todo.txt
│   │   ├── top.txt
│   │   ├── trolls.txt
│   │   ├── twitter.txt
│   │   ├── unions.txt
│   │   ├── usa.txt
│   │   ├── useful.txt
│   │   ├── users.txt
│   │   ├── vb.txt
│   │   ├── vcsqueeze.txt
│   │   ├── venturecapital.txt
│   │   ├── vw.txt
│   │   ├── want.txt
│   │   ├── wealth.txt
│   │   ├── web20.txt
│   │   ├── webstartups.txt
│   │   ├── weird.txt
│   │   ├── whyyc.txt
│   │   ├── wisdom.txt
│   │   ├── word.txt
│   │   ├── words.txt
│   │   ├── work.txt
│   │   ├── worked.txt
│   │   ├── writing44.txt
│   │   ├── wtax.txt
│   │   ├── yahoo.txt
│   │   ├── ycombinator.txt
│   │   └── ycstart.txt
│   ├── PaulGrahamEssaysLarge/
│   │   ├── addiction.txt
│   │   ├── aord.txt
│   │   ├── apple.txt
│   │   ├── avg.txt
│   │   ├── before.txt
│   │   ├── bias.txt
│   │   ├── boss.txt
│   │   ├── copy.txt
│   │   ├── corpdev.txt
│   │   ├── desres.txt
│   │   ├── diff.txt
│   │   ├── ecw.txt
│   │   ├── founders.txt
│   │   ├── foundervisa.txt
│   │   ├── gap.txt
│   │   ├── gba.txt
│   │   ├── gh.txt
│   │   ├── goodtaste.txt
│   │   ├── hubs.txt
│   │   ├── iflisp.txt
│   │   ├── island.txt
│   │   ├── know.txt
│   │   ├── langdes.txt
│   │   ├── laundry.txt
│   │   ├── love.txt
│   │   ├── mod.txt
│   │   ├── newideas.txt
│   │   ├── nft.txt
│   │   ├── philosophy.txt
│   │   ├── popular.txt
│   │   ├── pow.txt
│   │   ├── rootsoflisp.txt
│   │   ├── rss.txt
│   │   ├── siliconvalley.txt
│   │   ├── startuplessons.txt
│   │   ├── submarine.txt
│   │   ├── sun.txt
│   │   ├── superangels.txt
│   │   ├── todo.txt
│   │   ├── unions.txt
│   │   ├── useful.txt
│   │   ├── vb.txt
│   │   ├── vcsqueeze.txt
│   │   ├── vw.txt
│   │   ├── want.txt
│   │   ├── web20.txt
│   │   ├── weird.txt
│   │   ├── wisdom.txt
│   │   └── worked.txt
│   ├── San_Francisco_Trees.csv
│   ├── Transcripts/
│   │   ├── MFMPod/
│   │   │   ├── mfm_pod_alex.txt
│   │   │   ├── mfm_pod_rob.txt
│   │   │   └── mfm_pod_steph.txt
│   │   └── acme_co_v2.txt
│   ├── matching_tone_samples.json
│   ├── muir_lake_tahoe_in_winter.txt
│   ├── state_of_the_union.txt
│   └── thefuzz/
│       ├── .editorconfig
│       ├── .github/
│       │   └── workflows/
│       │       └── ci.yml
│       ├── .gitignore
│       ├── .travis.yml
│       ├── CHANGES.rst
│       ├── LICENSE.txt
│       ├── MANIFEST.in
│       ├── README.rst
│       ├── benchmarks.py
│       ├── data/
│       │   └── titledata.csv
│       ├── release
│       ├── setup.py
│       ├── test_thefuzz.py
│       ├── test_thefuzz_hypothesis.py
│       ├── test_thefuzz_pytest.py
│       ├── thefuzz/
│       │   ├── StringMatcher.py
│       │   ├── StringMatcher.pyi
│       │   ├── __init__.py
│       │   ├── fuzz.py
│       │   ├── fuzz.pyi
│       │   ├── process.py
│       │   ├── process.pyi
│       │   ├── string_processing.py
│       │   ├── string_processing.pyi
│       │   ├── utils.py
│       │   └── utils.pyi
│       └── tox.ini
├── data_generation/
│   ├── 5 Levels Of Summarization - Novice To Expert.ipynb
│   ├── Advanced Retrieval With LangChain.ipynb
│   ├── Ask A Book Questions.ipynb
│   ├── Clean and Standardize Data.ipynb
│   ├── Custom Files Question & Answer.ipynb
│   ├── Expert Structured Output (Using Function Calling).ipynb
│   ├── Expert Structured Output (Using Kor).ipynb
│   ├── Exploring ChatGPT Function Calling.ipynb
│   ├── Instructing LLMs To Match Tone.ipynb
│   ├── Personalized Email Generation.ipynb
│   ├── Retrieval_With_MMR.ipynb
│   ├── Topic Modeling With Language Models.ipynb
│   ├── Using LLMs To Summarize Personal Research.ipynb
│   └── Working With Call or Video Transcripts.ipynb
├── getting_started/
│   └── Quickstart Guide.ipynb
├── loaders/
│   ├── Google Drive Loader.ipynb
│   └── YouTube Loader.ipynb
├── requirements.txt
└── tutorials/
    ├── Google Drive Loader.ipynb
    ├── Twitter_Reply_Bot/
    │   └── Twitter Reply Bot Notebook.ipynb
    └── YouTube Loader.ipynb

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
#   For a library or package, you might want to ignore these files since the code is
#   intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
#   However, in case of collaboration, if having platform-specific dependencies or dependencies
#   having no cross-platform support, pipenv may install dependencies that don't work, or not
#   install all needed dependencies.
#Pipfile.lock

# poetry
#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
#   This is especially recommended for binary packages to ensure reproducibility, and is more
#   commonly ignored for libraries.
#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
#   in version control.
#   https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
#  and can be added to the global gitignore or merged into this file.  For a more nuclear
#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
**/.DS_STORE

================================================
FILE: LangChain Cookbook Part 1 - Fundamentals.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "359697d5",
   "metadata": {},
   "source": [
    "# LangChain Cookbook 👨‍🍳👩‍🍳"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "11d788b0",
   "metadata": {},
   "source": [
    "*This cookbook is based off the [LangChain Conceptual Documentation](https://docs.langchain.com/docs/)*\n",
    "\n",
    "**Goal:** Provide an introductory understanding of the components and use cases of LangChain via [ELI5](https://www.dictionary.com/e/slang/eli5/#:~:text=ELI5%20is%20short%20for%20%E2%80%9CExplain,a%20complicated%20question%20or%20problem.) examples and code snippets. For use cases check out [part 2](https://github.com/gkamradt/langchain-tutorials/blob/main/LangChain%20Cookbook%20Part%202%20-%20Use%20Cases.ipynb). See [video tutorial](https://www.youtube.com/watch?v=2xxziIWmaSA) of this notebook.\n",
    "\n",
    "\n",
    "**Links:**\n",
    "* [LC Conceptual Documentation](https://docs.langchain.com/docs/)\n",
    "* [LC Python Documentation](https://python.langchain.com/en/latest/)\n",
    "* [LC Javascript/Typescript Documentation](https://js.langchain.com/docs/)\n",
    "* [LC Discord](https://discord.gg/6adMQxSpJS)\n",
    "* [www.langchain.com](https://langchain.com/)\n",
    "* [LC Twitter](https://twitter.com/LangChainAI)\n",
    "\n",
    "\n",
    "### **What is LangChain?**\n",
    "> LangChain is a framework for developing applications powered by language models.\n",
    "\n",
    "**~~TL~~DR**: LangChain makes the complicated parts of working & building with AI models easier. It helps do this in two ways:\n",
    "\n",
    "1. **Integration** - Bring external data, such as your files, other applications, and api data, to your LLMs\n",
    "2. **Agency** - Allow your LLMs to interact with it's environment via decision making. Use LLMs to help decide which action to take next\n",
    "\n",
    "### **Why LangChain?**\n",
    "1. **Components** - LangChain makes it easy to swap out abstractions and components necessary to work with language models.\n",
    "\n",
    "2. **Customized Chains** - LangChain provides out of the box support for using and customizing 'chains' - a series of actions strung together.\n",
    "\n",
    "3. **Speed 🚢** - This team ships insanely fast. You'll be up to date with the latest LLM features.\n",
    "\n",
    "4. **Community 👥** - Wonderful discord and community support, meet ups, hackathons, etc.\n",
    "\n",
    "Though LLMs can be straightforward (text-in, text-out) you'll quickly run into friction points that LangChain helps with once you develop more complicated applications.\n",
    "\n",
    "*Note: This cookbook will not cover all aspects of LangChain. It's contents have been curated to get you to building & impact as quick as possible. For more, please check out [LangChain Conceptual Documentation](https://docs.langchain.com/docs/)*\n",
    "\n",
    "*Update Oct '23: This notebook has been expanded from it's original form*\n",
    "\n",
    "You'll need an OpenAI api key to follow this tutorial. You can have it as an environement variable, in an .env file where this jupyter notebook lives, or insert it below where 'YourAPIKey' is. Have if you have questions on this, put these instructions into [ChatGPT](https://chat.openai.com/)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "e9815081",
   "metadata": {
    "hide_input": false
   },
   "outputs": [],
   "source": [
    "from dotenv import load_dotenv\n",
    "import os\n",
    "\n",
    "load_dotenv()\n",
    "\n",
    "openai_api_key=os.getenv('OPENAI_API_KEY', 'YourAPIKey')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "05bb564d",
   "metadata": {},
   "source": [
    "# LangChain Components\n",
    "\n",
    "## Schema - Nuts and Bolts of working with Large Language Models (LLMs)\n",
    "\n",
    "### **Text**\n",
    "The natural language way to interact with LLMs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "8e0dc06c",
   "metadata": {
    "hide_input": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'What day comes after Friday?'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# You'll be working with simple strings (that'll soon grow in complexity!)\n",
    "my_text = \"What day comes after Friday?\"\n",
    "my_text"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2f39eb39",
   "metadata": {},
   "source": [
    "### **Chat Messages**\n",
    "Like text, but specified with a message type (System, Human, AI)\n",
    "\n",
    "* **System** - Helpful background context that tell the AI what to do\n",
    "* **Human** - Messages that are intented to represent the user\n",
    "* **AI** - Messages that show what the AI responded with\n",
    "\n",
    "For more, see OpenAI's [documentation](https://platform.openai.com/docs/guides/chat/introduction)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "99b0935b",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.chat_models import ChatOpenAI\n",
    "from langchain.schema import HumanMessage, SystemMessage, AIMessage\n",
    "\n",
    "# This it the language model we'll use. We'll talk about what we're doing below in the next section\n",
    "chat = ChatOpenAI(temperature=.7, openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a2d2f7af",
   "metadata": {},
   "source": [
    "Now let's create a few messages that simulate a chat experience with a bot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "878d6a36",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AIMessage(content='You could try a caprese salad with fresh tomatoes, mozzarella, and basil.')"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chat(\n",
    "    [\n",
    "        SystemMessage(content=\"You are a nice AI bot that helps a user figure out what to eat in one short sentence\"),\n",
    "        HumanMessage(content=\"I like tomatoes, what should I eat?\")\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0a425aaa",
   "metadata": {},
   "source": [
    "You can also pass more chat history w/ responses from the AI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "8fd3fe88",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AIMessage(content='You should also explore the charming streets of the Old Town and indulge in delicious French cuisine.')"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chat(\n",
    "    [\n",
    "        SystemMessage(content=\"You are a nice AI bot that helps a user figure out where to travel in one short sentence\"),\n",
    "        HumanMessage(content=\"I like the beaches where should I go?\"),\n",
    "        AIMessage(content=\"You should go to Nice, France\"),\n",
    "        HumanMessage(content=\"What else should I do when I'm there?\")\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ff5ee37a",
   "metadata": {},
   "source": [
    "You can also exclude the system message if you want"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "238a49f6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AIMessage(content='Friday')"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chat(\n",
    "    [\n",
    "        HumanMessage(content=\"What day comes after Thursday?\")\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "66bf9634",
   "metadata": {},
   "source": [
    "### **Documents**\n",
    "An object that holds a piece of text and metadata (more information about that text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "3bbf58b2",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.schema import Document"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "6ad9bef6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Document(page_content=\"This is my document. It is full of text that I've gathered from other places\", metadata={'my_document_id': 234234, 'my_document_source': 'The LangChain Papers', 'my_document_create_time': 1680013019})"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Document(page_content=\"This is my document. It is full of text that I've gathered from other places\",\n",
    "         metadata={\n",
    "             'my_document_id' : 234234,\n",
    "             'my_document_source' : \"The LangChain Papers\",\n",
    "             'my_document_create_time' : 1680013019\n",
    "         })"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3bd19754",
   "metadata": {},
   "source": [
    "But you don't have to include metadata if you don't want to"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "0798d3ca",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Document(page_content=\"This is my document. It is full of text that I've gathered from other places\")"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Document(page_content=\"This is my document. It is full of text that I've gathered from other places\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e462b5d",
   "metadata": {},
   "source": [
    "## Models - The interface to the AI brains"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b27fe982",
   "metadata": {},
   "source": [
    "###  **Language Model**\n",
    "A model that does text in ➡️ text out!\n",
    "\n",
    "*Check out how I changed the model I was using from the default one to ada-001 (a very cheap, low performing model). See more models [here](https://platform.openai.com/docs/models)*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "74b1a72a",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.llms import OpenAI\n",
    "\n",
    "llm = OpenAI(model_name=\"text-ada-001\", openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "6399c295",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'\\n\\nSaturday'"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "llm(\"What day comes after Friday?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3ef89bfa",
   "metadata": {},
   "source": [
    "### **Chat Model**\n",
    "A model that takes a series of messages and returns a message output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "bf091777",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.chat_models import ChatOpenAI\n",
    "from langchain.schema import HumanMessage, SystemMessage, AIMessage\n",
    "\n",
    "chat = ChatOpenAI(temperature=1, openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "f4260711",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AIMessage(content='Why did the math book go to New York? Because it had too many problems and needed a change of scenery!')"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chat(\n",
    "    [\n",
    "        SystemMessage(content=\"You are an unhelpful AI bot that makes a joke at whatever the user says\"),\n",
    "        HumanMessage(content=\"I would like to go to New York, how should I do this?\")\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "05c028f9",
   "metadata": {},
   "source": [
    "### Function Calling Models\n",
    "\n",
    "[Function calling models](https://openai.com/blog/function-calling-and-other-api-updates) are similar to Chat Models but with a little extra flavor. They are fine tuned to give structured data outputs.\n",
    "\n",
    "This comes in handy when you're making an API call to an external service or doing extraction."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "1020ff45",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AIMessage(content='', additional_kwargs={'function_call': {'name': 'get_current_weather', 'arguments': '{\\n  \"location\": \"Boston, MA\"\\n}'}})"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chat = ChatOpenAI(model='gpt-3.5-turbo-0613', temperature=1, openai_api_key=openai_api_key)\n",
    "\n",
    "output = chat(messages=\n",
    "     [\n",
    "         SystemMessage(content=\"You are an helpful AI bot\"),\n",
    "         HumanMessage(content=\"What’s the weather like in Boston right now?\")\n",
    "     ],\n",
    "     functions=[{\n",
    "         \"name\": \"get_current_weather\",\n",
    "         \"description\": \"Get the current weather in a given location\",\n",
    "         \"parameters\": {\n",
    "             \"type\": \"object\",\n",
    "             \"properties\": {\n",
    "                 \"location\": {\n",
    "                     \"type\": \"string\",\n",
    "                     \"description\": \"The city and state, e.g. San Francisco, CA\"\n",
    "                 },\n",
    "                 \"unit\": {\n",
    "                     \"type\": \"string\",\n",
    "                     \"enum\": [\"celsius\", \"fahrenheit\"]\n",
    "                 }\n",
    "             },\n",
    "             \"required\": [\"location\"]\n",
    "         }\n",
    "     }\n",
    "     ]\n",
    ")\n",
    "output"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0f399a1d",
   "metadata": {},
   "source": [
    "See the extra `additional_kwargs` that is passed back to us? We can take that and pass it to an external API to get data. It saves the hassle of doing output parsing."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c2b70f23",
   "metadata": {},
   "source": [
    "### **Text Embedding Model**\n",
    "Change your text into a vector (a series of numbers that hold the semantic 'meaning' of your text). Mainly used when comparing two pieces of text together.\n",
    "\n",
    "*BTW: Semantic means 'relating to meaning in language or logic.'*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "1655de82",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.embeddings import OpenAIEmbeddings\n",
    "\n",
    "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "a2c85e7e",
   "metadata": {},
   "outputs": [],
   "source": [
    "text = \"Hi! It's time for the beach\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "ddc5a368",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Here's a sample: [-0.00019600906371495047, -0.0031846734422911363, -0.0007734206914647714, -0.019472001962491232, -0.015092319017854244]...\n",
      "Your embedding is length 1536\n"
     ]
    }
   ],
   "source": [
    "text_embedding = embeddings.embed_query(text)\n",
    "print (f\"Here's a sample: {text_embedding[:5]}...\")\n",
    "print (f\"Your embedding is length {len(text_embedding)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c38fe99f",
   "metadata": {},
   "source": [
    "## Prompts - Text generally used as instructions to your model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8b9318ed",
   "metadata": {},
   "source": [
    "### **Prompt**\n",
    "What you'll pass to the underlying model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "2d270239",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "The statement is incorrect. Tomorrow is Tuesday, not Wednesday.\n"
     ]
    }
   ],
   "source": [
    "from langchain.llms import OpenAI\n",
    "\n",
    "llm = OpenAI(model_name=\"text-davinci-003\", openai_api_key=openai_api_key)\n",
    "\n",
    "# I like to use three double quotation marks for my prompts because it's easier to read\n",
    "prompt = \"\"\"\n",
    "Today is Monday, tomorrow is Wednesday.\n",
    "\n",
    "What is wrong with that statement?\n",
    "\"\"\"\n",
    "\n",
    "print(llm(prompt))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "74988254",
   "metadata": {},
   "source": [
    "### **Prompt Template**\n",
    "An object that helps create prompts based on a combination of user input, other non-static information and a fixed template string.\n",
    "\n",
    "Think of it as an [f-string](https://realpython.com/python-f-strings/) in python but for prompts\n",
    "\n",
    "*Advanced: Check out LangSmithHub(https://smith.langchain.com/hub) for many more communit prompt templates*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "abcc212d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Final Prompt: \n",
      "I really want to travel to Rome. What should I do there?\n",
      "\n",
      "Respond in one short sentence\n",
      "\n",
      "-----------\n",
      "LLM Output: Visit the Colosseum, the Vatican, and the Trevi Fountain.\n"
     ]
    }
   ],
   "source": [
    "from langchain.llms import OpenAI\n",
    "from langchain import PromptTemplate\n",
    "\n",
    "llm = OpenAI(model_name=\"text-davinci-003\", openai_api_key=openai_api_key)\n",
    "\n",
    "# Notice \"location\" below, that is a placeholder for another value later\n",
    "template = \"\"\"\n",
    "I really want to travel to {location}. What should I do there?\n",
    "\n",
    "Respond in one short sentence\n",
    "\"\"\"\n",
    "\n",
    "prompt = PromptTemplate(\n",
    "    input_variables=[\"location\"],\n",
    "    template=template,\n",
    ")\n",
    "\n",
    "final_prompt = prompt.format(location='Rome')\n",
    "\n",
    "print (f\"Final Prompt: {final_prompt}\")\n",
    "print (\"-----------\")\n",
    "print (f\"LLM Output: {llm(final_prompt)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ed40bac2",
   "metadata": {},
   "source": [
    "### **Example Selectors**\n",
    "An easy way to select from a series of examples that allow you to dynamic place in-context information into your prompt. Often used when your task is nuanced or you have a large list of examples.\n",
    "\n",
    "Check out different types of example selectors [here](https://python.langchain.com/docs/modules/model_io/prompts/example_selectors/)\n",
    "\n",
    "If you want an overview on why examples are important (prompt engineering), check out [this video](https://www.youtube.com/watch?v=dOxUroR57xs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "aaf36cd9",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/gregorykamradt/opt/anaconda3/lib/python3.9/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.7.2) is available. It's recommended that you update to the latest version using `pip install -U deeplake`.\n",
      "  warnings.warn(\n"
     ]
    }
   ],
   "source": [
    "from langchain.prompts.example_selector import SemanticSimilarityExampleSelector\n",
    "from langchain.vectorstores import Chroma\n",
    "from langchain.embeddings import OpenAIEmbeddings\n",
    "from langchain.prompts import FewShotPromptTemplate, PromptTemplate\n",
    "from langchain.llms import OpenAI\n",
    "\n",
    "llm = OpenAI(model_name=\"text-davinci-003\", openai_api_key=openai_api_key)\n",
    "\n",
    "example_prompt = PromptTemplate(\n",
    "    input_variables=[\"input\", \"output\"],\n",
    "    template=\"Example Input: {input}\\nExample Output: {output}\",\n",
    ")\n",
    "\n",
    "# Examples of locations that nouns are found\n",
    "examples = [\n",
    "    {\"input\": \"pirate\", \"output\": \"ship\"},\n",
    "    {\"input\": \"pilot\", \"output\": \"plane\"},\n",
    "    {\"input\": \"driver\", \"output\": \"car\"},\n",
    "    {\"input\": \"tree\", \"output\": \"ground\"},\n",
    "    {\"input\": \"bird\", \"output\": \"nest\"},\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "12b4798b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# SemanticSimilarityExampleSelector will select examples that are similar to your input by semantic meaning\n",
    "\n",
    "example_selector = SemanticSimilarityExampleSelector.from_examples(\n",
    "    # This is the list of examples available to select from.\n",
    "    examples, \n",
    "    \n",
    "    # This is the embedding class used to produce embeddings which are used to measure semantic similarity.\n",
    "    OpenAIEmbeddings(openai_api_key=openai_api_key), \n",
    "    \n",
    "    # This is the VectorStore class that is used to store the embeddings and do a similarity search over.\n",
    "    Chroma, \n",
    "    \n",
    "    # This is the number of examples to produce.\n",
    "    k=2\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "2cf30107",
   "metadata": {},
   "outputs": [],
   "source": [
    "similar_prompt = FewShotPromptTemplate(\n",
    "    # The object that will help select examples\n",
    "    example_selector=example_selector,\n",
    "    \n",
    "    # Your prompt\n",
    "    example_prompt=example_prompt,\n",
    "    \n",
    "    # Customizations that will be added to the top and bottom of your prompt\n",
    "    prefix=\"Give the location an item is usually found in\",\n",
    "    suffix=\"Input: {noun}\\nOutput:\",\n",
    "    \n",
    "    # What inputs your prompt will receive\n",
    "    input_variables=[\"noun\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "369442bb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Give the location an item is usually found in\n",
      "\n",
      "Example Input: tree\n",
      "Example Output: ground\n",
      "\n",
      "Example Input: bird\n",
      "Example Output: nest\n",
      "\n",
      "Input: plant\n",
      "Output:\n"
     ]
    }
   ],
   "source": [
    "# Select a noun!\n",
    "my_noun = \"plant\"\n",
    "# my_noun = \"student\"\n",
    "\n",
    "print(similar_prompt.format(noun=my_noun))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "9bb910f2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "' pot'"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "llm(similar_prompt.format(noun=my_noun))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8474c91d",
   "metadata": {},
   "source": [
    "### **Output Parsers Method 1: Prompt Instructions & String Parsing**\n",
    "A helpful way to format the output of a model. Usually used for structured output. LangChain has a bunch more output parsers listed on their [documentation](https://python.langchain.com/docs/modules/model_io/output_parsers).\n",
    "\n",
    "Two big concepts:\n",
    "\n",
    "**1. Format Instructions** - A autogenerated prompt that tells the LLM how to format it's response based off your desired result\n",
    "\n",
    "**2. Parser** - A method which will extract your model's text output into a desired structure (usually json)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "58353756",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.output_parsers import StructuredOutputParser, ResponseSchema\n",
    "from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate\n",
    "from langchain.llms import OpenAI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "ee36f881",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = OpenAI(model_name=\"text-davinci-003\", openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "fa59be3f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# How you would like your response structured. This is basically a fancy prompt template\n",
    "response_schemas = [\n",
    "    ResponseSchema(name=\"bad_string\", description=\"This a poorly formatted user input string\"),\n",
    "    ResponseSchema(name=\"good_string\", description=\"This is your response, a reformatted response\")\n",
    "]\n",
    "\n",
    "# How you would like to parse your output\n",
    "output_parser = StructuredOutputParser.from_response_schemas(response_schemas)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "d1079f0a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The output should be a markdown code snippet formatted in the following schema, including the leading and trailing \"```json\" and \"```\":\n",
      "\n",
      "```json\n",
      "{\n",
      "\t\"bad_string\": string  // This a poorly formatted user input string\n",
      "\t\"good_string\": string  // This is your response, a reformatted response\n",
      "}\n",
      "```\n"
     ]
    }
   ],
   "source": [
    "# See the prompt template you created for formatting\n",
    "format_instructions = output_parser.get_format_instructions()\n",
    "print (format_instructions)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "9aaae5be",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "You will be given a poorly formatted string from a user.\n",
      "Reformat it and make sure all the words are spelled correctly\n",
      "\n",
      "The output should be a markdown code snippet formatted in the following schema, including the leading and trailing \"```json\" and \"```\":\n",
      "\n",
      "```json\n",
      "{\n",
      "\t\"bad_string\": string  // This a poorly formatted user input string\n",
      "\t\"good_string\": string  // This is your response, a reformatted response\n",
      "}\n",
      "```\n",
      "\n",
      "% USER INPUT:\n",
      "welcom to califonya!\n",
      "\n",
      "YOUR RESPONSE:\n",
      "\n"
     ]
    }
   ],
   "source": [
    "template = \"\"\"\n",
    "You will be given a poorly formatted string from a user.\n",
    "Reformat it and make sure all the words are spelled correctly\n",
    "\n",
    "{format_instructions}\n",
    "\n",
    "% USER INPUT:\n",
    "{user_input}\n",
    "\n",
    "YOUR RESPONSE:\n",
    "\"\"\"\n",
    "\n",
    "prompt = PromptTemplate(\n",
    "    input_variables=[\"user_input\"],\n",
    "    partial_variables={\"format_instructions\": format_instructions},\n",
    "    template=template\n",
    ")\n",
    "\n",
    "promptValue = prompt.format(user_input=\"welcom to califonya!\")\n",
    "\n",
    "print(promptValue)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "b116bb23",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'```json\\n{\\n\\t\"bad_string\": \"welcom to califonya!\", \\n\\t\"good_string\": \"Welcome to California!\"\\n}\\n```'"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "llm_output = llm(promptValue)\n",
    "llm_output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "985aa814",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'bad_string': 'welcom to califonya!', 'good_string': 'Welcome to California!'}"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "output_parser.parse(llm_output)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "07045ae3",
   "metadata": {},
   "source": [
    "### **Output Parsers Method 2: OpenAI Fuctions**\n",
    "When OpenAI released function calling, the game changed. This is recommended method when starting out.\n",
    "\n",
    "They trained models specifically for outputing structured data. It became super easy to specify a Pydantic schema and get a structured output.\n",
    "\n",
    "There are many ways to define your schema, I prefer using Pydantic Models because of how organized they are. Feel free to reference OpenAI's [documention](https://platform.openai.com/docs/guides/gpt/function-calling) for other methods.\n",
    "\n",
    "In order to use this method you'll need to use a model that supports [function calling](https://openai.com/blog/function-calling-and-other-api-updates#:~:text=Developers%20can%20now%20describe%20functions%20to%20gpt%2D4%2D0613%20and%20gpt%2D3.5%2Dturbo%2D0613%2C). I'll use `gpt4-0613`\n",
    "\n",
    "**Example 1: Simple**\n",
    "\n",
    "Let's get started by defining a simple model for us to extract from."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "3593699b",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.pydantic_v1 import BaseModel, Field\n",
    "from typing import Optional\n",
    "\n",
    "class Person(BaseModel):\n",
    "    \"\"\"Identifying information about a person.\"\"\"\n",
    "\n",
    "    name: str = Field(..., description=\"The person's name\")\n",
    "    age: int = Field(..., description=\"The person's age\")\n",
    "    fav_food: Optional[str] = Field(None, description=\"The person's favorite food\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17033d15",
   "metadata": {},
   "source": [
    "Then let's create a chain (more on this later) that will do the extracting for us"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "60b7be09",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Person(name='Sally, Joey, Caroline', age=13, fav_food='spinach')"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langchain.chains.openai_functions import create_structured_output_chain\n",
    "\n",
    "llm = ChatOpenAI(model='gpt-4-0613', openai_api_key=openai_api_key)\n",
    "\n",
    "chain = create_structured_output_chain(Person, llm, prompt)\n",
    "chain.run(\n",
    "    \"Sally is 13, Joey just turned 12 and loves spinach. Caroline is 10 years older than Sally.\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37370210",
   "metadata": {},
   "source": [
    "Notice how we only have data on one person from that list? That is because we didn't specify we wanted multiple. Let's change our schema to specify that we want a list of people if possible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "df4ad5e6",
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Sequence\n",
    "\n",
    "class People(BaseModel):\n",
    "    \"\"\"Identifying information about all people in a text.\"\"\"\n",
    "\n",
    "    people: Sequence[Person] = Field(..., description=\"The people in the text\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa2bc127",
   "metadata": {},
   "source": [
    "Now we'll call for People rather than Person"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "5ba430d5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "People(people=[Person(name='Sally', age=13, fav_food=None), Person(name='Joey', age=12, fav_food='spinach'), Person(name='Caroline', age=23, fav_food=None)])"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain = create_structured_output_chain(People, llm, prompt)\n",
    "chain.run(\n",
    "    \"Sally is 13, Joey just turned 12 and loves spinach. Caroline is 10 years older than Sally.\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12db9b8b",
   "metadata": {},
   "source": [
    "Let's do some more parsing with it\n",
    "\n",
    "**Example 2: Enum**\n",
    "\n",
    "Now let's parse when a product from a list is mentioned"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "6616a735",
   "metadata": {},
   "outputs": [],
   "source": [
    "import enum\n",
    "\n",
    "llm = ChatOpenAI(model='gpt-4-0613', openai_api_key=openai_api_key)\n",
    "\n",
    "class Product(str, enum.Enum):\n",
    "    CRM = \"CRM\"\n",
    "    VIDEO_EDITING = \"VIDEO_EDITING\"\n",
    "    HARDWARE = \"HARDWARE\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "a5250ff5",
   "metadata": {},
   "outputs": [],
   "source": [
    "class Products(BaseModel):\n",
    "    \"\"\"Identifying products that were mentioned in a text\"\"\"\n",
    "\n",
    "    products: Sequence[Product] = Field(..., description=\"The products mentioned in a text\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "dd7e0bbf",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Products(products=[<Product.CRM: 'CRM'>, <Product.HARDWARE: 'HARDWARE'>, <Product.VIDEO_EDITING: 'VIDEO_EDITING'>])"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain = create_structured_output_chain(Products, llm, prompt)\n",
    "chain.run(\n",
    "    \"The CRM in this demo is great. Love the hardware. The microphone is also cool. Love the video editing\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7b43cec2",
   "metadata": {},
   "source": [
    "## Indexes - Structuring documents to LLMs can work with them"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d3f904e9",
   "metadata": {},
   "source": [
    "### **Document Loaders**\n",
    "Easy ways to import data from other sources. Shared functionality with [OpenAI Plugins](https://openai.com/blog/chatgpt-plugins) [specifically retrieval plugins](https://github.com/openai/chatgpt-retrieval-plugin)\n",
    "\n",
    "See a [big list](https://python.langchain.com/en/latest/modules/indexes/document_loaders.html) of document loaders here. A bunch more on [Llama Index](https://llamahub.ai/) as well."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7d4719d4",
   "metadata": {},
   "source": [
    "**HackerNews**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "ba88e05b",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.document_loaders import HNLoader"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "ee693520",
   "metadata": {},
   "outputs": [],
   "source": [
    "loader = HNLoader(\"https://news.ycombinator.com/item?id=34422627\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "88d89ad7",
   "metadata": {},
   "outputs": [],
   "source": [
    "data = loader.load()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "e814f930",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Found 76 comments\n",
      "Here's a sample:\n",
      "\n",
      "Ozzie_osman 8 months ago  \n",
      "             | next [–] \n",
      "\n",
      "LangChain is awesome. For people not sure what it's doing, large language models (LLMs) are very Ozzie_osman 8 months ago  \n",
      "             | parent | next [–] \n",
      "\n",
      "Also, another library to check out is GPT Index (https://github.com/jerryjliu/gpt_index)\n"
     ]
    }
   ],
   "source": [
    "print (f\"Found {len(data)} comments\")\n",
    "print (f\"Here's a sample:\\n\\n{''.join([x.page_content[:150] for x in data[:2]])}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c564583f",
   "metadata": {},
   "source": [
    "**Books from Gutenberg Project**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "72964fb8",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.document_loaders import GutenbergLoader\n",
    "\n",
    "loader = GutenbergLoader(\"https://www.gutenberg.org/cache/epub/2148/pg2148.txt\")\n",
    "\n",
    "data = loader.load()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "47140a26",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "      At Paris, just after dark one gusty evening in the autumn of 18-,\r\n",
      "\n",
      "\n",
      "      I was enjoying the twofold luxury of meditation \n"
     ]
    }
   ],
   "source": [
    "print(data[0].page_content[1855:1984])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f1386b0",
   "metadata": {},
   "source": [
    "**URLs and webpages**\n",
    "\n",
    "Let's try it out with [Paul Graham's website](http://www.paulgraham.com/)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "46a54e7d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'New: \\n\\nHow to Do Great Work |\\nRead |\\nWill |\\nTruth\\n\\n\\n\\n\\n\\nWant to start a startup? Get funded by Y Combinator.\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n© mmxxiii pg'"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langchain.document_loaders import UnstructuredURLLoader\n",
    "\n",
    "urls = [\n",
    "    \"http://www.paulgraham.com/\",\n",
    "]\n",
    "\n",
    "loader = UnstructuredURLLoader(urls=urls)\n",
    "\n",
    "data = loader.load()\n",
    "\n",
    "data[0].page_content"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0e9601db",
   "metadata": {},
   "source": [
    "### **Text Splitters**\n",
    "Often times your document is too long (like a book) for your LLM. You need to split it up into chunks. Text splitters help with this.\n",
    "\n",
    "There are many ways you could split your text into chunks, experiment with [different ones](https://python.langchain.com/en/latest/modules/indexes/text_splitters.html) to see which is best for you."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "id": "95713e57",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.text_splitter import RecursiveCharacterTextSplitter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "id": "a54455f5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You have 1 document\n"
     ]
    }
   ],
   "source": [
    "# This is a long document we can split up.\n",
    "with open('data/PaulGrahamEssays/worked.txt') as f:\n",
    "    pg_work = f.read()\n",
    "    \n",
    "print (f\"You have {len([pg_work])} document\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "id": "d19acb18",
   "metadata": {},
   "outputs": [],
   "source": [
    "text_splitter = RecursiveCharacterTextSplitter(\n",
    "    # Set a really small chunk size, just to show.\n",
    "    chunk_size = 150,\n",
    "    chunk_overlap  = 20,\n",
    ")\n",
    "\n",
    "texts = text_splitter.create_documents([pg_work])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "id": "e3090f05",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You have 610 documents\n"
     ]
    }
   ],
   "source": [
    "print (f\"You have {len(texts)} documents\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "id": "87a0f45a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Preview:\n",
      "February 2021Before college the two main things I worked on, outside of school,\n",
      "were writing and programming. I didn't write essays. I wrote what \n",
      "\n",
      "beginning writers were supposed to write then, and probably still\n",
      "are: short stories. My stories were awful. They had hardly any plot,\n"
     ]
    }
   ],
   "source": [
    "print (\"Preview:\")\n",
    "print (texts[0].page_content, \"\\n\")\n",
    "print (texts[1].page_content)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ad9e670d",
   "metadata": {},
   "source": [
    "There are a ton of different ways to do text splitting and it really depends on your retrieval strategy and application design. Check out more splitters [here](https://python.langchain.com/docs/modules/data_connection/document_transformers/)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1f85defb",
   "metadata": {},
   "source": [
    "### **Retrievers**\n",
    "Easy way to combine documents with language models.\n",
    "\n",
    "There are many different types of retrievers, the most widely supported is the VectoreStoreRetriever"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "id": "8cccbd82",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.document_loaders import TextLoader\n",
    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
    "from langchain.vectorstores import FAISS\n",
    "from langchain.embeddings import OpenAIEmbeddings\n",
    "\n",
    "loader = TextLoader('data/PaulGrahamEssays/worked.txt')\n",
    "documents = loader.load()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "id": "1dab1c20",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get your splitter ready\n",
    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)\n",
    "\n",
    "# Split your docs into texts\n",
    "texts = text_splitter.split_documents(documents)\n",
    "\n",
    "# Get embedding engine ready\n",
    "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
    "\n",
    "# Embedd your texts\n",
    "db = FAISS.from_documents(texts, embeddings)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "id": "e62372be",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Init your retriever. Asking for just 1 document back\n",
    "retriever = db.as_retriever()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "id": "e0534bbd",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "VectorStoreRetriever(tags=['FAISS'], vectorstore=<langchain.vectorstores.faiss.FAISS object at 0x7f8389169070>)"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "retriever"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "3846a3b5",
   "metadata": {},
   "outputs": [],
   "source": [
    "docs = retriever.get_relevant_documents(\"what types of things did the author want to build?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "db383cc8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "standards; what was the point? No one else wanted one either, so\n",
      "off they went. That was what happened to systems work.I wanted not just to build things, but to build things that would\n",
      "last.In this di\n",
      "\n",
      "much of it in grad school.Computer Science is an uneasy alliance between two halves, theory\n",
      "and systems. The theory people prove things, and the systems people\n",
      "build things. I wanted to build things. \n"
     ]
    }
   ],
   "source": [
    "print(\"\\n\\n\".join([x.page_content[:200] for x in docs[:2]]))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "24193139",
   "metadata": {},
   "source": [
    "### **VectorStores**\n",
    "Databases to store vectors. Most popular ones are [Pinecone](https://www.pinecone.io/) & [Weaviate](https://weaviate.io/). More examples on OpenAIs [retriever documentation](https://github.com/openai/chatgpt-retrieval-plugin#choosing-a-vector-database). [Chroma](https://www.trychroma.com/) & [FAISS](https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/) are easy to work with locally.\n",
    "\n",
    "Conceptually, think of them as tables w/ a column for embeddings (vectors) and a column for metadata.\n",
    "\n",
    "Example\n",
    "\n",
    "| Embedding      | Metadata |\n",
    "| ----------- | ----------- |\n",
    "| [-0.00015641732898075134, -0.003165106289088726, ...]      | {'date' : '1/2/23}       |\n",
    "| [-0.00035465431654651654, 1.4654131651654516546, ...]   | {'date' : '1/3/23}        |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "id": "3c5533ad",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.document_loaders import TextLoader\n",
    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
    "from langchain.vectorstores import FAISS\n",
    "from langchain.embeddings import OpenAIEmbeddings\n",
    "\n",
    "loader = TextLoader('data/PaulGrahamEssays/worked.txt')\n",
    "documents = loader.load()\n",
    "\n",
    "# Get your splitter ready\n",
    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)\n",
    "\n",
    "# Split your docs into texts\n",
    "texts = text_splitter.split_documents(documents)\n",
    "\n",
    "# Get embedding engine ready\n",
    "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "id": "661fdf19",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You have 78 documents\n"
     ]
    }
   ],
   "source": [
    "print (f\"You have {len(texts)} documents\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "id": "e99ac0ea",
   "metadata": {},
   "outputs": [],
   "source": [
    "embedding_list = embeddings.embed_documents([text.page_content for text in texts])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "id": "89e7758c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You have 78 embeddings\n",
      "Here's a sample of one: [-0.001058628615053026, -0.01118234211553424, -0.012874804746266883]...\n"
     ]
    }
   ],
   "source": [
    "print (f\"You have {len(embedding_list)} embeddings\")\n",
    "print (f\"Here's a sample of one: {embedding_list[0][:3]}...\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8ac358c5",
   "metadata": {},
   "source": [
    "Your vectorstore store your embeddings (☝️) and make them easily searchable"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f9b9b79b",
   "metadata": {},
   "source": [
    "## Memory\n",
    "Helping LLMs remember information.\n",
    "\n",
    "Memory is a bit of a loose term. It could be as simple as remembering information you've chatted about in the past or more complicated information retrieval.\n",
    "\n",
    "We'll keep it towards the Chat Message use case. This would be used for chat bots.\n",
    "\n",
    "There are many types of memory, explore [the documentation](https://python.langchain.com/en/latest/modules/memory/how_to_guides.html) to see which one fits your use case."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f43b49da",
   "metadata": {},
   "source": [
    "### Chat Message History"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "id": "893a18c1",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.memory import ChatMessageHistory\n",
    "from langchain.chat_models import ChatOpenAI\n",
    "\n",
    "chat = ChatOpenAI(temperature=0, openai_api_key=openai_api_key)\n",
    "\n",
    "history = ChatMessageHistory()\n",
    "\n",
    "history.add_ai_message(\"hi!\")\n",
    "\n",
    "history.add_user_message(\"what is the capital of france?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "id": "a2949fda",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[AIMessage(content='hi!'),\n",
       " HumanMessage(content='what is the capital of france?')]"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "history.messages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "id": "9b74d5cf",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AIMessage(content='The capital of France is Paris.')"
      ]
     },
     "execution_count": 63,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ai_response = chat(history.messages)\n",
    "ai_response"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "id": "529e168f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[AIMessage(content='hi!'),\n",
       " HumanMessage(content='what is the capital of france?'),\n",
       " AIMessage(content='The capital of France is Paris.')]"
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "history.add_ai_message(ai_response.content)\n",
    "history.messages"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f29fc79c",
   "metadata": {},
   "source": [
    "## Chains ⛓️⛓️⛓️\n",
    "Combining different LLM calls and action automatically\n",
    "\n",
    "Ex: Summary #1, Summary #2, Summary #3 > Final Summary\n",
    "\n",
    "Check out [this video](https://www.youtube.com/watch?v=f9_BWhCI4Zo&t=2s) explaining different summarization chain types\n",
    "\n",
    "There are [many applications of chains](https://python.langchain.com/en/latest/modules/chains/how_to_guides.html) search to see which are best for your use case.\n",
    "\n",
    "We'll cover two of them:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c34ba415",
   "metadata": {},
   "source": [
    "### 1. Simple Sequential Chains\n",
    "\n",
    "Easy chains where you can use the output of an LLM as an input into another. Good for breaking up tasks (and keeping your LLM focused)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "id": "79fc0950",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.llms import OpenAI\n",
    "from langchain.chains import LLMChain\n",
    "from langchain.prompts import PromptTemplate\n",
    "from langchain.chains import SimpleSequentialChain\n",
    "\n",
    "llm = OpenAI(temperature=1, openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "id": "43d4494a",
   "metadata": {},
   "outputs": [],
   "source": [
    "template = \"\"\"Your job is to come up with a classic dish from the area that the users suggests.\n",
    "% USER LOCATION\n",
    "{user_location}\n",
    "\n",
    "YOUR RESPONSE:\n",
    "\"\"\"\n",
    "prompt_template = PromptTemplate(input_variables=[\"user_location\"], template=template)\n",
    "\n",
    "# Holds my 'location' chain\n",
    "location_chain = LLMChain(llm=llm, prompt=prompt_template)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "id": "b6c8e00f",
   "metadata": {},
   "outputs": [],
   "source": [
    "template = \"\"\"Given a meal, give a short and simple recipe on how to make that dish at home.\n",
    "% MEAL\n",
    "{user_meal}\n",
    "\n",
    "YOUR RESPONSE:\n",
    "\"\"\"\n",
    "prompt_template = PromptTemplate(input_variables=[\"user_meal\"], template=template)\n",
    "\n",
    "# Holds my 'meal' chain\n",
    "meal_chain = LLMChain(llm=llm, prompt=prompt_template)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "id": "7e0b83f2",
   "metadata": {},
   "outputs": [],
   "source": [
    "overall_chain = SimpleSequentialChain(chains=[location_chain, meal_chain], verbose=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "id": "7d19c64d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new SimpleSequentialChain chain...\u001b[0m\n",
      "\u001b[36;1m\u001b[1;3m\n",
      "A classic dish from Rome is Spaghetti alla Carbonara, featuring egg, Parmesan cheese, black pepper, and pancetta or guanciale.\u001b[0m\n",
      "\u001b[33;1m\u001b[1;3m\n",
      "Ingredients:\n",
      "- 8oz spaghetti \n",
      "- 4 tablespoons olive oil\n",
      "- 4oz diced pancetta or guanciale\n",
      "- 2 cloves garlic, minced\n",
      "- 2 eggs, lightly beaten\n",
      "- 2 tablespoons parsley, chopped \n",
      "- ½ cup grated Parmesan \n",
      "- Salt and black pepper to taste\n",
      "\n",
      "Instructions:\n",
      "1. Bring a pot of salted water to a boil and add the spaghetti. Cook according to package directions. \n",
      "2. Meanwhile, add the olive oil to a large skillet over medium-high heat. Add the diced pancetta and garlic, and cook until pancetta is browned and garlic is fragrant.\n",
      "3. In a medium bowl, whisk together the eggs, parsley, Parmesan, and salt and pepper.\n",
      "4. Drain the cooked spaghetti and add it to the skillet with the pancetta and garlic. Remove from heat and pour the egg mixture over the spaghetti, stirring to combine. \n",
      "5. Serve the spaghetti alla carbonara with additional Parmesan cheese and black pepper.\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "review = overall_chain.run(\"Rome\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f6191bf5",
   "metadata": {},
   "source": [
    "### 2. Summarization Chain\n",
    "\n",
    "Easily run through long numerous documents and get a summary. Check out [this video](https://www.youtube.com/watch?v=f9_BWhCI4Zo) for other chain types besides map-reduce"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "id": "6f218c3e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new MapReduceDocumentsChain chain...\u001b[0m\n",
      "\n",
      "\n",
      "\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
      "Prompt after formatting:\n",
      "\u001b[32;1m\u001b[1;3mWrite a concise summary of the following:\n",
      "\n",
      "\n",
      "\"January 2017Because biographies of famous scientists tend to \n",
      "edit out their mistakes, we underestimate the \n",
      "degree of risk they were willing to take.\n",
      "And because anything a famous scientist did that\n",
      "wasn't a mistake has probably now become the\n",
      "conventional wisdom, those choices don't\n",
      "seem risky either.Biographies of Newton, for example, understandably focus\n",
      "more on physics than alchemy or theology.\n",
      "The impression we get is that his unerring judgment\n",
      "led him straight to truths no one else had noticed.\n",
      "How to explain all the time he spent on alchemy\n",
      "and theology?  Well, smart people are often kind of\n",
      "crazy.But maybe there is a simpler explanation. Maybe\"\n",
      "\n",
      "\n",
      "CONCISE SUMMARY:\u001b[0m\n",
      "Prompt after formatting:\n",
      "\u001b[32;1m\u001b[1;3mWrite a concise summary of the following:\n",
      "\n",
      "\n",
      "\"the smartness and the craziness were not as separate\n",
      "as we think. Physics seems to us a promising thing\n",
      "to work on, and alchemy and theology obvious wastes\n",
      "of time. But that's because we know how things\n",
      "turned out. In Newton's day the three problems \n",
      "seemed roughly equally promising. No one knew yet\n",
      "what the payoff would be for inventing what we\n",
      "now call physics; if they had, more people would \n",
      "have been working on it. And alchemy and theology\n",
      "were still then in the category Marc Andreessen would \n",
      "describe as \"huge, if true.\"Newton made three bets. One of them worked. But \n",
      "they were all risky.\"\n",
      "\n",
      "\n",
      "CONCISE SUMMARY:\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n",
      "\n",
      "\n",
      "\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
      "Prompt after formatting:\n",
      "\u001b[32;1m\u001b[1;3mWrite a concise summary of the following:\n",
      "\n",
      "\n",
      "\" Biographies of famous scientists often edit out their mistakes, giving readers the wrong impression that they never faced any risks to achieve successful results. An example of this is Newton, whose smartness is assumed to have led straight him to truths without any detours into alchemy or theology - despite the fact that he spent a lot of time on both fields. Maybe the simpler explanation is that he was willing to take risks, even if it means potentially making mistakes.\n",
      "\n",
      " In the 17th century, Newton took a risk and made three bets, one of which turned out to be a successful invention of what we now call physics. The other two bets were on less popular subjects of the time such as alchemy and theology. People did not know then what the payoff would be, but the bets still seemed relatively promising.\"\n",
      "\n",
      "\n",
      "CONCISE SUMMARY:\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\" Biographies tend to omit famous scientists' mistakes from their stories, but Newton was willing to take risks and explore multiple fields to make his discoveries. He placed three risky bets, one of which resulted in the creation of physics as we know it today.\""
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langchain.chains.summarize import load_summarize_chain\n",
    "from langchain.document_loaders import TextLoader\n",
    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
    "\n",
    "loader = TextLoader('data/PaulGrahamEssays/disc.txt')\n",
    "documents = loader.load()\n",
    "\n",
    "# Get your splitter ready\n",
    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=50)\n",
    "\n",
    "# Split your docs into texts\n",
    "texts = text_splitter.split_documents(documents)\n",
    "\n",
    "# There is a lot of complexity hidden in this one line. I encourage you to check out the video above for more detail\n",
    "chain = load_summarize_chain(llm, chain_type=\"map_reduce\", verbose=True)\n",
    "chain.run(texts)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84f6193c",
   "metadata": {},
   "source": [
    "## Agents 🤖🤖\n",
    "\n",
    "Official LangChain Documentation describes agents perfectly (emphasis mine):\n",
    "> Some applications will require not just a predetermined chain of calls to LLMs/other tools, but potentially an **unknown chain** that depends on the user's input. In these types of chains, there is a “agent” which has access to a suite of tools. Depending on the user input, the agent can then **decide which, if any, of these tools to call**.\n",
    "\n",
    "\n",
    "Basically you use the LLM not just for text output, but also for decision making. The coolness and power of this functionality can't be overstated enough.\n",
    "\n",
    "Sam Altman emphasizes that the LLMs are good '[reasoning engine](https://www.youtube.com/watch?v=L_Guz73e6fw&t=867s)'. Agent take advantage of this."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3ce05d51",
   "metadata": {},
   "source": [
    "### Agents\n",
    "\n",
    "The language model that drives decision making.\n",
    "\n",
    "More specifically, an agent takes in an input and returns a response corresponding to an action to take along with an action input. You can see different types of agents (which are better for different use cases) [here](https://python.langchain.com/en/latest/modules/agents/agents/agent_types.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f696b65c",
   "metadata": {},
   "source": [
    "### Tools\n",
    "\n",
    "A 'capability' of an agent. This is an abstraction on top of a function that makes it easy for LLMs (and agents) to interact with it. Ex: Google search.\n",
    "\n",
    "This area shares commonalities with [OpenAI plugins](https://platform.openai.com/docs/plugins/introduction)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a11f8231",
   "metadata": {},
   "source": [
    "### Toolkit\n",
    "\n",
    "Groups of tools that your agent can select from\n",
    "\n",
    "Let's bring them all together:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "id": "67d5d82d",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.agents import load_tools\n",
    "from langchain.agents import initialize_agent\n",
    "from langchain.llms import OpenAI\n",
    "import json\n",
    "\n",
    "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "id": "0ddcdbb9",
   "metadata": {},
   "outputs": [],
   "source": [
    "serpapi_api_key=os.getenv(\"SERP_API_KEY\", \"YourAPIKey\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "id": "44fad67f",
   "metadata": {},
   "outputs": [],
   "source": [
    "toolkit = load_tools([\"serpapi\"], llm=llm, serpapi_api_key=serpapi_api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "id": "f544a74b",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent = initialize_agent(toolkit, llm, agent=\"zero-shot-react-description\", verbose=True, return_intermediate_steps=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "id": "c4882754",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
      "\u001b[32;1m\u001b[1;3m I should try to find out what band Natalie Bergman is a part of.\n",
      "Action: Search\n",
      "Action Input: \"Natalie Bergman band\"\u001b[0m\n",
      "Observation: \u001b[36;1m\u001b[1;3m['Natalie Bergman is an American singer-songwriter. She is one half of the duo Wild Belle, along with her brother Elliot Bergman. Her debut solo album, Mercy, was released on Third Man Records on May 7, 2021. She is based in Los Angeles.', 'Natalie Bergman type: American singer-songwriter.', 'Natalie Bergman main_tab_text: Overview.', 'Natalie Bergman kgmid: /m/0qgx4kh.', 'Natalie Bergman genre: Folk.', 'Natalie Bergman parents: Susan Bergman, Judson Bergman.', 'Natalie Bergman born: 1988 or 1989 (age 34–35).', 'Natalie Bergman is an American singer-songwriter. She is one half of the duo Wild Belle, along with her brother Elliot Bergman. Her debut solo album, Mercy, ...']\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I should search for the first album of Wild Belle\n",
      "Action: Search\n",
      "Action Input: \"Wild Belle first album\"\u001b[0m\n",
      "Observation: \u001b[36;1m\u001b[1;3mIsles\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
      "Final Answer: Isles is the first album of the band that Natalie Bergman is a part of.\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "response = agent({\"input\":\"what was the first album of the\" \n",
    "                    \"band that Natalie Bergman is a part of?\"})"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3f9c30d2",
   "metadata": {},
   "source": [
    "![Wild Belle](data/WildBelle1.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14f4b368",
   "metadata": {},
   "source": [
    "🎵Enjoy🎵\n",
    "https://open.spotify.com/track/1eREJIBdqeCcqNCB1pbz7w?si=c014293b63c7478c"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}


================================================
FILE: LangChain Cookbook Part 2 - Use Cases.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "359697d5",
   "metadata": {},
   "source": [
    "# LangChain Cookbook Part 2: Use Cases👨‍🍳👩‍🍳"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "11d788b0",
   "metadata": {},
   "source": [
    "*This cookbook is based on the [LangChain Conceptual Documentation](https://docs.langchain.com/docs/)*\n",
    "\n",
    "**Goals:**\n",
    "\n",
    "1. Inspire you to build\n",
    "2. Provide an introductory understanding of the main use cases of LangChain via [ELI5](https://www.dictionary.com/e/slang/eli5/#:~:text=ELI5%20is%20short%20for%20%E2%80%9CExplain,a%20complicated%20question%20or%20problem.) examples and code snippets. For an introduction to the *fundamentals* of LangChain check out [Cookbook Part 1: Fundamentals](https://github.com/gkamradt/langchain-tutorials/blob/main/LangChain%20Cookbook%20Part%201%20-%20Fundamentals.ipynb).\n",
    "\n",
    "**LangChain Links:**\n",
    "* [LC Conceptual Documentation](https://docs.langchain.com/docs/)\n",
    "* [LC Python Documentation](https://python.langchain.com/en/latest/)\n",
    "* [LC Javascript/Typescript Documentation](https://js.langchain.com/docs/)\n",
    "* [LC Discord](https://discord.gg/6adMQxSpJS)\n",
    "* [www.langchain.com](https://langchain.com/)\n",
    "* [LC Twitter](https://twitter.com/LangChainAI)\n",
    "\n",
    "\n",
    "### **What is LangChain?**\n",
    "> LangChain is a framework for developing applications powered by language models.\n",
    "*[Source](https://blog.langchain.dev/announcing-our-10m-seed-round-led-by-benchmark/#:~:text=LangChain%20is%20a%20framework%20for%20developing%20applications%20powered%20by%20language%20models)*\n",
    "\n",
    "**TLDR**: LangChain makes the complicated parts of working & building with AI models easier. It helps do this in two ways:\n",
    "\n",
    "1. **Integration** - Bring external data, such as your files, other applications, and api data, to your LLMs\n",
    "2. **Agency** - Allow your LLMs to interact with its environment via decision making. Use LLMs to help decide which action to take next\n",
    "\n",
    "### **Why LangChain?**\n",
    "1. **Components** - LangChain makes it easy to swap out abstractions and components necessary to work with language models.\n",
    "\n",
    "2. **Customized Chains** - LangChain provides out of the box support for using and customizing 'chains' - a series of actions strung together.\n",
    "\n",
    "3. **Speed 🚢** - This team ships insanely fast. You'll be up to date with the latest LLM features.\n",
    "\n",
    "4. **Community 👥** - Wonderful [discord](https://discord.gg/6adMQxSpJS) and community support, meet ups, hackathons, etc.\n",
    "\n",
    "Though LLMs can be straightforward (text-in, text-out) you'll quickly run into friction points that LangChain helps with once you develop more complicated applications.\n",
    "\n",
    "### **Main Use Cases**\n",
    "\n",
    "* **Summarization** - Express the most important facts about a body of text or chat interaction\n",
    "* **Question and Answering Over Documents** - Use information held within documents to answer questions or query\n",
    "* **Extraction** - Pull structured data from a body of text or an user query\n",
    "* **Evaluation** - Understand the quality of output from your application\n",
    "* **Querying Tabular Data** - Pull data from databases or other tabular source\n",
    "* **Code Understanding** - Reason about and digest code\n",
    "* **Interacting with APIs** - Query APIs and interact with the outside world\n",
    "* **Chatbots** - A framework to have a back and forth interaction with a user combined with memory in a chat interface\n",
    "* **Agents** - Use LLMs to make decisions about what to do next. Enable these decisions with tools.\n",
    "\n",
    "Want to see live examples of these use cases? Head over to the [LangChain Project Gallery](https://github.com/gkamradt/langchain-tutorials)\n",
    "\n",
    "#### **Authors Note:**\n",
    "\n",
    "* This cookbook will not cover all aspects of LangChain. It's contents have been curated to get you to building & impact as quick as possible. For more, please check out [LangChain Technical Documentation](https://python.langchain.com/en/latest/index.html)\n",
    "* This notebook assumes is that you've seen part 1 of this series [Fundamentals](https://github.com/gkamradt/langchain-tutorials/blob/main/LangChain%20Cookbook%20Part%201%20-%20Fundamentals.ipynb). This notebook is focused on what to do and how to apply those fundamentals.\n",
    "* You'll notice I repeat import statements throughout the notebook. My intention is to lean on the side of clarity and help you see the full code block in one spot. No need to go back and forth to see when we imported a package.\n",
    "* We use the default models throughout the notebook, at the time of writing they were davinci-003 and gpt-3.5-turbo. You would no doubt get better results with GPT4\n",
    "\n",
    "Let's get started"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8e323fb6",
   "metadata": {},
   "source": [
    "Throughout this tutorial we will use OpenAI's various [models](https://platform.openai.com/docs/models/overview). LangChain makes it easy to [subsistute LLMs](https://langchain.com/integrations.html#:~:text=integrations%20LangChain%20provides.-,LLMs,-LLM%20Provider) so you can BYO-LLM if you want"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "e9815081",
   "metadata": {
    "hide_input": false
   },
   "outputs": [],
   "source": [
    "from dotenv import load_dotenv\n",
    "import os\n",
    "\n",
    "load_dotenv()\n",
    "\n",
    "openai_api_key = os.getenv('OPENAI_API_KEY', 'YourAPIKeyIfNotSet')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "dcd3587c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>.container { width:90% !important; }</style>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Run this cell if you want to make your display wider\n",
    "from IPython.display import display, HTML\n",
    "display(HTML(\"<style>.container { width:90% !important; }</style>\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "05bb564d",
   "metadata": {},
   "source": [
    "# LangChain Use Cases"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1bbdb1dc",
   "metadata": {},
   "source": [
    "## Summarization\n",
    "\n",
    "One of the most common use cases for LangChain and LLMs is summarization. You can summarize any piece of text, but use cases span from summarizing calls, articles, books, academic papers, legal documents, user history, a table, or financial documents. It's super helpful to have a tool which can summarize information quickly.\n",
    "\n",
    "* **Deep Dive** - (Coming Soon)\n",
    "* **Examples** - [Summarizing B2B Sales Calls](https://www.youtube.com/watch?v=DIw4rbpI9ic)\n",
    "* **Use Cases** - Summarize Articles, Transcripts, Chat History, Slack/Discord, Customer Interactions, Medical Papers, Legal Documents, Podcasts, Tweet Threads, Code Bases, Product Reviews, Financial Documents\n",
    "\n",
    "### Summaries Of Short Text\n",
    "\n",
    "For summaries of short texts, the method is straightforward, in fact you don't need to do anything fancy other than simple prompting with instructions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "0c292592",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.llms import OpenAI\n",
    "from langchain import PromptTemplate\n",
    "\n",
    "# Note, the default model is already 'text-davinci-003' but I call it out here explicitly so you know where to change it later if you want\n",
    "llm = OpenAI(temperature=0, model_name='text-davinci-003', openai_api_key=openai_api_key)\n",
    "\n",
    "# Create our template\n",
    "template = \"\"\"\n",
    "%INSTRUCTIONS:\n",
    "Please summarize the following piece of text.\n",
    "Respond in a manner that a 5 year old would understand.\n",
    "\n",
    "%TEXT:\n",
    "{text}\n",
    "\"\"\"\n",
    "\n",
    "# Create a LangChain prompt template that we can insert values to later\n",
    "prompt = PromptTemplate(\n",
    "    input_variables=[\"text\"],\n",
    "    template=template,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f539cb53",
   "metadata": {},
   "source": [
    "Let's let's find a confusing text online. *[Source](https://www.smithsonianmag.com/smart-news/long-before-trees-overtook-the-land-earth-was-covered-by-giant-mushrooms-13709647/)*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "0df2cde6",
   "metadata": {},
   "outputs": [],
   "source": [
    "confusing_text = \"\"\"\n",
    "For the next 130 years, debate raged.\n",
    "Some scientists called Prototaxites a lichen, others a fungus, and still others clung to the notion that it was some kind of tree.\n",
    "“The problem is that when you look up close at the anatomy, it’s evocative of a lot of different things, but it’s diagnostic of nothing,” says Boyce, an associate professor in geophysical sciences and the Committee on Evolutionary Biology.\n",
    "“And it’s so damn big that when whenever someone says it’s something, everyone else’s hackles get up: ‘How could you have a lichen 20 feet tall?’”\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "03d31842",
   "metadata": {},
   "source": [
    "Let's take a look at what prompt will be sent to the LLM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "406eb8a3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "------- Prompt Begin -------\n",
      "\n",
      "%INSTRUCTIONS:\n",
      "Please summarize the following piece of text.\n",
      "Respond in a manner that a 5 year old would understand.\n",
      "\n",
      "%TEXT:\n",
      "\n",
      "For the next 130 years, debate raged.\n",
      "Some scientists called Prototaxites a lichen, others a fungus, and still others clung to the notion that it was some kind of tree.\n",
      "“The problem is that when you look up close at the anatomy, it’s evocative of a lot of different things, but it’s diagnostic of nothing,” says Boyce, an associate professor in geophysical sciences and the Committee on Evolutionary Biology.\n",
      "“And it’s so damn big that when whenever someone says it’s something, everyone else’s hackles get up: ‘How could you have a lichen 20 feet tall?’”\n",
      "\n",
      "\n",
      "------- Prompt End -------\n"
     ]
    }
   ],
   "source": [
    "print (\"------- Prompt Begin -------\")\n",
    "\n",
    "final_prompt = prompt.format(text=confusing_text)\n",
    "print(final_prompt)\n",
    "\n",
    "print (\"------- Prompt End -------\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a95e53d9",
   "metadata": {},
   "source": [
    "Finally let's pass it through the LLM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "bc7e4b42",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "For 130 years, people argued about what Prototaxites was. Some thought it was a lichen, some thought it was a fungus, and some thought it was a tree. But no one could agree. It was so big that it was hard to figure out what it was.\n"
     ]
    }
   ],
   "source": [
    "output = llm(final_prompt)\n",
    "print (output)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "751c6359",
   "metadata": {},
   "source": [
    "This method works fine, but for longer text, it can become a pain to manage and you'll run into token limits. Luckily LangChain has out of the box support for different methods to summarize via their [load_summarize_chain](https://python.langchain.com/en/latest/use_cases/summarization.html).\n",
    "\n",
    "### Summaries Of Longer Text\n",
    "\n",
    "*Note: This method will also work for short text too*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "3441484b",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.llms import OpenAI\n",
    "from langchain.chains.summarize import load_summarize_chain\n",
    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
    "\n",
    "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e95b575c",
   "metadata": {},
   "source": [
    "Let's load up a longer document"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "6c33f9bb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "April 2008(This essay is derived from a talk at the 2008 Startup School.)About a month after we started Y Combinator we came up with the\n",
      "phrase that became our motto: Make something people want.  We've\n",
      "learned a lot since then, but if I were choosing now that's still\n",
      "the one I'd pick.\n"
     ]
    }
   ],
   "source": [
    "with open('data/PaulGrahamEssays/good.txt', 'r') as file:\n",
    "    text = file.read()\n",
    "\n",
    "# Printing the first 285 characters as a preview\n",
    "print (text[:285])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b489d2a2",
   "metadata": {},
   "source": [
    "Then let's check how many tokens are in this document. [get_num_tokens](https://python.langchain.com/en/latest/reference/modules/llms.html#langchain.llms.OpenAI.get_num_tokens) is a nice method for this."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "5e0e8181",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "There are 3970 tokens in your file\n"
     ]
    }
   ],
   "source": [
    "num_tokens = llm.get_num_tokens(text)\n",
    "\n",
    "print (f\"There are {num_tokens} tokens in your file\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5bf8eda6",
   "metadata": {},
   "source": [
    "While you could likely stuff this text in your prompt, let's act like it's too big and needs another method.\n",
    "\n",
    "First we'll need to split it up. This process is called 'chunking' or 'splitting' your text into smaller pieces. I like the [RecursiveCharacterTextSplitter](https://python.langchain.com/en/latest/modules/indexes/text_splitters/examples/recursive_text_splitter.html) because it's easy to control but there are a [bunch](https://python.langchain.com/en/latest/modules/indexes/text_splitters.html) you can try"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "25dd80dc",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You now have 4 docs intead of 1 piece of text\n"
     ]
    }
   ],
   "source": [
    "text_splitter = RecursiveCharacterTextSplitter(separators=[\"\\n\\n\", \"\\n\"], chunk_size=5000, chunk_overlap=350)\n",
    "docs = text_splitter.create_documents([text])\n",
    "\n",
    "print (f\"You now have {len(docs)} docs intead of 1 piece of text\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3e7547a3",
   "metadata": {},
   "source": [
    "Next we need to load up a chain which will make successive calls to the LLM for us. Want to see the prompt being used in the chain below? Check out the [LangChain documentation](https://github.com/hwchase17/langchain/blob/master/langchain/chains/summarize/map_reduce_prompt.py)\n",
    "\n",
    "For information on the difference between chain types, check out this video on [token limit workarounds](https://youtu.be/f9_BWhCI4Zo)\n",
    "\n",
    "*Note: You could also get fancy and make the first 4 calls of the map_reduce run in parallel too*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "28ddd9c0",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get your chain ready to use\n",
    "chain = load_summarize_chain(llm=llm, chain_type='map_reduce') # verbose=True optional to see what is getting sent to the LLM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "be0b2d04",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " This essay looks at the idea of benevolence in startups, and how it can help them succeed. It explains how benevolence can improve morale, make people want to help, and help startups be decisive. It also looks at how markets have evolved to value potential dividends and potential earnings, and how users dislike their new operating system. The author argues that starting a company with benevolent aims is currently undervalued, and that Y Combinator's motto of \"Make something people want\" is a powerful concept.\n"
     ]
    }
   ],
   "source": [
    "# Use it. This will run through the 4 documents, summarize the chunks, then get a summary of the summary.\n",
    "output = chain.run(docs)\n",
    "print (output)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a2d664fc",
   "metadata": {},
   "source": [
    "## Question & Answering Using Documents As Context"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ad87c72b",
   "metadata": {},
   "source": [
    "*[LangChain Question & Answer Docs](https://python.langchain.com/en/latest/use_cases/question_answering.html)*\n",
    "\n",
    "In order to use LLMs for question and answer we must:\n",
    "\n",
    "1. Pass the LLM relevant context it needs to answer a question\n",
    "2. Pass it our question that we want answered\n",
    "\n",
    "Simplified, this process looks like this \"llm(your context + your question) = your answer\"\n",
    "\n",
    "* **Deep Dive** - [Question A Book](https://youtu.be/h0DHDp1FbmQ), [Ask Questions To Your Custom Files](https://youtu.be/EnT-ZTrcPrg), [Chat Your Data JS (1000 pages of Financial Reports)](https://www.youtube.com/watch?v=Ix9WIZpArm0&t=1051s), [LangChain Q&A webinar](https://www.crowdcast.io/c/rh66hcwivly0)\n",
    "* **Examples** - [ChatPDF](https://www.chatpdf.com/)\n",
    "* **Use Cases** - Chat your documents, ask questions to academic papers, create study guides, reference medical information"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "685e15f3",
   "metadata": {},
   "source": [
    "### Simple Q&A Example\n",
    "\n",
    "Here let's review the convention of `llm(your context + your question) = your answer`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "9ebd8451",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.llms import OpenAI\n",
    "\n",
    "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "b4795187",
   "metadata": {},
   "outputs": [],
   "source": [
    "context = \"\"\"\n",
    "Rachel is 30 years old\n",
    "Bob is 45 years old\n",
    "Kevin is 65 years old\n",
    "\"\"\"\n",
    "\n",
    "question = \"Who is under 40 years old?\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2184b11b",
   "metadata": {},
   "source": [
    "Then combine them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "0c53650d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Rachel is under 40 years old.\n"
     ]
    }
   ],
   "source": [
    "output = llm(context + question)\n",
    "\n",
    "# I strip the text to remove the leading and trailing whitespace\n",
    "print (output.strip())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "385180ca",
   "metadata": {},
   "source": [
    "As we ramp up our sophistication, we'll take advantage of this convention more.\n",
    "\n",
    "The hard part comes in when you need to be selective about *which* data you put in your context. This field of study is called \"[document retrieval](https://python.langchain.com/en/latest/modules/indexes/retrievers.html)\" and tightly coupled with AI Memory."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53ed4080",
   "metadata": {},
   "source": [
    "### Using Embeddings\n",
    "\n",
    "I informally call what were about to go through as \"The VectorStore Dance\". It's the process of splitting your text, embedding the chunks, putting the embeddings in a DB, and then querying them. For a full video on this check out [How To Question A Book](https://www.youtube.com/watch?v=h0DHDp1FbmQ)\n",
    "\n",
    "The goal is to select relevant chunks of our long text, but which chunks do we pull? The most popular method is to pull *similar* texts based off comparing vector embeddings."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "a7a02ccc",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain import OpenAI\n",
    "\n",
    "# The vectorstore we'll be using\n",
    "from langchain.vectorstores import FAISS\n",
    "\n",
    "# The LangChain component we'll use to get the documents\n",
    "from langchain.chains import RetrievalQA\n",
    "\n",
    "# The easy document loader for text\n",
    "from langchain.document_loaders import TextLoader\n",
    "\n",
    "# The embedding engine that will convert our text to vectors\n",
    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
    "\n",
    "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "40afcfec",
   "metadata": {},
   "source": [
    "Let's load up a longer document"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "5772bc26",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You have 1 document\n",
      "You have 74663 characters in that document\n"
     ]
    }
   ],
   "source": [
    "loader = TextLoader('data/PaulGrahamEssays/worked.txt')\n",
    "doc = loader.load()\n",
    "print (f\"You have {len(doc)} document\")\n",
    "print (f\"You have {len(doc[0].page_content)} characters in that document\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb87424c",
   "metadata": {},
   "source": [
    "Now let's split our long doc into smaller pieces"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "b4a6e452",
   "metadata": {},
   "outputs": [],
   "source": [
    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=3000, chunk_overlap=400)\n",
    "docs = text_splitter.split_documents(doc)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "723e8aec",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Now you have 29 documents that have an average of 2,930 characters (smaller pieces)\n"
     ]
    }
   ],
   "source": [
    "# Get the total number of characters so we can see the average later\n",
    "num_total_characters = sum([len(x.page_content) for x in docs])\n",
    "\n",
    "print (f\"Now you have {len(docs)} documents that have an average of {num_total_characters / len(docs):,.0f} characters (smaller pieces)\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "9b591198",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get your embeddings engine ready\n",
    "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
    "\n",
    "# Embed your documents and combine with the raw text in a pseudo db. Note: This will make an API call to OpenAI\n",
    "docsearch = FAISS.from_documents(docs, embeddings)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a1b13348",
   "metadata": {},
   "source": [
    "Create your retrieval engine"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "47cd969d",
   "metadata": {},
   "outputs": [],
   "source": [
    "qa = RetrievalQA.from_chain_type(llm=llm, chain_type=\"stuff\", retriever=docsearch.as_retriever())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6aa2963c",
   "metadata": {},
   "source": [
    "Now it's time to ask a question. The retriever will go get the similar documents and combine with your question for the LLM to reason through.\n",
    "\n",
    "Note: It may not seem like much, but the magic here is that we didn't have to pass in our full original document."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "6a062c85",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "' The author describes painting as good work.'"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query = \"What does the author describe as good work?\"\n",
    "qa.run(query)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be503d53",
   "metadata": {},
   "source": [
    "If you wanted to do more you would hook this up to a cloud vector database, use a tool like metal and start managing your documents, with external data sources"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d3d04dc9",
   "metadata": {},
   "source": [
    "## Extraction\n",
    "*[LangChain Extraction Docs](https://python.langchain.com/en/latest/use_cases/extraction.html)*\n",
    "\n",
    "Extraction is the process of parsing data from a piece of text. This is commonly used with output parsing in order to *structure* our data.\n",
    "\n",
    "* **Deep Dive** - [Use LLMs to Extract Data From Text (Expert Level Text Extraction](https://youtu.be/xZzvwR9jdPA), [Structured Output From OpenAI (Clean Dirty Data)](https://youtu.be/KwAXfey-xQk)\n",
    "* **Examples** - [OpeningAttributes](https://twitter.com/GregKamradt/status/1646500373837008897)\n",
    "* **Use Cases:** Extract a structured row from a sentence to insert into a database, extract multiple rows from a long document to insert into a database, extracting parameters from a user query to make an API call\n",
    "\n",
    "A popular library for extraction is [Kor](https://eyurtsev.github.io/kor/). We won't cover it today but I highly suggest checking it out for advanced extraction."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "904d43c0",
   "metadata": {},
   "outputs": [],
   "source": [
    "# To help construct our Chat Messages\n",
    "from langchain.schema import HumanMessage\n",
    "from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate\n",
    "\n",
    "# We will be using a chat model, defaults to gpt-3.5-turbo\n",
    "from langchain.chat_models import ChatOpenAI\n",
    "\n",
    "# To parse outputs and get structured data back\n",
    "from langchain.output_parsers import StructuredOutputParser, ResponseSchema\n",
    "\n",
    "chat_model = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo', openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6923ca8b",
   "metadata": {},
   "source": [
    "### Vanilla Extraction\n",
    "\n",
    "Let's start off with an easy example. Here I simply supply a prompt with instructions with the type of output I want."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "ab1cce97",
   "metadata": {},
   "outputs": [],
   "source": [
    "instructions = \"\"\"\n",
    "You will be given a sentence with fruit names, extract those fruit names and assign an emoji to them\n",
    "Return the fruit name and emojis in a python dictionary\n",
    "\"\"\"\n",
    "\n",
    "fruit_names = \"\"\"\n",
    "Apple, Pear, this is an kiwi\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "38f16ea4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'Apple': '🍎', 'Pear': '🍐', 'kiwi': '🥝'}\n",
      "<class 'str'>\n"
     ]
    }
   ],
   "source": [
    "# Make your prompt which combines the instructions w/ the fruit names\n",
    "prompt = (instructions + fruit_names)\n",
    "\n",
    "# Call the LLM\n",
    "output = chat_model([HumanMessage(content=prompt)])\n",
    "\n",
    "print (output.content)\n",
    "print (type(output.content))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39d6cff3",
   "metadata": {},
   "source": [
    "Let's turn this into a proper python dictionary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "314286b4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'Apple': '🍎', 'Pear': '🍐', 'kiwi': '🥝'}\n",
      "<class 'dict'>\n"
     ]
    }
   ],
   "source": [
    "output_dict = eval(output.content)\n",
    "\n",
    "print (output_dict)\n",
    "print (type(output_dict))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3909eb29",
   "metadata": {},
   "source": [
    "While this worked this time, it's not a long term reliable method for more advanced use cases"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b6a0a90d",
   "metadata": {},
   "source": [
    "### Using LangChain's Response Schema\n",
    "\n",
    "LangChain's response schema will does two things for us: \n",
    "\n",
    "1. Autogenerate the a prompt with bonafide format instructions. This is great because I don't need to worry about the prompt engineering side, I'll leave that up to LangChain!\n",
    "\n",
    "2. Read the output from the LLM and turn it into a proper python object for me\n",
    "\n",
    "Here I define the schema I want. I'm going to pull out the song and artist that a user wants to play from a pseudo chat message."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "dc2ba0be",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The schema I want out\n",
    "response_schemas = [\n",
    "    ResponseSchema(name=\"artist\", description=\"The name of the musical artist\"),\n",
    "    ResponseSchema(name=\"song\", description=\"The name of the song that the artist plays\")\n",
    "]\n",
    "\n",
    "# The parser that will look for the LLM output in my schema and return it back to me\n",
    "output_parser = StructuredOutputParser.from_response_schemas(response_schemas)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "f9e3c6cf",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The output should be a markdown code snippet formatted in the following schema, including the leading and trailing \"\\`\\`\\`json\" and \"\\`\\`\\`\":\n",
      "\n",
      "```json\n",
      "{\n",
      "\t\"artist\": string  // The name of the musical artist\n",
      "\t\"song\": string  // The name of the song that the artist plays\n",
      "}\n",
      "```\n"
     ]
    }
   ],
   "source": [
    "# The format instructions that LangChain makes. Let's look at them\n",
    "format_instructions = output_parser.get_format_instructions()\n",
    "print(format_instructions)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "d702900c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The prompt template that brings it all together\n",
    "# Note: This is a different prompt template than before because we are using a Chat Model\n",
    "\n",
    "prompt = ChatPromptTemplate(\n",
    "    messages=[\n",
    "        HumanMessagePromptTemplate.from_template(\"Given a command from the user, extract the artist and song names \\n \\\n",
    "                                                    {format_instructions}\\n{user_prompt}\")  \n",
    "    ],\n",
    "    input_variables=[\"user_prompt\"],\n",
    "    partial_variables={\"format_instructions\": format_instructions}\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "bb6adde9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Given a command from the user, extract the artist and song names \n",
      "                                                     The output should be a markdown code snippet formatted in the following schema, including the leading and trailing \"\\`\\`\\`json\" and \"\\`\\`\\`\":\n",
      "\n",
      "```json\n",
      "{\n",
      "\t\"artist\": string  // The name of the musical artist\n",
      "\t\"song\": string  // The name of the song that the artist plays\n",
      "}\n",
      "```\n",
      "I really like So Young by Portugal. The Man\n"
     ]
    }
   ],
   "source": [
    "fruit_query = prompt.format_prompt(user_prompt=\"I really like So Young by Portugal. The Man\")\n",
    "print (fruit_query.messages[0].content)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "b8664302",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'artist': 'Portugal. The Man', 'song': 'So Young'}\n",
      "<class 'dict'>\n"
     ]
    }
   ],
   "source": [
    "fruit_output = chat_model(fruit_query.to_messages())\n",
    "output = output_parser.parse(fruit_output.content)\n",
    "\n",
    "print (output)\n",
    "print (type(output))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b68b8eeb",
   "metadata": {},
   "source": [
    "Awesome, now we have a dictionary that we can use later down the line\n",
    "\n",
    "<span style=\"background:#fff5d6\">Warning:</span> The parser looks for an output from the LLM in a specific format. Your model may not output the same format every time. Make sure to handle errors with this one. GPT4 and future iterations will be more reliable.\n",
    "\n",
    "For more advanced parsing check out [Kor](https://eyurtsev.github.io/kor/)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f2fb4ba6",
   "metadata": {},
   "source": [
    "## Evaluation\n",
    "\n",
    "*[LangChain Evaluation Docs](https://python.langchain.com/en/latest/use_cases/evaluation.html)*\n",
    "\n",
    "Evaluation is the process of doing quality checks on the output of your applications. Normal, deterministic, code has tests we can run, but judging the output of LLMs is more difficult because of the unpredictableness and variability of natural language. LangChain provides tools that aid us in this journey.\n",
    "\n",
    "* **Deep Dive** - Coming Soon\n",
    "* **Examples** - [Lance Martin's Advanced](https://twitter.com/RLanceMartin) [Auto-Evaluator](https://github.com/rlancemartin/auto-evaluator)\n",
    "* **Use Cases:** Run quality checks on your summarization or Question & Answer pipelines, check the output of you summarization pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "9fbaa6e1",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Embeddings, store, and retrieval\n",
    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
    "from langchain.vectorstores import FAISS\n",
    "from langchain.chains import RetrievalQA\n",
    "\n",
    "# Model and doc loader\n",
    "from langchain import OpenAI\n",
    "from langchain.document_loaders import TextLoader\n",
    "\n",
    "# Eval!\n",
    "from langchain.evaluation.qa import QAEvalChain\n",
    "\n",
    "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "9f35fa12",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You have 1 document\n",
      "You have 74663 characters in that document\n"
     ]
    }
   ],
   "source": [
    "# Our long essay from before\n",
    "loader = TextLoader('data/PaulGrahamEssays/worked.txt')\n",
    "doc = loader.load()\n",
    "\n",
    "print (f\"You have {len(doc)} document\")\n",
    "print (f\"You have {len(doc[0].page_content)} characters in that document\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7acca7da",
   "metadata": {},
   "source": [
    "First let's do the Vectorestore dance so we can do question and answers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "1955faef",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Now you have 29 documents that have an average of 2,930 characters (smaller pieces)\n"
     ]
    }
   ],
   "source": [
    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=3000, chunk_overlap=400)\n",
    "docs = text_splitter.split_documents(doc)\n",
    "\n",
    "# Get the total number of characters so we can see the average later\n",
    "num_total_characters = sum([len(x.page_content) for x in docs])\n",
    "\n",
    "print (f\"Now you have {len(docs)} documents that have an average of {num_total_characters / len(docs):,.0f} characters (smaller pieces)\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "890b85ca",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Embeddings and docstore\n",
    "embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
    "docsearch = FAISS.from_documents(docs, embeddings)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7a0d6e25",
   "metadata": {},
   "source": [
    "Make your retrieval chain. Notice how I have an `input_key` parameter now. This tells the chain which key from a dictionary I supply has my prompt/query in it. I specify `question` to match the question in the dict below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "ddb3f3b5",
   "metadata": {},
   "outputs": [],
   "source": [
    "chain = RetrievalQA.from_chain_type(llm=llm, chain_type=\"stuff\", retriever=docsearch.as_retriever(), input_key=\"question\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b37dd0cd",
   "metadata": {},
   "source": [
    "Now I'll pass a list of questions and ground truth answers to the LLM that I know are correct (I validated them as a human)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "d93d08bf",
   "metadata": {},
   "outputs": [],
   "source": [
    "question_answers = [\n",
    "    {'question' : \"Which company sold the microcomputer kit that his friend built himself?\", 'answer' : 'Healthkit'},\n",
    "    {'question' : \"What was the small city he talked about in the city that is the financial capital of USA?\", 'answer' : 'Yorkville, NY'}\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "98c4b591",
   "metadata": {},
   "source": [
    "I'll use `chain.apply` to run both my questions one by one separately.\n",
    "\n",
    "One of the cool parts is that I'll get my list of question and answers dictionaries back, but there'll be another key in the dictionary `result` which will be the output from the LLM.\n",
    "\n",
    "Note: I specifically made my 2nd question ambigious and tough to answer in one pass so the LLM would get it incorrect"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "a4a4e041",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'question': 'Which company sold the microcomputer kit that his friend built himself?',\n",
       "  'answer': 'Healthkit',\n",
       "  'result': ' The microcomputer kit was sold by Heathkit.'},\n",
       " {'question': 'What was the small city he talked about in the city that is the financial capital of USA?',\n",
       "  'answer': 'Yorkville, NY',\n",
       "  'result': ' The small city he talked about is New York City, which is the financial capital of the United States.'}]"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "predictions = chain.apply(question_answers)\n",
    "predictions"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ed1226c9",
   "metadata": {},
   "source": [
    "We then have the LLM compare my ground truth answer (the `answer` key) with the result from the LLM (`result` key).\n",
    "\n",
    "Or simply, we are asking the LLM to grade itself. What a wild world we live in."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "ae119b18",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Start your eval chain\n",
    "eval_chain = QAEvalChain.from_llm(llm)\n",
    "\n",
    "# Have it grade itself. The code below helps the eval_chain know where the different parts are\n",
    "graded_outputs = eval_chain.evaluate(question_answers,\n",
    "                                     predictions,\n",
    "                                     question_key=\"question\",\n",
    "                                     prediction_key=\"result\",\n",
    "                                     answer_key='answer')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "c2882750",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'text': ' CORRECT'}, {'text': ' INCORRECT'}]"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "graded_outputs"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5b30268b",
   "metadata": {},
   "source": [
    "This is correct! Notice how the answer in question #1 was \"Healthkit\" and the prediction was \"The microcomputer kit was sold by Heathkit.\" The LLM knew that the answer and result were the same and gave us a \"correct\" label. Awesome.\n",
    "\n",
    "For #2 it knew they were not the same and gave us an \"incorrect\" label"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d2745752",
   "metadata": {},
   "source": [
    "## Querying Tabular Data\n",
    "\n",
    "*[LangChain Querying Tabular Data Docs](https://python.langchain.com/en/latest/use_cases/tabular.html)*\n",
    "\n",
    "The most common type of data in the world sits in tabular form (ok, ok, besides unstructured data). It is super powerful to be able to query this data with LangChain and pass it through to an LLM \n",
    "\n",
    "* **Deep Dive** - Coming Soon\n",
    "* **Examples** - TBD\n",
    "* **Use Cases:** Use LLMs to query data about users, do data analysis, get real time information from your DBs\n",
    "\n",
    "For futher reading check out \"Agents + Tabular Data\" ([Pandas](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/pandas.html), [SQL](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/sql_database.html), [CSV](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/csv.html))\n",
    "\n",
    "Let's query an SQLite DB with natural language. We'll look at the [San Francisco Trees](https://data.sfgov.org/City-Infrastructure/Street-Tree-List/tkzw-k3nq) dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "9b19c2d0",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain import OpenAI, SQLDatabase, SQLDatabaseChain\n",
    "\n",
    "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "294a4e7f",
   "metadata": {},
   "source": [
    "We'll start off by specifying where our data is and get the connection ready"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "6044d54e",
   "metadata": {},
   "outputs": [],
   "source": [
    "sqlite_db_path = 'data/San_Francisco_Trees.db'\n",
    "db = SQLDatabase.from_uri(f\"sqlite:///{sqlite_db_path}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "203eedd4",
   "metadata": {},
   "source": [
    "Then we'll create a chain that take our LLM, and DB. I'm setting `verbose=True` so you can see what is happening underneath the hood."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "dccf0957",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/gregorykamradt/opt/anaconda3/lib/python3.9/site-packages/langchain/chains/sql_database/base.py:63: UserWarning: Directly instantiating an SQLDatabaseChain with an llm is deprecated. Please instantiate with llm_chain argument or using the from_llm class method.\n",
      "  warnings.warn(\n"
     ]
    }
   ],
   "source": [
    "db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "99cdbc44",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new SQLDatabaseChain chain...\u001b[0m\n",
      "How many Species of trees are there in San Francisco?\n",
      "SQLQuery:\u001b[32;1m\u001b[1;3mSELECT COUNT(DISTINCT \"qSpecies\") FROM \"SFTrees\";\u001b[0m\n",
      "SQLResult: \u001b[33;1m\u001b[1;3m[(578,)]\u001b[0m\n",
      "Answer:\u001b[32;1m\u001b[1;3mThere are 578 Species of trees in San Francisco.\u001b[0m\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'There are 578 Species of trees in San Francisco.'"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "db_chain.run(\"How many Species of trees are there in San Francisco?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6bd61598",
   "metadata": {},
   "source": [
    "This is awesome! There are actually a few steps going on here.\n",
    "\n",
    "**Steps:**\n",
    "1. Find which table to use\n",
    "2. Find which column to use\n",
    "3. Construct the correct sql query\n",
    "4. Execute that query\n",
    "5. Get the result\n",
    "6. Return a natural language reponse back\n",
    "\n",
    "Let's confirm via pandas"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "299ff6ca",
   "metadata": {},
   "outputs": [],
   "source": [
    "import sqlite3\n",
    "import pandas as pd\n",
    "\n",
    "# Connect to the SQLite database\n",
    "connection = sqlite3.connect(sqlite_db_path)\n",
    "\n",
    "# Define your SQL query\n",
    "query = \"SELECT count(distinct qSpecies) FROM SFTrees\"\n",
    "\n",
    "# Read the SQL query into a Pandas DataFrame\n",
    "df = pd.read_sql_query(query, connection)\n",
    "\n",
    "# Close the connection\n",
    "connection.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "id": "f1b2dd89",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "578\n"
     ]
    }
   ],
   "source": [
    "# Display the result in the first column first cell\n",
    "print(df.iloc[0,0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c48b5a42",
   "metadata": {},
   "source": [
    "Nice! The answers match."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "04293535",
   "metadata": {},
   "source": [
    "## Code Understanding\n",
    "\n",
    "*[LangChain Code Understanding Docs](https://python.langchain.com/en/latest/use_cases/code.html)*\n",
    "\n",
    "One of the most exciting abilities of LLMs is code undestanding. People around the world are leveling up their output in both speed & quality due to AI help. A big part of this is having a LLM that can understand code and help you with a particular task.\n",
    "\n",
    "* **Deep Dive** - Coming Soon\n",
    "* **Examples** - TBD\n",
    "* **Use Cases:** Co-Pilot-esque functionality that can help answer questions from a specific library, help you generate new code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "id": "f3101c11",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Helper to read local files\n",
    "import os\n",
    "\n",
    "# Vector Support\n",
    "from langchain.vectorstores import FAISS\n",
    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
    "\n",
    "# Model and chain\n",
    "from langchain.chat_models import ChatOpenAI\n",
    "\n",
    "# Text splitters\n",
    "from langchain.text_splitter import CharacterTextSplitter\n",
    "from langchain.document_loaders import TextLoader\n",
    "\n",
    "llm = ChatOpenAI(model_name='gpt-3.5-turbo', openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2c4dfd6d",
   "metadata": {},
   "source": [
    "We will do the Vectorstore dance again"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "id": "8a9247e8",
   "metadata": {},
   "outputs": [],
   "source": [
    "embeddings = OpenAIEmbeddings(disallowed_special=(), openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bf12eb2c",
   "metadata": {},
   "source": [
    "I put a small python package [The Fuzz](https://github.com/seatgeek/thefuzz) (personal indie favorite) in the data folder of this repo.\n",
    "\n",
    "The loop below will go through each file in the library and load it up as a doc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "id": "bd3973a2",
   "metadata": {},
   "outputs": [],
   "source": [
    "root_dir = 'data/thefuzz'\n",
    "docs = []\n",
    "\n",
    "# Go through each folder\n",
    "for dirpath, dirnames, filenames in os.walk(root_dir):\n",
    "    \n",
    "    # Go through each file\n",
    "    for file in filenames:\n",
    "        try: \n",
    "            # Load up the file as a doc and split\n",
    "            loader = TextLoader(os.path.join(dirpath, file), encoding='utf-8')\n",
    "            docs.extend(loader.load_and_split())\n",
    "        except Exception as e: \n",
    "            pass"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "136cae5e",
   "metadata": {},
   "source": [
    "Let's look at an example of a document. It's just code!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "id": "85a39161",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You have 175 documents\n",
      "\n",
      "------ Start Document ------\n",
      "import unittest\n",
      "import re\n",
      "import pycodestyle\n",
      "\n",
      "from thefuzz import fuzz\n",
      "from thefuzz import process\n",
      "from thefuzz import utils\n",
      "from thefuzz.string_processing import StringProcessor\n",
      "\n",
      "\n",
      "class StringProcessingTest(unittest.TestCase):\n",
      "    def test_replace_non_letters_non_numbers_with_whitespace(self):\n",
      "    \n"
     ]
    }
   ],
   "source": [
    "print (f\"You have {len(docs)} documents\\n\")\n",
    "print (\"------ Start Document ------\")\n",
    "print (docs[0].page_content[:300])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "02634791",
   "metadata": {},
   "source": [
    "Embed and store them in a docstore. This will make an API call to OpenAI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "id": "94427072",
   "metadata": {},
   "outputs": [],
   "source": [
    "docsearch = FAISS.from_documents(docs, embeddings)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "id": "2071f3c4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get our retriever ready\n",
    "qa = RetrievalQA.from_chain_type(llm=llm, chain_type=\"stuff\", retriever=docsearch.as_retriever())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "id": "0536b828",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = \"What function do I use if I want to find the most similar item in a list of items?\"\n",
    "output = qa.run(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "id": "b9074fb6",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You can use the `process.extractOne()` function from `thefuzz` package to find the most similar item in a list of items. Here's an example:\n",
      "\n",
      "```\n",
      "from thefuzz import process\n",
      "\n",
      "choices = [\"apple\", \"banana\", \"orange\", \"pear\"]\n",
      "query = \"pineapple\"\n",
      "\n",
      "best_match = process.extractOne(query, choices)\n",
      "print(best_match)\n",
      "```\n",
      "\n",
      "This would output `(u'apple', 36)`, which means that the most similar item to \"pineapple\" in the list of choices is \"apple\", with a similarity score of 36.\n"
     ]
    }
   ],
   "source": [
    "print (output)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "f53860e6",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = \"Can you write the code to use the process.extractOne() function? Only respond with code. No other text or explanation\"\n",
    "output = qa.run(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "27e56a5d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "import fuzzywuzzy.process as process\n",
      "\n",
      "choices = [\n",
      "    \"new york mets vs chicago cubs\",\n",
      "    \"chicago cubs at new york mets\",\n",
      "    \"atlanta braves vs pittsbugh pirates\",\n",
      "    \"new york yankees vs boston red sox\"\n",
      "]\n",
      "\n",
      "query = \"new york mets at chicago cubs\"\n",
      "\n",
      "best = process.extractOne(query, choices)\n",
      "print(best[0])\n"
     ]
    }
   ],
   "source": [
    "print (output)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c7deae05",
   "metadata": {},
   "source": [
    "[¡Shibby!](https://thumbs.gfycat.com/WateryBeneficialDeermouse-size_restricted.gif)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f3b2783a",
   "metadata": {},
   "source": [
    "## Interacting with APIs\n",
    "\n",
    "*[LangChain API Interaction Docs](https://python.langchain.com/en/latest/use_cases/apis.html)*\n",
    "\n",
    "If the data or action you need is behind an API, you'll need your LLM to interact with APIs\n",
    "\n",
    "* **Deep Dive** - Coming Soon\n",
    "* **Examples** - TBD\n",
    "* **Use Cases:** Understand a request from a user and carry out an action, be able to automate more real-world workflows\n",
    "\n",
    "This topic is closely related to Agents and Plugins, though we'll look at a simple use case for this section. For more information, check out [LangChain + plugins](https://python.langchain.com/en/latest/use_cases/agents/custom_agent_with_plugin_retrieval_using_plugnplai.html) documentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "id": "352685c0",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.chains import APIChain\n",
    "from langchain.llms import OpenAI\n",
    "\n",
    "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e6b834fe",
   "metadata": {},
   "source": [
    "LangChain's APIChain has the ability to read API documentation and understand which endpoint it needs to call.\n",
    "\n",
    "In this case I wrote (purposefully sloppy) API documentation to demonstrate how this works"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "id": "3ff4b986",
   "metadata": {},
   "outputs": [],
   "source": [
    "api_docs = \"\"\"\n",
    "\n",
    "BASE URL: https://restcountries.com/\n",
    "\n",
    "API Documentation:\n",
    "\n",
    "The API endpoint /v3.1/name/{name} Used to find informatin about a country. All URL parameters are listed below:\n",
    "    - name: Name of country - Ex: italy, france\n",
    "    \n",
    "The API endpoint /v3.1/currency/{currency} Uesd to find information about a region. All URL parameters are listed below:\n",
    "    - currency: 3 letter currency. Example: USD, COP\n",
    "    \n",
    "Woo! This is my documentation\n",
    "\"\"\"\n",
    "\n",
    "chain_new = APIChain.from_llm_and_api_docs(llm, api_docs, verbose=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "221aa3a6",
   "metadata": {},
   "source": [
    "Let's try to make an API call that is meant for the country endpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "id": "e6d9cae4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new APIChain chain...\u001b[0m\n",
      "\u001b[32;1m\u001b[1;3m https://restcountries.com/v3.1/name/france\u001b[0m\n",
      "\u001b[33;1m\u001b[1;3m[{\"name\":{\"common\":\"France\",\"official\":\"French Republic\",\"nativeName\":{\"fra\":{\"official\":\"République française\",\"common\":\"France\"}}},\"tld\":[\".fr\"],\"cca2\":\"FR\",\"ccn3\":\"250\",\"cca3\":\"FRA\",\"cioc\":\"FRA\",\"independent\":true,\"status\":\"officially-assigned\",\"unMember\":true,\"currencies\":{\"EUR\":{\"name\":\"Euro\",\"symbol\":\"€\"}},\"idd\":{\"root\":\"+3\",\"suffixes\":[\"3\"]},\"capital\":[\"Paris\"],\"altSpellings\":[\"FR\",\"French Republic\",\"République française\"],\"region\":\"Europe\",\"subregion\":\"Western Europe\",\"languages\":{\"fra\":\"French\"},\"translations\":{\"ara\":{\"official\":\"الجمهورية الفرنسية\",\"common\":\"فرنسا\"},\"bre\":{\"official\":\"Republik Frañs\",\"common\":\"Frañs\"},\"ces\":{\"official\":\"Francouzská republika\",\"common\":\"Francie\"},\"cym\":{\"official\":\"French Republic\",\"common\":\"France\"},\"deu\":{\"official\":\"Französische Republik\",\"common\":\"Frankreich\"},\"est\":{\"official\":\"Prantsuse Vabariik\",\"common\":\"Prantsusmaa\"},\"fin\":{\"official\":\"Ranskan tasavalta\",\"common\":\"Ranska\"},\"fra\":{\"official\":\"République française\",\"common\":\"France\"},\"hrv\":{\"official\":\"Francuska Republika\",\"common\":\"Francuska\"},\"hun\":{\"official\":\"Francia Köztársaság\",\"common\":\"Franciaország\"},\"ita\":{\"official\":\"Repubblica francese\",\"common\":\"Francia\"},\"jpn\":{\"official\":\"フランス共和国\",\"common\":\"フランス\"},\"kor\":{\"official\":\"프랑스 공화국\",\"common\":\"프랑스\"},\"nld\":{\"official\":\"Franse Republiek\",\"common\":\"Frankrijk\"},\"per\":{\"official\":\"جمهوری فرانسه\",\"common\":\"فرانسه\"},\"pol\":{\"official\":\"Republika Francuska\",\"common\":\"Francja\"},\"por\":{\"official\":\"República Francesa\",\"common\":\"França\"},\"rus\":{\"official\":\"Французская Республика\",\"common\":\"Франция\"},\"slk\":{\"official\":\"Francúzska republika\",\"common\":\"Francúzsko\"},\"spa\":{\"official\":\"República francés\",\"common\":\"Francia\"},\"srp\":{\"official\":\"Француска Република\",\"common\":\"Француска\"},\"swe\":{\"official\":\"Republiken Frankrike\",\"common\":\"Frankrike\"},\"tur\":{\"official\":\"Fransa Cumhuriyeti\",\"common\":\"Fransa\"},\"urd\":{\"official\":\"جمہوریہ فرانس\",\"common\":\"فرانس\"},\"zho\":{\"official\":\"法兰西共和国\",\"common\":\"法国\"}},\"latlng\":[46.0,2.0],\"landlocked\":false,\"borders\":[\"AND\",\"BEL\",\"DEU\",\"ITA\",\"LUX\",\"MCO\",\"ESP\",\"CHE\"],\"area\":551695.0,\"demonyms\":{\"eng\":{\"f\":\"French\",\"m\":\"French\"},\"fra\":{\"f\":\"Française\",\"m\":\"Français\"}},\"flag\":\"\\uD83C\\uDDEB\\uD83C\\uDDF7\",\"maps\":{\"googleMaps\":\"https://goo.gl/maps/g7QxxSFsWyTPKuzd7\",\"openStreetMaps\":\"https://www.openstreetmap.org/relation/1403916\"},\"population\":67391582,\"gini\":{\"2018\":32.4},\"fifa\":\"FRA\",\"car\":{\"signs\":[\"F\"],\"side\":\"right\"},\"timezones\":[\"UTC-10:00\",\"UTC-09:30\",\"UTC-09:00\",\"UTC-08:00\",\"UTC-04:00\",\"UTC-03:00\",\"UTC+01:00\",\"UTC+02:00\",\"UTC+03:00\",\"UTC+04:00\",\"UTC+05:00\",\"UTC+10:00\",\"UTC+11:00\",\"UTC+12:00\"],\"continents\":[\"Europe\"],\"flags\":{\"png\":\"https://flagcdn.com/w320/fr.png\",\"svg\":\"https://flagcdn.com/fr.svg\",\"alt\":\"The flag of France is composed of three equal vertical bands of blue, white and red.\"},\"coatOfArms\":{\"png\":\"https://mainfacts.com/media/images/coats_of_arms/fr.png\",\"svg\":\"https://mainfacts.com/media/images/coats_of_arms/fr.svg\"},\"startOfWeek\":\"monday\",\"capitalInfo\":{\"latlng\":[48.87,2.33]},\"postalCode\":{\"format\":\"#####\",\"regex\":\"^(\\\\d{5})$\"}}]\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "' France is an officially-assigned, independent country located in Western Europe. Its capital is Paris and its official language is French. Its currency is the Euro (€). It has a population of 67,391,582 and its borders are with Andorra, Belgium, Germany, Italy, Luxembourg, Monaco, Spain, and Switzerland.'"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain_new.run('Can you tell me information about france?')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "09235fc3",
   "metadata": {},
   "source": [
    "Let's try to make an API call that is meant for the currency endpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "id": "c2735073",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new APIChain chain...\u001b[0m\n",
      "\u001b[32;1m\u001b[1;3m https://restcountries.com/v3.1/currency/COP\u001b[0m\n",
      "\u001b[33;1m\u001b[1;3m[{\"name\":{\"common\":\"Colombia\",\"official\":\"Republic of Colombia\",\"nativeName\":{\"spa\":{\"official\":\"República de Colombia\",\"common\":\"Colombia\"}}},\"tld\":[\".co\"],\"cca2\":\"CO\",\"ccn3\":\"170\",\"cca3\":\"COL\",\"cioc\":\"COL\",\"independent\":true,\"status\":\"officially-assigned\",\"unMember\":true,\"currencies\":{\"COP\":{\"name\":\"Colombian peso\",\"symbol\":\"$\"}},\"idd\":{\"root\":\"+5\",\"suffixes\":[\"7\"]},\"capital\":[\"Bogotá\"],\"altSpellings\":[\"CO\",\"Republic of Colombia\",\"República de Colombia\"],\"region\":\"Americas\",\"subregion\":\"South America\",\"languages\":{\"spa\":\"Spanish\"},\"translations\":{\"ara\":{\"official\":\"جمهورية كولومبيا\",\"common\":\"كولومبيا\"},\"bre\":{\"official\":\"Republik Kolombia\",\"common\":\"Kolombia\"},\"ces\":{\"official\":\"Kolumbijská republika\",\"common\":\"Kolumbie\"},\"cym\":{\"official\":\"Gweriniaeth Colombia\",\"common\":\"Colombia\"},\"deu\":{\"official\":\"Republik Kolumbien\",\"common\":\"Kolumbien\"},\"est\":{\"official\":\"Colombia Vabariik\",\"common\":\"Colombia\"},\"fin\":{\"official\":\"Kolumbian tasavalta\",\"common\":\"Kolumbia\"},\"fra\":{\"official\":\"République de Colombie\",\"common\":\"Colombie\"},\"hrv\":{\"official\":\"Republika Kolumbija\",\"common\":\"Kolumbija\"},\"hun\":{\"official\":\"Kolumbiai Köztársaság\",\"common\":\"Kolumbia\"},\"ita\":{\"official\":\"Repubblica di Colombia\",\"common\":\"Colombia\"},\"jpn\":{\"official\":\"コロンビア共和国\",\"common\":\"コロンビア\"},\"kor\":{\"official\":\"콜롬비아 공화국\",\"common\":\"콜롬비아\"},\"nld\":{\"official\":\"Republiek Colombia\",\"common\":\"Colombia\"},\"per\":{\"official\":\"جمهوری کلمبیا\",\"common\":\"کلمبیا\"},\"pol\":{\"official\":\"Republika Kolumbii\",\"common\":\"Kolumbia\"},\"por\":{\"official\":\"República da Colômbia\",\"common\":\"Colômbia\"},\"rus\":{\"official\":\"Республика Колумбия\",\"common\":\"Колумбия\"},\"slk\":{\"official\":\"Kolumbijská republika\",\"common\":\"Kolumbia\"},\"spa\":{\"official\":\"República de Colombia\",\"common\":\"Colombia\"},\"srp\":{\"official\":\"Република Колумбија\",\"common\":\"Колумбија\"},\"swe\":{\"official\":\"Republiken Colombia\",\"common\":\"Colombia\"},\"tur\":{\"official\":\"Kolombiya Cumhuriyeti\",\"common\":\"Kolombiya\"},\"urd\":{\"official\":\"جمہوریہ کولمبیا\",\"common\":\"کولمبیا\"},\"zho\":{\"official\":\"哥伦比亚共和国\",\"common\":\"哥伦比亚\"}},\"latlng\":[4.0,-72.0],\"landlocked\":false,\"borders\":[\"BRA\",\"ECU\",\"PAN\",\"PER\",\"VEN\"],\"area\":1141748.0,\"demonyms\":{\"eng\":{\"f\":\"Colombian\",\"m\":\"Colombian\"},\"fra\":{\"f\":\"Colombienne\",\"m\":\"Colombien\"}},\"flag\":\"\\uD83C\\uDDE8\\uD83C\\uDDF4\",\"maps\":{\"googleMaps\":\"https://goo.gl/maps/RdwTG8e7gPwS62oR6\",\"openStreetMaps\":\"https://www.openstreetmap.org/relation/120027\"},\"population\":50882884,\"gini\":{\"2019\":51.3},\"fifa\":\"COL\",\"car\":{\"signs\":[\"CO\"],\"side\":\"right\"},\"timezones\":[\"UTC-05:00\"],\"continents\":[\"South America\"],\"flags\":{\"png\":\"https://flagcdn.com/w320/co.png\",\"svg\":\"https://flagcdn.com/co.svg\",\"alt\":\"The flag of Colombia is composed of three horizontal bands of yellow, blue and red, with the yellow band twice the height of the other two bands.\"},\"coatOfArms\":{\"png\":\"https://mainfacts.com/media/images/coats_of_arms/co.png\",\"svg\":\"https://mainfacts.com/media/images/coats_of_arms/co.svg\"},\"startOfWeek\":\"monday\",\"capitalInfo\":{\"latlng\":[4.71,-74.07]}}]\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "' The currency of Colombia is the Colombian peso (COP), symbolized by the \"$\" sign.'"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain_new.run('Can you tell me about the currency COP?')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d5be7e0",
   "metadata": {},
   "source": [
    "In both cases the APIChain read the instructions and understood which API call it needed to make.\n",
    "\n",
    "Once the response returned, it was parsed and then my question was answered. Awesome 🐒"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "90e0f275",
   "metadata": {},
   "source": [
    "## Chatbots\n",
    "\n",
    "*[LangChain Chatbot Docs](https://python.langchain.com/en/latest/use_cases/chatbots.html)*\n",
    "\n",
    "Chatbots use many of the tools we've already looked at with the addition of an important topic: Memory. There are a ton of different [types of memory](https://python.langchain.com/en/latest/modules/memory/how_to_guides.html), tinker to see which is best for you.\n",
    "\n",
    "* **Deep Dive** - Coming Soon\n",
    "* **Examples** - [ChatBase](https://www.chatbase.co/?via=greg) (Affiliate link), [NexusGPT](https://twitter.com/achammah1/status/1649482899253501958?s=20), [ChatPDF](https://www.chatpdf.com/)\n",
    "* **Use Cases:** Have a real time interaction with a user, provide an approachable UI for users to ask natural language questions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "id": "7dca0672",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.llms import OpenAI\n",
    "from langchain import LLMChain\n",
    "from langchain.prompts.prompt import PromptTemplate\n",
    "\n",
    "# Chat specific components\n",
    "from langchain.memory import ConversationBufferMemory"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53b86e88",
   "metadata": {},
   "source": [
    "For this use case I'm going to show you how to customize the context that is given to a chatbot.\n",
    "\n",
    "You could pass instructions on how the bot should respond, but also any additional relevant information it needs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "id": "547aefa1",
   "metadata": {},
   "outputs": [],
   "source": [
    "template = \"\"\"\n",
    "You are a chatbot that is unhelpful.\n",
    "Your goal is to not help the user but only make jokes.\n",
    "Take what the user is saying and make a joke out of it\n",
    "\n",
    "{chat_history}\n",
    "Human: {human_input}\n",
    "Chatbot:\"\"\"\n",
    "\n",
    "prompt = PromptTemplate(\n",
    "    input_variables=[\"chat_history\", \"human_input\"], \n",
    "    template=template\n",
    ")\n",
    "memory = ConversationBufferMemory(memory_key=\"chat_history\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "id": "475822a0",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm_chain = LLMChain(\n",
    "    llm=OpenAI(openai_api_key=openai_api_key), \n",
    "    prompt=prompt, \n",
    "    verbose=True, \n",
    "    memory=memory\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "id": "20ae6e3d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
      "Prompt after formatting:\n",
      "\u001b[32;1m\u001b[1;3m\n",
      "You are a chatbot that is unhelpful.\n",
      "Your goal is to not help the user but only make jokes.\n",
      "Take what the user is saying and make a joke out of it\n",
      "\n",
      "\n",
      "Human: Is an pear a fruit or vegetable?\n",
      "Chatbot:\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "' Yes, an pear is a fruit of confusion!'"
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "llm_chain.predict(human_input=\"Is an pear a fruit or vegetable?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "id": "bd87e2a9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
      "Prompt after formatting:\n",
      "\u001b[32;1m\u001b[1;3m\n",
      "You are a chatbot that is unhelpful.\n",
      "Your goal is to not help the user but only make jokes.\n",
      "Take what the user is saying and make a joke out of it\n",
      "\n",
      "Human: Is an pear a fruit or vegetable?\n",
      "AI:  Yes, an pear is a fruit of confusion!\n",
      "Human: What was one of the fruits I first asked you about?\n",
      "Chatbot:\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "' I think it was the fruit of knowledge!'"
      ]
     },
     "execution_count": 65,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "llm_chain.predict(human_input=\"What was one of the fruits I first asked you about?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8db86471",
   "metadata": {},
   "source": [
    "Notice how my 1st interaction was put into the prompt of my 2nd interaction. This is the memory piece at work.\n",
    "\n",
    "There are many ways to structure a conversation, check out the different ways on the [docs](https://python.langchain.com/en/latest/use_cases/chatbots.html)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "144e0d09",
   "metadata": {},
   "source": [
    "## Agents\n",
    "\n",
    "*[LangChain Agent Docs](https://python.langchain.com/en/latest/modules/agents.html)*\n",
    "\n",
    "Agents are one of the hottest [🔥](https://media.tenor.com/IH7C6xNbkuoAAAAC/so-hot-right-now-trending.gif) topics in LLMs. Agents are the decision makers that can look a data, reason about what the next action should be, and execute that action for you via tools\n",
    "\n",
    "* **Deep Dive** - [Introduction to agents](https://youtu.be/2xxziIWmaSA?t=1972), [LangChain Agents Webinar](https://www.crowdcast.io/c/46erbpbz609r), much deeper dive coming soon\n",
    "* **Examples** - TBD\n",
    "* **Use Cases:** Run programs autonomously without the need for human input\n",
    "\n",
    "Examples of advanced uses of agents appear in [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/Auto-GPT)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "id": "df6d2853",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Helpers\n",
    "import os\n",
    "import json\n",
    "\n",
    "from langchain.llms import OpenAI\n",
    "\n",
    "# Agent imports\n",
    "from langchain.agents import load_tools\n",
    "from langchain.agents import initialize_agent\n",
    "\n",
    "# Tool imports\n",
    "from langchain.agents import Tool\n",
    "from langchain.utilities import GoogleSearchAPIWrapper\n",
    "from langchain.utilities import TextRequestsWrapper"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47e7ab35",
   "metadata": {},
   "source": [
    "For this example I'm going to pull google search results. You may want to do this if you need a list of websites for a research project.\n",
    "\n",
    "You can sign up for both of these keys at the urls below\n",
    "\n",
    "[GOOGLE_API_KEY](https://console.cloud.google.com/apis/credentials)\n",
    "[GOOGLE_CSE_ID](https://programmablesearchengine.google.com/controlpanel/create)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "id": "c5fb5850",
   "metadata": {},
   "outputs": [],
   "source": [
    "GOOGLE_CSE_ID = os.getenv('GOOGLE_CSE_ID', 'YourAPIKeyIfNotSet')\n",
    "GOOGLE_API_KEY = os.getenv('GOOGLE_API_KEY', 'YourAPIKeyIfNotSet')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "id": "ef374dfa",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a3235ccc",
   "metadata": {},
   "source": [
    "Initialize both the tools you'll be using. For this example we'll search google and also give the LLM the ability to execute python code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "id": "55903997",
   "metadata": {},
   "outputs": [],
   "source": [
    "search = GoogleSearchAPIWrapper(google_api_key=GOOGLE_API_KEY, google_cse_id=GOOGLE_CSE_ID)\n",
    "\n",
    "requests = TextRequestsWrapper()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7859aed9",
   "metadata": {},
   "source": [
    "Put both your tools in a toolkit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "id": "7e60591c",
   "metadata": {},
   "outputs": [],
   "source": [
    "toolkit = [\n",
    "    Tool(\n",
    "        name = \"Search\",\n",
    "        func=search.run,\n",
    "        description=\"useful for when you need to search google to answer questions about current events\"\n",
    "    ),\n",
    "    Tool(\n",
    "        name = \"Requests\",\n",
    "        func=requests.get,\n",
    "        description=\"Useful for when you to make a request to a URL\"\n",
    "    ),\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21f7c19e",
   "metadata": {},
   "source": [
    "Create your agent by giving it the tools, LLM and the type of agent that it should be"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "id": "1d4ad2ec",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent = initialize_agent(toolkit, llm, agent=\"zero-shot-react-description\", verbose=True, return_intermediate_steps=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "da11d1dc",
   "metadata": {},
   "source": [
    "Now ask it a question, I'm going to give it one that it should go to Google for"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "id": "b027ee69",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
      "\u001b[32;1m\u001b[1;3m I need to find out what the capital of Canada is.\n",
      "Action: Search\n",
      "Action Input: \"capital of Canada\"\u001b[0m\n",
      "Observation: \u001b[36;1m\u001b[1;3mLooking to build credit or earn rewards? Compare our rewards, Guaranteed secured and other Guaranteed credit cards. Canada's capital is Ottawa and its three largest metropolitan areas are Toronto, Montreal, and Vancouver. Canada. A vertical triband design (red, white, red) ... Browse available job openings at Capital One - CA. ... Together, we will build one of Canada's leading information-based technology companies – join us, ... Ottawa is the capital city of Canada. It is located in the southern portion of the province of Ontario, at the confluence of the Ottawa River and the Rideau ... Shopify Capital offers small business funding in the form of merchant cash advances to eligible merchants in Canada. If you live in Canada and need ... Download Capital One Canada and enjoy it on your iPhone, iPad and iPod touch. ... Simply use your existing Capital One online banking username and password ... A leader in the alternative asset space, TPG was built for a distinctive approach, managing assets through a principled focus on innovation. We're Canada's largest credit union by membership because we prioritize people, not profits. Let's build the right plan to reach your financial goals, together. The national capital is Ottawa, Canada's fourth largest city. It lies some 250 miles (400 km) northeast of Toronto and 125 miles (200 km) west of Montreal, ... Finding Value Across the Capital Structure: Limited Recourse Capital Notes. Limited Recourse Capital Notes are an evolving segment of the Canadian fixed-income ...\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
      "Final Answer: Ottawa is the capital of Canada.\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'Ottawa is the capital of Canada.'"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "response = agent({\"input\":\"What is the capital of canada?\"})\n",
    "response['output']"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7db969cf",
   "metadata": {},
   "source": [
    "Great, that's correct. Now let's ask a question that requires listing the currect directory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "id": "8e516015",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
      "\u001b[32;1m\u001b[1;3m I need to find out what the comments are about\n",
      "Action: Search\n",
      "Action Input: \"comments on https://news.ycombinator.com/item?id=34425779\"\u001b[0m\n",
      "Observation: \u001b[36;1m\u001b[1;3mAbout a month after we started Y Combinator we came up with the phrase that ... Action Input: \"comments on https://news.ycombinator.com/item?id=34425779\" .\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I now know the comments are about Y Combinator\n",
      "Final Answer: The comments on the webpage are about Y Combinator.\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'The comments on the webpage are about Y Combinator.'"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "response = agent({\"input\":\"Tell me what the comments are about on this webpage https://news.ycombinator.com/item?id=34425779\"})\n",
    "response['output']"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ad644451",
   "metadata": {},
   "source": [
    "## FIN\n",
    "\n",
    "Wow! You made it all the way down to the bottom.\n",
    "\n",
    "Where do you go from here?\n",
    "\n",
    "The world of AI is massive and use cases will continue to grow. I'm personally most excited about the idea of use cases we don't know about yet.\n",
    "\n",
    "What else should we add to this list?\n",
    "\n",
    "Check out this [repo's ReadMe](https://github.com/gkamradt/langchain-tutorials) for more inspiration\n",
    "Check out more tutorials on [YouTube](https://www.youtube.com/@DataIndependent)\n",
    "\n",
    "I'd love to see what projects you build. Tag me on [Twitter](https://twitter.com/GregKamradt)!\n",
    "\n",
    "Have something you'd like to edit? See our [contribution guide](https://github.com/gkamradt/langchain-tutorials) and throw up a PR"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}


================================================
FILE: README.md
================================================
# Learn LangChain

Overview, Tutorial, and Examples of [LangChain](https://langchain.readthedocs.io/en/latest/)

See the accompanying tutorials on [YouTube](https://www.youtube.com/channel/UCyR2Ct3pDOeZSRyZH5hPO-Q)

If you want to get updated when new tutorials are out, get them [delivered to your inbox](https://prodigious-knitter-7293.ck.page/3bd9b7cea6)

If you're new to Jupyter Notebooks or Colab, check out [this video](https://www.youtube.com/watch?v=HW29067qVWk)

### **New To LangChain?**
Recommended Learning Path:
1. LangChain CookBook Part 1: 7 Core Concepts - [Code](https://github.com/gkamradt/langchain-tutorials/blob/main/LangChain%20Cookbook%20Part%201%20-%20Fundamentals.ipynb), [Video](https://youtu.be/2xxziIWmaSA)
2. LangChain CookBook Part 2: 9 Use Cases - [Code](https://github.com/gkamradt/langchain-tutorials/blob/main/LangChain%20Cookbook%20Part%202%20-%20Use%20Cases.ipynb), [Video](https://youtu.be/vGP4pQdCocw)
3. Explore the projects below and jump into the deep dives

Prompt Engineering (my favorite resources):
1. [Prompt Engineering Overview](https://www.youtube.com/watch?v=dOxUroR57xs) by [Elvis Saravia](https://twitter.com/omarsar0)
2. [ChatGPT Prompt Engineering for Developers](https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers/) - Prompt engineering basics straight from OpenAI
3. [Brex's Prompt Engineering Guide](https://github.com/brexhq/prompt-engineering)

## 🤖 **Project Gallery**

🐇 Beginner = Entry level projects to practice LangChain

🐒 Intermediate = In depth use of LangChain

🦈 Advanced = Advanced or custom implementations of LangChain

### **📝 Summarization** - *Deep Dive: [Code](https://github.com/gkamradt/langchain-tutorials/blob/main/data_generation/5%20Levels%20Of%20Summarization%20-%20Novice%20To%20Expert.ipynb), [Video](https://youtu.be/qaPMdcCqtWk)*
| Project    | Contact | Difficulty | Open Sourced? |  Notes | 
| - | ----------- | ---------- | :-: | ---------- |
| [SummarizePaper.com](https://www.summarizepaper.com/)      | Quentin Kral       | 🐒 Intermediate | ✅ [Code](https://github.com/summarizepaper/summarizepaper) | Summarize arXiv papers | 

<br>

### ❓ Question and Answering Over Documents
| Project      | Contact | Difficulty | Open Sourced? |  Notes | 
| ----------- | ----------- | ---------- | :-: | ---------- |
| [ChatPDF](https://github.com/akshata29/chatpdf)      | [Ashish Talati](https://github.com/akshata29)       | 🐒 Intermediate | ✅ [Code](https://github.com/akshata29/chatpdf) | Chat and Ask on your own data | 

<br>

### **📦 Extraction**
| Project      | Contact | Difficulty | Open Sourced? |  Notes | 
| ----------- | ----------- | ---------- | :-: | ---------- |
| [Kor](https://eyurtsev.github.io/kor/)      | [Eugene Yurtsev](https://twitter.com/veryboldbagel)       | 🐒 Intermediate | ✅ [Code](https://github.com/eyurtsev/kor) | This is a half-baked prototype that “helps” you extract structured data from text using large language models (LLMs) 🧩. | 
| [OpeningAttributes](https://twitter.com/GregKamradt/status/1643027796850253824)      | [@gregkamradt](https://twitter.com/GregKamradt)       | 🐇 Beginner | ✅ [Code](https://github.com/gkamradt/langchain-tutorials/blob/main/data_generation/Expert%20Structured%20Output%20(Using%20Kor).ipynb) | Extract technologies & tools from job descriptions | 

<br>

### **🔍 Evaluation** 
| Project      | Contact | Difficulty | Open Sourced? |  Notes | 
| ----------- | ----------- | ---------- | :-: | ---------- |
| [Auto-Evaluator](https://autoevaluator.langchain.com/)      | [@RLanceMartin](https://twitter.com/RLanceMartin)       | 🦈 Advanced | ✅ [Code](https://github.com/langchain-ai/auto-evaluator) | Evaluate Q&A Chains | 

<br>

### **📊 Querying Tabular Data** 
| Project      | Contact | Difficulty | Open Sourced? |  Notes | 
| ----------- | ----------- | ---------- | :-: | ---------- |
| TBD | | | | | 

<br>

### **💻 Code Understanding**
| Project      | Contact | Difficulty | Open Sourced? |  Notes | 
| ----------- | ----------- | ---------- | :-: | ---------- |
| TBD | | | | | 

<br>

### **🌐 Interacting with APIs**
| Project      | Contact | Difficulty | Open Sourced? |  Notes | 
| ----------- | ----------- | ---------- | :-: | ---------- |
| TBD | | | | | 

<br>

### **💬 Chatbots**
| Project      | Contact | Difficulty | Open Sourced? |  Notes | 
| ----------- | ----------- | ---------- | :-: | ---------- |
| [LangChain ChatBot](https://github.com/Haste171/langchain-chatbot)      | [David Peterson](https://github.com/Haste171)       | 🐒 Intermediate | ✅ [Code](https://github.com/Haste171/langchain-chatbot) | Input your PDF documents and analyze, ask questions, or do calculations on the data. |

<br>

### **🤖 Agents**
| Project      | Contact | Difficulty | Open Sourced? |  Notes | 
| ----------- | ----------- | ---------- | :-: | ---------- |
| [Agents Via Vocode](https://twitter.com/vocodehq/status/1653104377010483201)      | [@vocode](https://twitter.com/vocodehq)       | 🐒 Intermediate | ✅ [Code](https://github.com/vocodedev/vocode-python) | Agents making phone calls to order pizza |
| [NexusGPT](https://twitter.com/achammah1/status/1649482899253501958?s=20)      | [@achammah1](https://twitter.com/achammah1)       | 🐒 Intermediate | | AI Freelancer Platform. [Discord](https://discord.gg/Tttk8z9U5x) | 

### **👽 Other 👽**
| Project      | Contact | Difficulty | Open Sourced? |  Notes | 
| ----------- | ----------- | ---------- | :-: | ---------- |
| [Slack-GPT](https://github.com/martinseanhunt/slack-gpt)      | [@martinseanhunt](https://twitter.com/martinseanhunt)       | 🐒 Intermediate | ✅ [Code](https://github.com/martinseanhunt/slack-gpt) | A simple starter for a Slack app / chatbot that uses the Bolt.js Slack app framework, Langchain, openAI and a Pinecone vectorstore to provide LLM generated answers to user questions based on a custom data set. | 

## 💁 Contributing

As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of updating code, better documentation, or project to feature.

Submit a PR with notes.

This repo and series is provided by [DataIndependent](https://dataindependent.com/) and run by [Greg Kamradt](https://twitter.com/GregKamradt)

================================================
FILE: SUMMARY.md
================================================
# Table of contents

* [Learn LangChain](README.md)


================================================
FILE: agents/Agents + ZapierToolkit.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "a07e6c78",
   "metadata": {},
   "source": [
    "### Zapier Natural Language Actions API\n",
    "Full docs here: https://nla.zapier.com/api/v1/dynamic/docs"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a994f3cd",
   "metadata": {},
   "source": [
    "### Using An Agent"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "cc720bda",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.llms import OpenAI\n",
    "from langchain.agents import initialize_agent\n",
    "from langchain.agents.agent_toolkits import ZapierToolkit\n",
    "from langchain.utilities.zapier import ZapierNLAWrapper\n",
    "import os"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "b572d4a7",
   "metadata": {
    "hide_input": false
   },
   "outputs": [],
   "source": [
    "os.environ[\"OPENAI_API_KEY\"] = 'YourAPIKey'\n",
    "os.environ[\"ZAPIER_NLA_API_KEY\"] = 'YourAPIKey'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "ded109ef",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = OpenAI(temperature=0)\n",
    "zapier = ZapierNLAWrapper()\n",
    "toolkit = ZapierToolkit.from_zapier_nla_wrapper(zapier)\n",
    "agent = initialize_agent(toolkit.get_tools(), llm, agent=\"zero-shot-react-description\", verbose=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "5edadb8b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Twitter: Create Tweet\n",
      "A wrapper around Zapier NLA actions. The input to this tool is a natural language instruction, for example \"get the latest email from my bank\" or \"send a slack message to the #general channel\". Each tool will have params associated with it that are specified as a list. You MUST take into account the params when creating the instruction. For example, if the params are ['Message_Text', 'Channel'], your instruction should be something like 'send a slack message to the #general channel with the text hello world'. Another example: if the params are ['Calendar', 'Search_Term'], your instruction should be something like 'find the meeting in my personal calendar at 3pm'. Do not make up params, they will be explicitly specified in the tool description. If you do not have enough information to fill in the params, just say 'not enough information provided in the instruction, missing <param>'. If you get a none or null response, STOP EXECUTION, do not try to another tool!This tool specifically used for: Twitter: Create Tweet, and has params: ['Message']\n",
      "\n",
      "\n",
      "\n",
      "Giphy: Find GIF\n",
      "A wrapper around Zapier NLA actions. The input to this tool is a natural language instruction, for example \"get the latest email from my bank\" or \"send a slack message to the #general channel\". Each tool will have params associated with it that are specified as a list. You MUST take into account the params when creating the instruction. For example, if the params are ['Message_Text', 'Channel'], your instruction should be something like 'send a slack message to the #general channel with the text hello world'. Another example: if the params are ['Calendar', 'Search_Term'], your instruction should be something like 'find the meeting in my personal calendar at 3pm'. Do not make up params, they will be explicitly specified in the tool description. If you do not have enough information to fill in the params, just say 'not enough information provided in the instruction, missing <param>'. If you get a none or null response, STOP EXECUTION, do not try to another tool!This tool specifically used for: Giphy: Find GIF, and has params: ['Search']\n",
      "\n",
      "\n",
      "\n",
      "Slack: Send Direct Message\n",
      "A wrapper around Zapier NLA actions. The input to this tool is a natural language instruction, for example \"get the latest email from my bank\" or \"send a slack message to the #general channel\". Each tool will have params associated with it that are specified as a list. You MUST take into account the params when creating the instruction. For example, if the params are ['Message_Text', 'Channel'], your instruction should be something like 'send a slack message to the #general channel with the text hello world'. Another example: if the params are ['Calendar', 'Search_Term'], your instruction should be something like 'find the meeting in my personal calendar at 3pm'. Do not make up params, they will be explicitly specified in the tool description. If you do not have enough information to fill in the params, just say 'not enough information provided in the instruction, missing <param>'. If you get a none or null response, STOP EXECUTION, do not try to another tool!This tool specifically used for: Slack: Send Direct Message, and has params: ['Message_Text', 'To_Username']\n",
      "\n",
      "\n",
      "\n",
      "Gmail: Create Draft\n",
      "A wrapper around Zapier NLA actions. The input to this tool is a natural language instruction, for example \"get the latest email from my bank\" or \"send a slack message to the #general channel\". Each tool will have params associated with it that are specified as a list. You MUST take into account the params when creating the instruction. For example, if the params are ['Message_Text', 'Channel'], your instruction should be something like 'send a slack message to the #general channel with the text hello world'. Another example: if the params are ['Calendar', 'Search_Term'], your instruction should be something like 'find the meeting in my personal calendar at 3pm'. Do not make up params, they will be explicitly specified in the tool description. If you do not have enough information to fill in the params, just say 'not enough information provided in the instruction, missing <param>'. If you get a none or null response, STOP EXECUTION, do not try to another tool!This tool specifically used for: Gmail: Create Draft, and has params: ['Body', 'To', 'Subject']\n",
      "\n",
      "\n",
      "\n",
      "Slack: Send Channel Message\n",
      "A wrapper around Zapier NLA actions. The input to this tool is a natural language instruction, for example \"get the latest email from my bank\" or \"send a slack message to the #general channel\". Each tool will have params associated with it that are specified as a list. You MUST take into account the params when creating the instruction. For example, if the params are ['Message_Text', 'Channel'], your instruction should be something like 'send a slack message to the #general channel with the text hello world'. Another example: if the params are ['Calendar', 'Search_Term'], your instruction should be something like 'find the meeting in my personal calendar at 3pm'. Do not make up params, they will be explicitly specified in the tool description. If you do not have enough information to fill in the params, just say 'not enough information provided in the instruction, missing <param>'. If you get a none or null response, STOP EXECUTION, do not try to another tool!This tool specifically used for: Slack: Send Channel Message, and has params: ['Message_Text', 'Channel']\n",
      "\n",
      "\n",
      "\n",
      "Gmail: Find Email\n",
      "A wrapper around Zapier NLA actions. The input to this tool is a natural language instruction, for example \"get the latest email from my bank\" or \"send a slack message to the #general channel\". Each tool will have params associated with it that are specified as a list. You MUST take into account the params when creating the instruction. For example, if the params are ['Message_Text', 'Channel'], your instruction should be something like 'send a slack message to the #general channel with the text hello world'. Another example: if the params are ['Calendar', 'Search_Term'], your instruction should be something like 'find the meeting in my personal calendar at 3pm'. Do not make up params, they will be explicitly specified in the tool description. If you do not have enough information to fill in the params, just say 'not enough information provided in the instruction, missing <param>'. If you get a none or null response, STOP EXECUTION, do not try to another tool!This tool specifically used for: Gmail: Find Email, and has params: ['Search_String']\n",
      "\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "for tool in toolkit.get_tools():\n",
    "    print (tool.name)\n",
    "    print (tool.description)\n",
    "    print (\"\\n\\n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "42dd7a75",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent.run(\"\"\"Summarize the last email I received from greg at Data Independent.\n",
    "                Send the summary to the trending domains channel in slack.\"\"\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "70222dec",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent.run(\"Get the last email I received from greg at Data Independent. Summarize the reply and create a tweet\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f8809e78",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent.run(\"\"\"Get the last email I received from greg at Data Independent.\n",
    "              Create a draft email in gmail back to Greg with a good positive reply\"\"\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "60698fa2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
      "\u001b[32;1m\u001b[1;3m I need to find the email and then find a gif and send it to a slack channel\n",
      "Action: Gmail: Find Email\n",
      "Action Input: Find the last email I received from greg@DataIndependent.com\u001b[0m\n",
      "Observation: \u001b[33;1m\u001b[1;3m{\"from__email\": \"greg@dataindependent.com\", \"from__name\": \"Greg Kamradt\", \"body_plain\": \"Hey Greg,\\r\\n\\r\\nThis is Braden from VC Ventures. I love what you are doing at Thimble and\\r\\nwe think it's super cool. We'd love to collaborate and see how you'd like\\r\\nto partner.\\r\\n\\r\\nWe are happy to provide introductions, funding, or set you up with ideas.\\r\\n\\r\\nWhat can we help with?\\r\\n\\r\\nChat soon,\\r\\n\\r\\nBraden\\r\\n\", \"message_url\": \"https://mail.google.com/mail/u/0/#inbox/186e9069bdbf14a9\", \"subject\": \"We'd love to collaborate!\", \"date\": \"Wed, 15 Mar 2023 23:05:58 -0700\", \"to__emails\": \"thimbleai@gmail.com\", \"attachment_count\": \"0\", \"raw__payload__headers__Delivered-To\": \"thimbleai@gmail.com\", \"message_id\": \"186e9069bdbf14a9\"}\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I have the email, now I need to find a gif\n",
      "Action: Giphy: Find GIF\n",
      "Action Input: Find a good gif that matches the intent of the email\u001b[0m\n",
      "Observation: \u001b[33;1m\u001b[1;3mnull\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I need to find another gif\n",
      "Action: Giphy: Find GIF\n",
      "Action Input: Find a good gif that matches the intent of the email\u001b[0m\n",
      "Observation: \u001b[33;1m\u001b[1;3mnull\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I need to find another gif\n",
      "Action: Giphy: Find GIF\n",
      "Action Input: Find a good gif that matches the intent of the email\u001b[0m\n",
      "Observation: \u001b[33;1m\u001b[1;3mnull\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I need to find another gif\n",
      "Action: Giphy: Find GIF\n",
      "Action Input: Find a good gif that matches the intent of the email\u001b[0m\n",
      "Observation: \u001b[33;1m\u001b[1;3mnull\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m I have the gif, now I need to send it to a slack channel\n",
      "Action: Slack: Send Channel Message\n",
      "Action Input: Send the gif to trending domains in slack\u001b[0m"
     ]
    },
    {
     "ename": "KeyboardInterrupt",
     "evalue": "",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mKeyboardInterrupt\u001b[0m                         Traceback (most recent call last)",
      "\u001b[0;32m/var/folders/5c/csjfqsk97xz704h7v3fzjqph0000gn/T/ipykernel_28927/4128131430.py\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m agent.run(\"\"\"Get the last email I received from greg@DataIndependent.com\n\u001b[0m\u001b[1;32m      2\u001b[0m               Find a good gif that matches the intent of the email and send the gif to trending domains in slack\"\"\")\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/langchain/chains/base.py\u001b[0m in \u001b[0;36mrun\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m    211\u001b[0m             \u001b[0;32mif\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m!=\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    212\u001b[0m                 \u001b[0;32mraise\u001b[0m \u001b[0mValueError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"`run` supports only one positional argument.\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 213\u001b[0;31m             \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0moutput_keys\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    214\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    215\u001b[0m         \u001b[0;32mif\u001b[0m \u001b[0mkwargs\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/langchain/chains/base.py\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self, inputs, return_only_outputs)\u001b[0m\n\u001b[1;32m    114\u001b[0m         \u001b[0;32mexcept\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mKeyboardInterrupt\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mException\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    115\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcallback_manager\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mon_chain_error\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mverbose\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mverbose\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 116\u001b[0;31m             \u001b[0;32mraise\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    117\u001b[0m         \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcallback_manager\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mon_chain_end\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0moutputs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mverbose\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mverbose\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    118\u001b[0m         \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprep_outputs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minputs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0moutputs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreturn_only_outputs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/langchain/chains/base.py\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self, inputs, return_only_outputs)\u001b[0m\n\u001b[1;32m    111\u001b[0m         )\n\u001b[1;32m    112\u001b[0m         \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 113\u001b[0;31m             \u001b[0moutputs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_call\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minputs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    114\u001b[0m         \u001b[0;32mexcept\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mKeyboardInterrupt\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mException\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    115\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcallback_manager\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mon_chain_error\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mverbose\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mverbose\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/langchain/agents/agent.py\u001b[0m in \u001b[0;36m_call\u001b[0;34m(self, inputs)\u001b[0m\n\u001b[1;32m    497\u001b[0m         \u001b[0;31m# We now enter the agent loop (until it returns something).\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    498\u001b[0m         \u001b[0;32mwhile\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_should_continue\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0miterations\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 499\u001b[0;31m             next_step_output = self._take_next_step(\n\u001b[0m\u001b[1;32m    500\u001b[0m                 \u001b[0mname_to_tool_map\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcolor_mapping\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minputs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mintermediate_steps\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    501\u001b[0m             )\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/langchain/agents/agent.py\u001b[0m in \u001b[0;36m_take_next_step\u001b[0;34m(self, name_to_tool_map, color_mapping, inputs, intermediate_steps)\u001b[0m\n\u001b[1;32m    421\u001b[0m             \u001b[0mllm_prefix\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"\"\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mreturn_direct\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0magent\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mllm_prefix\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    422\u001b[0m             \u001b[0;31m# We then call the tool on the tool input to get an observation\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 423\u001b[0;31m             observation = tool.run(\n\u001b[0m\u001b[1;32m    424\u001b[0m                 \u001b[0moutput\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtool_input\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    425\u001b[0m                 \u001b[0mverbose\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mverbose\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/langchain/tools/base.py\u001b[0m in \u001b[0;36mrun\u001b[0;34m(self, tool_input, verbose, start_color, color, **kwargs)\u001b[0m\n\u001b[1;32m     69\u001b[0m         \u001b[0;32mexcept\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mException\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mKeyboardInterrupt\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     70\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcallback_manager\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mon_tool_error\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mverbose\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mverbose\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 71\u001b[0;31m             \u001b[0;32mraise\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     72\u001b[0m         self.callback_manager.on_tool_end(\n\u001b[1;32m     73\u001b[0m             \u001b[0mobservation\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mverbose\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mverbose\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcolor\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcolor\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/langchain/tools/base.py\u001b[0m in \u001b[0;36mrun\u001b[0;34m(self, tool_input, verbose, start_color, color, **kwargs)\u001b[0m\n\u001b[1;32m     66\u001b[0m         )\n\u001b[1;32m     67\u001b[0m         \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 68\u001b[0;31m             \u001b[0mobservation\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_run\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtool_input\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     69\u001b[0m         \u001b[0;32mexcept\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mException\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mKeyboardInterrupt\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     70\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcallback_manager\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mon_tool_error\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mverbose\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mverbose\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/langchain/tools/zapier/tool.py\u001b[0m in \u001b[0;36m_run\u001b[0;34m(self, instructions)\u001b[0m\n\u001b[1;32m    119\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0m_run\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minstructions\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    120\u001b[0m         \u001b[0;34m\"\"\"Use the Zapier NLA tool to return a list of all exposed user actions.\"\"\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 121\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mapi_wrapper\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrun_as_str\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0maction_id\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minstructions\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mparams\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    122\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    123\u001b[0m     \u001b[0;32masync\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0m_arun\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0m_\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/langchain/utilities/zapier.py\u001b[0m in \u001b[0;36mrun_as_str\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m    140\u001b[0m         \"\"\"Same as run, but returns a stringified version of the JSON for\n\u001b[1;32m    141\u001b[0m         insertting back into an LLM.\"\"\"\n\u001b[0;32m--> 142\u001b[0;31m         \u001b[0mdata\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrun\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    143\u001b[0m         \u001b[0;32mreturn\u001b[0m \u001b[0mjson\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdumps\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    144\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/langchain/utilities/zapier.py\u001b[0m in \u001b[0;36mrun\u001b[0;34m(self, action_id, instructions, params)\u001b[0m\n\u001b[1;32m    119\u001b[0m         \u001b[0msession\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_get_session\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    120\u001b[0m         \u001b[0mrequest\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_get_action_request\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0maction_id\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minstructions\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mparams\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 121\u001b[0;31m         \u001b[0mresponse\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0msession\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msession\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprepare_request\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mrequest\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    122\u001b[0m         \u001b[0mresponse\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mraise_for_status\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    123\u001b[0m         \u001b[0;32mreturn\u001b[0m \u001b[0mresponse\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjson\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"result\"\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/requests/sessions.py\u001b[0m in \u001b[0;36msend\u001b[0;34m(self, request, **kwargs)\u001b[0m\n\u001b[1;32m    699\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    700\u001b[0m         \u001b[0;31m# Send the request\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 701\u001b[0;31m         \u001b[0mr\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0madapter\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mrequest\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    702\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    703\u001b[0m         \u001b[0;31m# Total elapsed time of the request (approximately)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/requests/adapters.py\u001b[0m in \u001b[0;36msend\u001b[0;34m(self, request, stream, timeout, verify, cert, proxies)\u001b[0m\n\u001b[1;32m    487\u001b[0m         \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    488\u001b[0m             \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mchunked\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 489\u001b[0;31m                 resp = conn.urlopen(\n\u001b[0m\u001b[1;32m    490\u001b[0m                     \u001b[0mmethod\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mrequest\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmethod\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    491\u001b[0m                     \u001b[0murl\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0murl\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/urllib3/connectionpool.py\u001b[0m in \u001b[0;36murlopen\u001b[0;34m(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)\u001b[0m\n\u001b[1;32m    701\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    702\u001b[0m             \u001b[0;31m# Make the request on the httplib connection object.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 703\u001b[0;31m             httplib_response = self._make_request(\n\u001b[0m\u001b[1;32m    704\u001b[0m                 \u001b[0mconn\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    705\u001b[0m                 \u001b[0mmethod\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/urllib3/connectionpool.py\u001b[0m in \u001b[0;36m_make_request\u001b[0;34m(self, conn, method, url, timeout, chunked, **httplib_request_kw)\u001b[0m\n\u001b[1;32m    447\u001b[0m                     \u001b[0;31m# Python 3 (including for exceptions like SystemExit).\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    448\u001b[0m                     \u001b[0;31m# Otherwise it looks like a bug in the code.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 449\u001b[0;31m                     \u001b[0msix\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mraise_from\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    450\u001b[0m         \u001b[0;32mexcept\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mSocketTimeout\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mBaseSSLError\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mSocketError\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    451\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_raise_timeout\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0merr\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0murl\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0murl\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtimeout_value\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mread_timeout\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/urllib3/packages/six.py\u001b[0m in \u001b[0;36mraise_from\u001b[0;34m(value, from_value)\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/site-packages/urllib3/connectionpool.py\u001b[0m in \u001b[0;36m_make_request\u001b[0;34m(self, conn, method, url, timeout, chunked, **httplib_request_kw)\u001b[0m\n\u001b[1;32m    442\u001b[0m                 \u001b[0;31m# Python 3\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    443\u001b[0m                 \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 444\u001b[0;31m                     \u001b[0mhttplib_response\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mconn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgetresponse\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    445\u001b[0m                 \u001b[0;32mexcept\u001b[0m \u001b[0mBaseException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    446\u001b[0m                     \u001b[0;31m# Remove the TypeError from the exception chain in\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/http/client.py\u001b[0m in \u001b[0;36mgetresponse\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m   1375\u001b[0m         \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1376\u001b[0m             \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1377\u001b[0;31m                 \u001b[0mresponse\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mbegin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   1378\u001b[0m             \u001b[0;32mexcept\u001b[0m \u001b[0mConnectionError\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1379\u001b[0m                 \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mclose\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/http/client.py\u001b[0m in \u001b[0;36mbegin\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m    318\u001b[0m         \u001b[0;31m# read until we get a non-100 response\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    319\u001b[0m         \u001b[0;32mwhile\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 320\u001b[0;31m             \u001b[0mversion\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mstatus\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreason\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_read_status\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    321\u001b[0m             \u001b[0;32mif\u001b[0m \u001b[0mstatus\u001b[0m \u001b[0;34m!=\u001b[0m \u001b[0mCONTINUE\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    322\u001b[0m                 \u001b[0;32mbreak\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/http/client.py\u001b[0m in \u001b[0;36m_read_status\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m    279\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    280\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0m_read_status\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 281\u001b[0;31m         \u001b[0mline\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreadline\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0m_MAXLINE\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"iso-8859-1\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    282\u001b[0m         \u001b[0;32mif\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mline\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m>\u001b[0m \u001b[0m_MAXLINE\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    283\u001b[0m             \u001b[0;32mraise\u001b[0m \u001b[0mLineTooLong\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"status line\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/socket.py\u001b[0m in \u001b[0;36mreadinto\u001b[0;34m(self, b)\u001b[0m\n\u001b[1;32m    702\u001b[0m         \u001b[0;32mwhile\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    703\u001b[0m             \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 704\u001b[0;31m                 \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_sock\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrecv_into\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mb\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    705\u001b[0m             \u001b[0;32mexcept\u001b[0m \u001b[0mtimeout\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    706\u001b[0m                 \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_timeout_occurred\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/ssl.py\u001b[0m in \u001b[0;36mrecv_into\u001b[0;34m(self, buffer, nbytes, flags)\u001b[0m\n\u001b[1;32m   1240\u001b[0m                   \u001b[0;34m\"non-zero flags not allowed in calls to recv_into() on %s\"\u001b[0m \u001b[0;34m%\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1241\u001b[0m                   self.__class__)\n\u001b[0;32m-> 1242\u001b[0;31m             \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mread\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnbytes\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbuffer\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   1243\u001b[0m         \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1244\u001b[0m             \u001b[0;32mreturn\u001b[0m \u001b[0msuper\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrecv_into\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbuffer\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnbytes\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mflags\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/opt/anaconda3/lib/python3.9/ssl.py\u001b[0m in \u001b[0;36mread\u001b[0;34m(self, len, buffer)\u001b[0m\n\u001b[1;32m   1098\u001b[0m         \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1099\u001b[0m             \u001b[0;32mif\u001b[0m \u001b[0mbuffer\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1100\u001b[0;31m                 \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_sslobj\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mread\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbuffer\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   1101\u001b[0m             \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1102\u001b[0m                 \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_sslobj\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mread\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mKeyboardInterrupt\u001b[0m: "
     ]
    }
   ],
   "source": [
    "agent.run(\"\"\"Get the last email I received from greg@DataIndependent.com\n",
    "              Find a good gif that matches the intent of the email and send the gif to trending domains in slack\"\"\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1fd1ebc6",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent.run(\"\"\"Create a tweet that says, 'langchain + zapier is great'. \\\n",
    "Draft an email in gmail to greg @ data independent sharing my tweet with a personalized message\"\"\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f25901a0",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}


================================================
FILE: agents/Agents.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "5170bf5b",
   "metadata": {},
   "source": [
    "# Agents - Make OpenAI Do Things For you"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d9faa06",
   "metadata": {},
   "source": [
    "*[Source](https://langchain.readthedocs.io/en/latest/modules/agents/getting_started.html)*\n",
    "* **Agent** - Agents use an LLM to determine which actions to take and in what order. An action can either be using a tool and observing its output, or returning to the user.\n",
    "\n",
    "Parameters when creating an agent:\n",
    "* **Tool:** A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. The interface for a tool is currently a function that is expected to have a string as an input, with a string as an output.\n",
    "* **LLM:** The language model powering the agent.\n",
    "* **Agent:** The agent to use. This should be a string that references a support agent class. Because this notebook focuses on the simplest, highest level API, this only covers using the standard supported agents. If you want to implement a custom agent, see the documentation for custom agents (coming soon)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "id": "7f247903",
   "metada

Download .txt

gitextract_j4ji0gy0/

├── .gitignore
├── LangChain Cookbook Part 1 - Fundamentals.ipynb
├── LangChain Cookbook Part 2 - Use Cases.ipynb
├── README.md
├── SUMMARY.md
├── agents/
│   ├── Agents + ZapierToolkit.ipynb
│   └── Agents.ipynb
├── bots/
│   └── Twitter_Reply_Bot/
│       └── Twitter Reply Bot Notebook.ipynb
├── chains/
│   └── Chain Types.ipynb
├── chatapi/
│   └── ChatAPI + LangChain Basics.ipynb
├── data/
│   ├── LinkedInIndustries.csv
│   ├── LinkedInSubIndustries.csv
│   ├── PaulGrahamEssayMedium/
│   │   ├── fr.txt
│   │   ├── guidetoinvestors.txt
│   │   ├── mit.txt
│   │   ├── notnot.txt
│   │   ├── popular.txt
│   │   ├── re.txt
│   │   ├── road.txt
│   │   ├── start.txt
│   │   ├── startupfunding.txt
│   │   ├── startupideas.txt
│   │   ├── wealth.txt
│   │   └── worked.txt
│   ├── PaulGrahamEssaySmall/
│   │   ├── cred.txt
│   │   ├── disc.txt
│   │   ├── fix.txt
│   │   ├── fp.txt
│   │   ├── getideas.txt
│   │   ├── lwba.txt
│   │   ├── nft.txt
│   │   ├── noob.txt
│   │   ├── nov.txt
│   │   ├── pow.txt
│   │   ├── prop62.txt
│   │   ├── rootsoflisp.txt
│   │   ├── rss.txt
│   │   ├── todo.txt
│   │   └── twitter.txt
│   ├── PaulGrahamEssays/
│   │   ├── 13sentences.txt
│   │   ├── 5founders.txt
│   │   ├── 6631327.txt
│   │   ├── 95.txt
│   │   ├── ace.txt
│   │   ├── addiction.txt
│   │   ├── airbnb.txt
│   │   ├── airbnbs.txt
│   │   ├── alien.txt
│   │   ├── altair.txt
│   │   ├── ambitious.txt
│   │   ├── america.txt
│   │   ├── angelinvesting.txt
│   │   ├── aord.txt
│   │   ├── apple.txt
│   │   ├── artistsship.txt
│   │   ├── avg.txt
│   │   ├── badeconomy.txt
│   │   ├── before.txt
│   │   ├── better.txt
│   │   ├── bias.txt
│   │   ├── boss.txt
│   │   ├── bronze.txt
│   │   ├── bubble.txt
│   │   ├── charisma.txt
│   │   ├── cities.txt
│   │   ├── college.txt
│   │   ├── colleges.txt
│   │   ├── conformism.txt
│   │   ├── control.txt
│   │   ├── convergence.txt
│   │   ├── convince.txt
│   │   ├── copy.txt
│   │   ├── corpdev.txt
│   │   ├── cred.txt
│   │   ├── credentials.txt
│   │   ├── desres.txt
│   │   ├── determination.txt
│   │   ├── die.txt
│   │   ├── diff.txt
│   │   ├── disagree.txt
│   │   ├── disc.txt
│   │   ├── discover.txt
│   │   ├── distraction.txt
│   │   ├── divergence.txt
│   │   ├── donate.txt
│   │   ├── ds.txt
│   │   ├── early.txt
│   │   ├── earnest.txt
│   │   ├── ecw.txt
│   │   ├── equity.txt
│   │   ├── essay.txt
│   │   ├── ffb.txt
│   │   ├── fh.txt
│   │   ├── fix.txt
│   │   ├── fn.txt
│   │   ├── founders.txt
│   │   ├── foundersatwork.txt
│   │   ├── foundervisa.txt
│   │   ├── fp.txt
│   │   ├── fr.txt
│   │   ├── fundraising.txt
│   │   ├── future.txt
│   │   ├── gap.txt
│   │   ├── gba.txt
│   │   ├── genius.txt
│   │   ├── getideas.txt
│   │   ├── gh.txt
│   │   ├── good.txt
│   │   ├── goodart.txt
│   │   ├── goodtaste.txt
│   │   ├── googles.txt
│   │   ├── growth.txt
│   │   ├── guidetoinvestors.txt
│   │   ├── hackernews.txt
│   │   ├── head.txt
│   │   ├── herd.txt
│   │   ├── heresy.txt
│   │   ├── heroes.txt
│   │   ├── highres.txt
│   │   ├── hiresfund.txt
│   │   ├── hiring.txt
│   │   ├── hp.txt
│   │   ├── hs.txt
│   │   ├── hubs.txt
│   │   ├── hundred.txt
│   │   ├── hw.txt
│   │   ├── hwh.txt
│   │   ├── icad.txt
│   │   ├── ideas.txt
│   │   ├── identity.txt
│   │   ├── iflisp.txt
│   │   ├── ineq.txt
│   │   ├── inequality.txt
│   │   ├── investors.txt
│   │   ├── invtrend.txt
│   │   ├── island.txt
│   │   ├── javacover.txt
│   │   ├── jessica.txt
│   │   ├── judgement.txt
│   │   ├── kate.txt
│   │   ├── kids.txt
│   │   ├── know.txt
│   │   ├── ladder.txt
│   │   ├── langdes.txt
│   │   ├── laundry.txt
│   │   ├── lesson.txt
│   │   ├── lies.txt
│   │   ├── love.txt
│   │   ├── lwba.txt
│   │   ├── mac.txt
│   │   ├── makersschedule.txt
│   │   ├── marginal.txt
│   │   ├── maybe.txt
│   │   ├── mean.txt
│   │   ├── microsoft.txt
│   │   ├── mit.txt
│   │   ├── mod.txt
│   │   ├── name.txt
│   │   ├── nerds.txt
│   │   ├── newideas.txt
│   │   ├── newthings.txt
│   │   ├── nft.txt
│   │   ├── noob.txt
│   │   ├── noop.txt
│   │   ├── notnot.txt
│   │   ├── nov.txt
│   │   ├── nthings.txt
│   │   ├── opensource.txt
│   │   ├── organic.txt
│   │   ├── orth.txt
│   │   ├── own.txt
│   │   ├── patentpledge.txt
│   │   ├── pgh.txt
│   │   ├── philosophy.txt
│   │   ├── pinch.txt
│   │   ├── polls.txt
│   │   ├── popular.txt
│   │   ├── pow.txt
│   │   ├── power.txt
│   │   ├── prcmc.txt
│   │   ├── procrastination.txt
│   │   ├── progbot.txt
│   │   ├── prop62.txt
│   │   ├── property.txt
│   │   ├── publishing.txt
│   │   ├── pypar.txt
│   │   ├── ramenprofitable.txt
│   │   ├── randomness.txt
│   │   ├── re.txt
│   │   ├── read.txt
│   │   ├── real.txt
│   │   ├── really.txt
│   │   ├── relres.txt
│   │   ├── revolution.txt
│   │   ├── richnow.txt
│   │   ├── road.txt
│   │   ├── ronco.txt
│   │   ├── rootsoflisp.txt
│   │   ├── rss.txt
│   │   ├── safe.txt
│   │   ├── say.txt
│   │   ├── schlep.txt
│   │   ├── seesv.txt
│   │   ├── segway.txt
│   │   ├── selfindulgence.txt
│   │   ├── sfp.txt
│   │   ├── siliconvalley.txt
│   │   ├── simply.txt
│   │   ├── smart.txt
│   │   ├── softwarepatents.txt
│   │   ├── spam.txt
│   │   ├── speak.txt
│   │   ├── start.txt
│   │   ├── startupfunding.txt
│   │   ├── startuphubs.txt
│   │   ├── startupideas.txt
│   │   ├── startuplessons.txt
│   │   ├── startupmistakes.txt
│   │   ├── stuff.txt
│   │   ├── submarine.txt
│   │   ├── sun.txt
│   │   ├── superangels.txt
│   │   ├── swan.txt
│   │   ├── tablets.txt
│   │   ├── talk.txt
│   │   ├── taste.txt
│   │   ├── think.txt
│   │   ├── todo.txt
│   │   ├── top.txt
│   │   ├── trolls.txt
│   │   ├── twitter.txt
│   │   ├── unions.txt
│   │   ├── usa.txt
│   │   ├── useful.txt
│   │   ├── users.txt
│   │   ├── vb.txt
│   │   ├── vcsqueeze.txt
│   │   ├── venturecapital.txt
│   │   ├── vw.txt
│   │   ├── want.txt
│   │   ├── wealth.txt
│   │   ├── web20.txt
│   │   ├── webstartups.txt
│   │   ├── weird.txt
│   │   ├── whyyc.txt
│   │   ├── wisdom.txt
│   │   ├── word.txt
│   │   ├── words.txt
│   │   ├── work.txt
│   │   ├── worked.txt
│   │   ├── writing44.txt
│   │   ├── wtax.txt
│   │   ├── yahoo.txt
│   │   ├── ycombinator.txt
│   │   └── ycstart.txt
│   ├── PaulGrahamEssaysLarge/
│   │   ├── addiction.txt
│   │   ├── aord.txt
│   │   ├── apple.txt
│   │   ├── avg.txt
│   │   ├── before.txt
│   │   ├── bias.txt
│   │   ├── boss.txt
│   │   ├── copy.txt
│   │   ├── corpdev.txt
│   │   ├── desres.txt
│   │   ├── diff.txt
│   │   ├── ecw.txt
│   │   ├── founders.txt
│   │   ├── foundervisa.txt
│   │   ├── gap.txt
│   │   ├── gba.txt
│   │   ├── gh.txt
│   │   ├── goodtaste.txt
│   │   ├── hubs.txt
│   │   ├── iflisp.txt
│   │   ├── island.txt
│   │   ├── know.txt
│   │   ├── langdes.txt
│   │   ├── laundry.txt
│   │   ├── love.txt
│   │   ├── mod.txt
│   │   ├── newideas.txt
│   │   ├── nft.txt
│   │   ├── philosophy.txt
│   │   ├── popular.txt
│   │   ├── pow.txt
│   │   ├── rootsoflisp.txt
│   │   ├── rss.txt
│   │   ├── siliconvalley.txt
│   │   ├── startuplessons.txt
│   │   ├── submarine.txt
│   │   ├── sun.txt
│   │   ├── superangels.txt
│   │   ├── todo.txt
│   │   ├── unions.txt
│   │   ├── useful.txt
│   │   ├── vb.txt
│   │   ├── vcsqueeze.txt
│   │   ├── vw.txt
│   │   ├── want.txt
│   │   ├── web20.txt
│   │   ├── weird.txt
│   │   ├── wisdom.txt
│   │   └── worked.txt
│   ├── San_Francisco_Trees.csv
│   ├── Transcripts/
│   │   ├── MFMPod/
│   │   │   ├── mfm_pod_alex.txt
│   │   │   ├── mfm_pod_rob.txt
│   │   │   └── mfm_pod_steph.txt
│   │   └── acme_co_v2.txt
│   ├── matching_tone_samples.json
│   ├── muir_lake_tahoe_in_winter.txt
│   ├── state_of_the_union.txt
│   └── thefuzz/
│       ├── .editorconfig
│       ├── .github/
│       │   └── workflows/
│       │       └── ci.yml
│       ├── .gitignore
│       ├── .travis.yml
│       ├── CHANGES.rst
│       ├── LICENSE.txt
│       ├── MANIFEST.in
│       ├── README.rst
│       ├── benchmarks.py
│       ├── data/
│       │   └── titledata.csv
│       ├── release
│       ├── setup.py
│       ├── test_thefuzz.py
│       ├── test_thefuzz_hypothesis.py
│       ├── test_thefuzz_pytest.py
│       ├── thefuzz/
│       │   ├── StringMatcher.py
│       │   ├── StringMatcher.pyi
│       │   ├── __init__.py
│       │   ├── fuzz.py
│       │   ├── fuzz.pyi
│       │   ├── process.py
│       │   ├── process.pyi
│       │   ├── string_processing.py
│       │   ├── string_processing.pyi
│       │   ├── utils.py
│       │   └── utils.pyi
│       └── tox.ini
├── data_generation/
│   ├── 5 Levels Of Summarization - Novice To Expert.ipynb
│   ├── Advanced Retrieval With LangChain.ipynb
│   ├── Ask A Book Questions.ipynb
│   ├── Clean and Standardize Data.ipynb
│   ├── Custom Files Question & Answer.ipynb
│   ├── Expert Structured Output (Using Function Calling).ipynb
│   ├── Expert Structured Output (Using Kor).ipynb
│   ├── Exploring ChatGPT Function Calling.ipynb
│   ├── Instructing LLMs To Match Tone.ipynb
│   ├── Personalized Email Generation.ipynb
│   ├── Retrieval_With_MMR.ipynb
│   ├── Topic Modeling With Language Models.ipynb
│   ├── Using LLMs To Summarize Personal Research.ipynb
│   └── Working With Call or Video Transcripts.ipynb
├── getting_started/
│   └── Quickstart Guide.ipynb
├── loaders/
│   ├── Google Drive Loader.ipynb
│   └── YouTube Loader.ipynb
├── requirements.txt
└── tutorials/
    ├── Google Drive Loader.ipynb
    ├── Twitter_Reply_Bot/
    │   └── Twitter Reply Bot Notebook.ipynb
    └── YouTube Loader.ipynb

Download .txt

SYMBOL INDEX (147 symbols across 15 files)

FILE: data/thefuzz/benchmarks.py
  function print_result_from_timeit (line 45) | def print_result_from_timeit(stmt='pass', setup='pass', number=1000000):

FILE: data/thefuzz/setup.py
  function open_file (line 16) | def open_file(fname):

FILE: data/thefuzz/test_thefuzz.py
  class StringProcessingTest (line 11) | class StringProcessingTest(unittest.TestCase):
    method test_replace_non_letters_non_numbers_with_whitespace (line 12) | def test_replace_non_letters_non_numbers_with_whitespace(self):
    method test_dont_condense_whitespace (line 21) | def test_dont_condense_whitespace(self):
  class UtilsTest (line 29) | class UtilsTest(unittest.TestCase):
    method setUp (line 30) | def setUp(self):
    method tearDown (line 48) | def tearDown(self):
    method test_ascii_only (line 51) | def test_ascii_only(self):
    method test_fullProcess (line 55) | def test_fullProcess(self):
    method test_fullProcessForceAscii (line 59) | def test_fullProcessForceAscii(self):
  class RatioTest (line 64) | class RatioTest(unittest.TestCase):
    method setUp (line 66) | def setUp(self):
    method tearDown (line 99) | def tearDown(self):
    method testEqual (line 102) | def testEqual(self):
    method testCaseInsensitive (line 107) | def testCaseInsensitive(self):
    method testPartialRatio (line 111) | def testPartialRatio(self):
    method testTokenSortRatio (line 114) | def testTokenSortRatio(self):
    method testPartialTokenSortRatio (line 117) | def testPartialTokenSortRatio(self):
    method testTokenSetRatio (line 125) | def testTokenSetRatio(self):
    method testPartialTokenSetRatio (line 132) | def testPartialTokenSetRatio(self):
    method testQuickRatioEqual (line 135) | def testQuickRatioEqual(self):
    method testQuickRatioCaseInsensitive (line 138) | def testQuickRatioCaseInsensitive(self):
    method testQuickRatioNotEqual (line 141) | def testQuickRatioNotEqual(self):
    method testWRatioEqual (line 144) | def testWRatioEqual(self):
    method testWRatioCaseInsensitive (line 147) | def testWRatioCaseInsensitive(self):
    method testWRatioPartialMatch (line 150) | def testWRatioPartialMatch(self):
    method testWRatioMisorderedMatch (line 154) | def testWRatioMisorderedMatch(self):
    method testWRatioStr (line 158) | def testWRatioStr(self):
    method testQRatioStr (line 161) | def testQRatioStr(self):
    method testEmptyStringsScore100 (line 164) | def testEmptyStringsScore100(self):
    method testIssueSeven (line 168) | def testIssueSeven(self):
    method testRatioUnicodeString (line 178) | def testRatioUnicodeString(self):
    method testPartialRatioUnicodeString (line 184) | def testPartialRatioUnicodeString(self):
    method testWRatioUnicodeString (line 190) | def testWRatioUnicodeString(self):
    method testQRatioUnicodeString (line 208) | def testQRatioUnicodeString(self):
    method testQratioForceAscii (line 226) | def testQratioForceAscii(self):
    method testQRatioForceAscii (line 236) | def testQRatioForceAscii(self):
    method testTokenSetForceAscii (line 246) | def testTokenSetForceAscii(self):
    method testTokenSortForceAscii (line 256) | def testTokenSortForceAscii(self):
  class ValidatorTest (line 267) | class ValidatorTest(unittest.TestCase):
    method setUp (line 268) | def setUp(self):
    method testCheckForNone (line 271) | def testCheckForNone(self):
    method testCheckEmptyString (line 285) | def testCheckEmptyString(self):
  class ProcessTest (line 300) | class ProcessTest(unittest.TestCase):
    method setUp (line 302) | def setUp(self):
    method testGetBestChoice1 (line 327) | def testGetBestChoice1(self):
    method testGetBestChoice2 (line 332) | def testGetBestChoice2(self):
    method testGetBestChoice3 (line 337) | def testGetBestChoice3(self):
    method testGetBestChoice4 (line 342) | def testGetBestChoice4(self):
    method testWithProcessor (line 347) | def testWithProcessor(self):
    method testWithScorer (line 358) | def testWithScorer(self):
    method testWithCutoff (line 391) | def testWithCutoff(self):
    method testWithCutoff2 (line 412) | def testWithCutoff2(self):
    method testEmptyStrings (line 427) | def testEmptyStrings(self):
    method testNullStrings (line 441) | def testNullStrings(self):
    method test_list_like_extract (line 455) | def test_list_like_extract(self):
    method test_dict_like_extract (line 465) | def test_dict_like_extract(self):
    method test_dedupe (line 480) | def test_dedupe(self):
    method test_simplematch (line 498) | def test_simplematch(self):
  class TestCodeFormat (line 509) | class TestCodeFormat(unittest.TestCase):
    method test_pep8_conformance (line 510) | def test_pep8_conformance(self):

FILE: data/thefuzz/test_thefuzz_hypothesis.py
  function scorers_processors (line 15) | def scorers_processors():
  function full_scorers_processors (line 41) | def full_scorers_processors():
  function test_identical_strings_extracted (line 66) | def test_identical_strings_extracted(scorer, processor, data):
  function test_only_identical_strings_extracted (line 111) | def test_only_identical_strings_extracted(scorer, processor, data):

FILE: data/thefuzz/test_thefuzz_pytest.py
  function test_process_warning (line 4) | def test_process_warning(caplog):

FILE: data/thefuzz/thefuzz/StringMatcher.py
  class StringMatcher (line 14) | class StringMatcher:
    method _reset_cache (line 17) | def _reset_cache(self):
    method __init__ (line 21) | def __init__(self, isjunk=None, seq1='', seq2=''):
    method set_seqs (line 27) | def set_seqs(self, seq1, seq2):
    method set_seq1 (line 31) | def set_seq1(self, seq1):
    method set_seq2 (line 35) | def set_seq2(self, seq2):
    method get_opcodes (line 39) | def get_opcodes(self):
    method get_editops (line 47) | def get_editops(self):
    method get_matching_blocks (line 55) | def get_matching_blocks(self):
    method ratio (line 61) | def ratio(self):
    method quick_ratio (line 66) | def quick_ratio(self):
    method real_quick_ratio (line 72) | def real_quick_ratio(self):
    method distance (line 76) | def distance(self):

FILE: data/thefuzz/thefuzz/StringMatcher.pyi
  class StringMatcher (line 8) | class StringMatcher:
    method _reset_cache (line 9) | def _reset_cache(self) -> None:
    method __init__ (line 16) | def __init__(self, isjunk: Optional[bool] = ..., seq1: str = ..., seq2...
    method set_seqs (line 17) | def set_seqs(self, seq1: str, seq2: str) -> None: ...
    method set_seq1 (line 18) | def set_seq1(self, seq1: str) -> None: ...
    method set_seq2 (line 19) | def set_seq2(self, seq2: str) -> None: ...
    method get_opcodes (line 20) | def get_opcodes(self) -> OpcodeT: ...
    method get_editops (line 21) | def get_editops(self) -> EditOpcodeT: ...
    method get_matching_blocks (line 22) | def get_matching_blocks(self) -> MatchingBlocksT: ...
    method ratio (line 23) | def ratio(self) -> float: ...
    method quick_ratio (line 24) | def quick_ratio(self) -> float: ...
    method real_quick_ratio (line 25) | def real_quick_ratio(self) -> float: ...
    method distance (line 26) | def distance(self) -> int: ...

FILE: data/thefuzz/thefuzz/fuzz.py
  function ratio (line 22) | def ratio(s1, s2):
  function partial_ratio (line 32) | def partial_ratio(s1, s2):
  function _process_and_sort (line 73) | def _process_and_sort(s, force_ascii, full_process=True):
  function _token_sort (line 89) | def _token_sort(s1, s2, partial=True, force_ascii=True, full_process=True):
  function token_sort_ratio (line 99) | def token_sort_ratio(s1, s2, force_ascii=True, full_process=True):
  function partial_token_sort_ratio (line 106) | def partial_token_sort_ratio(s1, s2, force_ascii=True, full_process=True):
  function _token_set (line 114) | def _token_set(s1, s2, partial=True, force_ascii=True, full_process=True):
  function token_set_ratio (line 166) | def token_set_ratio(s1, s2, force_ascii=True, full_process=True):
  function partial_token_set_ratio (line 170) | def partial_token_set_ratio(s1, s2, force_ascii=True, full_process=True):
  function QRatio (line 179) | def QRatio(s1, s2, force_ascii=True, full_process=True):
  function UQRatio (line 208) | def UQRatio(s1, s2, full_process=True):
  function WRatio (line 222) | def WRatio(s1, s2, force_ascii=True, full_process=True):
  function UWRatio (line 300) | def UWRatio(s1, s2, full_process=True):

FILE: data/thefuzz/thefuzz/fuzz.pyi
  function ratio (line 1) | def ratio(s1: str, s2: str) -> int: ...
  function partial_ratio (line 2) | def partial_ratio(s1: str, s2: str) -> int: ...
  function _process_and_sort (line 3) | def _process_and_sort(s: str, force_ascii: bool, full_process: bool = .....
  function _token_sort (line 4) | def _token_sort(s1: str, s2: str, partial: bool = ..., force_ascii: bool...
  function token_sort_ratio (line 5) | def token_sort_ratio(s1: str, s2: str, force_ascii: bool = ..., full_pro...
  function partial_token_sort_ratio (line 6) | def partial_token_sort_ratio(s1: str, s2: str, force_ascii: bool = ..., ...
  function _token_set (line 7) | def _token_set(s1: str, s2: str, partial: bool = ..., force_ascii: bool ...
  function token_set_ratio (line 8) | def token_set_ratio(s1: str, s2: str, force_ascii: bool = ..., full_proc...
  function partial_token_set_ratio (line 9) | def partial_token_set_ratio(s1: str, s2: str, force_ascii: bool = ..., f...
  function QRatio (line 10) | def QRatio(s1: str, s2: str, force_ascii: bool = ..., full_process: bool...
  function UQRatio (line 11) | def UQRatio(s1: str, s2: str, full_process: bool = ...) -> int: ...
  function WRatio (line 12) | def WRatio(s1: str, s2: str, force_ascii: bool = ..., full_process: bool...
  function UWRatio (line 13) | def UWRatio(s1: str, s2: str, full_process: bool = ...) -> int: ...

FILE: data/thefuzz/thefuzz/process.py
  function extractWithoutOrder (line 18) | def extractWithoutOrder(query, choices, processor=default_processor, sco...
  function extract (line 124) | def extract(query, choices, processor=default_processor, scorer=default_...
  function extractBests (line 174) | def extractBests(query, choices, processor=default_processor, scorer=def...
  function extractOne (line 199) | def extractOne(query, choices, processor=default_processor, scorer=defau...
  function dedupe (line 227) | def dedupe(contains_dupes, threshold=70, scorer=fuzz.token_set_ratio):

FILE: data/thefuzz/thefuzz/process.pyi
  function extractWithoutOrder (line 13) | def extractWithoutOrder(query: str, choices: Mapping[str, str], processo...
  function extractWithoutOrder (line 17) | def extractWithoutOrder(query: str, choices: Sequence[str], processor: P...

FILE: data/thefuzz/thefuzz/string_processing.py
  class StringProcessor (line 4) | class StringProcessor:
    method replace_non_letters_non_numbers_with_whitespace (line 14) | def replace_non_letters_non_numbers_with_whitespace(cls, a_string):

FILE: data/thefuzz/thefuzz/string_processing.pyi
  class StringProcessor (line 1) | class StringProcessor(object):
    method replace_non_letters_non_numbers_with_whitespace (line 3) | def replace_non_letters_non_numbers_with_whitespace(cls, a_string: str...

FILE: data/thefuzz/thefuzz/utils.py
  function validate_string (line 6) | def validate_string(s):
  function check_for_equivalence (line 19) | def check_for_equivalence(func):
  function check_for_none (line 28) | def check_for_none(func):
  function check_empty_string (line 37) | def check_empty_string(func):
  function ascii_only (line 50) | def ascii_only(s):
  function make_type_consistent (line 54) | def make_type_consistent(s1, s2):
  function full_process (line 63) | def full_process(s, force_ascii=False):
  function intr (line 79) | def intr(n):

FILE: data/thefuzz/thefuzz/utils.pyi
  function validate_string (line 6) | def validate_string(s: str) -> bool: ...
  function check_for_equivalence (line 7) | def check_for_equivalence(func: TCallable) -> TCallable: ...
  function check_for_none (line 8) | def check_for_none(func: TCallable) -> TCallable: ...
  function check_empty_string (line 9) | def check_empty_string(func: TCallable) -> TCallable: ...
  function asciionly (line 10) | def asciionly(s: str) -> str: ...
  function asciidammit (line 11) | def asciidammit(s: Union[str, bytes]) -> str: ...
  function make_type_consistent (line 12) | def make_type_consistent(s1: str, s2: str) -> Tuple[str, str]: ...
  function full_process (line 13) | def full_process(s: str, force_ascii: bool = ...) -> str: ...
  function intr (line 14) | def intr(n: float) -> int: ...

Download .json

Condensed preview — 360 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (6,378K chars).

[
  {
    "path": ".gitignore",
    "chars": 3090,
    "preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
  },
  {
    "path": "LangChain Cookbook Part 1 - Fundamentals.ipynb",
    "chars": 65713,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"359697d5\",\n   \"metadata\": {},\n   \"source\": [\n    \"# LangChain Co"
  },
  {
    "path": "LangChain Cookbook Part 2 - Use Cases.ipynb",
    "chars": 80888,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"359697d5\",\n   \"metadata\": {},\n   \"source\": [\n    \"# LangChain Co"
  },
  {
    "path": "README.md",
    "chars": 6270,
    "preview": "# Learn LangChain\n\nOverview, Tutorial, and Examples of [LangChain](https://langchain.readthedocs.io/en/latest/)\n\nSee the"
  },
  {
    "path": "SUMMARY.md",
    "chars": 52,
    "preview": "# Table of contents\n\n* [Learn LangChain](README.md)\n"
  },
  {
    "path": "agents/Agents + ZapierToolkit.ipynb",
    "chars": 41744,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"a07e6c78\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Zapier Nat"
  },
  {
    "path": "agents/Agents.ipynb",
    "chars": 7932,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"5170bf5b\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Agents - Mak"
  },
  {
    "path": "bots/Twitter_Reply_Bot/Twitter Reply Bot Notebook.ipynb",
    "chars": 6716,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"6d336eed\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Twitter Repl"
  },
  {
    "path": "chains/Chain Types.ipynb",
    "chars": 207333,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"id\": \"e5490ab7\",\n   \"metadata\": {},\n   \"outputs\":"
  },
  {
    "path": "chatapi/ChatAPI + LangChain Basics.ipynb",
    "chars": 13514,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"e2376f66\",\n   \"metadata\": {},\n   \"source\": [\n    \"[Official docu"
  },
  {
    "path": "data/LinkedInIndustries.csv",
    "chars": 382,
    "preview": "Industry\r\nCorporate Services\r\nRecreation & Travel\r\nLegal\r\nWellness & Fitness\r\nEntertainment\r\nConsumer Goods\r\nDesign\r\nArt"
  },
  {
    "path": "data/LinkedInSubIndustries.csv",
    "chars": 5011,
    "preview": "Industry,SubIndustry\r\nCorporate Services,Accounting\r\nRecreation & Travel,Airlines/Aviation\r\nLegal,Alternative Dispute Re"
  },
  {
    "path": "data/PaulGrahamEssayMedium/fr.txt",
    "chars": 60488,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nSeptember 2013Most startups that raise money do it more than"
  },
  {
    "path": "data/PaulGrahamEssayMedium/guidetoinvestors.txt",
    "chars": 35171,
    "preview": "April 2007(This essay is derived from a keynote talk at the 2007 ASES Summit\nat Stanford.)The world of investors is a fo"
  },
  {
    "path": "data/PaulGrahamEssayMedium/mit.txt",
    "chars": 36018,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nOctober 2006(This essay is derived from a talk at MIT.)Till "
  },
  {
    "path": "data/PaulGrahamEssayMedium/notnot.txt",
    "chars": 34567,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nMarch 2007(This essay is derived from talks at the 2007 \nSta"
  },
  {
    "path": "data/PaulGrahamEssayMedium/popular.txt",
    "chars": 43269,
    "preview": "May 2001(This article was written as a kind of business plan for a\nnew language.\nSo it is missing (because it takes for "
  },
  {
    "path": "data/PaulGrahamEssayMedium/re.txt",
    "chars": 42080,
    "preview": "January 2016One advantage of being old is that you can see change happen in\nyour lifetime.  A lot of the change I've see"
  },
  {
    "path": "data/PaulGrahamEssayMedium/road.txt",
    "chars": 68899,
    "preview": "September 2001\n(This article explains why much of the next generation of software\nmay be server-based, what that will me"
  },
  {
    "path": "data/PaulGrahamEssayMedium/start.txt",
    "chars": 54365,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nMarch 2005(This essay is derived from a talk at the Harvard "
  },
  {
    "path": "data/PaulGrahamEssayMedium/startupfunding.txt",
    "chars": 50832,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nNovember 2005\nVenture funding works like gears.  A typical s"
  },
  {
    "path": "data/PaulGrahamEssayMedium/startupideas.txt",
    "chars": 40553,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nNovember 2012The way to get startup ideas is not to try to t"
  },
  {
    "path": "data/PaulGrahamEssayMedium/wealth.txt",
    "chars": 50316,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nMay 2004\n(This essay was originally published in Hackers \n& "
  },
  {
    "path": "data/PaulGrahamEssayMedium/worked.txt",
    "chars": 75754,
    "preview": "February 2021Before college the two main things I worked on, outside of school,\r\nwere writing and programming. I didn't "
  },
  {
    "path": "data/PaulGrahamEssaySmall/cred.txt",
    "chars": 1383,
    "preview": "April 2020I recently saw a \nvideo \nof TV journalists and politicians confidently\nsaying that the coronavirus would be no"
  },
  {
    "path": "data/PaulGrahamEssaySmall/disc.txt",
    "chars": 1261,
    "preview": "January 2017Because biographies of famous scientists tend to \nedit out their mistakes, we underestimate the \ndegree of r"
  },
  {
    "path": "data/PaulGrahamEssaySmall/fix.txt",
    "chars": 1301,
    "preview": "\nKevin Kelleher suggested an interesting way to compare programming\nlanguages: to describe each in terms of the problem "
  },
  {
    "path": "data/PaulGrahamEssaySmall/fp.txt",
    "chars": 1094,
    "preview": "December 2019I've seen the same pattern in many different fields: even though\nlots of people have worked hard in the fie"
  },
  {
    "path": "data/PaulGrahamEssaySmall/getideas.txt",
    "chars": 789,
    "preview": "January 2023(Someone fed my essays into GPT to make something that could answer\nquestions based on them, then asked it w"
  },
  {
    "path": "data/PaulGrahamEssaySmall/lwba.txt",
    "chars": 298,
    "preview": "\nAfter a link to \nBeating the Averages was posted on slashdot, \nsome readers wanted to hear in more detail \nabout the sp"
  },
  {
    "path": "data/PaulGrahamEssaySmall/nft.txt",
    "chars": 1742,
    "preview": "May 2021Noora Health, a nonprofit I've \nsupported for years, just launched\na new NFT. It has a dramatic name, Save Thous"
  },
  {
    "path": "data/PaulGrahamEssaySmall/noob.txt",
    "chars": 1978,
    "preview": "January 2020When I was young, I thought old people had everything figured out.\nNow that I'm old, I know this isn't true."
  },
  {
    "path": "data/PaulGrahamEssaySmall/nov.txt",
    "chars": 1506,
    "preview": "November 2019If you discover something new, there's a significant chance you'll be\naccused of some form of heresy.To dis"
  },
  {
    "path": "data/PaulGrahamEssaySmall/pow.txt",
    "chars": 655,
    "preview": "January 2017People who are powerful but uncharismatic will tend to be disliked.\nTheir power makes them a target for crit"
  },
  {
    "path": "data/PaulGrahamEssaySmall/prop62.txt",
    "chars": 1018,
    "preview": "November 2016If you're a California voter, there is an important proposition\non your ballot this year: Proposition 62, w"
  },
  {
    "path": "data/PaulGrahamEssaySmall/rootsoflisp.txt",
    "chars": 2056,
    "preview": "May 2001\n\n(I wrote this article to help myself understand exactly\nwhat McCarthy discovered.  You don't need to know this"
  },
  {
    "path": "data/PaulGrahamEssaySmall/rss.txt",
    "chars": 55,
    "preview": "Aaron Swartz created a scraped\nfeed\nof the essays page."
  },
  {
    "path": "data/PaulGrahamEssaySmall/todo.txt",
    "chars": 1285,
    "preview": "April 2012A palliative care nurse called Bronnie Ware made a list of the\nbiggest regrets\nof the dying.  Her list seems p"
  },
  {
    "path": "data/PaulGrahamEssaySmall/twitter.txt",
    "chars": 810,
    "preview": "April 2009Om Malik is the most recent of many people\nto ask why Twitter is such a big deal.The reason is that it's a new"
  },
  {
    "path": "data/PaulGrahamEssays/13sentences.txt",
    "chars": 7470,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\n\nWatch how this essay was\nwritten.\n\n\n\n\nFebruary 2009One of t"
  },
  {
    "path": "data/PaulGrahamEssays/5founders.txt",
    "chars": 4203,
    "preview": "April 2009Inc recently asked me who I thought were the 5 most\ninteresting startup founders of the last 30 years.  How do"
  },
  {
    "path": "data/PaulGrahamEssays/6631327.txt",
    "chars": 3888,
    "preview": "March 2006, rev August 2009A couple days ago I found to my surprise that I'd been granted a\npatent.\nIt issued in 2003, b"
  },
  {
    "path": "data/PaulGrahamEssays/95.txt",
    "chars": 5404,
    "preview": "December 2014American technology companies want the government to make immigration\neasier because they say they can't fi"
  },
  {
    "path": "data/PaulGrahamEssays/ace.txt",
    "chars": 19160,
    "preview": "December 2020As I was deciding what to write about next, I was surprised to find\nthat two separate essays I'd been plann"
  },
  {
    "path": "data/PaulGrahamEssays/addiction.txt",
    "chars": 7436,
    "preview": "July 2010What hard liquor, cigarettes, heroin, and crack have in common is\nthat they're all more concentrated forms of l"
  },
  {
    "path": "data/PaulGrahamEssays/airbnb.txt",
    "chars": 7384,
    "preview": "March 2011Yesterday Fred Wilson published a remarkable post about missing\nAirbnb.   VCs miss good startups all the time,"
  },
  {
    "path": "data/PaulGrahamEssays/airbnbs.txt",
    "chars": 6046,
    "preview": "December 2020To celebrate Airbnb's IPO and to help future founders, I thought\nit might be useful to explain what was spe"
  },
  {
    "path": "data/PaulGrahamEssays/alien.txt",
    "chars": 3948,
    "preview": "October 2022If there were intelligent beings elsewhere in the universe, they'd\r\nshare certain truths in common with us. "
  },
  {
    "path": "data/PaulGrahamEssays/altair.txt",
    "chars": 2128,
    "preview": "February 2015One of the most valuable exercises you can try if you want to\nunderstand startups is to look at the most su"
  },
  {
    "path": "data/PaulGrahamEssays/ambitious.txt",
    "chars": 21226,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nMarch 2012One of the more surprising things I've noticed whi"
  },
  {
    "path": "data/PaulGrahamEssays/america.txt",
    "chars": 27586,
    "preview": "May 2006(This essay is derived from a keynote at Xtech.)Startups happen in clusters.  There are a lot of them in Silicon"
  },
  {
    "path": "data/PaulGrahamEssays/angelinvesting.txt",
    "chars": 22379,
    "preview": "March 2009(This essay is derived from a talk at AngelConf.)When we sold our startup in 1998 I thought one day I'd do som"
  },
  {
    "path": "data/PaulGrahamEssays/aord.txt",
    "chars": 8501,
    "preview": "October 2015When I talk to a startup that's been operating for more than 8 or\n9 months, the first thing I want to know i"
  },
  {
    "path": "data/PaulGrahamEssays/apple.txt",
    "chars": 12406,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nNovember 2009I don't think Apple realizes how badly the App "
  },
  {
    "path": "data/PaulGrahamEssays/artistsship.txt",
    "chars": 7623,
    "preview": "November 2008One of the differences between big companies and startups is that\nbig companies tend to have developed proc"
  },
  {
    "path": "data/PaulGrahamEssays/avg.txt",
    "chars": 25387,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nApril 2001, rev. April 2003(This article is derived from a t"
  },
  {
    "path": "data/PaulGrahamEssays/badeconomy.txt",
    "chars": 6079,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nOctober 2008The economic situation is apparently so grim tha"
  },
  {
    "path": "data/PaulGrahamEssays/before.txt",
    "chars": 25529,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nOctober 2014(This essay is derived from a guest lecture in S"
  },
  {
    "path": "data/PaulGrahamEssays/better.txt",
    "chars": 25360,
    "preview": "January 2003(This article was given as a talk at the 2003 Spam Conference.\nIt describes the work I've done to improve th"
  },
  {
    "path": "data/PaulGrahamEssays/bias.txt",
    "chars": 3368,
    "preview": "October 2015This will come as a surprise to a lot of people, but in some cases\nit's possible to detect bias in a selecti"
  },
  {
    "path": "data/PaulGrahamEssays/boss.txt",
    "chars": 14317,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nMarch 2008, rev. June 2008Technology tends to separate norma"
  },
  {
    "path": "data/PaulGrahamEssays/bronze.txt",
    "chars": 17754,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nApril 2005This summer, as an \nexperiment, some \nfriends and "
  },
  {
    "path": "data/PaulGrahamEssays/bubble.txt",
    "chars": 21193,
    "preview": "September 2004(This essay is derived from an invited talk at ICFP 2004.)I had a front row seat for the Internet Bubble,\n"
  },
  {
    "path": "data/PaulGrahamEssays/charisma.txt",
    "chars": 8780,
    "preview": "\nNovember 2004, corrected June 2006Occam's razor says we should prefer the simpler of two explanations.\nI begin by remin"
  },
  {
    "path": "data/PaulGrahamEssays/cities.txt",
    "chars": 20147,
    "preview": "May 2008\nGreat cities attract ambitious people.  You can sense it when you\nwalk around one.  In a hundred subtle ways, t"
  },
  {
    "path": "data/PaulGrahamEssays/college.txt",
    "chars": 20690,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\n\nMarch 2005(Parts of this essay began as replies to students"
  },
  {
    "path": "data/PaulGrahamEssays/colleges.txt",
    "chars": 12521,
    "preview": "September 2007A few weeks ago I had a thought so heretical that it really surprised\nme. It may not matter all that much "
  },
  {
    "path": "data/PaulGrahamEssays/conformism.txt",
    "chars": 11987,
    "preview": "July 2020One of the most revealing ways to classify people is by the degree\nand aggressiveness of their conformism. Imag"
  },
  {
    "path": "data/PaulGrahamEssays/control.txt",
    "chars": 4344,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nDecember 2010Someone we funded is talking to VCs now, and as"
  },
  {
    "path": "data/PaulGrahamEssays/convergence.txt",
    "chars": 8887,
    "preview": "March 2009About twenty years ago people noticed computers and TV were on a\ncollision course and started to speculate abo"
  },
  {
    "path": "data/PaulGrahamEssays/convince.txt",
    "chars": 20931,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nAugust 2013When people hurt themselves lifting heavy things,"
  },
  {
    "path": "data/PaulGrahamEssays/copy.txt",
    "chars": 5365,
    "preview": "July 2006\nWhen I was in high school I spent a lot of time imitating bad\nwriters.  What we studied in English classes was"
  },
  {
    "path": "data/PaulGrahamEssays/corpdev.txt",
    "chars": 7086,
    "preview": "January 2015Corporate Development, aka corp dev, is the group within companies\nthat buys other companies. If you're talk"
  },
  {
    "path": "data/PaulGrahamEssays/cred.txt",
    "chars": 1383,
    "preview": "April 2020I recently saw a \nvideo \nof TV journalists and politicians confidently\nsaying that the coronavirus would be no"
  },
  {
    "path": "data/PaulGrahamEssays/credentials.txt",
    "chars": 13985,
    "preview": "December 2008A few months ago I read a New York Times article on South\nKorean cram schools that said \n  Admission to the"
  },
  {
    "path": "data/PaulGrahamEssays/desres.txt",
    "chars": 14988,
    "preview": "January 2003(This article is derived from a keynote talk at the fall 2002 meeting\nof NEPLS.)Visitors to this country are"
  },
  {
    "path": "data/PaulGrahamEssays/determination.txt",
    "chars": 9053,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nSeptember 2009Like all investors, we spend a lot of time try"
  },
  {
    "path": "data/PaulGrahamEssays/die.txt",
    "chars": 10807,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nAugust 2007(This is a talk I gave at the last \nY Combinator "
  },
  {
    "path": "data/PaulGrahamEssays/diff.txt",
    "chars": 4231,
    "preview": "December 2001 (rev. May 2002)\n\n(This article came about in response to some questions on\nthe LL1 mailing list.  It is no"
  },
  {
    "path": "data/PaulGrahamEssays/disagree.txt",
    "chars": 8930,
    "preview": "March 2008The web is turning writing into a conversation.  Twenty years ago,\nwriters wrote and readers read.  The web le"
  },
  {
    "path": "data/PaulGrahamEssays/disc.txt",
    "chars": 1261,
    "preview": "January 2017Because biographies of famous scientists tend to \nedit out their mistakes, we underestimate the \ndegree of r"
  },
  {
    "path": "data/PaulGrahamEssays/discover.txt",
    "chars": 7525,
    "preview": "September 2009When meeting people you don't know very well, the convention is\nto seem extra friendly.  You smile and say"
  },
  {
    "path": "data/PaulGrahamEssays/distraction.txt",
    "chars": 6295,
    "preview": "Note: The strategy described at the end of this essay didn't work.\nIt would work for a while, and then I'd gradually fin"
  },
  {
    "path": "data/PaulGrahamEssays/divergence.txt",
    "chars": 7802,
    "preview": "December 2008(I originally wrote this at the request of a company producing\na report about entrepreneurship.  Unfortunat"
  },
  {
    "path": "data/PaulGrahamEssays/donate.txt",
    "chars": 2891,
    "preview": "March 2021The secret curse of the nonprofit world is restricted donations.\nIf you haven't been involved with nonprofits,"
  },
  {
    "path": "data/PaulGrahamEssays/ds.txt",
    "chars": 24983,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nJuly 2013One of the most common types of advice we give at Y"
  },
  {
    "path": "data/PaulGrahamEssays/early.txt",
    "chars": 14128,
    "preview": "October 2020One of the biggest things holding people back from doing great work\nis the fear of making something lame. An"
  },
  {
    "path": "data/PaulGrahamEssays/earnest.txt",
    "chars": 9583,
    "preview": "December 2020Jessica and I have certain words that have special significance\nwhen we're talking about startups. The high"
  },
  {
    "path": "data/PaulGrahamEssays/ecw.txt",
    "chars": 6259,
    "preview": "December 2014If the world were static, we could have monotonically increasing\nconfidence in our beliefs.  The more (and "
  },
  {
    "path": "data/PaulGrahamEssays/equity.txt",
    "chars": 6106,
    "preview": "July 2007An investor wants to give you money for a certain percentage of\nyour startup.  Should you take it?  You're abou"
  },
  {
    "path": "data/PaulGrahamEssays/essay.txt",
    "chars": 26048,
    "preview": "September 2004Remember the essays you had to write in high school?\nTopic sentence, introductory paragraph,\nsupporting pa"
  },
  {
    "path": "data/PaulGrahamEssays/ffb.txt",
    "chars": 4794,
    "preview": "August 2003\nWe may be able to improve the accuracy of Bayesian spam filters\nby having them follow links to see what's\nwa"
  },
  {
    "path": "data/PaulGrahamEssays/fh.txt",
    "chars": 7621,
    "preview": "January 2020(I originally intended this for startup founders, who are often\nsurprised by the attention they get as their"
  },
  {
    "path": "data/PaulGrahamEssays/fix.txt",
    "chars": 1301,
    "preview": "\nKevin Kelleher suggested an interesting way to compare programming\nlanguages: to describe each in terms of the problem "
  },
  {
    "path": "data/PaulGrahamEssays/fn.txt",
    "chars": 7447,
    "preview": "May 2021Most people think of nerds as quiet, diffident people. In ordinary\nsocial situations they are — as quiet and dif"
  },
  {
    "path": "data/PaulGrahamEssays/founders.txt",
    "chars": 4542,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nOctober 2010\n\n(I wrote this for Forbes, who asked me to writ"
  },
  {
    "path": "data/PaulGrahamEssays/foundersatwork.txt",
    "chars": 4514,
    "preview": "January 2007(Foreword to Jessica Livingston's \nFounders at Work.)Apparently sprinters reach their highest speed right ou"
  },
  {
    "path": "data/PaulGrahamEssays/foundervisa.txt",
    "chars": 2261,
    "preview": "\n\nApril 2009I usually avoid politics, but since we now seem to have an administration that's open to suggestions, I'm go"
  },
  {
    "path": "data/PaulGrahamEssays/fp.txt",
    "chars": 1094,
    "preview": "December 2019I've seen the same pattern in many different fields: even though\nlots of people have worked hard in the fie"
  },
  {
    "path": "data/PaulGrahamEssays/fr.txt",
    "chars": 60488,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nSeptember 2013Most startups that raise money do it more than"
  },
  {
    "path": "data/PaulGrahamEssays/fundraising.txt",
    "chars": 27883,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nAugust 2008Raising money is the second hardest part of start"
  },
  {
    "path": "data/PaulGrahamEssays/future.txt",
    "chars": 22261,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nAugust 2010Two years ago I\nwrote about what I called \"a huge"
  },
  {
    "path": "data/PaulGrahamEssays/gap.txt",
    "chars": 32640,
    "preview": "May 2004When people care enough about something to do it well, those who\ndo it best tend to be far better than everyone "
  },
  {
    "path": "data/PaulGrahamEssays/gba.txt",
    "chars": 11455,
    "preview": "April 2004To the popular press, \"hacker\" means someone who breaks\ninto computers.  Among programmers it means a good pro"
  },
  {
    "path": "data/PaulGrahamEssays/genius.txt",
    "chars": 14996,
    "preview": "November 2019Everyone knows that to do great work you need both natural ability\nand determination. But there's a third i"
  },
  {
    "path": "data/PaulGrahamEssays/getideas.txt",
    "chars": 789,
    "preview": "January 2023(Someone fed my essays into GPT to make something that could answer\nquestions based on them, then asked it w"
  },
  {
    "path": "data/PaulGrahamEssays/gh.txt",
    "chars": 29511,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nJuly 2004(This essay is derived from a talk at Oscon 2004.)\n"
  },
  {
    "path": "data/PaulGrahamEssays/good.txt",
    "chars": 16697,
    "preview": "April 2008(This essay is derived from a talk at the 2008 Startup School.)About a month after we started Y Combinator we "
  },
  {
    "path": "data/PaulGrahamEssays/goodart.txt",
    "chars": 20206,
    "preview": "December 2006I grew up believing that taste is just a matter of personal preference.\nEach person has things they like, b"
  },
  {
    "path": "data/PaulGrahamEssays/goodtaste.txt",
    "chars": 6051,
    "preview": "November 2021(This essay is derived from a talk at the Cambridge Union.)When I was a kid, I'd have said there wasn't. My"
  },
  {
    "path": "data/PaulGrahamEssays/googles.txt",
    "chars": 7683,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nApril 2008Umair Haque \nwrote recently that the reason there "
  },
  {
    "path": "data/PaulGrahamEssays/growth.txt",
    "chars": 31064,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nSeptember 2012A startup is a company designed to grow fast. "
  },
  {
    "path": "data/PaulGrahamEssays/guidetoinvestors.txt",
    "chars": 35171,
    "preview": "April 2007(This essay is derived from a keynote talk at the 2007 ASES Summit\nat Stanford.)The world of investors is a fo"
  },
  {
    "path": "data/PaulGrahamEssays/hackernews.txt",
    "chars": 16425,
    "preview": "February 2009Hacker News was two years\nold last week.  Initially it was supposed to be a side project—an\napplication to "
  },
  {
    "path": "data/PaulGrahamEssays/head.txt",
    "chars": 10650,
    "preview": "August 2007A good programmer working intensively on his own code can hold it\nin his mind the way a mathematician holds a"
  },
  {
    "path": "data/PaulGrahamEssays/herd.txt",
    "chars": 6455,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nAugust 2013The biggest component in most investors' opinion "
  },
  {
    "path": "data/PaulGrahamEssays/heresy.txt",
    "chars": 12442,
    "preview": "April 2022One of the most surprising things I've witnessed in my lifetime is\nthe rebirth of the concept of heresy.In his"
  },
  {
    "path": "data/PaulGrahamEssays/heroes.txt",
    "chars": 15004,
    "preview": "April 2008There are some topics I save up because they'll be so much fun to\nwrite about.  This is one of them: a list of"
  },
  {
    "path": "data/PaulGrahamEssays/highres.txt",
    "chars": 9048,
    "preview": "December 2008For nearly all of history the success of a society was proportionate\nto its ability to assemble large and d"
  },
  {
    "path": "data/PaulGrahamEssays/hiresfund.txt",
    "chars": 4221,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nSeptember 2010The reason startups have been using \nmore conv"
  },
  {
    "path": "data/PaulGrahamEssays/hiring.txt",
    "chars": 27142,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nMay 2005(This essay is derived from a talk at the Berkeley C"
  },
  {
    "path": "data/PaulGrahamEssays/hp.txt",
    "chars": 31807,
    "preview": "May 2003(This essay is derived from a guest lecture at Harvard, which incorporated\nan earlier talk at Northeastern.)When"
  },
  {
    "path": "data/PaulGrahamEssays/hs.txt",
    "chars": 28151,
    "preview": "January 2005(I wrote this talk for a\nhigh school.  I never actually \ngave it, because the school authorities vetoed the "
  },
  {
    "path": "data/PaulGrahamEssays/hubs.txt",
    "chars": 10243,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nOctober 2011If you look at a list of US cities sorted by pop"
  },
  {
    "path": "data/PaulGrahamEssays/hundred.txt",
    "chars": 27891,
    "preview": "April 2003(This essay is derived from a keynote talk at PyCon 2003.)It's hard to predict what\nlife will be like in a hun"
  },
  {
    "path": "data/PaulGrahamEssays/hw.txt",
    "chars": 2440,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nOctober 2012One advantage of Y Combinator's early, broad foc"
  },
  {
    "path": "data/PaulGrahamEssays/hwh.txt",
    "chars": 18024,
    "preview": "June 2021It might not seem there's much to learn about how to work hard.\nAnyone who's been to school knows what it entai"
  },
  {
    "path": "data/PaulGrahamEssays/icad.txt",
    "chars": 33903,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nMay 2002\n\n\n\n\"We were after the C++ programmers. We managed t"
  },
  {
    "path": "data/PaulGrahamEssays/ideas.txt",
    "chars": 22241,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nOctober 2005(This essay is derived from a talk at the 2005 \n"
  },
  {
    "path": "data/PaulGrahamEssays/identity.txt",
    "chars": 5215,
    "preview": "February 2009I finally realized today why politics and religion yield such\nuniquely useless discussions.As a rule, any m"
  },
  {
    "path": "data/PaulGrahamEssays/iflisp.txt",
    "chars": 2457,
    "preview": "May 2003If Lisp is so great, why don't more people use it?  I was    \nasked this question by a student in the audience a"
  },
  {
    "path": "data/PaulGrahamEssays/ineq.txt",
    "chars": 19871,
    "preview": "\nJanuary 2016Since the 1970s, economic inequality in the US has increased\ndramatically. And in particular, the rich have"
  },
  {
    "path": "data/PaulGrahamEssays/inequality.txt",
    "chars": 16399,
    "preview": "August 2005(This essay is derived from a talk at Defcon 2005.)Suppose you wanted to get rid of economic inequality.  The"
  },
  {
    "path": "data/PaulGrahamEssays/investors.txt",
    "chars": 15949,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nAugust 2006, rev. April 2007, September 2010In a few days it"
  },
  {
    "path": "data/PaulGrahamEssays/invtrend.txt",
    "chars": 16640,
    "preview": "June 2013(This talk was written for an audience of investors.)Y Combinator has now funded 564 startups including the cur"
  },
  {
    "path": "data/PaulGrahamEssays/island.txt",
    "chars": 4068,
    "preview": "July 2006I've discovered a handy test for figuring out what you're addicted\nto.  Imagine you were going to spend the wee"
  },
  {
    "path": "data/PaulGrahamEssays/javacover.txt",
    "chars": 7613,
    "preview": "April 2001This essay developed out of conversations I've had with\nseveral other programmers about why Java smelled suspi"
  },
  {
    "path": "data/PaulGrahamEssays/jessica.txt",
    "chars": 11161,
    "preview": "November 2015A few months ago an article about Y Combinator said that early on\nit had been a \"one-man show.\"  It's sadly"
  },
  {
    "path": "data/PaulGrahamEssays/judgement.txt",
    "chars": 4365,
    "preview": "April 2007There are two different ways people judge you.  Sometimes judging\nyou correctly is the end goal.  But there's "
  },
  {
    "path": "data/PaulGrahamEssays/kate.txt",
    "chars": 4705,
    "preview": "August 2009Kate Courteau is the architect who designed Y Combinator's office.\nRecently we managed to recruit her to help"
  },
  {
    "path": "data/PaulGrahamEssays/kids.txt",
    "chars": 8146,
    "preview": "December 2019Before I had kids, I was afraid of having kids. Up to that point I\nfelt about kids the way the young August"
  },
  {
    "path": "data/PaulGrahamEssays/know.txt",
    "chars": 3685,
    "preview": "December 2014I've read Villehardouin's chronicle of the Fourth Crusade at least\ntwo times, maybe three.  And yet if I ha"
  },
  {
    "path": "data/PaulGrahamEssays/ladder.txt",
    "chars": 3374,
    "preview": "August 2005Thirty years ago, one was supposed to work one's way up the corporate\nladder.  That's less the rule now.  Our"
  },
  {
    "path": "data/PaulGrahamEssays/langdes.txt",
    "chars": 16954,
    "preview": "May 2001\n\n(These are some notes I made\nfor a panel discussion on programming language design\nat MIT on May 10, 2001.)1. "
  },
  {
    "path": "data/PaulGrahamEssays/laundry.txt",
    "chars": 24563,
    "preview": "October 2004\nAs E. B. White said, \"good writing is rewriting.\"  I didn't\nrealize this when I was in school.  In writing,"
  },
  {
    "path": "data/PaulGrahamEssays/lesson.txt",
    "chars": 22383,
    "preview": "December 2019\nThe most damaging thing you learned in school wasn't something you\nlearned in any specific class. It was l"
  },
  {
    "path": "data/PaulGrahamEssays/lies.txt",
    "chars": 29222,
    "preview": "May 2008Adults lie constantly to kids.  I'm not saying we should stop, but\nI think we should at least examine which lies"
  },
  {
    "path": "data/PaulGrahamEssays/love.txt",
    "chars": 25517,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nJanuary 2006To do something well you have to like it.   That"
  },
  {
    "path": "data/PaulGrahamEssays/lwba.txt",
    "chars": 298,
    "preview": "\nAfter a link to \nBeating the Averages was posted on slashdot, \nsome readers wanted to hear in more detail \nabout the sp"
  },
  {
    "path": "data/PaulGrahamEssays/mac.txt",
    "chars": 5505,
    "preview": "March 2005All the best hackers \nI know are gradually switching to Macs.  My\nfriend Robert said his whole research group "
  },
  {
    "path": "data/PaulGrahamEssays/makersschedule.txt",
    "chars": 6498,
    "preview": "\n\n\n\n\n\"...the mere consciousness of an engagement will sometimes worry a whole day.\" Charles Dickens\n\n\n\n\nJuly 2009One re"
  },
  {
    "path": "data/PaulGrahamEssays/marginal.txt",
    "chars": 34511,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nJune 2006(This essay is derived from talks at Usenix 2006 an"
  },
  {
    "path": "data/PaulGrahamEssays/maybe.txt",
    "chars": 10677,
    "preview": "February 2009A lot of cities look at Silicon Valley and ask \"How could we make\nsomething like that happen here?\"  The \no"
  },
  {
    "path": "data/PaulGrahamEssays/mean.txt",
    "chars": 6522,
    "preview": "November 2014It struck me recently how few of the most successful people I know\nare mean.  There are exceptions, but rem"
  },
  {
    "path": "data/PaulGrahamEssays/microsoft.txt",
    "chars": 7212,
    "preview": "April 2007A few days ago I suddenly realized Microsoft was dead.  I was talking\nto a young startup founder about how Goo"
  },
  {
    "path": "data/PaulGrahamEssays/mit.txt",
    "chars": 36018,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nOctober 2006(This essay is derived from a talk at MIT.)Till "
  },
  {
    "path": "data/PaulGrahamEssays/mod.txt",
    "chars": 3898,
    "preview": "December 2019There are two distinct ways to be politically moderate: on purpose\nand by accident. Intentional moderates a"
  },
  {
    "path": "data/PaulGrahamEssays/name.txt",
    "chars": 4148,
    "preview": "August 2015If you have a US startup called X and you don't have x.com, you\nshould probably change your name.The reason i"
  },
  {
    "path": "data/PaulGrahamEssays/nerds.txt",
    "chars": 31835,
    "preview": "February 2003When we were in junior high school, my friend Rich and I made a map\nof the school lunch tables according to"
  },
  {
    "path": "data/PaulGrahamEssays/newideas.txt",
    "chars": 7732,
    "preview": "May 2021There's one kind of opinion I'd be very afraid to express publicly.\nIf someone I knew to be both a domain expert"
  },
  {
    "path": "data/PaulGrahamEssays/newthings.txt",
    "chars": 6809,
    "preview": "February 2008The fiery reaction to the release of Arc had\nan unexpected consequence: it made me realize I had a design\np"
  },
  {
    "path": "data/PaulGrahamEssays/nft.txt",
    "chars": 1742,
    "preview": "May 2021Noora Health, a nonprofit I've \nsupported for years, just launched\na new NFT. It has a dramatic name, Save Thous"
  },
  {
    "path": "data/PaulGrahamEssays/noob.txt",
    "chars": 1978,
    "preview": "January 2020When I was young, I thought old people had everything figured out.\nNow that I'm old, I know this isn't true."
  },
  {
    "path": "data/PaulGrahamEssays/noop.txt",
    "chars": 3003,
    "preview": "\nThere is a kind of mania for object-oriented programming at the moment, but\n\nsome of the smartest programmers I know ar"
  },
  {
    "path": "data/PaulGrahamEssays/notnot.txt",
    "chars": 34567,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nMarch 2007(This essay is derived from talks at the 2007 \nSta"
  },
  {
    "path": "data/PaulGrahamEssays/nov.txt",
    "chars": 1506,
    "preview": "November 2019If you discover something new, there's a significant chance you'll be\naccused of some form of heresy.To dis"
  },
  {
    "path": "data/PaulGrahamEssays/nthings.txt",
    "chars": 7926,
    "preview": "September 2009I bet you the current issue of Cosmopolitan has an article\nwhose title begins with a number. \"7 Things He "
  },
  {
    "path": "data/PaulGrahamEssays/opensource.txt",
    "chars": 24644,
    "preview": "August 2005(This essay is derived from a talk at Oscon 2005.)Lately companies have been paying more attention to open so"
  },
  {
    "path": "data/PaulGrahamEssays/organic.txt",
    "chars": 5624,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nApril 2010The best way to come up with startup ideas is to a"
  },
  {
    "path": "data/PaulGrahamEssays/orth.txt",
    "chars": 4525,
    "preview": "July 2020\n\n\n\n\"Few people are capable of expressing with equanimity opinions which differ from the prejudices of their so"
  },
  {
    "path": "data/PaulGrahamEssays/own.txt",
    "chars": 13964,
    "preview": "June 2021A few days ago, on the way home from school, my nine year old son\ntold me he couldn't wait to get home to write"
  },
  {
    "path": "data/PaulGrahamEssays/patentpledge.txt",
    "chars": 4001,
    "preview": "August 2011I realized recently that we may be able to solve part of the patent\nproblem without waiting for the governmen"
  },
  {
    "path": "data/PaulGrahamEssays/pgh.txt",
    "chars": 14877,
    "preview": "April 2016(This is a talk I gave at an event called Opt412 in Pittsburgh.\nMuch of it will apply to other towns.  But not"
  },
  {
    "path": "data/PaulGrahamEssays/philosophy.txt",
    "chars": 27969,
    "preview": "September 2007In high school I decided I was going to study philosophy in college.\nI had several motives, some more hono"
  },
  {
    "path": "data/PaulGrahamEssays/pinch.txt",
    "chars": 9006,
    "preview": "December 2014Many startups go through a point a few months before they die where\nalthough they have a significant amount"
  },
  {
    "path": "data/PaulGrahamEssays/polls.txt",
    "chars": 3518,
    "preview": "November 2004\nA lot of people are writing now about \nwhy Kerry lost.  Here I want to\nexamine a more specific question: w"
  },
  {
    "path": "data/PaulGrahamEssays/popular.txt",
    "chars": 43269,
    "preview": "May 2001(This article was written as a kind of business plan for a\nnew language.\nSo it is missing (because it takes for "
  },
  {
    "path": "data/PaulGrahamEssays/pow.txt",
    "chars": 655,
    "preview": "January 2017People who are powerful but uncharismatic will tend to be disliked.\nTheir power makes them a target for crit"
  },
  {
    "path": "data/PaulGrahamEssays/power.txt",
    "chars": 17092,
    "preview": "May 2002\n\n\n\n\"The quantity of meaning compressed into a small space by \nalgebraic signs, is another circumstance that fac"
  },
  {
    "path": "data/PaulGrahamEssays/prcmc.txt",
    "chars": 7360,
    "preview": "July 2008At this year's startup school, David Heinemeier Hansson gave a\n talk\nin which he suggested that startup founder"
  },
  {
    "path": "data/PaulGrahamEssays/procrastination.txt",
    "chars": 10115,
    "preview": "December 2005The most impressive people I know are all terrible procrastinators.\nSo could it be that procrastination isn"
  },
  {
    "path": "data/PaulGrahamEssays/progbot.txt",
    "chars": 5475,
    "preview": "1993\n\n(This essay is from the introduction to On Lisp.)\nIt's a long-standing principle of programming style that the fun"
  },
  {
    "path": "data/PaulGrahamEssays/prop62.txt",
    "chars": 1018,
    "preview": "November 2016If you're a California voter, there is an important proposition\non your ballot this year: Proposition 62, w"
  },
  {
    "path": "data/PaulGrahamEssays/property.txt",
    "chars": 5549,
    "preview": "March 2012As a child I read a book of stories about a famous judge in eighteenth\ncentury Japan called Ooka Tadasuke.  On"
  },
  {
    "path": "data/PaulGrahamEssays/publishing.txt",
    "chars": 10323,
    "preview": "September 2009Publishers of all types, from news to music, are unhappy that\nconsumers won't pay for content anymore.  At"
  },
  {
    "path": "data/PaulGrahamEssays/pypar.txt",
    "chars": 2502,
    "preview": "August 2004In a recent talk I said something that upset a lot of\npeople: that you could get smarter programmers to work "
  },
  {
    "path": "data/PaulGrahamEssays/ramenprofitable.txt",
    "chars": 10569,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nJuly 2009Now that the term \"ramen profitable\" has become wid"
  },
  {
    "path": "data/PaulGrahamEssays/randomness.txt",
    "chars": 3248,
    "preview": "April 2006, rev August 2009Plato quotes Socrates as saying \"the unexamined life is not worth\nliving.\"  Part of what he m"
  },
  {
    "path": "data/PaulGrahamEssays/re.txt",
    "chars": 42080,
    "preview": "January 2016One advantage of being old is that you can see change happen in\nyour lifetime.  A lot of the change I've see"
  },
  {
    "path": "data/PaulGrahamEssays/read.txt",
    "chars": 2402,
    "preview": "November 2022In the science fiction books I read as a kid, reading had often\nbeen replaced by some more efficient way of"
  },
  {
    "path": "data/PaulGrahamEssays/real.txt",
    "chars": 4567,
    "preview": "April 2021When intellectuals talk about the death penalty, they talk about\nthings like whether it's permissible for the "
  },
  {
    "path": "data/PaulGrahamEssays/really.txt",
    "chars": 29388,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nOctober 2009(This  essay is derived from a talk at the 2009 "
  },
  {
    "path": "data/PaulGrahamEssays/relres.txt",
    "chars": 5696,
    "preview": "\n\nWant to start a startup?  Get funded by\nY Combinator.\n\n\n\n\nMarch 2009A couple days ago I finally got being a good start"
  },
  {
    "path": "data/PaulGrahamEssays/revolution.txt",
    "chars": 7927,
    "preview": "April 2009Recently I realized I'd been holding two ideas in my head that would explode if combined.The first is that sta"
  },
  {
    "path": "data/PaulGrahamEssays/richnow.txt",
    "chars": 14565,
    "preview": "April 2021Every year since 1982, Forbes magazine has published a list of the\nrichest Americans. If we compare the 100 ri"
  },
  {
    "path": "data/PaulGrahamEssays/road.txt",
    "chars": 68899,
    "preview": "September 2001\n(This article explains why much of the next generation of software\nmay be server-based, what that will me"
  },
  {
    "path": "data/PaulGrahamEssays/ronco.txt",
    "chars": 3471,
    "preview": "January 2015No one, VC or angel, has invested in more of the top startups than\nRon Conway.  He knows what happened in ev"
  },
  {
    "path": "data/PaulGrahamEssays/rootsoflisp.txt",
    "chars": 2056,
    "preview": "May 2001\n\n(I wrote this article to help myself understand exactly\nwhat McCarthy discovered.  You don't need to know this"
  },
  {
    "path": "data/PaulGrahamEssays/rss.txt",
    "chars": 55,
    "preview": "Aaron Swartz created a scraped\nfeed\nof the essays page."
  },
  {
    "path": "data/PaulGrahamEssays/safe.txt",
    "chars": 4392,
    "preview": "August 2015I recently got an email from a founder that helped me understand\nsomething important: why it's safe for start"
  }
]

// ... and 160 more files (download for full content)

About this extraction

This page contains the full source code of the gkamradt/langchain-tutorials GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 360 files (55.9 MB), approximately 1.6M tokens, and a symbol index with 147 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo