[
  {
    "path": ".gitignore",
    "content": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\npip-wheel-metadata/\nshare/python-wheels/\n*.egg-info/\n.installed.cfg\n*.egg\nMANIFEST\n\n# PyInstaller\n#  Usually these files are written by a python script from a template\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.nox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*.cover\n*.py,cover\n.hypothesis/\n.pytest_cache/\n\n# Translations\n*.mo\n*.pot\n\n# Django stuff:\n*.log\nlocal_settings.py\ndb.sqlite3\ndb.sqlite3-journal\n\n# Flask stuff:\ninstance/\n.webassets-cache\n\n# Scrapy stuff:\n.scrapy\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\ntarget/\n\n# Jupyter Notebook\n.ipynb_checkpoints\n\n# IPython\nprofile_default/\nipython_config.py\n\n# pyenv\n.python-version\n\n# pipenv\n#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.\n#   However, in case of collaboration, if having platform-specific dependencies or dependencies\n#   having no cross-platform support, pipenv may install dependencies that don't work, or not\n#   install all needed dependencies.\n#Pipfile.lock\n\n# PEP 582; used by e.g. github.com/David-OConnor/pyflow\n__pypackages__/\n\n# Celery stuff\ncelerybeat-schedule\ncelerybeat.pid\n\n# SageMath parsed files\n*.sage.py\n\n# Environments\n.env\n.venv\nenv/\nvenv/\nENV/\nenv.bak/\nvenv.bak/\n\n# Spyder project settings\n.spyderproject\n.spyproject\n\n# Rope project settings\n.ropeproject\n\n# mkdocs documentation\n/site\n\n# mypy\n.mypy_cache/\n.dmypy.json\ndmypy.json\n\n# Pyre type checker\n.pyre/\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2022 Ivan Bestvina\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "MANIFEST.in",
    "content": "include README.rst\n"
  },
  {
    "path": "README.md",
    "content": "<p align=\"center\">\n  <img width=\"300\" src=\"https://raw.githubusercontent.com/ibestvina/datasloth/main/media/datasloth.png\">\n</p>\n\n# DataSloth\n_Natural language Pandas queries and data generation powered by GPT-3_\n\n\n<p align=\"center\">\n  <img width=\"800\" src=\"https://raw.githubusercontent.com/ibestvina/datasloth/main/media/quick_example.png\">\n</p>\n\n\n## Installation\n`pip install datasloth`\n\n## Usage\n\nIn order for DataSloth to work, you must have a working [OpenAI API key](https://beta.openai.com/account/api-keys) set in your environment variable, or provide it to the DataSloth object. For more info, refer to this [guide](https://help.openai.com/en/articles/5112595-best-practices-for-api-key-safety).\n\nDataSloth automatically discovers all Pandas dataframes in your namespace (filtering out names starting with an underscode). Before you load any data, import DataSloth and create the `sloth`:\n\n```python\nfrom datasloth import DataSloth\nsloth = DataSloth()\n```\n\nNext, load any data you want to use. Try naming your dataframes and columns in a meaningful way, as DataSloth uses these names to understand what the data is about.\n\nOnce your data is loaded, simply run\n\n`sloth.query('...')`\n\nto query the data.\n\n\n### Improving results\n\nTo improve the results, you can set custom descriptions of your tables:\n\n`df.sloth.description = 'Verbose description of the table'`\n\nBy default, table descriptions consist of information about each column in the table. You can include this default description in your custom one by adding a `{COLUMNS_SUMMARY}` placeholder. See the detailed example notebook in the examples folder for more information.\n\n### Solving issues\n\nA lot of times, if the returned data is not correct, or not fully formatted the way you want, it helps to rephrase the question or give specific pointers to how the final data should look like. To better understand where things might have gone wrong, use `show_query=True` in the `sloth.query()`, or run `sloth.show_last_query()` after the prompt has finished to print out the SQL query used (whithout rerunning the engine).\n\n## Data generation\n\nDataSloth is also able to generate random data with the `generate` function. For example, running:\n```python\nsloth.generate(\n    description=\"people from Mars, with very space-sounding names, and strange taste in ice cream\", \n    columns=['First Name', 'Last Name', 'Date Of Birth', 'Country', 'City', 'Favourite Ice Cream'],\n    n_rows=15\n)\n```\nProduces something like this:\n| First Name | Last Name | Date Of Birth | Country |             City | Favourite Ice Cream |\n|-----------:|----------:|--------------:|--------:|-----------------:|--------------------:|\n|     Glorza |    Mangal |    06/12/2079 |    Mars |      Pryus Mater |   Celestial Delight |\n|      Yalza |     Krang |    09/21/2084 |    Mars | Valles Marineris |           Moon Mist |\n|     Tralza |     Vomar |    04/17/2074 |    Mars |     Syrtis Major |        Mars Mud Pie |\n|      Dalza |     Ralad |    01/02/2088 |    Mars |  Hellas Planitia |     Alien Abduction |\n|      Halza |     Wular |    11/04/2092 |    Mars |     Olympus Mons |     Martian Sunrise |\n\nNote that the results of the `generate` function are random, and different on each call.\n"
  },
  {
    "path": "README.rst",
    "content": "\nDataSloth\n=========\n\n*Natural language Pandas queries and data generation powered by GPT-3*\n\n\nInstallation\n------------\n\n``pip install datasloth``\n\nUsage\n-----\n\nIn order for DataSloth to work, you must have a working `OpenAI API\nkey <https://beta.openai.com/account/api-keys>`__ set in your\nenvironment variable, or provide it to the DataSloth object. For more\ninfo, refer to this\n`guide <https://help.openai.com/en/articles/5112595-best-practices-for-api-key-safety>`__.\n\nDataSloth automatically discovers all Pandas dataframes in your\nnamespace (filtering out names starting with an underscode). Before you\nload any data, import DataSloth and create the ``sloth``:\n\n.. code:: python\n\n   from datasloth import DataSloth\n   sloth = DataSloth()\n\nNext, load any data you want to use. Try naming your dataframes and\ncolumns in a meaningful way, as DataSloth uses these names to understand\nwhat the data is about.\n\nOnce your data is loaded, simply run\n\n``sloth.query('...')``\n\nto query the data.\n"
  },
  {
    "path": "datasloth/__init__.py",
    "content": "import os\nimport inspect\nimport re\nimport pandas as pd\nfrom pandas.api.extensions import register_dataframe_accessor\nfrom pandas.api.types import is_string_dtype, is_numeric_dtype, is_datetime64_any_dtype\nfrom sqlalchemy import desc\nfrom pandasql import sqldf, PandaSQLException\nimport openai\n\n\n@pd.api.extensions.register_dataframe_accessor(\"sloth\")\nclass SlothAccessor:\n    \"\"\"\n    Pandas Dataframe accessor to add '.sloth.description' field to dataframes,\n    and manage column summaries used by DataSloth.\n    \"\"\"\n    def __init__(self, pandas_obj: pd.DataFrame) -> None:\n        self._validate(pandas_obj)\n        self._obj = pandas_obj\n        self._description = '{COLUMNS_SUMMARY}'\n\n    @staticmethod\n    def _validate(obj):\n        pass\n\n    @property\n    def description(self) -> str:\n        return self._description.format(COLUMNS_SUMMARY=self.columns_summary())\n\n    @description.setter\n    def description(self, value: str) -> None:\n        \"\"\"\n        Set additional description manually to inform the language engine about this table.\n        Use '{COLUMNS_SUMMARY}' to include the default column summary in the description.\n        By default, description is set only to this summary. To reset it, set description to None.\n        \"\"\"\n        if value is None:\n            self._description = '{COLUMNS_SUMMARY}'\n        else:\n            self._description = value\n    \n    def columns_summary(self) -> str:\n        \"\"\"\n        Returns columns summary of the dataframe, in the \"table\" format containing\n        column names, data types and additional info about columns.\n        \"\"\"\n        summary_lines = ['|column name|data type|info|']\n        for col_name in self._obj:\n            col = self._obj[col_name]\n            summary_lines.append(f'|{col_name}|{col.dtype}|{column_info(col)}|')\n        return '\\n'.join(summary_lines)\n        \n    \nclass DataSloth():\n    prompt_format = \"\"\"\n\nMake sure to join in tables if information from multiple tables is needed for a task.\n\nTask: percentage of True values of column X in table Y\n```\nSQL query for SQLite:\nSELECT (SUM(CASE WHEN X = 'True' THEN 1.0 END) / COUNT(*)) * 100 AS percentage\nFROM Y\n```\n\nTask: count of rows in table T where date is equal to 11th of August 1993\n```\nSQL query for SQLite:\nSELECT COUNT(*) AS row_count\nFROM T\nWHERE date(date) = date('1993-08-11')\n```\n\nTask: {QUERY}\nSQL query for SQLite:\n```\n\"\"\"\n\n    def __init__(self, openai_api_key=None) -> None:\n        if openai_api_key:\n            openai.api_key = openai_api_key\n        else:\n            openai.api_key = os.getenv(\"OPENAI_API_KEY\")\n        if not openai.api_key:\n            raise Exception(\n                \"OpenAI API key is not set. Either provide it to DataSloth(openai_api_key='...') \"\\\n                \"run openai.api_key('...'), or set it as an env variable OPENAI_API_KEY.\"\n            )\n        self.last_prompt = None\n        self.last_gpt_response = None\n\n    @staticmethod\n    def dataframes_summary(env=None, ignore='^_') -> str:\n        \"\"\"\n        Summary of all DataFrames available in the namespace, ignoring those matching the 'ignore' regex.\n        \"\"\"\n        summary_lines = ['Tables available in the database, with their additional information, are:']\n        table_count = 0\n        for name, value in env.items():\n            if isinstance(value, pd.DataFrame) and (not ignore or not re.match(ignore, name)):\n                summary_lines += [\n                    f\"\\n\\nTable name: {name}\",\n                    value.sloth.description\n                ]\n                table_count += 1\n        if not table_count:\n            return None\n        return '\\n'.join(summary_lines)\n\n    def query(self, query, env=None, show_query=False):\n        \"\"\"\n        Query all Pandas DataFrames available in the namespace with a natural language query.\n        To limit the tables used in the query, set the 'env' variable to a dict of tables\n        (keys are table names, and values are table objects), or set it to globals() or locals().\n        To learn more, check pandasql docs.\n        \"\"\"\n        env = env or get_outer_frame_variables()\n        query = query[0].lower() + query[1:]\n        prompt = self.dataframes_summary(env)\n        if not prompt:\n            print('No dataframes found')\n            return\n        prompt += DataSloth.prompt_format.format(QUERY=query)\n        response = openai.Completion.create(\n            model=\"gpt-3.5-turbo-instruct\", # as per OpenAI deprecations guide: https://platform.openai.com/docs/deprecations/instructgpt-models\n            prompt=prompt,\n            temperature=0,\n            max_tokens=1000,\n            top_p=1,\n            frequency_penalty=0,\n            presence_penalty=0,\n            stop=[\"\\n```\\n\"]\n        )\n        sql_query = response['choices'][0]['text']\n        sql_query = sql_query.replace('```', '')\n        self.last_prompt = (prompt, sql_query)\n        if show_query:\n            print(sql_query)\n        try:\n            result = sqldf(sql_query, env)\n        except PandaSQLException:\n            result = None\n            print('Unsuccessful. Try rephrasing your query, or add additional table descriptions in df.sloth.description.')\n            print('You can inspect the generated prompt and GPT response in sloth.show_last_prompt().')\n        return result\n\n    def generate(self, description, columns, n_rows=10):\n        \"\"\"\n        Generates a random dataset based on the description and a list of columns.\n        \"\"\"\n        rows = []\n        while len(rows) < n_rows:\n            prompt = f'Fill the table below with {min(n_rows - len(rows) + 5, 30)} random rows about {description}\\n\\n'\n            prompt += f\"|{'|'.join(columns)}|\\n\"\n            prompt += f\"|{'|'.join(['-'*len(col) for col in columns])}|\\n|\"\n            response = openai.Completion.create(\n                model=\"gpt-3.5-turbo-instruct\", # as per OpenAI deprecations guide: https://platform.openai.com/docs/deprecations/instructgpt-models\n                prompt=prompt,\n                temperature=0.8,\n                max_tokens=1000,\n                top_p=1,\n                frequency_penalty=0,\n                presence_penalty=0,\n            )\n            response = '|' + response['choices'][0]['text']\n            new_rows = [row[1:-1].split('|') for row in response.split('\\n') if not re.match('^[- |]*$', row)]\n            new_rows = [row for row in new_rows if len(row) == len(columns)]\n            rows += new_rows\n            prompt = response + prompt\n\n        df = pd.DataFrame(rows, columns=columns).head(n_rows)\n        return df\n\n    \n    def _last_prompt(self):\n        if self.last_prompt:\n            print(self.last_prompt[0])\n            print(f'[->]\\n{self.last_prompt[1]}')\n\n    def show_last_query(self):\n        \"\"\"Print the SQL query generated in the last sloth.query() call.\"\"\"\n        if self.last_prompt:\n            print(self.last_prompt[1])\n\n# Code copied from pandasql\ndef get_outer_frame_variables():\n    \"\"\" Get a dict of local and global variables of the first outer frame from another file. \"\"\"\n    cur_filename = inspect.getframeinfo(inspect.currentframe()).filename\n    outer_frame = next(f\n                       for f in inspect.getouterframes(inspect.currentframe())\n                       if f.filename != cur_filename)\n    variables = {}\n    variables.update(outer_frame.frame.f_globals)\n    variables.update(outer_frame.frame.f_locals)\n    return variables\n\ndef column_info(col):\n    \"\"\"Info about a specific column, different depending on its type\"\"\"\n    if is_string_dtype(col) or col.dtype == 'category':\n        unique = col.unique().tolist()\n        summary = 'unique values: ' + ', '.join(map(str, unique[:30]))\n        if len(unique) > 30:\n            summary += '...'\n    elif col.dtype == 'bool':\n        summary = f\"values: 0, 1\"\n    elif is_numeric_dtype(col):\n        summary = f\"min={col.min()}, max={col.max()}\"\n    elif is_datetime64_any_dtype(col):\n        summary = f\"first={col.min()}, last={col.max()}\"\n    else:\n        summary = ''\n    return summary\n"
  },
  {
    "path": "examples/datasloth_detailed_example.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# DataSloth\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"from datasloth import DataSloth\\n\",\n    \"import pandas as pd\\n\",\n    \"import seaborn as sns\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"# Make sure your OpenAI API key is set in the OPENAI_API_KEY env variable, or provide it as an argument to DataSloth()\\n\",\n    \"sloth = DataSloth()\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>survived</th>\\n\",\n       \"      <th>pclass</th>\\n\",\n       \"      <th>sex</th>\\n\",\n       \"      <th>age</th>\\n\",\n       \"      <th>sibsp</th>\\n\",\n       \"      <th>parch</th>\\n\",\n       \"      <th>fare</th>\\n\",\n       \"      <th>embarked</th>\\n\",\n       \"      <th>class</th>\\n\",\n       \"      <th>who</th>\\n\",\n       \"      <th>adult_male</th>\\n\",\n       \"      <th>deck</th>\\n\",\n       \"      <th>embark_town</th>\\n\",\n       \"      <th>alive</th>\\n\",\n       \"      <th>alone</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>3</td>\\n\",\n       \"      <td>male</td>\\n\",\n       \"      <td>22.0</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>7.2500</td>\\n\",\n       \"      <td>S</td>\\n\",\n       \"      <td>Third</td>\\n\",\n       \"      <td>man</td>\\n\",\n       \"      <td>True</td>\\n\",\n       \"      <td>NaN</td>\\n\",\n       \"      <td>Southampton</td>\\n\",\n       \"      <td>no</td>\\n\",\n       \"      <td>False</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>1</th>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>female</td>\\n\",\n       \"      <td>38.0</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>71.2833</td>\\n\",\n       \"      <td>C</td>\\n\",\n       \"      <td>First</td>\\n\",\n       \"      <td>woman</td>\\n\",\n       \"      <td>False</td>\\n\",\n       \"      <td>C</td>\\n\",\n       \"      <td>Cherbourg</td>\\n\",\n       \"      <td>yes</td>\\n\",\n       \"      <td>False</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>2</th>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>3</td>\\n\",\n       \"      <td>female</td>\\n\",\n       \"      <td>26.0</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>7.9250</td>\\n\",\n       \"      <td>S</td>\\n\",\n       \"      <td>Third</td>\\n\",\n       \"      <td>woman</td>\\n\",\n       \"      <td>False</td>\\n\",\n       \"      <td>NaN</td>\\n\",\n       \"      <td>Southampton</td>\\n\",\n       \"      <td>yes</td>\\n\",\n       \"      <td>True</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>3</th>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>female</td>\\n\",\n       \"      <td>35.0</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>53.1000</td>\\n\",\n       \"      <td>S</td>\\n\",\n       \"      <td>First</td>\\n\",\n       \"      <td>woman</td>\\n\",\n       \"      <td>False</td>\\n\",\n       \"      <td>C</td>\\n\",\n       \"      <td>Southampton</td>\\n\",\n       \"      <td>yes</td>\\n\",\n       \"      <td>False</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>4</th>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>3</td>\\n\",\n       \"      <td>male</td>\\n\",\n       \"      <td>35.0</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>8.0500</td>\\n\",\n       \"      <td>S</td>\\n\",\n       \"      <td>Third</td>\\n\",\n       \"      <td>man</td>\\n\",\n       \"      <td>True</td>\\n\",\n       \"      <td>NaN</td>\\n\",\n       \"      <td>Southampton</td>\\n\",\n       \"      <td>no</td>\\n\",\n       \"      <td>True</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"   survived  pclass     sex   age  sibsp  parch     fare embarked  class  \\\\\\n\",\n       \"0         0       3    male  22.0      1      0   7.2500        S  Third   \\n\",\n       \"1         1       1  female  38.0      1      0  71.2833        C  First   \\n\",\n       \"2         1       3  female  26.0      0      0   7.9250        S  Third   \\n\",\n       \"3         1       1  female  35.0      1      0  53.1000        S  First   \\n\",\n       \"4         0       3    male  35.0      0      0   8.0500        S  Third   \\n\",\n       \"\\n\",\n       \"     who  adult_male deck  embark_town alive  alone  \\n\",\n       \"0    man        True  NaN  Southampton    no  False  \\n\",\n       \"1  woman       False    C    Cherbourg   yes  False  \\n\",\n       \"2  woman       False  NaN  Southampton   yes   True  \\n\",\n       \"3  woman       False    C  Southampton   yes  False  \\n\",\n       \"4    man        True  NaN  Southampton    no   True  \"\n      ]\n     },\n     \"execution_count\": 2,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# Main dataset to show datasloth capabilities\\n\",\n    \"titanic = sns.load_dataset('titanic')\\n\",\n    \"titanic.head()\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"SELECT COUNT(*) AS survived_men\\n\",\n      \"FROM titanic\\n\",\n      \"WHERE sex = 'male' AND survived = 1\\n\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>survived_men</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>109</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"   survived_men\\n\",\n       \"0           109\"\n      ]\n     },\n     \"execution_count\": 4,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# Example 1: we do not need to specify exact lables in our data. Here, 'men' is autonatically converted to 'male'.\\n\",\n    \"sloth.query(\\\"Number of men which survived the titanic\\\", show_query=True)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 5,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"SELECT AVG(fare) AS avg_fare\\n\",\n      \"FROM titanic\\n\",\n      \"WHERE alone = 1 AND sex = 'male'\\n\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>avg_fare</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>16.713358</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"    avg_fare\\n\",\n       \"0  16.713358\"\n      ]\n     },\n     \"execution_count\": 5,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# Exmaple 2: loosely specified statistics\\n\",\n    \"sloth.query(\\\"Average fare paid by men who traveled alone\\\", show_query=True)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 6,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"SELECT (SUM(CASE WHEN survived = 1 AND sex = 'male' THEN 1.0 END) / COUNT(*)) * 100 AS percentage\\n\",\n      \"FROM titanic\\n\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>percentage</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>12.233446</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"   percentage\\n\",\n       \"0   12.233446\"\n      ]\n     },\n     \"execution_count\": 6,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# Example 3: more complex stats\\n\",\n    \"sloth.query(\\\"Percentage of male survivors\\\", show_query=True)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 7,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"SELECT sex, (SUM(CASE WHEN survived = 1 THEN 1.0 END) / COUNT(*)) * 100 AS percentage\\n\",\n      \"FROM titanic\\n\",\n      \"GROUP BY sex\\n\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>sex</th>\\n\",\n       \"      <th>percentage</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>female</td>\\n\",\n       \"      <td>74.203822</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>1</th>\\n\",\n       \"      <td>male</td>\\n\",\n       \"      <td>18.890815</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"      sex  percentage\\n\",\n       \"0  female   74.203822\\n\",\n       \"1    male   18.890815\"\n      ]\n     },\n     \"execution_count\": 7,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# Example 4: group aggregations\\n\",\n    \"sloth.query(\\\"Calculate the percentage of survivors per sex\\\", show_query=True)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 8,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>pclass</th>\\n\",\n       \"      <th>meal_type</th>\\n\",\n       \"      <th>n_courses</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>breakfast</td>\\n\",\n       \"      <td>10</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>1</th>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>lunch</td>\\n\",\n       \"      <td>15</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>2</th>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>dinner</td>\\n\",\n       \"      <td>20</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>3</th>\\n\",\n       \"      <td>2</td>\\n\",\n       \"      <td>breakfast</td>\\n\",\n       \"      <td>5</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>4</th>\\n\",\n       \"      <td>2</td>\\n\",\n       \"      <td>lunch</td>\\n\",\n       \"      <td>6</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>5</th>\\n\",\n       \"      <td>2</td>\\n\",\n       \"      <td>dinner</td>\\n\",\n       \"      <td>7</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>6</th>\\n\",\n       \"      <td>3</td>\\n\",\n       \"      <td>breakfast</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>7</th>\\n\",\n       \"      <td>3</td>\\n\",\n       \"      <td>lunch</td>\\n\",\n       \"      <td>2</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>8</th>\\n\",\n       \"      <td>3</td>\\n\",\n       \"      <td>dinner</td>\\n\",\n       \"      <td>3</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"   pclass  meal_type  n_courses\\n\",\n       \"0       1  breakfast         10\\n\",\n       \"1       1      lunch         15\\n\",\n       \"2       1     dinner         20\\n\",\n       \"3       2  breakfast          5\\n\",\n       \"4       2      lunch          6\\n\",\n       \"5       2     dinner          7\\n\",\n       \"6       3  breakfast          1\\n\",\n       \"7       3      lunch          2\\n\",\n       \"8       3     dinner          3\"\n      ]\n     },\n     \"execution_count\": 8,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# Introducing another dataframe into the namespace\\n\",\n    \"classes = pd.DataFrame({\\n\",\n    \"    'pclass': [1, 1, 1, 2, 2, 2, 3, 3, 3],\\n\",\n    \"    'meal_type': ['breakfast', 'lunch', 'dinner'] * 3, \\n\",\n    \"    'n_courses': [10, 15, 20, 5, 6, 7, 1, 2, 3]\\n\",\n    \"})\\n\",\n    \"classes\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 9,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"SELECT sex, (SUM(CASE WHEN survived = '1' THEN 1.0 END) / COUNT(*)) * 100 AS percentage\\n\",\n      \"FROM titanic\\n\",\n      \"JOIN classes ON titanic.pclass = classes.pclass\\n\",\n      \"WHERE meal_type = 'breakfast' AND n_courses > 5\\n\",\n      \"GROUP BY sex\\n\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>sex</th>\\n\",\n       \"      <th>percentage</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>female</td>\\n\",\n       \"      <td>96.808511</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>1</th>\\n\",\n       \"      <td>male</td>\\n\",\n       \"      <td>36.885246</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"      sex  percentage\\n\",\n       \"0  female   96.808511\\n\",\n       \"1    male   36.885246\"\n      ]\n     },\n     \"execution_count\": 9,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# Example 5: automatically joining with other tables in the namescpace\\n\",\n    \"sloth.query(\\\"Calculate the percentage of survivors of people who had more than 5 courses for breakfast. Do it per sex.\\\", show_query=True)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 10,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>code</th>\\n\",\n       \"      <th>date</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>S</td>\\n\",\n       \"      <td>1912-04-10</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>1</th>\\n\",\n       \"      <td>C</td>\\n\",\n       \"      <td>1912-04-10</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>2</th>\\n\",\n       \"      <td>Q</td>\\n\",\n       \"      <td>1912-04-11</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"  code       date\\n\",\n       \"0    S 1912-04-10\\n\",\n       \"1    C 1912-04-10\\n\",\n       \"2    Q 1912-04-11\"\n      ]\n     },\n     \"execution_count\": 10,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# Another table, with departure dates from each port\\n\",\n    \"# Note that the table and column names do not explain what the information is about\\n\",\n    \"table_por_dep = pd.DataFrame({'code': ['S', 'C', 'Q'], 'date': pd.to_datetime(['1912-04-10', '1912-04-10', '1912-04-11'])})\\n\",\n    \"table_por_dep\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 11,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"SELECT COUNT(*) AS female_passengers\\n\",\n      \"FROM titanic\\n\",\n      \"WHERE sex = 'female'\\n\",\n      \"AND date(embarked) = date('1912-04-11')\\n\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>female_passengers</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>0</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"   female_passengers\\n\",\n       \"0                  0\"\n      ]\n     },\n     \"execution_count\": 11,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# Sloth is not able to make the connection correctly, as it does not know that departure dates are stored in that other table\\n\",\n    \"sloth.query(\\\"Count female passengers who departed on 11th of April\\\", show_query=True)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 12,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"SELECT COUNT(*) AS female_passengers\\n\",\n      \"FROM titanic\\n\",\n      \"INNER JOIN table_por_dep ON titanic.embarked = table_por_dep.code\\n\",\n      \"WHERE date(table_por_dep.date) = date('1912-04-11')\\n\",\n      \"AND titanic.sex = 'female'\\n\"\n     ]\n    },\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>female_passengers</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>36</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"   female_passengers\\n\",\n       \"0                 36\"\n      ]\n     },\n     \"execution_count\": 12,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# To help, we add the table description\\n\",\n    \"# Note the use of a COLUMNS_SUMMARY placeholder to still keep the default description in.\\n\",\n    \"table_por_dep.sloth.description = \\\\\\n\",\n    \"\\\"Departure date table, to be joined to the main Titanic table on the 'embarked' code. \\\\n{COLUMNS_SUMMARY}\\\"\\n\",\n    \"\\n\",\n    \"sloth.query(\\\"Count female passengers who departed from their port on 11th of April\\\", show_query=True)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": []\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Data generation\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 7,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>First Name</th>\\n\",\n       \"      <th>Last Name</th>\\n\",\n       \"      <th>Date Of Birth</th>\\n\",\n       \"      <th>Country</th>\\n\",\n       \"      <th>City</th>\\n\",\n       \"      <th>Favourite Ice Cream</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>Glorza</td>\\n\",\n       \"      <td>Mangal</td>\\n\",\n       \"      <td>06/12/2079</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Pryus Mater</td>\\n\",\n       \"      <td>Celestial Delight</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>1</th>\\n\",\n       \"      <td>Yalza</td>\\n\",\n       \"      <td>Krang</td>\\n\",\n       \"      <td>09/21/2084</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Valles Marineris</td>\\n\",\n       \"      <td>Moon Mist</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>2</th>\\n\",\n       \"      <td>Tralza</td>\\n\",\n       \"      <td>Vomar</td>\\n\",\n       \"      <td>04/17/2074</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Syrtis Major</td>\\n\",\n       \"      <td>Mars Mud Pie</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>3</th>\\n\",\n       \"      <td>Dalza</td>\\n\",\n       \"      <td>Ralad</td>\\n\",\n       \"      <td>01/02/2088</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Hellas Planitia</td>\\n\",\n       \"      <td>Alien Abduction</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>4</th>\\n\",\n       \"      <td>Halza</td>\\n\",\n       \"      <td>Wular</td>\\n\",\n       \"      <td>11/04/2092</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Olympus Mons</td>\\n\",\n       \"      <td>Martian Sunrise</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>5</th>\\n\",\n       \"      <td>Kalza</td>\\n\",\n       \"      <td>Lopal</td>\\n\",\n       \"      <td>03/09/2073</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Ares Vallis</td>\\n\",\n       \"      <td>Red Planet</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>6</th>\\n\",\n       \"      <td>Malza</td>\\n\",\n       \"      <td>Bomar</td>\\n\",\n       \"      <td>07/14/2081</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Terra Cimmeria</td>\\n\",\n       \"      <td>Mars Bar</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>7</th>\\n\",\n       \"      <td>Nalza</td>\\n\",\n       \"      <td>Kamar</td>\\n\",\n       \"      <td>12/25/2085</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Utopia Planitia</td>\\n\",\n       \"      <td>Espresso crunch</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>8</th>\\n\",\n       \"      <td>Ralza</td>\\n\",\n       \"      <td>Fomar</td>\\n\",\n       \"      <td>02/11/2070</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Arsia Mons</td>\\n\",\n       \"      <td>Cotton candy</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>9</th>\\n\",\n       \"      <td>Salza</td>\\n\",\n       \"      <td>Soldar</td>\\n\",\n       \"      <td>05/16/2078</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Tharsis Montes</td>\\n\",\n       \"      <td>Butterscotch</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>10</th>\\n\",\n       \"      <td>Talza</td>\\n\",\n       \"      <td>Womar</td>\\n\",\n       \"      <td>10/28/2080</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Mangala Valles</td>\\n\",\n       \"      <td>Cookies and Cream</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>11</th>\\n\",\n       \"      <td>Ulza</td>\\n\",\n       \"      <td>Dalad</td>\\n\",\n       \"      <td>06/01/2072</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Elysium Planitia</td>\\n\",\n       \"      <td>Green Tea</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>12</th>\\n\",\n       \"      <td>Vulza</td>\\n\",\n       \"      <td>Ropal</td>\\n\",\n       \"      <td>04/14/2087</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Cydonia Mensae</td>\\n\",\n       \"      <td>Mint chocolate chip</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>13</th>\\n\",\n       \"      <td>Zalza</td>\\n\",\n       \"      <td>Bular</td>\\n\",\n       \"      <td>07/11/2089</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Isidis Planitia</td>\\n\",\n       \"      <td>Rocky Road</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>14</th>\\n\",\n       \"      <td>Blorza</td>\\n\",\n       \"      <td>Fomar</td>\\n\",\n       \"      <td>09/08/2076</td>\\n\",\n       \"      <td>Mars</td>\\n\",\n       \"      <td>Tempe Terra</td>\\n\",\n       \"      <td>Vanilla</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"   First Name Last Name Date Of Birth Country              City  \\\\\\n\",\n       \"0      Glorza    Mangal    06/12/2079    Mars       Pryus Mater   \\n\",\n       \"1       Yalza     Krang    09/21/2084    Mars  Valles Marineris   \\n\",\n       \"2      Tralza     Vomar    04/17/2074    Mars      Syrtis Major   \\n\",\n       \"3       Dalza     Ralad    01/02/2088    Mars   Hellas Planitia   \\n\",\n       \"4       Halza     Wular    11/04/2092    Mars      Olympus Mons   \\n\",\n       \"5       Kalza     Lopal    03/09/2073    Mars       Ares Vallis   \\n\",\n       \"6       Malza     Bomar    07/14/2081    Mars    Terra Cimmeria   \\n\",\n       \"7       Nalza     Kamar    12/25/2085    Mars   Utopia Planitia   \\n\",\n       \"8       Ralza     Fomar    02/11/2070    Mars        Arsia Mons   \\n\",\n       \"9       Salza    Soldar    05/16/2078    Mars    Tharsis Montes   \\n\",\n       \"10      Talza     Womar    10/28/2080    Mars    Mangala Valles   \\n\",\n       \"11       Ulza     Dalad    06/01/2072    Mars  Elysium Planitia   \\n\",\n       \"12      Vulza     Ropal    04/14/2087    Mars    Cydonia Mensae   \\n\",\n       \"13      Zalza     Bular    07/11/2089    Mars   Isidis Planitia   \\n\",\n       \"14     Blorza     Fomar    09/08/2076    Mars       Tempe Terra   \\n\",\n       \"\\n\",\n       \"    Favourite Ice Cream  \\n\",\n       \"0     Celestial Delight  \\n\",\n       \"1             Moon Mist  \\n\",\n       \"2          Mars Mud Pie  \\n\",\n       \"3       Alien Abduction  \\n\",\n       \"4       Martian Sunrise  \\n\",\n       \"5            Red Planet  \\n\",\n       \"6              Mars Bar  \\n\",\n       \"7       Espresso crunch  \\n\",\n       \"8          Cotton candy  \\n\",\n       \"9          Butterscotch  \\n\",\n       \"10    Cookies and Cream  \\n\",\n       \"11            Green Tea  \\n\",\n       \"12  Mint chocolate chip  \\n\",\n       \"13           Rocky Road  \\n\",\n       \"14              Vanilla  \"\n      ]\n     },\n     \"execution_count\": 7,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# Given a table description and a list of columns, DataSloth can generate some random data\\n\",\n    \"sloth.generate(\\n\",\n    \"    \\\"people from Mars, with very space-sounding names, and strange taste in ice cream\\\", \\n\",\n    \"    ['First Name', 'Last Name', 'Date Of Birth', 'Country', 'City', 'Favourite Ice Cream'],\\n\",\n    \"    n_rows=15\\n\",\n    \")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": []\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Python 3 (ipykernel)\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.10.4\"\n  },\n  \"vscode\": {\n   \"interpreter\": {\n    \"hash\": \"fa2753a9fc1c7a7f868f370d31058bd0275fd3cd078c4899cfafe3ad2d226086\"\n   }\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 4\n}\n"
  },
  {
    "path": "examples/datasloth_quick_example.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"from datasloth import DataSloth\\n\",\n    \"sloth = DataSloth()\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>survived</th>\\n\",\n       \"      <th>pclass</th>\\n\",\n       \"      <th>sex</th>\\n\",\n       \"      <th>age</th>\\n\",\n       \"      <th>sibsp</th>\\n\",\n       \"      <th>parch</th>\\n\",\n       \"      <th>fare</th>\\n\",\n       \"      <th>embarked</th>\\n\",\n       \"      <th>class</th>\\n\",\n       \"      <th>who</th>\\n\",\n       \"      <th>adult_male</th>\\n\",\n       \"      <th>deck</th>\\n\",\n       \"      <th>embark_town</th>\\n\",\n       \"      <th>alive</th>\\n\",\n       \"      <th>alone</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>3</td>\\n\",\n       \"      <td>male</td>\\n\",\n       \"      <td>22.0</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>7.2500</td>\\n\",\n       \"      <td>S</td>\\n\",\n       \"      <td>Third</td>\\n\",\n       \"      <td>man</td>\\n\",\n       \"      <td>True</td>\\n\",\n       \"      <td>NaN</td>\\n\",\n       \"      <td>Southampton</td>\\n\",\n       \"      <td>no</td>\\n\",\n       \"      <td>False</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>1</th>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>female</td>\\n\",\n       \"      <td>38.0</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>71.2833</td>\\n\",\n       \"      <td>C</td>\\n\",\n       \"      <td>First</td>\\n\",\n       \"      <td>woman</td>\\n\",\n       \"      <td>False</td>\\n\",\n       \"      <td>C</td>\\n\",\n       \"      <td>Cherbourg</td>\\n\",\n       \"      <td>yes</td>\\n\",\n       \"      <td>False</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"   survived  pclass     sex   age  sibsp  parch     fare embarked  class  \\\\\\n\",\n       \"0         0       3    male  22.0      1      0   7.2500        S  Third   \\n\",\n       \"1         1       1  female  38.0      1      0  71.2833        C  First   \\n\",\n       \"\\n\",\n       \"     who  adult_male deck  embark_town alive  alone  \\n\",\n       \"0    man        True  NaN  Southampton    no  False  \\n\",\n       \"1  woman       False    C    Cherbourg   yes  False  \"\n      ]\n     },\n     \"execution_count\": 2,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"import seaborn as sns\\n\",\n    \"titanic = sns.load_dataset('titanic')\\n\",\n    \"titanic.head(2)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>avg_fare</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>16.713358</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"    avg_fare\\n\",\n       \"0  16.713358\"\n      ]\n     },\n     \"execution_count\": 3,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"sloth.query(\\\"Average fare paid by men who traveled alone\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/html\": [\n       \"<div>\\n\",\n       \"<style scoped>\\n\",\n       \"    .dataframe tbody tr th:only-of-type {\\n\",\n       \"        vertical-align: middle;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe tbody tr th {\\n\",\n       \"        vertical-align: top;\\n\",\n       \"    }\\n\",\n       \"\\n\",\n       \"    .dataframe thead th {\\n\",\n       \"        text-align: right;\\n\",\n       \"    }\\n\",\n       \"</style>\\n\",\n       \"<table border=\\\"1\\\" class=\\\"dataframe\\\">\\n\",\n       \"  <thead>\\n\",\n       \"    <tr style=\\\"text-align: right;\\\">\\n\",\n       \"      <th></th>\\n\",\n       \"      <th>survived</th>\\n\",\n       \"      <th>pclass</th>\\n\",\n       \"      <th>sex</th>\\n\",\n       \"      <th>age</th>\\n\",\n       \"      <th>sibsp</th>\\n\",\n       \"      <th>parch</th>\\n\",\n       \"      <th>fare</th>\\n\",\n       \"      <th>embarked</th>\\n\",\n       \"      <th>class</th>\\n\",\n       \"      <th>who</th>\\n\",\n       \"      <th>adult_male</th>\\n\",\n       \"      <th>deck</th>\\n\",\n       \"      <th>embark_town</th>\\n\",\n       \"      <th>alive</th>\\n\",\n       \"      <th>alone</th>\\n\",\n       \"    </tr>\\n\",\n       \"  </thead>\\n\",\n       \"  <tbody>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>0</th>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>female</td>\\n\",\n       \"      <td>50.0</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>28.7125</td>\\n\",\n       \"      <td>C</td>\\n\",\n       \"      <td>First</td>\\n\",\n       \"      <td>woman</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>C</td>\\n\",\n       \"      <td>Cherbourg</td>\\n\",\n       \"      <td>no</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>1</th>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>female</td>\\n\",\n       \"      <td>2.0</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>2</td>\\n\",\n       \"      <td>151.5500</td>\\n\",\n       \"      <td>S</td>\\n\",\n       \"      <td>First</td>\\n\",\n       \"      <td>child</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>C</td>\\n\",\n       \"      <td>Southampton</td>\\n\",\n       \"      <td>no</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"    </tr>\\n\",\n       \"    <tr>\\n\",\n       \"      <th>2</th>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>female</td>\\n\",\n       \"      <td>25.0</td>\\n\",\n       \"      <td>1</td>\\n\",\n       \"      <td>2</td>\\n\",\n       \"      <td>151.5500</td>\\n\",\n       \"      <td>S</td>\\n\",\n       \"      <td>First</td>\\n\",\n       \"      <td>woman</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"      <td>C</td>\\n\",\n       \"      <td>Southampton</td>\\n\",\n       \"      <td>no</td>\\n\",\n       \"      <td>0</td>\\n\",\n       \"    </tr>\\n\",\n       \"  </tbody>\\n\",\n       \"</table>\\n\",\n       \"</div>\"\n      ],\n      \"text/plain\": [\n       \"   survived  pclass     sex   age  sibsp  parch      fare embarked  class  \\\\\\n\",\n       \"0         0       1  female  50.0      0      0   28.7125        C  First   \\n\",\n       \"1         0       1  female   2.0      1      2  151.5500        S  First   \\n\",\n       \"2         0       1  female  25.0      1      2  151.5500        S  First   \\n\",\n       \"\\n\",\n       \"     who  adult_male deck  embark_town alive  alone  \\n\",\n       \"0  woman           0    C    Cherbourg    no      1  \\n\",\n       \"1  child           0    C  Southampton    no      0  \\n\",\n       \"2  woman           0    C  Southampton    no      0  \"\n      ]\n     },\n     \"execution_count\": 4,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"sloth.query(\\\"All first class women who did not survive\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": []\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Python 3 (ipykernel)\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.10.4\"\n  },\n  \"vscode\": {\n   \"interpreter\": {\n    \"hash\": \"fa2753a9fc1c7a7f868f370d31058bd0275fd3cd078c4899cfafe3ad2d226086\"\n   }\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 4\n}\n"
  },
  {
    "path": "setup.cfg",
    "content": "[metadata]\ndescription-file = README.md"
  },
  {
    "path": "setup.py",
    "content": "from setuptools import setup\n\ndef readme():\n    with open('README.rst') as f:\n        return f.read()\n\nsetup(\n    name='datasloth',\n    version='0.4',\n    description='Natural language Pandas queries and data generation',\n    url='http://github.com/ibestvina/datasloth',\n    author='Ivan Bestvina',\n    author_email='ivan.bestvina@gmail.com',\n    license='MIT',\n    packages=['datasloth'],\n    zip_safe=False,\n    install_requires=[\n        'openai',\n        'pandas',\n        'pandasql'\n    ],\n    long_description=readme(),\n)"
  }
]