Full Code of ContinuumIO/intake for AI

master a04cec104a9b cached
237 files
1.1 MB
307.3k tokens
1445 symbols
1 requests
Download .txt
Showing preview only (1,194K chars total). Download the full file or copy to clipboard to get everything.
Repository: ContinuumIO/intake
Branch: master
Commit: a04cec104a9b
Files: 237
Total size: 1.1 MB

Directory structure:
gitextract_g6quolyo/

├── .ci-coveragerc
├── .coveragerc
├── .gitattributes
├── .github/
│   └── workflows/
│       ├── main.yaml
│       ├── pre-commit.yml
│       └── pypipublish.yaml
├── .gitignore
├── .pre-commit-config.yaml
├── LICENSE
├── MANIFEST.in
├── README.md
├── README_refactor.md
├── docs/
│   ├── Makefile
│   ├── README.md
│   ├── environment.yml
│   ├── make.bat
│   ├── make_api.py
│   ├── plugins.py
│   ├── plugins.yaml
│   ├── requirements.txt
│   └── source/
│       ├── _static/
│       │   ├── .keep
│       │   ├── css/
│       │   │   └── custom.css
│       │   └── images/
│       │       └── plotting_example.html
│       ├── api.rst
│       ├── api2.rst
│       ├── api_base.rst
│       ├── api_other.rst
│       ├── api_user.rst
│       ├── catalog.rst
│       ├── changelog.rst
│       ├── code-of-conduct.rst
│       ├── community.rst
│       ├── conf.py
│       ├── contributing.rst
│       ├── data-packages.rst
│       ├── deployments.rst
│       ├── examples.rst
│       ├── glossary.rst
│       ├── gui.rst
│       ├── guide.rst
│       ├── index.rst
│       ├── index_v1.rst
│       ├── making-plugins.rst
│       ├── overview.rst
│       ├── persisting.rst
│       ├── plotting.rst
│       ├── plugin-directory.rst
│       ├── quickstart.rst
│       ├── reference.rst
│       ├── roadmap.rst
│       ├── scope2.rst
│       ├── start.rst
│       ├── tools.rst
│       ├── tour2.rst
│       ├── transforms.rst
│       ├── use_cases.rst
│       ├── user2.rst
│       └── walkthrough2.rst
├── examples/
│   └── Take2.ipynb
├── intake/
│   ├── __init__.py
│   ├── catalog/
│   │   ├── __init__.py
│   │   ├── base.py
│   │   ├── default.py
│   │   ├── entry.py
│   │   ├── exceptions.py
│   │   ├── gui.py
│   │   ├── local.py
│   │   ├── tests/
│   │   │   ├── __init__.py
│   │   │   ├── cache_data/
│   │   │   │   └── states.csv
│   │   │   ├── catalog.yml
│   │   │   ├── catalog1.yml
│   │   │   ├── catalog_alias.yml
│   │   │   ├── catalog_caching.yml
│   │   │   ├── catalog_dup_parameters.yml
│   │   │   ├── catalog_dup_sources.yml
│   │   │   ├── catalog_hierarchy.yml
│   │   │   ├── catalog_named.yml
│   │   │   ├── catalog_non_dict.yml
│   │   │   ├── catalog_search/
│   │   │   │   ├── example_packages/
│   │   │   │   │   ├── ep/
│   │   │   │   │   │   └── __init__.py
│   │   │   │   │   └── ep-0.1.dist-info/
│   │   │   │   │       └── entry_points.txt
│   │   │   │   └── yaml.yml
│   │   │   ├── catalog_union_1.yml
│   │   │   ├── catalog_union_2.yml
│   │   │   ├── conftest.py
│   │   │   ├── data_source_missing.yml
│   │   │   ├── data_source_name_non_string.yml
│   │   │   ├── data_source_non_dict.yml
│   │   │   ├── data_source_value_non_dict.yml
│   │   │   ├── dot-nest.yaml
│   │   │   ├── entry1_1.csv
│   │   │   ├── entry1_2.csv
│   │   │   ├── example1_source.py
│   │   │   ├── example_plugin_dir/
│   │   │   │   └── example2_source.py
│   │   │   ├── multi_plugins.yaml
│   │   │   ├── multi_plugins2.yaml
│   │   │   ├── obsolete_data_source_list.yml
│   │   │   ├── obsolete_params_list.yml
│   │   │   ├── params_missing_required.yml
│   │   │   ├── params_name_non_string.yml
│   │   │   ├── params_non_dict.yml
│   │   │   ├── params_value_bad_choice.yml
│   │   │   ├── params_value_bad_type.yml
│   │   │   ├── params_value_non_dict.yml
│   │   │   ├── plugins_non_dict.yml
│   │   │   ├── plugins_source_missing.yml
│   │   │   ├── plugins_source_missing_key.yml
│   │   │   ├── plugins_source_non_dict.yml
│   │   │   ├── plugins_source_non_list.yml
│   │   │   ├── plugins_source_non_string.yml
│   │   │   ├── small.npy
│   │   │   ├── test_alias.py
│   │   │   ├── test_catalog_save.py
│   │   │   ├── test_core.py
│   │   │   ├── test_default.py
│   │   │   ├── test_discovery.py
│   │   │   ├── test_gui.py
│   │   │   ├── test_local.py
│   │   │   ├── test_parameters.py
│   │   │   ├── test_reload_integration.py
│   │   │   ├── test_utils.py
│   │   │   ├── test_zarr.py
│   │   │   └── util.py
│   │   ├── utils.py
│   │   └── zarr.py
│   ├── config.py
│   ├── conftest.py
│   ├── container/
│   │   ├── __init__.py
│   │   └── base.py
│   ├── interface/
│   │   ├── __init__.py
│   │   ├── base.py
│   │   ├── catalog/
│   │   │   ├── __init__.py
│   │   │   ├── add.py
│   │   │   └── search.py
│   │   ├── gui.py
│   │   └── source/
│   │       ├── __init__.py
│   │       └── defined_plots.py
│   ├── readers/
│   │   ├── __init__.py
│   │   ├── catalogs.py
│   │   ├── convert.py
│   │   ├── datatypes.py
│   │   ├── entry.py
│   │   ├── examples.py
│   │   ├── importlist.py
│   │   ├── metadata.py
│   │   ├── mixins.py
│   │   ├── namespaces.py
│   │   ├── output.py
│   │   ├── readers.py
│   │   ├── search.py
│   │   ├── tests/
│   │   │   ├── __init__.py
│   │   │   ├── cats/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── stac_data/
│   │   │   │   │   ├── 1.0.0/
│   │   │   │   │   │   ├── catalog/
│   │   │   │   │   │   │   ├── catalog.json
│   │   │   │   │   │   │   └── child-catalog.json
│   │   │   │   │   │   ├── collection/
│   │   │   │   │   │   │   ├── collection.json
│   │   │   │   │   │   │   ├── simple-item.json
│   │   │   │   │   │   │   └── zarr-collection.json
│   │   │   │   │   │   ├── item/
│   │   │   │   │   │   │   └── zarr-item.json
│   │   │   │   │   │   └── itemcollection/
│   │   │   │   │   │       └── example-search.json
│   │   │   │   │   └── 1.0.0beta2/
│   │   │   │   │       └── earthsearch/
│   │   │   │   │           ├── readme.md
│   │   │   │   │           └── single-file-stac.json
│   │   │   │   ├── test_sql.py
│   │   │   │   ├── test_stac.py
│   │   │   │   ├── test_thredds.py
│   │   │   │   └── test_tiled.py
│   │   │   ├── test_basic.py
│   │   │   ├── test_consistency.py
│   │   │   ├── test_dict.py
│   │   │   ├── test_errors.py
│   │   │   ├── test_reader.py
│   │   │   ├── test_search.py
│   │   │   ├── test_up.py
│   │   │   ├── test_utils.py
│   │   │   └── test_workflows.py
│   │   ├── transform.py
│   │   ├── user_parameters.py
│   │   └── utils.py
│   ├── source/
│   │   ├── __init__.py
│   │   ├── base.py
│   │   ├── csv.py
│   │   ├── derived.py
│   │   ├── discovery.py
│   │   ├── jsonfiles.py
│   │   ├── npy.py
│   │   ├── tests/
│   │   │   ├── __init__.py
│   │   │   ├── alias.yaml
│   │   │   ├── cached.yaml
│   │   │   ├── data.zarr/
│   │   │   │   ├── .zarray
│   │   │   │   └── 0
│   │   │   ├── der.yaml
│   │   │   ├── footer_csvs/
│   │   │   │   ├── sample_fewfooters.csv
│   │   │   │   ├── sample_manyfooters.csv
│   │   │   │   └── sample_nofooters.csv
│   │   │   ├── pipeline.yaml
│   │   │   ├── plugin_searchpath/
│   │   │   │   ├── collision_foo/
│   │   │   │   │   └── __init__.py
│   │   │   │   ├── collision_foo2/
│   │   │   │   │   └── __init__.py
│   │   │   │   ├── driver_with_entrypoints/
│   │   │   │   │   └── __init__.py
│   │   │   │   ├── driver_with_entrypoints-0.1.dist-info/
│   │   │   │   │   └── entry_points.txt
│   │   │   │   ├── intake_foo/
│   │   │   │   │   └── __init__.py
│   │   │   │   └── not_intake_foo/
│   │   │   │       └── __init__.py
│   │   │   ├── sample1.csv
│   │   │   ├── sample2_1.csv
│   │   │   ├── sample2_2.csv
│   │   │   ├── sample3_2.csv
│   │   │   ├── sources.yaml
│   │   │   ├── test_base.py
│   │   │   ├── test_csv.py
│   │   │   ├── test_derived.py
│   │   │   ├── test_discovery.py
│   │   │   ├── test_json.py
│   │   │   ├── test_npy.py
│   │   │   ├── test_text.py
│   │   │   ├── test_tiled.py
│   │   │   ├── test_utils.py
│   │   │   └── util.py
│   │   ├── textfiles.py
│   │   ├── tiled.py
│   │   ├── utils.py
│   │   └── zarr.py
│   ├── tests/
│   │   ├── __init__.py
│   │   ├── catalog1.yml
│   │   ├── catalog2.yml
│   │   ├── catalog_inherit_params.yml
│   │   ├── catalog_nested.yml
│   │   ├── catalog_nested_sub.yml
│   │   ├── test_config.py
│   │   ├── test_top_level.py
│   │   └── test_utils.py
│   ├── util_tests.py
│   └── utils.py
├── pyproject.toml
├── readthedocs.yml
└── scripts/
    └── ci/
        ├── environment-pip.yml
        ├── environment-py310.yml
        ├── environment-py311.yml
        ├── environment-py312.yml
        ├── environment-py313.yml
        └── environment-py314.yml

================================================
FILE CONTENTS
================================================

================================================
FILE: .ci-coveragerc
================================================
[run]
omit = *tests/*, */_version.py


================================================
FILE: .coveragerc
================================================
[run]
omit =
    */tests/*
    */test_*.py
    *_version.py
source =
    intake
[report]
show_missing = True


================================================
FILE: .gitattributes
================================================
intake/_version.py export-subst


================================================
FILE: .github/workflows/main.yaml
================================================
name: CI

on:
  push:
    branches: "*"
  pull_request:
    branches: master

jobs:
  test:
    name: ${{ matrix.OS }}-${{ matrix.CONDA_ENV }}-pytest
    runs-on: ${{ matrix.OS }}
    strategy:
      fail-fast: false
      matrix:
        OS: [ubuntu-latest, windows-latest]
        CONDA_ENV: [py310, py311, py312, py313, py314, pip]
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Setup conda
        uses: conda-incubator/setup-miniconda@v3
        with:
          environment-file: scripts/ci/environment-${{ matrix.CONDA_ENV }}.yml

      - name: pip-install
        shell: bash -l {0}
        run: |
          pip install . --no-deps

      - name: Run Tests
        shell: bash -l {0}
        run: |
          pytest -v intake/readers


================================================
FILE: .github/workflows/pre-commit.yml
================================================
name: pre-commit

on:
  pull_request:
    branches:
      - '*'
  push:
    branches: [master]
  workflow_dispatch:

jobs:
  pre-commit:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-python@v4
      with:
          python-version: "3.11"
    - uses: pre-commit/action@v3.0.0


================================================
FILE: .github/workflows/pypipublish.yaml
================================================
name: Upload Python Package

on:
  release:
    types: [created]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: "3.x"
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install setuptools setuptools-scm wheel twine
      - name: Build and publish
        env:
          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
        run: |
          python setup.py sdist bdist_wheel
          twine upload dist/*


================================================
FILE: .gitignore
================================================
.DS_Store
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
_version.py

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
.pytest_cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/
docs/source/plugin-list.html

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# dotenv
.env

# virtualenv
.venv
venv/
ENV/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/

# jetbrains/pycharm
.idea/


================================================
FILE: .pre-commit-config.yaml
================================================
# This is the configuration for pre-commit, a local framework for managing pre-commit hooks
#   Check out the docs at: https://pre-commit.com/

default_stages: [pre-commit]
repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
    hooks:
    -   id: check-builtin-literals
    -   id: check-case-conflict
    -   id: check-docstring-first
    -   id: check-executables-have-shebangs
    -   id: check-toml
    -   id: detect-private-key
    -   id: end-of-file-fixer
    -   id: trailing-whitespace
-   repo: https://github.com/ambv/black
    rev: 23.1.0
    hooks:
    -   id: black
-   repo: https://github.com/charliermarsh/ruff-pre-commit
    rev: v0.0.249
    hooks:
    -   id: ruff  # See 'setup.cfg' for args
        args: [intake]
        files: intake/
-   repo: https://github.com/hoxbro/clean_notebook
    rev: 0.1.5
    hooks:
      - id: clean-notebook
ci:
  autofix_prs: false
  autoupdate_schedule: quarterly


================================================
FILE: LICENSE
================================================
Copyright (c) 2017, Anaconda, Inc.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.

Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


================================================
FILE: MANIFEST.in
================================================
prune .github
prune docs
prune examples
prune scripts
global-exclude test*.py *.yml *.yaml *.csv calvert* *.png *.json


================================================
FILE: README.md
================================================
# Intake: Take 2

**A general python package for describing, loading and processing data**

![Logo](https://github.com/intake/intake/raw/master/logo-small.png)

[![Build Status](https://github.com/intake/intake/workflows/CI/badge.svg)](https://github.com/intake/intake/actions)
[![Documentation Status](https://readthedocs.org/projects/intake/badge/?version=latest)](http://intake.readthedocs.io/en/latest/?badge=latest)


*Taking the pain out of data access and distribution*

Intake is an open-source package to:

- describe your data declaratively
- gather data sets into catalogs
- search catalogs and services to find the right data you need
- load, transform and output data in many formats
- work with third party remote storage and compute platforms

Documentation is available at [Read the Docs](http://intake.readthedocs.io/en/latest).

Please report issues at https://github.com/intake/intake/issues

Install
-------

Recommended method using conda:
```bash
conda install -c conda-forge intake
```

You can also install using `pip`, in which case you have a choice as to how many of the optional
dependencies you install, with the simplest having least requirements

```bash
pip install intake
```

Note that you may well need specific drivers and other plugins, which usually have additional
dependencies of their own.

Development
-----------
 * Create development Python environment with the required dependencies, ideally with `conda`.
   The requirements can be found in the yml files in the `scripts/ci/` directory of this repo.
   * e.g. `conda env create -f scripts/ci/environment-py311.yml` and then `conda activate test_env`
 * Install intake using `pip install -e .`
 * Use `pytest` to run tests.
 * Create a fork on github to be able to submit PRs.
 * We respect, but do not enforce, pep8 standards; all new code should be covered by tests.

Support
-------

Work on this repository is supported in part by:

"Anaconda, Inc. - Advancing AI through open source."

.. raw:: html

    <a href="https://anaconda.com/"><img src="https://camo.githubusercontent.com/b8555ef2222598ed37ce38ac86955febbd25de7619931bb7dd3c58432181d3b6/68747470733a2f2f626565776172652e6f72672f636f6d6d756e6974792f6d656d626572732f616e61636f6e64612f616e61636f6e64612d6c617267652e706e67" alt="anaconda logo" width="40%"/></a>


================================================
FILE: README_refactor.md
================================================
## Intake Take2

Intake has been extensively rewritten to produce Intake Take2,
https://github.com/intake/intake/pull/737 .
This will now become the version of the ``main`` branch and be released as v2.0.0. The
main documentation will move to describing V2, and V1 will not be further developed.
Existing users of the legacy version ("v1") may find their code breaks and will need
a version pin, although we aim to support most legacy workflows via backward compatibility.

To install, you would do the following

```shell
> pip install intake
or
> conda install intake
```

To get v1:

```shell
> pip install "intake<2"
or
> conda install "intake<2"
```

This README is being kept to describe why the rewrite was done and considerations that
went into it.

### Motivation for the rewrite.

The main way to get the most out of Intake v1 has been by editing YAML files. This is
how the documentation is structured. Yes, you could use intake.open_* to seed them, but then
you will find a strong discontinuity between the documentation of the driver and the third
party library that actually does the reading.

This made is very unlikely to convert a novice data-oriented python user into someone
that can create even the simplest catalogs. They will certainly never use more advanced
features like parametrisation or derived datasets. The new model eases users in and lends
itself to being overlaid with graphical/wizard interfaces (i.e., in jupyterlab or in
preparation for use with
[anaconda.cloud](https://docs.anaconda.com/free/anaconda-notebooks/notebook-data-catalog/)).

### Main changes

This is a total rewrite. Backward compatibility is desired and some v1 sources have been
rewritten to use the v2 readers.

#### Simplification

We are dropping features that added complexity but were only rarely used.

- the server; the Intake server was never production-ready, and most
 use-cases can be provided by [tiled](https://blueskyproject.io/tiled/)
- the caching/persist stuff; files can be persisted by fsspec, and we maintain the ability to
 write to various formats
- explicit dependence on dask; dask is just one of many possible compute engines and
 an we should not be tied to one
- less added functionality in the readers (like file pattern stuff)
- explicit dependence on hvplot (but you can still choose to use it)
- the CLI


#### New structure

Many new classes have appeared. From an intake-savy point of view, the biggest change is
the splitting of "drivers" into "data" and "reader". I view them as the objective description
of what the dataset is (e.g., "this is CSV at this URL") versus how you might load it
("call pandas with these arguments"). This strongly implies that you might want to read the
same data in different ways. Crucially, it makes the readers much easier to write.

Here is the Awkward reader for parquet files. Particularly for files, often all you need to do
is specify which function will do the read and what keyword accepts the URL.
```python
class AwkwardParquet(Awkward):
    implements = {datatypes.Parquet}
    imports = {"awkward", "pyarrow"}
    func = "awkward:from_parquet"
    url_arg = "path"
```

The imports are declared and deferred until needed, so there is no need to make all those intake-*
repos with their own dependencies. (Of course, you might still want to declare packages
and requirements; considering whether catalogs should have requirements, but this is better
suited for something like conda-project). The arguments accepted are the same as for the
target function, and the method `.doc()` will show this.


### New features

- recommendation system to try to guess the right data type from a URL or existing function call,
 and readers that can use that type (and for each, tells you the instance it makes and provides docs).
 Can be extended to "I have this type but I want this other type, what
 set of steps get me there"
- embracing any compute engines as first-class (e.g., duck, dask, ray, spark) or none
- no constraints on the types of data that can/should be returned
- pipeline building tools, including explicit conversion, types operations, generalised getattr and
 getitem (like dask delayed) and apply. Most of these available as "transform" attributes, including
 new namespaces like "reader.np.max(..)" will call numpy on whatever the reader makes, but lazily.
- output functions, as a special type of "conversion", returning a new data description for further
 manipulation. This is effectively caching (would like to add conditions to the pipeline, only load and
 convert if converted version doesn't already exist).
- generalised derived datasets, including functions of multiple intake inputs. A data or any reader
 output might be the input of any other reader, forming a graph. Picking a specific output from those
 possible gives you the pipeline, ready for execution. Any such pipelines could be encoded in a catalog.
- user parameters are similar to before, but are also plugable; a few types are provided.
 Some helper methods have been made
 to walk data/reader kwargs and extract default values as parameters, replacing their original value
 with a reference to the parameter. The parameters are hierarchical catalog->data->reader

Some examples of each of these exist in the current state of the code. There are many many more to
write, but the functions themselves are really simple. This is aiming for composition and easy crowd
sourcing, high bus factor.

### Work to follow

- thorough search capability, which will need some thoughts in this context
- compatibility with remaining existing intake plugins
- the catalog serialisation currently uses custom YAML tags, but this should not be necessary
- add those magic methods that make pipelines work on descriptions on catalogs, not just
 materialised readers.
- metadata conventions, to persist basic dataset properties (e.g., based on frictionlessdata spec)
 and validation as a pipeline operation you can do to any data entry using any available reader that
 can produce the info
- probably much more - I will need help!

### Unanswered questions

- actual functions and classes are now embedded into any YAML serialised catalog as strings. These
 are imported/instantiated when the reader is instantiated from its description. So arbitrary
 code execution is possible, but not at catalog parse time. We only have a loose permissions config
 story around this
- this implementation maintains the distinction between "descriptions" (which have templated values
 and user parameters) and readers (which only have concrete values and real instances). Is this a
 major confusion we somehow want to eliminate in V2?


================================================
FILE: docs/Makefile
================================================
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line.
SPHINXOPTS    =
SPHINXBUILD   = sphinx-build
SPHINXPROJ    = intake
SOURCEDIR     = source
BUILDDIR      = build

# Put it first so that "make" without argument is like "make help".
help:
	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Run custom script to build HTML table of plugins
html: Makefile
	python plugins.py
	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)


================================================
FILE: docs/README.md
================================================
# Building Documentation

An environment with several prerequisites is needed to build the
documentation.  Create this with:

## First option for environment

```bash
conda create -n intake-docs python=3.8 pandas dask python-snappy appdirs -c conda-forge -y
conda activate intake-docs
```

Additional pip packages are listed in `./requirements.txt` are required to
build the docs:

```bash
pip install -r requirements.txt
```

## Second option for environment

A conda environment with pip packages included is in `environment.yml` of the current directory, and you may create it with:

```bash
conda env create
conda activate intake-docs
```

## Build docs

To make HTML documentation:

```bash
make html
```

Outputs to `build/html/index.html`


================================================
FILE: docs/environment.yml
================================================
name: intake-docs
channels:
  - conda-forge

dependencies:
  - appdirs
  - python=3.12
  - dask
  - numpy
  - pandas
  - msgpack-python
  - msgpack-numpy
  - requests
  - tornado
  - jinja2
  - python-snappy
  - pyyaml
  - hvplot
  - platformdirs
  - panel
  - bokeh
  - docutils
  - sphinx
  - sphinx_rtd_theme
  - numpydoc
  - entrypoints
  - aiohttp


================================================
FILE: docs/make.bat
================================================
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
	set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build
set SPHINXPROJ=intake

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
	echo.
	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
	echo.installed, then set the SPHINXBUILD environment variable to point
	echo.to the full path of the 'sphinx-build' executable. Alternatively you
	echo.may add the Sphinx directory to PATH.
	echo.
	echo.If you don't have Sphinx installed, grab it from
	echo.http://sphinx-doc.org/
	exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%

:end
popd


================================================
FILE: docs/make_api.py
================================================
import os
import sys
import intake


def run(path):
    fn = os.path.join(path, "source", "api2.rst")
    with open(fn, "w") as f:
        print(
            f"""
API Reference
=============

User Functions
--------------

.. autosummary::
    intake.config.Config
    intake.readers.datatypes.recommend
    intake.readers.convert.auto_pipeline
    intake.readers.entry.Catalog
    intake.readers.entry.DataDescription
    intake.readers.entry.ReaderDescription
    intake.readers.readers.recommend
    intake.readers.readers.reader_from_call

.. autoclass:: intake.config.Config
    :members:

.. autofunction:: intake.readers.datatypes.recommend

.. autofunction:: intake.readers.convert.auto_pipeline

.. autoclass:: intake.readers.entry.Catalog
    :members:

.. autoclass:: intake.readers.entry.DataDescription
    :members:

.. autoclass:: intake.readers.entry.ReaderDescription
    :members:

.. autofunction:: intake.readers.readers.recommend

.. autofunction:: intake.readers.readers.reader_from_call

Base Classes
------------

These may be subclassed by developers

.. autosummary::""",
            file=f,
        )
        bases = (
            "intake.readers.datatypes.BaseData",
            "intake.readers.readers.BaseReader",
            "intake.readers.convert.BaseConverter",
            "intake.readers.namespaces.Namespace",
            "intake.readers.search.SearchBase",
            "intake.readers.user_parameters.BaseUserParameter",
        )
        for base in bases:
            print("  ", base, file=f)
        print(file=f)
        for base in bases:
            print(
                f""".. autoclass:: {base}
   :members:
""",
                file=f,
            )

        print(
            """

Data Classes
------------

.. autosummary::""",
            file=f,
        )
        for cls in sorted(intake.readers.subclasses(intake.BaseData), key=lambda c: c.qname()):
            print("  ", cls.qname().replace(":", "."), file=f)
        print(
            """

Reader Classes
--------------

Includes readers, transformers, converters and output classes.

.. autosummary::""",
            file=f,
        )
        for cls in sorted(intake.readers.subclasses(intake.BaseReader), key=lambda c: c.qname()):
            print("  ", cls.qname().replace(":", "."), file=f)


if __name__ == "__main__":
    here = os.path.abspath(os.path.dirname(sys.argv[0]))
    run(here)
else:
    here = os.path.abspath(os.path.dirname(__file__))


================================================
FILE: docs/plugins.py
================================================
import asyncio

import aiohttp
import pandas as pd
import yaml


def format_package_links(package_name, repo_link):
    return f'<a href="{repo_link}">{package_name}</a>'


def format_repo_link(repo_link):
    if "http" not in repo_link:
        return f"https://github.com/{repo_link}/"
    return repo_link


def format_badge_html(badge, link):
    return f'<a href="{link}"><img src="{badge}"/></a>'


async def check_ok(client, url):
    async with client.get(url) as r:
        if "anaconda.org" in url:
            body = await r.text()
            if "requires authentication" in body:
                return False
        return r.ok


async def check_all_ok(urls):
    async with aiohttp.client.ClientSession() as c:
        coroutines = [check_ok(c, url) for url in urls]
        return await asyncio.gather(*coroutines)


def generate_plugin_table():
    plugins = yaml.safe_load(open("plugins.yaml", "rb"))
    plugin_df = pd.DataFrame(plugins)
    plugin_df = plugin_df.rename(columns={"description": "Description", "drivers": "Drivers"})

    plugin_df["short_name"] = plugin_df["name"].apply(lambda x: x.split("/")[-1])
    plugin_df["repo_links"] = plugin_df["repo"].apply(format_repo_link)
    plugin_df["conda_package"] = plugin_df["conda_package"].fillna(plugin_df["short_name"])
    plugin_df["ci_yaml"] = plugin_df["ci_yaml"].fillna("main.yaml")

    # CI badges
    plugin_df["ci_badges"] = plugin_df[["repo_links", "ci_yaml"]].apply(
        lambda x: f"{x[0]}/actions/workflows/{x[1]}/badge.svg", axis=1
    )
    plugin_df["ci_links"] = plugin_df["repo_links"].apply(lambda x: f"{x}/actions")
    ci_badges_ok = asyncio.run(check_all_ok(plugin_df["ci_badges"]))

    # Docs badges
    plugin_df["docs_badges"] = plugin_df["short_name"].apply(
        lambda x: f"https://readthedocs.org/projects/{x}/badge/?version=latest"
    )
    plugin_df["docs_links"] = plugin_df["short_name"].apply(
        lambda x: f"https://{x}.readthedocs.io/en/latest/?badge=latest"
    )
    docs_badges_ok = asyncio.run(check_all_ok(plugin_df["docs_links"]))

    # PyPi badges
    plugin_df["pypi_badges"] = plugin_df["short_name"].apply(
        lambda x: f"https://img.shields.io/pypi/v/{x}.svg?maxAge=3600"
    )
    plugin_df["pypi_links"] = plugin_df["short_name"].apply(
        lambda x: f"https://pypi.org/project/{x}"
    )
    pypi_badges_ok = asyncio.run(check_all_ok(plugin_df["pypi_links"]))

    # Conda badges
    plugin_df["conda_badges"] = plugin_df[["conda_channel", "conda_package"]].apply(
        lambda x: f"https://img.shields.io/conda/vn/{x[0]}/{x[1]}.svg?colorB=4488ff&label={x[0]}&style=flat",
        axis=1,
    )
    plugin_df["conda_links"] = plugin_df[["conda_channel", "conda_package"]].apply(
        lambda x: f"https://anaconda.org/{x[0]}/{x[1]}", axis=1
    )
    conda_badges_ok = asyncio.run(check_all_ok(plugin_df["conda_links"]))

    # Conda defaults badges
    plugin_df["conda_defaults_links"] = plugin_df["conda_package"].apply(
        lambda x: f"https://anaconda.org/anaconda/{x}"
    )
    plugin_df["conda_defaults_badges"] = plugin_df["conda_package"].apply(
        lambda x: f"https://img.shields.io/conda/vn/anaconda/{x}.svg?colorB=4488ff&label=defaults&style=flat"
    )
    conda_defaults_badges_ok = asyncio.run(check_all_ok(plugin_df["conda_defaults_links"]))

    # Conda forge badges
    plugin_df["conda_forge_links"] = plugin_df["conda_package"].apply(
        lambda x: f"https://anaconda.org/conda-forge/{x}"
    )
    plugin_df["conda_forge_badges"] = plugin_df["conda_package"].apply(
        lambda x: f"https://img.shields.io/conda/vn/conda-forge/{x}.svg?colorB=4488ff&style=flat"
    )
    conda_forge_badges_ok = asyncio.run(check_all_ok(plugin_df["conda_forge_links"]))

    plugin_df["Package Name"] = plugin_df[["name", "repo_links"]].apply(
        lambda x: format_package_links(*x), axis=1
    )
    plugin_df["CI"] = plugin_df[["ci_badges", "ci_links"]][ci_badges_ok].apply(
        lambda x: format_badge_html(*x), axis=1
    )
    plugin_df["Docs"] = plugin_df[["docs_badges", "docs_links"]][docs_badges_ok].apply(
        lambda x: format_badge_html(*x), axis=1
    )
    plugin_df["PyPi"] = plugin_df[["pypi_badges", "pypi_links"]][pypi_badges_ok].apply(
        lambda x: format_badge_html(*x), axis=1
    )
    plugin_df["conda"] = plugin_df[["conda_badges", "conda_links"]][conda_badges_ok].apply(
        lambda x: format_badge_html(*x), axis=1
    )
    plugin_df["conda_forge"] = plugin_df[["conda_forge_badges", "conda_forge_links"]][
        conda_forge_badges_ok
    ].apply(lambda x: format_badge_html(*x), axis=1)
    plugin_df["conda_defaults"] = plugin_df[["conda_defaults_badges", "conda_defaults_links"]][
        conda_defaults_badges_ok
    ].apply(lambda x: format_badge_html(*x), axis=1)

    plugin_df = plugin_df.fillna("")

    # Concat conda badges
    plugin_df["Conda"] = plugin_df["conda"] + plugin_df["conda_forge"] + plugin_df["conda_defaults"]

    plugin_df.to_html(
        "source/plugin-list.html",
        escape=False,
        justify="left",
        index=False,
        border=0,
        classes="table_wrapper",
        columns=["Package Name", "Description", "Drivers", "CI", "Docs", "PyPi", "Conda"],
        col_space=["auto", "auto", "auto", "90px", "90px", "90px", "90px"],
    )


if __name__ == "__main__":
    print("Generating custom plugin table... ", end="")
    generate_plugin_table()
    print("done")


================================================
FILE: docs/plugins.yaml
================================================
- name: intake
  repo: intake/intake
  description: Builtin to Intake
  drivers: catalog, csv, intake_remote, ndzarr, numpy, textfiles, yaml_file_cat, yaml_files_cat, zarr_cat, json, jsonl

- name: intake-astro
  repo: intake/intake-astro
  description: Table and array loading of FITS astronomical data
  drivers: fits_array, fits_table
  conda_channel: intake

- name: intake-accumulo
  repo: intake/intake-accumulo
  description: Apache Accumulo clustered data storage
  drivers: accumulo

- name: intake-avro
  repo: intake/intake-avro
  description: Apache Avro data serialization format
  drivers: avro_table, avro_sequence

- name: intake-bluesky
  repo: nsls-ii/intake-bluesky
  description: Search and retrieve data in the <a href="https://nsls-ii.github.io/bluesky">bluesky</a> data model

- name: intake-dcat
  repo: CityOfLosAngeles/intake-dcat
  description: Browse and load data from <a href="https://www.w3.org/TR/vocab-dcat">DCAT</a> catalogs
  drivers: dcat

- name: intake-dremio
  repo: intake/intake-dremio
  description: Scan tables and send SQL queries to a <a href="https://docs.dremio.com/"> Dremio </a> server
  drivers: dremio

- name: intake-duckdb
  repo: intake/intake-duckdb
  description: Load DuckDB tables and build catalogs from DuckDB backends
  drivers: duckdb, duckdb_cat

- name: intake-dynamodb
  repo: informatics-lab/intake-dynamodb>
  description: Link to Amazon DynamoDB
  drivers: dynamodb
  conda_channel: informaticslab
  conda_package: intake_dynamodb

- name: intake-elasticsearch
  repo: intake/intake-elasticsearch
  description: Elasticsearch search and analytics engine
  drivers: elasticsearch_seq, elasticsearch_table

- name: intake-esm
  repo: NCAR/intake-esm
  description: Plugin for building and loading intake catalogs for earth system data sets holdings, such as <a href="https://cmip.llnl.gov/">CMIP</a> (Coupled Model Intercomparison Project) and CESM Large Ensemble datasets

- name: intake-geopandas
  repo: informatics-lab/intake_geopandas
  description: Load from ESRI Shape Files, GeoJSON, and geospatial databases with geopandas
  drivers: geojson, postgis, shapefile, spatialite, regionmask
  conda_channel: informaticslab

- name: intake-google-analytics
  repo: intake/intake-google-analytics
  description: Run Google Analytics queries and load data as a DataFrame
  drivers: google_analytics_query

- name: intake-hbase
  repo: intake/intake-hbase
  description: Apache HBase database
  drivers: hbase
  conda_channel: intake

- name: intake-iris
  repo: informatics-lab/intake-iris
  description: Load netCDF and GRIB files with IRIS
  drivers: grib, netcdf
  conda_channel: informaticslab
  conda_package: intake_iris

- name: intake-metabase
  repo: continuumio/intake-metabase
  description: Generate catalogs and load tables as DataFrames from Metabase
  drivers: metabase_catalog, metabase_table

- name: intake-mongo
  repo: intake/intake-mongo
  description: MongoDB noSQL query
  drivers: mongo
  conda_channel: intake

- name: intake-nested-yaml-catalog
  repo: zillow/intake-nested-yaml-catalog
  description: Plugin supporting a single YAML hierarchical catalog to organize datasets and avoid a data swamp
  drivers: nested_yaml_cat

- name: intake-netflow
  repo: intake/intake-netflow
  description: Netflow packet format
  drivers: netflow
  conda_channel: intake

- name: intake-notebook
  repo: informatics-lab/intake-notebook
  description: Experimental plugin to access parameterised notebooks through intake and executed via papermill
  drivers: ipynb
  conda_channel: informaticslab

- name: intake-odbc
  repo: intake/intake-odbc
  description: ODBC database
  drivers: odbc
  conda_channel: intake

- name: intake-parquet
  repo: intake/intake-parquet
  description: Apache Parquet file format
  drivers: parquet

- name: intake-pattern-catalog
  repo: DTN-Public/intake-pattern-catalog
  description: Plugin for specifying a file-path pattern which can represent a number of different entries
  drivers: pattern_cat

- name: intake-pcap
  repo: intake/intake-pcap
  description: PCAP network packet format
  drivers: pcap

- name: intake-postgres
  repo: intake/intake-postgres
  description: PostgreSQL database
  drivers: postgres
  conda_channel: intake

- name: intake-s3-manifests
  repo: informatics-lab/intake-s3-manifests
  drivers: s3_manifest
  conda_channel: informaticslab
  conda_package: intake_s3_manifests

- name: intake-salesforce
  repo: sophiamyang/intake-salesforce
  description: Generate catalogs and load tables as DataFrames from Salesforce
  drivers: salesforce_catalog, salesforce_table

- name: intake-sdmx
  repo: dr-leo/intake_sdmx
  description: Plugin for SDMX-compliant data sources such as BIS, ECB, ESTAT, INSEE, ILO, UN, UNICEF, World Bank and more
  drivers: sdmx_dataset

- name: intake-sklearn
  repo: AlbertDeFusco/intake-sklearn
  description: Load scikit-learn models from Pickle files
  drivers: sklearn

- name: intake-solr
  repo: intake/intake-solr
  description: Apache Solr search platform
  drivers: solr
  conda_channel: intake

- name: intake-stac
  repo: intake/intake-stac
  description: Intake Driver for <a href="https://stacspec.org/">SpatioTemporal Asset Catalogs (STAC)</a>

- name: intake-stripe
  repo: sophiamyang/intake-stripe
  description: Generate catalogs and load tables as DataFrames from Stripe
  drivers: stripe_catalog, stripe_table

- name: intake-spark
  repo: intake/intake-spark
  description: Data processed by Apache Spark
  drivers: spark_cat, spark_rdd, spark_dataframe

- name: intake-sql
  repo: intake/intake-sql
  description: Generic SQL queries via SQLAlchemy
  drivers: sql_cat, sql, sql_auto, sql_manual

- name: intake-sqlite
  repo: catalyst-cooperative/intake-sqlite
  description: Local caching of remote SQLite DBs and queries via SQLAlchemy
  drivers: sqlite_cat, sqlite, sqlite_auto, sqlite_manual

- name: intake-splunk
  repo: intake/intake-splunk
  description: Splunk machine data query
  drivers: splunk
  conda_channel: intake

- name: intake-streamz
  repo: intake/intake-streamz
  description: Real-time event processing using Streamz
  drivers: streamz

- name: intake-thredds
  repo: NCAR/intake-thredds
  ci_yaml: ci.yaml
  description: Intake interface to THREDDS data catalogs
  drivers: thredds_cat, thredds_merged_source

- name: intake-xarray
  repo: intake/intake-xarray
  description: Load netCDF, Zarr and other multi-dimensional data
  drivers: xarray_image, netcdf, grib, opendap, rasterio, remote-xarray, zarr

- name: intake-dataframe-catalog
  repo: ACCESS-NRI/intake-dataframe-catalog
  ci_yaml: ci.yml
  description: A searchable table of intake sources and associated metadata
  drivers: df_catalog
  conda_channel: accessnri


================================================
FILE: docs/requirements.txt
================================================
sphinx
sphinx_rtd_theme
numpydoc
panel
hvplot
entrypoints


================================================
FILE: docs/source/_static/.keep
================================================


================================================
FILE: docs/source/_static/css/custom.css
================================================
div.prompt {
  display: none
}

div.logo-block img {
  display: none !important
}

.table_wrapper{
  display: block;
  overflow-x: auto;
}

.table_wrapper td, th {
  padding: 2px;
}

.table_wrapper tr:nth-child(even) {
  background: #E0E0E0;
}


================================================
FILE: docs/source/_static/images/plotting_example.html
================================================

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="utf-8">
        <title>HoloPlot Plot</title>

<link rel="stylesheet" href="https://cdn.pydata.org/bokeh/release/bokeh-0.12.16.min.css" type="text/css" />

<script type="text/javascript" src="https://cdn.pydata.org/bokeh/release/bokeh-0.12.16.min.js"></script>
<script type="text/javascript">
    Bokeh.set_log_level("info");
</script>
    </head>
    <body>

        <div class="bk-root">
            <div class="bk-plotdiv" id="a54b0d41-2dbd-4cbd-abe8-b158862abe73"></div>
        </div>

        <script type="application/json" id="de6014ac-0cb8-4cb5-aff2-696c243a0f51">
          {"cf966cb6-e460-4e42-870e-e1fe77bb6636":{"roots":{"references":[{"attributes":{"below":[{"id":"4c2875d5-77df-4591-9438-8d68a338a1d8","type":"LinearAxis"}],"left":[{"id":"a9c6f8b1-9ea1-41ca-a889-60f708773bc7","type":"LinearAxis"}],"min_border_bottom":10,"min_border_left":10,"min_border_right":10,"min_border_top":10,"plot_height":300,"plot_width":700,"renderers":[{"id":"4c2875d5-77df-4591-9438-8d68a338a1d8","type":"LinearAxis"},{"id":"49b73a93-7160-40db-bf35-7a5671949656","type":"Grid"},{"id":"a9c6f8b1-9ea1-41ca-a889-60f708773bc7","type":"LinearAxis"},{"id":"c62b66c7-6be2-4af9-a12f-aafc8e32caff","type":"Grid"},{"id":"8d9c612d-75cb-4181-a127-97ed515133b2","type":"BoxAnnotation"},{"id":"2215446e-e645-4546-acea-13fc8fa81a4a","type":"Legend"},{"id":"0d4d8d8c-cbc8-4eba-8c60-867a3ce11b30","type":"GlyphRenderer"},{"id":"f5632049-54e4-4001-905f-2e37f0480b55","type":"GlyphRenderer"},{"id":"125ade69-dfa3-4df1-90e0-238b813d7202","type":"GlyphRenderer"}],"title":{"id":"96b34104-9f23-4702-9783-868ea5114e49","type":"Title"},"toolbar":{"id":"aebf021b-befc-4d6b-9ffd-a4e1aadceb03","type":"Toolbar"},"x_range":{"id":"5059be37-ae81-463d-8db0-292d6b4b5a54","type":"Range1d"},"x_scale":{"id":"8223a783-c027-473c-9217-f60e3ac36aa0","type":"LinearScale"},"y_range":{"id":"94497823-8347-4b0e-b26c-fe2aa2abbd31","type":"Range1d"},"y_scale":{"id":"d0b1b2a0-7f3c-4115-be37-2b6913b3b801","type":"LinearScale"}},"id":"9268281a-2417-46c5-9182-d5ebb5689dc2","subtype":"Figure","type":"Plot"},{"attributes":{"label":{"value":"Violent Crime rate"},"renderers":[{"id":"0d4d8d8c-cbc8-4eba-8c60-867a3ce11b30","type":"GlyphRenderer"}]},"id":"44735ebd-4ba2-41b2-a24b-6c318fc39393","type":"LegendItem"},{"attributes":{},"id":"5277ef26-8da1-45ef-b505-c9855b23f975","type":"Selection"},{"attributes":{},"id":"0d192563-9260-41da-a32c-f5ccbc73bd4c","type":"BasicTickFormatter"},{"attributes":{},"id":"c05057cb-f16d-4349-b03d-c09cdc6ba559","type":"Selection"},{"attributes":{"plot":null,"text":"","text_color":{"value":"black"},"text_font_size":{"value":"12pt"}},"id":"96b34104-9f23-4702-9783-868ea5114e49","type":"Title"},{"attributes":{"data_source":{"id":"2b2287e6-2706-4b63-aedd-a6bf32979687","type":"ColumnDataSource"},"glyph":{"id":"c15dc676-9a32-46ac-acfd-f99f4eb8e731","type":"Line"},"hover_glyph":null,"muted_glyph":{"id":"4f29323a-f6a0-4040-aaaa-19ff6880b09d","type":"Line"},"nonselection_glyph":{"id":"fd981887-f190-42fe-8cee-dfb35ef077e1","type":"Line"},"selection_glyph":null,"view":{"id":"ec9bc3f8-1164-478d-8a1e-ac8f0b55492d","type":"CDSView"}},"id":"125ade69-dfa3-4df1-90e0-238b813d7202","type":"GlyphRenderer"},{"attributes":{"grid_line_color":{"value":null},"plot":{"id":"9268281a-2417-46c5-9182-d5ebb5689dc2","subtype":"Figure","type":"Plot"},"ticker":{"id":"132f30e9-c8c4-4c71-a5b2-98ab2f2d40ef","type":"BasicTicker"}},"id":"49b73a93-7160-40db-bf35-7a5671949656","type":"Grid"},{"attributes":{},"id":"ffb0bc45-c827-4f96-ae52-ad134c44d0b3","type":"UnionRenderers"},{"attributes":{},"id":"3bb836e5-3444-4da9-9a78-ff97ff6f536a","type":"BasicTickFormatter"},{"attributes":{"label":{"value":"Burglary rate"},"renderers":[{"id":"125ade69-dfa3-4df1-90e0-238b813d7202","type":"GlyphRenderer"}]},"id":"ebda79be-5fa7-4bab-9636-7e7f956bcaa9","type":"LegendItem"},{"attributes":{"callback":null,"end":2014,"start":1960},"id":"5059be37-ae81-463d-8db0-292d6b4b5a54","type":"Range1d"},{"attributes":{"line_alpha":0.2,"line_color":"#1f77b4","line_width":2,"x":{"field":"Year"},"y":{"field":"Rate (per 100k people)"}},"id":"087d0ef0-6a69-4746-ba34-828ff44f60c8","type":"Line"},{"attributes":{"line_alpha":0.2,"line_color":"#2ca02c","line_width":2,"x":{"field":"Year"},"y":{"field":"Rate (per 100k people)"}},"id":"4f29323a-f6a0-4040-aaaa-19ff6880b09d","type":"Line"},{"attributes":{"line_alpha":0.1,"line_color":"#2ca02c","line_width":2,"x":{"field":"Year"},"y":{"field":"Rate (per 100k people)"}},"id":"fd981887-f190-42fe-8cee-dfb35ef077e1","type":"Line"},{"attributes":{},"id":"132f30e9-c8c4-4c71-a5b2-98ab2f2d40ef","type":"BasicTicker"},{"attributes":{"axis_label":"Year","bounds":"auto","formatter":{"id":"3bb836e5-3444-4da9-9a78-ff97ff6f536a","type":"BasicTickFormatter"},"major_label_orientation":"horizontal","plot":{"id":"9268281a-2417-46c5-9182-d5ebb5689dc2","subtype":"Figure","type":"Plot"},"ticker":{"id":"132f30e9-c8c4-4c71-a5b2-98ab2f2d40ef","type":"BasicTicker"}},"id":"4c2875d5-77df-4591-9438-8d68a338a1d8","type":"LinearAxis"},{"attributes":{"axis_label":"Rate (per 100k people)","bounds":"auto","formatter":{"id":"0d192563-9260-41da-a32c-f5ccbc73bd4c","type":"BasicTickFormatter"},"major_label_orientation":"horizontal","plot":{"id":"9268281a-2417-46c5-9182-d5ebb5689dc2","subtype":"Figure","type":"Plot"},"ticker":{"id":"c685bdab-a075-4101-ae8a-b6968cb1d905","type":"BasicTicker"}},"id":"a9c6f8b1-9ea1-41ca-a889-60f708773bc7","type":"LinearAxis"},{"attributes":{"source":{"id":"2b2287e6-2706-4b63-aedd-a6bf32979687","type":"ColumnDataSource"}},"id":"ec9bc3f8-1164-478d-8a1e-ac8f0b55492d","type":"CDSView"},{"attributes":{},"id":"c685bdab-a075-4101-ae8a-b6968cb1d905","type":"BasicTicker"},{"attributes":{},"id":"150786bd-42e2-4f3f-ad4d-35e7ce7581e4","type":"ResetTool"},{"attributes":{"dimension":1,"grid_line_color":{"value":null},"plot":{"id":"9268281a-2417-46c5-9182-d5ebb5689dc2","subtype":"Figure","type":"Plot"},"ticker":{"id":"c685bdab-a075-4101-ae8a-b6968cb1d905","type":"BasicTicker"}},"id":"c62b66c7-6be2-4af9-a12f-aafc8e32caff","type":"Grid"},{"attributes":{"callback":null,"data":{"Rate (per 100k people)":{"__ndarray__":"zczMzMwcZEAzMzMzM8NjQJqZmZmZSWRAZmZmZmYGZUAzMzMzM9NnQGZmZmZmBmlAAAAAAACAa0BmZmZmZqZvQGZmZmZmpnJAMzMzMzOLdEAAAAAAALh2QAAAAAAAwHhAAAAAAAAQeUBmZmZmZhZ6QJqZmZmZ0XxAzczMzMx8fkDNzMzMzDx9QGZmZmZmvn1AzczMzMwcf0AzMzMzMyeBQM3MzMzMpIJAAAAAAACMgkBmZmZmZtaBQM3MzMzM0IBAMzMzMzPfgEDNzMzMzHCBQM3MzMzMYINAAAAAAAAkg0DNzMzMzASEQDMzMzMz14RAzczMzMzMhkCamZmZmbGHQJqZmZmZrYdAzczMzMxYh0DNzMzMzEyGQAAAAAAAZIVAzczMzMzkg0AAAAAAABiDQM3MzMzMvIFAAAAAAABYgEAAAAAAAKh/QAAAAAAAiH9AZmZmZmbmfkDNzMzMzLx9QDMzMzMz83xAAAAAAABQfUDNzMzMzPR9QM3MzMzMfH1AmpmZmZmpfEBmZmZmZv56QAAAAAAASHlAmpmZmZkxeEDNzMzMzDx4QJqZmZmZsXdAMzMzMzN7d0A=","dtype":"float64","shape":[55]},"Rate_left_parenthesis_per_100k_people_right_parenthesis":{"__ndarray__":"zczMzMwcZEAzMzMzM8NjQJqZmZmZSWRAZmZmZmYGZUAzMzMzM9NnQGZmZmZmBmlAAAAAAACAa0BmZmZmZqZvQGZmZmZmpnJAMzMzMzOLdEAAAAAAALh2QAAAAAAAwHhAAAAAAAAQeUBmZmZmZhZ6QJqZmZmZ0XxAzczMzMx8fkDNzMzMzDx9QGZmZmZmvn1AzczMzMwcf0AzMzMzMyeBQM3MzMzMpIJAAAAAAACMgkBmZmZmZtaBQM3MzMzM0IBAMzMzMzPfgEDNzMzMzHCBQM3MzMzMYINAAAAAAAAkg0DNzMzMzASEQDMzMzMz14RAzczMzMzMhkCamZmZmbGHQJqZmZmZrYdAzczMzMxYh0DNzMzMzEyGQAAAAAAAZIVAzczMzMzkg0AAAAAAABiDQM3MzMzMvIFAAAAAAABYgEAAAAAAAKh/QAAAAAAAiH9AZmZmZmbmfkDNzMzMzLx9QDMzMzMz83xAAAAAAABQfUDNzMzMzPR9QM3MzMzMfH1AmpmZmZmpfEBmZmZmZv56QAAAAAAASHlAmpmZmZkxeEDNzMzMzDx4QJqZmZmZsXdAMzMzMzN7d0A=","dtype":"float64","shape":[55]},"Variable":["Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate","Violent Crime rate"],"Year":[1960,1961,1962,1963,1964,1965,1966,1967,1968,1969,1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988,1989,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014]},"selected":{"id":"82b27fb8-dcc0-4828-9fe8-cb11fa8c716d","type":"Selection"},"selection_policy":{"id":"7fa57198-d15f-4a50-bad8-d6730c3d1eac","type":"UnionRenderers"}},"id":"c408e98a-be3e-4937-954c-9e712fa19f2c","type":"ColumnDataSource"},{"attributes":{"line_color":"#1f77b4","line_width":2,"x":{"field":"Year"},"y":{"field":"Rate (per 100k people)"}},"id":"f5cb752e-2308-4d2e-a511-0a1c60baf228","type":"Line"},{"attributes":{"bottom_units":"screen","fill_alpha":{"value":0.5},"fill_color":{"value":"lightgrey"},"left_units":"screen","level":"overlay","line_alpha":{"value":1.0},"line_color":{"value":"black"},"line_dash":[4,4],"line_width":{"value":2},"plot":null,"render_mode":"css","right_units":"screen","top_units":"screen"},"id":"8d9c612d-75cb-4181-a127-97ed515133b2","type":"BoxAnnotation"},{"attributes":{},"id":"1c3e9b46-f773-44ef-9d3f-5b7734e5a267","type":"SaveTool"},{"attributes":{},"id":"5bd2845d-0741-40fa-83fb-b04651091fb8","type":"PanTool"},{"attributes":{},"id":"d0b1b2a0-7f3c-4115-be37-2b6913b3b801","type":"LinearScale"},{"attributes":{},"id":"e6b98c50-b4c7-4a21-8f0f-f1afcdb8eb23","type":"WheelZoomTool"},{"attributes":{"overlay":{"id":"8d9c612d-75cb-4181-a127-97ed515133b2","type":"BoxAnnotation"}},"id":"da5e614b-d585-4e75-95f5-6a59eea5e0bc","type":"BoxZoomTool"},{"attributes":{"line_alpha":0.1,"line_color":"#1f77b4","line_width":2,"x":{"field":"Year"},"y":{"field":"Rate (per 100k people)"}},"id":"4aff6e88-4363-4255-b78b-c71223315006","type":"Line"},{"attributes":{"source":{"id":"c408e98a-be3e-4937-954c-9e712fa19f2c","type":"ColumnDataSource"}},"id":"de890536-a3b0-4809-af7a-f51c8ae52099","type":"CDSView"},{"attributes":{"active_drag":"auto","active_inspect":"auto","active_scroll":"auto","active_tap":"auto","tools":[{"id":"3481d497-d16e-4103-802f-c7f49222fe53","type":"HoverTool"},{"id":"1c3e9b46-f773-44ef-9d3f-5b7734e5a267","type":"SaveTool"},{"id":"5bd2845d-0741-40fa-83fb-b04651091fb8","type":"PanTool"},{"id":"e6b98c50-b4c7-4a21-8f0f-f1afcdb8eb23","type":"WheelZoomTool"},{"id":"da5e614b-d585-4e75-95f5-6a59eea5e0bc","type":"BoxZoomTool"},{"id":"150786bd-42e2-4f3f-ad4d-35e7ce7581e4","type":"ResetTool"}]},"id":"aebf021b-befc-4d6b-9ffd-a4e1aadceb03","type":"Toolbar"},{"attributes":{"callback":null,"end":1684.1,"start":58.3},"id":"94497823-8347-4b0e-b26c-fe2aa2abbd31","type":"Range1d"},{"attributes":{"line_color":"#ff7f0e","line_width":2,"x":{"field":"Year"},"y":{"field":"Rate (per 100k people)"}},"id":"6d895348-af34-47fb-8db9-7aed49b78720","type":"Line"},{"attributes":{"click_policy":"mute","items":[{"id":"44735ebd-4ba2-41b2-a24b-6c318fc39393","type":"LegendItem"},{"id":"f8870c4e-c6c9-46b6-ac8b-82c4df0ff009","type":"LegendItem"},{"id":"ebda79be-5fa7-4bab-9636-7e7f956bcaa9","type":"LegendItem"}],"plot":{"id":"9268281a-2417-46c5-9182-d5ebb5689dc2","subtype":"Figure","type":"Plot"}},"id":"2215446e-e645-4546-acea-13fc8fa81a4a","type":"Legend"},{"attributes":{"data_source":{"id":"c408e98a-be3e-4937-954c-9e712fa19f2c","type":"ColumnDataSource"},"glyph":{"id":"f5cb752e-2308-4d2e-a511-0a1c60baf228","type":"Line"},"hover_glyph":null,"muted_glyph":{"id":"087d0ef0-6a69-4746-ba34-828ff44f60c8","type":"Line"},"nonselection_glyph":{"id":"4aff6e88-4363-4255-b78b-c71223315006","type":"Line"},"selection_glyph":null,"view":{"id":"de890536-a3b0-4809-af7a-f51c8ae52099","type":"CDSView"}},"id":"0d4d8d8c-cbc8-4eba-8c60-867a3ce11b30","type":"GlyphRenderer"},{"attributes":{"callback":null,"data":{"Rate (per 100k people)":{"__ndarray__":"zczMzMwMTkBmZmZmZiZNQJqZmZmZ2U1AZmZmZmbmTkDNzMzMzAxRQM3MzMzM7FFAMzMzMzMzVEAzMzMzM7NZQJqZmZmZeWBAzczMzMyMYkAzMzMzM4NlQAAAAAAAgGdAZmZmZmaWZkAzMzMzM+NmQJqZmZmZKWpAmpmZmZmZa0CamZmZmeloQGZmZmZm1mdAmpmZmZl5aEDNzMzMzExrQDMzMzMzY29AZmZmZmYmcECamZmZmdltQGZmZmZmFmtAZmZmZma2aUCamZmZmSlqQAAAAAAAQGxAZmZmZma2akAzMzMzM8NrQJqZmZmZSW1AzczMzMwEcEAzMzMzMwtxQDMzMzMze3BAAAAAAAAAcECamZmZmbltQM3MzMzMnGtAzczMzMw8aUBmZmZmZkZnQAAAAAAAsGRAMzMzMzPDYkAAAAAAACBiQAAAAAAAkGJAMzMzMzNDYkAAAAAAANBhQGZmZmZmFmFAmpmZmZmZYUAAAAAAAMBiQJqZmZmZiWJAzczMzMw8YkAzMzMzM6NgQDMzMzMz011AmpmZmZl5XEBmZmZmZkZcQAAAAAAAQFtAzczMzMyMWUA=","dtype":"float64","shape":[55]},"Rate_left_parenthesis_per_100k_people_right_parenthesis":{"__ndarray__":"zczMzMwMTkBmZmZmZiZNQJqZmZmZ2U1AZmZmZmbmTkDNzMzMzAxRQM3MzMzM7FFAMzMzMzMzVEAzMzMzM7NZQJqZmZmZeWBAzczMzMyMYkAzMzMzM4NlQAAAAAAAgGdAZmZmZmaWZkAzMzMzM+NmQJqZmZmZKWpAmpmZmZmZa0CamZmZmeloQGZmZmZm1mdAmpmZmZl5aEDNzMzMzExrQDMzMzMzY29AZmZmZmYmcECamZmZmdltQGZmZmZmFmtAZmZmZma2aUCamZmZmSlqQAAAAAAAQGxAZmZmZma2akAzMzMzM8NrQJqZmZmZSW1AzczMzMwEcEAzMzMzMwtxQDMzMzMze3BAAAAAAAAAcECamZmZmbltQM3MzMzMnGtAzczMzMw8aUBmZmZmZkZnQAAAAAAAsGRAMzMzMzPDYkAAAAAAACBiQAAAAAAAkGJAMzMzMzNDYkAAAAAAANBhQGZmZmZmFmFAmpmZmZmZYUAAAAAAAMBiQJqZmZmZiWJAzczMzMw8YkAzMzMzM6NgQDMzMzMz011AmpmZmZl5XEBmZmZmZkZcQAAAAAAAQFtAzczMzMyMWUA=","dtype":"float64","shape":[55]},"Variable":["Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate","Robbery rate"],"Year":[1960,1961,1962,1963,1964,1965,1966,1967,1968,1969,1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988,1989,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014]},"selected":{"id":"5277ef26-8da1-45ef-b505-c9855b23f975","type":"Selection"},"selection_policy":{"id":"080c73c2-3a77-45d1-83b4-7e294d83271c","type":"UnionRenderers"}},"id":"26627c94-a925-4257-add4-d88fc980dcc4","type":"ColumnDataSource"},{"attributes":{"callback":null,"renderers":[{"id":"0d4d8d8c-cbc8-4eba-8c60-867a3ce11b30","type":"GlyphRenderer"},{"id":"f5632049-54e4-4001-905f-2e37f0480b55","type":"GlyphRenderer"},{"id":"125ade69-dfa3-4df1-90e0-238b813d7202","type":"GlyphRenderer"}],"tooltips":[["Variable","@{Variable}"],["Year","@{Year}"],["Rate (per 100k people)","@{Rate_left_parenthesis_per_100k_people_right_parenthesis}"]]},"id":"3481d497-d16e-4103-802f-c7f49222fe53","type":"HoverTool"},{"attributes":{"line_alpha":0.2,"line_color":"#ff7f0e","line_width":2,"x":{"field":"Year"},"y":{"field":"Rate (per 100k people)"}},"id":"b39f13cb-fc20-4a4f-aac9-2b702f99c6b1","type":"Line"},{"attributes":{},"id":"8223a783-c027-473c-9217-f60e3ac36aa0","type":"LinearScale"},{"attributes":{"source":{"id":"26627c94-a925-4257-add4-d88fc980dcc4","type":"ColumnDataSource"}},"id":"342bcecb-3198-4223-a63d-02a2b07986eb","type":"CDSView"},{"attributes":{"line_alpha":0.1,"line_color":"#ff7f0e","line_width":2,"x":{"field":"Year"},"y":{"field":"Rate (per 100k people)"}},"id":"44bd8031-3f2c-48c0-8850-51598e239176","type":"Line"},{"attributes":{},"id":"7fa57198-d15f-4a50-bad8-d6730c3d1eac","type":"UnionRenderers"},{"attributes":{},"id":"080c73c2-3a77-45d1-83b4-7e294d83271c","type":"UnionRenderers"},{"attributes":{"data_source":{"id":"26627c94-a925-4257-add4-d88fc980dcc4","type":"ColumnDataSource"},"glyph":{"id":"6d895348-af34-47fb-8db9-7aed49b78720","type":"Line"},"hover_glyph":null,"muted_glyph":{"id":"b39f13cb-fc20-4a4f-aac9-2b702f99c6b1","type":"Line"},"nonselection_glyph":{"id":"44bd8031-3f2c-48c0-8850-51598e239176","type":"Line"},"selection_glyph":null,"view":{"id":"342bcecb-3198-4223-a63d-02a2b07986eb","type":"CDSView"}},"id":"f5632049-54e4-4001-905f-2e37f0480b55","type":"GlyphRenderer"},{"attributes":{},"id":"82b27fb8-dcc0-4828-9fe8-cb11fa8c716d","type":"Selection"},{"attributes":{"callback":null,"data":{"Rate (per 100k people)":{"__ndarray__":"mpmZmZnJf0AzMzMzMzeAQJqZmZmZuYBAMzMzMzMDgkCamZmZmdWDQJqZmZmZtYRAAAAAAACIhkDNzMzMzNSJQGZmZmZmIo1AzczMzMzAjkCamZmZmfOQQAAAAAAALpJAMzMzMzPTkUAAAAAAABqTQM3MzMzMdpZAZmZmZmbwl0DNzMzMzKCWQDMzMzMzL5ZAZmZmZmZqlkCamZmZmZ+XQGZmZmZmUJpAzczMzMy8mUAAAAAAAECXQM3MzMzM6pRAAAAAAADGk0DNzMzMzC6UQDMzMzMzF5VAzczMzMzelEDNzMzMzJCUQGZmZmZmDpRAzczMzMxAk0BmZmZmZpCTQJqZmZmZQZJAzczMzMwukUBmZmZmZkiQQAAAAAAA2I5AAAAAAACIjUBmZmZmZraMQJqZmZmZ+YpAMzMzMzMTiEBmZmZmZsaGQGZmZmZmLodAAAAAAABYh0AAAAAAACiHQGZmZmZm0oZAMzMzMzO3hkDNzMzMzOiGQM3MzMzMsIZAAAAAAADohkCamZmZmW2GQAAAAAAA6IVAZmZmZmbqhUCamZmZmQGFQDMzMzMzE4NAAAAAAAD0gEA=","dtype":"float64","shape":[55]},"Rate_left_parenthesis_per_100k_people_right_parenthesis":{"__ndarray__":"mpmZmZnJf0AzMzMzMzeAQJqZmZmZuYBAMzMzMzMDgkCamZmZmdWDQJqZmZmZtYRAAAAAAACIhkDNzMzMzNSJQGZmZmZmIo1AzczMzMzAjkCamZmZmfOQQAAAAAAALpJAMzMzMzPTkUAAAAAAABqTQM3MzMzMdpZAZmZmZmbwl0DNzMzMzKCWQDMzMzMzL5ZAZmZmZmZqlkCamZmZmZ+XQGZmZmZmUJpAzczMzMy8mUAAAAAAAECXQM3MzMzM6pRAAAAAAADGk0DNzMzMzC6UQDMzMzMzF5VAzczMzMzelEDNzMzMzJCUQGZmZmZmDpRAzczMzMxAk0BmZmZmZpCTQJqZmZmZQZJAzczMzMwukUBmZmZmZkiQQAAAAAAA2I5AAAAAAACIjUBmZmZmZraMQJqZmZmZ+YpAMzMzMzMTiEBmZmZmZsaGQGZmZmZmLodAAAAAAABYh0AAAAAAACiHQGZmZmZm0oZAMzMzMzO3hkDNzMzMzOiGQM3MzMzMsIZAAAAAAADohkCamZmZmW2GQAAAAAAA6IVAZmZmZmbqhUCamZmZmQGFQDMzMzMzE4NAAAAAAAD0gEA=","dtype":"float64","shape":[55]},"Variable":["Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate","Burglary rate"],"Year":[1960,1961,1962,1963,1964,1965,1966,1967,1968,1969,1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988,1989,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014]},"selected":{"id":"c05057cb-f16d-4349-b03d-c09cdc6ba559","type":"Selection"},"selection_policy":{"id":"ffb0bc45-c827-4f96-ae52-ad134c44d0b3","type":"UnionRenderers"}},"id":"2b2287e6-2706-4b63-aedd-a6bf32979687","type":"ColumnDataSource"},{"attributes":{"line_color":"#2ca02c","line_width":2,"x":{"field":"Year"},"y":{"field":"Rate (per 100k people)"}},"id":"c15dc676-9a32-46ac-acfd-f99f4eb8e731","type":"Line"},{"attributes":{"label":{"value":"Robbery rate"},"renderers":[{"id":"f5632049-54e4-4001-905f-2e37f0480b55","type":"GlyphRenderer"}]},"id":"f8870c4e-c6c9-46b6-ac8b-82c4df0ff009","type":"LegendItem"}],"root_ids":["9268281a-2417-46c5-9182-d5ebb5689dc2"]},"title":"Bokeh Application","version":"0.12.16"}}
        </script>
        <script type="text/javascript">
          (function() {
            var fn = function() {
              Bokeh.safely(function() {
                (function(root) {
                  function embed_document(root) {

                  var docs_json = document.getElementById('de6014ac-0cb8-4cb5-aff2-696c243a0f51').textContent;
                  var render_items = [{"docid":"cf966cb6-e460-4e42-870e-e1fe77bb6636","elementid":"a54b0d41-2dbd-4cbd-abe8-b158862abe73","modelid":"9268281a-2417-46c5-9182-d5ebb5689dc2"}];
                  root.Bokeh.embed.embed_items(docs_json, render_items);

                  }
                  if (root.Bokeh !== undefined) {
                    embed_document(root);
                  } else {
                    var attempts = 0;
                    var timer = setInterval(function(root) {
                      if (root.Bokeh !== undefined) {
                        embed_document(root);
                        clearInterval(timer);
                      }
                      attempts++;
                      if (attempts > 100) {
                        console.log("Bokeh: ERROR: Unable to run BokehJS code because BokehJS library is missing")
                        clearInterval(timer);
                      }
                    }, 10, root)
                  }
                })(window);
              });
            };
            if (document.readyState != "loading") fn();
            else document.addEventListener("DOMContentLoaded", fn);
          })();
        </script>
    </body>
</html>


================================================
FILE: docs/source/api.rst
================================================
API
===

Auto-generated reference

.. toctree::
   :maxdepth: 1

   api_user.rst
   api_base.rst
   api_other.rst

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/api2.rst
================================================

.. _api2:

API Reference
=============

User Functions
--------------

.. autosummary::
    intake.config.Config
    intake.readers.datatypes.recommend
    intake.readers.convert.auto_pipeline
    intake.readers.convert.path
    intake.readers.entry.Catalog
    intake.readers.entry.DataDescription
    intake.readers.entry.ReaderDescription
    intake.readers.readers.recommend
    intake.readers.readers.reader_from_call

.. autoclass:: intake.config.Config
    :members:

.. autofunction:: intake.readers.datatypes.recommend

.. autofunction:: intake.readers.convert.auto_pipeline

.. autoclass:: intake.readers.entry.Catalog
    :members:

.. autoclass:: intake.readers.entry.DataDescription
    :members:

.. autoclass:: intake.readers.entry.ReaderDescription
    :members:

.. autofunction:: intake.readers.readers.recommend

.. autofunction:: intake.readers.readers.reader_from_call

.. _base:

Base Classes
------------

These may be subclassed by developers

.. autosummary::
   intake.readers.datatypes.BaseData
   intake.readers.readers.BaseReader
   intake.readers.convert.BaseConverter
   intake.readers.namespaces.Namespace
   intake.readers.search.SearchBase
   intake.readers.user_parameters.BaseUserParameter

.. autoclass:: intake.readers.datatypes.BaseData
   :members:

.. autoclass:: intake.readers.readers.BaseReader
   :members:

.. autoclass:: intake.readers.convert.BaseConverter
   :members:

.. autoclass:: intake.readers.namespaces.Namespace
   :members:

.. autoclass:: intake.readers.search.SearchBase
   :members:

.. autoclass:: intake.readers.user_parameters.BaseUserParameter
   :members:

.. _data:

Data Classes
------------

.. autosummary::
   intake.readers.datatypes.ASDF
   intake.readers.datatypes.AVRO
   intake.readers.datatypes.CSV
   intake.readers.datatypes.Catalog
   intake.readers.datatypes.CatalogAPI
   intake.readers.datatypes.CatalogFile
   intake.readers.datatypes.DICOM
   intake.readers.datatypes.DeltalakeTable
   intake.readers.datatypes.Excel
   intake.readers.datatypes.FITS
   intake.readers.datatypes.Feather1
   intake.readers.datatypes.Feather2
   intake.readers.datatypes.FileData
   intake.readers.datatypes.GDALRasterFile
   intake.readers.datatypes.GDALVectorFile
   intake.readers.datatypes.GRIB2
   intake.readers.datatypes.GeoJSON
   intake.readers.datatypes.GeoPackage
   intake.readers.datatypes.HDF5
   intake.readers.datatypes.Handle
   intake.readers.datatypes.HuggingfaceDataset
   intake.readers.datatypes.IcebergDataset
   intake.readers.datatypes.JPEG
   intake.readers.datatypes.JSONFile
   intake.readers.datatypes.KerasModel
   intake.readers.datatypes.Literal
   intake.readers.datatypes.MatlabArray
   intake.readers.datatypes.MatrixMarket
   intake.readers.datatypes.NetCDF3
   intake.readers.datatypes.Nifti
   intake.readers.datatypes.NumpyFile
   intake.readers.datatypes.ORC
   intake.readers.datatypes.OpenDAP
   intake.readers.datatypes.PNG
   intake.readers.datatypes.Parquet
   intake.readers.datatypes.PickleFile
   intake.readers.datatypes.Prometheus
   intake.readers.datatypes.PythonSourceCode
   intake.readers.datatypes.RawBuffer
   intake.readers.datatypes.SKLearnPickleModel
   intake.readers.datatypes.SQLQuery
   intake.readers.datatypes.SQLite
   intake.readers.datatypes.STACJSON
   intake.readers.datatypes.Service
   intake.readers.datatypes.Shapefile
   intake.readers.datatypes.TFRecord
   intake.readers.datatypes.THREDDSCatalog
   intake.readers.datatypes.TIFF
   intake.readers.datatypes.Text
   intake.readers.datatypes.TileDB
   intake.readers.datatypes.TiledDataset
   intake.readers.datatypes.TiledService
   intake.readers.datatypes.WAV
   intake.readers.datatypes.XML
   intake.readers.datatypes.YAMLFile
   intake.readers.datatypes.Zarr

.. _reader:

Reader Classes
--------------

Includes readers, transformers, converters and output classes.

.. autosummary::
   intake.readers.catalogs.EarthdataCatalogReader
   intake.readers.catalogs.EarthdataReader
   intake.readers.catalogs.HuggingfaceHubCatalog
   intake.readers.catalogs.SKLearnExamplesCatalog
   intake.readers.catalogs.SQLAlchemyCatalog
   intake.readers.catalogs.STACIndex
   intake.readers.catalogs.StacCatalogReader
   intake.readers.catalogs.StacSearch
   intake.readers.catalogs.StackBands
   intake.readers.catalogs.THREDDSCatalogReader
   intake.readers.catalogs.TensorFlowDatasetsCatalog
   intake.readers.catalogs.TiledCatalogReader
   intake.readers.catalogs.TorchDatasetsCatalog
   intake.readers.convert.ASDFToNumpy
   intake.readers.convert.BaseConverter
   intake.readers.convert.DaskArrayToTileDB
   intake.readers.convert.DaskDFToPandas
   intake.readers.convert.DaskToRay
   intake.readers.convert.DeltaQueryToDask
   intake.readers.convert.DeltaQueryToDaskGeopandas
   intake.readers.convert.DicomToNumpy
   intake.readers.convert.DuckToPandas
   intake.readers.convert.FITSToNumpy
   intake.readers.convert.GenericFunc
   intake.readers.convert.HuggingfaceToRay
   intake.readers.convert.NibabelToNumpy
   intake.readers.convert.NumpyToTileDB
   intake.readers.convert.PandasToGeopandas
   intake.readers.convert.PandasToMetagraph
   intake.readers.convert.PandasToPolars
   intake.readers.convert.PandasToRay
   intake.readers.convert.Pipeline
   intake.readers.convert.PolarsEager
   intake.readers.convert.PolarsLazy
   intake.readers.convert.PolarsToPandas
   intake.readers.convert.RayToDask
   intake.readers.convert.RayToPandas
   intake.readers.convert.RayToSpark
   intake.readers.convert.SparkDFToRay
   intake.readers.convert.TileDBToNumpy
   intake.readers.convert.TileDBToPandas
   intake.readers.convert.TiledNodeToCatalog
   intake.readers.convert.TiledSearch
   intake.readers.convert.ToHvPlot
   intake.readers.convert.TorchToRay
   intake.readers.output.CatalogToJson
   intake.readers.output.DaskArrayToZarr
   intake.readers.output.GeopandasToFile
   intake.readers.output.MatplotlibToPNG
   intake.readers.output.NumpyToNumpyFile
   intake.readers.output.PandasToCSV
   intake.readers.output.PandasToFeather
   intake.readers.output.PandasToHDF5
   intake.readers.output.PandasToParquet
   intake.readers.output.Repr
   intake.readers.output.ToMatplotlib
   intake.readers.output.XarrayToNetCDF
   intake.readers.output.XarrayToZarr
   intake.readers.readers.ASDFReader
   intake.readers.readers.Awkward
   intake.readers.readers.AwkwardAVRO
   intake.readers.readers.AwkwardJSON
   intake.readers.readers.AwkwardParquet
   intake.readers.readers.Condition
   intake.readers.readers.CupyNumpyReader
   intake.readers.readers.CupyTextReader
   intake.readers.readers.DaskAwkwardJSON
   intake.readers.readers.DaskAwkwardParquet
   intake.readers.readers.DaskCSV
   intake.readers.readers.DaskDF
   intake.readers.readers.DaskDeltaLake
   intake.readers.readers.DaskHDF
   intake.readers.readers.DaskJSON
   intake.readers.readers.DaskNPYStack
   intake.readers.readers.DaskParquet
   intake.readers.readers.DaskSQL
   intake.readers.readers.DaskZarr
   intake.readers.readers.DeltaReader
   intake.readers.readers.DicomReader
   intake.readers.readers.DuckCSV
   intake.readers.readers.DuckDB
   intake.readers.readers.DuckJSON
   intake.readers.readers.DuckParquet
   intake.readers.readers.DuckSQL
   intake.readers.readers.FITSReader
   intake.readers.readers.FileByteReader
   intake.readers.readers.FileExistsReader
   intake.readers.readers.FileReader
   intake.readers.readers.GeoPandasReader
   intake.readers.readers.GeoPandasTabular
   intake.readers.readers.HandleToUrlReader
   intake.readers.readers.HuggingfaceReader
   intake.readers.readers.KerasAudio
   intake.readers.readers.KerasImageReader
   intake.readers.readers.KerasModelReader
   intake.readers.readers.KerasText
   intake.readers.readers.NibabelNiftiReader
   intake.readers.readers.NumpyReader
   intake.readers.readers.NumpyText
   intake.readers.readers.NumpyZarr
   intake.readers.readers.Pandas
   intake.readers.readers.PandasCSV
   intake.readers.readers.PandasExcel
   intake.readers.readers.PandasFeather
   intake.readers.readers.PandasHDF5
   intake.readers.readers.PandasORC
   intake.readers.readers.PandasParquet
   intake.readers.readers.PandasSQLAlchemy
   intake.readers.readers.Polars
   intake.readers.readers.PolarsAvro
   intake.readers.readers.PolarsCSV
   intake.readers.readers.PolarsDeltaLake
   intake.readers.readers.PolarsExcel
   intake.readers.readers.PolarsFeather
   intake.readers.readers.PolarsIceberg
   intake.readers.readers.PolarsJSON
   intake.readers.readers.PolarsParquet
   intake.readers.readers.PrometheusMetricReader
   intake.readers.readers.PythonModule
   intake.readers.readers.RasterIOXarrayReader
   intake.readers.readers.Ray
   intake.readers.readers.RayBinary
   intake.readers.readers.RayCSV
   intake.readers.readers.RayDeltaLake
   intake.readers.readers.RayJSON
   intake.readers.readers.RayParquet
   intake.readers.readers.RayText
   intake.readers.readers.Retry
   intake.readers.readers.SKImageReader
   intake.readers.readers.SKLearnExampleReader
   intake.readers.readers.SKLearnModelReader
   intake.readers.readers.ScipyMatlabReader
   intake.readers.readers.ScipyMatrixMarketReader
   intake.readers.readers.SparkCSV
   intake.readers.readers.SparkDataFrame
   intake.readers.readers.SparkDeltaLake
   intake.readers.readers.SparkParquet
   intake.readers.readers.SparkText
   intake.readers.readers.TFORC
   intake.readers.readers.TFPublicDataset
   intake.readers.readers.TFRecordReader
   intake.readers.readers.TFSQL
   intake.readers.readers.TFTextreader
   intake.readers.readers.TileDBDaskReader
   intake.readers.readers.TileDBReader
   intake.readers.readers.TiledClient
   intake.readers.readers.TiledNode
   intake.readers.readers.TorchDataset
   intake.readers.readers.XArrayDatasetReader
   intake.readers.readers.YAMLCatalogReader
   intake.readers.transform.DataFrameColumns
   intake.readers.transform.GetItem
   intake.readers.transform.Method
   intake.readers.transform.PysparkColumns
   intake.readers.transform.THREDDSCatToMergedDataset
   intake.readers.transform.XarraySel


================================================
FILE: docs/source/api_base.rst
================================================
Base Classes
------------

This is a reference API class listing, useful mainly for developers.

.. autosummary::
   intake.source.base.DataSourceBase
   intake.source.base.DataSource
   intake.catalog.Catalog
   intake.catalog.entry.CatalogEntry
   intake.catalog.local.UserParameter
   intake.source.derived.AliasSource
   intake.source.base.Schema

.. autoclass:: intake.source.base.DataSource
   :members:

   .. attribute:: plot

      Accessor for HVPlot methods.  See :doc:`plotting` for more details.

.. autoclass:: intake.catalog.Catalog
   :members:

.. autoclass:: intake.catalog.entry.CatalogEntry
   :members:

.. autoclass:: intake.catalog.local.UserParameter
   :members:

.. autoclass:: intake.source.derived.AliasSource
   :members:

.. autoclass:: intake.source.base.Schema
   :members:

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/api_other.rst
================================================
Other Classes
=============

GUI
---

.. autosummary::

   intake.interface.base.Base
   intake.interface.base.BaseSelector
   intake.interface.base.BaseView
   intake.interface.catalog.add.FileSelector
   intake.interface.catalog.add.URLSelector
   intake.interface.catalog.add.CatAdder
   intake.interface.catalog.search.Search
   intake.interface.source.defined_plots.Plots

.. autoclass:: intake.interface.base.Base
   :members:

.. autoclass:: intake.interface.base.BaseSelector
   :members:

.. autoclass:: intake.interface.base.BaseView
   :members:

.. autoclass:: intake.interface.catalog.add.FileSelector
   :members:

.. autoclass:: intake.interface.catalog.add.URLSelector
   :members:

.. autoclass:: intake.interface.catalog.add.CatAdder
   :members:

.. autoclass:: intake.interface.catalog.search.Search
   :members:

.. autoclass:: intake.interface.source.defined_plots.Plots
   :members:

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/api_user.rst
================================================
End User
--------

These are reference class and function definitions likely to be useful to everyone.

.. autosummary::
   intake.open_catalog
   intake.registry
   intake.register_driver
   intake.unregister_driver
   intake.source.csv.CSVSource
   intake.source.textfiles.TextFilesSource
   intake.source.jsonfiles.JSONFileSource
   intake.source.jsonfiles.JSONLinesFileSource
   intake.source.npy.NPySource
   intake.source.zarr.ZarrArraySource
   intake.catalog.local.YAMLFileCatalog
   intake.catalog.local.YAMLFilesCatalog
   intake.catalog.zarr.ZarrGroupCatalog

.. autofunction::
   intake.open_catalog

.. attribute:: intake.registry

   Mapping from plugin names to the DataSource classes that implement them. These are the
   names that should appear in the ``driver:`` key of each source definition in a
   catalog. See :doc:`plugin-directory` for more details.

.. attribute:: intake.open_

   Set of functions, one for each plugin, for direct opening of a data source. The names are derived from the names of
   the plugins in the registry at import time.


Source classes
''''''''''''''

.. autoclass:: intake.source.csv.CSVSource
   :members: __init__, discover, read_partition, read, to_dask

.. autoclass:: intake.source.zarr.ZarrArraySource
   :members: __init__, discover, read_partition, read, to_dask

.. autoclass:: intake.source.textfiles.TextFilesSource
   :members: __init__, discover, read_partition, read, to_dask

.. autoclass:: intake.source.jsonfiles.JSONFileSource
   :members: __init__, discover, read

.. autoclass:: intake.source.jsonfiles.JSONLinesFileSource
   :members: __init__, discover, read, head

.. autoclass:: intake.source.npy.NPySource
   :members: __init__, discover, read_partition, read, to_dask

.. autoclass:: intake.catalog.local.YAMLFileCatalog
   :members: __init__, reload, search, walk

.. autoclass:: intake.catalog.local.YAMLFilesCatalog
   :members: __init__, reload, search, walk

.. autoclass:: intake.catalog.zarr.ZarrGroupCatalog
   :members: __init__, reload, search, walk, to_zarr

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/catalog.rst
================================================
Catalogs
========

Data catalogs provide an abstraction that allows you to externally define, and optionally share, descriptions of
datasets, called *catalog entries*.  A catalog entry for a dataset includes information like:

* The name of the Intake driver that can load the data
* Arguments to the ``__init__()`` method of the driver
* Metadata provided by the catalog author (such as field descriptions and types, or data provenance)

In addition, Intake allows the arguments to data sources to be templated, with the variables explicitly
expressed as "user parameters". The given arguments are rendered using ``jinja2``, the
values of named user parameters, and any overrides.
The parameters are also offer validation of the allowed types and values, for both the template
values and the final arguments passed to the data source. The parameters are named and described, to
indicate to the user what they are for. This kind of structure can be used to, for example,
choose between two parts of a given data source, like "latest" and "stable", see the `entry1_part` entry in
the example below.

The user of the catalog can always override any template or argument value at the time
that they access a give source.

The Catalog class
-----------------

In Intake, a ``Catalog`` instance is an object with one or more named entries.
The entries might be read from a static file (e.g., YAML, see the next section), from
an Intake server or from any other data service that has a driver. Drivers which
create catalogs are ordinary DataSource classes, except that they have the container
type "catalog", and do not return data products via the ``read()`` method.

For example, you might choose to instantiate the base class and fill in some entries
explicitly in your code

.. code-block:: python

    from intake.catalog import Catalog
    from intake.catalog.local import LocalCatalogEntry
    mycat = Catalog.from_dict({
        'source1': LocalCatalogEntry(name, description, driver, args=...),
        ...
        })

Alternatively, subclasses of ``Catalog`` can define how entries are created from
whichever file format or service they interact with, examples including ``RemoteCatalog``
and `SQLCatalog`_. These generate entries based on their respective targets; some
provide advanced search capabilities executed on the server.

.. _SQLCatalog: https://intake-sql.readthedocs.io/en/latest/api.html#intake_sql.SQLCatalog


YAML Format
-----------

Intake catalogs can most simply be described with YAML files. This is very common
in the tutorials and this documentation, because it simple to understand, but demonstrate
the many features of Intake. Note that YAML files are also the easiest way to share
a catalog, simply by copying to a publicly-available location such as a cloud storage
bucket.
Here is an example:

.. code-block:: yaml

    metadata:
      version: 1
      parameters:
        file_name:
          type: str
          description: default file name for child entries
          default: example_file_name
    sources:
      example:
        description: test
        driver: random
        args: {}

      entry1_full:
        description: entry1 full
        metadata:
          foo: 'bar'
          bar: [1, 2, 3]
        driver: csv
        args: # passed to the open() method
          urlpath: '{{ CATALOG_DIR }}/entry1_*.csv'

      entry1_part:
        description: entry1 part
        parameters: # User parameters
          part:
            description: section of the data
            type: str
            default: "stable"
            allowed: ["latest", "stable"]
        driver: csv
        args:
          urlpath: '{{ CATALOG_DIR }}/entry1_{{ part }}.csv'

      entry2:
        description: entry2
        driver: csv
        args:
          # file_name parameter will be inherited from file-level parameters, so will
          # default to "example_file_name"
          urlpath: '{{ CATALOG_DIR }}/entry2/{{ file_name }}.csv`


Metadata
''''''''

Arbitrary extra descriptive information can go into the metadata section. Some fields will be
claimed for internal use and some fields may be restricted to local reading; but for now the only
field that is expected is ``version``, which will be updated when a breaking change is made to the
file format. Any catalog will have ``.metadata`` and ``.version`` attributes available.

Note that each source also has its own metadata.

The metadata section an also contain ``parameters`` which will be inherited by the sources in
the file (note that these sources can augment these parameters, or override them with their own
parameters).

Extra drivers
'''''''''''''

The ``driver:`` entry of a data source specification can be a driver name, as has been shown in the examples so far.
It can also be an absolute class path to use for the data source, in which case there will be no ambiguity about how
to load the data. That is the the preferred way to be explicit, when the driver name alone is not enough
(see `Driver Selection`_, below).

.. code-block:: yaml

    plugins:
      source:
        - module: intake.catalog.tests.example1_source
    sources:
      ...

However, you do not, in general, need to do this, since the ``driver:`` field of
each source can also explicitly refer to the plugin class.

Sources
'''''''

The majority of a catalog file is composed of data sources, which are named data sets that can be loaded for the user.
Catalog authors describe the contents of data set, how to load it, and optionally offer some customization of the
returned data.  Each data source has several attributes:

- ``name``: The canonical name of the source.  Best practice is to compose source names from valid Python identifiers.
  This allows Intake to support things like tab completion of data source names on catalog objects.
  For example, ``monthly_downloads`` is a good source
  name.
- ``description``: Human readable description of the source.  To help catalog browsing tools, the description should be
  Markdown.

- ``driver``: Name of the Intake :term:`Driver` to use with this source.  Must either already be installed in the current
  Python environment (i.e. with conda or pip) or loaded in the ``plugin`` section of the file. Can be a simple
  driver name like "csv" or the full path to the implementation class like "package.module.Class".

- ``args``: Keyword arguments to the init method of the driver.  Arguments may use template expansion.

- ``metadata``: Any metadata keys that should be attached to the data source when opened.  These will be supplemented
  by additional metadata provided by the driver.  Catalog authors can use whatever key names they would like, with the
  exception that keys starting with a leading underscore are reserved for future internal use by Intake.

- ``direct_access``: Control whether the data is directly accessed by the client, or proxied through a catalog server.
  See :ref:`remote-catalogs` for more details.

- ``parameters``: A dictionary of data source parameters.  See below for more details.

Caching Source Files Locally
''''''''''''''''''''''''''''

*This method of defining the cache  with a dedicated block is deprecated, see the Remote Access
section, below*

To enable caching on the first read of remote data source files, add the ``cache`` section with the
following attributes:

- ``argkey``: The args section key which contains the URL(s) of the data to be cached.
- ``type``: One of the keys in the cache registry [`intake.source.cache.registry`], referring to an implementation of caching behaviour. The default is "file" for the caching of one or more files.

Example:

.. code-block:: yaml

  test_cache:
    description: cache a csv file from the local filesystem
    driver: csv
    cache:
      - argkey: urlpath
        type: file
    args:
      urlpath: '{{ CATALOG_DIR }}/cache_data/states.csv'

The ``cache_dir`` defaults to ``~/.intake/cache``, and can be specified in the intake configuration
file or ``INTAKE_CACHE_DIR``
environment variable, or at runtime using the ``"cache_dir"`` key of the configuration.
The special value ``"catdir"`` implies that cached files will appear in the same directory as the
catalog file in which the data source is defined, within a directory named "intake_cache". These will
not appear in the cache usage reported by the CLI.

Optionally, the cache section can have a ``regex`` attribute, that modifies the path of the cache on
the disk. By default, the cache path is made by concatenating ``cache_dir``, dataset name, hash of
the url, and the url itself (without the protocol). ``regex`` attribute allows to remove part of the
url (the matching part).

Caching can be disabled at runtime for all sources regardless of the catalog specification::

    from intake.config import conf

    conf['cache_disabled'] = True

By default, progress bars are shown during downloads if the package ``tqdm`` is
available, but this can be disabled (e.g., for
consoles that don't support complex text) with

    conf['cache_download_progress'] = False

or, equivalently, the environment parameter ``INTAKE_CACHE_PROGRESS``.


The "types" of caching are that supported are listed in ``intake.source.cache.registry``, see
the docstrings of each for specific parameters that should appear in the cache block.


It is possible to work with compressed source files by setting ``type: compression`` in the cache specification.
By default the compression type is inferred from the file extension, otherwise it can be set by assigning the ``decomp``
variable to any of the options listed in ``intake.source.decompress.decomp``.
This will extract all the file(s) in the compressed file referenced by urlpath and store them in the cache directory.

In cases where miscellaneous files are present in the compressed file, a ``regex_filter`` parameter can be used. Only
the extracted filenames that match the pattern will be loaded. The cache path is appended to the filename so it is
necessary to include a wildcard to the beginning of the pattern.

Example:

.. code-block:: yaml

  test_compressed:
    driver: csv
    args:
      urlpath: 'compressed_file.tar.gz'
    cache:
      - type: compressed
        decomp: tgz
        argkey: urlpath
        regex_filter: '.*data.csv'

Templating
----------

Intake catalog files support Jinja2 templating for driver arguments. Any occurrence of
a substring like ``{{field}}`` will be replaced by the value of the user parameters with
that same name, or the value explicitly provided by the user. For how to specify these user parameters,
see the next section.

Some additional values are available for templating. The following is always available:
``CATALOG_DIR``, the full path to the directory containing the YAML catalog file.  This is especially useful
for constructing paths relative to the catalog directory to locate data files and custom drivers.
For example, the search for CSV files for the two "entry1" blocks, above, will happen in the same directory as
where the catalog file was found.

The following functions `may` be available. Since these execute code, the user of a catalog may decide
whether they trust those functions or not.

- ``env("USER")``: look in the set environment variables for the named variable
- ``client_env("USER")``: exactly the same, except that when using a client-server topology, the
  value will come from the environment of the client.
- ``shell("get_login thisuser -t")``: execute the command, and use the output as the value. The
  output will be trimmed of any trailing whitespace.
- ``client_shell("get_login thisuser -t")``: exactly the same, except that when using a client-server
  topology, the value will come from the system of the client.

The reason for the "client" versions of the functions is to prevent leakage of potentially sensitive
information between client and server by controlling where lookups happen. When working without a server,
only the ones without "client" are used.

An example:

.. code-block:: yaml

    sources:
      personal_source:
        description: This source needs your username
        args:
          url: "http://server:port/user/{{env(USER)}}"

Here, if the user is named "blogs", the ``url`` argument will resolve to
``"http://server:port/user/blogs"``; if the environment variable is not defined, it will
resolve to ``"http://server:port/user/"``

.. _paramdefs:

Parameter Definition
--------------------

Source parameters
'''''''''''''''''

A source definition can contain a "parameters" block.
Expressed in YAML, a parameter may look as follows:

.. code-block:: yaml

    parameters:
      name:
        description: name to use  # human-readable text for what this parameter means
        type: str  # one of bool, str, int, float, list[str | int | float], datetime, mlist
        default: normal  # optional, value to assume if user does not override
        allowed: ["normal", "strange"]  # optional, list of values that are OK, for validation
        min: "n"  # optional, minimum allowed, for validation
        max: "t"  # optional, maximum allowed, for validation

A parameter, not to be confused with an :term:`argument`,
can have one of two uses:

- to provide values for variables to be used in templating the arguments. *If* the pattern "{{name}}" exists in
  any of the source arguments, it will be replaced by the value of the parameter. If the user provides
  a value (e.g., ``source = cat.entry(name='something")``), that will be used, otherwise the default value. If
  there is no user input or default, the empty value appropriate for type is used. The ``default`` field allows
  for the same function expansion as listed for arguments, above.

- *If* an argument with the same name as the parameter exists, its value, after any templating, will be
  coerced to the given type of the parameter and validated against the allowed/max/min. It is therefore possible
  to use the string templating system (e.g., to get a value from the environment), but pass the final value as,
  for example, an integer. It makes no sense to provide a default for this case (the argument already has a value),
  but providing a default will not raise an exception.

- the "mlist" type is special: it means that the input must be a list, whose values are chosen from the
  allowed list. This is the only type where the parameter value is not the same type as the allowed list's
  values, e.g., if a list of str is set for ``allowed``, a list of str must also be the final value.

Note: the ``datetime`` type accepts multiple values:
Python datetime, ISO8601 string,  Unix timestamp int, "now" and  "today".

Catalog parameters
''''''''''''''''''

You can also define user parameters at the catalog level. This applies the parameter to
all entries within that catalog, without having to define it for each and every entry.
Furthermore, catalogs dested within the catalog will also inherit the parameter(s).

For example, with the following spec

.. code-block:: yaml

    metadata:
      version: 1
      parameters:
        bucket:
          type: str
          description: description
          default: test_bucket
    sources:
      param_source:
        driver: parquet
        description: description
        args:
          urlpath: s3://{{bucket}}/file.parquet
      subcat:
        driver: yaml_file
        path: "{{CATALOG_DIR}}/other.yaml"

If ``cat`` is the corresponsing catalog instance,
the URL of source ``cat.param_source`` will evaluate to "s3://test_bucket/file.parquet" by default, but
the parameter can be overridden with ``cat.param_source(bucket="other_bucket")``. Also, any
entries of ``subcat``, another catalog referenced from here, would also have the "bucket"-named
parameter attached to all sources. Of course, those sources do no need to make use of the
parameter.

To change the default, we can gerenate a new instance

.. code-block:: python

    cat2 = cat(bucket="production")  # sets default value of "bucket" for cat2
    subcat = cat.subcat(bucket="production")  # sets default only for the nested catalog

Of course, in these situations you can still override the value of the parameter for any
source, or pass explicit values for the arguments of the source, as normal.

For cases where the catalog is not defined in a YAML spec, the argument ``user_parameters``
to the constructor takes the same form as ``parameters`` above: a dict of user parameters,
either as ``UserParameter`` instances or as a dictionary spec for each one.

Templating parameters
'''''''''''''''''''''

Template functions can also be used in parameters (see `Templating`_, above), but you can use the available functions directly without the extra `{{...}}`.

For example, this catalog entry uses the ``env("HOME")`` functionality as described to set a default based on the user's home directory.

.. code-block:: yaml

    sources:
      variabledefault:
        description: "This entry leads to an example csv file in the user's home directory by default, but the user can pass root="somepath" to override that."
        driver: csv
        args:
          path: "{{root}}/example.csv"
        parameters:
          root:
            description: "root path"
            type: str
            default: "env(HOME)"


Driver Selection
----------------

In some cases, it may be possible that multiple backends are capable of loading from the same data
format or service. Sometimes, this may mean two drivers with unique names, or a single driver
with a parameter to choose between the different backends.

However, it is possible that multiple drivers for reading a particular type of data
also share the same driver name: for example, both the
intake-iris and the intake-xarray packages contain drivers with the name ``"netcdf"``, which
are capable of reading the same files, but with different backends. Here we will describe the
various possibilities of coping with this situation. Intake's plugin system makes it easy to encode such choices.

It may be
acceptable to use any driver which claims to handle that data type, or to give the option of
which driver to use to the user, or it may be necessary to specify which precise driver(s) are
appropriate for that particular data. Intake allows all of these possibilities, even if the
backend drivers require extra arguments.

Specifying a single driver explicitly, rather than using a generic name, would look like this:

.. code-block:: yaml

    sources:
      example:
        description: test
        driver: package.module.PluginClass
        args: {}

It is also possible to describe a list of drivers with the same syntax. The first one
found will be the one used. Note that the class imports will only happen at data source
instantiation, i.e., when the entry is selected from the catalog.

.. code-block:: yaml

    sources:
      example:
        description: test
        driver:
          - package.module.PluginClass
          - another_package.PluginClass2
        args: {}

These alternative plugins can also be given data-source specific names, allowing the
user to choose at load time with `driver=` as a parameter. Additional arguments may also
be required for each option (which, as usual, may include user parameters); however, the
same global arguments will be passed to all of the drivers listed.


.. code-block:: yaml

    sources:
      example:
        description: test
        driver:
          first:
            class: package.module.PluginClass
            args:
              specific_thing: 9
          second:
            class: another_package.PluginClass2
        args: {}

Remote Access
-------------

(see also :ref:`remote_data` for the implementation details)

Many drivers support reading directly from remote data sources such as HTTP, S3 or GCS. In these cases,
the path to read from is usually given with a protocol prefix such as ``gcs://``. Additional dependencies
will typically be required (``requests``, ``s3fs``, ``gcsfs``, etc.), any data package
should specify these.  Further parameters
may be necessary for communicating with the storage backend and, by convention, the driver should take
a parameter ``storage_options`` containing arguments to pass to the backend. Some
remote backends may also make use of environment variables or config files to
determine thier default behaviour.

The special template variable "CATALOG_DIR" may be used to construct relative URLs in the arguments to
a source. In such cases, if the filesystem used to load that catalog contained arguments, then
the ``storage_options`` of that file system will be extracted and passed to the source. Therefore, all
sources which can accept general URLs (beyond just local paths) must make sure to accept this
argument.

As an example of using ``storage_options``, the following
two sources would allow for reading CSV data from S3 and GCS backends without
authentication (anonymous access), respectively

.. code-block:: yaml

   sources:
     s3_csv:
       driver: csv
       description: "Publicly accessible CSV data on S3; requires s3fs"
       args:
         urlpath: s3://bucket/path/*.csv
         storage_options:
           anon: true
     gcs_csv:
       driver: csv
       description: "Publicly accessible CSV data on GCS; requires gcsfs"
       args:
         urlpath: gcs://bucket/path/*.csv
         storage_options:
           token: "anon"

.. _caching:

**Using S3 Profiles**

An AWS profile may be specified as an argument under ``storage_options`` via the following format:

.. code-block:: yaml

      args:
        urlpath: s3://bucket/path/*.csv
        storage_options:
          profile: aws-profile-name


Caching
'''''''

URLs interpreted by ``fsspec`` offer `automatic caching`_. For example, to enable
file-based caching for the first source above, you can do:

.. code-block:: yaml

   sources:
     s3_csv:
       driver: csv
       description: "Publicly accessible CSV data on S3; requires s3fs"
       args:
         urlpath: simplecache::s3://bucket/path/*.csv
         storage_options:
           s3:
             anon: true

Here we have added the "simplecache" to the URL (this caching backend does not store any
metadata about the cached file) and specified that the "anon" parameter is
meant as an argument to s3, not to the caching mechanism. As each file in
s3 is accessed, it will first be downloaded and then the local version
used instead.

.. _automatic caching: https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally

You can tailor how the caching works. In particular the location of the local
storage can be set with the ``cache_storage`` parameter (under the "simplecache"
group of storage_options, of course) - otherwise they are stored in a temporary
location only for the duration of the current python session. The cache location
is particularly useful in conjunction with an environment variable, or
relative to "{{CATALOG_DIR}}", wherever the catalog was loaded from.

Please see the ``fsspec`` documentation for the full set of cache types and their
various options.

Local Catalogs
--------------

A Catalog can be loaded from a YAML file on the local filesystem by creating a Catalog object:

.. code-block:: python

    from intake import open_catalog
    cat = open_catalog('catalog.yaml')

Then sources can be listed:

.. code-block:: python

    list(cat)

and data sources are loaded via their name:

.. code-block:: python

    data = cat.entry_part1

and you can optionally configure new instances of the source to define user parameters
or override arguments by calling either of:

.. code-block:: python

    data = cat.entry_part1.configure_new(part='1')
    data = cat.entry_part1(part='1')  # this is a convenience shorthand

Intake also supports loading a catalog from all of the files ending in ``.yml`` and ``.yaml`` in a directory, or by using an
explicit glob-string. Note that the URL provided may refer to a remote storage systems by passing a protocol
specifier such as ``s3://``, ``gcs://``.:

.. code-block:: python

    cat = open_catalog('/research/my_project/catalog.d/')

Intake Catalog objects will automatically reload changes or new additions to catalog files and directories on disk.
These changes will not affect already-opened data sources.


Catalog Nesting
---------------

A catalog is just another type of data source for Intake. For example, you can print a YAML
specification corresponding to a catalog as follows:

.. code-block:: python

    cat = intake.open_catalog('cat.yaml')
    print(cat.yaml())

results in:

.. code-block:: yaml

    sources:
      cat:
        args:
          path: cat.yaml
        description: ''
        driver: intake.catalog.local.YAMLFileCatalog
        metadata: {}

The `point` here, is that this can be included in another catalog.
(It would, of course, be better to include a description and the full path of the catalog
file here.)
If the entry above were saved to another file, "root.yaml", and the
original catalog contained an entry, ``data``, you could access it as:

.. code-block:: python

    root = intake.open_catalog('root.yaml')
    root.cat.data



It is, therefore, possible to build up a hierarchy of catalogs referencing each other.
These can, of course, include remote URLs and indeed catalog sources other than simple files (all the
tables on a SQL server, for instance). Plus, since the argument and parameter system also
applies to entries such as the example above, it would be possible to give the user a runtime
choice of multiple catalogs to pick between, or have this decision depend on an environment
variable.

.. _remote-catalogs:

Server Catalogs
---------------

Intake also includes a server which can share an Intake catalog over HTTP
(or HTTPS with the help of a TLS-enabled reverse proxy).  From the user perspective, remote catalogs function
identically to local catalogs:

.. code-block:: python

    cat = open_catalog('intake://catalog1:5000')
    list(cat)

The difference is that operations on the catalog translate to requests sent to the catalog server.  Catalog servers
provide access to data sources in one of two modes:

* Direct access: In this mode, the catalog server tells the client how to load the data, but the client uses its
  local drivers to make the connection.  This requires the client has the required driver already installed *and*
  has direct access to the files or data servers that the driver will connect to.

* Proxied access: In this mode, the catalog server uses its local drivers to open the data source and stream the data
  over the network to the client.  The client does not need *any* special drivers to read the data, and can read data
  from files and data servers that it cannot access, as long as the catalog server has the required access.

Whether a particular catalog entry supports direct or proxied access is determined by the ``direct_access`` option:


- ``forbid`` (default): Force all clients to proxy data through the catalog server

- ``allow``: If the client has the required driver, access the source directly, otherwise proxy the data through the
  catalog server.

- ``force``: Force all clients to access the data directly.  If they do not have the required driver, an exception will
  be raised.

Note that when the client is loading a data source via direct access, the catalog server will need to send the driver
arguments to the client.  Do not include sensitive credentials in a data source that allows direct access.

Client Authorization Plugins
''''''''''''''''''''''''''''

Intake servers can check if clients are authorized to access the catalog as a whole, or individual catalog entries.
Typically a matched pair of server-side plugin (called an "auth plugin") and a client-side plugin (called a "client
auth plugin) need to be enabled for authorization checks to work.  This feature is still in early development, but see
module ``intake.auth.secret`` for a demonstration pair of server and client classes implementation auth via a shared
secret.

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/changelog.rst
================================================
Changelog
=========

2.0.4
-----

Released March 19, 2024

- re-enable v1 entrypoint sources
- expose recommend functions higher up (e.g,. intake.recommend)
- add more geo types and readers, including pmtiles
- add migration guide

2.0.3
-----

Released February 29, 2024

- fix v1 caches
- more docs

2.0.0
-----

Released Jan 31, 2024

- complete rewrite of the package, see main docs page

0.7.0
-----

Released May 29, 2023

- be able to override arguments when using a source defined in an entry-point
- make sources usable without explicit dependence on dask: zarr, textfiles, csv
- removed some explicit usage (but not all) of dask throughout the codebase
- new dataframe pipeline transform source

.. _v0.6.8:

0.6.8
-----

Released March 11, 2023

- user parameter parsed as string before conversion to given type
- numpy source becomes first to have read() path avoid dask
- when registering drivers dynamically, corresponding open_* functions
  will be created automatically (plus refactor/cleanup of the discovery code)
- docs config and style updates; the list of plugins to automatically
  pull in status badges
- catalog .gui attribute will make top-level GUI instance instead of
  cut down one-catalog version
- pre-commit checks added and consistent code style applied


.. _v0.6.7:

0.6.7
-----

Released February 13, 2023

- server fix for upstream dask change giving newlined in report
- editable plots, based on hvPlot's "explorer"
- remove "text" input to YAMLFileCatalog
- GUI bug fixes
- allow catalog TTL as None

.. _v0.6.6:

0.6.6
-----

Released on August 26, 2022.

- Fixed bug in json and jsonl driver.
- Ensure description is retained in the catalog.
- Fix cache issue when running inside a notebook.
- Add templating parameters.
- Plotting api keeps hold of hvplot calls to allow other plots to be made.
- docs updates
- fix urljoin for server via proxy

.. _v0.6.5:

0.6.5
-----

Released on January 9, 2022.

- Added link to intake-google-analytics.
- Add tiled driver.
- Add json and jsonl drivers.
- Allow parameters to be passed through catalog.
- Add mlist type which allows inputs from a known list of values.

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/code-of-conduct.rst
================================================
Code of Conduct
===============

All participants in the fsspec community are expected to adhere to a Code of Conduct.

As contributors and maintainers of this project, and in the interest of
fostering an open and welcoming community, we pledge to respect all people who
contribute through reporting issues, posting feature requests, updating
documentation, submitting pull requests or patches, and other activities.

We are committed to making participation in this project a harassment-free
experience for everyone, treating everyone as unique humans deserving of
respect.

Examples of unacceptable behaviour by participants include:

- The use of sexualized language or imagery
- Personal attacks
- Trolling or insulting/derogatory comments
- Public or private harassment
- Publishing other's private information, such as physical or electronic
  addresses, without explicit permission
- Other unethical or unprofessional conduct

Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviours that they deem inappropriate,
threatening, offensive, or harmful.

By adopting this Code of Conduct, project maintainers commit themselves
to fairly and consistently applying these principles to every aspect of
managing this project. Project maintainers who do not follow or enforce
the Code of Conduct may be permanently removed from the project team.

This code of conduct applies both within project spaces and in public
spaces when an individual is representing the project or its community.

If you feel the code of conduct has been violated, please report the
incident to the fsspec core team.

Reporting
---------

If you believe someone is violating theCode of Conduct we ask that you report it
to the  Project by emailing community@anaconda.com. All reports will be kept
confidential. In some cases we may determine that a public statement will need
to be made. If that's the case, the identities of all victims and reporters
will remain confidential unless those individuals instruct us otherwise.
If you believe anyone is in physical danger, please notify appropriate law
enforcement first.

In your report please include:

- Your contact info
- Names (real, nicknames, or pseudonyms) of any individuals involved.
  If there were other witnesses besides you, please try to include them as well.
- When and where the incident occurred. Please be as specific as possible.
- Your account of what occurred. If there is a publicly available record
  please include a link.
- Any extra context you believe existed for the incident.
- If you believe this incident is ongoing.
- If you believe any member of the core team has a conflict of interest
  in adjudicating the incident.
- What, if any, corrective response you believe would be appropriate.
- Any other information you believe we should have.

Core team members are obligated to maintain confidentiality with regard
to the reporter and details of an incident.

What happens next?
~~~~~~~~~~~~~~~~~~

You will receive an email acknowledging receipt of your complaint.
The core team will immediately meet to review the incident and determine:

- What happened.
- Whether this event constitutes a code of conduct violation.
- Who the bad actor was.
- Whether this is an ongoing situation, or if there is a threat to anyone's
  physical safety.
- If this is determined to be an ongoing incident or a threat to physical safety,
  the working groups' immediate priority will be to protect everyone involved.

If a member of the core team is one of the named parties, they will not be
included in any discussions, and will not be provided with any confidential
details from the reporter.

If anyone on the core team believes they have a conflict of interest in
adjudicating on a reported issue, they will inform the other core team
members, and exempt themselves from any discussion about the issue.
Following this declaration, they will not be provided with any confidential
details from the reporter.

Once the working group has a complete account of the events they will make a
decision as to how to response. Responses may include:

- Nothing (if we determine no violation occurred).
- A private reprimand from the working group to the individual(s) involved.
- A public reprimand.
- An imposed vacation
- A permanent or temporary ban from some or all spaces (GitHub repositories, etc.)
- A request for a public or private apology.

We'll respond within one week to the person who filed the report with either a
resolution or an explanation of why the situation is not yet resolved.

Once we've determined our final action, we'll contact the original reporter
to let them know what action (if any) we'll be taking. We'll take into account
feedback from the reporter on the appropriateness of our response, but we
don't guarantee we'll act on it.

Acknowledgement
---------------

This CoC is modified from the one by `BeeWare`_, which in turn refers to
the `Contributor Covenant`_ and the `Django`_ project.

.. _BeeWare: https://beeware.org/community/behavior/code-of-conduct/
.. _Contributor Covenant: https://www.contributor-covenant.org/version/1/3/0/code-of-conduct/
.. _Django: https://www.djangoproject.com/conduct/reporting/

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/community.rst
================================================
Community
=========

Intake is used and developed by individuals at a variety of institutions.  It
is open source (`license <https://github.com/intake/intake/blob/master/LICENSE>`_)
and sits within the broader Python numeric ecosystem commonly referred to as
PyData or SciPy.

Discussion
----------

Conversation happens in the following places:

1.  **Usage questions** are directed to `Stack Overflow with the #intake tag`_.
    Intake developers monitor this tag.
2.  **Bug reports and feature requests** are managed on the `GitHub issue
    tracker`_. Individual intake plugins are managed in separate repositories
    each with its own issue tracker. Please consult the :doc:`plugin-directory`
    for a list of available plugins.
3.  **Chat** occurs on at `gitter.im/ContinuumIO/intake
    <https://gitter.im/ContinuumIO/intake>`_.  Note that
    because gitter chat is not searchable by future users we discourage usage
    questions and bug reports on gitter and instead ask people to use Stack
    Overflow or GitHub.
4.  **Monthly community meeting** happens the first Thursday of the month at
    9:00 US Central Time. See `<https://github.com/intake/intake/issues/596>`_,
    with a reminder sent out on the gitter channel. Strictly informal chatter.


.. _`Stack Overflow with the #intake tag`: https://stackoverflow.com/questions/tagged/intake
.. _`GitHub issue tracker`: https://github.com/intake/intake/issues/


Asking for help
---------------

We welcome usage questions and bug reports from all users, even those who are
new to using the project.  There are a few things you can do to improve the
likelihood of quickly getting a good answer.

1.  **Ask questions in the right place**:  We strongly prefer the use
    of Stack Overflow or GitHub issues over Gitter chat.  GitHub and
    Stack Overflow are more easily searchable by future users, and therefore is more
    efficient for everyone's time.  Gitter chat is strictly reserved for
    developer and community discussion.

    If you have a general question about how something should work or
    want best practices then use Stack Overflow.  If you think you have found a
    bug then use GitHub

2.  **Ask only in one place**: Please restrict yourself to posting your
    question in only one place (likely Stack Overflow or GitHub) and don't post
    in both

3.  **Create a minimal example**:  It is ideal to create `minimal, complete,
    verifiable examples <https://stackoverflow.com/help/mcve>`_.  This
    significantly reduces the time that answerers spend understanding your
    situation, resulting in higher quality answers more quickly.

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/conf.py
================================================
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#
# intake documentation build configuration file, created by
# sphinx-quickstart on Jan 8 09:15:00 2018.
#
# This file is execfile()d with the current directory set to its
# containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys

sys.path.insert(0, os.path.abspath("../.."))
import intake

# -- General configuration ------------------------------------------------

# If your documentation needs a minimal Sphinx version, state it here.
#
# needs_sphinx = '1.0'

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
    "sphinx.ext.intersphinx",
    "sphinx.ext.autodoc",
    "sphinx.ext.autosummary",
    "numpydoc",
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]

# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
# source_suffix = ['.rst', '.md']
source_suffix = ".rst"

# The master toctree document.
master_doc = "index"

# General information about the project.
project = "intake"
copyright = "2018, Anaconda"
author = "Anaconda"

# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = intake.__version__.split("+")[0]
# The full version, including alpha/beta/rc tags.
release = intake.__version__

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = "en"

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This patterns also effect to html_static_path and html_extra_path
exclude_patterns = ["**.ipynb_checkpoints"]

# The name of the Pygments (syntax highlighting) style to use.
pygments_style = "sphinx"

# If true, `todo` and `todoList` produce output, else they produce nothing.
todo_include_todos = False


# -- Options for HTML output ----------------------------------------------

# The theme to use for HTML and HTML Help pages.  See the documentation for
# a list of builtin themes.
#

html_theme = "sphinx_rtd_theme"
html_favicon = "_static/images/favicon.ico"

# on_rtd is whether we are on readthedocs.org
on_rtd = os.environ.get("READTHEDOCS", None) == "True"

if not on_rtd:
    # only import and set the theme if we're building docs locally
    # otherwise, readthedocs.org uses their theme by default, so no need to specify it
    import sphinx_rtd_theme

    html_theme = "sphinx_rtd_theme"


# Theme options are theme-specific and customize the look and feel of a theme
# further.  For a list of options available for each theme, see the
# documentation.
#
# html_theme_options = {}

# Default title is "<project> v<release> documentation"
html_title = "Intake documentation"

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"]
html_css_files = [
    "css/custom.css",
]


# -- Options for HTMLHelp output ------------------------------------------

# Output file base name for HTML help builder.
htmlhelp_basename = "intakedoc"


# -- Options for LaTeX output ---------------------------------------------

latex_elements = {
    # The paper size ('letterpaper' or 'a4paper').
    #
    # 'papersize': 'letterpaper',
    # The font size ('10pt', '11pt' or '12pt').
    #
    # 'pointsize': '10pt',
    # Additional stuff for the LaTeX preamble.
    #
    # 'preamble': '',
    # Latex figure (float) alignment
    #
    # 'figure_align': 'htbp',
}

# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
#  author, documentclass [howto, manual, or own class]).
latex_documents = [
    (master_doc, "intake.tex", "Intake Documentation", "Anaconda", "manual"),
]


# -- Options for manual page output ---------------------------------------

# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [(master_doc, "intake", "Intake Documentation", [author], 1)]


# -- Options for Texinfo output -------------------------------------------

# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
#  dir menu entry, description, category)
texinfo_documents = [
    (
        master_doc,
        "intake",
        "Intake Documentation",
        author,
        "intake",
        "Fast data ingestion for Python.",
        "Miscellaneous",
    ),
]


# Example configuration for intersphinx: refer to the Python standard library.
intersphinx_mapping = {"python": ("https://docs.python.org/", None)}


# Config numpydoc
numpydoc_show_class_members = False
numpydoc_show_inherited_class_members = False
numpydoc_class_members_toctree = False


================================================
FILE: docs/source/contributing.rst
================================================
Contributor guide
=================

``intake`` is an open-source project (see the LICENSE). We welcome contributions from
the public, for code, including fixes and new features, documentation and anything else
that will help make this repo better. Even posting issues can be very useful, as they
will help others to make the necessary changes to alleviate the issue.

Development process
-------------------

Development of ``intake`` happens on `github`_. There you will find options for
creating issues and commenting on existing issues and pull-requests (PRs). You may
wish to "watch" the repo, to be notified of changes as they occur. You must have an
account on github to be able to interact here, but this is free. By default, you
will be notified of changes (e.g., new comments) on any issue or PR you have interacted
with.

In order to propose changes to the repo itself, you will need to create a PR. This is
done by following these steps:

1. clone the repo. There are many ways to do this, but most common is the following command,
   which will create a local directory ``intake/`` containing the code, metadata, docs and
   version control information.

.. code-block:: shell

   $ git clone https://github.com/intake/intake


2. create a fork or the repo using the github web interface. Your fork will probably live in
   your private github namespace. Set this as a remote inside your local copy of the repo

.. code-block:: shell

   $ git remote add fork https://github.com/<username>/intake

3. make changes locally in a new branch. First you create the branch, and then add commits
   to that branch. Here are suggested ways to do this. Note that git is _very_ flexible and
   there are many ways to achieve each step.

.. code-block:: shell

   $ git checkout -b <new branch name>
   $ git commit -a

4. When your branch is an a suitable state, `push` your work to your branch. github will prompt
   you with a URL to create the PR, or navigate to your fork and branch in the web interface to
   create the PR there.

.. code-block:: shell

   $ git push fork

5. After review from a maintainer, you may wish to push more commits to your branch as required,
   and your PR may be accepted ("merged") or rejected ("closed").

.. _github: https://github.com/intake/intake

Guidelines
----------

To make contributing as smooth as possible, we recommend the following.

1. Always follow the project's Code of Conduct when interacting with other humans.

2. Please describe as clearly as possible what your intent is. In the case of issues, this
   might include pasting the whole traceback your have seen following an error, listing the
   versions of ``intake`` and its dependencies that you have installed, describing the
   circumstances when you saw a problem or would like better behaviour. Ideally, you would
   include code that allows maintainers to fully reproduce your steps.

3. When submitting changes, make sure that you describe what the changes achieve and how.
   Ideally, all code should be covered by tests included in the same PR, and that run to
   completion as part of CI (see below).

4. New functions and classes should include reasonable
   `style`, e.g., appropriate labels and hierarchy, indentation and other code formatting
   matching the rest of the docs, and docstrings and comments as appropriate. A "precommit"
   set of linters is available to run against your code, and runs as part of CI to enforce
   a minimal set of style rules. To run these locally on every commit, you can run this in the
   repo root:

.. code-block:: shell

   $ pre-commit install

5. Additions to the prose documentation (under docs/source/) should be included for new
   or altered features. After the initial full release, we will be maintaining a changelog.

Testing
-------

This repo uses ``pytest`` for testing. You can install test dependencies, for example with
this command run in the repo root. There are many optional dependencies for specific tests,
and we recommend that you use ``pytest.importorskip`` to tests that need these or additional
packages, so that they will not fail for developers without those dependencies. **Do**, however,
edit one or more files in scripts/ci/, to ensure that added tests will execute in at least one
of the CI runs.

The easiest way to boostrap a development environment in order to run tests as they will in
CI is to use conda-env, e.g.:

.. code-block:: shell

   $ conda env create -y -f scripts/ci/environment-py313.yml
   $ conda activate test_env

To run the tests:

.. code-block:: shell

   $ pytest -v

Note that ensuring coverage is optional, but recommended.

Adding docs
-----------

Docstrings, prose text and examples/tutorials are eagerly accepted! We, as coders, often
are late to fully document our work, and all contributions are welcome. Separate instructions
can be found in the docs/README.md file.

In addition, full notebook examples may be added in the examples/ directory, but you
should be sure to add instructions on the appropriate environment or other preparation
required to run them.

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/data-packages.rst
================================================
Making Data Packages
====================

Intake can used to create :term:`Data packages`, so that you can easily distribute
your catalogs - others can just "install data". Since you may also want to distribute
custom catalogues, perhaps with visualisations, and driver code, packaging these things
together is a great convenience. Indeed, packaging gives you the opportunity to
version-tag your distribution and to declare the requirements needed to be able to
use the data. This is a common pattern for distributing code for python and other
languages, but not commonly seen for data artifacts.

The current version of Intake allows making data packages using standard python
tools (to be installed, for example, using ``pip``).
The previous, now deprecated, technique is still described below, under
:ref:`condapack` and is specific to the `conda` packaging system.

Python packaging solution
-------------------------

Intake allows you to register data artifacts (catalogs and data sources) in the
metadata of a python package. This means, that when you install that package, intake
will automatically know of the registered items, and they will appear within the
"builtin" catalog ``intake.cat``.

Here we assume that you understand what is meant by a python package (i.e., a
folder containing ``__init__.py`` and other code, config and data files).
Furthermore, you should familiarise yourself with what is required for
bundling such a package into a *distributable* package (one with a ``setup.py``)
by reading the `official packaging documentation`_

.. _official packaging documentation: https://packaging.python.org/tutorials/packaging-projects/

The `intake examples`_ contains a full tutorial for packaging and distributing
intake data and/or catalogs for ``pip`` and ``conda``, see the directory
"data_package/".

.. _intake examples: https://github.com/intake/intake-examples

Entry points definition
'''''''''''''''''''''''

Intake uses the concept of `entry points` to define the entries that are defined
by a given package. Entry points provide a mechanism to register metadata about a
package at install time, so that it can easily be found by other packages such as Intake.
Entry points was originally a `separate package`_, but is included in the standard
library as of python 3.8 (you will not need to install it, as Intake requires it).

All you need to do to register an entry in ``intake.cat`` is:

- define a data source somewhere in your package. This object can
  be of any ttype that makes sense to Intake, including Catalogs, and sources
  that have drivers defined in the very same package. Obviously, if you can have
  catalogs, you can populate these however you wish, including with more catalogs.
  You need not be restricted to simply loading in YAML files.
- include a block in your call to ``setp`` in ``setup.py`` with code something like

.. code-block:: python

    entry_points={
        'intake.catalogs': [
            'sea_cat = intake_example_package:cat',
            'sea_data = intake_example_package:data'
        ]
    }

  Here only the lines with "sea_cat" and "sea_data" are specific to the example
  package, the rest is required boilerplate. Each of those two lines defines a name
  for the data entry (before the "=" sign) and the location to load from, in
  module:object format.

- install the package using ``pip``, ``python setup.py``, or package it for ``conda``

.. _separate package: https://github.com/takluyver/entrypoints

Intake's process
''''''''''''''''

When Intake is imported, it investigates all registered entry points with the
``"intake.catalogs"`` group. It will go through and assign each name to the
given location of the final object. In the above example, ``intake.cat.sea_cat``
would be associated with the ``cat`` object in the ``intake_example_package``
package, and so on.

Note that Intake does **not** immediately import the given package or module, because imports
can sometimes be expensive, and if you have a lot of data packages, it might cause
a slow-down every time that Intake is imported. Instead, a placeholder entry is
created, and whenever the entry is accessed, that's when the particular package
will be imported.

.. code-block:: python

    In [1]: import intake

    In [2]: intake.cat.sea_cat  # does not import yet
    Out[2]: <Entry containing Catalog named sea_cat>

    In [3]: cat = intake.cat.sea_cat()  # imports now

    In [4]: cat   # this data source happens to be a catalog
    Out[4]: <Intake catalog: sea>

(note here the parentheses - this explicitly initialises the source, and normally
you don't have to do this)

.. _condapack:

Pure conda solution
-------------------

This packaging method is deprecated, but still available.

Combined with the `Conda Package Manger <https://conda.io/docs/>`_, Intake
makes it possible to create :term:`Data packages` which can be installed and upgraded just like
software packages.  This offers several advantages:

  * Distributing Catalogs and Drivers becomes as easy as ``conda install``
  * Data packages can be versioned, improving reproducibility in some cases
  * Data packages can depend on the libraries required for reading
  * Data packages can be self-describing using Intake catalog files
  * Applications that need certain Catalogs can include data packages in their dependency list

In this tutorial, we give a walk-through to enable you to distribute any
Catalogs to others, so that they can access the data using Intake without worrying about where it
resides or how it should be loaded.

Implementation
''''''''''''''

The function ``intake.catalog.default.load_combo_catalog`` searches for YAML catalog files in a number
of place at import. All entries in these catalogs are flattened and placed in the "builtin"
``intake.cat``.

The places searched are:

  * a platform-specific user directory as given by the `appdirs`_ package
  * in the environment's "/share/intake" data directory, where the location of the current environment
    is found from virtualenv or conda environment variables
  * in directories listed in the "INTAKE_PATH" environment variable or "catalog_path" config parameter

.. _appdirs: https://github.com/ActiveState/appdirs

Defining a Package
''''''''''''''''''

The steps involved in creating a data package are:

1. Identifying a dataset, which can be accessed via a URL or included directly as one or more files in the package.

2. Creating a package containing:

   * an intake catalog file
   * a ``meta.yaml`` file (description of the data, version, requirements, etc.)
   * a script to copy the data

3. Building the package using the command ``conda build``.

4. Uploading the package to a package repository such as `Anaconda Cloud <https://anaconda.org>`_ or your own private
   repository.

Data packages are standard conda packages that install an Intake catalog file into the user's conda environment
(``$CONDA_PREFIX/share/intake``).  A data package does not necessarily imply there are data files inside the package.
A data package could describe remote data sources (such as files in S3) and take up very little space on disk.

These packages are considered ``noarch`` packages, so that one package can be installed on any platform, with any
version of Python (or no Python at all).  The easiest way to create such a package is using a
`conda build <https://conda.io/docs/commands/build/conda-build.html>`_ recipe.

Conda-build recipes are stored in a directory that contains a files like:

  * ``meta.yaml`` - description of package metadata
  * ``build.sh`` - script for building/installing package contents (on Linux/macOS)
  * other files needed by the package (catalog files and data files for data packages)

An example that packages up data from a Github repository would look like this:

.. code-block:: yaml

    # meta.yaml
    package:
      version: '1.0.0'
      name: 'data-us-states'

    source:
      git_rev: v1.0.0
      git_url: https://github.com/CivilServiceUSA/us-states

    build:
      number: 0
      noarch: generic

    requirements:
      run:
        - intake
      build: []

    about:
      description: Data about US states from CivilServices (https://civil.services/)
      license: MIT
      license_family: MIT
      summary: Data about US states from CivilServices

The key parts of a data package recipe (different from typical conda recipes) is the ``build`` section:

.. code-block:: yaml

    build:
      number: 0
      noarch: generic

This will create a package that can be installed on any platform, regardless of the platform where the package is
built.  If you need to rebuild a package, the build number can be incremented to ensure users get the latest version when they conda update.

The corresponding ``build.sh`` file in the recipe looks like this:

.. code-block:: bash

    #!/bin/bash

    mkdir -p $CONDA_PREFIX/share/intake/civilservices
    cp $SRC_DIR/data/states.csv $PREFIX/share/intake/civilservices
    cp $RECIPE_DIR/us_states.yaml $PREFIX/share/intake/

The ``$SRC_DIR`` variable refers to any source tree checked out (from Github or other service), and the
``$RECIPE_DIR`` refers to the directory where the ``meta.yaml`` is located.

Finishing out this example, the catalog file for this data source looks like this:

.. code-block:: yaml

    sources:
      states:
        description: US state information from [CivilServices](https://civil.services/)
        driver: csv
        args:
          urlpath: '{{ CATALOG_DIR }}/civilservices/states.csv'
        metadata:
          origin_url: 'https://github.com/CivilServiceUSA/us-states/blob/v1.0.0/data/states.csv'

The ``{{ CATALOG_DIR }}`` Jinja2 variable is used to construct a path relative to where the catalog file was installed.

To build the package, you must have conda-build installed:

.. code-block:: bash

    conda install conda-build

Building the package requires no special arguments:

.. code-block:: bash

    conda build my_recipe_dir

Conda-build will display the path of the built package, which you will need to upload it.

If you want your data package to be publicly available on `Anaconda Cloud <https://anaconda.org>`_, you can install
the anaconda-client utility:

.. code-block:: bash

    conda install anaconda-client

Then you can register your Anaconda Cloud credentials and upload the package:

.. code-block:: bash

    anaconda login
    anaconda upload /Users/intake_user/anaconda/conda-bld/noarch/data-us-states-1.0.0-0.tar.bz2

Best Practices
--------------

Versioning
''''''''''

* Versions for data packages should be used to indicate changes in the data values or schema.  This allows applications
  to easily pin to the specific data version they depend on.

* Putting data files into a package ensures reproducibility by allowing a version number to be associated with files
  on disk.  This can consume quite a bit of disk space for the user, however. Large data files are not generally
  included in pip or conda packages so, if possible, you should reference the data assets in an external place where they
  can be loaded.

Packaging
'''''''''

* Packages that refer to remote data sources (such as databases and REST APIs) need to think about authentication.
  Do not include authentication credentials inside a data package.  They should be obtained from the environment.

* Data packages should depend on the Intake plugins required to read the data, or Intake itself.

* You may well want to break any driver code code out into a separate package so that it can be updated
  independent of the data. The data package would then depend on the driver package.

Nested catalogs
'''''''''''''''

As noted above, entries will appear in the users' builtin
catalog as ``intake.cat.*``. In the case that the catalog has multiple entries, it may be desirable
to put the entries below a namespace as ``intake.cat.data_package.*``. This can be achieved by having
one catalog containing the (several) data sources, with only a single top-level entry pointing to
it. This catalog could be defined in a YAML file, created using any other catalog driver, or constructed
in the code, e.g.:

.. code-block:: python

    from intake.catalog import Catalog
    from intake.catalog.local import LocalCatalogEntry as Entry
    cat = intake.catalog.Catalog()
    cat._entries = {name: Entry(name, descr, driver='package.module.driver',
                                  args={"urlpath": url})
                              for name, url in my_input_list}

If your package contains many sources of different types, you may even nest the catalogs, i.e.,
have a top-level whose contents are also catalogs.

.. code-block:: python

    e = Entry('first_cat', 'sample', driver='catalog')
    e._default_source = cat
    top_level = Catalog()
    top_level._entries = {'fist_cat': e, ...}

where your entry point might look something like: ``"my_cat = my_package:top_level"``. You could achieve the same
with multiple YAML files.

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/deployments.rst
================================================
Deployment Scenarios
--------------------

In the following sections, we will describe some of the ways in which Intake is used in real
production systems. These go well beyond the typical YAML files presented in the quickstart
and examples sections, which are necessarily short and simple, and do not demonstrate the
full power of Intake.

Sharing YAML files
~~~~~~~~~~~~~~~~~~

This is the simplest scenario, and amply described in these documents. The primary
advantage is simplicity: it is enough to put a file in an accessible place (even
a gist or repo), in order
for someone else to be able to discover and load that data. Furthermore, such
files can easily refer to one-another, to build up a full tree of data assets with
minimum pain Since YAML files are
text, this also lends itself to working well with version control systems.
Furthermore, all sources can describe themselves as YAML, and the
``export`` and ``upload`` commands can produce an efficient format (possibly remote) together
with YAML definition in a single step.

Pangeo
~~~~~~

The `Pangeo`_ collaboration uses Intake to catalog their data holdings, which are generally
in various forms of netCDF-compliant formats, massive multi-dimensional arrays with data
relating to earth and climate science and meteorology. On their cloud-based platform,
containers start up jupyter-lab sessions which have Intake installed, and therefore can
simply pick and load the data that each researcher needs - often requiring large Dask
clusters to actually do the processing.

A `static <https://pangeo-data.github.io/pangeo-datastore/>`__ rendering of the catalog
contents is available, so that users can browse the holdings
without even starting a python session. This rendering is produced by CI on the
`repo <https://github.com/pangeo-data/pangeo-datastore>`__ whenever new definitions are
added, and it also checks (using Intake) that each definition is indeed loadable.

Pangeo also developed intake-stac, which can talk to STAC servers to make real-time
queries and parse the results into Intake data sources. This is a standard for
spaceo-temporal data assets, and indexes massive amounts of cloud-stored data.

.. _Pangeo: http://pangeo.io/

Anaconda Enterprise
~~~~~~~~~~~~~~~~~~~

Intake will be the basis of the data access and cataloging service within
`Anaconda Enterprise`_, running as a micro-service in a container, and offering data
source definitions to users. The access control, who gets to see which data-set,
and serving of credentials to be able to read from the various data storage services,
will all be handled by the platform and be fully configurable by admins.

.. _Anaconda Enterprise: https://www.anaconda.com/enterprise/

National Center for Atmospheric Research
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

NCAR has developed `intake-esm`_, a mechanism for creating file-based Intake catalogs
for climate data from project efforts such as the `Coupled Model Intercomparison Project (CMIP)`_
and the `Community Earth System Model (CESM) Large Ensemble Project`_.
These projects produce a huge of amount climate data persisted on tape, disk storage components
across multiple (of the order ~300,000) netCDF files. Finding, investigating, loading these files into data array containers
such as `xarray` can be a daunting task due to the large number of files a user may be interested in.
``Intake-esm`` addresses this issue in three steps:

- `Datasets Catalog Curation`_ in form of YAML files. These YAML files provide information about data locations,
  access pattern,  directory structure, etc. ``intake-esm`` uses these YAML files in conjunction with file name templates
  to construct a local database. Each row in this database consists of a set of metadata such as ``experiment``,
  ``modeling realm``, ``frequency`` corresponding to data contained in one netCDF file.

.. code-block:: python

   cat = intake.open_esm_metadatastore(catalog_input_definition="GLADE-CMIP5")


- Search and Discovery: once the database is built, ``intake-esm`` can be used for searching and discovering
  of climate datasets by eliminating the need for the user to know specific locations (file path) of
  their data set of interest:

.. code-block:: python

   sub_cat = cat.search(variable=['hfls'], frequency='mon', modeling_realm='atmos', institute=['CCCma', 'CNRM-CERFACS'])

- Access: when the user is satisfied with the results of their query, they can ask ``intake-esm``
  to load the actual netCDF files into xarray datasets:

.. code-block:: python

   dsets = cat.to_xarray(decode_times=True, chunks={'time': 50})

.. _intake-esm: https://github.com/NCAR/intake-esm
.. _Datasets Catalog Curation: https://github.com/NCAR/intake-esm-datastore
.. _Coupled Model Intercomparison Project (CMIP): https://www.wcrp-climate.org/wgcm-cmip
.. _Community Earth System Model (CESM) Large Ensemble Project: http://www.cesm.ucar.edu/projects/community-projects/LENS/

Brookhaven Archive
~~~~~~~~~~~~~~~~~~

The `Bluesky`_ project uses Intake to dynamically query a MongoDB instance, which
holds the details of experimental and simulation data catalogs, to return a
custom Catalog for every query. Data-sets can then be loaded into python, or the original
raw data can be accessed ...

.. _Bluesky: https://github.com/bluesky/intake-bluesky

Zillow
~~~~~~

Zillow is developing Intake to meet the needs of their datalake access layer (DAL),
to encapsulate the highly hierarchical nature of their data. Of particular importance,
is the ability to provide different version (testing/production, and different
storage formats) of the same logical dataset, depending on
whether it is being read on a laptop versus the production infrastructure ...


.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/examples.rst
================================================
Examples
========

Here we list links to notebooks and other code demonstrating the use of Intake in various
scenarios. The first section is of general interest to various users, and the sections that
follow tend to be more specific about particular features and workflows.

Many of the entries here include a link to Binder, which a service that lest you execute
code live in a notebook environment. This is a great way to experience using Intake.
It can take a while, sometimes, for Binder to come up; please have patience.

See also the `examples`_ repository, containing data-sets which can be built and installed
as conda packages.

.. _examples: https://github.com/intake/intake-examples/

General
-------

- Basic Data scientist workflow: using Intake
  [`Static <https://github.com/intake/intake-examples/blob/master/tutorial/data_scientist.ipynb>`__]
  [`Executable <https://mybinder.org/v2/gh/intake/intake-examples/master?filepath=tutorial%2Fdata_scientist.ipynb>`__].

- Workflow for creating catalogs: a Data Engineer's approach to Intake
  [`Static <https://github.com/intake/intake-examples/blob/master/tutorial/data_engineer.ipynb>`__]
  [`Executable <https://mybinder.org/v2/gh/intake/intake-examples/master?filepath=tutorial%2Fdata_engineer.ipynb>`__]

Developer
---------

Tutorials delving deeper into the Internals of Intake, for those who wish to contribute

- How you would go about writing a new plugin
  [`Static <https://github.com/intake/intake-examples/blob/master/tutorial/dev.ipynb>`__]
  [`Executable <https://mybinder.org/v2/gh/intake/intake-examples/master?filepath=tutorial%2Fdev.ipynb>`__]

Features
--------

More specific examples of Intake functionality

- Caching:

    - New-style data package creation [`Static <https://github.com/intake/intake-examples/tree/master/data_package>`__]

    - Using automatically cached data-files
      [`Static <https://github.com/mmccarty/intake-blog/blob/master/examples/caching.ipynb>`__]
      [`Executable <https://mybinder.org/v2/gh/mmccarty/intake-blog/master?filepath=examples%2Fcaching.ipynb>`__]

    - Earth science demonstration of cached dataset
      [`Static <https://github.com/mmccarty/intake-blog/blob/master/examples/Walker_Lake.ipynb>`__]
      [`Executable <https://mybinder.org/v2/gh/mmccarty/intake-blog/master?filepath=examples%2FWalker_Lake.ipynb>`__]

- File-name pattern parsing:

    - Satellite imagery, science workflow
      [`Static <https://github.com/jsignell/intake-blog/blob/master/path-as-pattern/landsat.ipynb>`__]
      [`Executable <https://mybinder.org/v2/gh/jsignell/intake-blog/master?filepath=path-as-pattern%2Flandsat.ipynb>`__]

    - How to set up pattern parsing
      [`Static <https://github.com/jsignell/intake-blog/blob/master/path-as-pattern/csv.ipynb>`__]
      [`Executable <https://mybinder.org/v2/gh/jsignell/intake-blog/master?filepath=path-as-pattern%2Fcsv.ipynb>`__]

- Custom catalogs:

    - A custom intake plugin that adapts DCAT catalogs
      [`Static <https://github.com/CityOfLosAngeles/intake-dcat/blob/master/examples/demo.ipynb>`__]
      [`Executable <https://mybinder.org/v2/gh/CityOfLosAngeles/intake-dcat/master?urlpath=lab%2Ftree%2Fexamples%2Fdemo.ipynb>`__]


Data
----

- `Anaconda package data`_, originally announced in `this blog`_
- `Planet Four Catalog`_, originally from https://www.planetfour.org/results
- The official Intake `examples`_

.. _Anaconda package data: https://github.com/ContinuumIO/anaconda-package-data
.. _this blog: https://www.anaconda.com/announcing-public-anaconda-package-download-data/
.. _Planet Four Catalog: https://github.com/michaelaye/p4catalog

Blogs
-----

These are Intake-related articles that may be of interest.

- `Discovering and Exploring Data in a Graphical Interface`_
- `Taking the Pain out of Data Access`_
- `Caching Data on First Read Makes Future Analysis Faster`_
- `Parsing Data from Filenames and Paths`_
- `Intake for cataloguing Spark`_
- `Intake released on Conda-Forge`_

.. _Discovering and Exploring Data in a Graphical Interface: https://www.anaconda.com/intake-discovering-and-exploring-data-in-a-graphical-interface/
.. _Intake for cataloguing Spark: https://www.anaconda.com/intake-for-cataloging-spark/
.. _Taking the Pain out of Data Access: https://www.anaconda.com/intake-taking-the-pain-out-of-data-access/
.. _Caching Data on First Read Makes Future Analysis Faster: https://www.anaconda.com/intake-caching-data-on-first-read-makes-future-analysis-faster/
.. _Parsing Data from Filenames and Paths: https://www.anaconda.com/intake-parsing-data-from-filenames-and-paths/
.. _Intake released on Conda-Forge: https://www.anaconda.com/intake-released-on-conda-forge/

Talks
-----

- `__init__ podcast interview (May 2019)`_
- `AnacondaCon (March 2019)`_
- `PyData DC (November 2018)`_
- `PyData NYC (October 2018)`_
- `ESIP tech dive (November 2018)`_

.. _\__init__ podcast interview (May 2019): https://www.pythonpodcast.com/intake-data-catalog-episode-213/
.. _ESIP tech dive (November 2018): https://www.youtube.com/watch?v=PSD7r3JFml0&feature=youtu.be
.. _PyData DC (November 2018): https://www.youtube.com/watch?v=OvZFtePHKXw
.. _PyData NYC (October 2018): https://www.youtube.com/watch?v=pjkMmJQfTb8
.. _AnacondaCon (March 2019): https://www.youtube.com/watch?v=oyZJrROQzUs

News
----

- See out `Wiki`_ page

.. _Wiki: https://github.com/intake/intake/wiki/Community-News

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/glossary.rst
================================================
Glossary
========

.. glossary::

    Argument
        One of a set of values passed to a function or class. In the Intake sense, this usually is the
        set of key-value pairs defined in the "args" section of a source definition; unless the user
        overrides, these will be used for instantiating the source.

    Cache
        Local copies of remote files. Intake allows for download-on-first-use for data-sources,
        so that subsequent access is much faster. The
        format of the files is unchanged in this case, but may be decompressed.

    Catalog
        An inventory of entries, each of which corresponds to a specific :term:`Data-set`. Within these docs, a catalog is
        most commonly defined in a :term:`YAML` file, for simplicity, but there are other possibilities, such as connecting to an Intake
        server or another third-party data service, like a SQL database. Thus, catalogs form a hierarchy: any
        catalog can contain other, nested catalogs.

    Catalog file
        A :term:`YAML` specification file which contains a list of named entries describing how to load data
        sources. :doc:`catalog`.

    Conda
        A package and environment management package for the python ecosystem, see the `conda website`_. Conda ensures
        dependencies and correct versions are installed for you, provides precompiled, binary-compatible software,
        and extends to many languages beyond python, such as R, javascript and C.

    Conda package
        A single installable item which the :term:`Conda` application can install. A package may include
        a :term:`Catalog`, data-files and maybe some additional code. It will also include a specification of the
        dependencies that it requires (e.g., Intake and any additional :term:`Driver`), so that Conda can install those
        automatically. Packages can be created locally, or can be found on `anaconda.org`_ or other package
        repositories.

    Container
        One of the supported data formats. Each :term:`Driver` outputs its data in one of these. The
        containers correspond to familiar data structures for end-analysis, such as list-of-dicts, Numpy nd-array or
        Pandas data-frame.

    Data-set
        A specific assemblage of data. The type of data (tabular, multi-dimensional or something else) and the format
        (file type, data service type) are all attributes of the data-set. In addition, in the context of Intake,
        data-sets are usually entries within a :term:`Catalog` with additional descriptive text and metadata and
        a specification of *how* to load the data.

    Data Source
        An Intake specification for a specific :term:`Data-set`. In most cases, the two terms are
        synonymous.

    Data User
        A person who uses data to produce models and other inferences/conclusions. This
        person generally uses standard python analysis packages like Numpy, Pandas, SKLearn and may produce
        graphical output. They will want to be able to find the right data for a given job, and for
        the data to be available in a standard format as quickly and
        easily as possible. In many organisations, the appropriate job title may be Data Scientist, but
        research scientists and BI/analysts also fit this description.

    Data packages
        Data packages are standard conda packages that install an Intake catalog file into the user’s conda
        environment ($CONDA_PREFIX/share/intake). A data package does not necessarily imply there are data files
        inside the package. A data package could describe remote data sources (such as files in S3) and take up
        very little space on disk.

    Data Provider
        A person whose main objective is to curate data sources, get them into appropriate
        formats, describe the contents, and disseminate the data to those that need to use them. Such a person
        may care about the specifics of the storage format and backing store, the right number of fields
        to keep and removing bad data. They may have a good idea of the best way to visualise any give
        data-set. In an organisation, this job may be known as Data Engineer, but it could as easily be
        done by a member of the IT team. These people are the most likely to author :term:`Catalogs<Catalog>`.

    Developer
        A person who writes or fixes code. In the context of Intake, a developer may make new format
        :term:`Drivers<Driver>`, create authentication systems or add functionality to Intake itself. They can
        take existing code for loading data in other projects, and use Intake to add extra functionality to it,
        for instance, remote data access, parallel processing, or file-name parsing.

    Driver
        The thing that does the work of reading the data for a catalog entry is known as a driver, often referred
        to using a simple name such as "csv". Intake
        has a plugin architecture, and new drivers can be created or installed, and specific catalogs/data-sets may
        require particular drivers for their contained data-sets. If installed as :term:`Conda` packages, then
        these requirements will be automatically installed for you. The driver's output will be a :term:`Container`,
        and often the code is a simpler layer over existing functionality in a third-party package.

    GUI
        A Graphical User Interface. Intake comes with a GUI for finding and selecting data-sets, see :doc:`gui`.

    IT
        The Information Technology team for an organisation. Such a team may have
        control of the computing infrastructure and security (sys-ops), and may well act as gate-keepers when
        exposing data for use by other colleagues. Commonly, IT has stronger policy enforcement requirements
        that other groups, for instance requiring all data-set copy actions to be logged centrally.

    Persist
        A process of making a local version of a data-source. One canonical format is used for each
        of the container types, optimised for quick and parallel access. This is particularly useful
        if the data takes a long time to acquire, perhaps because it is the result of a complex
        query on a remote service. The resultant output can be set to expire and be automatically
        refreshed, see :doc:`persisting`. Not to be confused with the :term:`cache`.

    Plugin
        Modular extra functionality for Intake, provided by a package that is installed separately. The most common type of
        plugin will be for a :term:`Driver` to load some particular data format; but other parts of Intake are
        pluggable, such as authentication mechanisms for the server.

    Server
        A remote source for Intake catalogs. The server will
        provide data source specifications (i.e., a remote :term:`Catalog`), and may also provide the raw data, in situations
        where the client is not able or not allowed to access it directly. As such, the server can act as a gatekeeper of
        the data for security and monitoring purposes. The implementation of the server in Intake is accessible as the
        ``intake-server`` command, and acts as a reference: other implementations can easily be created for
        specific circumstances.

    TTL
        Time-to-live, how long before the given entity is considered to have expired. Usually in seconds.

    User Parameter
        A data source definition can contain a "parameters" section, which can act as explicit decision indicators
        for the user, or as validation and type coersion for the definition's :term:`Argument` s. See
        :ref:`paramdefs`.

    YAML
        A text-based format for expressing data with a dictionary (key-value) and list structure, with a limited
        number of data-types, such as strings and numbers. YAML uses indentations to nest objects, making it easy
        to read and write for humans, compared to JSON. Intake's catalogs and config are usually expressed in YAML
        files.


.. _conda website: https://conda.io/docs/
.. _anaconda.org: http://anaconda.org

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/gui.rst
================================================
GUI
===

Using the GUI
-------------

**Note**: the GUI requires ``panel`` and ``bokeh`` to
be available in the current environment.

The Intake top-level singleton ``intake.gui`` gives access to a graphical data browser
within the Jupyter notebook. To expose it, simply enter it into a code cell (Jupyter
automatically display the last object in a code cell).

.. image:: _static/images/gui_builtin.png

New instances of the GUI are also available by instantiating ``intake.interface.gui.GUI``,
where you can specify a list of catalogs to initially include.

The GUI contains three main areas:

- a **list of catalogs**. The "builtin" catalog, displayed by default, includes data-sets installed
  in the system, the same as ``intake.cat``.

- a **list of sources** within the currently selected catalog.

- a **description** of the currently selected source.


Catalogs
--------
Selecting a catalog from the list will display nested catalogs below the parent and display
source entries from the catalog in the **list of sources**.

Below the **lists of catalogs** is a row of buttons that are used for adding, removing and
searching-within catalogs:

-  **Add**: opens a sub-panel for adding catalogs to the interface, by either browsing for a local
   YAML file or by entering a URL for a catalog, which can be a remote file or Intake server

-  **Remove**: deletes the currently selected catalog from the list

-  **Search**: opens a sub-panel for finding entries in the currently selected catalog (and its
   sub-catalogs)

Add Catalogs
~~~~~~~~~~~~

The Add button (+) exposes a sub-panel with two main ways to add catalogs to the interface:

.. image:: _static/images/gui_add.png

This panel has a tab to load files from **local**; from that you can navigate around the filesystem
using the arrow or by editing the path directly. Use the home button to get back to the starting
place. Select the catalog file you need. Use the "Add Catalog" button to add the catalog to the list
above.

.. image:: _static/images/gui_add_local.png

Another tab loads a catalog from **remote**. Any URL is valid here, including cloud locations,
``"gcs://bucket/..."``, and intake servers, ``"intake://server:port"``. Without a protocol
specifier, this can be a local path. Again, use the "Add Catalog" button to add
the catalog to the list above.

.. image:: _static/images/gui_add_remote.png

Finally, you can add catalogs to the interface in code, using the ``.add()`` method,
which can take filenames, remote URLs or existing ``Catalog`` instances.

Remove Catalogs
~~~~~~~~~~~~~~~

The Remove button (-) deletes the currently selected catalog from the list. It is important to
note that this action does not have any impact on files, it only affects what shows up in the list.

.. image:: _static/images/gui_remove.png

Search
~~~~~~

The sub-panel opened by the Search button (🔍) allows the user to search within the selected catalog

.. image:: _static/images/gui_search.png

From the Search sub-panel the user enters for free-form text. Since some catalogs contain nested sub-catalogs,
the Depth selector allows the search to be limited to the stated number of nesting levels.
This may be necessary, since, in theory, catalogs can contain circular references,
and therefore allow for infinite recursion.

.. image:: _static/images/gui_search_inputs.png

Upon execution of the search, the currently selected catalog will be searched. Entries will
be considered to match if any of the entered words is found in the description of the entry (this
is case-insensitive). If any matches are found, a new entry will be made in the catalog list,
with the suffix "_search".

.. image:: _static/images/gui_search_cat.png

Sources
-------
Selecting a source from the list updates the description text on the left-side of the gui.

Below the **list of sources** is a row of buttons for inspecting the selected data source:

-  **Plot**: opens a sub-panel for viewing the pre-defined (specified in the yaml) plots
   for the selected source.

Plot
~~~~

The Plot button (📊) opens a sub-panel with an area for viewing pre-defined plots.

.. image:: _static/images/gui_plot.png

These plots are specified in the catalog yaml and that yaml can be displayed by
checking the box next to "show yaml".

.. image:: _static/images/gui_plot_yaml.png

The holoviews object can be retrieved from the gui using ``intake.interface.source.plot.pane.object``,
and you can then use it in Python or export it to a file.

Interactive Visualization
'''''''''''''''''''''''''

If you have installed the optional extra packages `dfviz`_ and `xrviz`_, you can
interactively plot your dataframe or array data, respectively.

.. image:: _static/images/custom_button.png

.. _dfviz: https://dfviz.readthedocs.io/
.. _xrviz: https://xrviz.readthedocs.io/

The button "customize" will be available for data sources of the appropriate type.
Click this to open the interactive interface. If you have not selected a predefined
plot (or there are none), then the interface will start without any prefilled
values, but if you do first select a plot, then the interface will have its options
pre-filled from the options

For specific instructions on how to use the interfaces (which can also be used
independently of the Intake GUI), please navigate to the linked documentation.

Note that the final parameters that are sent to ``hvPlot`` to produce the output
each time a plot if updated, are explicitly available in YAML format, so that
you can save the state as a "predefined plot" in the catalog. The same set of
parameters can also be used in code, with ``datasource.plot(...)``.

.. image:: _static/images/YAMLtab.png

Using the Selection
-------------------

Once catalogs are loaded and the desired sources has been identified and selected,
the selected sources will be available at the ``.sources`` attribute (``intake.gui.sources``).
Each source entry has informational methods available and can be opened as a data source,
as with any catalog entry:

.. code-block:: python

   In [ ]: source_entry = intake.gui.sources[0]
           source_entry
   Out   :
   name: sea_ice_origin
   container: dataframe
   plugin: ['csv']
   description: Arctic/Antarctic Sea Ice
   direct_access: forbid
   user_parameters: []
   metadata:
   args:
     urlpath: https://timeseries.weebly.com/uploads/2/1/0/8/21086414/sea_ice.csv

   In [ ]: data_source = source_entry()  # may specify parameters here
           data_source.read()
   Out   : < some data >

   In [ ]: source_entry.plot()  # or skip data source step
   Out   : < graphics>

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/guide.rst
================================================
User Guide
----------

More detailed information about specific parts of Intake, such as how to author catalogs,
how to use the graphical interface, plotting, etc.

.. toctree::
    :maxdepth: 1

    gui.rst
    catalog.rst
    tools.rst
    persisting.rst
    plotting.rst
    plugin-directory.rst
    transforms.rst

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/index.rst
================================================
.. raw:: html

   <img src="_static/images/logo.png" alt="Intake Logo" style="float:right;width:94px;height:60px;">

.. _take2:

Intake Take2
============

*Taking the pain out of data access and distribution*

Intake is an open-source package to:

- describe your data declaratively
- gather data sets into catalogs
- search catalogs and services to find the right data you need
- load, transform and output data in many formats
- work with third party remote storage and compute platforms

This is the start of the documentation for the alpha version of Intake: Take2, a
rewrite of Intake (henceforth referred to as legacy or V1). We will give an
introduction to the ideas of Intake in general and specifically how to use this
release. Go directly to the walkthrough and examples, or read the following motivation
and declarations of scope.

.. note::

    We are making Take2 as a full release. It is still "beta" in the sense that we will be adding
    many data types, readers and transformers, and are prepared to revisit the API in general. The
    reason not to use a pre-release or RC, is that users never see these.


.. warning::

    Looking for :ref:`v1` documentation? You may have just installed Intake and found that
    Take2 broke things for you, so you might wish to pin to an older version. Or stick around
    and find out why you might wish to update your code. All old "sources", whether still working
    or not, should be considered deprecated.


.. toctree::
    :maxdepth: 2

    scope2.rst
    user2.rst
    walkthrough2.rst
    tour2.rst
    api2.rst

    code-of-conduct.rst
    contributing.rst

    index_v1.rst

Install
-------

To install Intake Take2:

.. code-block:: bash

    pip install -c conda-forge intake
    or
    pip install intake

Please leave issues and discussions on our `repo page`_.

.. _repo page: https://github.com/intake/intake

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

These docs pages collect anonymous tracking data using goatcounter, and the
dashboard is available to the public: https://intake.goatcounter.com/ .


.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/index_v1.rst
================================================
.. raw:: html

   <img src="_static/images/logo.png" alt="Intake Logo" style="float:right;width:94px;height:60px;">

.. _v1:

Intake Legacy
=============

*Taking the pain out of data access and distribution*


Intake is a lightweight package for finding, investigating, loading and disseminating data. It will appeal to different
groups for some of the reasons below, but is useful for all and acts as a common platform that everyone can use to
smooth the progression of data from developers and providers to users.

.. warning::

    This is the Legacy documentation for Intake pre-v2. To install, please pin your versions to
    "<2". You should expect old catalogs and sources to continue working, but a lot has changed,
    so we encourage all comers to read the new documentation and adapt their catalogs and code
    if possible.

    Looking for :ref:`take2` ?

Intake contains the following main components. You *do not* need to use them all! The
library is modular, only use the parts you need:

* A set of **data loaders** (:term:`Drivers<Driver>`) with a common interface, so that you can
  investigate or load anything, from local or remote, with the exact same call, and turning into data structures
  that you already know how to manipulate, such as arrays and data-frames.
* A **Cataloging system** (:term:`Catalogs<Catalog>`) for listing data sources, their metadata and parameters,
  and referencing which of the Drivers should load each. The catalogs for a hierarchical,
  searchable structure, which can be backed by files, Intake servers or third-party
  data services
* Sets of **convenience functions** to apply to various data sources, such as data-set
  persistence, automatic concatenation and metadata inference and the ability to
  distribute catalogs and data sources using simple packaging abstractions.
* A **GUI layer** accessible in the Jupyter notebook or as a standalone webserver, which
  allows you to find and navigate catalogs, investigate data sources, and plot either
  predefined visualisations or interactively find the right view yourself
* A **client-server protocol** to allow for arbitrary data cataloging services or to
  serve the data itself, with a pluggable auth model.


:term:`Data User`
-----------------

.. raw:: html

   <img src="_static/images/line.png" alt="Line graph" style="float:left;width:160px;height:120px;padding-right:25px">

* Intake loads the data for a range of formats and types (see :ref:`plugin-directory`) into containers you already use,
  like Pandas dataframes, Python lists, NumPy arrays, and more
* Intake loads, then gets out of your way
* GUI search and introspect data-sets in :term:`Catalogs<Catalog>`: quickly find what you need to do your work
* Install data-sets and automatically get requirements
* Leverage cloud resources and distributed computing.

See the executable tutorial:

.. image:: https://mybinder.org/badge_logo.svg
   :target: https://mybinder.org/v2/gh/intake/intake-examples/master?filepath=tutorial%2Fdata_scientist.ipynb

:term:`Data Provider`
---------------------

.. raw:: html

   <img src="_static/images/grid.png" alt="Grid" style="float:right;width:160px;height:120px;">

* Simple spec to define data sources
* Single point of truth, no more copy&paste
* Distribute data using packages, shared files or a server
* Update definitions in-place
* Parametrise user options
* Make use of additional functionality like filename parsing and caching.

See the executable tutorial:

.. image:: https://mybinder.org/badge_logo.svg
   :target: https://mybinder.org/v2/gh/intake/intake-examples/master?filepath=tutorial%2Fdata_engineer.ipynb

:term:`IT`
----------

.. raw:: html

   <img src="_static/images/terminal.png" alt="FA-terminal" style="float:right;width:80px;height:80px">

* Create catalogs out of established departmental practices
* Provide data access credentials via Intake parameters
* Use server-client architecture as gatekeeper:

   * add authentication methods
   * add monitoring point; track the data-sets being accessed.

* Hook Intake into proprietary data access systems.

:term:`Developer`
-----------------

.. raw:: html

   <img src="_static/images/code.png" alt="Python code" style="float:left;width:200px;height:90px;padding-right:25px">

* Turn boilerplate code into a reusable :term:`Driver`
* Pluggable architecture of Intake allows for many points to add and improve
* Open, simple code-base -- come and get involved on `github`_!

.. _github: https://github.com/intake/intake


See the executable tutorial:

.. image:: https://mybinder.org/badge_logo.svg
   :target: https://mybinder.org/v2/gh/intake/intake-examples/master?filepath=tutorial%2Fdev.ipynb

First steps
===========

The :doc:`start` document contains the sections that all users new to Intake should
read through. :ref:`usecases` shows specific problems that Intake solves.
For a brief demonstration, which you can execute locally, go to :doc:`quickstart`.
For a general description of all of the components of Intake and how they fit together, go
to :doc:`overview`. Finally, for some notebooks using Intake and articles about Intake, go
to :doc:`examples` and `intake-examples`_.
These and other documentation pages will make reference to concepts that
are defined in the :doc:`glossary`.

.. _intake-examples: https://github.com/intake/intake-examples

|

|

.. toctree::
    :maxdepth: 1
    :hidden:

    start.rst
    guide.rst
    reference.rst
    roadmap.rst
    glossary.rst
    community.rst


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/making-plugins.rst
================================================
Making Drivers
==============

The goal of the Intake plugin system is to make it very simple to implement a :term:`Driver` for a new data source, without
any special knowledge of Dask or the Intake catalog system.

Assumptions
-----------

Although Intake is very flexible about data, there are some basic assumptions that a driver must satisfy.

Data Model
''''''''''

Intake currently supports 3 kinds of containers, represented the most common data models used in Python:

* dataframe
* ndarray
* python (list of Python objects, usually dictionaries)

Although a driver can load *any* type of data into any container, and new container types can be added to the list
above, it is reasonable to expect that the number of container types remains small. Declaring a container type is
only informational for the user when read locally, but streaming of data from a server requires that the container type
be known to both server and client.

A given driver must only return one kind of container.  If a file format (such as HDF5) could reasonably be
interpreted as two different data models depending on usage (such as a dataframe or an ndarray), then two different
drivers need to be created with different names.  If a driver returns the ``python`` container, it should document
what Python objects will appear in the list.

The source of data should be essentially permanent and immutable.  That is, loading the data should not destroy or
modify the data, nor should closing the data source destroy the data either.  When a data source is serialized and
sent to another host, it will need to be reopened at the destination, which may cause queries to be re-executed and
files to be reopened.  Data sources that treat readers as "consumers" and remove data once read will cause erratic
behavior, so Intake is not suitable for accessing things like FIFO message queues.

Schema
''''''

The schema of a data source is a detailed description of the data, which can be known by loading only metadata or by
loading only some small representative portion of the data. It is information to present to the user about the data
that they are considering loading, and may be important in the case of server-client communication. In the latter
context, the contents of the schema must be serializable by ``msgpack`` (i.e., numbers, strings, lists and
dictionaries only).

There may be unknown parts of
the schema before the whole data is read.  drivers may require this unknown information in the
`__init__()` method (or the catalog spec), or do some kind of partial data inspection to determine the schema; or
more simply, may be given as unknown ``None`` values.
Regardless of method used, the
time spent figuring out the schema ahead of time should be short and not scale with the size of the data.

Typical fields in a schema dictionary are ``npartitions``, ``dtype``, ``shape``, etc., which will be more appropriate
for some drivers/data-types than others.

Partitioning
''''''''''''

Data sources are assumed to be *partitionable*.  A data partition is a randomly accessible fragment of the data.
In the case of sequential and data-frame sources, partitions are numbered, starting from zero, and correspond to
contiguous chunks of data divided along the first
dimension of the data structure. In general, any partitioning scheme is conceivable, such as a tuple-of-ints to
index the chunks of a large numerical array.

Not all data sources can be partitioned.  For example, file
formats without sufficient indexing often can only be read from beginning to end.  In these cases, the DataSource
object should report that there is only 1 partition.  However, it often makes sense for a data source to be able to
represent a directory of files, in which case each file will correspond to one partition.

Metadata
''''''''

Once opened, a DataSource object can have arbitrary metadata associated with it.  The metadata for a data source
should be a dictionary that can be serialized as JSON.  This metadata comes from the following sources:

1. A data catalog entry can associate fixed metadata with the data source.  This is helpful for data formats that do
   not have any support for metadata within the file format.

2. The driver handling the data source may have some general metadata associated with the state of the system at the
   time of access, available even before loading any data-specific information.

2. A driver can add additional metadata when the schema is loaded for the data source.  This allows metadata embedded
   in the data source to be exported.

From the user perspective, all of the metadata should be loaded once the data source has loaded the rest of the
schema (after ``discover()``, ``read()``, ``to_dask()``, etc have been called).


Subclassing ``intake.source.base.DataSourceBase``
-------------------------------------------------

Every Intake driver class should be a subclass of ``intake.source.base.DataSource``.
The class should have the following attributes to identify itself:

- ``name``: The short name of the driver.  This should be a valid python identifier.
  You should not include the
  word ``intake`` in the driver name.

- ``version``: A version string for the driver.  This may be reported to the user by tools
  based on Intake, but has
  no semantic importance.

- ``container``: The container type of data sources created by this object, e.g.,
  ``dataframe``, ``ndarray``, or
  ``python``, one of the keys of ``intake.container.container_map``.
  For simplicity, a driver many only return one typed of container.  If a particular
  source of data could
  be used in multiple ways (such as HDF5 files interpreted as dataframes or as ndarrays),
  two drivers must be created.
  These two drivers can be part of the same Python package.

- ``partition_access``: Do the data sources returned by this driver have multiple
  partitions?  This may help tools in
  the future make more optimal decisions about how to present data.  If in doubt
  (or the answer depends on init
  arguments), ``True`` will always result in correct behavior, even if the data
  source has only one partition.

The ``__init()__`` method should always accept a keyword argument ``metadata``, a
dictionary of metadata from the
catalog to associate with the source.  This dictionary must be serializable as JSON.

The `DataSourceBase` class has a small number of methods which should be overridden.
Here is an example producing a
data-frame::

    class FooSource(intake.source.base.DataSource):
        container = 'dataframe'
        name = 'foo'
        version = '0.0.1'
        partition_access = True

        def __init__(self, a, b, metadata=None):
            # Do init here with a and b
            super(FooSource, self).__init__(
                metadata=metadata
            )

        def _get_schema(self):
            return intake.source.base.Schema(
                datashape=None,
                dtype={'x': "int64", 'y': "int64"},
                shape=(None, 2),
                npartitions=2,
                extra_metadata=dict(c=3, d=4)
            )

        def _get_partition(self, i):
            # Return the appropriate container of data here
            return pd.DataFrame({'x': [1, 2, 3], 'y': [10, 20, 30]})

        def read(self):
            self._load_metadata()
            return pd.concat([self.read_partition(i) for i in range(self.npartitions)])

        def _close(self):
            # close any files, sockets, etc
            pass

Most of the work typically happens in the following methods:

- ``__init__()``: Should be very lightweight and fast.  No files or network resources should be opened, and no
  significant memory should be allocated yet.  Data sources might be serialized immediately.  The default implementation
  of the pickle protocol in the base class will record all the arguments to ``__init__()`` and recreate the object with
  those arguments when unpickled, assuming the class has no side effects.

- ``_get_schema()``: May open files and network resources and return as much of the schema as possible in small
  amount of *approximately* constant  time. Typically, imports of packages needed by the source only happen here.
  The ``npartitions`` and ``extra_metadata`` attributes must be correct
  when ``_get_schema`` returns.  Further keys such as ``dtype``, ``shape``, etc., should reflect the container type of
  the data-source, and can be ``None`` if not easily knowable, or include ``None`` for some elements. File-based
  sources should use fsspec to open a local or remote URL, and pass ``storage_options`` to it. This ensures
  compatibility and extra features such as caching. If the backend can only deal with local files, you may
  still want to use ``fsspec.open_local`` to allow for caching.

- ``_get_partition(self, i)``: Should return all of the data from partition id ``i``, where ``i`` is typically an
  integer, but may be something more complex.
  The base class will automatically verify that ``i`` is in the range ``[0, npartitions)``, so no range checking is
  required in the typical case.

- ``_close(self)``: Close any network or file handles and deallocate any significant memory.  Note that these
  resources may be need to be reopened/reallocated if a read is called again later.

The full set of user methods of interest are as follows:

- ``discover(self)``: Read the source attributes, like ``npartitions``, etc.  As with ``_get_schema()`` above, this
  method is assumed to be fast, and make a best effort to set attributes. The output should be serializable, if the
  source is to be used on a server; the details contained will be used for creating a remote-source on the client.

- ``read(self)``: Return all the data in memory in one in-memory container.

- ``read_chunked(self)``: Return an iterator that returns contiguous chunks of the data.  The chunking is generally
  assumed to be at the partition level, but could be finer grained if desired.

- ``read_partition(self, i)``: Returns the data for a given partition id.  It is assumed that reading a given
  partition does not require reading the data that precedes it.  If ``i`` is out of range, an ``IndexError`` should
  be raised.

- ``to_dask(self)``: Return a (lazy) Dask data structure corresponding to this data source.  It should be assumed
  that the data can be read from the Dask workers, so the loads can be done in future tasks.  For further information,
  see the `Dask documentation <https://dask.pydata.org/en/latest/>`_.

- ``close(self)``: Close network or file handles and deallocate memory.  If other methods are called after ``close()``,
  the source is automatically reopened.

- ``to_*``: for some sources, it makes sense to provide alternative outputs aside from the base container
  (dataframe, array, ...) and Dask variants.

Note that all of these methods typically call ``_get_schema``, to make sure that the source has been
initialised.

Subclassing ``intake.source.base.DataSource``
---------------------------------------------

``DataSource`` provides the same functionality as ``DataSourceBase``, but has some additional mixin
classes to provide some extras. A developer may choose to derive from ``DataSource`` to get all of
these, or from ``DataSourceBase`` and make their own choice of mixins to support.

- ``HoloviewsMixin``: provides plotting and GUI capabilities via the `holoviz`_ stack

- ``PersistMixin``: allows for storing a local copy in a default format for the given
  container type

- ``CacheMixin``: allows for local storage of data files for a source. Deprecated,
  you should use one of the caching mechanisms in ``fsspec``.

.. _holoviz: https://holoviz.org/index.html

.. _driver-discovery:

Driver Discovery
----------------

Intake discovers available drivers in three different ways, described below.
After the discovery phase, Intake will automatically create
``open_[driver_name]`` convenience functions under the ``intake`` module
namespace.  Calling a function like ``open_csv()`` is equivalent to
instantiating the corresponding data-source class.

Entrypoints
'''''''''''

If you are packaging your driver into an installable package to be shared, you
should add the following to the package's ``setup.py``:

.. code-block:: python

   setup(
       ...
       entry_points={
           'intake.drivers': [
               'some_format_name = some_package.and_maybe_a_submodule:YourDriverClass',
               ...
           ]
       },
   )

.. important::

   Some critical details of Python's entrypoints feature:

   * Note the unusual syntax of the entrypoints. Each item is given as one long
     string, with the ``=`` as part of the string. Modules are separated by
     ``.``, and the final object name is preceded by ``:``.
   * The right hand side of the equals sign must point to where the object is
     *actually defined*. If ``YourDriverClass`` is defined in
     ``foo/bar.py`` and imported into ``foo/__init__.py`` you might expect
     ``foo:YourDriverClass`` to work, but it does not. You must spell out
     ``foo.bar:YourDriverClass``.

Entry points are a way for Python packages to advertise objects with some
common interface. When Intake is imported, it discovers all packages installed
in the current environment that advertise ``'intake.drivers'`` in this way.

Most packages that define intake drivers have a dependency on ``intake``
itself, for example in order to use intake's base classes. This can create a
ciruclar dependency: importing the package imports intake, which tries
to discover and import packages that define drivers. To avoid this pitfall,
just ensure that ``intake`` is imported first thing in your package's
``__init__.py``. This ensures that the driver-discovery code runs first. Note
that you are *not* required to make your package depend on intake. The rule is
that *if* you import ``intake`` you must import it first thing. If you do not
import intake, there is no circularity.

Configuration
'''''''''''''

The intake configuration file can be used to:

* Specify precedence in the event of name collisions---for example, if two different
  ``csv`` drivers are installed.
* Disable a troublesome driver.
* Manually make intake aware of a driver, which can be useful for
  experimentation and early development until a ``setup.py`` with an
  entrypoint is prepared.
* Assign a driver to a name other than the one assigned by the driver's
  author.

The commandline invocation

.. code-block:: bash

   intake drivers enable some_format_name some_package.and_maybe_a_submodule.YourDriverClass

is equivalent to adding this to your intake configuration file:

.. code-block:: yaml

   drivers:
     some_format_name: some_package.and_maybe_a_submodule.YourDriverClass

You can also disable a troublesome driver

.. code-block:: bash

   intake drivers disable some_format_name

which is equivalent to

.. code-block:: yaml

   drivers:
     your_format_name: false

Deprecated: Package Scan
''''''''''''''''''''''''

When Intake is imported, it will search the Python module path (by default includes ``site-packages`` and other
directories in your ``$PYTHONPATH``) for packages starting with ``intake\_`` and discover DataSource subclasses inside
those packages to register.  drivers will be registered based on the``name`` attribute of the object.
By convention, drivers should have names that are lowercase, valid Python identifiers that do not contain the word
``intake``.

This approach is deprecated because it is limiting (requires the package to
begin with "intake\_") and because the package scan can be slow. Using
entrypoints is strongly encouraged. The package scan *may* be disabled by
default in some future release of intake. During the transition period, if a
package named ``intake_*`` provides an entrypoint for a given name, that will
take precedence over any drivers gleaned from the package scan having that
name. If intake discovers any names from the package scan for which there are
no entrypoints, it will issue a ``FutureWarning``.

Python API to Driver Discovery
''''''''''''''''''''''''''''''

.. autofunction:: intake.source.discovery.drivers.register_driver
.. autofunction:: intake.source.discovery.drivers.enable
.. autofunction:: intake.source.discovery.drivers.disable

.. _remote_data:

Remote Data
-----------

For drivers loading from files, the author should be aware that it is easy to implement loading
from files stored in remote services. A simplistic case is demonstrated by the included CSV driver,
which simply passes a URL to Dask, which in turn can interpret the URL as a remote data service,
and use the ``storage_options`` as required (see the Dask documentation on `remote data`_).

.. _remote data: http://dask.pydata.org/en/latest/remote-data-services.html

More advanced usage, where a Dask loader does not already exist, will likely rely on
`fsspec.open_files`_ . Use this function to produce lazy ``OpenFile`` object for local
or remote data, based on a URL, which will have a protocol designation and possibly contain
glob "*" characters. Additional parameters may be passed to ``open_files``, which should,
by convention, be supplied by a driver argument named ``storage_options`` (a dictionary).

.. _fsspec.open_files: https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspec.open_files

To use an ``OpenFile`` object, make it concrete by using a context:


.. code-block:: python

    # at setup, to discover the number of files/partitions
    set_of_open_files = fsspec.open_files(urlpath, mode='rb', **storage_options)

    # when actually loading data; here we loop over all files, but maybe we just do one partition
    for an_open_file in set_of_open_files:
        # `with` causes the object to become concrete until the end of the block
        with an_open_file as f:
            # do things with f, which is a file-like object
            f.seek(); f.read()

The ``textfiles`` builtin drivers implements this mechanism, as an example.


Structured File Paths
---------------------

The CSV driver sets up an example of how to gather data which is encoded in file paths
like (``'data_{site}_.csv'``) and return that data in the output.
Other drivers could also follow the same structure where data is being loaded from a
set of filenames. Typically this would apply to data-frame output.
This is possible as long as the driver has access to each of the file paths at some
point in ``_get_schema``. Once the file paths are known, the driver developer can use the helper
functions defined in ``intake.source.utils`` to get the values for each field in the pattern
for each file in the list. These values should then be added to the data, a process which
normally would happen within the _get_schema method.

The PatternMixin defines driver properties such as urlpath, path_as_pattern, and pattern.
The implementation might look something like this::

    from intake.source.utils import reverse_formats

    class FooSource(intake.source.base.DataSource, intake.source.base.PatternMixin):
        def __init__(self, a, b, path_as_pattern, urlpath, metadata=None):
            # Do init here with a and b
            self.path_as_pattern = path_as_pattern
            self.urlpath = urlpath

            super(FooSource, self).__init__(
                container='dataframe',
                metadata=metadata
            )
        def _get_schema(self):
            # read in the data
            values_by_field = reverse_formats(self.pattern, file_paths)
            # add these fields and map values to the data
            return data


Since dask already has a specific method for including the file paths in the output dataframe,
in the CSV driver we set ``include_path_column=True``, to get a dataframe where one of the
columns contains all the file paths. In this case, `add these fields and values to data`
is a mapping between the categorical file paths column and the ``values_by_field``.

In other drivers where each file is read in independently the driver developer
can set the new fields on the data from each file before concattenating.
This pattern looks more like::

    from intake.source.utils import reverse_format

    class FooSource(intake.source.base.DataSource):
        ...

        def _get_schema(self):
            # get list of file paths
            for path in file_paths:
                # read in the file
                values_by_field = reverse_format(self.pattern, path)
                # add these fields and values to the data
            # concatenate the datasets
            return data


To toggle on and off this path as pattern behavior, the CSV and intake-xarray drivers
uses the bool ``path_as_pattern`` keyword argument.

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/overview.rst
================================================
Overview
========

Introduction
------------

This page describes the technical design of Intake, with brief details of the aims of the project and
components of the library

Why Intake?
-----------

Intake solves a related set of problems:

* Python API standards for loading data (such as DB-API 2.0) are optimized for transactional databases and query results
  that are processed one row at a time.

* Libraries that do load data in bulk tend to each have their own API for doing so, which adds friction when switching
  data formats.

* Loading data into a distributed data structure (like those found in Dask and Spark) often requires writing a separate
  loader.

* Abstractions often focus on just one data model (tabular, n-dimensional array, or semi-structured), when many projects
  need to work with multiple kinds of data.

Intake has the explicit goal of **not** defining a computational expression
system.  Intake plugins load the data into containers (e.g., arrays or data-frames) that
provide their data processing features.  As a result, it is
very easy to make a new Intake plugin with a relatively small amount of Python.

Structure
---------

Intake is a Python library for accessing data in a simple and uniform way.  It consists of three parts:

1. A lightweight plugin system for adding data loader :term:`drivers<Driver>` for new file formats and servers
(like databases, REST endpoints or other cataloging services)

2. A cataloging system for specifying these sources in simple :term:`YAML` syntax, or with plugins that read source specs
from some external data service

3. A server-client architecture that can share data catalog metadata over the network, or even stream the data directly
to clients if needed

Intake supports loading data into standard Python containers. The list can be easily extended,
but the currently supported list is:

* Pandas Dataframes - tabular data

* NumPy Arrays - tensor data

* Python lists of dictionaries - semi-structured data

Additionally, Intake can load data into distributed data structures.  Currently it supports Dask, a flexible parallel
computing library with distributed containers like `dask.dataframe <https://dask.pydata.org/en/latest/dataframe.html>`_,
`dask.array <https://dask.pydata.org/en/latest/array.html>`_,
and `dask.bag <https://dask.pydata.org/en/latest/bag.html>`_.
In the future, other distributed computing systems could use Intake to create similar data structures.

Concepts
--------

Intake is built out of four core concepts:

* Data Source classes: the "driver" plugins that each implement loading of some specific type of data into python, with
  plugin-specific arguments.

* Data Source: An object that represents a reference to a data source.  Data source instances have methods for loading the
  data into standard containers, like Pandas DataFrames, but do not load any data until specifically requested.

* Catalog: An inventory of catalog entries, each of which defines a Data Source. Catalog objects can be created from
  local YAML definitions, by connecting
  to remote servers, or by some driver that knows how to query an external data service.

* Catalog Entry: A named data source held internally by catalog objects, which generate
  data source instances when accessed.
  The catalog entry includes metadata about the source, as well as the name of the
  driver and arguments. Arguments can be parameterized, allowing one entry to return
  different subsets of data depending on the user request.

The business of a plugin is to go from some data format (bunch of files or some remote service)
to a ":term:`Container`" of the data (e.g., data-frame), a thing on which you can perform further analysis.
Drivers can be used directly by the user, or indirectly through data catalogs.  Data sources can be pickled, sent over
the network to other hosts, and reopened (assuming the remote system has access to the required files or servers).

See also the :doc:`glossary`.

Future Directions
-----------------

Ongoing work for enhancements, as well as requests for plugins, etc., can be found at the
`issue tracker <https://github.com/intake/intake/issues>`_. See the :ref:`roadmap`
for general mid- and long-term goals.

.. raw:: html

    <script data-goatcounter="https://intake.goatcounter.com/count"
        async src="//gc.zgo.at/count.js"></script>


================================================
FILE: docs/source/persisting.rst
================================================
.. _persisting:

Persisting Data
===============

(this is an experimental new feature, expect enhancements and changes)

Introduction
------------

As defined in the glossary, to :term:`Persist` is to convert data into the storage format
most appropriate for the container type, and save a copy of this for rapid lookup in the future.
This is of great potential benefit where the creation or transfer of the original data source
takes some time.

This is not to be confused with the file :term:`Cache`.

Usage
-----

Any :term:`Data Source` has a method ``.persist()``. The only option that you will need to
pick is a :term:`TTL`, the number of seconds that the persisted version lasts before
expiry (leave as ``None`` for no expiry). This creates a local copy in the persist
directory, which may be in ``"~/.intake/persist``, but can be configured.

Each container type (dataframe, array, ...) will have its own implementation of persistence,
and a particular file storage format associated. The call to ``.persist()`` may take
arguments to tune how the local files are created, and in some cases may require additional
optional packages to be installed.

Example::

    cat = intake.open_catalog('mycat.yaml')  # load a remote cat
    source = cat.csvsource()  # source pointing to remote data
    source.persist()

    source = cat.csvsource()  # future use now gives local intake_parquet.ParquetSource

To control whether a catalog will automatically give you the persisted version of a
source in this way using the argument ``persist_mode``, e.g., to ignore locally
persisted versions, you could have done::

    cat = intake.open_catalog('mycat.yaml', persist_mode='never')
    or
    source = cat.csvsource(persist_mode='never')

Note that if you give a TTL (in seconds), then the original source will be accessed
and a new persisted version written transparently when the old persisted version has expired.

Note that after persisting, the original source will have ``source.has_been_persisted == True``
and the persisted source (i.e., the one loaded from local files) will have
``source.is_persisted == True``.

Export
------

A similar concept to Persist, Export allows you to make a copy of some data source, in the
format appropriate for its container, and place this data-set in whichever location suits you,
including remote locations. This functionality (``source.export()``) does *not* touch the persist
store; instead, it returns a YAML text representation of the output, so that you can put it into
a catalog of your own. It would be this catalog that you share with other people.

Note that "exported" data-sources like this do contain the information of the original source they
were made from in their metadata, so you can recreate the original source, if you want to, and
read from there.

Persisting to Remote
----
Download .txt
gitextract_g6quolyo/

├── .ci-coveragerc
├── .coveragerc
├── .gitattributes
├── .github/
│   └── workflows/
│       ├── main.yaml
│       ├── pre-commit.yml
│       └── pypipublish.yaml
├── .gitignore
├── .pre-commit-config.yaml
├── LICENSE
├── MANIFEST.in
├── README.md
├── README_refactor.md
├── docs/
│   ├── Makefile
│   ├── README.md
│   ├── environment.yml
│   ├── make.bat
│   ├── make_api.py
│   ├── plugins.py
│   ├── plugins.yaml
│   ├── requirements.txt
│   └── source/
│       ├── _static/
│       │   ├── .keep
│       │   ├── css/
│       │   │   └── custom.css
│       │   └── images/
│       │       └── plotting_example.html
│       ├── api.rst
│       ├── api2.rst
│       ├── api_base.rst
│       ├── api_other.rst
│       ├── api_user.rst
│       ├── catalog.rst
│       ├── changelog.rst
│       ├── code-of-conduct.rst
│       ├── community.rst
│       ├── conf.py
│       ├── contributing.rst
│       ├── data-packages.rst
│       ├── deployments.rst
│       ├── examples.rst
│       ├── glossary.rst
│       ├── gui.rst
│       ├── guide.rst
│       ├── index.rst
│       ├── index_v1.rst
│       ├── making-plugins.rst
│       ├── overview.rst
│       ├── persisting.rst
│       ├── plotting.rst
│       ├── plugin-directory.rst
│       ├── quickstart.rst
│       ├── reference.rst
│       ├── roadmap.rst
│       ├── scope2.rst
│       ├── start.rst
│       ├── tools.rst
│       ├── tour2.rst
│       ├── transforms.rst
│       ├── use_cases.rst
│       ├── user2.rst
│       └── walkthrough2.rst
├── examples/
│   └── Take2.ipynb
├── intake/
│   ├── __init__.py
│   ├── catalog/
│   │   ├── __init__.py
│   │   ├── base.py
│   │   ├── default.py
│   │   ├── entry.py
│   │   ├── exceptions.py
│   │   ├── gui.py
│   │   ├── local.py
│   │   ├── tests/
│   │   │   ├── __init__.py
│   │   │   ├── cache_data/
│   │   │   │   └── states.csv
│   │   │   ├── catalog.yml
│   │   │   ├── catalog1.yml
│   │   │   ├── catalog_alias.yml
│   │   │   ├── catalog_caching.yml
│   │   │   ├── catalog_dup_parameters.yml
│   │   │   ├── catalog_dup_sources.yml
│   │   │   ├── catalog_hierarchy.yml
│   │   │   ├── catalog_named.yml
│   │   │   ├── catalog_non_dict.yml
│   │   │   ├── catalog_search/
│   │   │   │   ├── example_packages/
│   │   │   │   │   ├── ep/
│   │   │   │   │   │   └── __init__.py
│   │   │   │   │   └── ep-0.1.dist-info/
│   │   │   │   │       └── entry_points.txt
│   │   │   │   └── yaml.yml
│   │   │   ├── catalog_union_1.yml
│   │   │   ├── catalog_union_2.yml
│   │   │   ├── conftest.py
│   │   │   ├── data_source_missing.yml
│   │   │   ├── data_source_name_non_string.yml
│   │   │   ├── data_source_non_dict.yml
│   │   │   ├── data_source_value_non_dict.yml
│   │   │   ├── dot-nest.yaml
│   │   │   ├── entry1_1.csv
│   │   │   ├── entry1_2.csv
│   │   │   ├── example1_source.py
│   │   │   ├── example_plugin_dir/
│   │   │   │   └── example2_source.py
│   │   │   ├── multi_plugins.yaml
│   │   │   ├── multi_plugins2.yaml
│   │   │   ├── obsolete_data_source_list.yml
│   │   │   ├── obsolete_params_list.yml
│   │   │   ├── params_missing_required.yml
│   │   │   ├── params_name_non_string.yml
│   │   │   ├── params_non_dict.yml
│   │   │   ├── params_value_bad_choice.yml
│   │   │   ├── params_value_bad_type.yml
│   │   │   ├── params_value_non_dict.yml
│   │   │   ├── plugins_non_dict.yml
│   │   │   ├── plugins_source_missing.yml
│   │   │   ├── plugins_source_missing_key.yml
│   │   │   ├── plugins_source_non_dict.yml
│   │   │   ├── plugins_source_non_list.yml
│   │   │   ├── plugins_source_non_string.yml
│   │   │   ├── small.npy
│   │   │   ├── test_alias.py
│   │   │   ├── test_catalog_save.py
│   │   │   ├── test_core.py
│   │   │   ├── test_default.py
│   │   │   ├── test_discovery.py
│   │   │   ├── test_gui.py
│   │   │   ├── test_local.py
│   │   │   ├── test_parameters.py
│   │   │   ├── test_reload_integration.py
│   │   │   ├── test_utils.py
│   │   │   ├── test_zarr.py
│   │   │   └── util.py
│   │   ├── utils.py
│   │   └── zarr.py
│   ├── config.py
│   ├── conftest.py
│   ├── container/
│   │   ├── __init__.py
│   │   └── base.py
│   ├── interface/
│   │   ├── __init__.py
│   │   ├── base.py
│   │   ├── catalog/
│   │   │   ├── __init__.py
│   │   │   ├── add.py
│   │   │   └── search.py
│   │   ├── gui.py
│   │   └── source/
│   │       ├── __init__.py
│   │       └── defined_plots.py
│   ├── readers/
│   │   ├── __init__.py
│   │   ├── catalogs.py
│   │   ├── convert.py
│   │   ├── datatypes.py
│   │   ├── entry.py
│   │   ├── examples.py
│   │   ├── importlist.py
│   │   ├── metadata.py
│   │   ├── mixins.py
│   │   ├── namespaces.py
│   │   ├── output.py
│   │   ├── readers.py
│   │   ├── search.py
│   │   ├── tests/
│   │   │   ├── __init__.py
│   │   │   ├── cats/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── stac_data/
│   │   │   │   │   ├── 1.0.0/
│   │   │   │   │   │   ├── catalog/
│   │   │   │   │   │   │   ├── catalog.json
│   │   │   │   │   │   │   └── child-catalog.json
│   │   │   │   │   │   ├── collection/
│   │   │   │   │   │   │   ├── collection.json
│   │   │   │   │   │   │   ├── simple-item.json
│   │   │   │   │   │   │   └── zarr-collection.json
│   │   │   │   │   │   ├── item/
│   │   │   │   │   │   │   └── zarr-item.json
│   │   │   │   │   │   └── itemcollection/
│   │   │   │   │   │       └── example-search.json
│   │   │   │   │   └── 1.0.0beta2/
│   │   │   │   │       └── earthsearch/
│   │   │   │   │           ├── readme.md
│   │   │   │   │           └── single-file-stac.json
│   │   │   │   ├── test_sql.py
│   │   │   │   ├── test_stac.py
│   │   │   │   ├── test_thredds.py
│   │   │   │   └── test_tiled.py
│   │   │   ├── test_basic.py
│   │   │   ├── test_consistency.py
│   │   │   ├── test_dict.py
│   │   │   ├── test_errors.py
│   │   │   ├── test_reader.py
│   │   │   ├── test_search.py
│   │   │   ├── test_up.py
│   │   │   ├── test_utils.py
│   │   │   └── test_workflows.py
│   │   ├── transform.py
│   │   ├── user_parameters.py
│   │   └── utils.py
│   ├── source/
│   │   ├── __init__.py
│   │   ├── base.py
│   │   ├── csv.py
│   │   ├── derived.py
│   │   ├── discovery.py
│   │   ├── jsonfiles.py
│   │   ├── npy.py
│   │   ├── tests/
│   │   │   ├── __init__.py
│   │   │   ├── alias.yaml
│   │   │   ├── cached.yaml
│   │   │   ├── data.zarr/
│   │   │   │   ├── .zarray
│   │   │   │   └── 0
│   │   │   ├── der.yaml
│   │   │   ├── footer_csvs/
│   │   │   │   ├── sample_fewfooters.csv
│   │   │   │   ├── sample_manyfooters.csv
│   │   │   │   └── sample_nofooters.csv
│   │   │   ├── pipeline.yaml
│   │   │   ├── plugin_searchpath/
│   │   │   │   ├── collision_foo/
│   │   │   │   │   └── __init__.py
│   │   │   │   ├── collision_foo2/
│   │   │   │   │   └── __init__.py
│   │   │   │   ├── driver_with_entrypoints/
│   │   │   │   │   └── __init__.py
│   │   │   │   ├── driver_with_entrypoints-0.1.dist-info/
│   │   │   │   │   └── entry_points.txt
│   │   │   │   ├── intake_foo/
│   │   │   │   │   └── __init__.py
│   │   │   │   └── not_intake_foo/
│   │   │   │       └── __init__.py
│   │   │   ├── sample1.csv
│   │   │   ├── sample2_1.csv
│   │   │   ├── sample2_2.csv
│   │   │   ├── sample3_2.csv
│   │   │   ├── sources.yaml
│   │   │   ├── test_base.py
│   │   │   ├── test_csv.py
│   │   │   ├── test_derived.py
│   │   │   ├── test_discovery.py
│   │   │   ├── test_json.py
│   │   │   ├── test_npy.py
│   │   │   ├── test_text.py
│   │   │   ├── test_tiled.py
│   │   │   ├── test_utils.py
│   │   │   └── util.py
│   │   ├── textfiles.py
│   │   ├── tiled.py
│   │   ├── utils.py
│   │   └── zarr.py
│   ├── tests/
│   │   ├── __init__.py
│   │   ├── catalog1.yml
│   │   ├── catalog2.yml
│   │   ├── catalog_inherit_params.yml
│   │   ├── catalog_nested.yml
│   │   ├── catalog_nested_sub.yml
│   │   ├── test_config.py
│   │   ├── test_top_level.py
│   │   └── test_utils.py
│   ├── util_tests.py
│   └── utils.py
├── pyproject.toml
├── readthedocs.yml
└── scripts/
    └── ci/
        ├── environment-pip.yml
        ├── environment-py310.yml
        ├── environment-py311.yml
        ├── environment-py312.yml
        ├── environment-py313.yml
        └── environment-py314.yml
Download .txt
SYMBOL INDEX (1445 symbols across 96 files)

FILE: docs/make_api.py
  function run (line 6) | def run(path):

FILE: docs/plugins.py
  function format_package_links (line 8) | def format_package_links(package_name, repo_link):
  function format_repo_link (line 12) | def format_repo_link(repo_link):
  function format_badge_html (line 18) | def format_badge_html(badge, link):
  function check_ok (line 22) | async def check_ok(client, url):
  function check_all_ok (line 31) | async def check_all_ok(urls):
  function generate_plugin_table (line 37) | def generate_plugin_table():

FILE: intake/__init__.py
  function __getattr__ (line 58) | def __getattr__(attr):
  function __dir__ (line 96) | def __dir__(*_, **__):
  function open_catalog (line 101) | def open_catalog(uri=None, **kwargs):

FILE: intake/catalog/__init__.py
  function _make_builtin (line 13) | def _make_builtin():
  function __getattr__ (line 21) | def __getattr__(name):

FILE: intake/catalog/base.py
  class VersionError (line 19) | class VersionError(Exception):
  class Catalog (line 23) | class Catalog(DataSource):
    method __init__ (line 48) | def __init__(
    method from_dict (line 131) | def from_dict(cls, entries, **kwargs):
    method kwargs (line 152) | def kwargs(self):
    method _make_entries_container (line 155) | def _make_entries_container(self):
    method _load (line 179) | def _load(self):
    method force_reload (line 183) | def force_reload(self):
    method reload (line 188) | def reload(self):
    method version (line 194) | def version(self):
    method search (line 199) | def search(self, text, depth=2):
    method filter (line 224) | def filter(self, func):
    method walk (line 257) | def walk(self, sofar=None, prefix=None, depth=2):
    method items (line 289) | def items(self):
    method values (line 294) | def values(self):
    method serialize (line 299) | def serialize(self):
    method save (line 328) | def save(self, url, storage_options=None):
    method _get_entry (line 345) | def _get_entry(self, name):
    method configure_new (line 357) | def configure_new(self, **kwargs):
    method _get_entries (line 384) | def _get_entries(self):
    method __iter__ (line 387) | def __iter__(self):
    method keys (line 391) | def keys(self):
    method __len__ (line 395) | def __len__(self):
    method __contains__ (line 398) | def __contains__(self, key):
    method __dir__ (line 402) | def __dir__(self):
    method _ipython_key_completions_ (line 413) | def _ipython_key_completions_(self):
    method __repr__ (line 416) | def __repr__(self):
    method __getattr__ (line 419) | def __getattr__(self, item):
    method __setitem__ (line 432) | def __setitem__(self, key, entry):
    method pop (line 448) | def pop(self, key):
    method __getitem__ (line 462) | def __getitem__(self, key):
    method discover (line 502) | def discover(self):
    method _close (line 510) | def _close(self):
    method gui (line 515) | def gui(self):

FILE: intake/catalog/default.py
  function load_user_catalog (line 21) | def load_user_catalog():
  function user_data_dir (line 30) | def user_data_dir():
  function load_global_catalog (line 35) | def load_global_catalog():
  function conda_prefix (line 48) | def conda_prefix():
  function which (line 57) | def which(program):
  function global_data_dir (line 64) | def global_data_dir():
  function load_combo_catalog (line 82) | def load_combo_catalog():

FILE: intake/catalog/entry.py
  class CatalogEntry (line 11) | class CatalogEntry(DictSerialiseMixin):
    method __init__ (line 18) | def __init__(self, getenv=True, getshell=False):
    method describe (line 24) | def describe(self):
    method get (line 42) | def get(self, **user_parameters):
    method __call__ (line 58) | def __call__(self, persist=None, **kwargs):
    method container (line 66) | def container(self):
    method container (line 70) | def container(self, cont):
    method plots (line 76) | def plots(self):
    method _ipython_display_ (line 80) | def _ipython_display_(self):
    method _yaml (line 96) | def _yaml(self):
    method __iter__ (line 99) | def __iter__(self):
    method __getitem__ (line 106) | def __getitem__(self, item):
    method __repr__ (line 119) | def __repr__(self):
    method gui (line 123) | def gui(self):

FILE: intake/catalog/exceptions.py
  class CatalogException (line 9) | class CatalogException(Exception):
  class PermissionDenied (line 13) | class PermissionDenied(CatalogException):
  class ShellPermissionDenied (line 19) | class ShellPermissionDenied(PermissionDenied):
    method __init__ (line 22) | def __init__(self, msg=None):
  class EnvironmentPermissionDenied (line 28) | class EnvironmentPermissionDenied(PermissionDenied):
    method __init__ (line 31) | def __init__(self, msg=None):
  class ValidationError (line 37) | class ValidationError(CatalogException):
    method __init__ (line 40) | def __init__(self, message, errors):
  class DuplicateKeyError (line 45) | class DuplicateKeyError(ValidationError):
    method __init__ (line 48) | def __init__(self, context, context_mark, problem, problem_mark):
  class ObsoleteError (line 55) | class ObsoleteError(ValidationError):
  class ObsoleteParameterError (line 59) | class ObsoleteParameterError(ObsoleteError):
    method __init__ (line 60) | def __init__(self):
  class ObsoleteDataSourceError (line 78) | class ObsoleteDataSourceError(ObsoleteError):
    method __init__ (line 79) | def __init__(self):

FILE: intake/catalog/gui.py
  class GUI (line 13) | class GUI(object):
    method __init__ (line 14) | def __init__(self, *args, **kwargs):
    method __repr__ (line 17) | def __repr__(self):
    method __init__ (line 26) | def __init__(self, *args, **kwargs):
    method __repr__ (line 29) | def __repr__(self):
  class GUI (line 25) | class GUI(object):
    method __init__ (line 14) | def __init__(self, *args, **kwargs):
    method __repr__ (line 17) | def __repr__(self):
    method __init__ (line 26) | def __init__(self, *args, **kwargs):
    method __repr__ (line 29) | def __repr__(self):

FILE: intake/catalog/local.py
  class UserParameter (line 28) | class UserParameter(DictSerialiseMixin):
    method __init__ (line 53) | def __init__(
    method __repr__ (line 86) | def __repr__(self):
    method describe (line 91) | def describe(self):
    method expand_defaults (line 105) | def expand_defaults(self, client=False, getenv=True, getshell=False):
    method validate (line 114) | def validate(self, value):
  class LocalCatalogEntry (line 138) | class LocalCatalogEntry(CatalogEntry):
    method __init__ (line 141) | def __init__(
    method name (line 231) | def name(self):
    method describe (line 234) | def describe(self):
    method _create_open_args (line 254) | def _create_open_args(self, user_parameters):
    method get (line 307) | def get(self, **user_parameters):
    method clear_cached_default_source (line 325) | def clear_cached_default_source(self):
  class CatalogParser (line 333) | class CatalogParser(object):
    method __init__ (line 336) | def __init__(self, data, getenv=True, getshell=False, context=None):
    method ok (line 345) | def ok(self):
    method data (line 349) | def data(self):
    method errors (line 353) | def errors(self):
    method warnings (line 357) | def warnings(self):
    method error (line 360) | def error(self, msg, obj, key=None):
    method warning (line 366) | def warning(self, msg, obj, key=None):
    method _parse_plugins (line 372) | def _parse_plugins(self, data):
    method _getitem (line 411) | def _getitem(self, obj, key, dtype, required=True, default=None, choic...
    method _parse_user_parameter (line 437) | def _parse_user_parameter(self, name, data):
    method _parse_data_source (line 455) | def _parse_data_source(self, name, data):
    method _parse_data_source_local (line 465) | def _parse_data_source_local(self, name, data):
    method _parse_data_sources (line 524) | def _parse_data_sources(self, data):
    method _parse (line 557) | def _parse(self, data):
  function get_dir (line 573) | def get_dir(path):
  class YAMLFileCatalog (line 587) | class YAMLFileCatalog(Catalog):
    method __init__ (line 595) | def __init__(self, path=None, text=None, autoreload=True, **kwargs):
    method _load (line 619) | def _load(self, reload=False):
    method add (line 654) | def add(self, source, name=None, path=None, storage_options=None):
    method parse (line 710) | def parse(self, text):
    method name_from_path (line 758) | def name_from_path(self):
  class YAMLFilesCatalog (line 766) | class YAMLFilesCatalog(Catalog):
    method __init__ (line 774) | def __init__(self, path, flatten=True, **kwargs):
    method _load (line 794) | def _load(self):
  class MergedCatalog (line 868) | class MergedCatalog(Catalog):
    method __init__ (line 873) | def __init__(self, catalogs, *args, **kwargs):
    method _load (line 877) | def _load(self):
  class EntrypointEntry (line 883) | class EntrypointEntry(CatalogEntry):
    method __init__ (line 889) | def __init__(self, entrypoint):
    method __repr__ (line 895) | def __repr__(self):
    method name (line 899) | def name(self):
    method describe (line 902) | def describe(self):
    method __call__ (line 915) | def __call__(self, **kwargs):
  class EntrypointsCatalog (line 925) | class EntrypointsCatalog(Catalog):
    method __init__ (line 930) | def __init__(self, *args, entrypoints_group="intake.catalogs", paths=N...
    method _load (line 935) | def _load(self):

FILE: intake/catalog/tests/catalog_search/example_packages/ep/__init__.py
  class TestCatalog (line 1) | class TestCatalog:

FILE: intake/catalog/tests/conftest.py
  function catalog1 (line 9) | def catalog1():

FILE: intake/catalog/tests/example1_source.py
  class ExampleSource (line 11) | class ExampleSource(DataSource):
    method __init__ (line 17) | def __init__(self, **kwargs):

FILE: intake/catalog/tests/example_plugin_dir/example2_source.py
  class Ex2Plugin (line 11) | class Ex2Plugin(DataSource):
    method __init__ (line 17) | def __init__(self):

FILE: intake/catalog/tests/test_alias.py
  function test_simple (line 16) | def test_simple():
  function test_mapping (line 25) | def test_mapping():
  function test_other_cat (line 40) | def test_other_cat():
  function test_alias (line 55) | def test_alias():

FILE: intake/catalog/tests/test_catalog_save.py
  function test_catalog_description (line 11) | def test_catalog_description(tmpdir):

FILE: intake/catalog/tests/test_core.py
  function test_no_entry (line 6) | def test_no_entry():
  function test_regression (line 14) | def test_regression():

FILE: intake/catalog/tests/test_default.py
  function test_which (line 15) | def test_which():
  function test_load (line 20) | def test_load():

FILE: intake/catalog/tests/test_discovery.py
  function test_catalog_discovery (line 7) | def test_catalog_discovery():
  function test_deferred_import (line 23) | def test_deferred_import():

FILE: intake/catalog/tests/test_gui.py
  function panel_importable (line 10) | def panel_importable():
  function test_cat_no_panel_does_not_raise_errors (line 23) | def test_cat_no_panel_does_not_raise_errors(catalog1):
  function test_cat_no_panel_display_gui (line 28) | def test_cat_no_panel_display_gui(catalog1):
  function test_cat_gui (line 33) | def test_cat_gui(catalog1):
  function test_entry_no_panel_does_not_raise_errors (line 39) | def test_entry_no_panel_does_not_raise_errors(catalog1):
  function test_entry_no_panel_display_gui (line 44) | def test_entry_no_panel_display_gui(catalog1):

FILE: intake/catalog/tests/test_local.py
  function abspath (line 25) | def abspath(filename):
  function test_local_catalog (line 29) | def test_local_catalog(catalog1):
  function test_get_items (line 87) | def test_get_items(catalog1):
  function test_nested (line 92) | def test_nested(catalog1):
  function test_nested_gets_name_from_super (line 102) | def test_nested_gets_name_from_super(catalog1):
  function test_hash (line 110) | def test_hash(catalog1):
  function test_getitem (line 114) | def test_getitem(catalog1):
  function test_source_plugin_config (line 120) | def test_source_plugin_config(catalog1):
  function test_metadata (line 127) | def test_metadata(catalog1):
  function test_use_source_plugin_from_config (line 132) | def test_use_source_plugin_from_config(catalog1):
  function test_get_dir (line 136) | def test_get_dir():
  function test_entry_dir_function (line 154) | def test_entry_dir_function(catalog1):
  function test_user_parameter_default_value (line 170) | def test_user_parameter_default_value(dtype, expected):
  function test_user_parameter_repr (line 175) | def test_user_parameter_repr():
  function test_user_parameter_coerce_value (line 200) | def test_user_parameter_coerce_value(dtype, given, expected):
  function test_user_parameter_coerce_special_datetime (line 206) | def test_user_parameter_coerce_special_datetime(given):
  function test_user_parameter_coerce_min (line 219) | def test_user_parameter_coerce_min(dtype, given, expected):
  function test_user_parameter_coerce_max (line 232) | def test_user_parameter_coerce_max(dtype, given, expected):
  function test_user_parameter_coerce_allowed (line 244) | def test_user_parameter_coerce_allowed(dtype, given, expected):
  function test_user_parameter_validation_range (line 249) | def test_user_parameter_validation_range():
  function test_user_parameter_validation_allowed (line 266) | def test_user_parameter_validation_allowed():
  function test_user_pars_list (line 281) | def test_user_pars_list():
  function test_user_pars_mlist (line 298) | def test_user_pars_mlist():
  function test_parser_validation_error (line 336) | def test_parser_validation_error(filename):
  function test_parser_obsolete_error (line 348) | def test_parser_obsolete_error(filename):
  function test_union_catalog (line 353) | def test_union_catalog():
  function test_persist_local_cat (line 391) | def test_persist_local_cat(temp_cache):
  function test_empty_catalog (line 405) | def test_empty_catalog():
  function test_nonexistent_error (line 410) | def test_nonexistent_error():
  function test_duplicate_data_sources (line 415) | def test_duplicate_data_sources():
  function test_duplicate_parameters (line 423) | def test_duplicate_parameters():
  function temp_catalog_file (line 432) | def temp_catalog_file():
  function test_catalog_file_removal (line 455) | def test_catalog_file_removal(temp_catalog_file):
  function test_flatten_duplicate_error (line 465) | def test_flatten_duplicate_error():
  function test_multi_cat_names (line 485) | def test_multi_cat_names():
  function test_name_of_builtin (line 502) | def test_name_of_builtin():
  function test_cat_with_declared_name (line 509) | def test_cat_with_declared_name():
  function test_cat_with_no_declared_name_gets_name_from_dir_if_file_named_catalog (line 523) | def test_cat_with_no_declared_name_gets_name_from_dir_if_file_named_cata...
  function test_default_expansions (line 534) | def test_default_expansions():
  function test_remote_cat (line 569) | def test_remote_cat(http_server):
  function test_multi_plugins (line 576) | def test_multi_plugins():
  function test_no_plugins (line 620) | def test_no_plugins():
  function test_explicit_entry_driver (line 632) | def test_explicit_entry_driver():
  function test_getitem_and_getattr (line 643) | def test_getitem_and_getattr():
  function test_dot_names (line 664) | def test_dot_names():
  function test_listing (line 679) | def test_listing(catalog1):
  function test_dict_save (line 685) | def test_dict_save():
  function test_dict_save_complex (line 701) | def test_dict_save_complex():
  function test_dict_adddel (line 725) | def test_dict_adddel():
  function test_filter (line 740) | def test_filter():
  function test_from_dict_with_data_source (line 758) | def test_from_dict_with_data_source():
  function test_no_instance (line 769) | def test_no_instance():
  function test_fsspec_integration (line 779) | def test_fsspec_integration():
  function test_cat_add (line 816) | def test_cat_add(tmpdir):
  function test_no_entries_items (line 834) | def test_no_entries_items(catalog1):
  function test_cat_dictlike (line 857) | def test_cat_dictlike(catalog1):
  function test_inherit_params (line 863) | def test_inherit_params(inherit_params_cat):
  function test_runtime_overwrite_params (line 867) | def test_runtime_overwrite_params(inherit_params_cat):
  function test_local_param_overwrites (line 874) | def test_local_param_overwrites(inherit_params_cat):
  function test_local_and_global_params (line 878) | def test_local_and_global_params(inherit_params_cat):
  function test_search_inherit_params (line 885) | def test_search_inherit_params(inherit_params_cat):
  function test_multiple_cats_params (line 892) | def test_multiple_cats_params(inherit_params_multiple_cats):

FILE: intake/catalog/tests/test_parameters.py
  class NoSource (line 10) | class NoSource(DataSource):
    method __init__ (line 11) | def __init__(self, **kwargs):
  function test_simplest (line 19) | def test_simplest():
  function test_cache_default_source (line 25) | def test_cache_default_source():
  function test_parameter_default (line 42) | def test_parameter_default():
  function test_maybe_default_from_env (line 49) | def test_maybe_default_from_env():
  function test_up_override_and_render (line 76) | def test_up_override_and_render():
  function test_user_explicit_override (line 83) | def test_user_explicit_override():
  function test_auto_env_expansion (line 91) | def test_auto_env_expansion():
  function test_validate_up (line 148) | def test_validate_up():
  function test_validate_par (line 165) | def test_validate_par():
  function test_mlist_parameter (line 178) | def test_mlist_parameter():
  function test_explicit_overrides (line 191) | def test_explicit_overrides():
  function test_extra_arg (line 214) | def test_extra_arg():
  function test_unknown (line 220) | def test_unknown():
  function test_catalog_passthrough (line 232) | def test_catalog_passthrough():

FILE: intake/catalog/tests/test_reload_integration.py
  function teardown_module (line 27) | def teardown_module(module):
  function intake_server_with_config (line 35) | def intake_server_with_config(intake_server):
  function test_reload_updated_config (line 59) | def test_reload_updated_config(intake_server_with_config):
  function test_reload_updated_directory (line 80) | def test_reload_updated_directory(intake_server_with_config):
  function test_reload_missing_remote_directory (line 104) | def test_reload_missing_remote_directory(intake_server):
  function test_reload_missing_local_directory (line 138) | def test_reload_missing_local_directory(tempdir):

FILE: intake/catalog/tests/test_utils.py
  function test_expand_templates (line 13) | def test_expand_templates():
  function test_expand_nested_template (line 20) | def test_expand_nested_template():
  function test_coerce_datetime (line 44) | def test_coerce_datetime(test_input, expected):
  function test_flatten (line 48) | def test_flatten():
  function test_coerce (line 71) | def test_coerce(value, dtype, expected):

FILE: intake/catalog/tests/test_zarr.py
  function temp_zarr (line 17) | def temp_zarr():
  function test_zarr_catalog (line 71) | def test_zarr_catalog(temp_zarr, consolidated):
  function test_zarr_entries_in_yaml_catalog (line 127) | def test_zarr_entries_in_yaml_catalog(temp_zarr):

FILE: intake/catalog/tests/util.py
  function assert_items_equal (line 11) | def assert_items_equal(a, b):
  class TestingSource (line 15) | class TestingSource(base.DataSource):
    method __init__ (line 23) | def __init__(self, *args, **kwargs):
    method _load_metadata (line 29) | def _load_metadata(self):
    method _get_partition (line 32) | def _get_partition(self, _):
  function register (line 36) | def register():

FILE: intake/catalog/utils.py
  function flatten (line 19) | def flatten(iterable):
  function reload_on_change (line 40) | def reload_on_change(f):
  function clamp (line 49) | def clamp(value, lower=0, upper=sys.maxsize):
  function _j_getenv (line 54) | def _j_getenv(x, default=""):
  function _j_getshell (line 64) | def _j_getshell(x):
  function _j_passthrough (line 75) | def _j_passthrough(x, funcname):
  function _expand (line 83) | def _expand(p, context, all_vars, client, getenv, getshell):
  function expand_templates (line 116) | def expand_templates(pars, context, return_left=False, client=False, get...
  function expand_defaults (line 144) | def expand_defaults(default, client=False, getenv=True, getshell=False):
  function merge_pars (line 186) | def merge_pars(params, user_inputs, spec_pars, client=False, getenv=True...
  function coerce_datetime (line 265) | def coerce_datetime(v=None):
  function with_str_parse (line 282) | def with_str_parse(value, rule):
  function coerce (line 309) | def coerce(dtype, value):
  class RemoteCatalogError (line 347) | class RemoteCatalogError(Exception):
  function _has_catalog_dir (line 351) | def _has_catalog_dir(args):

FILE: intake/catalog/zarr.py
  class ZarrGroupCatalog (line 5) | class ZarrGroupCatalog(Catalog):
    method __init__ (line 13) | def __init__(
    method _load (line 47) | def _load(self):
    method to_zarr (line 102) | def to_zarr(self):

FILE: intake/config.py
  function cfile (line 43) | def cfile():
  class Config (line 47) | class Config(dict):
    method __init__ (line 58) | def __init__(self, filename=None, **kwargs):
    method reset (line 65) | def reset(self):
    method save (line 70) | def save(self, fn=None):
    method _unset (line 87) | def _unset(self, temp):
    method set (line 92) | def set(self, update_dict=None, **kw):
    method __getitem__ (line 124) | def __getitem__(self, item):
    method get (line 132) | def get(self, key, default=None):
    method reload_all (line 137) | def reload_all(self):
    method load (line 142) | def load(self, fn=None):
    method load_env (line 159) | def load_env(self):
  function intake_path_dirs (line 183) | def intake_path_dirs(path):

FILE: intake/conftest.py
  class TestSource (line 21) | class TestSource(DataSource):
    method __init__ (line 25) | def __init__(self, **kwargs):
    method _get_schema (line 29) | def _get_schema(self):
  function tmp_config_path (line 37) | def tmp_config_path(tmp_path):
  function env (line 55) | def env(temp_cache, tempdir):
  function inherit_params_cat (line 65) | def inherit_params_cat():
  function inherit_params_multiple_cats (line 73) | def inherit_params_multiple_cats():
  function inherit_params_subcat (line 82) | def inherit_params_subcat():

FILE: intake/container/__init__.py
  function register_container (line 1) | def register_container(*_, **__):

FILE: intake/container/base.py
  class RemoteSource (line 1) | class RemoteSource:
  function get_partition (line 5) | def get_partition(*_, **__):

FILE: intake/interface/__init__.py
  function do_import (line 12) | def do_import():
  function __getattr__ (line 42) | def __getattr__(attr):
  function output_notebook (line 48) | def output_notebook(*_, **__):

FILE: intake/interface/base.py
  function enable_widget (line 21) | def enable_widget(widget, enable=True):
  function coerce_to_list (line 26) | def coerce_to_list(items, preprocess=None):
  class Base (line 38) | class Base(object):
    method __init__ (line 69) | def __init__(self, visible=True, visible_callback=None, logo=False):
    method panel (line 75) | def panel(self):
    method panel (line 81) | def panel(self, panel):
    method servable (line 84) | def servable(self, *args, **kwargs):
    method show (line 87) | def show(self, *args, **kwargs):
    method __repr__ (line 90) | def __repr__(self):
    method _repr_mimebundle_ (line 94) | def _repr_mimebundle_(self, *args, **kwargs):
    method setup (line 101) | def setup(self):
    method visible (line 106) | def visible(self):
    method visible (line 111) | def visible(self, visible):
    method unwatch (line 124) | def unwatch(self):
    method __getstate__ (line 133) | def __getstate__(self):
    method __setstate__ (line 137) | def __setstate__(self, state):
    method from_state (line 145) | def from_state(cls, state):
  class BaseSelector (line 156) | class BaseSelector(Base):
    method labels (line 171) | def labels(self):
    method items (line 176) | def items(self):
    method items (line 181) | def items(self, items):
    method _create_options (line 186) | def _create_options(self, items):
    method options (line 195) | def options(self):
    method options (line 200) | def options(self, new):
    method add (line 212) | def add(self, items):
    method remove (line 226) | def remove(self, items):
    method selected (line 234) | def selected(self):
    method selected (line 244) | def selected(self, new):
  class BaseView (line 259) | class BaseView(Base):
    method __getstate__ (line 260) | def __getstate__(self, include_source=True):
    method __setstate__ (line 272) | def __setstate__(self, state):
    method source (line 281) | def source(self):
    method source (line 285) | def source(self, source):

FILE: intake/interface/catalog/add.py
  class FileSelector (line 21) | class FileSelector(Base):
    method __init__ (line 51) | def __init__(self, filters=["yaml", "yml"], done_callback=None, **kwar...
    method setup (line 58) | def setup(self):
    method protocol_changed (line 88) | def protocol_changed(self, *_):
    method go_clicked (line 93) | def go_clicked(self, *_):
    method path (line 100) | def path(self):
    method url (line 104) | def url(self):
    method move_up (line 108) | def move_up(self, arg=None):
    method go_home (line 112) | def go_home(self, arg=None):
    method make_options (line 116) | def make_options(self, arg=None):
    method move_down (line 135) | def move_down(self, *events):
    method __getstate__ (line 148) | def __getstate__(self):
    method __setstate__ (line 152) | def __setstate__(self, state):
  class URLSelector (line 161) | class URLSelector(Base):
    method __init__ (line 181) | def __init__(self, **kwargs):
    method setup (line 185) | def setup(self):
    method url (line 190) | def url(self):
    method __getstate__ (line 194) | def __getstate__(self):
    method __setstate__ (line 198) | def __setstate__(self, state):
  class CatAdder (line 206) | class CatAdder(Base):
    method __init__ (line 232) | def __init__(self, done_callback=None, **kwargs):
    method setup (line 242) | def setup(self):
    method cat_url (line 253) | def cat_url(self):
    method cat (line 263) | def cat(self):
    method add_cat (line 272) | def add_cat(self, arg=None):

FILE: intake/interface/catalog/search.py
  class Search (line 10) | class Search:
    method __init__ (line 20) | def __init__(self, done_callback: callable):
    method go (line 28) | def go(self, *_):

FILE: intake/interface/gui.py
  class GUI (line 16) | class GUI:
    method __init__ (line 30) | def __init__(self, cats=None):
    method _repr_mimebundle_ (line 79) | def _repr_mimebundle_(self, *args, **kwargs):
    method show (line 83) | def show(self, *args, **kwargs):
    method __repr__ (line 86) | def __repr__(self):
    method cat_selected (line 89) | def cat_selected(self, *_):
    method update_catsel (line 121) | def update_catsel(self):
    method add_catalog (line 124) | def add_catalog(self, cat, name=None, **_):
    method source_selected (line 136) | def source_selected(self, *_):
    method plot_clicked (line 155) | def plot_clicked(self, *_):
    method searched (line 164) | def searched(self, searchstring: str):
    method add_clicked (line 170) | def add_clicked(self, *_):
    method sub_clicked (line 178) | def sub_clicked(self, *_):
    method remove_cat (line 182) | def remove_cat(self, catname, done=True):
    method search_clicked (line 189) | def search_clicked(self, *_):
    method cats (line 198) | def cats(self):
    method sources (line 203) | def sources(self):
    method source_instance (line 208) | def source_instance(self):
  function get_catlist (line 212) | def get_catlist(catnames, children, outlist=None, seen=None):

FILE: intake/interface/source/defined_plots.py
  class Event (line 24) | class Event:
    method __init__ (line 25) | def __init__(self, plot, **kwargs):
  class Plots (line 31) | class Plots(BaseView):
    method __init__ (line 61) | def __init__(self, source=None, **kwargs):
    method setup (line 69) | def setup(self):
    method source (line 128) | def source(self, source):
    method has_plots (line 146) | def has_plots(self):
    method options (line 151) | def options(self):
    method selected (line 156) | def selected(self):
    method selected (line 161) | def selected(self, selected):
    method watch (line 165) | def watch(self, callback):
    method plot_selected (line 168) | def plot_selected(self, *events):
    method name_changed (line 175) | def name_changed(self, *events):
    method cancel (line 189) | def cancel(self, _):
    method interact_action (line 196) | def interact_action(self, _):
    method interact (line 213) | def interact(self, _):
    method draw (line 267) | def draw(self):
    method _plot_object (line 292) | def _plot_object(self, selected):
    method _create (line 301) | def _create(self):
    method _edit (line 312) | def _edit(self):
    method _clone (line 318) | def _clone(self):
    method _rename (line 333) | def _rename(self):
    method _delete (line 346) | def _delete(self):
    method __getstate__ (line 355) | def __getstate__(self, include_source=True):
    method __setstate__ (line 366) | def __setstate__(self, state):

FILE: intake/readers/__init__.py
  function recommend (line 14) | def recommend(data, *args, reader=False, **kwargs):

FILE: intake/readers/catalogs.py
  class TiledLazyEntries (line 16) | class TiledLazyEntries(LazyDict):
    method __init__ (line 19) | def __init__(self, client):
    method __getitem__ (line 22) | def __getitem__(self, item: str) -> ReaderDescription:
    method __len__ (line 36) | def __len__(self):
    method __iter__ (line 39) | def __iter__(self):
    method __repr__ (line 42) | def __repr__(self):
  class TiledCatalogReader (line 46) | class TiledCatalogReader(BaseReader):
    method _read (line 57) | def _read(self, data, **kwargs):
  class SQLAlchemyCatalog (line 71) | class SQLAlchemyCatalog(BaseReader):
    method _read (line 82) | def _read(self, data, views=True, schema=None, **kwargs):
  class StacCatalogReader (line 99) | class StacCatalogReader(BaseReader):
    method _read (line 113) | def _read(
    method _get_reader (line 185) | def _get_reader(asset, signer=None, prefer=None, metadata=None):
  class StackBands (line 220) | class StackBands(BaseReader):
    method _read (line 231) | def _read(self, data, bands: list[str], concat_dim: str = "band", sign...
  class StacSearch (line 311) | class StacSearch(BaseReader):
    method __init__ (line 320) | def __init__(self, metadata=None, **kwargs):
    method _read (line 323) | def _read(self, data, query=None, **kwargs):
  class STACIndex (line 352) | class STACIndex(BaseReader):
    method _read (line 359) | def _read(self, *args, **kwargs):
  class THREDDSCatalog (line 389) | class THREDDSCatalog(Catalog):
  class THREDDSCatalogReader (line 397) | class THREDDSCatalogReader(BaseReader):
    method _read (line 408) | def _read(self, data, make="both", **kwargs):
  class HuggingfaceHubCatalog (line 449) | class HuggingfaceHubCatalog(BaseReader):
    method _read (line 482) | def _read(self, *args, with_community_datasets: bool = False, **kwargs):
  class SKLearnExamplesCatalog (line 504) | class SKLearnExamplesCatalog(BaseReader):
    method _read (line 535) | def _read(self, **kw):
  class TorchDatasetsCatalog (line 549) | class TorchDatasetsCatalog(BaseReader):
    method _read (line 570) | def _read(self, rootdir: str, *args, **kwargs):
  class TensorFlowDatasetsCatalog (line 603) | class TensorFlowDatasetsCatalog(BaseReader):
    method _read (line 624) | def _read(self, *args, **kwargs):
  class EarthdataReader (line 634) | class EarthdataReader(BaseReader):
    method _read (line 650) | def _read(self, concept, **kwargs):
  class EarthdataCatalogReader (line 659) | class EarthdataCatalogReader(BaseReader):
    method _read (line 689) | def _read(self, temporal=("1980-01-01", "2023-11-10"), **kwargs):

FILE: intake/readers/convert.py
  class ImportsProperty (line 19) | class ImportsProperty:
    method __get__ (line 22) | def __get__(self, obj, cls):
  class BaseConverter (line 32) | class BaseConverter(BaseReader):
    method run (line 44) | def run(self, x, *args, **kwargs):
    method _read (line 52) | def _read(self, *args, data=None, **kwargs):
  class GenericFunc (line 65) | class GenericFunc(BaseConverter):
    method _read (line 72) | def _read(self, *args, data=None, func=None, data_kwarg=None, **kwargs):
  class SameType (line 84) | class SameType:
  class DuckToPandas (line 88) | class DuckToPandas(BaseConverter):
    method run (line 92) | def run(self, x, *args, with_arrow=True, **kwargs):
  class PandasToDuck (line 106) | class PandasToDuck(BaseConverter):
    method run (line 116) | def run(
  class DaskDFToPandas (line 146) | class DaskDFToPandas(BaseConverter):
    method run (line 154) | def run(self, x, *args, **kwargs):
  class PandasToGeopandas (line 158) | class PandasToGeopandas(BaseConverter):
  class XarrayToPandas (line 163) | class XarrayToPandas(BaseConverter):
  class PandasToXarray (line 168) | class PandasToXarray(BaseConverter):
  class ToHvPlot (line 173) | class ToHvPlot(BaseConverter):
    method run (line 185) | def run(self, data, explorer: bool = False, **kw):
  class RayToPandas (line 195) | class RayToPandas(BaseConverter):
  class PandasToRay (line 200) | class PandasToRay(BaseConverter):
  class RayToDask (line 205) | class RayToDask(BaseConverter):
  class DaskToRay (line 210) | class DaskToRay(BaseConverter):
  class HuggingfaceToRay (line 215) | class HuggingfaceToRay(BaseConverter):
  class TorchToRay (line 220) | class TorchToRay(BaseConverter):
  class SparkDFToRay (line 225) | class SparkDFToRay(BaseConverter):
  class RayToSpark (line 230) | class RayToSpark(BaseConverter):
  class TiledNodeToCatalog (line 235) | class TiledNodeToCatalog(BaseConverter):
    method run (line 238) | def run(self, x, **kw):
  class TiledSearch (line 261) | class TiledSearch(BaseConverter):
    method run (line 266) | def run(self, x, *arg, **kw):
  class TileDBToNumpy (line 272) | class TileDBToNumpy(BaseConverter):
    method run (line 275) | def run(self, x, *args, **kwargs):
  class TileDBToPandas (line 280) | class TileDBToPandas(BaseConverter):
    method run (line 286) | def run(self, x, *args, **kwargs):
  class DaskArrayToTileDB (line 290) | class DaskArrayToTileDB(BaseConverter):
    method run (line 295) | def run(self, x, uri, **kwargs):
  class NumpyToTileDB (line 299) | class NumpyToTileDB(BaseConverter):
    method run (line 306) | def run(self, x, uri, **kwargs):
  class DeltaQueryToDask (line 310) | class DeltaQueryToDask(BaseConverter):
    method _read (line 314) | def _read(self, reader, query, *args, **kwargs):
  class DeltaQueryToDaskGeopandas (line 322) | class DeltaQueryToDaskGeopandas(BaseConverter):
    method _read (line 326) | def _read(self, reader, query, *args, **kwargs):
  class GeoDataFrameToSTACCatalog (line 336) | class GeoDataFrameToSTACCatalog(BaseConverter):
    method _un_arr (line 341) | def _un_arr(cls, data):
    method read (line 351) | def read(self, data, *args, **kwargs):
  class PandasToMetagraph (line 371) | class PandasToMetagraph(BaseConverter):
  class NibabelToNumpy (line 376) | class NibabelToNumpy(BaseConverter):
  class DicomToNumpy (line 381) | class DicomToNumpy(BaseConverter):
    method run (line 385) | def run(self, x, *args, **kwargs):
  class FITSToNumpy (line 389) | class FITSToNumpy(BaseConverter):
    method run (line 393) | def run(self, x, extension=None):
  class ASDFToNumpy (line 409) | class ASDFToNumpy(BaseConverter):
    method run (line 413) | def run(self, x, tree_path: str | list[str], **kwargs):
  class PolarsLazy (line 421) | class PolarsLazy(BaseConverter):
  class PolarsEager (line 426) | class PolarsEager(BaseConverter):
  class PolarsToPandas (line 431) | class PolarsToPandas(BaseConverter):
    method run (line 435) | def run(self, x, *args, **kwargs):
  class PandasToPolars (line 439) | class PandasToPolars(BaseConverter):
  class DataFrameToMetadata (line 444) | class DataFrameToMetadata(BaseConverter):
    method run (line 449) | def run(self, x, *args, **kwargs):
  class GGUFToLlamaCPPService (line 470) | class GGUFToLlamaCPPService(BaseConverter):
    method run (line 473) | def run(self, x, **kwargs):
  class LLamaCPPServiceToOpenAIService (line 477) | class LLamaCPPServiceToOpenAIService(BaseConverter):
    method run (line 482) | def run(self, x, options=None):
  class OpenAIServiceToOpenAIClient (line 488) | class OpenAIServiceToOpenAIClient(BaseConverter):
    method run (line 491) | def run(self, x):
  function convert_class (line 495) | def convert_class(data, out_type: str):
  function convert_classes (line 514) | def convert_classes(in_type: str):
  class Pipeline (line 531) | class Pipeline(readers.BaseReader):
    method __init__ (line 540) | def __init__(
    method steps (line 564) | def steps(self):
    method __call__ (line 567) | def __call__(self, *args, **kwargs):
    method __repr__ (line 576) | def __repr__(self):
    method output_doc (line 584) | def output_doc(self):
    method doc (line 590) | def doc(self):
    method doc_n (line 593) | def doc_n(self, n):
    method _read_stage_n (line 597) | def _read_stage_n(self, stage, discover=False, **kwargs):
    method _read (line 624) | def _read(self, discover=False, **kwargs):
    method apply (line 634) | def apply(self, func, *arg, output_instance=None, **kwargs):
    method first_n_stages (line 641) | def first_n_stages(self, n: int):
    method discover (line 665) | def discover(self, **kwargs):
    method with_step (line 668) | def with_step(self, step, out_instance):
    method read_stepwise (line 679) | def read_stepwise(self, breakpoint=0):
  class PipelineExecution (line 690) | class PipelineExecution:
    method __init__ (line 700) | def __init__(self, pipeline, breakpoint=0):
    method __repr__ (line 708) | def __repr__(self):
    method cont (line 711) | def cont(self):
    method step (line 718) | def step(self, **kw):
  function conversions_graph (line 736) | def conversions_graph(avoid=None, allow_wildcard=True):
  function plot_conversion_graph (line 775) | def plot_conversion_graph(filename) -> None:
  function path (line 785) | def path(
  function auto_pipeline (line 819) | def auto_pipeline(

FILE: intake/readers/datatypes.py
  class BaseData (line 21) | class BaseData(Tokenizable):
    method __init__ (line 34) | def __init__(self, metadata: dict[str, Any] | None = None):
    method _filepattern (line 39) | def _filepattern(cls):
    method _mimetypes (line 44) | def _mimetypes(cls):
    method possible_readers (line 48) | def possible_readers(self):
    method possible_outputs (line 55) | def possible_outputs(self):
    method to_reader_cls (line 60) | def to_reader_cls(
    method to_reader (line 89) | def to_reader(self, outtype: str | None = None, reader: str | None = N...
    method to_entry (line 104) | def to_entry(self):
    method __repr__ (line 112) | def __repr__(self):
    method auto_pipeline (line 116) | def auto_pipeline(self, outtype: str | tuple[str]):
  class FileData (line 123) | class FileData(BaseData):
    method __init__ (line 126) | def __init__(self, url, storage_options: dict | None = None, metadata:...
  class Service (line 132) | class Service(BaseData):
    method __init__ (line 135) | def __init__(self, url, options=None, metadata=None):
  class Catalog (line 141) | class Catalog(BaseData):
  class PMTiles (line 147) | class PMTiles(FileData):
  class DuckDB (line 155) | class DuckDB(FileData):
  class Parquet (line 163) | class Parquet(FileData):
  class CSV (line 173) | class CSV(FileData):
  class CSVPattern (line 181) | class CSVPattern(CSV):
  class Text (line 190) | class Text(FileData):
  class XML (line 201) | class XML(FileData):
  class THREDDSCatalog (line 210) | class THREDDSCatalog(XML):
  class PNG (line 221) | class PNG(FileData):
  class JPEG (line 230) | class JPEG(FileData):
  class WAV (line 239) | class WAV(FileData):
  class NetCDF3 (line 248) | class NetCDF3(FileData):
  class HDF5 (line 257) | class HDF5(FileData):
    method __init__ (line 265) | def __init__(
  class Zarr (line 282) | class Zarr(FileData):
    method __init__ (line 290) | def __init__(
  class IcechunkRepo (line 306) | class IcechunkRepo(FileData):
    method __init__ (line 314) | def __init__(
  class MatlabArray (line 334) | class MatlabArray(FileData):
    method __init__ (line 340) | def __init__(self, path, variable=None):
  class MatrixMarket (line 346) | class MatrixMarket(FileData):
  class Excel (line 352) | class Excel(FileData):
  class TIFF (line 361) | class TIFF(FileData):
  class GRIB2 (line 371) | class GRIB2(FileData):
  class FITS (line 380) | class FITS(FileData):
  class ASDF (line 389) | class ASDF(FileData):
  class DICOM (line 397) | class DICOM(FileData):
  class Nifti (line 406) | class Nifti(FileData):
  class OpenDAP (line 415) | class OpenDAP(Service):
  class SQLQuery (line 421) | class SQLQuery(BaseData):
    method __init__ (line 427) | def __init__(self, conn, query, metadata=None):
  class Prometheus (line 433) | class Prometheus(Service):
    method __init__ (line 438) | def __init__(
  class LlamaCPPService (line 460) | class LlamaCPPService(Service):
    method open (line 468) | def open(self):
  class OpenAIService (line 475) | class OpenAIService(Service):
    method __init__ (line 481) | def __init__(
  class SQLite (line 492) | class SQLite(FileData):
  class AVRO (line 500) | class AVRO(FileData):
  class ORC (line 509) | class ORC(FileData):
  class YAMLFile (line 517) | class YAMLFile(FileData):
  class CatalogFile (line 525) | class CatalogFile(Catalog, YAMLFile):
  class CatalogAPI (line 529) | class CatalogAPI(Catalog, Service):
  class JSONFile (line 535) | class JSONFile(FileData):
  class GeoJSON (line 544) | class GeoJSON(JSONFile):
  class Shapefile (line 551) | class Shapefile(FileData):
  class FlatGeoBuf (line 562) | class FlatGeoBuf(FileData):
  class GeoPackage (line 570) | class GeoPackage(SQLite):
  class STACJSON (line 576) | class STACJSON(JSONFile):
  class TiledService (line 583) | class TiledService(CatalogAPI):
  class TiledDataset (line 587) | class TiledDataset(Service):
  class TileDB (line 593) | class TileDB(Service):
  class IcebergDataset (line 601) | class IcebergDataset(JSONFile):
  class DeltalakeTable (line 608) | class DeltalakeTable(FileData):
  class NumpyFile (line 616) | class NumpyFile(FileData):
  class RawBuffer (line 625) | class RawBuffer(FileData):
    method __init__ (line 631) | def __init__(
  class Literal (line 642) | class Literal(BaseData):
    method __init__ (line 645) | def __init__(self, data, metadata=None):
  class Handle (line 650) | class Handle(JSONFile):
  class Feather2 (line 661) | class Feather2(FileData):
  class Feather1 (line 668) | class Feather1(FileData):
  class PythonSourceCode (line 675) | class PythonSourceCode(FileData):
  class GDALRasterFile (line 682) | class GDALRasterFile(FileData):
  class GDALVectorFile (line 692) | class GDALVectorFile(FileData):
  class HuggingfaceDataset (line 705) | class HuggingfaceDataset(BaseData):
    method __init__ (line 710) | def __init__(self, name, split=None, metadata=None):
  class TFRecord (line 716) | class TFRecord(FileData):
  class KerasModel (line 723) | class KerasModel(FileData):
  class GGUF (line 730) | class GGUF(FileData):
  class SafeTensors (line 740) | class SafeTensors(FileData):
  class PickleFile (line 752) | class PickleFile(FileData):
  class ModelConfig (line 758) | class ModelConfig(FileData):
  class SKLearnPickleModel (line 769) | class SKLearnPickleModel(PickleFile):
  function recommend (line 787) | def recommend(

FILE: intake/readers/entry.py
  class DataDescription (line 38) | class DataDescription(Tokenizable):
    method __init__ (line 46) | def __init__(
    method __repr__ (line 58) | def __repr__(self):
    method to_data (line 64) | def to_data(self, user_parameters=None, **kwargs):
    method __call__ (line 71) | def __call__(self, **kwargs):
    method get_kwargs (line 74) | def get_kwargs(
    method extract_parameter (line 89) | def extract_parameter(
  class ReaderDescription (line 107) | class ReaderDescription(Tokenizable):
    method __init__ (line 115) | def __init__(
    method check_imports (line 129) | def check_imports(self):
    method get_kwargs (line 136) | def get_kwargs(self, user_parameters=None, **kwargs) -> dict[str, Any]:
    method extract_parameter (line 162) | def extract_parameter(self, name: str, path=None, value=None, cls=Simp...
    method to_reader (line 176) | def to_reader(self, user_parameters=None, **kwargs):
    method to_cat (line 188) | def to_cat(self, name=None):
    method __call__ (line 194) | def __call__(self, user_parameters=None, **kwargs):
    method from_dict (line 198) | def from_dict(cls, data):
    method __repr__ (line 207) | def __repr__(self):
  class Catalog (line 215) | class Catalog(Tokenizable):
    method __init__ (line 218) | def __init__(
    method add_entry (line 239) | def add_entry(
    method _ipython_key_completions_ (line 292) | def _ipython_key_completions_(self):
    method delete (line 295) | def delete(self, name, recursive=False):
    method extract_parameter (line 311) | def extract_parameter(
    method move_parameter (line 346) | def move_parameter(self, from_entity: str, to_entity: str, parameter_n...
    method promote_parameter_name (line 356) | def promote_parameter_name(self, parameter_name: str, level: str = "ca...
    method __getattr__ (line 403) | def __getattr__(self, item):
    method to_yaml_file (line 413) | def to_yaml_file(self, path: str, **storage_options):
    method from_yaml_file (line 424) | def from_yaml_file(path: str, **kwargs):
    method from_entries (line 441) | def from_entries(cls, data: dict, metadata=None):
    method from_dict (line 449) | def from_dict(cls, data):
    method get_entity (line 469) | def get_entity(self, item: str):
    method get_aliases (line 492) | def get_aliases(self, entity: str):
    method search (line 496) | def search(self, expr) -> Catalog:
    method __getitem__ (line 518) | def __getitem__(self, item):
    method _rehydrate (line 537) | def _rehydrate(self, val):
    method __delitem__ (line 558) | def __delitem__(self, key):
    method __delattr__ (line 571) | def __delattr__(self, item):
    method _find_iter (line 574) | def _find_iter(self, thing):
    method __contains__ (line 582) | def __contains__(self, thing):
    method __call__ (line 594) | def __call__(self, **kwargs):
    method __iter__ (line 612) | def __iter__(self):
    method __len__ (line 615) | def __len__(self) -> int:
    method __dir__ (line 618) | def __dir__(self) -> Iterable[str]:
    method __add__ (line 621) | def __add__(self, other: Catalog | DataDescription):
    method __iadd__ (line 634) | def __iadd__(self, other: Catalog | ReaderDescription):
    method __repr__ (line 643) | def __repr__(self):
    method __setitem__ (line 653) | def __setitem__(self, name: str, entry):
    method rename (line 661) | def rename(self, old: str, new: str, clobber=True):
    method name (line 668) | def name(self):
    method give_name (line 674) | def give_name(self, tok: str, name: str, clobber=True):

FILE: intake/readers/examples.py
  function ms_building_parquet (line 5) | def ms_building_parquet():
  function ms_delta_buildings (line 76) | def ms_delta_buildings():

FILE: intake/readers/importlist.py
  function process_entries (line 23) | def process_entries():

FILE: intake/readers/mixins.py
  class PipelineMixin (line 12) | class PipelineMixin(Completable):
    method __getattr__ (line 15) | def __getattr__(self, item):
    method __getitem__ (line 33) | def __getitem__(self, item: str):
    method __dir__ (line 52) | def __dir__(self):
    method _namespaces (line 56) | def _namespaces(self):
    method output_doc (line 62) | def output_doc(cls):
    method apply (line 67) | def apply(self, func, *args, output_instance=None, **kwargs):
    method transform (line 83) | def transform(self):
  class Functioner (line 90) | class Functioner(Completable):
    method __init__ (line 93) | def __init__(self, reader, funcdict):
    method _ipython_key_completions_ (line 97) | def _ipython_key_completions_(self):
    method __getitem__ (line 100) | def __getitem__(self, item):
    method __repr__ (line 125) | def __repr__(self):
    method __call__ (line 130) | def __call__(self, func, *args, output_instance=None, **kwargs):
    method methods (line 143) | def methods(self):
    method __dir__ (line 152) | def __dir__(self):
    method __getattr__ (line 155) | def __getattr__(self, item):

FILE: intake/readers/namespaces.py
  class Namespace (line 16) | class Namespace(Completable):
    method __init__ (line 22) | def __init__(self, reader):
    method _funcs (line 27) | def _funcs(cls) -> Iterable[str]:
    method __dir__ (line 34) | def __dir__(self) -> Iterable[str]:
    method __getattr__ (line 39) | def __getattr__(self, item):
    method __repr__ (line 48) | def __repr__(self):
  class FuncHolder (line 52) | class FuncHolder:
    method __init__ (line 55) | def __init__(self, reader, func):
    method __call__ (line 59) | def __call__(self, *args, **kwargs):
  class np (line 63) | class np(Namespace):
  class ak (line 68) | class ak(Namespace):
  class xr (line 73) | class xr(Namespace):
  class pd (line 78) | class pd(Namespace):
  class pl (line 83) | class pl(Namespace):
  function get_namespaces (line 88) | def get_namespaces(reader):

FILE: intake/readers/output.py
  class PandasToParquet (line 31) | class PandasToParquet(BaseConverter):
    method run (line 37) | def run(self, x, url, storage_options=None, metadata=None, **kwargs):
  class PandasToCSV (line 42) | class PandasToCSV(BaseConverter):
    method run (line 48) | def run(self, x, url, storage_options=None, metadata=None, **kwargs):
  class PandasToHDF5 (line 53) | class PandasToHDF5(BaseConverter):
    method run (line 59) | def run(self, x, url, table, storage_options=None, metadata=None, **kw...
  class PandasToFeather (line 64) | class PandasToFeather(BaseConverter):
    method run (line 70) | def run(self, x, url, storage_options=None, metadata=None, **kwargs):
  class XarrayToNetCDF (line 76) | class XarrayToNetCDF(BaseConverter):
    method run (line 79) | def run(self, x, url, group="", metadata=None, **kwargs):
  class XarrayToZarr (line 84) | class XarrayToZarr(BaseConverter):
    method run (line 88) | def run(self, x, url, group="", storage_options=None, metadata=None, *...
  class DaskArrayToZarr (line 93) | class DaskArrayToZarr(BaseConverter):
    method run (line 97) | def run(self, x, url, group="", storage_options=None, metadata=None, *...
  class NumpyToNumpyFile (line 107) | class NumpyToNumpyFile(BaseConverter):
    method run (line 113) | def run(self, x, path, *args, storage_options=None, metadata=None, **k...
  class ToMatplotlib (line 122) | class ToMatplotlib(BaseConverter):
    method run (line 130) | def run(self, x, **kwargs):
  class MatplotlibToPNG (line 137) | class MatplotlibToPNG(BaseConverter):
    method run (line 146) | def run(self, x, url, metadata=None, storage_options=None, **kwargs):
  class GeopandasToFile (line 152) | class GeopandasToFile(BaseConverter):
    method run (line 160) | def run(self, x, url, metadata=None, **kwargs):
  class Repr (line 165) | class Repr(BaseConverter):
  class IPythonDisplay (line 172) | class IPythonDisplay(BaseConverter):
    method run (line 176) | def run(self, x, **kwargs):
  class CatalogToJson (line 199) | class CatalogToJson(BaseConverter):
    method run (line 203) | def run(self, x, url, metadata=None, storage_options=None, **kwargs):

FILE: intake/readers/readers.py
  class BaseReader (line 23) | class BaseReader(Tokenizable, PipelineMixin):
    method __init__ (line 32) | def __init__(
    method __repr__ (line 73) | def __repr__(self):
    method __call__ (line 76) | def __call__(self, *args, **kwargs):
    method doc (line 85) | def doc(cls):
    method discover (line 95) | def discover(self, **kwargs):
    method _func (line 104) | def _func(self):
    method read (line 110) | def read(self, *args, **kwargs):
    method _read (line 123) | def _read(self, *args, **kwargs):
    method to_entry (line 127) | def to_entry(self):
    method to_cat (line 138) | def to_cat(self, name=None):
    method data (line 143) | def data(self):
    method to_reader (line 155) | def to_reader(self, outtype: tuple[str] | str | None = None, reader: s...
    method auto_pipeline (line 159) | def auto_pipeline(self, outtype: str | tuple[str], avoid: list[str] | ...
  class FileReader (line 165) | class FileReader(BaseReader):
    method _read (line 172) | def _read(self, data, **kw):
  class OpenFilesReader (line 179) | class OpenFilesReader(FileReader):
  class PanelImageViewer (line 186) | class PanelImageViewer(FileReader):
  class FileByteReader (line 193) | class FileByteReader(FileReader):
    method discover (line 199) | def discover(self, data=None, **kwargs):
    method _read (line 204) | def _read(self, data, **kwargs):
  class FileTextReader (line 212) | class FileTextReader(FileReader):
    method discover (line 218) | def discover(self, data=None, encoding=None, **kwargs):
    method _read (line 225) | def _read(self, data, encoding=None, **kwargs):
  class FileSizeReader (line 235) | class FileSizeReader(FileReader):
    method _read (line 239) | def _read(self, data, **kw):
  class Pandas (line 245) | class Pandas(FileReader):
  class PandasParquet (line 251) | class PandasParquet(Pandas):
  class PandasFeather (line 258) | class PandasFeather(Pandas):
  class PandasORC (line 265) | class PandasORC(Pandas):
  class PandasExcel (line 272) | class PandasExcel(Pandas):
  class PandasSQLAlchemy (line 279) | class PandasSQLAlchemy(BaseReader):
    method discover (line 285) | def discover(self, **kwargs):
    method _read (line 290) | def _read(self, data, **kwargs):
  class DaskDF (line 295) | class DaskDF(FileReader):
    method discover (line 300) | def discover(self, **kwargs):
  class DaskParquet (line 304) | class DaskParquet(DaskDF):
  class DaskGeoParquet (line 311) | class DaskGeoParquet(DaskParquet):
  class DaskHDF (line 317) | class DaskHDF(DaskDF):
    method _read (line 323) | def _read(self, data, **kw):
  class DaskJSON (line 327) | class DaskJSON(DaskDF):
  class DaskDeltaLake (line 333) | class DaskDeltaLake(DaskDF):
  class DaskSQL (line 340) | class DaskSQL(BaseReader):
    method _read (line 345) | def _read(self, data, index_col, **kw):
  class DaskNPYStack (line 350) | class DaskNPYStack(FileReader):
  class DaskZarr (line 361) | class DaskZarr(FileReader):
    method _read (line 367) | def _read(self, data, **kwargs):
  class NumpyZarr (line 376) | class NumpyZarr(FileReader):
    method _read (line 382) | def _read(self, data, **kwargs):
  class DuckDB (line 388) | class DuckDB(BaseReader):
    method discover (line 395) | def discover(self, **kwargs):
    method _duck (line 399) | def _duck(cls, data, conn=None):
  class DuckParquet (line 428) | class DuckParquet(DuckDB, FileReader):
    method _read (line 431) | def _read(self, data, **kwargs):
  class DuckCSV (line 435) | class DuckCSV(DuckDB, FileReader):
    method _read (line 438) | def _read(self, data, **kwargs):
  class DuckJSON (line 442) | class DuckJSON(DuckDB, FileReader):
    method _read (line 445) | def _read(self, data, **kwargs):
  class DuckSQL (line 449) | class DuckSQL(DuckDB):
    method _read (line 452) | def _read(self, data, **kwargs):
  class SparkDataFrame (line 458) | class SparkDataFrame(FileReader):
    method discover (line 464) | def discover(self, **kwargs):
  class SparkCSV (line 468) | class SparkCSV(SparkDataFrame):
    method _read (line 471) | def _read(self, data, **kwargs):
  class SparkParquet (line 475) | class SparkParquet(SparkDataFrame):
    method _read (line 478) | def _read(self, data, **kwargs):
  class SparkText (line 482) | class SparkText(SparkDataFrame):
    method _read (line 485) | def _read(self, data, **kwargs):
  class SparkDeltaLake (line 489) | class SparkDeltaLake(SparkDataFrame):
    method _read (line 493) | def _read(self, data, **kw):
  class HuggingfaceReader (line 498) | class HuggingfaceReader(BaseReader):
    method _read (line 504) | def _read(self, data, *args, **kwargs):
  class SKLearnExampleReader (line 508) | class SKLearnExampleReader(BaseReader):
    method _read (line 513) | def _read(self, name, **kw):
  class LlamaServerReader (line 522) | class LlamaServerReader(BaseReader):
    method _short_kwargs_docs (line 594) | def _short_kwargs_docs(cls):
    method _find_executable (line 599) | def _find_executable(cls):
    method check_imports (line 610) | def check_imports(cls):
    method _local_model_path (line 615) | def _local_model_path(self, data, callback=DEFAULT_CALLBACK):
    method _read (line 643) | def _read(self, data, log_file="llama-cpp.log", **kwargs):
  class LlamaCPPCompletion (line 707) | class LlamaCPPCompletion(BaseReader):
    method _read (line 712) | def _read(self, data, prompt: str = "", *args, **kwargs):
  class LlamaCPPEmbedding (line 723) | class LlamaCPPEmbedding(BaseReader):
    method _read (line 728) | def _read(self, data, prompt: str = "", *args, **kwargs):
  class OpenAIReader (line 739) | class OpenAIReader(BaseReader):
    method _read (line 744) | def _read(self, data, **kwargs):
  class OpenAICompletion (line 751) | class OpenAICompletion(BaseReader):
    method _read (line 757) | def _read(self, data, messages: list[dict], *args, model="gtp-3.5-turb...
  class TorchDataset (line 771) | class TorchDataset(BaseReader):
    method _read (line 774) | def _read(self, modname, funcname, rootdir, **kw):
  class TFPublicDataset (line 785) | class TFPublicDataset(BaseReader):
    method _read (line 790) | def _read(self, name, *args, **kwargs):
  class TFTextreader (line 794) | class TFTextreader(FileReader):
  class TFORC (line 802) | class TFORC(FileReader):
  class TFSQL (line 810) | class TFSQL(BaseReader):
    method _read (line 816) | def _read(self, data, **kwargs):
  class KerasImageReader (line 820) | class KerasImageReader(FileReader):
  class KerasText (line 828) | class KerasText(FileReader):
  class KerasAudio (line 836) | class KerasAudio(FileReader):
  class KerasModelReader (line 844) | class KerasModelReader(FileReader):
  class TFRecordReader (line 852) | class TFRecordReader(FileReader):
  class SKLearnModelReader (line 860) | class SKLearnModelReader(FileReader):
    method _read (line 868) | def _read(self, data, **kw):
  class Awkward (line 873) | class Awkward(FileReader):
  class AwkwardParquet (line 879) | class AwkwardParquet(Awkward):
    method discover (line 885) | def discover(self, **kwargs):
  class DaskAwkwardParquet (line 890) | class DaskAwkwardParquet(AwkwardParquet):
    method discover (line 895) | def discover(self, **kwargs):
  class AwkwardJSON (line 899) | class AwkwardJSON(Awkward):
  class AwkwardAVRO (line 905) | class AwkwardAVRO(Awkward):
  class DaskAwkwardJSON (line 911) | class DaskAwkwardJSON(Awkward):
    method discover (line 917) | def discover(self, **kwargs):
  class HandleToUrlReader (line 921) | class HandleToUrlReader(BaseReader):
    method _extract (line 933) | def _extract(cls, meta, base):
    method _read (line 946) | def _read(self, data, base="https://hdl.handle.net/api/handles", **kwa...
  class PandasCSV (line 957) | class PandasCSV(Pandas):
    method discover (line 962) | def discover(self, **kw):
  class PandasHDF5 (line 970) | class PandasHDF5(Pandas):
    method _read (line 975) | def _read(self, data, **kw):
  class DaskCSV (line 982) | class DaskCSV(DaskDF):
  class DaskText (line 988) | class DaskText(FileReader):
    method discover (line 996) | def discover(self, n=10, **kwargs):
  class DaskCSVPattern (line 1000) | class DaskCSVPattern(DaskCSV):
    method _read (line 1009) | def _read(self, data, **kw):
  class Polars (line 1030) | class Polars(FileReader):
    method discover (line 1035) | def discover(self, **kwargs):
  class PolarsDeltaLake (line 1041) | class PolarsDeltaLake(Polars):
  class PolarsAvro (line 1046) | class PolarsAvro(Polars):
  class PolarsFeather (line 1052) | class PolarsFeather(Polars):
  class PolarsParquet (line 1057) | class PolarsParquet(Polars):
  class PolarsCSV (line 1062) | class PolarsCSV(Polars):
  class PolarsJSON (line 1067) | class PolarsJSON(Polars):
  class PolarsIceberg (line 1072) | class PolarsIceberg(Polars):
  class PolarsExcel (line 1078) | class PolarsExcel(Polars):
  class Ray (line 1084) | class Ray(FileReader):
    method discover (line 1090) | def discover(self, **kwargs):
    method _read (line 1093) | def _read(self, data, **kw):
  class RayParquet (line 1105) | class RayParquet(Ray):
  class RayCSV (line 1110) | class RayCSV(Ray):
  class RayJSON (line 1115) | class RayJSON(Ray):
  class RayText (line 1120) | class RayText(Ray):
  class RayBinary (line 1125) | class RayBinary(Ray):
  class RayDeltaLake (line 1130) | class RayDeltaLake(Ray):
  class DeltaReader (line 1137) | class DeltaReader(FileReader):
  class TiledNode (line 1146) | class TiledNode(BaseReader):
    method _read (line 1152) | def _read(self, data, **kwargs):
  class TiledClient (line 1158) | class TiledClient(BaseReader):
    method _read (line 1163) | def _read(self, data, as_client=True, dask=False, **kwargs):
  class TileDBReader (line 1177) | class TileDBReader(BaseReader):
    method _read (line 1183) | def _read(self, data, attribute=None, **kwargs):
  class TileDBDaskReader (line 1187) | class TileDBDaskReader(BaseReader):
    method _read (line 1193) | def _read(self, data, attribute=None, **kwargs):
  class PythonModule (line 1197) | class PythonModule(BaseReader):
    method _read (line 1201) | def _read(self, data, module_name=None, **kwargs):
  class SKImageReader (line 1212) | class SKImageReader(FileReader):
  class NumpyText (line 1220) | class NumpyText(FileReader):
    method _read (line 1226) | def _read(self, data, **kw):
  class NumpyReader (line 1233) | class NumpyReader(NumpyText):
  class CupyNumpyReader (line 1238) | class CupyNumpyReader(NumpyText):
  class CupyTextReader (line 1245) | class CupyTextReader(CupyNumpyReader):
  class XArrayDatasetReader (line 1250) | class XArrayDatasetReader(FileReader):
    method _read (line 1270) | def _read(self, data, open_local=False, **kw):
  class XArrayPatternReader (line 1351) | class XArrayPatternReader(XArrayDatasetReader):
    method _read (line 1368) | def _read(self, data, open_local=False, pattern=None, **kw):
  class RasterIOXarrayReader (line 1391) | class RasterIOXarrayReader(FileReader):
    method _read (line 1398) | def _read(self, data, concat_kwargs=None, **kwargs):
  class GeoPandasReader (line 1417) | class GeoPandasReader(FileReader):
    method _read (line 1433) | def _read(self, data, with_fsspec=None, **kwargs):
  class GeoPandasTabular (line 1448) | class GeoPandasTabular(FileReader):
    method _read (line 1456) | def _read(self, data, **kwargs):
  class ScipyMatlabReader (line 1471) | class ScipyMatlabReader(FileReader):
    method _read (line 1477) | def _read(self, data, **kwargs):
  class ScipyMatrixMarketReader (line 1481) | class ScipyMatrixMarketReader(FileReader):
    method _read (line 1487) | def _read(self, data, **kw):
  class NibabelNiftiReader (line 1492) | class NibabelNiftiReader(FileReader):
    method _read (line 1499) | def _read(self, data, **kw):
  class FITSReader (line 1504) | class FITSReader(FileReader):
    method _read (line 1510) | def _read(self, data, **kw):
  class ASDFReader (line 1518) | class ASDFReader(FileReader):
    method _read (line 1524) | def _read(self, data, **kw):
  class DicomReader (line 1532) | class DicomReader(FileReader):
    method _read (line 1540) | def _read(self, data, **kw):
  class Condition (line 1545) | class Condition(BaseReader):
    method _read (line 1546) | def _read(
  class PMTileReader (line 1561) | class PMTileReader(BaseReader):
    method _read (line 1566) | def _read(self, data):
  class FileExistsReader (line 1582) | class FileExistsReader(BaseReader):
    method _read (line 1587) | def _read(self, data, *args, **kwargs):
  class YAMLCatalogReader (line 1595) | class YAMLCatalogReader(FileReader):
  class PrometheusMetricReader (line 1603) | class PrometheusMetricReader(BaseReader):
    method _read (line 1610) | def _read(self, data: datatypes.Prometheus, *args, **kwargs):
  class Retry (line 1631) | class Retry(BaseReader):
    method _read (line 1637) | def _read(
  function recommend (line 1682) | def recommend(data):
  function reader_from_call (line 1702) | def reader_from_call(func: str, *args, join_lines=False, **kwargs) -> Ba...

FILE: intake/readers/search.py
  class SearchBase (line 11) | class SearchBase:
    method filter (line 17) | def filter(self, entry: ReaderDescription) -> bool:
    method __or__ (line 22) | def __or__(self, other):
    method __and__ (line 25) | def __and__(self, other):
    method __inv__ (line 28) | def __inv__(self):
  class Or (line 32) | class Or(SearchBase):
    method __init__ (line 33) | def __init__(self, first: SearchBase, second: SearchBase):
    method filter (line 37) | def filter(self, entry: ReaderDescription) -> bool:
  class And (line 41) | class And(SearchBase):
    method __init__ (line 42) | def __init__(self, first: SearchBase, second: SearchBase):
    method filter (line 46) | def filter(self, entry: ReaderDescription) -> bool:
  class Not (line 50) | class Not(SearchBase):
    method __init__ (line 51) | def __init__(self, first: SearchBase):
    method filter (line 54) | def filter(self, entry: ReaderDescription) -> bool:
  class Any (line 58) | class Any(SearchBase):
    method __init__ (line 59) | def __init__(self, *terms: tuple[SearchBase, ...]):
    method filter (line 62) | def filter(self, entry: ReaderDescription) -> bool:
  class All (line 66) | class All(SearchBase):
    method __init__ (line 67) | def __init__(self, *terms: tuple[SearchBase, ...]):
    method filter (line 70) | def filter(self, entry: ReaderDescription) -> bool:
  class Text (line 74) | class Text(SearchBase):
    method __init__ (line 77) | def __init__(self, text: str):
    method filter (line 80) | def filter(self, entry: ReaderDescription) -> bool:
  class Importable (line 84) | class Importable(SearchBase):
    method filter (line 91) | def filter(self, entry: ReaderDescription) -> bool:
  class EnvironmentSatisfied (line 95) | class EnvironmentSatisfied(SearchBase):
    method filter (line 107) | def filter(self, entry: ReaderDescription) -> bool:
    method _is_consistent (line 115) | def _is_consistent(env, output=False):

FILE: intake/readers/tests/cats/test_sql.py
  function postgres_with_data (line 13) | def postgres_with_data(postgresql):
  function test_pg_pandas (line 25) | def test_pg_pandas(postgres_with_data):
  function test_pg_duck_with_pandas_input (line 41) | def test_pg_duck_with_pandas_input(postgres_with_data):
  function sqlite_with_data (line 54) | def sqlite_with_data(tmpdir):
  function test_sqlite_pandas (line 66) | def test_sqlite_pandas(sqlite_with_data):
  function test_sqlite_duck_with_pandas_input (line 76) | def test_sqlite_duck_with_pandas_input(sqlite_with_data):
  function test_pandas_duck_pandas (line 85) | def test_pandas_duck_pandas(sqlite_with_data):
  function test_cat (line 101) | def test_cat(sqlite_with_data):

FILE: intake/readers/tests/cats/test_stac.py
  function test_1 (line 13) | def test_1():
  function test_bands (line 22) | def test_bands():

FILE: intake/readers/tests/cats/test_thredds.py
  function test_1 (line 10) | def test_1():

FILE: intake/readers/tests/cats/test_tiled.py
  function tiled_server (line 13) | def tiled_server():
  function test_catalog_workflow (line 41) | def test_catalog_workflow(tiled_server):

FILE: intake/readers/tests/test_basic.py
  function test1 (line 9) | def test1():
  function test_recommend_filetype (line 17) | def test_recommend_filetype():
  function test_recommend_reader (line 26) | def test_recommend_reader():
  function test_data_metadata (line 39) | def test_data_metadata():

FILE: intake/readers/tests/test_consistency.py
  function test_readers (line 10) | def test_readers(cls):
  function test_data (line 26) | def test_data(cls):
  function test_filereaders (line 35) | def test_filereaders(cls):
  function test_converters (line 40) | def test_converters(cls):

FILE: intake/readers/tests/test_dict.py
  function test_yaml_roundtrip (line 5) | def test_yaml_roundtrip():

FILE: intake/readers/tests/test_errors.py
  function test_func_ser (line 6) | def test_func_ser():

FILE: intake/readers/tests/test_reader.py
  function test_reader_from_call (line 8) | def test_reader_from_call():
  function xarray_dataset (line 28) | def xarray_dataset():
  function test_xarray_pattern (line 51) | def test_xarray_pattern(tmpdir, xarray_dataset):
  function test_xarray_dataset_remote_url_glob_str (line 77) | def test_xarray_dataset_remote_url_glob_str(tmpdir, xarray_dataset):
  function icechunk_xr_repo (line 109) | def icechunk_xr_repo(tmpdir):
  function test_icechunk (line 146) | def test_icechunk(icechunk_xr_repo):

FILE: intake/readers/tests/test_search.py
  class NotImportable (line 6) | class NotImportable(BaseReader):
  function test_1 (line 12) | def test_1():

FILE: intake/readers/tests/test_up.py
  function test_basic (line 4) | def test_basic():
  function test_named_options (line 27) | def test_named_options():

FILE: intake/readers/tests/test_utils.py
  class OnlyOkeKey (line 6) | class OnlyOkeKey(LazyDict):
    method __getitem__ (line 7) | def __getitem__(self, item):
    method __iter__ (line 12) | def __iter__(self):
  function test_lazy_dict (line 16) | def test_lazy_dict():

FILE: intake/readers/tests/test_workflows.py
  function dataframe_file (line 14) | def dataframe_file():
  function df (line 21) | def df(dataframe_file):
  function test_pipelines_in_catalogs (line 25) | def test_pipelines_in_catalogs(dataframe_file, df):
  function test_pipeline_steps (line 45) | def test_pipeline_steps(dataframe_file, df):
  function test_parameters (line 72) | def test_parameters(dataframe_file, monkeypatch):
  function test_namespace (line 96) | def test_namespace(dataframe_file):
  function fails (line 106) | def fails(x):
  function test_retry (line 114) | def test_retry(dataframe_file):
  function dir_non_empty (line 133) | def dir_non_empty(d):
  function test_custom_cache (line 139) | def test_custom_cache(dataframe_file, tmpdir, df):
  function test_cat_mapper (line 169) | def test_cat_mapper(dataframe_file):

FILE: intake/readers/transform.py
  class DataFrameColumns (line 10) | class DataFrameColumns(BaseConverter):
    method run (line 14) | def run(self, x, columns, **_):
  class XarraySel (line 18) | class XarraySel(BaseConverter):
    method run (line 22) | def run(self, x, indexers, **_):
  class THREDDSCatToMergedDataset (line 26) | class THREDDSCatToMergedDataset(BaseConverter):
    method run (line 29) | def run(self, cat, path, driver="h5netcdf", xarray_kwargs=None, concat...
  class PysparkColumns (line 78) | class PysparkColumns(BaseConverter):
    method run (line 81) | def run(self, x, columns, **_):
  class Method (line 85) | class Method(BaseConverter):
    method run (line 93) | def run(self, x, *args, method_name: str = "", **kw):
  class GetItem (line 101) | class GetItem(BaseConverter):
    method _read (line 110) | def _read(self, item, data=None):
  function identity (line 114) | def identity(x):
  class CatalogMapper (line 118) | class CatalogMapper(BaseConverter):
    method run (line 121) | def run(

FILE: intake/readers/user_parameters.py
  class BaseUserParameter (line 25) | class BaseUserParameter(Tokenizable):
    method __init__ (line 28) | def __init__(self, default, description=""):
    method __repr__ (line 32) | def __repr__(self):
    method set_default (line 36) | def set_default(self, value):
    method with_default (line 44) | def with_default(self, value):
    method coerce (line 55) | def coerce(self, value):
    method _validate (line 59) | def _validate(self, value):
    method validate (line 62) | def validate(self, value) -> bool:
    method to_dict (line 72) | def to_dict(self):
  class SimpleUserParameter (line 78) | class SimpleUserParameter(BaseUserParameter):
    method __init__ (line 81) | def __init__(self, dtype: type = object, **kw):
    method _dtype (line 88) | def _dtype(self):
    method coerce (line 91) | def coerce(self, value):
    method _validate (line 96) | def _validate(self, value):
  class OptionsUserParameter (line 100) | class OptionsUserParameter(SimpleUserParameter):
    method __init__ (line 103) | def __init__(self, options, dtype=object, **kw):
    method _validate (line 107) | def _validate(self, value):
  class NamedOptionsUserParameter (line 111) | class NamedOptionsUserParameter(SimpleUserParameter):
    method __init__ (line 118) | def __init__(self, options, default, dtype=object, keytype=str, **kw):
    method coerce (line 123) | def coerce(self, value):
    method _validate (line 126) | def _validate(self, value):
  class MultiOptionUserParameter (line 132) | class MultiOptionUserParameter(OptionsUserParameter):
    method __init__ (line 138) | def __init__(self, options: list | tuple, dtype=object, **kw):
    method coerce_one (line 141) | def coerce_one(self, value):
    method coerce (line 144) | def coerce(self, value):
    method _validate (line 147) | def _validate(self, value):
  class BoundedNumberUserParameter (line 151) | class BoundedNumberUserParameter(SimpleUserParameter):
    method __init__ (line 154) | def __init__(self, dtype=float, max_value=None, min_value=None, **kw):
    method _validate (line 159) | def _validate(self, value):
  class NoMatch (line 173) | class NoMatch(ValueError):
  function register_template (line 177) | def register_template(name):
  function env (line 199) | def env(match, up):
  function data (line 205) | def data(match, up):
  function imp (line 230) | def imp(match, up):
  function unpickle (line 243) | def unpickle(match, up):
  function _set_values (line 252) | def _set_values(up, arguments):
  function set_values (line 280) | def set_values(user_parameters: dict[str, BaseUserParameter], arguments:...

FILE: intake/readers/utils.py
  class SecurityError (line 15) | class SecurityError(RuntimeError):
  function subclasses (line 19) | def subclasses(cls: type) -> set:
  function merge_dicts (line 32) | def merge_dicts(*dicts: dict) -> dict:
  function nested_keys_to_dict (line 64) | def nested_keys_to_dict(kw: dict[str, Any]) -> dict:
  function find_funcs (line 101) | def find_funcs(val, tokens={}):
  class LazyDict (line 134) | class LazyDict(Mapping):
    method __getitem__ (line 139) | def __getitem__(self, item):
    method __len__ (line 142) | def __len__(self):
    method keys (line 145) | def keys(self):
    method __contains__ (line 150) | def __contains__(self, item):
    method __iter__ (line 153) | def __iter__(self):
  class PartlyLazyDict (line 157) | class PartlyLazyDict(LazyDict):
    method __init__ (line 160) | def __init__(self, *mappings):
    method keys (line 169) | def keys(self):
    method __len__ (line 175) | def __len__(self):
    method __iter__ (line 178) | def __iter__(self):
    method __getitem__ (line 181) | def __getitem__(self, item):
    method __setitem__ (line 189) | def __setitem__(self, key, value):
    method update (line 192) | def update(self, data):
    method copy (line 198) | def copy(self):
  class FormatWithPassthrough (line 202) | class FormatWithPassthrough(dict):
    method __getitem__ (line 205) | def __getitem__(self, item):
  function check_imports (line 213) | def check_imports(*imports: Iterable[str]) -> bool:
  class Completable (line 226) | class Completable:
    method check_imports (line 231) | def check_imports(cls):
    method tab_completion_fixer (line 235) | def tab_completion_fixer(item):
  class Tokenizable (line 249) | class Tokenizable(Completable):
    method _dic_for_comp (line 260) | def _dic_for_comp(self):
    method _token (line 269) | def _token(self):
    method token (line 275) | def token(self):
    method __hash__ (line 285) | def __hash__(self):
    method __eq__ (line 289) | def __eq__(self, other):
    method qname (line 295) | def qname(cls):
    method to_dict (line 299) | def to_dict(self):
    method pprint (line 303) | def pprint(self):
    method from_dict (line 310) | def from_dict(cls, data):
  function to_dict (line 320) | def to_dict(thing):
  function make_cls (line 332) | def make_cls(cls: str | type, kwargs: dict):
  function descend_to_path (line 339) | def descend_to_path(path: str | list, kwargs: dict | list | tuple, name:...
  function extract_by_path (line 358) | def extract_by_path(path: str, cls: type, name: str, kwargs: dict, **kw)...
  function _by_value (line 365) | def _by_value(val, up, name):
  function extract_by_value (line 383) | def extract_by_value(value: Any, cls: type, name: str, kwargs: dict, **k...
  function replace_values (line 390) | def replace_values(val, needle, replace):
  function one_to_one (line 410) | def one_to_one(it: Iterable) -> dict:
  function all_to_one (line 414) | def all_to_one(it: Iterable, one: Any) -> dict:
  function camel_to_snake (line 423) | def camel_to_snake(name: str) -> str:
  function snake_to_camel (line 430) | def snake_to_camel(name: str) -> str:
  function pattern_to_glob (line 435) | def pattern_to_glob(pattern: str) -> str:
  function safe_dict (line 477) | def safe_dict(x):
  function port_in_use (line 488) | def port_in_use(host, port=None):
  function find_free_port (line 513) | def find_free_port():
  function _is_tok (line 523) | def _is_tok(s: str) -> bool:

FILE: intake/source/__init__.py
  class DriverRegistry (line 17) | class DriverRegistry(MappingView):
    method __init__ (line 24) | def __init__(self, drivers_source=drivers):
    method __getitem__ (line 27) | def __getitem__(self, item):
    method __iter__ (line 37) | def __iter__(self):
    method keys (line 40) | def keys(self):
    method __len__ (line 43) | def __len__(self):
    method __repr__ (line 46) | def __repr__(self):
    method __contains__ (line 49) | def __contains__(self, item):
  function import_name (line 61) | def import_name(name):
  function get_plugin_class (line 78) | def get_plugin_class(name):

FILE: intake/source/base.py
  class Schema (line 16) | class Schema(dict):
    method __getattr__ (line 17) | def __getattr__(self, item):
  class NoEntry (line 21) | class NoEntry(AttributeError):
  class DataSourceBase (line 25) | class DataSourceBase(DictSerialiseMixin):
    method __init__ (line 49) | def __init__(self, storage_options=None, metadata=None):
    method _get_schema (line 56) | def _get_schema(self):
    method _get_partition (line 60) | def _get_partition(self, i):
    method __eq__ (line 67) | def __eq__(self, other):
    method __hash__ (line 74) | def __hash__(self):
    method _close (line 77) | def _close(self):
    method _load_metadata (line 81) | def _load_metadata(self):
    method _yaml (line 90) | def _yaml(self):
    method yaml (line 115) | def yaml(self):
    method _ipython_display_ (line 124) | def _ipython_display_(self):
    method __repr__ (line 136) | def __repr__(self):
    method is_persisted (line 140) | def is_persisted(self):
    method has_been_persisted (line 145) | def has_been_persisted(self):
    method _get_cache (line 149) | def _get_cache(self, urlpath):
    method discover (line 153) | def discover(self):
    method read (line 164) | def read(self):
    method read_chunked (line 171) | def read_chunked(self):
    method read_partition (line 177) | def read_partition(self, i):
    method to_dask (line 189) | def to_dask(self):
    method to_spark (line 193) | def to_spark(self):
    method entry (line 206) | def entry(self):
    method configure_new (line 211) | def configure_new(self, **kwargs):
    method describe (line 235) | def describe(self):
    method close (line 239) | def close(self):
    method __enter__ (line 244) | def __enter__(self):
    method __exit__ (line 248) | def __exit__(self, exc_type, exc_value, traceback):
  class DataSource (line 252) | class DataSource(DataSourceBase):
  class PatternMixin (line 262) | class PatternMixin:

FILE: intake/source/csv.py
  class CSVSource (line 14) | class CSVSource(DataSource):
    method __init__ (line 23) | def __init__(self, urlpath, storage_options=None, metadata=None, **kwa...
    method discover (line 28) | def discover(self):
    method to_dask (line 31) | def to_dask(self):
    method read (line 34) | def read(self):
    method to_spark (line 37) | def to_spark(self):

FILE: intake/source/derived.py
  function _kwargs_string (line 11) | def _kwargs_string(kwargs_dict):
  class PipelineStepError (line 15) | class PipelineStepError(CatalogException):
  class MissingTargetError (line 19) | class MissingTargetError(CatalogException):
    method __init__ (line 20) | def __init__(self, source, step_index, method, target):
  function get_source (line 28) | def get_source(target, cat, kwargs, cat_kwargs):
  class AliasSource (line 38) | class AliasSource(DataSource):
    method __init__ (line 64) | def __init__(self, target, mapping=None, metadata=None, kwargs=None, c...
    method _get_source (line 88) | def _get_source(self):
    method discover (line 99) | def discover(self):
    method read (line 103) | def read(self):
    method read_partition (line 107) | def read_partition(self, i):
    method read_chunked (line 111) | def read_chunked(self):
    method to_dask (line 115) | def to_dask(self):
  function first (line 120) | def first(targets, cat, kwargs, cat_kwargs):
  function first_discoverable (line 130) | def first_discoverable(targets, cat, kwargs, cat_kwargs):
  class DerivedSource (line 146) | class DerivedSource(DataSource):
    method __init__ (line 159) | def __init__(
    method _validate_params (line 198) | def _validate_params(self):
    method _pick (line 205) | def _pick(self):
  class GenericTransform (line 216) | class GenericTransform(DerivedSource):
    method _validate_params (line 234) | def _validate_params(self):
    method _get_schema (line 239) | def _get_schema(self):
    method to_dask (line 244) | def to_dask(self):
    method read (line 252) | def read(self):
  class DataFrameTransform (line 257) | class DataFrameTransform(GenericTransform):
    method to_dask (line 269) | def to_dask(self):
    method _get_schema (line 275) | def _get_schema(self):
    method read (line 285) | def read(self):
  class Columns (line 289) | class Columns(DataFrameTransform):
    method __init__ (line 301) | def __init__(self, columns, **kwargs):
    method pick_columns (line 312) | def pick_columns(self, df):
  class DataFramePipeline (line 316) | class DataFramePipeline(DataFrameTransform):
    method __init__ (line 365) | def __init__(self, steps, **kwargs):
    method _get_sources (line 370) | def _get_sources(self):
    method pipeline (line 375) | def pipeline(self, df):

FILE: intake/source/discovery.py
  class DriverSouces (line 18) | class DriverSouces:
    method __init__ (line 23) | def __init__(self, config=None, do_scan=None):
    method package_scan (line 43) | def package_scan(self):
    method package_scan (line 47) | def package_scan(self, val):
    method from_entrypoints (line 50) | def from_entrypoints(self):
    method from_conf (line 61) | def from_conf(self):
    method __setitem__ (line 64) | def __setitem__(self, key, value):
    method __delitem__ (line 68) | def __delitem__(self, key):
    method scanned (line 73) | def scanned(self):
    method disabled (line 76) | def disabled(self):
    method registered (line 82) | def registered(self):
    method enabled_plugins (line 96) | def enabled_plugins(self):
    method register_driver (line 99) | def register_driver(self, name, value, clobber=False, do_enable=False):
    method unregister_driver (line 126) | def unregister_driver(self, name):
    method enable (line 131) | def enable(self, name, driver=None):
    method disable (line 156) | def disable(self, name):
  function _load_entrypoint (line 178) | def _load_entrypoint(entrypoint):
  function _normalize (line 197) | def _normalize(name):
  class ConfigurationError (line 207) | class ConfigurationError(Exception):

FILE: intake/source/jsonfiles.py
  class JSONFileSource (line 8) | class JSONFileSource(DataSource):
    method __init__ (line 20) | def __init__(
    method read (line 70) | def read(self):
    method _load_metadata (line 83) | def _load_metadata(self):
    method _get_schema (line 86) | def _get_schema(self):
  class JSONLinesFileSource (line 90) | class JSONLinesFileSource(DataSource):
    method __init__ (line 100) | def __init__(
    method _open (line 150) | def _open(self):
    method read (line 166) | def read(self):
    method head (line 170) | def head(self, nrows: int = 100):
    method _load_metadata (line 177) | def _load_metadata(self):
    method _get_schema (line 180) | def _get_schema(self):

FILE: intake/source/npy.py
  class NPySource (line 13) | class NPySource(DataSource):
    method __init__ (line 27) | def __init__(self, path, storage_options=None, metadata=None):
    method to_dask (line 43) | def to_dask(self):
    method read (line 46) | def read(self):

FILE: intake/source/tests/plugin_searchpath/collision_foo/__init__.py
  class FooPlugin (line 11) | class FooPlugin(DataSource):

FILE: intake/source/tests/plugin_searchpath/collision_foo2/__init__.py
  class FooPlugin (line 11) | class FooPlugin(DataSource):

FILE: intake/source/tests/plugin_searchpath/driver_with_entrypoints/__init__.py
  class SomeTestDriver (line 1) | class SomeTestDriver:

FILE: intake/source/tests/plugin_searchpath/intake_foo/__init__.py
  class FooPlugin (line 11) | class FooPlugin(DataSource):
    method __init__ (line 17) | def __init__(self, **kwargs):

FILE: intake/source/tests/plugin_searchpath/not_intake_foo/__init__.py
  class FooPlugin (line 11) | class FooPlugin(DataSource):
    method __init__ (line 17) | def __init__(self, **kwargs):

FILE: intake/source/tests/test_base.py
  function test_datasource_base_method_exceptions (line 20) | def test_datasource_base_method_exceptions():
  function test_name (line 34) | def test_name():
  function test_datasource_base_context_manager (line 40) | def test_datasource_base_context_manager():
  class MockDataSourceDataFrame (line 49) | class MockDataSourceDataFrame(base.DataSource):
    method __init__ (line 58) | def __init__(self, a, b):
    method _get_schema (line 67) | def _get_schema(self):
    method _get_partition (line 77) | def _get_partition(self, i):
    method read (line 87) | def read(self):
    method to_dask (line 90) | def to_dask(self):
    method _close (line 98) | def _close(self):
  function source_dataframe (line 103) | def source_dataframe():
  function test_datasource_discover (line 107) | def test_datasource_discover(source_dataframe):
  function check_df (line 132) | def check_df(data):
  function test_datasource_read (line 140) | def test_datasource_read(source_dataframe):
  function check_df_parts (line 146) | def check_df_parts(parts):
  function test_datasource_read_chunked (line 152) | def test_datasource_read_chunked(source_dataframe):
  function test_datasource_read_partition (line 158) | def test_datasource_read_partition(source_dataframe):
  function test_datasource_read_partition_out_of_range (line 166) | def test_datasource_read_partition_out_of_range(source_dataframe):
  function test_datasource_to_dask (line 174) | def test_datasource_to_dask(source_dataframe):
  function test_datasource_close (line 180) | def test_datasource_close(source_dataframe):
  function test_datasource_context_manager (line 186) | def test_datasource_context_manager(source_dataframe):
  function test_datasource_pickle (line 193) | def test_datasource_pickle(source_dataframe):
  class MockDataSourcePython (line 200) | class MockDataSourcePython(base.DataSource):
    method __init__ (line 209) | def __init__(self, a, b):
    method _get_schema (line 218) | def _get_schema(self):
    method _get_partition (line 223) | def _get_partition(self, i):
    method read (line 233) | def read(self):
    method to_dask (line 236) | def to_dask(self):
    method _close (line 244) | def _close(self):
  function source_python (line 249) | def source_python():
  function test_datasource_python_discover (line 253) | def test_datasource_python_discover(source_python):
  function test_datasource_python_read (line 277) | def test_datasource_python_read(source_python):
  function test_datasource_python_to_dask (line 288) | def test_datasource_python_to_dask(source_python):
  function test_yaml_method (line 299) | def test_yaml_method(source_python):
  function test_alias_fail (line 306) | def test_alias_fail():
  function test_reconfigure (line 313) | def test_reconfigure():
  function test_import_name (line 333) | def test_import_name(data):

FILE: intake/source/tests/test_csv.py
  function data_filenames (line 22) | def data_filenames():
  function sample1_datasource (line 34) | def sample1_datasource(data_filenames):
  function sample2_datasource (line 39) | def sample2_datasource(data_filenames):
  function sample_pattern_datasource (line 44) | def sample_pattern_datasource(data_filenames):
  function sample_list_datasource (line 49) | def sample_list_datasource(data_filenames):
  function sample_list_datasource_with_glob (line 54) | def sample_list_datasource_with_glob(data_filenames):
  function sample_list_datasource_with_path_as_pattern_str (line 59) | def sample_list_datasource_with_path_as_pattern_str(data_filenames):
  function sample_pattern_datasource_with_cache (line 67) | def sample_pattern_datasource_with_cache(data_filenames):
  function footer_csv_dir (line 81) | def footer_csv_dir():
  function sample_datasource_with_skipfooter (line 86) | def sample_datasource_with_skipfooter(request, footer_csv_dir):
  function test_csv_plugin (line 92) | def test_csv_plugin():
  function test_open (line 99) | def test_open(data_filenames):
  function test_discover (line 106) | def test_discover(sample1_datasource):
  function test_read_dask (line 115) | def test_read_dask(sample1_datasource, data_filenames):
  function test_read_pandas (line 125) | def test_read_pandas(sample1_datasource, data_filenames):
  function test_read_list (line 134) | def test_read_list(sample_list_datasource, data_filenames):
  function test_read_list_with_glob (line 152) | def test_read_list_with_glob(sample_list_datasource_with_glob, data_file...
  function test_read_chunked (line 172) | def test_read_chunked(sample1_datasource, data_filenames):
  function check_read_pattern_output (line 181) | def check_read_pattern_output(df, df_part):
  function test_read_pattern_dask (line 215) | def test_read_pattern_dask(sample_pattern_datasource):
  function test_read_pattern_pandas (line 222) | def test_read_pattern_pandas(sample_pattern_datasource):
  function test_read_pattern_with_cache (line 231) | def test_read_pattern_with_cache(sample_pattern_datasource_with_cache):
  function test_read_pattern_with_path_as_pattern_str (line 238) | def test_read_pattern_with_path_as_pattern_str(sample_list_datasource_wi...
  function test_read_partition (line 245) | def test_read_partition(sample2_datasource, data_filenames):
  function test_to_dask (line 260) | def test_to_dask(sample1_datasource, data_filenames):
  function test_plot (line 269) | def test_plot(sample1_datasource):
  function test_close (line 277) | def test_close(sample1_datasource, data_filenames):
  function test_pickle (line 286) | def test_pickle(sample1_datasource):
  function test_skipfooter (line 296) | def test_skipfooter(sample_datasource_with_skipfooter, footer_csv_dir):

FILE: intake/source/tests/test_derived.py
  function pipe_cat (line 14) | def pipe_cat():
  function test_columns (line 19) | def test_columns():
  function _pick_columns (line 26) | def _pick_columns(df, columns):
  function test_df_transform (line 30) | def test_df_transform():
  function test_barebones (line 37) | def test_barebones():
  function test_other_cat (line 44) | def test_other_cat():
  function test_pipeline_no_loc (line 50) | def test_pipeline_no_loc(pipe_cat):
  function test_pipeline_failed (line 58) | def test_pipeline_failed(pipe_cat):
  function test_pipeline_cols (line 63) | def test_pipeline_cols(pipe_cat):
  function test_pipeline_accessor (line 76) | def test_pipeline_accessor(pipe_cat):
  function test_pipeline_assign (line 82) | def test_pipeline_assign(pipe_cat):
  function test_pipeline_assign_value (line 89) | def test_pipeline_assign_value(pipe_cat):
  function test_pipeline_concat (line 96) | def test_pipeline_concat(pipe_cat):
  function test_pipeline_merge (line 104) | def test_pipeline_merge(pipe_cat):
  function test_pipeline_merge_fail (line 116) | def test_pipeline_merge_fail(pipe_cat):
  function test_pipeline_join (line 121) | def test_pipeline_join(pipe_cat):
  function test_pipeline_join_fail (line 145) | def test_pipeline_join_fail(pipe_cat):
  function test_pipeline_func (line 150) | def test_pipeline_func(pipe_cat):
  function test_pipeline_apply (line 160) | def test_pipeline_apply(pipe_cat):
  function test_groupby_apply (line 169) | def test_groupby_apply(pipe_cat):
  function test_groupby_transform (line 176) | def test_groupby_transform(pipe_cat):
  function test_pipeline_dask (line 182) | def test_pipeline_dask(pipe_cat):

FILE: intake/source/tests/test_discovery.py
  function extra_pythonpath (line 21) | def extra_pythonpath():
  function test_package_scan (line 34) | def test_package_scan(extra_pythonpath, tmp_config_path):
  function test_discover_cli (line 45) | def test_discover_cli(extra_pythonpath, tmp_config_path):
  function test_discover (line 76) | def test_discover(extra_pythonpath, tmp_config_path):
  function test_enable_and_disable (line 101) | def test_enable_and_disable(extra_pythonpath, tmp_config_path):
  function test_register_and_unregister (line 131) | def test_register_and_unregister(extra_pythonpath, tmp_config_path):
  function test_discover_collision (line 148) | def test_discover_collision(extra_pythonpath, tmp_config_path):

FILE: intake/source/tests/test_json.py
  function json_file (line 23) | def json_file(request, tmp_path) -> str:
  function jsonl_file (line 33) | def jsonl_file(request, tmp_path) -> str:
  function test_jsonfile (line 42) | def test_jsonfile(json_file: str):
  function test_jsonfile_none (line 49) | def test_jsonfile_none(json_file: str):
  function test_jsonfile_discover (line 63) | def test_jsonfile_discover(json_file: str):
  function test_jsonlfile (line 69) | def test_jsonlfile(jsonl_file: str):
  function test_jsonfilel_none (line 81) | def test_jsonfilel_none(jsonl_file: str):
  function test_jsonfilel_discover (line 100) | def test_jsonfilel_discover(json_file: str):
  function test_jsonl_head (line 106) | def test_jsonl_head(jsonl_file: str):

FILE: intake/source/tests/test_npy.py
  function test_one_file (line 22) | def test_one_file(tempdir, shape):
  function test_multi_file (line 41) | def test_multi_file(tempdir, shape):
  function test_zarr_minimal (line 79) | def test_zarr_minimal():
  function test_zarr_parts (line 91) | def test_zarr_parts():

FILE: intake/source/tests/test_text.py
  function test_textfiles (line 20) | def test_textfiles(tempdir):
  function test_complex_text (line 34) | def test_complex_text(tempdir, comp):
  function test_complex_bytes (line 67) | def test_complex_bytes(tempdir, comp, pars):
  function test_text_persist (line 91) | def test_text_persist(temp_cache):
  function test_text_export (line 98) | def test_text_export(temp_cache):

FILE: intake/source/tests/test_tiled.py
  function server (line 15) | def server():
  function test_simple (line 38) | def test_simple(server):

FILE: intake/source/tests/util.py
  function verify_plugin_interface (line 9) | def verify_plugin_interface(plugin):
  function verify_datasource_interface (line 15) | def verify_datasource_interface(source):
  function zscore (line 37) | def zscore(s):
  function reverse (line 41) | def reverse(s):

FILE: intake/source/textfiles.py
  class TextFilesSource (line 13) | class TextFilesSource(base.DataSource):
    method __init__ (line 28) | def __init__(
    method read (line 71) | def read(self):
    method to_spark (line 80) | def to_spark(self):

FILE: intake/source/tiled.py
  class TiledCatalog (line 5) | class TiledCatalog(Catalog):
    method __init__ (line 18) | def __init__(self, server, path=None):
    method search (line 49) | def search(self, query, type="text"):
    method __getitem__ (line 62) | def __getitem__(self, item):
  class TiledSource (line 81) | class TiledSource(DataSource):
    method __init__ (line 91) | def __init__(self, uri="", path="", instance=None, metadata=None):
    method discover (line 121) | def discover(self):
    method to_dask (line 132) | def to_dask(self):
    method read (line 136) | def read(self):
    method _yaml (line 139) | def _yaml(self):

FILE: intake/source/utils.py
  function tokenize (line 10) | def tokenize(*args, **kwargs):
  function _validate_format_spec (line 21) | def _validate_format_spec(format_spec):
  function _get_parts_of_format_string (line 29) | def _get_parts_of_format_string(resolved_string, literal_texts, format_s...
  function reverse_format (line 73) | def reverse_format(format_string, resolved_string):
  function reverse_formats (line 155) | def reverse_formats(format_string, resolved_strings):

FILE: intake/source/zarr.py
  class ZarrArraySource (line 13) | class ZarrArraySource(DataSource):
    method __init__ (line 26) | def __init__(self, urlpath, storage_options=None, component=None, meta...
    method to_dask (line 50) | def to_dask(self):
    method read (line 53) | def read(self):

FILE: intake/tests/test_config.py
  function test_load_conf (line 18) | def test_load_conf(conf):
  function test_pathdirs (line 32) | def test_pathdirs():
  function test_load_env (line 51) | def test_load_env(conf):

FILE: intake/tests/test_top_level.py
  function user_catalog (line 24) | def user_catalog():
  function tmp_path_catalog (line 35) | def tmp_path_catalog():
  function test_autoregister_open (line 48) | def test_autoregister_open():
  function test_default_catalogs (line 52) | def test_default_catalogs():
  function test_user_catalog (line 58) | def test_user_catalog(user_catalog):
  function test_open_styles (line 63) | def test_open_styles(tmp_path_catalog):
  function test_path_catalog (line 83) | def test_path_catalog(tmp_path_catalog):
  function test_bad_open (line 91) | def test_bad_open():
  function test_bad_open_helptext (line 102) | def test_bad_open_helptext():
  function test_output_notebook (line 113) | def test_output_notebook():
  function test_old_usage (line 118) | def test_old_usage():
  function test_no_imports (line 123) | def test_no_imports():
  function tmp_path_catalog_nested (line 142) | def tmp_path_catalog_nested():
  function test_nested_catalog_access (line 150) | def test_nested_catalog_access(tmp_path_catalog_nested):

FILE: intake/tests/test_utils.py
  function test_windows_file_path (line 17) | def test_windows_file_path():
  function test_make_path_posix_removes_double_sep (line 24) | def test_make_path_posix_removes_double_sep():
  function test_noops (line 38) | def test_noops(path):
  function test_roundtrip_file_path (line 43) | def test_roundtrip_file_path():
  function test_yaml_tuples (line 50) | def test_yaml_tuples():
  function copy_test_file (line 57) | def copy_test_file(filename, target_dir):

FILE: intake/util_tests.py
  function tempdir (line 16) | def tempdir():
  function temp_conf (line 26) | def temp_conf(conf):

FILE: intake/utils.py
  function import_name (line 24) | def import_name(name):
  function make_path_posix (line 35) | def make_path_posix(path):
  function no_duplicates_constructor (line 43) | def no_duplicates_constructor(loader, node, deep=False):
  function tuple_constructor (line 67) | def tuple_constructor(loader, node, deep=False):
  function represent_dictionary_order (line 71) | def represent_dictionary_order(self, dict_data):
  function no_duplicate_yaml (line 79) | def no_duplicate_yaml():
  function yaml_load (line 93) | def yaml_load(stream):
  function classname (line 99) | def classname(ob):
  class DictSerialiseMixin (line 109) | class DictSerialiseMixin(object):
    method __new__ (line 112) | def __new__(cls, *args, **kwargs):
    method classname (line 121) | def classname(self):
    method __dask_tokenize__ (line 124) | def __dask_tokenize__(self):
    method __getstate__ (line 131) | def __getstate__(self):
    method __setstate__ (line 149) | def __setstate__(self, state):
    method __hash__ (line 156) | def __hash__(self):
    method __eq__ (line 161) | def __eq__(self, other):
  function remake_instance (line 165) | def remake_instance(data):
  function pretty_describe (line 178) | def pretty_describe(object, nestedness=0, indent=2):
  function decode_datetime (line 189) | def decode_datetime(obj):
  function encode_datetime (line 206) | def encode_datetime(obj):
  class RegistryView (line 212) | class RegistryView(collections.abc.Mapping):
    method __init__ (line 222) | def __init__(self, registry):
    method __repr__ (line 225) | def __repr__(self):
    method __getitem__ (line 228) | def __getitem__(self, key):
    method __iter__ (line 231) | def __iter__(self):
    method __len__ (line 234) | def __len__(self):
    method update (line 239) | def update(self, *args, **kwargs):
    method __setitem__ (line 250) | def __setitem__(self, key, value):
    method __delitem__ (line 261) | def __delitem__(self, key):
  class DriverRegistryView (line 273) | class DriverRegistryView(RegistryView):
  class ContainerRegistryView (line 281) | class ContainerRegistryView(RegistryView):
  class ModuleImporter (line 289) | class ModuleImporter:
    method __init__ (line 290) | def __init__(self, destination):
    method __getattribute__ (line 294) | def __getattribute__(self, item):
  function is_notebook (line 307) | def is_notebook() -> bool:
  function is_fsspec_url (line 329) | def is_fsspec_url(s: str) -> bool:
Condensed preview — 237 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (1,226K chars).
[
  {
    "path": ".ci-coveragerc",
    "chars": 37,
    "preview": "[run]\nomit = *tests/*, */_version.py\n"
  },
  {
    "path": ".coveragerc",
    "chars": 109,
    "preview": "[run]\nomit =\n    */tests/*\n    */test_*.py\n    *_version.py\nsource =\n    intake\n[report]\nshow_missing = True\n"
  },
  {
    "path": ".gitattributes",
    "chars": 32,
    "preview": "intake/_version.py export-subst\n"
  },
  {
    "path": ".github/workflows/main.yaml",
    "chars": 777,
    "preview": "name: CI\n\non:\n  push:\n    branches: \"*\"\n  pull_request:\n    branches: master\n\njobs:\n  test:\n    name: ${{ matrix.OS }}-$"
  },
  {
    "path": ".github/workflows/pre-commit.yml",
    "chars": 325,
    "preview": "name: pre-commit\n\non:\n  pull_request:\n    branches:\n      - '*'\n  push:\n    branches: [master]\n  workflow_dispatch:\n\njob"
  },
  {
    "path": ".github/workflows/pypipublish.yaml",
    "chars": 666,
    "preview": "name: Upload Python Package\n\non:\n  release:\n    types: [created]\n\njobs:\n  deploy:\n    runs-on: ubuntu-latest\n    steps:\n"
  },
  {
    "path": ".gitignore",
    "chars": 1250,
    "preview": ".DS_Store\n# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n_version.py\n\n# C extensions\n*.so\n\n# "
  },
  {
    "path": ".pre-commit-config.yaml",
    "chars": 951,
    "preview": "# This is the configuration for pre-commit, a local framework for managing pre-commit hooks\n#   Check out the docs at: h"
  },
  {
    "path": "LICENSE",
    "chars": 1286,
    "preview": "Copyright (c) 2017, Anaconda, Inc.\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or with"
  },
  {
    "path": "MANIFEST.in",
    "chars": 119,
    "preview": "prune .github\nprune docs\nprune examples\nprune scripts\nglobal-exclude test*.py *.yml *.yaml *.csv calvert* *.png *.json\n"
  },
  {
    "path": "README.md",
    "chars": 2317,
    "preview": "# Intake: Take 2\n\n**A general python package for describing, loading and processing data**\n\n![Logo](https://github.com/i"
  },
  {
    "path": "README_refactor.md",
    "chars": 6671,
    "preview": "## Intake Take2\n\nIntake has been extensively rewritten to produce Intake Take2,\nhttps://github.com/intake/intake/pull/73"
  },
  {
    "path": "docs/Makefile",
    "chars": 765,
    "preview": "# Minimal makefile for Sphinx documentation\n#\n\n# You can set these variables from the command line.\nSPHINXOPTS    =\nSPHI"
  },
  {
    "path": "docs/README.md",
    "chars": 746,
    "preview": "# Building Documentation\n\nAn environment with several prerequisites is needed to build the\ndocumentation.  Create this w"
  },
  {
    "path": "docs/environment.yml",
    "chars": 353,
    "preview": "name: intake-docs\nchannels:\n  - conda-forge\n\ndependencies:\n  - appdirs\n  - python=3.12\n  - dask\n  - numpy\n  - pandas\n  -"
  },
  {
    "path": "docs/make.bat",
    "chars": 814,
    "preview": "@ECHO OFF\r\n\r\npushd %~dp0\r\n\r\nREM Command file for Sphinx documentation\r\n\r\nif \"%SPHINXBUILD%\" == \"\" (\r\n\tset SPHINXBUILD=sp"
  },
  {
    "path": "docs/make_api.py",
    "chars": 2469,
    "preview": "import os\nimport sys\nimport intake\n\n\ndef run(path):\n    fn = os.path.join(path, \"source\", \"api2.rst\")\n    with open(fn, "
  },
  {
    "path": "docs/plugins.py",
    "chars": 5441,
    "preview": "import asyncio\n\nimport aiohttp\nimport pandas as pd\nimport yaml\n\n\ndef format_package_links(package_name, repo_link):\n    "
  },
  {
    "path": "docs/plugins.yaml",
    "chars": 6745,
    "preview": "- name: intake\n  repo: intake/intake\n  description: Builtin to Intake\n  drivers: catalog, csv, intake_remote, ndzarr, nu"
  },
  {
    "path": "docs/requirements.txt",
    "chars": 58,
    "preview": "sphinx\nsphinx_rtd_theme\nnumpydoc\npanel\nhvplot\nentrypoints\n"
  },
  {
    "path": "docs/source/_static/.keep",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "docs/source/_static/css/custom.css",
    "chars": 244,
    "preview": "div.prompt {\n  display: none\n}\n\ndiv.logo-block img {\n  display: none !important\n}\n\n.table_wrapper{\n  display: block;\n  o"
  },
  {
    "path": "docs/source/_static/images/plotting_example.html",
    "chars": 21810,
    "preview": "\n<!DOCTYPE html>\n<html lang=\"en\">\n    <head>\n        <meta charset=\"utf-8\">\n        <title>HoloPlot Plot</title>\n\n<link "
  },
  {
    "path": "docs/source/api.rst",
    "chars": 249,
    "preview": "API\n===\n\nAuto-generated reference\n\n.. toctree::\n   :maxdepth: 1\n\n   api_user.rst\n   api_base.rst\n   api_other.rst\n\n.. ra"
  },
  {
    "path": "docs/source/api2.rst",
    "chars": 10065,
    "preview": "\n.. _api2:\n\nAPI Reference\n=============\n\nUser Functions\n--------------\n\n.. autosummary::\n    intake.config.Config\n    in"
  },
  {
    "path": "docs/source/api_base.rst",
    "chars": 941,
    "preview": "Base Classes\n------------\n\nThis is a reference API class listing, useful mainly for developers.\n\n.. autosummary::\n   int"
  },
  {
    "path": "docs/source/api_other.rst",
    "chars": 1041,
    "preview": "Other Classes\n=============\n\nGUI\n---\n\n.. autosummary::\n\n   intake.interface.base.Base\n   intake.interface.base.BaseSelec"
  },
  {
    "path": "docs/source/api_user.rst",
    "chars": 2183,
    "preview": "End User\n--------\n\nThese are reference class and function definitions likely to be useful to everyone.\n\n.. autosummary::"
  },
  {
    "path": "docs/source/catalog.rst",
    "chars": 28191,
    "preview": "Catalogs\n========\n\nData catalogs provide an abstraction that allows you to externally define, and optionally share, desc"
  },
  {
    "path": "docs/source/changelog.rst",
    "chars": 2284,
    "preview": "Changelog\n=========\n\n2.0.4\n-----\n\nReleased March 19, 2024\n\n- re-enable v1 entrypoint sources\n- expose recommend function"
  },
  {
    "path": "docs/source/code-of-conduct.rst",
    "chars": 5532,
    "preview": "Code of Conduct\n===============\n\nAll participants in the fsspec community are expected to adhere to a Code of Conduct.\n\n"
  },
  {
    "path": "docs/source/community.rst",
    "chars": 2763,
    "preview": "Community\n=========\n\nIntake is used and developed by individuals at a variety of institutions.  It\nis open source (`lice"
  },
  {
    "path": "docs/source/conf.py",
    "chars": 5679,
    "preview": "#!/usr/bin/env python3\n# -*- coding: utf-8 -*-\n#\n# intake documentation build configuration file, created by\n# sphinx-qu"
  },
  {
    "path": "docs/source/contributing.rst",
    "chars": 5228,
    "preview": "Contributor guide\n=================\n\n``intake`` is an open-source project (see the LICENSE). We welcome contributions fr"
  },
  {
    "path": "docs/source/data-packages.rst",
    "chars": 13134,
    "preview": "Making Data Packages\n====================\n\nIntake can used to create :term:`Data packages`, so that you can easily distr"
  },
  {
    "path": "docs/source/deployments.rst",
    "chars": 5854,
    "preview": "Deployment Scenarios\n--------------------\n\nIn the following sections, we will describe some of the ways in which Intake "
  },
  {
    "path": "docs/source/examples.rst",
    "chars": 5535,
    "preview": "Examples\n========\n\nHere we list links to notebooks and other code demonstrating the use of Intake in various\nscenarios. "
  },
  {
    "path": "docs/source/glossary.rst",
    "chars": 8290,
    "preview": "Glossary\n========\n\n.. glossary::\n\n    Argument\n        One of a set of values passed to a function or class. In the Inta"
  },
  {
    "path": "docs/source/gui.rst",
    "chars": 6721,
    "preview": "GUI\n===\n\nUsing the GUI\n-------------\n\n**Note**: the GUI requires ``panel`` and ``bokeh`` to\nbe available in the current "
  },
  {
    "path": "docs/source/guide.rst",
    "chars": 453,
    "preview": "User Guide\n----------\n\nMore detailed information about specific parts of Intake, such as how to author catalogs,\nhow to "
  },
  {
    "path": "docs/source/index.rst",
    "chars": 2262,
    "preview": ".. raw:: html\n\n   <img src=\"_static/images/logo.png\" alt=\"Intake Logo\" style=\"float:right;width:94px;height:60px;\">\n\n.. "
  },
  {
    "path": "docs/source/index_v1.rst",
    "chars": 5736,
    "preview": ".. raw:: html\n\n   <img src=\"_static/images/logo.png\" alt=\"Intake Logo\" style=\"float:right;width:94px;height:60px;\">\n\n.. "
  },
  {
    "path": "docs/source/making-plugins.rst",
    "chars": 20880,
    "preview": "Making Drivers\n==============\n\nThe goal of the Intake plugin system is to make it very simple to implement a :term:`Driv"
  },
  {
    "path": "docs/source/overview.rst",
    "chars": 4365,
    "preview": "Overview\n========\n\nIntroduction\n------------\n\nThis page describes the technical design of Intake, with brief details of "
  },
  {
    "path": "docs/source/persisting.rst",
    "chars": 5532,
    "preview": ".. _persisting:\n\nPersisting Data\n===============\n\n(this is an experimental new feature, expect enhancements and changes)"
  },
  {
    "path": "docs/source/plotting.rst",
    "chars": 7903,
    "preview": "Plotting\n========\n\nIntake provides a plotting API based on the `hvPlot <https://hvplot.holoviz.org/index.html>`_ library"
  },
  {
    "path": "docs/source/plugin-directory.rst",
    "chars": 824,
    "preview": ".. _plugin-directory:\n\nPlugin Directory\n================\n\nThis is a list of known projects which install driver plugins "
  },
  {
    "path": "docs/source/quickstart.rst",
    "chars": 10481,
    "preview": "Quickstart\n==========\n\nThis guide will show you how to get started using Intake to read data, and give you a flavour\nof "
  },
  {
    "path": "docs/source/reference.rst",
    "chars": 263,
    "preview": "Reference\n---------\n\n\n.. toctree::\n    :maxdepth: 1\n\n    api.rst\n    changelog.rst\n    making-plugins.rst\n    data-packa"
  },
  {
    "path": "docs/source/roadmap.rst",
    "chars": 4323,
    "preview": ".. _roadmap:\n\nRoadmap\n=======\n\nSome high-level work that we expect to be achieved on the time-scale of months. This list"
  },
  {
    "path": "docs/source/scope2.rst",
    "chars": 5098,
    "preview": "Scope\n=====\n\nHere we lay out what Intake is, why you might want to use it, main features and also\na few reasons you may "
  },
  {
    "path": "docs/source/start.rst",
    "chars": 443,
    "preview": ".. _start:\n\nStart here\n----------\n\nThese documents will familiarise you with Intake, show you some basic usage and examp"
  },
  {
    "path": "docs/source/tools.rst",
    "chars": 7762,
    "preview": "Command Line Tools\n==================\n\nThe package installs two executable commands: for starting the catalog server; an"
  },
  {
    "path": "docs/source/tour2.rst",
    "chars": 8310,
    "preview": "Developers' Package Tour\n========================\n\nGeneral Guidelines\n------------------\n\nIntake is an open source proje"
  },
  {
    "path": "docs/source/transforms.rst",
    "chars": 7187,
    "preview": "Dataset Transforms\n------------------\n\naka. derived datasets.\n\n.. warning::\n    experimental feature, the API may change"
  },
  {
    "path": "docs/source/use_cases.rst",
    "chars": 18425,
    "preview": ".. _usecases:\n\nUse Cases - I want to...\n========================\n\nHere follows a list of specific things that people may"
  },
  {
    "path": "docs/source/user2.rst",
    "chars": 8566,
    "preview": ".. catalog_user:\n\nCatalog User\n============\n\nSo someone has sent you an Intake URL or other way to load a catalog. What "
  },
  {
    "path": "docs/source/walkthrough2.rst",
    "chars": 12834,
    "preview": "Creator Walkthrough\n===================\n\nAs soon as you have used a catalog, you may wonder how to create the - look no "
  },
  {
    "path": "examples/Take2.ipynb",
    "chars": 9939,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"id\": \"f5f6d518-dfe8-4651-b315-44bb55b885be\",\n "
  },
  {
    "path": "intake/__init__.py",
    "chars": 6789,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/__init__.py",
    "chars": 875,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/base.py",
    "chars": 17744,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/default.py",
    "chars": 3147,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/entry.py",
    "chars": 4010,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/exceptions.py",
    "chars": 2514,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/gui.py",
    "chars": 1076,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2019, Anaconda, I"
  },
  {
    "path": "intake/catalog/local.py",
    "chars": 34222,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/tests/__init__.py",
    "chars": 328,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/tests/cache_data/states.csv",
    "chars": 30103,
    "preview": "\"state\",\"slug\",\"code\",\"nickname\",\"website\",\"admission_date\",\"admission_number\",\"capital_city\",\"capital_url\",\"population\""
  },
  {
    "path": "intake/catalog/tests/catalog.yml",
    "chars": 730,
    "preview": "plugins:\n  source:\n    - module: intake.catalog.tests.example1_source\nsources:\n  use_example1:\n    description: example1"
  },
  {
    "path": "intake/catalog/tests/catalog1.yml",
    "chars": 2001,
    "preview": "name: name_in_cat\nmetadata:\n  test: true\nplugins:\n  source:\n    - module: intake.catalog.tests.example1_source\n    - mod"
  },
  {
    "path": "intake/catalog/tests/catalog_alias.yml",
    "chars": 1536,
    "preview": "sources:\n  input_data:\n    description: a local data file\n    driver: csv\n    args:\n      urlpath: '{{ CATALOG_DIR }}/ca"
  },
  {
    "path": "intake/catalog/tests/catalog_caching.yml",
    "chars": 2750,
    "preview": "metadata:\n  test: true\nplugins:\n  source:\n    - module: intake.catalog.tests.example1_source\n    - module: intake.catalo"
  },
  {
    "path": "intake/catalog/tests/catalog_dup_parameters.yml",
    "chars": 514,
    "preview": "sources:\n  entry1_part:\n    description: entry1 part\n    parameters:\n      part:\n        description: a\n        type: st"
  },
  {
    "path": "intake/catalog/tests/catalog_dup_sources.yml",
    "chars": 461,
    "preview": "sources:\n  entry1_part:\n    description: entry1 part\n    parameters:\n      part:\n        description: a\n        type: st"
  },
  {
    "path": "intake/catalog/tests/catalog_hierarchy.yml",
    "chars": 577,
    "preview": "sources:\n  a.b.c:\n    description: abc\n    driver: csv\n    args:\n      urlpath: '{{ CATALOG_DIR }}/entry1_*.csv'\n  a.b.d"
  },
  {
    "path": "intake/catalog/tests/catalog_named.yml",
    "chars": 274,
    "preview": "name: name_in_spec\ndescription: This is a catalog with a description in the yaml\nmetadata:\n  some: thing\nplugins:\n  sour"
  },
  {
    "path": "intake/catalog/tests/catalog_non_dict.yml",
    "chars": 12,
    "preview": "- 1\n- 2\n- 3\n"
  },
  {
    "path": "intake/catalog/tests/catalog_search/example_packages/ep/__init__.py",
    "chars": 27,
    "preview": "class TestCatalog:\n    ...\n"
  },
  {
    "path": "intake/catalog/tests/catalog_search/example_packages/ep-0.1.dist-info/entry_points.txt",
    "chars": 39,
    "preview": "[intake.catalogs]\nep1 = ep:TestCatalog\n"
  },
  {
    "path": "intake/catalog/tests/catalog_search/yaml.yml",
    "chars": 169,
    "preview": "plugins:\n  source:\n    - module: intake.catalog.tests.example1_source\nsources:\n  use_example1:\n    description: example1"
  },
  {
    "path": "intake/catalog/tests/catalog_union_1.yml",
    "chars": 169,
    "preview": "plugins:\n  source:\n    - module: intake.catalog.tests.example1_source\nsources:\n  use_example1:\n    description: example1"
  },
  {
    "path": "intake/catalog/tests/catalog_union_2.yml",
    "chars": 771,
    "preview": "plugins:\n  source:\n    - module: intake.catalog.tests.example1_source\n    - module: intake.catalog.tests.example2_source"
  },
  {
    "path": "intake/catalog/tests/conftest.py",
    "chars": 194,
    "preview": "import os.path\n\nimport pytest\n\nfrom intake import open_catalog\n\n\n@pytest.fixture\ndef catalog1():\n    path = os.path.dirn"
  },
  {
    "path": "intake/catalog/tests/data_source_missing.yml",
    "chars": 22,
    "preview": "plugins:\n  source: []\n"
  },
  {
    "path": "intake/catalog/tests/data_source_name_non_string.yml",
    "chars": 18,
    "preview": "sources:\n  1: foo\n"
  },
  {
    "path": "intake/catalog/tests/data_source_non_dict.yml",
    "chars": 13,
    "preview": "sources: foo\n"
  },
  {
    "path": "intake/catalog/tests/data_source_value_non_dict.yml",
    "chars": 18,
    "preview": "sources:\n  foo: 1\n"
  },
  {
    "path": "intake/catalog/tests/dot-nest.yaml",
    "chars": 603,
    "preview": "sources:\n  self:\n    description: this cat\n    driver: yaml_file_cat\n    args:\n      path: \"{{CATALOG_DIR}}/dot-nest.yam"
  },
  {
    "path": "intake/catalog/tests/entry1_1.csv",
    "chars": 67,
    "preview": "name,score,rank\nAlice1,100.5,1\nBob1,50.3,2\nCharlie1,25,3\nEve1,25,3\n"
  },
  {
    "path": "intake/catalog/tests/entry1_2.csv",
    "chars": 67,
    "preview": "name,score,rank\nAlice2,100.5,1\nBob2,50.3,2\nCharlie2,25,3\nEve2,25,3\n"
  },
  {
    "path": "intake/catalog/tests/example1_source.py",
    "chars": 614,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/tests/example_plugin_dir/example2_source.py",
    "chars": 567,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/tests/multi_plugins.yaml",
    "chars": 1562,
    "preview": "sources:\n  tables0:\n    args:\n        urlpath: \"{{ CATALOG_DIR }}/files*\"\n    description: \"short form\"\n    driver:\n    "
  },
  {
    "path": "intake/catalog/tests/multi_plugins2.yaml",
    "chars": 216,
    "preview": "sources:\n  tables6:\n    args:\n        urlpath: \"{{ CATALOG_DIR }}/files*\"\n    description: \"incompatible plugins\"\n    dr"
  },
  {
    "path": "intake/catalog/tests/obsolete_data_source_list.yml",
    "chars": 37,
    "preview": "sources:\n  - name: a\n    driver: csv\n"
  },
  {
    "path": "intake/catalog/tests/obsolete_params_list.yml",
    "chars": 62,
    "preview": "sources:\n  a:\n    driver: csv\n    parameters:\n      - name: b\n"
  },
  {
    "path": "intake/catalog/tests/params_missing_required.yml",
    "chars": 33,
    "preview": "sources:\n  a:\n    description: A\n"
  },
  {
    "path": "intake/catalog/tests/params_name_non_string.yml",
    "chars": 58,
    "preview": "sources:\n  a:\n    driver: csv\n    parameters:\n      1: {}\n"
  },
  {
    "path": "intake/catalog/tests/params_non_dict.yml",
    "chars": 48,
    "preview": "sources:\n  a:\n    driver: csv\n    parameters: b\n"
  },
  {
    "path": "intake/catalog/tests/params_value_bad_choice.yml",
    "chars": 99,
    "preview": "sources:\n  a:\n    driver: csv\n    parameters:\n      b:\n        description: B\n        type: string\n"
  },
  {
    "path": "intake/catalog/tests/params_value_bad_type.yml",
    "chars": 96,
    "preview": "sources:\n  a:\n    driver: csv\n    parameters:\n      b:\n        description: 1\n        type: str\n"
  },
  {
    "path": "intake/catalog/tests/params_value_non_dict.yml",
    "chars": 57,
    "preview": "sources:\n  a:\n    driver: csv\n    parameters:\n      b: 1\n"
  },
  {
    "path": "intake/catalog/tests/plugins_non_dict.yml",
    "chars": 23,
    "preview": "plugins: 0\nsources: {}\n"
  },
  {
    "path": "intake/catalog/tests/plugins_source_missing.yml",
    "chars": 34,
    "preview": "plugins:\n  s0urce: []\nsources: {}\n"
  },
  {
    "path": "intake/catalog/tests/plugins_source_missing_key.yml",
    "chars": 53,
    "preview": "plugins:\n  source:\n    - directory: /tmp\nsources: {}\n"
  },
  {
    "path": "intake/catalog/tests/plugins_source_non_dict.yml",
    "chars": 44,
    "preview": "plugins:\n  source:\n    - module\nsources: {}\n"
  },
  {
    "path": "intake/catalog/tests/plugins_source_non_list.yml",
    "chars": 38,
    "preview": "plugins:\n  source: module\nsources: {}\n"
  },
  {
    "path": "intake/catalog/tests/plugins_source_non_string.yml",
    "chars": 47,
    "preview": "plugins:\n  source:\n    - module: 0\nsources: {}\n"
  },
  {
    "path": "intake/catalog/tests/test_alias.py",
    "chars": 1984,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/tests/test_catalog_save.py",
    "chars": 756,
    "preview": "\"\"\"\nTest saving catalogs.\n\"\"\"\nimport os\n\nimport intake\nfrom intake.catalog import Catalog\nfrom intake.catalog.local impo"
  },
  {
    "path": "intake/catalog/tests/test_core.py",
    "chars": 307,
    "preview": "import pytest\n\nfrom intake.catalog.base import Catalog\n\n\ndef test_no_entry():\n    cat = Catalog()\n    cat2 = cat.configu"
  },
  {
    "path": "intake/catalog/tests/test_default.py",
    "chars": 725,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/tests/test_discovery.py",
    "chars": 1117,
    "preview": "import os\nimport sys\n\nfrom ..local import EntrypointsCatalog, MergedCatalog, YAMLFilesCatalog\n\n\ndef test_catalog_discove"
  },
  {
    "path": "intake/catalog/tests/test_gui.py",
    "chars": 1447,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2019, Anaconda, I"
  },
  {
    "path": "intake/catalog/tests/test_local.py",
    "chars": 26108,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/tests/test_parameters.py",
    "chars": 7339,
    "preview": "import os\n\nimport pytest\n\nimport intake\nfrom intake.catalog.local import LocalCatalogEntry, UserParameter\nfrom intake.so"
  },
  {
    "path": "intake/catalog/tests/test_reload_integration.py",
    "chars": 3575,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/tests/test_utils.py",
    "chars": 2471,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/tests/test_zarr.py",
    "chars": 5372,
    "preview": "import os\nimport shutil\nimport tempfile\n\nimport pytest\n\nfrom intake import open_catalog\nfrom intake.catalog.zarr import "
  },
  {
    "path": "intake/catalog/tests/util.py",
    "chars": 1015,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/utils.py",
    "chars": 11580,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/catalog/zarr.py",
    "chars": 3307,
    "preview": "from .base import Catalog\nfrom .local import LocalCatalogEntry\n\n\nclass ZarrGroupCatalog(Catalog):\n    \"\"\"A catalog of th"
  },
  {
    "path": "intake/config.py",
    "chars": 6050,
    "preview": "\"\"\"Intake config manipulations and persistence\"\"\"\n\n# -------------------------------------------------------------------"
  },
  {
    "path": "intake/conftest.py",
    "chars": 2541,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/container/__init__.py",
    "chars": 42,
    "preview": "def register_container(*_, **__):\n    ...\n"
  },
  {
    "path": "intake/container/base.py",
    "chars": 68,
    "preview": "class RemoteSource:\n    ...\n\n\ndef get_partition(*_, **__):\n    pass\n"
  },
  {
    "path": "intake/interface/__init__.py",
    "chars": 1304,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/interface/base.py",
    "chars": 9051,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2019, Anaconda, I"
  },
  {
    "path": "intake/interface/catalog/__init__.py",
    "chars": 328,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2019, Anaconda, I"
  },
  {
    "path": "intake/interface/catalog/add.py",
    "chars": 9399,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2019, Anaconda, I"
  },
  {
    "path": "intake/interface/catalog/search.py",
    "chars": 1023,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2019, Anaconda, I"
  },
  {
    "path": "intake/interface/gui.py",
    "chars": 7928,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2019, Anaconda, I"
  },
  {
    "path": "intake/interface/source/__init__.py",
    "chars": 328,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2019, Anaconda, I"
  },
  {
    "path": "intake/interface/source/defined_plots.py",
    "chars": 14604,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2022, Anaconda, I"
  },
  {
    "path": "intake/readers/__init__.py",
    "chars": 1413,
    "preview": "from intake.readers.datatypes import *  # noqa: F403\nfrom intake.readers.readers import *  # noqa: F403\nfrom intake.read"
  },
  {
    "path": "intake/readers/catalogs.py",
    "chars": 24163,
    "preview": "\"\"\"Data readers which create Catalog objects\"\"\"\n\nfrom __future__ import annotations\n\nimport itertools\nimport json\n\nimpor"
  },
  {
    "path": "intake/readers/convert.py",
    "chars": 27987,
    "preview": "\"\"\"Convert between python representations of data\n\nBy convention, functions here do not change the data, just how it is "
  },
  {
    "path": "intake/readers/datatypes.py",
    "chars": 26421,
    "preview": "\"\"\"Enumerates all the sorts of data that Intake knows about\"\"\"\n\nfrom __future__ import annotations\n\nimport re\nfrom itert"
  },
  {
    "path": "intake/readers/entry.py",
    "chars": 25801,
    "preview": "\"\"\"Description of the ways to load a data set\n\nThese are the definitions as they would appear in a Catalog: they may hav"
  },
  {
    "path": "intake/readers/examples.py",
    "chars": 2600,
    "preview": "\"\"\"This module can contain examples of complex Intake use we wish to refer to\"\"\"\nimport operator\n\n\ndef ms_building_parqu"
  },
  {
    "path": "intake/readers/importlist.py",
    "chars": 1957,
    "preview": "\"\"\"Imports made my intake when it itself is imported\n\nSince \"plugins\" are just subclasses of things like intake.readers."
  },
  {
    "path": "intake/readers/metadata.py",
    "chars": 1375,
    "preview": "\"\"\"Some types and meanings of fields that can be expected in metadata dictionaries\n\nMetadata should be JSON-serializable"
  },
  {
    "path": "intake/readers/mixins.py",
    "chars": 6643,
    "preview": "\"\"\"Helpers for creating pipelines\"\"\"\n\nfrom __future__ import annotations\n\nimport re\nfrom itertools import chain\n\nfrom in"
  },
  {
    "path": "intake/readers/namespaces.py",
    "chars": 2705,
    "preview": "\"\"\"Add module accessors to pipelines, providing functions appropriate for its output\n\nThe code here allow something like"
  },
  {
    "path": "intake/readers/output.py",
    "chars": 7157,
    "preview": "\"\"\"Serialise and output data into persistent formats\n\nThis is how to \"export\" data from Intake.\n\nBy convention, function"
  },
  {
    "path": "intake/readers/readers.py",
    "chars": 57358,
    "preview": "\"\"\"Classes for reading data into a python objects\"\"\"\n\nfrom __future__ import annotations\n\nimport inspect\nimport itertool"
  },
  {
    "path": "intake/readers/search.py",
    "chars": 4099,
    "preview": "\"\"\"Find datasets meeting some complex criteria\"\"\"\n\nfrom __future__ import annotations\n\nfrom intake.readers.entry import "
  },
  {
    "path": "intake/readers/tests/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "intake/readers/tests/cats/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "intake/readers/tests/cats/stac_data/1.0.0/catalog/catalog.json",
    "chars": 347,
    "preview": "{\n  \"type\": \"Catalog\",\n  \"id\": \"test\",\n  \"stac_version\": \"1.0.0\",\n  \"description\": \"test catalog\",\n  \"links\": [\n    {\n  "
  },
  {
    "path": "intake/readers/tests/cats/stac_data/1.0.0/catalog/child-catalog.json",
    "chars": 242,
    "preview": "{\n  \"type\": \"Catalog\",\n  \"id\": \"test\",\n  \"stac_version\": \"1.0.0\",\n  \"description\": \"child catalog\",\n  \"links\": [\n    {\n "
  },
  {
    "path": "intake/readers/tests/cats/stac_data/1.0.0/collection/collection.json",
    "chars": 1765,
    "preview": "{\n  \"id\": \"simple-collection\",\n  \"type\": \"Collection\",\n  \"stac_extensions\": [\n    \"https://stac-extensions.github.io/eo/"
  },
  {
    "path": "intake/readers/tests/cats/stac_data/1.0.0/collection/simple-item.json",
    "chars": 2733,
    "preview": "{\n  \"stac_version\": \"1.0.0\",\n  \"stac_extensions\": [\n    \"https://stac-extensions.github.io/projection/v1.0.0/schema.json"
  },
  {
    "path": "intake/readers/tests/cats/stac_data/1.0.0/collection/zarr-collection.json",
    "chars": 14074,
    "preview": "{\n  \"type\": \"Collection\",\n  \"id\": \"daymet-daily-hi\",\n  \"stac_version\": \"1.0.0\",\n  \"description\": \"{{ collection.descript"
  },
  {
    "path": "intake/readers/tests/cats/stac_data/1.0.0/item/zarr-item.json",
    "chars": 14375,
    "preview": "{\n  \"type\": \"Feature\",\n  \"stac_version\": \"1.0.0\",\n  \"id\": \"daymet-daily-hi\",\n  \"properties\": {\n    \"cube:dimensions\": {\n"
  },
  {
    "path": "intake/readers/tests/cats/stac_data/1.0.0/itemcollection/example-search.json",
    "chars": 14456,
    "preview": "{\n  \"id\": \"mysearchresults\",\n  \"stac_version\": \"1.0.0-beta.2\",\n  \"stac_extensions\": [\"single-file-stac\"],\n  \"description"
  },
  {
    "path": "intake/readers/tests/cats/stac_data/1.0.0beta2/earthsearch/readme.md",
    "chars": 788,
    "preview": "Generated with:\n\n```python\nimport satsearch\nimport pystac\nimport json\n\nbbox = [35.48, -3.24, 35.58, -3.14]\ndates = '2020"
  },
  {
    "path": "intake/readers/tests/cats/stac_data/1.0.0beta2/earthsearch/single-file-stac.json",
    "chars": 185474,
    "preview": "{\n  \"id\": \"STAC\",\n  \"description\": \"Single file STAC\",\n  \"stac_version\": \"1.0.0-beta.2\",\n  \"stac_extensions\": [\"single-f"
  },
  {
    "path": "intake/readers/tests/cats/test_sql.py",
    "chars": 3570,
    "preview": "import os\n\nimport pandas as pd\nimport pytest\n\nfrom intake.readers import catalogs, datatypes, readers\n\npytest.importorsk"
  },
  {
    "path": "intake/readers/tests/cats/test_stac.py",
    "chars": 911,
    "preview": "import os\n\nimport pytest\n\nimport intake.readers.datatypes\n\nhere = os.path.dirname(os.path.abspath(__file__))\ncat_url = o"
  },
  {
    "path": "intake/readers/tests/cats/test_thredds.py",
    "chars": 673,
    "preview": "import pytest\n\nimport intake.readers\n\npytest.importorskip(\"siphon\")\npytest.importorskip(\"xarray\")\npytest.importorskip(\"h"
  },
  {
    "path": "intake/readers/tests/cats/test_tiled.py",
    "chars": 1459,
    "preview": "import shlex\nimport subprocess\nimport time\n\nimport pytest\n\nimport intake.readers.datatypes\n\ntiled = pytest.importorskip("
  },
  {
    "path": "intake/readers/tests/test_basic.py",
    "chars": 1448,
    "preview": "import os\n\nfrom intake.readers import datatypes, readers, entry\n\nhere = os.path.dirname(__file__)\ntestdir = os.path.absp"
  },
  {
    "path": "intake/readers/tests/test_consistency.py",
    "chars": 1735,
    "preview": "import pytest\nimport intake\nfrom intake.readers.utils import subclasses\nfrom intake.readers.readers import FileReader\nfr"
  },
  {
    "path": "intake/readers/tests/test_dict.py",
    "chars": 544,
    "preview": "import intake.readers\nfrom intake.readers import entry\n\n\ndef test_yaml_roundtrip():\n    cat = entry.Catalog()\n    cat[\"o"
  },
  {
    "path": "intake/readers/tests/test_errors.py",
    "chars": 345,
    "preview": "import pytest\n\nimport intake\n\n\ndef test_func_ser():\n    class A:\n        def get(self):\n            def inner():\n       "
  },
  {
    "path": "intake/readers/tests/test_reader.py",
    "chars": 4398,
    "preview": "import tempfile\n\nimport pytest\n\nimport intake\n\n\ndef test_reader_from_call():\n    import pandas as pd\n\n    df = pd.DataFr"
  },
  {
    "path": "intake/readers/tests/test_search.py",
    "chars": 1188,
    "preview": "from intake.readers.entry import Catalog, ReaderDescription\nfrom intake.readers.readers import BaseReader\nfrom intake.re"
  },
  {
    "path": "intake/readers/tests/test_up.py",
    "chars": 1096,
    "preview": "import pytest\n\n\ndef test_basic():\n    from intake.readers import user_parameters as up\n\n    p = up.SimpleUserParameter(d"
  },
  {
    "path": "intake/readers/tests/test_utils.py",
    "chars": 673,
    "preview": "import pytest\n\nfrom intake.readers.utils import LazyDict, PartlyLazyDict\n\n\nclass OnlyOkeKey(LazyDict):\n    def __getitem"
  },
  {
    "path": "intake/readers/tests/test_workflows.py",
    "chars": 5697,
    "preview": "import os\n\nimport fsspec\nimport pytest\n\nimport intake.readers\nfrom intake.readers import convert, readers, utils, entry\n"
  },
  {
    "path": "intake/readers/transform.py",
    "chars": 4744,
    "preview": "\"\"\"Manipulate data: functions that change the data but not the container type\n\"\"\"\nfrom __future__ import annotations\n\nim"
  },
  {
    "path": "intake/readers/user_parameters.py",
    "chars": 9946,
    "preview": "\"\"\"\nParametrization of data/reader entries, as they appear in Catalogs\n\nParameters can be used to template values across"
  },
  {
    "path": "intake/readers/utils.py",
    "chars": 15888,
    "preview": "from __future__ import annotations\n\nimport importlib.metadata\nimport numbers\nimport re\nimport typing\nfrom functools impo"
  },
  {
    "path": "intake/source/__init__.py",
    "chars": 2506,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/source/base.py",
    "chars": 8275,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/source/csv.py",
    "chars": 1254,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/source/derived.py",
    "chars": 16318,
    "preview": "from copy import deepcopy\nfrom functools import lru_cache, partial\nfrom textwrap import dedent\n\nfrom .. import open_cata"
  },
  {
    "path": "intake/source/discovery.py",
    "chars": 6429,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/source/jsonfiles.py",
    "chars": 5489,
    "preview": "import contextlib\nimport json\nfrom itertools import islice\n\nfrom intake.source.base import DataSource\n\n\nclass JSONFileSo"
  },
  {
    "path": "intake/source/npy.py",
    "chars": 1551,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/source/tests/__init__.py",
    "chars": 328,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/source/tests/alias.yaml",
    "chars": 401,
    "preview": "sources:\n  csvs:\n    driver: textfiles\n    args:\n      urlpath: '{{ CATALOG_DIR }}/*.csv'\n  yamls:\n    driver: textfiles"
  },
  {
    "path": "intake/source/tests/cached.yaml",
    "chars": 1077,
    "preview": "sources:\n  calvert:\n    driver: csv\n    args:\n      urlpath: '{{ CATALOG_DIR }}/calvert_uk.zip'\n    cache:\n      - type:"
  },
  {
    "path": "intake/source/tests/data.zarr/.zarray",
    "chars": 312,
    "preview": "{\n    \"chunks\": [\n        10\n    ],\n    \"compressor\": {\n        \"blocksize\": 0,\n        \"clevel\": 5,\n        \"cname\": \"l"
  },
  {
    "path": "intake/source/tests/der.yaml",
    "chars": 224,
    "preview": "sources:\n  base:\n    driver: csv\n    args:\n      urlpath: \"{{CATALOG_DIR}}/sample1.csv\"\n  cols:\n    driver: intake.sourc"
  },
  {
    "path": "intake/source/tests/footer_csvs/sample_fewfooters.csv",
    "chars": 128,
    "preview": "name,score,rank\nAlice,100.5,1\nBob,50.3,2\nCharlie,25,3\nEve,25,3\n1. This is footer #1\n2. This is another footer row, with "
  },
  {
    "path": "intake/source/tests/footer_csvs/sample_manyfooters.csv",
    "chars": 271,
    "preview": "name,score,rank\nAlice,100.5,1\nBob,50.3,2\nCharlie,25,3\nEve,25,3\n1. This is footer #1\n2. This is another footer row, with "
  },
  {
    "path": "intake/source/tests/footer_csvs/sample_nofooters.csv",
    "chars": 63,
    "preview": "name,score,rank\nAlice,100.5,1\nBob,50.3,2\nCharlie,25,3\nEve,25,3\n"
  },
  {
    "path": "intake/source/tests/pipeline.yaml",
    "chars": 4114,
    "preview": "sources:\n  df:\n    driver: csv\n    args:\n      urlpath: '{{ CATALOG_DIR }}/sample1.csv'\n  df1:\n    driver: csv\n    args:"
  },
  {
    "path": "intake/source/tests/plugin_searchpath/collision_foo/__init__.py",
    "chars": 496,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/source/tests/plugin_searchpath/collision_foo2/__init__.py",
    "chars": 496,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/source/tests/plugin_searchpath/driver_with_entrypoints/__init__.py",
    "chars": 30,
    "preview": "class SomeTestDriver:\n    ...\n"
  },
  {
    "path": "intake/source/tests/plugin_searchpath/driver_with_entrypoints-0.1.dist-info/entry_points.txt",
    "chars": 75,
    "preview": "[intake.drivers]\nsome_test_driver = driver_with_entrypoints:SomeTestDriver\n"
  },
  {
    "path": "intake/source/tests/plugin_searchpath/intake_foo/__init__.py",
    "chars": 544,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/source/tests/plugin_searchpath/not_intake_foo/__init__.py",
    "chars": 549,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2012 - 2018, Anaconda, I"
  },
  {
    "path": "intake/source/tests/sample1.csv",
    "chars": 63,
    "preview": "name,score,rank\nAlice,100.5,1\nBob,50.3,2\nCharlie,25,3\nEve,25,3\n"
  },
  {
    "path": "intake/source/tests/sample2_1.csv",
    "chars": 67,
    "preview": "name,score,rank\nAlice1,100.5,1\nBob1,50.3,2\nCharlie1,25,3\nEve1,25,3\n"
  },
  {
    "path": "intake/source/tests/sample2_2.csv",
    "chars": 67,
    "preview": "name,score,rank\nAlice2,100.5,1\nBob2,50.3,2\nCharlie2,25,3\nEve2,25,3\n"
  }
]

// ... and 37 more files (download for full content)

About this extraction

This page contains the full source code of the ContinuumIO/intake GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 237 files (1.1 MB), approximately 307.3k tokens, and a symbol index with 1445 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!