Full Code of blue-yonder/tsfresh for AI

main 69e50a565101 cached

137 files

8.4 MB

2.2M tokens

687 symbols

1 requests

Download .txt

Showing preview only (8,812K chars total). Download the full file or copy to clipboard to get everything.

Repository: blue-yonder/tsfresh
Branch: main
Commit: 69e50a565101
Files: 137
Total size: 8.4 MB

Directory structure:
gitextract_1ae_uw_y/

├── .coveragerc
├── .github/
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug-report.md
│   │   └── config.yml
│   └── workflows/
│       ├── benchmark_default_branch.yml
│       ├── deploy.yml
│       ├── stylecheck.yml
│       ├── test.yml
│       └── test_all.yml
├── .gitignore
├── .pre-commit-config.yaml
├── .readthedocs.yml
├── AUTHORS.rst
├── CHANGES.rst
├── Dockerfile
├── Dockerfile.testing
├── LICENSE.txt
├── Makefile
├── README.md
├── binder/
│   └── requirements.txt
├── docs/
│   ├── Makefile
│   ├── _static/
│   │   ├── .gitignore
│   │   └── theme_override.css
│   ├── _templates/
│   │   └── module_functions_template.rst
│   ├── api/
│   │   ├── modules.rst
│   │   ├── tsfresh.convenience.rst
│   │   ├── tsfresh.examples.rst
│   │   ├── tsfresh.feature_extraction.rst
│   │   ├── tsfresh.feature_selection.rst
│   │   ├── tsfresh.rst
│   │   ├── tsfresh.scripts.rst
│   │   ├── tsfresh.transformers.rst
│   │   └── tsfresh.utilities.rst
│   ├── authors.rst
│   ├── changes.rst
│   ├── conf.py
│   ├── images/
│   │   └── rolling_mechanism_drawio_template.xml
│   ├── index.rst
│   ├── license.rst
│   └── text/
│       ├── data_formats.rst
│       ├── faq.rst
│       ├── feature_calculation.rst
│       ├── feature_extraction_settings.rst
│       ├── feature_filtering.rst
│       ├── forecasting.rst
│       ├── how_to_add_custom_feature.rst
│       ├── how_to_contribute.rst
│       ├── introduction.rst
│       ├── large_data.rst
│       ├── list_of_features.rst
│       ├── quick_start.rst
│       ├── sklearn_transformers.rst
│       └── tsfresh_on_a_cluster.rst
├── notebooks/
│   ├── 01 Feature Extraction and Selection.ipynb
│   ├── 02 sklearn Pipeline.ipynb
│   ├── 03 Feature Extraction Settings.ipynb
│   ├── 04 Multiclass Selection Example.ipynb
│   ├── 05 Timeseries Forecasting.ipynb
│   ├── advanced/
│   │   ├── 05 Timeseries Forecasting (multiple ids).ipynb
│   │   ├── compare-runtimes-of-feature-calculators.ipynb
│   │   ├── feature_extraction_with_datetime_index.ipynb
│   │   ├── friedrich_coefficients.ipynb
│   │   ├── inspect_dft_features.ipynb
│   │   ├── perform-PCA-on-extracted-features.ipynb
│   │   └── visualize-benjamini-yekutieli-procedure.ipynb
│   └── pipeline.pkl
├── setup.cfg
├── setup.py
├── tests/
│   ├── __init__.py
│   ├── benchmark.py
│   ├── fixtures.py
│   ├── integrations/
│   │   ├── __init__.py
│   │   ├── examples/
│   │   │   ├── __init__.py
│   │   │   ├── test_driftbif_simulation.py
│   │   │   ├── test_har_dataset.py
│   │   │   └── test_robot_execution_failures.py
│   │   ├── test_bindings.py
│   │   ├── test_feature_extraction.py
│   │   ├── test_full_pipeline.py
│   │   ├── test_notebooks.py
│   │   └── test_relevant_feature_extraction.py
│   └── units/
│       ├── __init__.py
│       ├── feature_extraction/
│       │   ├── __init__.py
│       │   ├── test_data.py
│       │   ├── test_extraction.py
│       │   ├── test_feature_calculations.py
│       │   └── test_settings.py
│       ├── feature_selection/
│       │   ├── __init__.py
│       │   ├── test_checks.py
│       │   ├── test_fdr_control.py
│       │   ├── test_feature_significance.py
│       │   ├── test_relevance.py
│       │   ├── test_selection.py
│       │   └── test_significance_tests.py
│       ├── scripts/
│       │   ├── __init__.py
│       │   └── test_run_tsfresh.py
│       ├── transformers/
│       │   ├── __init__.py
│       │   ├── test_feature_augmenter.py
│       │   ├── test_feature_selector.py
│       │   ├── test_per_column_imputer.py
│       │   └── test_relevant_feature_augmenter.py
│       └── utilities/
│           ├── __init__.py
│           ├── test_dataframe_functions.py
│           ├── test_distribution.py
│           └── test_string_manipilations.py
└── tsfresh/
    ├── __init__.py
    ├── convenience/
    │   ├── __init__.py
    │   ├── bindings.py
    │   └── relevant_extraction.py
    ├── defaults.py
    ├── examples/
    │   ├── __init__.py
    │   ├── driftbif_simulation.py
    │   ├── har_dataset.py
    │   └── robot_execution_failures.py
    ├── feature_extraction/
    │   ├── __init__.py
    │   ├── data.py
    │   ├── extraction.py
    │   ├── feature_calculators.py
    │   └── settings.py
    ├── feature_selection/
    │   ├── __init__.py
    │   ├── relevance.py
    │   ├── selection.py
    │   └── significance_tests.py
    ├── scripts/
    │   ├── __init__.py
    │   ├── data.txt
    │   ├── measure_execution_time.py
    │   ├── run_tsfresh.py
    │   └── test_timing.py
    ├── transformers/
    │   ├── __init__.py
    │   ├── feature_augmenter.py
    │   ├── feature_selector.py
    │   ├── per_column_imputer.py
    │   └── relevant_feature_augmenter.py
    └── utilities/
        ├── __init__.py
        ├── dataframe_functions.py
        ├── distribution.py
        ├── profiling.py
        └── string_manipulation.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .coveragerc
================================================
# .coveragerc to control coverage.py
[run]
relative_files = True
branch = True
source = ./tsfresh
omit = tsfresh/utilities/profiling.py
       tsfresh/__init__.py
       tsfresh/feature_extraction/__init__.py
       tsfresh/example/__init__.py
       tsfresh/utilities/__init__.py
       tsfresh/transformers/__init__.py
       tsfresh/convenience/__init__.py
       tsfresh/examples/har_dataset.py
       tsfresh/examples/robot_execution_failures.py
       tsfresh/examples/driftbif_simulation.py
       tsfresh/examples/test_tsfresh_baseline_dataset.py
       tsfresh/scripts/test_timing.py
       tsfresh/scripts/measure_execution_time.py
       /opt/hostedtoolcache/Python/*

[report]
# Regexes for lines to exclude from consideration
exclude_lines =
    # Have to re-enable the standard pragma
    pragma: no cover

    # Don't complain about missing debug-only code:
    def __repr__
    if self\.debug

    # Don't complain if tests don't hit defensive assertion code:
    raise AssertionError
    raise NotImplementedError

    # Don't complain if non-runnable code isn't run:
    if 0:
    if __name__ == .__main__.:


================================================
FILE: .github/ISSUE_TEMPLATE/bug-report.md
================================================
---
name: Bug Report
about: Report a bug
labels:
  - bug
---

<!--
Thank you very much for filing a bug report!
We, the maintainers, are happy to help you. When opening an issue, please provide the following information to us.

If your issue is more a question on how o use tsfresh for your use-case, please have a look into
the Q&A: https://github.com/blue-yonder/tsfresh/discussions/categories/q-a
-->

**The problem**:

<!-- Please shortly describe your problem.

We recommend to include a self-contained copy-pastable example that generates the issue if possible.
If you need data to showcase your problem, please include a very small test dataset.

If you need some help on how to create good bug reports, have a look into those resources:

- Craft Minimal Bug Reports http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports
- Minimal Complete Verifiable Examples https://stackoverflow.com/help/mcve

Please include proper formatting of your error messages and code examples
using the formatting options and have a look into the "Preview" before sending your
bug report.

That makes diagnosing your issue much easier.
-->

**Anything else we need to know?**:

**Environment**:

- Python version:
- Operating System:
- tsfresh version:
- Install method (conda, pip, source):


================================================
FILE: .github/ISSUE_TEMPLATE/config.yml
================================================
blank_issues_enabled: false
contact_links:
  - name: "Q&A and general Questions"
    url: https://github.com/blue-yonder/tsfresh/discussions/categories/q-a
    about: "If you have questions on how to use `tsfresh` for your use case, please ask on GitHub Discussions."
  - name: "Discussion and feature requests"
    url: https://github.com/blue-yonder/tsfresh/discussions
    about: "For anything else, please use the general discussion forums."


================================================
FILE: .github/workflows/benchmark_default_branch.yml
================================================
# Store benchmark results as an artifact
name: Benchmark the default branch
on:
  # Only run on the default branch
  push:
    branches:
      - main

jobs:
  benchmark:
    name: Run pytest-benchmark
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.10"
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 pip install ".[testing]"
      - name: Run benchmark
        run: |
          cd tests
          pytest benchmark.py --benchmark-min-rounds=4 --benchmark-only -n 0 --no-cov --benchmark-json output.json
      - name: Upload the file
        uses: actions/upload-artifact@v4
        with:
          name: benchmark_results
          path: tests/output.json


================================================
FILE: .github/workflows/deploy.yml
================================================
name: Upload Python Package

on:
  release:
    types: [created]

jobs:
  deploy:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.8"
          cache: 'pip' # caching pip dependencies
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install setuptools wheel twine
      - name: Build and publish
        env:
          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
        run: |
          python setup.py sdist bdist_wheel
          twine upload dist/*


================================================
FILE: .github/workflows/stylecheck.yml
================================================
---
name: Python Style Check
on: [pull_request]

jobs:
  pre-commit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
      - uses: pre-commit/action@v3.0.1


================================================
FILE: .github/workflows/test.yml
================================================
---
name: Test
on:
  pull_request:
  push:
    branches:
      - main

jobs:
  test:
    name: Test
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        python-version: [3.9, "3.10", 3.11, 3.12]
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
          cache: 'pip' # caching pip dependencies

      - name: Install dependencies
        run: |
          # Do all the installations
          python -m pip install --upgrade pip wheel setuptools

          SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 pip install ".[testing]"

          # Print out the pip versions for debugging
          pip list

      - name: Test with pytest
        run: |
          pytest --junitxml=junit/test-results-${{ matrix.python-version }}-${{ matrix.installation }}.xml --cov-report=xml tests

      - name: Upload pytest test results
        uses: actions/upload-artifact@v4
        with:
          name: pytest-results-${{ matrix.python-version }}-${{ matrix.installation }}
          path: junit/test-results-${{ matrix.python-version }}-${{ matrix.installation }}.xml
        # Use always() to always run this step to publish test results when there are test failures
        if: ${{ always() }}

      - name: Upload coverage to Codecov
        uses: codecov/codecov-action@v3


================================================
FILE: .github/workflows/test_all.yml
================================================
---
name: Test Default Branch
on:
  push:
    branches:
      - main

jobs:
  test:
    name: Test
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4

    - name: Set up Python 3.9
      uses: actions/setup-python@v5
      with:
        python-version: 3.9
        cache: 'pip' # caching pip dependencies

    - name: Install dependencies
      run: |
        # Do all the installations
        python -m pip install --upgrade pip wheel setuptools
        SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 pip install ".[testing]"

        # Print out the pip versions for debugging
        pip list

    - name: Test with pytest
      run: |
        # Include notebook tests
        rm -rf .coverage.*
        export TEST_NOTEBOOKS=y
        pytest --junitxml=junit/test-results.xml --cov-report=xml tests

    - name: Upload pytest test results
      uses: actions/upload-artifact@v4
      with:
        name: pytest-results
        path: junit/test-results.xml
      # Use always() to always run this step to publish test results when there are test failures
      if: ${{ always() }}


================================================
FILE: .gitignore
================================================
#others
*.lock

# Temporary and binary files
*~
*.py[cod]
*.so
*.cfg
!setup.cfg
*.orig
*.log
*.pot
__pycache__/*
.cache/*
.*.swp

# Project files
.ropeproject
.project
.pydevproject
.settings
.idea
.vscode

# Package files
*.egg
*.eggs/
.installed.cfg
*.egg-info

# Unittest and coverage
htmlcov/*
.coverage
.coverage.*
.tox
junit.xml
coverage.xml

# Build and docs folder/files
build/*
dist/*
sdist/*
docs/_build/*
cover/*
MANIFEST
docs/text/_generated/

# Virtual environment
venv*
activate

# IPython notebooks
.ipynb_checkpoints

# examples data directory
tsfresh/examples/data/
tsfresh/notebooks/data/
# ds_store
*.DS_Store

# dask
dask-worker-space
dask-worker-space/

# python version files
.python_version

# tox log files
.tox


================================================
FILE: .pre-commit-config.yaml
================================================
---
repos:
  - repo: https://github.com/psf/black
    rev: 22.12.0
    hooks:
      - id: black
        language_version: python3
        exclude: "^docs/.*$"
  - repo: https://github.com/pycqa/isort
    rev: 5.12.0
    hooks:
      - id: isort
        args:
          - "--profile"
          - "black"
        exclude: "^docs/.*$"
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-yaml
      - id: check-added-large-files


================================================
FILE: .readthedocs.yml
================================================
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the version of Python and other tools you might need
build:
  os: ubuntu-22.04
  tools:
    python: "3.10"

# Build documentation in the docs/ directory with Sphinx
sphinx:
  configuration: docs/conf.py

# Optionally declare the Python requirements required to build your docs
python:
  install:
   - method: pip
     path: .
     extra_requirements:
        - docs


================================================
FILE: AUTHORS.rst
================================================


Authors
==========


Core Development Team
---------------------

- Maximilian Christ (`maximilianchrist.com <http://maximilianchrist.com>`_, `max.christ@me.com <max.christ@me.com>`_)
- Nils Braun  (`nilslennartbraun@gmail.com <nilslennartbraun@gmail.com>`_)
- Julius Neuffer (`julius.neuffer@blue-yonder.com <julius.neuffer@blue-yonder.com>`_)
- Andreas W. Kempa-Liehr (`a.kempa-liehr@auckland.ac.nz <https://unidirectory.auckland.ac.nz/profile/akem134>`_)

Contributions
-------------

- Alan Larson
- Alex Broekhof
- Alex F
- Alex Kennedy
- Alex Loosley
- Andrew Van Benschoten
- Ben Auffarth
- Brian Sang
- Brunno Vanelli
- Chris Chow
- Chris Holdgraf
- Christoph Hösler
- cnzero
- CYHSM
- Dan Mazur
- Daniel Azevedo
- Daniel Gomez
- Daniel Naftalovich
- Delyan
- Denis Barbier
- Derrick
- Dhruv Srikanth
- Dillon Wong
- Dimitris Spathis
- Dominic White
- earthgecko
- Emanuele Fumagalli
- Erlend Aune
- Evans Doe Ocansey
- Ezekiel Kruglick
- Filip Malkowski
- filipj8
- Florian Aspart
- flyingdutchman23
- Fujimoto Seiji
- George Wambold
- Greg Bodeker
- Gregor Koehler
- Gustavo Bertoli
- Haris Sahovic
- HaveF
- He Kaisheng
- Henry Swaffield
- Igor Pechersky
- J. Kleint
- James Myatt
- Jean-Francois Zinque
- Jeroen Van Der Donckt
- Justin Hong
- Justin White
- kartikey-vyas
- koho
- Lifepillar
- lupupu
- Mal Curtis
- Mario Kahlhofer
- Markus Frey
- Marx
- Matúš Tomlein
- Maximilian Lohmann
- meer1992
- mendel5
- Michele Tonutti
- Moritz Gelb
- Nigel Bosch
- Niklas Haas
- Oli
- Omar Gutiérre
- patrjon
- Paul Fornia
- Paul Voigt
- Quan Gu
- rhysimu
- Ricardo Emanuel Vaz Vargas
- Roman Yurchak
- Roy Wedge
- Sarius2009
- Scott Simmons
- Sean M. Law
- Sergey Shepelev
- Soledad Galli
- Stephan Müller
- supergitacc
- Tan Tao-Lin
- Teo Bucci
- Thibault Blanc
- Thibault de Boissiere
- Timo Klerx
- Vincent Tang
- Will Koehrsen
- Wojciech Indyk
- yairst
- YamaByte
- yitao-yu


================================================
FILE: CHANGES.rst
================================================
=========
Changelog
=========

tsfresh uses `Semantic Versioning <http://semver.org/>`_

Version 0.21.1
==============
- Bugfixes/Typos/Documentation:
    - Remove pkg_resources in preparation for its deprecation. (#1116)

Version 0.21.0
==============
- Breaking Change
    - Drop support for python 3.7 and 3.8 (#1100)
- Bugfixes/Typos/Documentation:
    - Fix incompatibility with scipy versions 1.15 and higher by relying on the
      ``pywavelets`` package for cwt (#1097)
    - Improve code quality of feature extractors (#1103)
    - Improve developer experience with tox, bisect and docker (#1093, #994, #1102)

Version 0.20.3
==============
- Bugfixes/Typos/Documentation:
    * Fixes issue #1073: Updated setup.cfg to require newer scipy version (#1081)
    * extract_relevant_features now passes chunksize to extract_features (#1083)
    * Fix code and tests for numpy >= 2.0 (#1085)
    * Update tsfresh.feature_extraction.feature_calculators.skewness to make it consistent with the design principle of not ignoring nan (#1066)
    * Fix spelling/grammar in pipeline notebook (#1082)
    * Added recommendation to revert thread limitations (#1069)
    * Fix the 01 example notebook to not leak information between train and test set
    * Feature calculator return type documentation (#1070)

Version 0.20.2
==============
- Added Features
    -  Make Dask and Distributed optional dependencies (#1061)
    - View and Set N Jobs (#1029)

- Bugfixes/Typos/Documentation:
    - Extra notes on parallelization efficiencies (#1046)
    - Update doc extraction settings for clarity and formatting (#1033)
    - Typos (#1031, #1034, #1049, #1048)

Version 0.20.1
==============

- Added Features
    - Make tsfresh compatible with numpy 1.24 (#1018) and pandas 2.0 (#1028)

- Bugfixes/Typos/Documentation:
    - Use pandas Index.equals in check_if_pandas_series (#963)
    - Updates to package layout, CI/CD and developer setup


Version 0.20.0
==============

- Breaking Change
    - The matrixprofile package becomes an optional dependency

- Bugfixes/Typos/Documentation:
    - Fix feature extraction of Friedrich coefficients for pandas>1.3.5
    - Fix file paths after example notebooks were moved

Version 0.19.0
==============

- Breaking Change
    - Drop Python 3.6 support due to dependency on statsmodels 0.13

- Added Features
    - Improve documentation (#831, #834, #851, #853, #870)
    - Add absolute_maximum and mean_n_absolute_max features (#833)
    - Make settings pickable (#845, #847, #910)
    - Disable multiprocessing for `n_jobs=1` (#852)
    - Add black, isort, and pre-commit (#876)

- Bugfixes/Typos/Documentation:
    - Fix conversion of time-series into sequence for lempel_ziv_complexity (#806)
    - Fix range count config (#827)
    - Reword documentation (#893)
    - Fix statsmodels deprecation issues (#898, #912)
    - Fix typo in requirements (#903)
    - Bump statsmodels to v0.13 (#
    - Updated references


Version 0.18.0
==============

- Added Features
    - Allow arbitrary rolling sizes (#766)
    - Allow for multiclass significance tests (#762)
    - Add multiclass option to RelevantFeatureAugmenter (#782)
    - Addition of matrix_profile feature (#793)
    - Added new query similarity counter feature (#798)
    - Add root mean square feature (#813)
- Bugfixes/Typos/Documentation:
    - Do not send coverage of notebook tests to codecov (#759)
    - Fix typos in notebook (#757, #780)
    - Fix output format of `make_forecasting_frame` (#758)
    - Fix badges and remove benchmark test
    - Fix BY notebook plot (#760)
    - Ts forecast example improvement (#763)
    - Also surpress warnings in dask (#769)
    - Update relevant_feature_augmenter.py (#779)
    - Fix column names in quick_start.rst (#778)
    - Improve relevance table function documentation (#781)
    - Fixed #789 Typo in "how to add custom feature" (#790)
    - Convert to the correct type on warnings (#799)
    - Fix minor typos in the docs (#802)
    - Add unwanted filetypes to gitignore (#819)
    - Fix build and test failures (#815)
    - Fix imputing docu (#800)
    - Bump the scikit-learn version (#822)

Version 0.17.0
==============

We changed the default branch from "master" to "main".

- Breaking Change
    - Changed constructed id in roll_time_series from string to tuple (#700)
    - Same for add_sub_time_series_index (#720)
- Added Features
    - Implemented the Lempel-Ziv-Complexity and the Fourier Entropy (#688)
    - Prevent #524 by adding an assert for common identifiers (#690)
    - Added permutation entropy (#691)
    - Added a logo :-) (#694)
    - Implemented the benford distribution feature (#689)
    - Reworked the notebooks (#701, #704)
    - Speed up the result pivoting (#705)
    - Add a test for the dask bindings (#719)
    - Refactor input data iteration to need less memory (#707)
    - Added benchmark tests (#710)
    - Make dask a possible input format (#736)
- Bugfixes:
    - Fixed a bug in the selection, that caused all regression tasks with un-ordered index to be wrong (#715)
    - Fixed readthedocs (#695, #696)
    - Fix spark and dask after #705 and for non-id named id columns (#712)
    - Fix in the forecasting notebook (#729)
    - Let tsfresh choose the value column if possible (#722)
    - Move from coveralls github action to codecov (#734)
    - Improve speed of data processing (#735)
    - Fix for newer, more strict pandas versions (#737)
    - Fix documentation for feature calculators (#743)

Version 0.16.0
==============

- Breaking Change
    - Fix the sorting of the parameters in the feature names (#656)
      The feature names consist of a sorted list of all parameters now.
      That used to be true for all non-combiner features, and is now also true for combiner features.
      If you relied on the actual feature name, this is a breaking change.
    - Change the id after the rolling (#668)
      Now, the old id of your data is still kept. Additionally, we improved the way
      dataframes without a time column are rolled and how the new sub-time series
      are named.
      Also, the documentation was improved a lot.
- Added Features
    - Added variation coefficient (#654)
    - Added the datetimeindex explanation from the notebook to the docs (#661)
    - Optimize RelevantFeatureAugmenter to avoid re-extraction (#669)
    - Added a function `add_sub_time_series_index` (#666)
    - Added Dockerfile
    - Speed optimizations and speed testing script (#681)
- Bugfixes
    - Increase the extracted `ar` coefficients to the full parameter range. (#662)
    - Documentation fixes (#663, #664, #665)
    - Rewrote the `sample_entropy` feature calculator (#681)
      It is now faster and (hopefully) more correct.
      But your results will change!


Version 0.15.1
==============

- Changelog and documentation fixes

Version 0.15.0
==============

- Added Features
    - Add count_above and count_below feature (#632)
    - Add convenience bindings for dask dataframes and pyspark dataframes (#651)
- Bugfixes
    - Fix documentation build and feature table in sphinx (#637, #631, #627)
    - Add scripts to API documentation
    - Skip dask test for older python versions (#649)
    - Add missing distributor keyword (#648)
    - Fix tuple input for cwt (#645)

Version 0.14.1
==============

- Fix travis deployment

Version 0.14.0
==============

- Breaking Change
    - Replace Benjamini-Hochberg implementation with statsmodels implementation (#570)
- Refactoring and Documentation
    - travis.yml (#605)
    - gitignore (#608)
    - Fix docstring of c3 (#590)
    - Feature/pep8 (#607)
- Added Features
    - Improve test coverage (#609)
    - Add "autolag" parameter to augmented_dickey_fuller() (#612)
- Bugfixes
    - Feature/pep8 (#607)
    - Fix filtering on warnings with multiprocessing on Windows (#610)
    - Remove outdated logging config (#621)
    - Replace Benjamini-Hochberg implementation with statsmodels implementation (#570)
    - Fix the kernel and the naming of a notebook (#626)


Version 0.13.0
==============

- Drop python 2.7 support (#568)
- Fixed bugs
    - Fix cache in friedrich_coefficients and agg_linear_trend (#593)
    - Added a check for wrong column names and a test for this check (#586)
    - Make sure to not install the tests folder (#599)
    - Make sure there is at least a single column which we can use for data (#589)
    - Avoid division by zero in energy_ratio_by_chunks (#588)
    - Ensure that get_moment() uses float computations (#584)
    - Preserve index when column_value and column_kind not provided (#576)
    - Add @set_property("input", "pd.Series") when needed (#582)
    - Fix off-by-one error in longest strike features (fixes #577) (#578)
    - Add `set_property` import (#572)
    - Fix typo (#571)
    - Fix indexing of melted normalized input (#563)
    - Fix travis (#569)
- Remove warnings (#583)
- Update to newest python version (#594)
- Optimizations
    - Early return from change_quantiles if ql >= qh (#591)
    - Optimize mean_second_derivative_central (#587)
    - Improve performance with Numpy's sum function (#567)
    - Optimize mean_change (fixes issue #542) and correct documentation (#574)


Version 0.12.0
==============

- fixed bugs
    - wrong calculation of friedrich coefficients
    - feature selection selected too many features
    - an ignored max_timeshift parameter in roll_time_series
- add deprecation warning for python 2
- added support for index based features
- new feature calculator
    - linear_trend_timewise
- enable the RelevantFeatureAugmenter to be used in cross validated pipelines
- increased scipy dependency to 1.2.0


Version 0.11.2
==============
- change chunking in energy_ratio_by_chunks to use all data points
- fix warning for spkt_welch_density
- adapt default settings for "value_count" and "range_count"
- added
    - maxlag parameter to agg_autocorrelation function
- now, the kind column of the input DataFrame is cast as str, old derived FC_Settings can become invalid
- only set default_fc_parameters to ComprehensiveFCParameters() if also kind_to_fc_parameters is set None in `extract_features`
- removed pyscaffold
- use asymptotic algorithm to derive kendal tau


Version 0.11.1
==============
- general performance improvements
- removed hard pinning of dependencies
- fixed bugs
    - the stock price forecasting notebook
    - the multi classification notebook

Version 0.11.0
==============
- new feature calculators:
    - fft_aggregated
    - cid_ce
- renamed mean_second_derivate_central to mean_second_derivative_central
- add warning if no relevant features were found in feature selection
- add columns_to_ignore parameter to from_columns method
- add distribution module, contains support for distributed feature extraction on Dask

Version 0.10.1
==============
- split test suite into unit and integration tests
- fixed the following bugs
    - use name of value column as time series kind
    - prevent the spawning of subprocesses which lead to high memory consumption
    - fix deployment from travis to pypi

Version 0.10.0
==============
- new feature calculators:
    - partial autocorrelation
- added list of calculated features to documentation
- added two ipython notebooks to
    - illustrate PCA on features
    - illustrate the Benjamini Yekutieli procedure
- fixed the following bugs
    - improperly quotation of dickey fuller settings

Version 0.9.0
=============
- new feature calculators:
    - ratio_beyond_r_sigma
    - energy_ratio_by_chunks
    - number_crossing_m
    - c3
    - angle & abs for fft coefficients
    - agg_autocorrelation
    - p-Value and usedLag for augmented_dickey_fuller
    - change_quantiles
- changed the calculation of the following features:
    - fft_coefficients
    - autocorrelation
    - time_reversal_asymmetry_statistic
- removed the following feature calculators:
    - large_number_of_peak
    - mean_autocorrelation
    - mean_abs_change_quantiles
- add support for multi classification in the feature selection
- improved description of the rolling mechanism
- added function make_forecasting_frame method for forecasting tasks
- internally ditched the pandas representation of the time series, yielding drastic speed improvements
- replaced feature calculator types from aggregate/aggregate with parameter/apply to simple/combiner
- add test for the ipython notebooks
- added notebook to inspect dft features
- make sure that RelevantFeatureAugmentor always imputes
- fixed the following bugs
    - impute was replacing whole columns by mean
    - fft coefficient were only calculated on truncated part
    - allow to suppress warnings from impute function
    - added missing lag in time_reversal_asymmetry_statistic

Version 0.8.1
=============
- new features:
    - linear trend
    - agg trend
- new sklearn compatible transformers
    - PerColumnImputer
- fixed bugs
    - make mannwhitneyu method compatible with scipy > v0.18.0
- added caching to travis
- internally, added serial calculation of features

Version 0.8.0
=============
- Breaking API changes:
    - removing of feature extraction settings object, replaced by keyword arguments and a plain dictionary (fc_parameters)
    - removing of feature selection settings object, replaced by keyword arguments
- added notebook with examples of new API
- added chapter in docs about the new API
- adjusted old notebooks and documentation to new API

Version 0.7.1
=============

- added a maximum shift parameter to the rolling utility
- added a FAQ entry about how to use tsfresh on windows
- drastically decreased the runtime of the following features
    - cwt_coefficient
    - index_mass_quantile
    - number_peaks
    - large_standard_deviation
    - symmetry_looking
- removed baseline unit tests
- bugfixes:
    - per sample parallel imputing was done on chunks which gave non deterministic results
    - imputing on dtypes other that float32 did not work properly
- several improvements to documentation

Version 0.7.0
=============

- new rolling utility to use tsfresh for time series forecasting tasks
- bugfixes:
    - index_mass_quantile was using global index of time series container
    - an index with same name as id_column was breaking parallelization
    - friedrich_coefficients and max_langevin_fixed_point were occasionally stalling

Version 0.6.0
=============

- progress bar for feature selection
- new feature: estimation of largest fixed point of deterministic dynamics
- new notebook: demonstration how to use tsfresh in a pipeline with train and test datasets
- remove no logging handler warning
- fixed bug in the RelevantFeatureAugmenter regarding the evaluate_only_added_features parameters

Version 0.5.0
=============

- new example: driftbif simulation
- further improvements of the parallelization
- language improvements in the documentation
- performance improvements for some features
- performance improvements for the impute function
- new feature and feature renaming: sum_of_recurring_values, sum_of_recurring_data_points

Version 0.4.0
=============

- fixed several bugs: checking of UCI dataset, out of index error for mean_abs_change_quantiles
- added a progress bar denoting the progress of the extraction process
- added parallelization per sample
- added unit tests for comparing results of feature extraction to older snapshots
- added "high_comp_cost" attribute
- added ReasonableFeatureExtraction settings only calculating features without "high_comp_cost" attribute

Version 0.3.1
=============

- fixed several bugs: closing multiprocessing pools / index out of range cwt calculator / division by 0 in index_mass_quantile
- now all warnings are disabled by default
- for a singular type time series data, the name of value column is used as feature prefix

Version 0.3.0
=============

- fixed bug with parsing of "NUMBER_OF_CPUS" environment variable
- now features are calculated in parallel for each type

Version 0.2.0
=============

- now p-values are calculated in parallel
- fixed bugs for constant features
- allow time series columns to be named 0
- moved uci repository datasets to github mirror
- added feature calculator sample_entropy
- added MinimalFeatureExtraction settings
- fixed bug in calculation of fourier coefficients

Version 0.1.2
=============

- added support for python 3.5.2
- fixed bug with the naming of the features that made the naming of features non-deterministic

Version 0.1.1
=============

- mainly fixes for the read-the-docs documentation, the pypi readme and so on

Version 0.1.0
=============

- Initial version :)


================================================
FILE: Dockerfile
================================================
# Define builder and base image
FROM python:3.8-slim as base
FROM python:3.8 as builder

LABEL maintainer="nilslennartbraun@gmail.com"

# Install tsfresh from source into the builder image
ADD . /source
WORKDIR /source
RUN pip3 install --prefix=/install .

# Copy the installed sources to the base image
FROM base
COPY --from=builder /install /usr/local


================================================
FILE: Dockerfile.testing
================================================
# Bakes the python versions which tsfresh targets into a testing env
FROM ubuntu:22.04

SHELL ["/bin/bash", "-c"]

# These are required to build python from source
RUN apt-get update && apt-get install -y \
    python3-pip \
    curl \
    clang \
    git \
    build-essential \
    libssl-dev \
    libreadline-dev \
    zlib1g-dev \
    libbz2-dev \
    libsqlite3-dev \
    llvm \
    libncurses5-dev \
    libgdbm-dev \
    libnss3-dev \
    libffi-dev \
    liblzma-dev \
    libgmp-dev \
    libmpfr-dev \
    && apt-get clean


RUN curl https://pyenv.run | bash

# For interactive use (if any), this is an edge case.
RUN echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc && \
    echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc && \
    echo 'eval "$(pyenv init -)"' >> ~/.bashrc

ENV PYENV_ROOT="/root/.pyenv"
ENV PATH="$PYENV_ROOT/bin:$PATH"
ENV PATH="$PYENV_ROOT/shims:$PATH"

ARG PYTHON_VERSIONS
RUN for version in $PYTHON_VERSIONS; do \
    echo Installing $version; \
    # Band aid for https://github.com/pyenv/pyenv/issues/1738
    # since this also appears to apply to 3.7.X
    if [[ $version =~ ^3\.7\..*$ ]]; then \
      echo Using clang to compile $version; \
      CC=clang pyenv install $version || exit 1; \
    else \
      pyenv install $version || exit 1; \
    fi; \
    done

RUN pyenv global $PYTHON_VERSIONS

RUN pip install tox

WORKDIR /tsfresh

# Requires adding safe.directory so that tsfresh can build when the
# repo is mounted.
# Note cannot do this at build time as no git directory exists
CMD ["/bin/bash", "-c", "git config --global --add safe.directory /tsfresh && tox -r -p auto"]


================================================
FILE: LICENSE.txt
================================================
MIT LICENCE

Copyright (c) 2016 Maximilian Christ, Blue Yonder GmbH

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
documentation files (the "Software"), to deal in the Software without restriction, including without limitation the
rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit
persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the
Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


================================================
FILE: Makefile
================================================
WORKDIR := /tsfresh
TEST_IMAGE := tsfresh-test-image
TEST_DOCKERFILE := Dockerfile.testing
TEST_CONTAINER := tsfresh-test-container
PYTHON_VERSIONS := "3.9 3.10 3.11 3.12"

# Tests `PYTHON_VERSIONS`, provided they are also
# specified in setup.cfg `envlist`
test-all-testenv: build-docker-testenv run-docker-tests clean

build-docker-testenv:
	docker build \
			-f $(TEST_DOCKERFILE) \
			-t $(TEST_IMAGE) \
			--build-arg PYTHON_VERSIONS=$(PYTHON_VERSIONS) \
			.

run-docker-tests:
	docker run --rm \
			--name $(TEST_CONTAINER) \
			-v .:$(WORKDIR) \
			-v build_artifacts:$(WORKDIR)/build \
			-v tox_artifacts:$(WORKDIR)/.tox \
			-v egg_artifacts:$(WORKDIR)/tsfresh.egg-info \
			$(TEST_IMAGE)

clean:
	rm -rf .tox build/ dist/ *.egg-info

bisect:
	@if [ -z "$(GOOD_COMMIT)" ]; then \
		echo "Error: GOOD_COMMIT is required. Usage: make bisect GOOD_COMMIT=<commit_hash>."; \
		echo "Assumes that the current checked-out commit is a known bad commit, and bisects from there."; \
		exit 1; \
	fi
	git bisect start
	git bisect bad
	git bisect good $(GOOD_COMMIT)
	git bisect run pytest
	git bisect reset

.PHONY: build-docker-testenv clean run-docker-tests test-all-testenv bisect


================================================
FILE: README.md
================================================
<div align="center">
  <img width="70%" src="./docs/images/tsfresh_logo.svg">
</div>

-----------------

# tsfresh

[![Documentation Status](https://readthedocs.org/projects/tsfresh/badge/?version=latest)](https://tsfresh.readthedocs.io/en/latest/?badge=latest)
[![Build Status](https://github.com/blue-yonder/tsfresh/workflows/Test%20Default%20Branch/badge.svg)](https://github.com/blue-yonder/tsfresh/actions)
[![codecov](https://codecov.io/gh/blue-yonder/tsfresh/branch/main/graph/badge.svg)](https://codecov.io/gh/blue-yonder/tsfresh)
[![license](https://img.shields.io/github/license/mashape/apistatus.svg)](https://github.com/blue-yonder/tsfresh/blob/main/LICENSE.txt)
[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/blue-yonder/tsfresh/main?filepath=notebooks)
[![Downloads](https://pepy.tech/badge/tsfresh)](https://pepy.tech/project/tsfresh)

This repository contains the *TSFRESH* python package. The abbreviation stands for

*"Time Series Feature extraction based on scalable hypothesis tests"*.

The package provides systematic time-series feature extraction by combining established algorithms from statistics, time-series analysis, signal processing, and nonlinear dynamics with a robust feature selection algorithm. In this context, the term *time-series* is interpreted in the broadest possible sense, such that any types of sampled data or even event sequences can be characterised.

## Spend less time on feature engineering

Data Scientists often spend most of their time either cleaning data or building features.
While we cannot change the first thing, the second can be automated.
*TSFRESH* frees your time spent on building features by extracting them automatically.
Hence, you have more time to study the newest deep learning paper, read hacker news or build better models.


## Automatic extraction of 100s of features

*TSFRESH* automatically extracts 100s of features from time series.
Those features describe basic characteristics of the time series such as the number of peaks, the average or maximal value or more complex features such as the time reversal symmetry statistic.

![The features extracted from a exemplary time series](docs/images/introduction_ts_exa_features.png)

The set of features can then be used to construct statistical or machine learning models on the time series to be used for example in regression or
classification tasks.

## Forget irrelevant features

Time series often contain noise, redundancies or irrelevant information.
As a result most of the extracted features will not be useful for the machine learning task at hand.

To avoid extracting irrelevant features, the *TSFRESH* package has a built-in filtering procedure.
This filtering procedure evaluates the explaining power and importance of each characteristic for the regression or classification tasks at hand.

It is based on the well developed theory of hypothesis testing and uses a multiple test procedure.
As a result the filtering process mathematically controls the percentage of irrelevant extracted features.

The  *TSFRESH* package is described in the following open access paper:

* Christ, M., Braun, N., Neuffer, J., and Kempa-Liehr A.W. (2018).
   _Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh -- A Python package)._
   Neurocomputing 307, p. 72-77, [doi: 10.1016/j.neucom.2018.03.067](https://doi.org/10.1016/j.neucom.2018.03.067).

The FRESH algorithm is described in the following whitepaper:

* Christ, M., Kempa-Liehr, A.W., and Feindt, M. (2017).
    _Distributed and parallel time series feature extraction for industrial big data applications._
    ArXiv e-print 1610.07717,  [https://arxiv.org/abs/1610.07717](https://arxiv.org/abs/1610.07717).

Systematic time-series feature extraction even works for unsupervised problems:

* Teh, H.Y., Wang, K.I-K., Kempa-Liehr, A.W. (2021).
    _Expect the Unexpected: Unsupervised feature selection for automated sensor anomaly detection._
    IEEE Sensors Journal 15.16, p. 18033-18046, [doi: 10.1109/JSEN.2021.3084970](https://doi.org/10.1109/JSEN.2021.3084970).

* Teh, H.Y., Wang, K.I-K., Kempa-Liehr, A.W. (2025).
    _Feature-based normality models for anomaly detection._
    Sensors 25.4757, p. 1-25, [doi: 10.3390/s25154757](https://doi.org/10.3390/s25154757).

Due to the fact that tsfresh basically provides time-series feature extraction for free, you can now concentrate on engineering new time-series,
like e.g. differences of signals from synchronous measurements, which provide even better time-series features:

* Kempa-Liehr, A.W., Oram, J., Wong, A., Finch, M., Besier, T. (2020).
    _Feature engineering workflow for activity recognition from synchronized inertial measurement units._
    In: Pattern Recognition. ACPR 2019. Ed. by M. Cree et al. Vol. 1180. Communications in Computer and Information Science (CCIS).
    Singapore: Springer, p. 223–231. [doi: 10.1007/978-981-15-3651-9_20](https://doi.org/10.1007/978-981-15-3651-9_20).

* Simmons, S., Jarvis, L., Dempsey, D., Kempa-Liehr, A.W. (2021).
    _Data Mining on Extremely Long Time-Series._
    In: 2021 International Conference on Data Mining Workshops (ICDMW). Ed. by B. Xue et al.
    Los Alamitos: IEEE, p. 1057-1066. [doi: 10.1109/ICDMW53433.2021.00137](https://doi.org/10.1109/ICDMW53433.2021.00137).

Systematic time-series features engineering allows to work with time-series samples of different lengths, because every time-series is projected
into a well-defined feature space. This approach allows the design of robust machine learning algorithms in applications with missing data.

* Kennedy, A., Gemma, N., Rattenbury, N., Kempa-Liehr, A.W. (2021).
    _Modelling the projected separation of microlensing events using systematic time-series feature engineering._
    Astronomy and Computing 35.100460, p. 1–14, [doi: 10.1016/j.ascom.2021.100460](https://doi.org/10.1016/j.ascom.2021.100460)

Is your time-series classification problem imbalanced? There is a good chance that undersampling of time-series feature matrices
might solve your problem:

* Dempsey, D.E., Cronin, S.J., Mei, S., Kempa-Liehr, A.W. (2020).
    _Automatic precursor recognition and real-time forecasting of sudden explosive volcanic eruptions at Whakaari, New Zealand_.
    Nature Communications 11.3562, p. 1-8, [doi: 10.1038/s41467-020-17375-2](https://doi.org/10.1038/s41467-020-17375-2).

You are not working with time-series, but with 2D and 3D images? Spatial variation sequences (SVS) are an excellent application for tsfresh:

* Jeune, H., Pechan, N., Reitsma, S., Kempa-Liehr, A.W. (2021).
    _Spatial Variation Sequences for Remote Sensing Applications with Small Sample Sizes._
    In: 2024 Image and Video Technology. 11th Pacific-Rim Symposium, Ed. by W.Q. Yan et al.,
    Lecture Notes in Computer Science (14403). Springer Nature: Singapore, p. 153-166. [doi: {10.1007/978-981-97-0376-0_12](https://doi.org/{10.1007/978-981-97-0376-0_12).

* Koptev, I., Tian, J., Peel, E., Parker, R., Walker, C., Kempa-Liehr, A.W. (2025).
    _Interpretable Dimensionality Reduction in 3D Image Recognition with Small Sample Sizes_.
    Journal of Nondestructive Evaluation 44.44, p. 1-12, [doi: 10.1007/s10921-025-01183-z](https://doi.org/10.1007/s10921-025-01183-z).

Natural language processing of written texts is an example of applying systematic time-series feature engineering to event sequences,
which is described in the following open access paper:

* Tang, Y., Blincoe, K., Kempa-Liehr, A.W. (2020).
    _Enriching Feature Engineering for Short Text Samples by Language Time Series Analysis._
    EPJ Data Science 9.26, p. 1–59. [doi: 10.1140/epjds/s13688-020-00244-9](https://doi.org/10.1140/epjds/s13688-020-00244-9)



## Advantages of tsfresh

*TSFRESH* has several selling points, for example

1. it is field tested
2. it is unit tested
3. the filtering process is statistically/mathematically correct
4. it has a comprehensive documentation
5. it is compatible with sklearn, pandas and numpy
6. it allows anyone to easily add their favorite features
7. it both runs on your local machine or even on a cluster

## Next steps

If you are interested in the technical workings, go to see our comprehensive Read-The-Docs documentation at [http://tsfresh.readthedocs.io](http://tsfresh.readthedocs.io).

The algorithm, especially the filtering part are also described in the paper mentioned above.

We appreciate any contributions, if you are interested in helping us to make *TSFRESH* the biggest archive of feature extraction methods in python, just head over to our [How-To-Contribute](http://tsfresh.readthedocs.io/en/latest/text/how_to_contribute.html) instructions.

If you want to try out `tsfresh` quickly or if you want to integrate it into your workflow, we also have a docker image available:

    docker pull nbraun/tsfresh


## Backwards compatibility

If you need to reproduce or update time-series features, which were computed with the `matrixprofile` feature calculators, you need to create a Python 3.8 environment:

    conda create --name tsfresh__py_3.8 python=3.8
    conda activate tsfresh__py_3.8
    pip install tsfresh[matrixprofile]

## Acknowledgements

The research and development of *TSFRESH* was funded in part by the German Federal Ministry of Education and Research under grant number 01IS14004 (project iPRODICT).


================================================
FILE: binder/requirements.txt
================================================
-e .[testing]


================================================
FILE: docs/Makefile
================================================
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS    ?=
SPHINXBUILD   ?= sphinx-build
SOURCEDIR     = .
BUILDDIR      = _build

# Put it first so that "make" without argument is like "make help".
help:
	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)


================================================
FILE: docs/_static/.gitignore
================================================
# Empty directory


================================================
FILE: docs/_static/theme_override.css
================================================
/* override table width restrictions. Taken from https://rackerlabs.github.io/docs-rackspace/tools/rtd-tables.html */
@media screen and (min-width: 767px) {

   .wy-table-responsive table td {
      /* !important prevents the common CSS stylesheets from overriding
         this as on RTD they are loaded after this stylesheet */
      white-space: normal !important;
   }

   .wy-table-responsive {
      overflow: visible !important;
   }
}


================================================
FILE: docs/_templates/module_functions_template.rst
================================================
.. currentmodule:: {{ fullname }}

{% block functions %}

.. autosummary::
{% for item in functions %}
   {{ item }}
{%- endfor %}

{% endblock %}


================================================
FILE: docs/api/modules.rst
================================================
tsfresh
=======

.. toctree::
   :maxdepth: 4

   tsfresh


================================================
FILE: docs/api/tsfresh.convenience.rst
================================================
tsfresh.convenience package
===========================

Submodules
----------

tsfresh.convenience.bindings module
-----------------------------------

.. automodule:: tsfresh.convenience.bindings
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.convenience.relevant\_extraction module
-----------------------------------------------

.. automodule:: tsfresh.convenience.relevant_extraction
   :members:
   :undoc-members:
   :show-inheritance:

Module contents
---------------

.. automodule:: tsfresh.convenience
   :members:
   :undoc-members:
   :show-inheritance:


================================================
FILE: docs/api/tsfresh.examples.rst
================================================
tsfresh.examples package
========================

Submodules
----------

tsfresh.examples.driftbif\_simulation module
--------------------------------------------

.. automodule:: tsfresh.examples.driftbif_simulation
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.examples.har\_dataset module
------------------------------------

.. automodule:: tsfresh.examples.har_dataset
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.examples.robot\_execution\_failures module
--------------------------------------------------

.. automodule:: tsfresh.examples.robot_execution_failures
   :members:
   :undoc-members:
   :show-inheritance:

Module contents
---------------

.. automodule:: tsfresh.examples
   :members:
   :undoc-members:
   :show-inheritance:


================================================
FILE: docs/api/tsfresh.feature_extraction.rst
================================================
tsfresh.feature\_extraction package
===================================

Submodules
----------

tsfresh.feature\_extraction.data module
---------------------------------------

.. automodule:: tsfresh.feature_extraction.data
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.feature\_extraction.extraction module
---------------------------------------------

.. automodule:: tsfresh.feature_extraction.extraction
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.feature\_extraction.feature\_calculators module
-------------------------------------------------------

.. automodule:: tsfresh.feature_extraction.feature_calculators
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.feature\_extraction.settings module
-------------------------------------------

.. automodule:: tsfresh.feature_extraction.settings
   :members:
   :undoc-members:
   :show-inheritance:

Module contents
---------------

.. automodule:: tsfresh.feature_extraction
   :members:
   :undoc-members:
   :show-inheritance:


================================================
FILE: docs/api/tsfresh.feature_selection.rst
================================================
tsfresh.feature\_selection package
==================================

Submodules
----------

tsfresh.feature\_selection.relevance module
-------------------------------------------

.. automodule:: tsfresh.feature_selection.relevance
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.feature\_selection.selection module
-------------------------------------------

.. automodule:: tsfresh.feature_selection.selection
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.feature\_selection.significance\_tests module
-----------------------------------------------------

.. automodule:: tsfresh.feature_selection.significance_tests
   :members:
   :undoc-members:
   :show-inheritance:

Module contents
---------------

.. automodule:: tsfresh.feature_selection
   :members:
   :undoc-members:
   :show-inheritance:


================================================
FILE: docs/api/tsfresh.rst
================================================
tsfresh package
===============

Subpackages
-----------

.. toctree::
   :maxdepth: 4

   tsfresh.convenience
   tsfresh.examples
   tsfresh.feature_extraction
   tsfresh.feature_selection
   tsfresh.scripts
   tsfresh.transformers
   tsfresh.utilities

Submodules
----------

tsfresh.defaults module
-----------------------

.. automodule:: tsfresh.defaults
   :members:
   :undoc-members:
   :show-inheritance:

Module contents
---------------

.. automodule:: tsfresh
   :members:
   :undoc-members:
   :show-inheritance:


================================================
FILE: docs/api/tsfresh.scripts.rst
================================================
tsfresh.scripts package
=======================

Submodules
----------

tsfresh.scripts.measure\_execution\_time module
-----------------------------------------------

.. automodule:: tsfresh.scripts.measure_execution_time
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.scripts.run\_tsfresh module
-----------------------------------

.. automodule:: tsfresh.scripts.run_tsfresh
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.scripts.test\_timing module
-----------------------------------

.. automodule:: tsfresh.scripts.test_timing
   :members:
   :undoc-members:
   :show-inheritance:

Module contents
---------------

.. automodule:: tsfresh.scripts
   :members:
   :undoc-members:
   :show-inheritance:


================================================
FILE: docs/api/tsfresh.transformers.rst
================================================
tsfresh.transformers package
============================

Submodules
----------

tsfresh.transformers.feature\_augmenter module
----------------------------------------------

.. automodule:: tsfresh.transformers.feature_augmenter
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.transformers.feature\_selector module
---------------------------------------------

.. automodule:: tsfresh.transformers.feature_selector
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.transformers.per\_column\_imputer module
------------------------------------------------

.. automodule:: tsfresh.transformers.per_column_imputer
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.transformers.relevant\_feature\_augmenter module
--------------------------------------------------------

.. automodule:: tsfresh.transformers.relevant_feature_augmenter
   :members:
   :undoc-members:
   :show-inheritance:

Module contents
---------------

.. automodule:: tsfresh.transformers
   :members:
   :undoc-members:
   :show-inheritance:


================================================
FILE: docs/api/tsfresh.utilities.rst
================================================
tsfresh.utilities package
=========================

Submodules
----------

tsfresh.utilities.dataframe\_functions module
---------------------------------------------

.. automodule:: tsfresh.utilities.dataframe_functions
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.utilities.distribution module
-------------------------------------

.. automodule:: tsfresh.utilities.distribution
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.utilities.profiling module
----------------------------------

.. automodule:: tsfresh.utilities.profiling
   :members:
   :undoc-members:
   :show-inheritance:

tsfresh.utilities.string\_manipulation module
---------------------------------------------

.. automodule:: tsfresh.utilities.string_manipulation
   :members:
   :undoc-members:
   :show-inheritance:

Module contents
---------------

.. automodule:: tsfresh.utilities
   :members:
   :undoc-members:
   :show-inheritance:


================================================
FILE: docs/authors.rst
================================================
.. _authors:
.. include:: ../AUTHORS.rst


================================================
FILE: docs/changes.rst
================================================
.. _changes:
.. include:: ../CHANGES.rst


================================================
FILE: docs/conf.py
================================================
# -*- coding: utf-8 -*-
#
# This file is execfile()d with the current directory set to its containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.

import datetime
import sys

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
# sys.path.insert(0, os.path.abspath('.'))

# -- Hack for ReadTheDocs ------------------------------------------------------
# This hack is necessary since RTD does not issue `sphinx-apidoc` before running
# `sphinx-build -b html . _build/html`. See Issue:
# https://github.com/rtfd/readthedocs.org/issues/1139
# DON'T FORGET: Check the box "Install your project inside a virtualenv using
# setup.py install" in the RTD Advanced Settings.
import os

on_rtd = os.environ.get("READTHEDOCS", None) == "True"
if on_rtd:
    import inspect
    from sphinx.ext.apidoc import main

    __location__ = os.path.join(
        os.getcwd(), os.path.dirname(inspect.getfile(inspect.currentframe()))
    )

    output_dir = os.path.join(__location__, "../docs/api")
    module_dir = os.path.join(__location__, "../tsfresh")
    cmd_line_template = "sphinx-apidoc -f -o {outputdir} {moduledir}"
    cmd_line = cmd_line_template.format(outputdir=output_dir, moduledir=module_dir)
    main(cmd_line.split(" ")[1:])

# -- General configuration -----------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = [
    "sphinx.ext.autodoc",
    "sphinx.ext.autosummary",
    "sphinx.ext.intersphinx",
    "sphinx.ext.viewcode",
    "sphinx.ext.doctest",
    "sphinx.ext.imgmath",
    "sphinx.ext.napoleon",
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]

# The suffix of source filenames.
source_suffix = ".rst"

# The encoding of source files.
# source_encoding = 'utf-8-sig'

# The master toctree document.
master_doc = "index"

# General information about the project.
now = datetime.datetime.today()
project = "tsfresh"
copyright = "2023-{}, Maximilian Christ et al./ Blue Yonder GmbH".format(now.year)

# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = ""  # Is set by calling `setup.py docs`
# The full version, including alpha/beta/rc tags.
release = ""  # Is set by calling `setup.py docs`

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
# language = None

# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
# today = ''
# Else, today_fmt is used as the format for a strftime call.
# today_fmt = '%B %d, %Y'

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ["_build", "api/tests*"]

# Boolean indicating whether to scan all found documents for autosummary
# directives, and to generate stub pages for each
autosummary_generate = True

# -- Options for HTML output ---------------------------------------------------

# The theme to use for HTML and HTML Help pages.  See the documentation for
# a list of builtin themes.
html_theme = "sphinx_rtd_theme"

# Theme options are theme-specific and customize the look and feel of a theme
# further.  For a list of options available for each theme, see the
# documentation.
html_theme_options = {"style_nav_header_background": "#51b63c"}

# The name for this set of Sphinx documents.  If None, it defaults to
# "<project> v<release> documentation".
try:
    from tsfresh import __version__ as version
except ImportError:
    pass
else:
    release = version

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"]

# From https://rackerlabs.github.io/docs-rackspace/tools/rtd-tables.html
html_css_files = [
    'theme_override.css',
]

# Output file base name for HTML help builder.
htmlhelp_basename = "tsfresh-doc"


================================================
FILE: docs/images/rolling_mechanism_drawio_template.xml
================================================
<mxfile userAgent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36" version="6.9.1" editor="www.draw.io" type="device"><diagram id="172fcd8a-daa3-4aac-4999-7b3edecb1ef5" name="Page-1">7ZxJb9s4FIB/jYGZQwNu2o5Nms5cChTIAJM5MhJtE5EtQ6Zju79+KImyJZK21Vhb3PZQSI8SRX3vkXoLnQl+WOz+Sulq/i2JWDxBINpN8JcJQn7gyP8zwb4QuJ5XCGYpjwoRPAqe+A+mhEBJNzxi69qFIkliwVd1YZgslywUNRlN02Rbv2yaxPWnruiMGYKnkMam9F8eibl6LeQd5X8zPpuXT4ZuULS80PB1liabpXreBOFp/q9oXtCyL/Wi6zmNkm1FhB8n+CFNElEcLXYPLM7QltiK+76eaD2MO2VL0eQGDxd3vNF4w8oh5wMT+xJG/josuwFM8P12zgV7WtEwa91K7UvZXCxieQbl4TRZCqXPQJ6q3lkq2O7kEOHhxaU9sWTBRLqXl6gbXECKW5QtQaLYbY+aweo15lWllPdRZQyzQ9dHIPJAMTnBB4yej1MO6QwfB3TEx72Mh0VyaqlTvs4fwcPHo/CeLaPP2ZyV7ctkyTRePI4fkjhJ885wRJk/DaV8LdLklVVa3NBnL9NDSzlt0QGyQRSBbGhnEV9EqIQpi6ngb/XVw8ZVPeJ7wuVQDirEbl2DAaj3sE42acjUTdXZrPWDMKh3hLSOBE1nTBgd5Vo+vHaziXFZ8THPlWnqI+KpXLR5spTn8tUyeVXlJ9VlTJOTcwLDOgdoTglo0ycEp1XXdEr43ZFJE0FVawDaIXUwEYUq8BqRQsC7nlQwahty6jaEAGxmQy2YEGzw1RmQjOsPRwaOmozvDEcGjZqMvs70iqaBkzskGoCHQ+OMGw32hkPTwLcdEo1DhkMzbu8Pemg4NB26fy2gQcAdDs24/T3oB4OhQQ0cvvWcrrLDcJPG+/uUhq9MXM43HJMTWgDxqa0IwtXSM2Xur4qNmNicNrCZ3mBI43Ajg3E2QW4scsN5k4ez7HDKqNikbF02ye4rrQZw+f6iDrOedFDJimqGQolozGcZ41BSZVJ+n9HkcmifVcOCR1Ge+7Bpz5oiajWUQaaKiDXIa0FFplu6SlnEQ3HDvIMBeZOPu5IYjrAtbWPJ9LbCrYEjPFZuvoYNW7B1tQKbTvLvFdiiIk+Lfl1TRZ2tCKazfvsr8CGTOwTwBiHAWJcSWI615EYs3lxHSzBu4ARXi0ksfkm2tUJSJpAN8yTlP6RJ0bheXQpjul7z0Gbh1VCjqTVeWz0qZVcWjyDQciTvrR5BB53vqL3qETYd97Oq/l03tGoMw3q++RDC/qzqSVBfLiHuTvVmQGCoftDKoUaiaekQBKe113gFHHeiGmvlZQSasmkBTYO4Zkg0RE9U94hm3Dl87OqJ6h7RjDuHjwM95dgjmnHn8LGPhkMz7hw+gXoOv0c0487hE6yXUvtDU4aTY0Vj7GfpEc24d20QY9tGj2jG7Q4TY69Pf2icDtF0sJMOB3VUh4jgEiq/hX24jhk5PEOD1jU7lbWAmsHIYZ4toA5cD1O3JaZI+9SVuacLabzyC3kVUjPi2HeKVAL1I2JD6qMX7LaF1CMXkSLbdvo2kJru+LM5pz+elRLt04p7tFLTjd93irQnK9U3Y1iQdmalpvv/jG/ASl3NSssFrg8rNcOGfadI+7FSAoKLSLuyUtcMN74W1VUp/EZFyneWAuzzL1d5dRokdD2L2bdR0XJtcY+hk3/y3Hr+UqGQKE9oyLhv/+vpUi9TWHTpdqVL08Oe/vEM//z4HwYH1XPXkPi9fRjc23SyXX3RsSC1fhja+EmmWQ6Qhopu0VDdHg3VDF1uwc82DNWCtDMPxgxdpKHiWzRU30wFdWaoZvRyC662YagWpJ0ZaoOix9XbjNiOi+dMH3eOOvtvUk3jantOrDiL7R6TShQ71KYk7GtbBXzNBWu6M8UhWkcNf9IulUH3lctW2QXr0wMm2u8TMQBnx0X0HZ1+7Xp5UIzgvdtkvAb7HVqzOXgHCK5a3R0qjfA7S7kcdub9fwHvN8ZgSGMkei35vcaoL0LGFuqWjNFBWqomOG+Mjnf2+p81Rnl6/As0xeXHv/KDH/8H</diagram></mxfile>


================================================
FILE: docs/index.rst
================================================
.. image:: images/tsfresh_logo.svg
   :width: 70 %
   :alt: some characteristics of the time series
   :align: center

=======
tsfresh
=======

This is the documentation of **tsfresh**.

tsfresh is a python package.
It automatically calculates a large number of time series characteristics, the so called features.
Further the package contains methods to evaluate the explaining power and importance of such characteristics for
regression or classification tasks.

You can jump right into the package by looking into our :ref:`quick-start-label`.

Contents
========

The following chapters will explain the tsfresh package in detail:

.. toctree::
   :maxdepth: 1

   text/introduction
   text/quick_start
   text/data_formats
   text/sklearn_transformers
   text/list_of_features
   text/feature_extraction_settings
   text/feature_filtering
   text/how_to_add_custom_feature
   text/large_data
   text/tsfresh_on_a_cluster
   text/forecasting
   text/faq
   api/modules
   authors
   license
   changes
   text/how_to_contribute
   text/feature_calculation


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`


Acknowledgements
================

The research and development of TSFRESH was funded in part by the German Federal Ministry of Education and Research
under grant number 01IS14004 (project iPRODICT).


================================================
FILE: docs/license.rst
================================================
.. _license:

=======
License
=======

.. literalinclude:: ../LICENSE.txt


================================================
FILE: docs/text/data_formats.rst
================================================
.. _data-formats-label:

Data Formats
============

tsfresh offers three different options to specify the format of the time series data to use with the function
:func:`tsfresh.extract_features` (and all utility functions that expect a time series, for that
matter, like for example :func:`tsfresh.utilities.dataframe_functions.roll_time_series`).

Irrespective of the input format, tsfresh will always return the calculated features in the same output format
described below.

Typically, the input format options are :class:`pandas.DataFrame` objects, which we will discuss here, and also
Dask dataframes and PySpark computational graphs, which are discussed here :ref:`large-data-label`.

There are four important column types that
make up those DataFrames. Each will be described with an example from the robot failures dataset
(see :ref:`quick-start-label`).

:`column_id`: This column indicates which entities the time series belong to. Features will be extracted individually
    for each entity (id). The resulting feature matrix will contain one row per id.
    Each robot is a different entity, so each of the set of features has a different id.

:`column_sort`: This column contains values which allows sorting of the time series (e.g. sorting by time stamps).
    In general, having equidistant time steps, or the same time scale for the different ids and/or kinds, is not a requirement.
    Some features, however, might only sense for equidistant time stamps.
    If you omit this column, the DataFrame is assumed to be already sorted in ascending order.
    Each of the robot sensor measurements has a time stamp which is used as the `column_sort`.

Need only to be specified on some data formats (see below):

:`column_value`: This column contains the actual values of the time series.
    This corresponds to the measured values of different sensors on the robots.

:`column_kind`: This column indicates the names of the different time series types (e.g. different sensors in an
    industrial application as in the robot dataset).
    For each kind of time series the features are calculated individually.

Important: None of these columns is allowed to contain ``NaN``, ``Inf`` or ``-Inf`` values.

In the following paragraphs, we describe the different input formats that are build based off those columns:

* A flat DataFrame
* A stacked DataFrame
* A dictionary of flat DataFrames

The difference between a flat and a stacked DataFrame is indicated by specifying (or not) the parameters
``column_value`` and ``column_kind`` in the :func:`tsfresh.extract_features` function.

If you are unsure which one to choose, try either the flat or stacked DataFrame.

Input Option 1. Flat DataFrame or Wide DataFrame
------------------------------------------------

If both ``column_value`` and ``column_kind`` are set to ``None``, the time series data is assumed to be in a flat
DataFrame. This means that each different time series must be saved as its own column.

Example: Imagine you record the values of time series x and y for different objects A and B for three different
times t1, t2 and t3. Your resulting DataFrame may look like this:

+----+------+----------+----------+
| id | time | x        | y        |
+====+======+==========+==========+
| A  | t1   | x(A, t1) | y(A, t1) |
+----+------+----------+----------+
| A  | t2   | x(A, t2) | y(A, t2) |
+----+------+----------+----------+
| A  | t3   | x(A, t3) | y(A, t3) |
+----+------+----------+----------+
| B  | t1   | x(B, t1) | y(B, t1) |
+----+------+----------+----------+
| B  | t2   | x(B, t2) | y(B, t2) |
+----+------+----------+----------+
| B  | t3   | x(B, t3) | y(B, t3) |
+----+------+----------+----------+

Now, you want to calculate some features with tsfresh so you would pass:

.. code:: python

    column_id="id", column_sort="time", column_kind=None, column_value=None

to the extraction function, to extract features separately for all ids and separately for the x and y values.
You can also omit the ``column_kind=None, column_value=None`` as this is the default.

Input Option 2. Stacked DataFrame or Long DataFrame
---------------------------------------------------

If both ``column_value`` and ``column_kind`` are set, the time series data is assumed to be a stacked DataFrame.
This means that there are no different columns for the different types of time series.
This representation has several advantages over the flat Data Frame.
For example, the time stamps of the different time series do not have to align.

It does not contain different columns for the different types of time series but only one
value column and a kind column. Following with our previous example, the dataframe would look like this:

+----+------+------+----------+
| id | time | kind | value    |
+====+======+======+==========+
| A  | t1   | x    | x(A, t1) |
+----+------+------+----------+
| A  | t2   | x    | x(A, t2) |
+----+------+------+----------+
| A  | t3   | x    | x(A, t3) |
+----+------+------+----------+
| A  | t1   | y    | y(A, t1) |
+----+------+------+----------+
| A  | t2   | y    | y(A, t2) |
+----+------+------+----------+
| A  | t3   | y    | y(A, t3) |
+----+------+------+----------+
| B  | t1   | x    | x(B, t1) |
+----+------+------+----------+
| B  | t2   | x    | x(B, t2) |
+----+------+------+----------+
| B  | t3   | x    | x(B, t3) |
+----+------+------+----------+
| B  | t1   | y    | y(B, t1) |
+----+------+------+----------+
| B  | t2   | y    | y(B, t2) |
+----+------+------+----------+
| B  | t3   | y    | y(B, t3) |
+----+------+------+----------+

Then you would set:

.. code:: python

    column_id="id", column_sort="time", column_kind="kind", column_value="value"

to end up with the same extracted features.
You can also omit the value column and let ``tsfresh`` deduce it automatically.


Input Option 3. Dictionary of flat DataFrames
---------------------------------------------

Instead of passing a DataFrame which must be split up by its different kinds by tsfresh, you can also give a
dictionary mapping from the kind as string to a DataFrame containing only the time series data of that kind.
So essentially you are using a singular DataFrame for each kind of time series.

The data from the example can be split into two DataFrames resulting in the following dictionary:

{ "x":

    +----+------+----------+
    | id | time | value    |
    +====+======+==========+
    | A  | t1   | x(A, t1) |
    +----+------+----------+
    | A  | t2   | x(A, t2) |
    +----+------+----------+
    | A  | t3   | x(A, t3) |
    +----+------+----------+
    | B  | t1   | x(B, t1) |
    +----+------+----------+
    | B  | t2   | x(B, t2) |
    +----+------+----------+
    | B  | t3   | x(B, t3) |
    +----+------+----------+

,
"y":

   +----+------+----------+
   | id | time | value    |
   +====+======+==========+
   | A  | t1   | y(A, t1) |
   +----+------+----------+
   | A  | t2   | y(A, t2) |
   +----+------+----------+
   | A  | t3   | y(A, t3) |
   +----+------+----------+
   | B  | t1   | y(B, t1) |
   +----+------+----------+
   | B  | t2   | y(B, t2) |
   +----+------+----------+
   | B  | t3   | y(B, t3) |
   +----+------+----------+

}

You would pass this dictionary to tsfresh together with the following arguments:

.. code:: python

    column_id="id", column_sort="time", column_kind=None, column_value="value":


In this case we do not need to specify the kind column as the kind is the respective dictionary key.

Output Format
-------------

The resulting feature matrix, containing the extracted features, is the same for all three input options.
It will always be a :class:`pandas.DataFrame` with the following layout:

+----+-------------+-----+-------------+-------------+-----+-------------+
| id | x_feature_1 | ... | x_feature_N | y_feature_1 | ... | y_feature_N |
+====+=============+=====+=============+=============+=====+=============+
| A  | ...         | ... | ...         | ...         | ... | ...         |
+----+-------------+-----+-------------+-------------+-----+-------------+
| B  | ...         | ... | ...         | ...         | ... | ...         |
+----+-------------+-----+-------------+-------------+-----+-------------+

where the x features are calculated using all x values (independently for A and B), the y features using all y values
(independently for A and B), and so on.

This DataFrame is also the expected input format to the feature selection algorithms used by tsfresh (e.g. the
:func:`tsfresh.select_features` function).


================================================
FILE: docs/text/faq.rst
================================================
FAQ
===


    1. **Does tsfresh support different time series lengths?**

       Yes, it supports different time series lengths. However, some feature calculators can demand a minimal length
       of the time series. If a shorter time series is passed to the calculator, a NaN is returned for those
       features.



    2. **Is it possible to extract features from rolling/shifted time series?**

       Yes, the :func:`tsfresh.dataframe_functions.roll_time_series` function allows to conveniently create a rolled
       time series dataframe from your data. You just have to transform your data into one of the supported tsfresh
       :ref:`data-formats-label`.
       Then, the :func:`tsfresh.dataframe_functions.roll_time_series` will return a DataFrame with the rolled time series,
       that you can pass to tsfresh.
       You can find more details here: :ref:`forecasting-label`.


    3. **How can I use tsfresh with windows?**

       We recommend to use `Anaconda <https://www.continuum.io/downloads#windows>`_. After installation, open the
       Anaconda Prompt, create an environment and set up tsfresh
       (Please be aware that we're using multiprocessing, which can be `problematic <http://stackoverflow.com/questions/18204782/runtimeerror-on-windows-trying-python-multiprocessing>`_.):

       .. code:: Bash

           conda create -n ENV_NAME python=VERSION
           conda install -n ENV_NAME pip requests numpy pandas scipy statsmodels patsy scikit-learn tqdm
           activate ENV_NAME
           pip install tsfresh


    4. **Does tsfresh support different sampling rates in the time series?**

        Yes! The feature calculators in tsfresh do not care about the sampling frequency.
        You will have to use the second input format, the stacked DataFramed (see :ref:`data-formats-label`)


    5. **Does tsfresh support irregularly spaced time series?**

	Yes, but be careful. As its name suggests, the ``column_sort`` (i.e., timestamp) parameter is only used to sort observations.
	Beyond sorting, tsfresh does not use the timestamp in calculations.
	While many features do not need a timestamp (or only need it for ordering), others will assume that observations are evenly spaced in time (e.g., one second between each observation).
	Since tsfresh ignores spacing, care should be taken when selecting features to use with a highly irregular series.

    6. **Even when just extracing the :class:`tsfresh.feature_extraction.settings.EfficientFCParameters`, tsfresh is taking a long time to run. Is there anything further I can do to speed up the processing?**

	If you are using Parallelization (the default option), you may need to check you are not over-provisioning your avaiable cpu cores. Take a look at :ref:`notes-for-efficient-parallelization-label` for steps to eliminate this, which can speed up processing significantly.


================================================
FILE: docs/text/feature_calculation.rst
================================================
.. _feature-naming-label:

Feature Calculator Naming
=========================

tsfresh enforces a strict naming of the created features, which you have to follow whenever you create new feature
calculators.
This is due to the :func:`tsfresh.feature_extraction.settings.from_columns` method which needs to
deduce the following information from the feature name:

    * the time series that was used to calculate the feature
    * the feature calculator method that was used to derive the feature
    * all parameters that have been used to calculate the feature (optional)

Hence, to enable the :func:`tsfresh.feature_extraction.settings.from_columns` to deduce all the
necessary conditions, the features should be named in the following format:

    {time_series_name}__{feature_name}__{parameter name 1}_{parameter value 1}__[..]__{parameter name k}_{parameter value k}

Here, we assumed that {feature_name} has k parameters.

Examples of feature naming
''''''''''''''''''''''''''

So for example the following feature name:

    temperature_1__quantile__q_0.6

is the value of the feature :func:`tsfresh.feature_extraction.feature_calculators.quantile` for the time series
```temperature_1``` and a parameter value of ``q=0.6``. On the other hand, the feature named:

    Pressure 5__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_14__w_5

denotes the value of the feature :func:`tsfresh.feature_extraction.feature_calculators.cwt_coefficients` for
the time series ```Pressure 5``` under parameter values of ``widths=(2, 5, 10, 20)``, ``coeff=14`` and ``w=5``.


================================================
FILE: docs/text/feature_extraction_settings.rst
================================================
Feature extraction settings
===========================

When starting a new data science project involving time series you probably want to start by extracting a
comprehensive set of features. Later you can identify which features are relevant for the task at hand.
In the final stages, you probably want to fine tune the parameter of the features to fine tune your models.

You can do all those things with tsfresh. So, you need to know how to control which features are calculated by tsfresh
and how one can adjust the parameters. In this section, we will clarify this.

For the lazy: Just let me calculate some features!
--------------------------------------------------

To calculate a comprehensive set of features, call the :func:`tsfresh.extract_features` method without
passing a ``default_fc_parameters`` or ``kind_to_fc_parameters`` object. This way you will be using the default options,
which will use all the feature calculators in this package, that we consider are OK to return by default.

For the advanced: How do I set the parameters for all kind of time series?
----------------------------------------------------------------------------

After digging deeper into your data, you maybe want to calculate more of a certain type of feature and less of another
type. So, you need to use custom settings for the feature extractors. To do that with tsfresh you will have to use a
custom settings object:

>>> from tsfresh.feature_extraction import ComprehensiveFCParameters
>>> settings = ComprehensiveFCParameters()
>>> # Set here the options of the settings object as shown in the paragraphs below
>>> # ...
>>> from tsfresh.feature_extraction import extract_features
>>> extract_features(df, default_fc_parameters=settings)

The ``default_fc_parameters`` is expected to be a dictionary which maps feature calculator names
(the function names you can find in the :mod:`tsfresh.feature_extraction.feature_calculators` file) to a list
of dictionaries, which are the parameters with which the function will be called (as key value pairs). Each
function-parameter combination that is in this dict will be called during the extraction and will produce a feature.
If the function does not take any parameters, the value should be set to ``None``.

For example:

.. code:: python

    fc_parameters = {
        "length": None,
        "large_standard_deviation": [{"r": 0.05}, {"r": 0.1}]
    }

will produce three features: one by calling the
:func:`tsfresh.feature_extraction.feature_calculators.length` function without any parameters and two by calling
:func:`tsfresh.feature_extraction.feature_calculators.large_standard_deviation` with ``r = 0.05`` and ``r = 0.1``.

So you can control which features will be extracted, by adding or removing either keys or parameters from this dict.
It is as easy as that.
If you decide not to calculate the length feature here, you delete it from the dictionary:

.. code:: python

    del fc_parameters["length"]

And now, only the two other features are calculated.

For convenience, three dictionaries are predefined and can be used right away:

* :class:`tsfresh.feature_extraction.settings.ComprehensiveFCParameters`: includes all features without parameters and
  all features with parameters, each with different parameter combinations. This is the default for :func:`tsfresh.extract_features`
  if you do not hand in a ``default_fc_parameters`` at all.
* :class:`tsfresh.feature_extraction.settings.MinimalFCParameters`: includes only a handful of features
  and can be used for quick tests. The features which have the "minimal" attribute are used here.
* :class:`tsfresh.feature_extraction.settings.EfficientFCParameters`: Mostly the same features as in the
  :class:`tsfresh.feature_extraction.settings.ComprehensiveFCParameters`, but without features which are marked with the
  "high_comp_cost" attribute. This can be used if runtime performance plays a major role.

Theoretically, you could calculate an unlimited number of features with tsfresh by adding entry after entry to the
dictionary.


For the ambitious: How do I set the parameters for different type of time series?
---------------------------------------------------------------------------------

It is also possible to control the features to be extracted for the different kinds of time series individually.
You can do so by passing a ``kind_to_fc_parameters`` parameter to the :func:`tsfresh.extract_features` function.
It should be a dict mapping from kind names (as string) to ``fc_parameters`` objects,
which you would normally pass as an argument to the ``default_fc_parameters`` parameter.

So, for example the following code snippet:

.. code:: python

    kind_to_fc_parameters = {
        "temperature": {"mean": None},
        "pressure": {"maximum": None, "minimum": None}
    }

will extract the ``"mean"`` feature of the ``"temperature"`` time series and the ``"minimum"`` and ``"maximum"`` of the
``"pressure"`` time series.

The ``kind_to_fc_parameters`` argument will partly override the ``default_fc_parameters``. So, if you include a kind
name in the ``kind_to_fc_parameters`` parameter, its value will be used for that kind.
Other kinds will still use the ``default_fc_parameters``.


A handy trick: Do I really have to create the dictionary by hand?
-----------------------------------------------------------------

Not necessarily. Let's assume you have a DataFrame of tsfresh features.
By using feature selection algorithms you find out that only a subgroup of features is relevant.


Then, we provide the :func:`tsfresh.feature_extraction.settings.from_columns` method that constructs the ``kind_to_fc_parameters``
dictionary from the column names of this filtered feature matrix to make sure that only relevant features are extracted.

This can save a huge amount of time because you prevent the calculation of unnecessary features.
Let's illustrate this with an example:

.. code:: python

    # X_tsfresh contains the extracted tsfresh features
    X_tsfresh = extract_features(...)

    # which are now filtered to only contain relevant features
    X_tsfresh_filtered = some_feature_selection(X_tsfresh, y, ....)

    # we can easily construct the corresponding settings object
    kind_to_fc_parameters = tsfresh.feature_extraction.settings.from_columns(X_tsfresh_filtered)

The above code will construct for you the ``kind_to_fc_parameters`` dictionary that corresponds to the features and parameters (!) from
the tsfresh features that were filtered by the ``some_feature_selection`` feature selection algorithm.


================================================
FILE: docs/text/feature_filtering.rst
================================================
Feature filtering
=================


The all-relevant problem of feature selection is the identification of all strongly and weakly relevant attributes.
This problem is especially hard to solve for time series classification and regression in industrial applications such
as predictive maintenance or production line optimization, for which each label or regression target is associated with
several time series and meta-information simultaneously.

To limit the number of irrelevant features, tsfresh deploys the fresh algorithm (fresh stands for `FeatuRe Extraction
based on Scalable Hypothesis tests`) [1]_.

The algorithm is called by :func:`tsfresh.feature_selection.relevance.calculate_relevance_table`.
It is an efficient, scalable feature extraction algorithm, which filters the available features in an early stage of the
machine learning pipeline with respect to their significance for the classification or regression task, while
controlling the expected percentage of selected but irrelevant features.

The filtering process consists of three phases depicted in the following figure:

.. image:: ../images/feature_extraction_process_20160815_mc_1.png
   :scale: 70 %
   :alt: the time series
   :align: center

Phase 1 - Feature extraction
''''''''''''''''''''''''''''

Firstly, the algorithm characterizes time series with comprehensive and well-established feature mappings and considers
additional features describing meta-information.
The feature calculators used to derive the features are those in
:mod:`tsfresh.feature_extraction.feature_calculators`.

In the above figure, this corresponds to the change from raw time series to aggregated features.

Phase 2 - Feature significance testing
''''''''''''''''''''''''''''''''''''''

In a second step, each feature vector is individually and independently evaluated with respect to its significance for
predicting the target under investigation.
Those tests are located in the submodule :mod:`tsfresh.feature_selection.significance_tests`.
The result of these tests is a vector of p-values, quantifying the significance of each feature for predicting the
label/target.

In the above figure, this corresponds to the change from aggregated features to p-values.

Phase 3 - Multiple test procedure
'''''''''''''''''''''''''''''''''

The vector of p-values is evaluated on the basis of the Benjamini-Yekutieli procedure [2]_ in order to decide which features
to keep.
This multiple testing procedure is taken from the statsmodel package.

In the above figure, this corresponds to the change from p-values to selected features.


References
''''''''''

    .. [1] Christ, M., Kempa-Liehr, A.W. and Feindt, M. (2016).
         Distributed and parallel time series feature extraction for industrial big data applications.
         ArXiv e-prints: 1610.07717 URL: http://adsabs.harvard.edu/abs/2016arXiv161007717C

    .. [2] Benjamini, Y. and Yekutieli, D. (2001).
        The control of the false discovery rate in multiple testing under dependency.
        Annals of statistics, 1165--1188


================================================
FILE: docs/text/forecasting.rst
================================================
.. _forecasting-label:

Rolling/Time series forecasting
===============================

Features extracted with *tsfresh* can be used for many different tasks, such as time series classification,
compression or forecasting.
This section explains how we can use the features for time series forecasting.

Let's say you have the price of a certain stock, e.g., Apple, for 100 time steps.
Now, you want to build a feature-based model to forecast future prices of the Apple stock.
You could remove the last price value (of today) and extract features from the time series until today to predict the price of today.
But this would only give you a single example to train.
Instead, you can repeat this process: for every day in your stock price time series, remove the current value, extract features for the time until this value and train to predict the value of the day (which you removed).
You can think of it as shifting a cut-out window over your sorted time series data: on each shift step you extract the data you see through your cut-out window to build a new, smaller time series and extract features only on this one.
Then you continue shifting.
In ``tsfresh``, the process of shifting a cut-out window over your data to create smaller time series cut-outs is called *rolling*.

Rolling is a way to turn a single time series into multiple time series, each of them ending one (or n) time step later than the one before.
The rolling utilities implemented in `tsfresh` help you in this process of reshaping (and rolling) your data into a format on which you can apply the usual :func:`tsfresh.extract_features` method.
This means that the step of extracting the time series windows and the feature extraction are separated.

Please note that "time" does not necessarily mean clock time here.
The "sort" column of a DataFrame in the supported :ref:`data-formats-label` gives a sequential state to the
individual measurements.
In the case of time series this can be the *time* dimension while in other cases, this can be a location, a frequency. etc.

The following image illustrates the process:

.. image:: ../images/rolling_mechanism_1.png
   :scale: 100 %
   :alt: The rolling mechanism
   :align: center


Another example can be found in streaming data, e.g., in Industry 4.0 applications.
Here, you typically get one new data row at a time and use it to, for example, predict machine failures. To train your model,
you could act as if you would stream the data, by feeding your classifier the data after one time step,
the data after the first two time steps, etc.

In tsfresh, rolling is implemented via the helper function :func:`tsfresh.utilities.dataframe_functions.roll_time_series`.
Further, we provide the :func:`tsfresh.utilities.dataframe_functions.make_forecasting_frame` method as a convenient
wrapper to quickly construct the container and target vector for a given sequence.

Let's walk through an example to see how it works:

The rolling mechanism
---------------------

We look into the following flat DataFrame example, which is a tsfresh suitable format (see :ref:`data-formats-label`).
Note, that rolling also works for all other time series formats.

+----+------+----+----+
| id | time | x  | y  |
+====+======+====+====+
| 1  |  1   | 1  | 5  |
+----+------+----+----+
| 1  |  2   | 2  | 6  |
+----+------+----+----+
| 1  |  3   | 3  | 7  |
+----+------+----+----+
| 1  |  4   | 4  | 8  |
+----+------+----+----+
| 2  |  8   | 10 | 12 |
+----+------+----+----+
| 2  |  9   | 11 | 13 |
+----+------+----+----+

In the above flat DataFrame, we measured the values from two sensors x and y for two different entities (id 1 and 2) in 4 or 2 time
steps (1, 2, 3, 4, 8, 9).

If you want to follow along, here is the python code to generate this data:

.. code:: python

   import pandas as pd
   df = pd.DataFrame({
      "id": [1, 1, 1, 1, 2, 2],
      "time": [1, 2, 3, 4, 8, 9],
      "x": [1, 2, 3, 4, 10, 11],
      "y": [5, 6, 7, 8, 12, 13],
   })

Now, we can use :func:`tsfresh.utilities.dataframe_functions.roll_time_series` to get consecutive sub-time series.
You could think of having a window sliding over your time series data and extracting out every data you can see through this window.
There are three parameters to tune for the window:

* `max_timeshift` defines, how large the window is at maximum. The extracted time series will have at maximum length of `max_timeshift + 1`.
  (they can also be smaller, as time stamps in the beginning have less past values).
* `min_timeshift` defines the minimal size of each window. Shorter time series (usually at the beginning) will be omitted.
* Advanced: `rolling_direction`: if you want to slide in positive (increasing sort) or negative (decreasing sort) direction. You barely need negative direction, so you probably do not want to change the default. The absolute value of this parameter decides how much you want to shift per cut-out step.

The column parameters are the same as in the usual :ref:`data-formats-label`.

Let's see what will happen with our data sample:

.. code:: python

   from tsfresh.utilities.dataframe_functions import roll_time_series
   df_rolled = roll_time_series(df, column_id="id", column_sort="time")

The new data set consists only of values from the old data set, but with new ids.
Also the sort column values (in this case ``time``) is copied.
If you group by ``id``, you will end up with the following parts (or "windows"):

+-------+-------+---+----+
| id    | time  | x |  y |
+=======+=======+===+====+
| (1,1) |    1  | 1 |  5 |
+-------+-------+---+----+

+-------+-------+---+----+
| id    | time  | x |  y |
+=======+=======+===+====+
| (1,2) |    1  | 1 |  5 |
+-------+-------+---+----+
| (1,2) |    2  | 2 |  6 |
+-------+-------+---+----+

+-------+-------+---+----+
| id    | time  | x |  y |
+=======+=======+===+====+
| (1,3) |    1  | 1 |  5 |
+-------+-------+---+----+
| (1,3) |    2  | 2 |  6 |
+-------+-------+---+----+
| (1,3) |    3  | 3 |  7 |
+-------+-------+---+----+

+-------+-------+---+----+
| id    | time  | x |  y |
+=======+=======+===+====+
| (1,4) |    1  | 1 |  5 |
+-------+-------+---+----+
| (1,4) |    2  | 2 |  6 |
+-------+-------+---+----+
| (1,4) |    3  | 3 |  7 |
+-------+-------+---+----+
| (1,4) |    4  | 4 |  8 |
+-------+-------+---+----+

+-------+-------+---+----+
| id    | time  | x |  y |
+=======+=======+===+====+
| (2,8) |    8  |10 | 12 |
+-------+-------+---+----+

+-------+-------+---+----+
| id    | time  | x |  y |
+=======+=======+===+====+
| (2,9) |    8  |10 | 12 |
+-------+-------+---+----+
| (2,9) |    9  |11 | 13 |
+-------+-------+---+----+

Now, you can run the usual feature extraction procedure on the rolled data:

.. code:: python

   from tsfresh import extract_features
   df_features = extract_features(df_rolled, column_id="id", column_sort="time")

You will end up with features generated for each one of the parts above, which you can then use for training your forecasting model.

+----------+----------------+-----------------------------+-----+
| variable |  x__abs_energy |  x__absolute_sum_of_changes | ... |
+==========+================+=============================+=====+
| id       |                |                             | ... |
+----------+----------------+-----------------------------+-----+
| (1,1)    |            1.0 |                         0.0 | ... |
+----------+----------------+-----------------------------+-----+
| (1,2)    |            5.0 |                         1.0 | ... |
+----------+----------------+-----------------------------+-----+
| (1,3)    |           14.0 |                         2.0 | ... |
+----------+----------------+-----------------------------+-----+
| (1,4)    |           30.0 |                         3.0 | ... |
+----------+----------------+-----------------------------+-----+
| (2,8)    |          100.0 |                         0.0 | ... |
+----------+----------------+-----------------------------+-----+
| (2,9)    |          221.0 |                         1.0 | ... |
+----------+----------------+-----------------------------+-----+

The features for example for id ``(1,3)`` are extracted using the data of ``id=1`` up to and including ``t=3`` (so ``t=1``, ``t=2`` and ``t=3``).

If you want to train a model for a forecasting, `tsfresh` also offers the function :func:`tsfresh.utilities.dataframe_functions.make_forecasting_frame`, which will help you match the target vector properly.
This process is visualized in the following figure.
It shows how the purple, rolled sub-timeseries are used as base for the construction of the feature matrix *X*
(if *f* is the `extract_features` function).
The green data points need to be predicted by the model and are used as rows in the target vector *y*.
Be aware that this only works for a one-dimensional time series of a single `id` and `kind`.

.. image:: ../images/rolling_mechanism_2.png
   :scale: 100 %
   :alt: The rolling mechanism
   :align: center

Parameters and Implementation Notes
-----------------------------------

The above example demonstrates the overall rolling mechanism, which creates new time series.
Now, we discuss the naming convention for the new time series.

For identifying every subsequence, `tsfresh` uses the time stamp of the point that will be predicted together with the old identifier as "id".
For positive rolling, this `timeshift` is the last time stamp in the subsequence.
For negative rolling, it is the first one, for example the above dataframe rolled in negative direction gives us:

+-------+------+----+----+
| id    | time |  x |  y |
+=======+======+====+====+
| (1,1) |    1 |  1 |  5 |
+-------+------+----+----+
| (1,1) |    2 |  2 |  6 |
+-------+------+----+----+
| (1,1) |    3 |  3 |  7 |
+-------+------+----+----+
| (1,1) |    4 |  4 |  8 |
+-------+------+----+----+
| (1,2) |    2 |  2 |  6 |
+-------+------+----+----+
| (1,2) |    3 |  3 |  7 |
+-------+------+----+----+
| (1,2) |    4 |  4 |  8 |
+-------+------+----+----+
| (1,3) |    3 |  3 |  7 |
+-------+------+----+----+
| (1,3) |    4 |  4 |  8 |
+-------+------+----+----+
| (1,4) |    4 |  4 |  8 |
+-------+------+----+----+
| (2,8) |    8 | 10 | 12 |
+-------+------+----+----+
| (2,8) |    9 | 11 | 13 |
+-------+------+----+----+
| (2,9) |    9 | 11 | 13 |
+-------+------+----+----+

which you could use to predict the current value using the future time series values (if that makes sense in your case).

Choosing a non-default `max_timeshift` or `min_timeshift` would make the extracted sub-time-series smaller or even remove them completely (e.g. with `min_timeshift = 1` the ``(1,1)`` (i.e. ``id=1,timeshift=1``) of the positive rolling case would disappear).
Using a ``rolling_direction`` with a larger absolute value (e.g. -2 or 2) will skip some of the windows (in this case, every second).


================================================
FILE: docs/text/how_to_add_custom_feature.rst
================================================
How to add a custom feature
===========================

If you want to extract custom made features from your time series, tsfresh allows you to do so in a few
simple steps:

Step 1. Decide which type of feature you want to implement
----------------------------------------------------------

tsfresh supports two types of feature calculation methods:

    *1.* simple

    *2.* combiner

The difference lays in the number of features calculated for a singular time series.
The feature_calculator is simple if it returns one (*1.*) feature, and it is a combiner and returns multiple features (*2.*).
So if you want to add a singular feature, you should select *1.*, the simple feature calculator class.
If it is however, better to calculate multiple features at the same time (e.g., to perform auxiliary calculations only
once for all features), then you should choose type *2.*.


Step 2. Write the feature calculator
------------------------------------

Depending on which type of feature calculator you are implementing, you can use the following feature calculator skeletons:

1. simple features
~~~~~~~~~~~~~~~~~~

You can write a simple feature calculator that returns exactly one feature, without parameters as follows:

.. code:: python

    from tsfresh.feature_extraction.feature_calculators import set_property


    @set_property("fctype", "simple")
    def your_feature_calculator(x):
        """
        The description of your feature

        :param x: the time series to calculate the feature of
        :type x: pandas.Series
        :return: the value of this feature
        :return type: bool, int or float
        """
        # Calculation of feature as float, int or bool
        result = f(x)
        return result

or with parameters:

.. code:: python

    @set_property("fctype", "simple"")
    def your_feature_calculator(x, p1, p2, ...):
        """
        Description of your feature

        :param x: the time series to calculate the feature of
        :type x: pandas.Series
        :param p1: description of your parameter p1
        :type p1: type of your parameter p1
        :param p2: description of your parameter p2
        :type p2: type of your parameter p2
        ...
        :return: the value of this feature
        :return type: bool, int or float
        """
        # Calculation of feature as float, int or bool
        f = f(x)
        return f


2. combiner features
~~~~~~~~~~~~~~~~~~~~

Alternatively, you can write a combiner feature calculator that returns multiple features as follows:

.. code:: python

    from tsfresh.utilities.string_manipulation import convert_to_output_format


    @set_property("fctype", "combiner")
    def your_feature_calculator(x, param):
        """
        Short description of your feature (should be a one liner as we parse the first line of the description)

        Long detailed description, add somme equations, add some references, what kind of statistics is the feature
        capturing? When should you use it? When not?

        :param x: the time series to calculate the feature of
        :type x: pandas.Series
        :param c: the time series name
        :type c: str
        :param param: contains dictionaries {"p1": x, "p2": y, ...} with p1 float, p2 int ...
        :type param: list
        :return: list of tuples (s, f) where s are the parameters, serialized as a string,
                 and f the respective feature value as bool, int or float
        :return type: pandas.Series
        """
        # Do some pre-processing if needed for all parameters
        # f is a function that calculates the feature value for each single parameter combination
        return [(convert_to_output_format(config), f(x, config)) for config in param]


Writing your own time-based feature calculators
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Writing your own time-based feature calculators is no different than usual. Only two new properties must be set using the `@set_property` decorator:

* Adding ``@set_property("input", "pd.Series")`` tells the function that the input of the function is a ``pd.Series`` rather than a ``numpy`` array.
  This allows the index to be used automatically.
* Adding ``@set_property("index_type", pd.DatetimeIndex)`` tells the function that the input is a `DatetimeIndex`,
  allowing it to perform calculations based on time data types.

For example, if we want to write a function that calculates the time between the first and last measurement, it could look something like this:

.. code:: python

    @set_property("input", "pd.Series")
    @set_property("index_type", pd.DatetimeIndex)
    def timespan(x, param):
        ix = x.index

        # Get differences between the last timestamp and the first timestamp in seconds,
        # then convert to hours.
        times_seconds = (ix[-1] - ix[0]).total_seconds()
        return times_seconds / float(3600)


Step 3. Add custom settings for your feature
--------------------------------------------

Finally, you need to add your new custom feature to the extraction settings, otherwise it is not used
during extraction.
To do this, create a new settings object (by default, ``tsfresh`` uses the
:class:`tsfresh.feature_extraction.settings.ComprehensiveFCParameters`) and
add your function as a key to the dictionary.
As a value, either use ``None`` if your function does not need parameters or a list with the
parameters you want to use (as dictionaries).

.. code:: python

    settings = ComprehensiveFCParameters()
    settings[f] = [{"n": 1}, {"n": 2}]

After that, make sure you pass your newly created settings in the call to ``extract_features``.

Step 4. Make a pull request
---------------------------

We would be very happy if you contribute your custom features to tsfresh.

To do this, add your feature into the ``feature_calculators.py`` file and append your
feature (as a name) with safe default parameters to the ``name_to_param`` dictionary inside the
:class:`tsfresh.feature_extraction.settings.ComprehensiveFCParameters` constructor:

.. code:: python

    name_to_param.update({
        # here are the existing settings
        ...
        # Now the settings of your feature calculator
        "your_feature_calculator" = [{"p1": x, "p2": y, ...} for x,y in ...],
    })

Make sure, that the different feature extraction settings
(e.g. :class:`tsfresh.feature_extraction.settings.EfficientFCParameters`,
:class:`tsfresh.feature_extraction.settings.MinimalFCParameters` or
:class:`tsfresh.feature_extraction.settings.ComprehensiveFCParameters`) do include different sets of
feature calculators to use. You can control, which feature extraction settings object will include your new
feature calculator by giving your function attributes like "minimal" or "high_comp_cost". See the
classes in :mod:`tsfresh.feature_extraction.settings` for more information.

After that, add some tests and make a pull request to our `github repo <https://github.com/blue-yonder/tsfresh>`_.
We happily accept partly implemented feature calculators, which we can finalize together.


================================================
FILE: docs/text/how_to_contribute.rst
================================================
How to contribute
=================

We want tsfresh to become the biggest archive of feature extraction methods in python. To achieve this goal, we need
your help!

All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome. If you
want to add one or two interesting feature calculators, implement a new feature selection process or just fix 1-2 typos,
your help is appreciated.

If you want to help, just create a pull request on our github page. To the new user, working with Git can sometimes be
confusing and frustrating. If you are not familiar with Git you can also contact us by :ref:`email <authors>`.


Guidelines
''''''''''

There are three general coding paradigms that we believe in:

    1. **Keep it simple**. We believe that *"Programs should be written for people to read, and only incidentally for
       machines to execute."*.

    2. **Keep it documented** by at least including a docstring for each method and class. Do not describe what you are
       doing but why you are doing it.

    3. **Keep it tested**. We aim for a high test coverage.


There are two important copyright guidelines:

    4. Please do not include any data sets for which a licence is not available or commercial use is even prohibited.
       Those can undermine the licence of the whole projects.

    5. Do not use code snippets for which a licence is not available (e.g. from stackoverflow) or commercial use is
       even prohibited. Those can undermine the licence of the whole projects.

Further, there are some technical decisions we made:

    6. Clear the Output of iPython notebooks. This improves the readability of related Git diffs.


Installation
''''''''''''

Install all the relevant python packages with

.. code::

    cd /path/to/tsfresh
    pip install -e ".[testing]"
    pre-commit install


The last command will dynamically link the tsfresh package which means that changes to the code will directly show up
for example in your test run.


Test framework
''''''''''''''

After making your changes, you probably want to test your changes locally. To run our comprehensive suite of unit tests
you have to:


.. code::

    pytest


To test changes across multiple versions of Python, run:


.. code::

    tox -r -p auto


This will execute tests for the Python versions specified in `setup.cfg <https://github.com/blue-yonder/tsfresh/blob/main/setup.cfg>`_ using the `envlist` variable. For example, if `envlist` is set to `py37, py38`, the test suite will run for Python 3.7 and 3.8 on the local development platform, assuming the binaries for those versions are available locally. The exact Python microversions (e.g. `3.7.1` vs `3.7.2`) depend on what is installed on the local development machine.

A recommended way to manage multiple Python versions when testing locally is with `pyenv`, which enables organized installation and switching between versions.

In addition to running tests locally, you can also run them in a Dockerized testing environment:


.. code::

   make test-all-testenv


This command will initially take some time. However subsequent invokations will be faster, and testing this way ensures a clean, consistent test environment, regardless of your local setup.


Documentation
'''''''''''''

Build the documentation after installing as described above with


.. code::

    pip install -e ".[docs]"
    cd docs
    make html

The finished documentation can be found in the docs/_build/html folder.


Styling
'''''''

We use black and isort for styling. They are automatically triggered on every commit after having installed pre-commit
(as described above).


We are looking forward to hear from you! =)


PR Descriptions
'''''''''''''''

The PR should have a clear and descriptive title, along with a detailed description of the changes made, the problem being addressed, and any relevant tips for reviewers. An example of what this might look like is `here. <https://github.com/blue-yonder/tsfresh/pull/994#issue-1509962136>`_


================================================
FILE: docs/text/introduction.rst
================================================
Introduction
============

Why tsfresh?
------------

tsfresh is used for systematic feature engineering from time-series and other sequential data [1]_.
These data have in common that they are ordered by an independent variable.
The most common independent variable is time (time series).
Other examples for sequential data are reflectance and absorption spectra,
which have wavelength as their ordering dimension.
In order to keep things simple, we are simply referring to all different types of sequential data as time-series.

.. image:: ../images/introduction_ts_exa.png
   :scale: 70 %
   :alt: the time series
   :align: center

(and yes, it is pretty cold!)

Now you want to calculate different characteristics such as the maximum or minimum temperature, the average temperature
or the number of temporary temperature peaks:

.. image:: ../images/introduction_ts_exa_features.png
   :scale: 70 %
   :alt: some characteristics of the time series
   :align: center

Without tsfresh, you would have to calculate all those characteristics manually; tsfresh automates this process
calculating and returning all those features automatically.

In addition, tsfresh is compatible with the Python libraries :mod:`pandas` and :mod:`scikit-learn`, so you can easily
integrate the feature extraction with your current routines.

What can we do with these features?
-----------------------------------

The extracted features can be used to describe the time series, i.e., often these features give new insights into the
time series and their dynamics. They can also be used to cluster time series and to train machine learning models that
perform classification or regression tasks on time series.

The tsfresh package has been successfully used in the following projects:

    * prediction of steel billets quality during a continuous casting process [2]_,
    * activity recognition from synchronized sensors [3]_,
    * volcanic eruption forecasting [4]_,
    * authorship attribution from written text samples [5]_,
    * characterisation of extrasolar planetary systems from time-series with missing data [6]_,
    * sensor anomaly detection [7]_,
    * and `many many more <https://scholar.google.de/scholar?cites=365611925060572663>`_.

What can't we do with tsfresh?
------------------------------

Currently, tsfresh is not suitable:

    * for streaming data (by streaming data we mean data that is usually used for online operations, while time series data is usually used for offline operations)
    * to train models on the extracted features (we do not want to reinvent the wheel, to train machine learning models check out the Python package
      `scikit-learn <http://scikit-learn.org/stable/>`_)
    * for usage with highly irregular time series; tsfresh uses timestamps only to order observations, while many features are interval-agnostic (e.g., number of peaks) and can be determined for any series, some otherfeatures (e.g., linear trend) assume equal spacing in time, and should be used with care when this assumption is not met.

However, some of these use cases could be implemented, if you have an application in mind, open
an issue at `<https://github.com/blue-yonder/tsfresh/issues>`_, or feel free to contact us.

What else is out there?
-----------------------

There is a matlab package called `hctsa <https://github.com/benfulcher/hctsa>`_ which can be used to automatically
extract features from time series.
It is also possible to use hctsa from within Python through the `pyopy <https://github.com/strawlab/pyopy>`_
package.
Other available packagers are `featuretools <https://www.featuretools.com/>`_, `FATS <http://isadoranun.github.io/tsfeat/>`_ and `cesium <http://cesium-ml.org/>`_.

References
----------

   .. [1] Christ, M., Braun, N., Neuffer, J. and Kempa-Liehr A.W. (2018).
          *Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python package)*.
          Neurocomputing 307 (2018) 72-77,
          `doi: 10.1016/j.neucom.2018.03.067 <https://doi.org/10.1016/j.neucom.2018.03.067>`_.
   .. [2] Christ, M., Kempa-Liehr, A.W. and Feindt, M. (2016).
          *Distributed and parallel time series feature extraction for industrial big data applications*.
          Asian Conference on Machine Learning (ACML), Workshop on Learning on Big Data (WLBD).
          `<https://arxiv.org/abs/1610.07717v1>`_.
   .. [3] Kempa-Liehr, A.W., Oram, J., Wong, A., Finch, M. and Besier, T. (2020).
          *Feature engineering workflow for activity recognition from synchronized inertial measurement units*.
          In: Pattern Recognition. ACPR 2019. Ed. by M. Cree et al. Vol. 1180.
          Communications in Computer and Information Science (CCIS).
          Singapore: Springer 2020, 223–231.
          `doi: 10.1007/978-981-15-3651-9_20 <https://doi.org/10.1007/978-981-15-3651-9_20>`_.
   .. [4] D. E. Dempsey, S. J. Cronin, S. Mei, and A. W. Kempa-Liehr (2020).
          *Automatic precursor recognition and real-time forecasting of sudden explosive volcanic eruptions at Whakaari, New Zealand*.
          Nature Communications 11.3562, pp. 1–8.
          `doi: 10.1038/s41467-020-17375-2 <https://dx.doi.org/10.1038/s41467-020-17375-2>`_.
   .. [5] Tang, Y., Blincoe, K., Kempa-Liehr, A.W. (2020).
          *Enriching Feature Engineering for Short Text Samples by Language Time Series Analysis*.
          EPJ Data Science 9.26 (2020), 1–59.
          `doi: 10.1140/epjds/s13688-020-00244-9 <https://doi.org/10.1140/epjds/s13688-020-00244-9>`_.
   .. [6] Kennedy, A., Gemma, N., Rattenbury, N., Kempa-Liehr, A.W. (2021).
          *Modelling the projected separation of microlensing events using systematic time-series feature engineering*.
          Astronomy and Computing 35.100460 (2021), 1–14,
          `doi: 10.1016/j.ascom.2021.100460 <https://doi.org/10.1016/j.ascom.2021.100460>`_.
   .. [7] Hui Yie Teh, Kevin I-Kai Wang, and Andreas W. Kempa-Liehr (2021).
          *Expect the Unexpected: Unsupervised feature selection for automated sensor anomaly detection*.
          IEEE Sensors Journal 15.16, pp. 18033–18046.
          `doi: 10.1109/JSEN.2021.3084970 <https://doi.org/10.1109/JSEN.2021.3084970>`_.


================================================
FILE: docs/text/large_data.rst
================================================
.. _large-data-label:

Large Input Data
================

If you are working with large time series data, you are probably facing multiple problems.
The two most important ones are:

* long execution times for feature extraction
* large memory consumption, even beyond what a single machine can handle

To solve the first problem, you can parallelize the computation as described in :ref:`tsfresh-on-a-cluster-label`.
Note, that parallelization on your local computer is already turned on by default.

However, for larger data sets you need to handle both problems at the same time.
You have multiple options to do so, which we will discuss in the following paragraphs.

Dask - the simple way
---------------------

*tsfresh* accepts a `dask dataframe <https://docs.dask.org/en/latest/dataframe.html>`_ instead of a
pandas dataframe as input for the :func:`tsfresh.extract_features` function.
Dask dataframes allow you to scale your computation beyond your local memory (via partitioning the data internally)
and even to large clusters of machines.
Its dataframe API is very similar to pandas dataframes and might even be a drop-in replacement.

All arguments discussed in :ref:`data-formats-label` are also valid for dask dataframes.
The input data will be transformed into the correct format for *tsfresh* using dask methods
and the feature extraction will be added as additional computations to the computation graph.
You can then add additional computations to the result or trigger the computation as usual with ``.compute()``.

.. NOTE::

    The last step of the feature extraction is to bring all features into a tabular format.
    Especially for very large data samples, this computation can be a large
    performance bottleneck.
    We therefore recommend to turn the pivoting off, if you do not really need it
    and work with the un-pivoted data as much as possible.

For example, to read in data from parquet and do the feature extraction:

.. code::

    import dask.dataframe as dd
    from tsfresh import extract_features

    df = dd.read_parquet(...)

    X = extract_features(df,
                         column_id="id",
                         column_sort="time",
                         pivot=False)

    result = X.compute()

Dask - more control
-------------------

The feature extraction method needs to perform some data transformations before it
can call the actual feature calculators.
If you want to optimize your data flow, you might want to have more control on how
exactly the feature calculation is added to you dask computation graph.

Therefore, it is also possible to add the feature extraction directly:


.. code::

    from tsfresh.convenience.bindings import dask_feature_extraction_on_chunk
    features = dask_feature_extraction_on_chunk(df_grouped,
                                                column_id="id",
                                                column_kind="kind",
                                                column_sort="time",
                                                column_value="value")

In this case however, ``df_grouped`` must already be in the correct format.
Check out the documentation of :func:`tsfresh.convenience.bindings.dask_feature_extraction_on_chunk`
for more information.
No pivoting will be performed in this case.

PySpark
-------

Similar to dask, it is also possible to pass the feature extraction into a Spark
computation graph.
You can find more information in the documentation of :func:`tsfresh.convenience.bindings.spark_feature_extraction_on_chunk`.


================================================
FILE: docs/text/list_of_features.rst
================================================
Overview on extracted features
==============================

*tsfresh* calculates a comprehensive number of features. All feature calculators are contained in the submodule:

.. autosummary::
   :toctree: _generated
   :template: module_functions_template.rst

    tsfresh.feature_extraction.feature_calculators


The following list contains all the feature calculations supported in the current version of *tsfresh*:

.. include:: _generated/tsfresh.feature_extraction.feature_calculators.rst


================================================
FILE: docs/text/quick_start.rst
================================================
.. _quick-start-label:

Quick Start
===========


Install tsfresh
---------------

As the compiled tsfresh package is hosted on the Python Package Index (PyPI) you can easily install it with pip

.. code:: shell

   pip install tsfresh

If you need to work with large time series data that may not fit in memory, install tsfresh with
`Dask <https://www.dask.org>`_:

.. code:: shell

   pip install tsfresh[dask]

See also :ref:`large-data-label`.


Dive in
-------

Before boring yourself by reading the docs in detail, you can dive right into tsfresh with the following example:

We are given a data set containing robot failures as discussed in [1]_.
Each robot records time series from six different sensors.
For each sample denoted by a different id we are going to classify if the robot reports a failure or not.
From a machine learning point of view, our goal is to classify each group of time series.

To start, we load the data into python

.. code:: python

    from tsfresh.examples.robot_execution_failures import download_robot_execution_failures, \
        load_robot_execution_failures
    download_robot_execution_failures()
    timeseries, y = load_robot_execution_failures()

and end up with a pandas.DataFrame `timeseries` having the following shape

.. code:: python

   print(timeseries.head())

+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|     | id  | time| F_x | F_y | F_z | T_x | T_y | T_z |
+=====+=====+=====+=====+=====+=====+=====+=====+=====+
| 0   | 1   | 0   | -1  | -1  | 63  | -3  | -1  | 0   |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| 1   | 1   | 1   | 0   | 0   | 62  | -3  | -1  | 0   |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| 2   | 1   | 2   | -1  | -1  | 61  | -3  | 0   | 0   |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| 3   | 1   | 3   | -1  | -1  | 63  | -2  | -1  | 0   |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| 4   | 1   | 4   | -1  | -1  | 63  | -3  | -1  | 0   |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+

The first column is the DataFrame index and has no meaning here.
There are six different time series (`F_x`, `F_y`, `F_z`, `T_x`, `T_y`, `T_z`) for the different sensors. The different robots are denoted by the ids column.

On the other hand, ``y`` contains the information which robot `id` reported a failure and which not:

+---+---+
| 1 | 0 |
+---+---+
| 2 | 0 |
+---+---+
| 3 | 0 |
+---+---+
| 4 | 0 |
+---+---+
| 5 | 0 |
+---+---+
|...|...|
+---+---+

Here, for the samples with ids 1 to 5 no failure was reported.

In the following we illustrate the time series of the sample id 3 reporting no failure:

.. code:: python

    import matplotlib.pyplot as plt
    timeseries[timeseries['id'] == 3].plot(subplots=True, sharex=True, figsize=(10,10))
    plt.show()

.. image:: ../images/ts_example_robot_failures_nofail.png
   :alt: the time series for id 3 (no failure)
   :align: center

And for id 20 reporting a failure:


.. code:: python

    timeseries[timeseries['id'] == 20].plot(subplots=True, sharex=True, figsize=(10,10))
    plt.show()

.. image:: ../images/ts_example_robot_failures_fail.png
   :alt: the time series for id 20 (failure)
   :align: center

You can already see some differences by eye - but for successful machine learning we have to put these differences into
numbers.

For this, tsfresh comes into place.
It allows us to automatically extract over 1200 features from those six different time series for each robot.

For extracting all features, we do:

.. code:: python

    from tsfresh import extract_features
    extracted_features = extract_features(timeseries, column_id="id", column_sort="time")

You end up with the DataFrame `extracted_features` with more than 1200 different extracted features.
We will now first, remove all ``NaN`` values (which were created by feature calculators that can not be used on the
given data, e.g., because the statistics are too low), and then select only the relevant features:

.. code-block:: python

    from tsfresh import select_features
    from tsfresh.utilities.dataframe_functions import impute

    impute(extracted_features)
    features_filtered = select_features(extracted_features, y)


Only around 300 features were classified as relevant enough.

Further, you can even perform the extraction, imputing and filtering at the same time with the
:func:`tsfresh.extract_relevant_features` function:

.. code-block:: python

    from tsfresh import extract_relevant_features

    features_filtered_direct = extract_relevant_features(timeseries, y,
                                                         column_id='id', column_sort='time')


You can now use the features in the DataFrame `features_filtered` (which is equal to
`features_filtered_direct`) in conjunction with `y` to train your classification model.
You can find an example in the Jupyter notebook
`01 Feature Extraction and Selection.ipynb <https://github.com/blue-yonder/tsfresh/blob/main/notebooks/01%20Feature%20Extraction%20and%20Selection.ipynb>`_
where we train a RandomForestClassifier using the extracted features.

References

.. [1] http://archive.ics.uci.edu/ml/datasets/Robot+Execution+Failures


================================================
FILE: docs/text/sklearn_transformers.rst
================================================
.. _sklearn-transformers-label:

scikit-learn Transformers
=========================

tsfresh includes three scikit-learn compatible transformers, which allow you to easily incorporate feature extraction
and feature selection from time series into your existing machine learning pipelines.

The scikit-learn pipeline allows you to assemble several pre-processing steps that will be executed in sequence and
thus, can be cross-validated together while setting different parameters (for more details about the scikit-learn's
pipeline, take a look at the official documentation [1]_).
Our tsfresh transformers allow you to extract and filter the time series features during these pre-processing sequence.

The first two estimators in tsfresh are the :class:`~tsfresh.transformers.feature_augmenter.FeatureAugmenter`,
which extracts the features, and the :class:`~tsfresh.transformers.feature_selector.FeatureSelector`, which
performs the feature selection algorithm.
It is preferable to combine extracting and filtering of the features in a single step to avoid unnecessary feature
calculations.
Hence, the :class:`~tsfresh.transformers.feature_augmenter.RelevantFeatureAugmenter` combines both the
extraction and filtering of the features in a single step.

Example
-------

In the following example you see how we combine tsfresh's
:class:`~tsfresh.transformers.relevant_feature_augmenter.RelevantFeatureAugmenter` and a
:class:`~sklearn.ensemble.RandomForestClassifier` into a single pipeline. This pipeline can then fit both our
transformer and the classifier in one step.

.. code-block:: python

    from sklearn.pipeline import Pipeline
    from sklearn.ensemble import RandomForestClassifier
    from tsfresh.examples import load_robot_execution_failures
    from tsfresh.transformers import RelevantFeatureAugmenter
    import pandas as pd

    # Download dataset
    from tsfresh.examples.robot_execution_failures import download_robot_execution_failures
    download_robot_execution_failures()

    pipeline = Pipeline([
                ('augmenter', RelevantFeatureAugmenter(column_id='id', column_sort='time')),
                ('classifier', RandomForestClassifier()),
                ])

    df_ts, y = load_robot_execution_failures()
    X = pd.DataFrame(index=y.index)

    pipeline.set_params(augmenter__timeseries_container=df_ts)
    pipeline.fit(X, y)

The parameters of the :class:`~tsfresh.transformers.relevant_feature_augmenter.RelevantFeatureAugmenter` correspond to
the parameters of the top-level convenience function
:func:`~tsfresh.convenience.relevant_extraction.extract_relevant_features`.
In the above example, we only set the names of two columns ``column_id='id'``, ``column_sort='time'``
(see :ref:`data-formats-label` for more details on those parameters).

Because we cannot pass the time series container directly as a parameter to the augmenter step when calling fit or
transform on a :class:`sklearn.pipeline.Pipeline`, we have to set it manually by calling
``pipeline.set_params(augmenter__timeseries_container=df_ts)``.
In general, you can change the time series container from which the features are extracted by calling either the
pipeline's :func:`~sklearn.pipeline.Pipeline.set_params` method or the transformers
:func:`~tsfresh.transformers.relevant_feature_augmenter.RelevantFeatureAugmenter.set_timeseries_container` method.

For further examples, visit the Jupyter Notebook 02 sklearn Pipeline.ipynb in the notebooks folder of the tsfresh
github repository.


References
----------

    .. [1] http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html


================================================
FILE: docs/text/tsfresh_on_a_cluster.rst
================================================
.. _tsfresh-on-a-cluster-label:

.. role:: python(code)
    :language: python

Parallelization
===============

The feature extraction, the feature selection, as well as the rolling, offer the possibility of parallelization.
By default, all of those tasks are parallelized by tsfresh.
Here we discuss the different settings to control the parallelization.
To achieve the best results for your use-case you should experiment with the parameters.

.. NOTE::
    This document describes parallelization to speed up processing time.
    If you are working with large amounts of data (which might not fit into memory),
    check :ref:`large-data-label`.

Please, let us know about your results tuning the below mentioned parameters! It will help improve the documentation as
well as the default settings.

Parallelization of Feature Selection
------------------------------------

We use a :class:`multiprocessing.Pool` to parallelize the calculation of the p-values for each feature. On
instantiation we set the Pool's number of worker processes to
`n_jobs`. This field defaults to
the number of processors on the current system. We recommend setting it to the maximum number of available (and
otherwise idle) processors.

The chunksize of the Pool's map function is another important parameter to consider. It can be set via the
`chunksize` field. By default it is up to
:class:`multiprocessing.Pool` is parallelisation parameter. One data chunk is
defined as a singular time series for one id and one kind. The chunksize is the
number of chunks that are submitted as one task to one worker process.  If you
set the chunksize to 10, then it means that one worker task corresponds to
calculate all features for 10 id/kind time series combinations. If it is set
to None, depending on distributor, heuristics are used to find the optimal
chunksize. The chunksize can have a crucial influence on the optimal cluster
performance and should be optimised in benchmarks for the problem at hand.

Parallelization of Feature Extraction
-------------------------------------

For the feature extraction tsfresh exposes the parameters
`n_jobs` and `chunksize`. Both behave similarly to the parameters
for the feature selection.

To do performance studies and profiling, it is sometimes useful to turn off parallelization. This can be
done by setting the parameter `n_jobs` to 0.

Parallelization beyond a single machine
---------------------------------------

The high volume of time series data can demand an analysis at scale.
So, time series need to be processed on a group of computational units instead of a singular machine.

Accordingly, it may be necessary to distribute the extraction of time series features to a cluster.
It is possible to extract features with *tsfresh* in a distributed fashion.
In the following paragraphs we discuss how to setup a distributed *tsfresh*.

To distribute the calculation of features, we use a certain object, the Distributor class (located in the
:mod:`tsfresh.utilities.distribution` module).

Essentially, a Distributor organizes the application of feature calculators to data chunks.
It maps the feature calculators to the data chunks and then reduces them, meaning that it combines the results of the
individual mappings into one object, the feature matrix.

So, Distributor will, in the following order,

    1. calculate an optimal :python:`chunk_size`, based on the characteristics of the time series data
       (by :func:`~tsfresh.utilities.distribution.DistributorBaseClass.calculate_best_chunk_size`)

    2. split the time series data into chunks
       (by :func:`~tsfresh.utilities.distribution.DistributorBaseClass.partition`)

    3. distribute the application of the feature calculators to the data chunks
       (by :func:`~tsfresh.utilities.distribution.DistributorBaseClass.distribute`)

    4. combine the results into the feature matrix
       (by :func:`~tsfresh.utilities.distribution.DistributorBaseClass.map_reduce`)

    5. close all connections, shutdown all resources and clean everything
       (by :func:`~tsfresh.utilities.distribution.DistributorBaseClass.close`)

So, how can you use the Distributor to extract features with *tsfresh*?
You will have to pass :python:`distributor` as an argument to the :func:`~tsfresh.feature_extraction.extract_features`
method.


The following example shows how to define the MultiprocessingDistributor, which will distribute the calculations to a
local pool of threads:

.. code:: python

    from tsfresh.examples.robot_execution_failures import \
        download_robot_execution_failures, \
        load_robot_execution_failures
    from tsfresh.feature_extraction import extract_features
    from tsfresh.utilities.distribution import MultiprocessingDistributor

    # download and load some time series data
    download_robot_execution_failures()
    df, y = load_robot_execution_failures()

    # We construct a Distributor that will spawn the calculations
    # over four threads on the local machine
    Distributor = MultiprocessingDistributor(n_workers=4,
                                             disable_progressbar=False,
                                             progressbar_title="Feature Extraction")

    # just to pass the Distributor object to
    # the feature extraction, along with the other parameters
    X = extract_features(timeseries_container=df,
                         column_id='id',
                         column_sort='time',
                         distributor=Distributor)

The following example corresponds to the existing multiprocessing *tsfresh* API, where you just specify the number of
jobs, without the need to construct the Distributor:

.. code:: python

    from tsfresh.examples.robot_execution_failures import \
        download_robot_execution_failures, \
        load_robot_execution_failures
    from tsfresh.feature_extraction import extract_features

    download_robot_execution_failures()
    df, y = load_robot_execution_failures()

    X = extract_features(timeseries_container=df,
                         column_id='id',
                         column_sort='time',
                         n_jobs=4)

Using dask to distribute the calculations
'''''''''''''''''''''''''''''''''''''''''

We provide a Distributor for the `dask framework <https://dask.pydata.org/en/latest/>`_, where
*"Dask is a flexible parallel computing library for analytic computing."*

.. NOTE::
    This part of the documentation only handles parallelizing the computation using
    a dask cluster. The input and output are still pandas objects.
    If you want to use dask's capabilities to scale to data beyond your local
    memory, have a look at :ref:`large-data-label`.

Dask is a great framework to distribute analytic calculations into clusters.
It scales up and down, meaning that you can also use it on a singular machine.
The only thing that you will need to run *tsfresh* on a Dask cluster is the ip address and port number of the
`dask-scheduler <http://distributed.readthedocs.io/en/latest/setup.html>`_.

Let's say that your dask scheduler is running at ``192.168.0.1:8786``, then we can construct a
:class:`~sfresh.utilities.distribution.ClusterDaskDistributor` that connects to the scheduler and distributes the
time series data and the calculation to a cluster:

.. code:: python

    from tsfresh.examples.robot_execution_failures import \
        download_robot_execution_failures, \
        load_robot_execution_failures
    from tsfresh.feature_extraction import extract_features
    from tsfresh.utilities.distribution import ClusterDaskDistributor

    download_robot_execution_failures()
    df, y = load_robot_execution_failures()

    Distributor = ClusterDaskDistributor(address="192.168.0.1:8786")

    X = extract_features(timeseries_container=df,
                         column_id='id',
                         column_sort='time',
                         distributor=Distributor)

Compared to the :class:`~tsfresh.utilities.distribution.MultiprocessingDistributor` example from above, we only had to
change one line to switch from one machine to a whole cluster.
It is as easy as that.
By changing the Distributor you can easily deploy your application to run to a cluster instead of your workstation.

You can also use a local DaskCluster on your local machine to emulate a Dask network.
The following example shows how to setup a :class:`~tsfresh.utilities.distribution.LocalDaskDistributor` on a local cluster
of 3 workers:

.. code:: python

    from tsfresh.examples.robot_execution_failures import \
        download_robot_execution_failures, \
        load_robot_execution_failures
    from tsfresh.feature_extraction import extract_features
    from tsfresh.utilities.distribution import LocalDaskDistributor

    download_robot_execution_failures()
    df, y = load_robot_execution_failures()

    Distributor = LocalDaskDistributor(n_workers=3)

    X = extract_features(timeseries_container=df,
                         column_id='id',
                         column_sort='time',
                         distributor=Distributor)

Writing your own distributor
''''''''''''''''''''''''''''

If you want to use other framework instead of Dask, you will have to write your own Distributor.
To construct your custom Distributor, you need to define an object that inherits from the abstract base class
:class:`tsfresh.utilities.distribution.DistributorBaseClass`.
The :mod:`tsfresh.utilities.distribution` module contains more information about what you need to implement.

Notes for efficient parallelization
'''''''''''''''''''''''''''''''''''

By default tsfresh uses parallelization to distribute the single-threaded python code to the multiple cores available on the host machine.

However, this can create an issue known as over-provisioning. Many of the underlying python libraries (e.g. numpy) used in the feature calculators have C code implementations for their low-level processing. Those `also` try to spread their workload between as many cores available - which is in conflict with the parallelization done by tsfresh.

Over-provisioning is inefficient because of the overheads of repeated context switching.

This issue can be solved by constraining the C libraries to single threads, using the following environment variables:

.. code:: python

    import os
    os.environ['OMP_NUM_THREADS'] = "1"
    os.environ['MKL_NUM_THREADS'] = "1"
    os.environ['OPENBLAS_NUM_THREADS'] = "1"

Put these lines at the beginning of your notebook/python script - before you call any tsfresh code or import any other module.

The more cores your host computer has, the more improvement in processing speed will be gained by implementing these environment changes. Speed increases of between 6x and 26x have been observed depending on the type of the host machine.

.. NOTE::
    If you intend to run machine learning pipelines after feature extraction, it is highly recommended to revert these changes.
    Settings have been tested on GPU-accelerated XGBoost classifier training task with 9x speed reduction.
    Many popular machine learning libraries like scikit-learn and Tensorflow/PyTorch rely on parallelization of your CPU cores to speed up CPU-dependent low-level computations.
    By forcing those libraries to only use a single thread, it creates a bottleneck for other downstream tasks even if you are accelerating them with GPUs.
    e.g. CPU-computation of loss function after every PyTorch neural network forward pass will only run on one thread, while the GPU idles and waits for its completion before starting on its backpropogation tasks.


================================================
FILE: notebooks/01 Feature Extraction and Selection.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Feature Extraction and Selection"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This basic example shows how to use [tsfresh](https://tsfresh.readthedocs.io/) to extract useful features from multiple timeseries and use them to improve classification performance.\n",
    "\n",
    "We use the robot execution failure data set as an example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "\n",
    "import matplotlib.pylab as plt\n",
    "\n",
    "from tsfresh import extract_features, extract_relevant_features, select_features\n",
    "from tsfresh.utilities.dataframe_functions import impute\n",
    "from tsfresh.feature_extraction import ComprehensiveFCParameters\n",
    "\n",
    "from sklearn.tree import DecisionTreeClassifier\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.metrics import classification_report"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load and visualize data\n",
    "\n",
    "The data set documents 88 robot executions (each has a unique `id` between 1 and 88), which is a subset of the [Robot Execution Failures Data Set](https://archive.ics.uci.edu/ml/datasets/Robot+Execution+Failures). \n",
    "For the purpose of simplicity we are only differentiating between successfull and failed executions (`y`).\n",
    "\n",
    "For each execution 15 force (F) and torque (T) samples are given, which were measured at regular time intervals for the spatial dimensions x, y, and z. \n",
    "Therefore each row of the data frame references a specific execution (`id`), a time index (`index`) and documents the respective measurements of 6 sensors (`F_x`, `F_y`, `F_z`, `T_x`, `T_y`, `T_z`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tsfresh.examples import robot_execution_failures\n",
    "\n",
    "robot_execution_failures.download_robot_execution_failures()\n",
    "df, y = robot_execution_failures.load_robot_execution_failures()\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's draw some example executions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df[df.id == 3][['time', 'F_x', 'F_y', 'F_z', 'T_x', 'T_y', 'T_z']].plot(x='time', title='Success example (id 3)', figsize=(12, 6));\n",
    "df[df.id == 20][['time', 'F_x', 'F_y', 'F_z', 'T_x', 'T_y', 'T_z']].plot(x='time', title='Failure example (id 20)', figsize=(12, 6));"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Extract Features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use the data to extract time series features using `tsfresh`.\n",
    "We want to extract features for each time series, that means for each robot execution (which is our `id`) and for each of the measured sensor values (`F_*` and `T_*`).\n",
    "\n",
    "You can think of it like this: tsfresh will result in a single row for each `id` and will calculate the features for each columns (we call them \"kind\") separately.\n",
    "\n",
    "The `time` column is our sorting column.\n",
    "For an overview on the data formats of `tsfresh`, please have a look at [the documentation](https://tsfresh.readthedocs.io/en/latest/text/data_formats.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# We are very explicit here and specify the `default_fc_parameters`. If you remove this argument,\n",
    "# the ComprehensiveFCParameters (= all feature calculators) will also be used as default.\n",
    "# Have a look into the documentation (https://tsfresh.readthedocs.io/en/latest/text/feature_extraction_settings.html)\n",
    "# or one of the other notebooks to learn more about this.\n",
    "extraction_settings = ComprehensiveFCParameters()\n",
    "\n",
    "X = extract_features(df, column_id='id', column_sort='time',\n",
    "                     default_fc_parameters=extraction_settings,\n",
    "                     # we impute = remove all NaN features automatically\n",
    "                     impute_function=impute)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`X` now contains for each robot execution (= `id`) a single row, with all the features `tsfresh` calculated based on the measured times series values for this `id`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "\n",
    "Currently, 4674 non-NaN features are calculated. \n",
    "This number varies with the version of `tsfresh` and with your data.\n",
    "    \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Select Features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using the hypothesis tests implemented in `tsfresh` (see [here](https://tsfresh.readthedocs.io/en/latest/text/feature_filtering.html) for more information) it is now possible to select only the relevant features out of this large dataset.\n",
    "\n",
    "`tsfresh` will do a hypothesis test for each of the features to check, if it is relevant for your given target.\n",
    "\n",
    "To not leak information between the train and the test set, we will only perform the selection on the train set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_full_train, X_full_test, y_train, y_test = train_test_split(X, y, test_size=.4, random_state=42)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_filtered_train = select_features(X_full_train, y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "X_filtered_train.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "\n",
    "Currently, 423 non-NaN features survive the feature selection given this target.\n",
    "Again, this number will vary depending on your data, your target and the `tsfresh` version.\n",
    "    \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## Train and evaluate classifier"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's train a boosted decision tree on the filtered as well as the full set of extracted features."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_filtered_train, X_filtered_test = X_full_train[X_filtered_train.columns], X_full_test[X_filtered_train.columns]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "classifier_full = DecisionTreeClassifier()\n",
    "classifier_full.fit(X_full_train, y_train)\n",
    "print(classification_report(y_test, classifier_full.predict(X_full_test)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "classifier_filtered = DecisionTreeClassifier()\n",
    "classifier_filtered.fit(X_filtered_train, y_train)\n",
    "print(classification_report(y_test, classifier_filtered.predict(X_filtered_test)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compared to using all features (`classifier_full`), using only the relevant features (`classifier_filtered`) achieves similar or better classification performance with much less data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "\n",
    "Please remember that the hypothesis test in `tsfresh` is a statistical test.\n",
    "You might get better performance with other feature selection methods (e.g. training a classifier with\n",
    "all but one feature to find its importance) - but in general the feature selection implemented\n",
    "in `tsfresh` will give you a very reasonable set of selected features.\n",
    "    \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Extraction and Filtering is the same as filtered Extraction"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Above, we performed the feature extraction and selection independently. \n",
    "If you are only interested in the list of selected features, you can run this in one step:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "extract_relevant_features(df, y, column_id='id', column_sort='time',\n",
    "                          default_fc_parameters=extraction_settings)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}


================================================
FILE: notebooks/02 sklearn Pipeline.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Feature Selection in a sklearn pipeline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook is quite similar to [the first example](./01%20Feature%20Extraction%20and%20Selection.ipynb).\n",
    "This time however, we use the `sklearn` pipeline API of `tsfresh`.\n",
    "If you want to learn more, have a look at [the documentation](https://tsfresh.readthedocs.io/en/latest/text/sklearn_transformers.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from sklearn.pipeline import Pipeline\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.ensemble import RandomForestClassifier\n",
    "from sklearn.metrics import classification_report\n",
    "\n",
    "from tsfresh.examples import load_robot_execution_failures\n",
    "from tsfresh.transformers import RelevantFeatureAugmenter\n",
    "from tsfresh.utilities.dataframe_functions import impute"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load and Prepare the Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check out the first example notebook to learn more about the data and format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tsfresh.examples.robot_execution_failures import download_robot_execution_failures\n",
    "download_robot_execution_failures() \n",
    "df_ts, y = load_robot_execution_failures()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We want to use the extracted features to predict for each of the robot executions, if it was a failure or not.\n",
    "Therefore our basic \"entity\" is a single robot execution given by a distinct `id`.\n",
    "\n",
    "A dataframe with these identifiers as index needs to be prepared for the pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = pd.DataFrame(index=y.index)\n",
    "\n",
    "# Split data into train and test set\n",
    "X_train, X_test, y_train, y_test = train_test_split(X, y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build the pipeline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We build a sklearn pipeline that consists of a feature extraction step (`RelevantFeatureAugmenter`) with a subsequent `RandomForestClassifier`.\n",
    "\n",
    "The `RelevantFeatureAugmenter` takes roughly the same arguments as `extract_features` and `select_features` do."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ppl = Pipeline([\n",
    "        ('augmenter', RelevantFeatureAugmenter(column_id='id', column_sort='time')),\n",
    "        ('classifier', RandomForestClassifier())\n",
    "      ])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-warning\">\n",
    "    \n",
    "Here comes the tricky part!\n",
    "    \n",
    "The input to the pipeline will be our dataframe `X`, with one row per identifier.\n",
    "It is currently empty.\n",
    "But which time series data should the `RelevantFeatureAugmenter` use to actually extract the features from?\n",
    "\n",
    "We need to pass the time series data (stored in `df_ts`) to the transformer.\n",
    "    \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case, df_ts contains the time series of both train and test set, if you have different dataframes for \n",
    "train and test set, you have to call set_params two times \n",
    "(see further below on how to deal with two independent data sets)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ppl.set_params(augmenter__timeseries_container=df_ts);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We are now ready to fit the pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ppl.fit(X_train, y_train)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The augmenter has used the input time series data to extract time series features for each of the identifiers in the `X_train` and selected only the relevant ones using the passed `y_train` as target.\n",
    "These features have been added to `X_train` as new columns.\n",
    "The classifier can now use these features during trainings."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prediction"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "During inference, the augmenter only extracts those features that it has found as being relevant in the training phase. The classifier predicts the target using these features."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_pred = ppl.predict(X_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, finally we inspect the performance:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(classification_report(y_test, y_pred))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also find out which columns the augmenter has selected"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ppl.named_steps[\"augmenter\"].feature_selector.relevant_features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "In this example we passed in an empty (except the index) `X_train` or `X_test` into the pipeline.\n",
    "However, you can also fill the input with other features you have (e.g. features extracted from the metadata)\n",
    "or even use other pipeline components before.\n",
    "    \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Separating the time series data containers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the example above we passed in a single `df_ts` into the `RelevantFeatureAugmenter`, which was used both for training and predicting.\n",
    "During training, only the data with the `id`s from `X_train` were extracted. The rest of the data are extracted during prediction.\n",
    "\n",
    "However, it is perfectly fine to call `set_params` twice: once before training and once before prediction. \n",
    "This can be handy if you for example dump the trained pipeline to disk and re-use it only later for prediction.\n",
    "You only need to make sure that the `id`s of the entities you use during training/prediction are actually present in the passed time series data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_ts_train = df_ts[df_ts[\"id\"].isin(y_train.index)]\n",
    "df_ts_test = df_ts[df_ts[\"id\"].isin(y_test.index)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ppl.set_params(augmenter__timeseries_container=df_ts_train);\n",
    "ppl.fit(X_train, y_train);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pickle\n",
    "with open(\"pipeline.pkl\", \"wb\") as f:\n",
    "    pickle.dump(ppl, f)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Later: load the fitted model and do predictions on new, unseen data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pickle\n",
    "with open(\"pipeline.pkl\", \"rb\") as f:\n",
    "    ppk = pickle.load(f)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ppl.set_params(augmenter__timeseries_container=df_ts_test);\n",
    "y_pred = ppl.predict(X_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(classification_report(y_test, y_pred))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}


================================================
FILE: notebooks/03 Feature Extraction Settings.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Feature Calculator Settings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By default, all feature calculators are used when you call `extract_features`.\n",
    "There could be multiple reasons why you do not want that:\n",
    "* you are only interested on a certain feature (or features)\n",
    "* you want to save time during extraction\n",
    "* you have ran the feature selection before and already know, which features are relevant\n",
    "\n",
    "For more information on these settings, please have a look into [the documentation](http://tsfresh.readthedocs.io/en/latest/text/feature_extraction_settings.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tsfresh.feature_extraction import extract_features\n",
    "from tsfresh.feature_extraction import settings\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Construct a time series container"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For testing, we construct the time series container that includes two sensor time series, \"temperature\" and \"pressure\", for two devices \"a\" and \"b\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.DataFrame({\"id\": [\"a\", \"a\", \"b\", \"b\"], \"temperature\": [1,2,3,1], \"pressure\": [-1, 2, -1, 7]})\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The `default_fc_parameters`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Which features are calculated by `tsfresh` is controlled by a dictionary that contains a mapping from feature calculator names to their parameters. \n",
    "This dictionary is called `fc_parameters`. \n",
    "It maps feature calculator names (= keys) to parameters (= values). \n",
    "Every key in the dictionary will be looked up as a function in `tsfresh.feature_extraction.feature_calculators` and be used to extract features.\n",
    "\n",
    "`tsfresh` comes with some predefined sets of `fc_parameters` dictionaries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "settings.ComprehensiveFCParameters, settings.EfficientFCParameters, settings.MinimalFCParameters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For example, to only calculate a very minimal set of features:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "settings_minimal = settings.MinimalFCParameters() \n",
    "settings_minimal"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Each key stands for one of the feature calculators. \n",
    "The value are the parameters. If a feature calculator has no parameters, `None` is used as a value (and as these feature calculators are very simple, they all have no parameters)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This dictionary can passed to the extract method, resulting in a few basic time series beeing calculated:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_tsfresh = extract_features(df, column_id=\"id\", default_fc_parameters=settings_minimal)\n",
    "X_tsfresh.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By using the settings_minimal as value of the default_fc_parameters parameter, those settings are used for all type of time series. \n",
    "In this case, the `settings_minimal` dictionary is used for both \"temperature\" and \"pressure\" time series.\n",
    "\n",
    "Please note how the columns in the resulting dataframe depend both on the settings as well as the kinds of the data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, lets say we want to remove the length feature and prevent it from beeing calculated. We just delete it from the dictionary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "del settings_minimal[\"length\"]\n",
    "settings_minimal"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, if we extract features for this reduced dictionary, the length feature will not be calculated"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_tsfresh = extract_features(df, column_id=\"id\", default_fc_parameters=settings_minimal)\n",
    "X_tsfresh.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The `kind_to_fc_parameters`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, lets say we do not want to calculate the same features for both type of time series. Instead there should be different sets of features for each kind.\n",
    "\n",
    "To do that, we can use the `kind_to_fc_parameters` parameter, which lets us specifiy which `fc_parameters` we want to use for which kind of time series:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fc_parameters_pressure = {\"length\": None, \n",
    "                          \"sum_values\": None}\n",
    "\n",
    "fc_parameters_temperature = {\"maximum\": None, \n",
    "                             \"minimum\": None}\n",
    "\n",
    "kind_to_fc_parameters = {\n",
    "    \"temperature\": fc_parameters_temperature,\n",
    "    \"pressure\": fc_parameters_pressure\n",
    "}\n",
    "\n",
    "print(kind_to_fc_parameters)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, in this case, for sensor \"pressure\" both \"max\" and \"min\" are calculated. \n",
    "For the \"temperature\" signal, the length and sum\\_values features are extracted instead."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_tsfresh = extract_features(df, column_id=\"id\", kind_to_fc_parameters=kind_to_fc_parameters)\n",
    "X_tsfresh.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Extracting from data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After applying a feature selection algorithm to drop irrelevant feature columns you know which features are relevant and which are not.\n",
    "You can also use this information to only extract these relevant features in the first place.\n",
    "\n",
    "The provided `from_columns` method can be used to infer a settings dictionary from the dataframe containing the features.\n",
    "This dictionary can then for example be stored and be used in the next feature extraction."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Assuming `X_tsfresh` contains only our relevant features\n",
    "relevant_settings = settings.from_columns(X_tsfresh)\n",
    "relevant_settings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## More complex dictionaries"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We provide `fc_parameters` dictionaries with larger sets of features.\n",
    "\n",
    "The `EfficientFCParameters` contain features and parameters that should be calculated quite fast:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "settings_efficient = settings.EfficientFCParameters()\n",
    "settings_efficient"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `ComprehensiveFCParameters` are the biggest set of features. It will take the longest to calculate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "settings_comprehensive = settings.ComprehensiveFCParameters()\n",
    "settings_comprehensive"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Feature Calculator Parameters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "More complex feature calculators have parameters that you can use to tune the extracted features.\n",
    "The predefined settings (such as `ComprehensiveFCParameters`) already contain default values of these features.\n",
    "\n",
    "However for your own projects, you might want/need to tune them.\n",
    "\n",
    "In detail, the values in a `fc_parameters` dictionary contain a list of parameter dictionaries. \n",
    "When calculating the feature, each entry in the list of parameters will be used to calculate one feature.\n",
    "\n",
    "For example, lets have a look into the feature `large_standard_deviation`, which depends on a single parameter called `r` (it basically defines how large \"large\" is).\n",
    "The `ComprehensiveFCParameters` contains several default values for `r`. \n",
    "Each of them will be used to calculate a single feature:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "settings_comprehensive['large_standard_deviation']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you use these settings in feature extraction, that would trigger the calculation of 20 different `large_standard_deviation` features, one for `r=0.05` up to `r=0.95`.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "settings_tmp = {'large_standard_deviation': settings_comprehensive['large_standard_deviation']}\n",
    "\n",
    "X_tsfresh = extract_features(df, column_id=\"id\", default_fc_parameters=settings_tmp)\n",
    "X_tsfresh.columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you now want to change the parameters for a specific feature calculator, all you need to do is to change the dictionary values."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: notebooks/04 Multiclass Selection Example.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Multiclass Example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This example show shows how to use `tsfresh` to extract and select useful features from timeseries in a multiclass classification example. \n",
    "The underlying control of the false discovery rate (FDR) has been introduced by [Tang et al. (2020, Sec. 3.2)](https://doi.org/10.1140/epjds/s13688-020-00244-9).\n",
    "\n",
    "We use an example dataset of human activity recognition for this.\n",
    "The dataset consists of timeseries for 7352 accelerometer readings. \n",
    "Each reading represents an accelerometer reading for 2.56 sec at 50hz (for a total of 128 samples per reading). Furthermore, each reading corresponds one of six activities (walking, walking upstairs, walking downstairs, sitting, standing and laying).\n",
    "\n",
    "For more information go to https://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones\n",
    "\n",
    "This notebook follows the example in [the first notebook](./01%20Feature%20Extraction%20and%20Selection.ipynb), so we will go quickly over the extraction and focus on the more interesting feature selection in this case."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import matplotlib.pylab as plt\n",
    "\n",
    "from tsfresh import extract_features, extract_relevant_features, select_features\n",
    "from tsfresh.utilities.dataframe_functions import impute\n",
    "\n",
    "from sklearn.tree import DecisionTreeClassifier\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.metrics import classification_report\n",
    "\n",
    "import pandas as pd\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load and visualize data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tsfresh.examples.har_dataset import download_har_dataset, load_har_dataset, load_har_classes\n",
    "\n",
    "# fetch dataset from uci\n",
    "download_har_dataset()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": "        0         1         2         3         4         5         6    \\\n0  0.000181  0.010139  0.009276  0.005066  0.010810  0.004045  0.004757   \n1  0.001094  0.004550  0.002879  0.002247  0.003305  0.002416  0.001619   \n2  0.003531  0.002285 -0.000420 -0.003738 -0.006706 -0.003148  0.000733   \n3 -0.001772 -0.001311  0.000388  0.000408 -0.000355  0.000998  0.001109   \n4  0.000087 -0.000272  0.001022  0.003126  0.002284  0.000885  0.001933   \n\n        7         8         9    ...       118       119       120       121  \\\n0  0.006214  0.003307  0.007572  ...  0.001412 -0.001509  0.000060  0.000435   \n1  0.000981  0.000009 -0.000363  ... -0.000104 -0.000141  0.001333  0.001541   \n2  0.000668  0.002162 -0.000946  ...  0.000661  0.001853 -0.000268 -0.000394   \n3 -0.003149 -0.008882 -0.010483  ...  0.000458  0.002103  0.001358  0.000820   \n4  0.002270  0.002247  0.002175  ...  0.002529  0.003518 -0.000248 -0.002761   \n\n        122       123       124       125       126       127  \n0 -0.000819  0.000228 -0.000300 -0.001147 -0.000222  0.001576  \n1  0.001077 -0.000736 -0.003767 -0.004646 -0.002941 -0.001599  \n2 -0.000279 -0.000316  0.000144  0.001246  0.003117  0.002178  \n3 -0.000212 -0.001915 -0.001631 -0.000867 -0.001172 -0.000028  \n4  0.000252  0.003752  0.001626 -0.000698 -0.001223 -0.003328  \n\n[5 rows x 128 columns]",
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>0</th>\n      <th>1</th>\n      <th>2</th>\n      <th>3</th>\n      <th>4</th>\n      <th>5</th>\n      <th>6</th>\n      <th>7</th>\n      <th>8</th>\n      <th>9</th>\n      <th>...</th>\n      <th>118</th>\n      <th>119</th>\n      <th>120</th>\n      <th>121</th>\n      <th>122</th>\n      <th>123</th>\n      <th>124</th>\n      <th>125</th>\n      <th>126</th>\n      <th>127</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>0.000181</td>\n      <td>0.010139</td>\n      <td>0.009276</td>\n      <td>0.005066</td>\n      <td>0.010810</td>\n      <td>0.004045</td>\n      <td>0.004757</td>\n      <td>0.006214</td>\n      <td>0.003307</td>\n      <td>0.007572</td>\n      <td>...</td>\n      <td>0.001412</td>\n      <td>-0.001509</td>\n      <td>0.000060</td>\n      <td>0.000435</td>\n      <td>-0.000819</td>\n      <td>0.000228</td>\n      <td>-0.000300</td>\n      <td>-0.001147</td>\n      <td>-0.000222</td>\n      <td>0.001576</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>0.001094</td>\n      <td>0.004550</td>\n      <td>0.002879</td>\n      <td>0.002247</td>\n      <td>0.003305</td>\n      <td>0.002416</td>\n      <td>0.001619</td>\n      <td>0.000981</td>\n      <td>0.000009</td>\n      <td>-0.000363</td>\n      <td>...</td>\n      <td>-0.000104</td>\n      <td>-0.000141</td>\n      <td>0.001333</td>\n      <td>0.001541</td>\n      <td>0.001077</td>\n      <td>-0.000736</td>\n      <td>-0.003767</td>\n      <td>-0.004646</td>\n      <td>-0.002941</td>\n      <td>-0.001599</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>0.003531</td>\n      <td>0.002285</td>\n      <td>-0.000420</td>\n      <td>-0.003738</td>\n      <td>-0.006706</td>\n      <td>-0.003148</td>\n      <td>0.000733</td>\n      <td>0.000668</td>\n      <td>0.002162</td>\n      <td>-0.000946</td>\n      <td>...</td>\n      <td>0.000661</td>\n      <td>0.001853</td>\n      <td>-0.000268</td>\n      <td>-0.000394</td>\n      <td>-0.000279</td>\n      <td>-0.000316</td>\n      <td>0.000144</td>\n      <td>0.001246</td>\n      <td>0.003117</td>\n      <td>0.002178</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>-0.001772</td>\n      <td>-0.001311</td>\n      <td>0.000388</td>\n      <td>0.000408</td>\n      <td>-0.000355</td>\n      <td>0.000998</td>\n      <td>0.001109</td>\n      <td>-0.003149</td>\n      <td>-0.008882</td>\n      <td>-0.010483</td>\n      <td>...</td>\n      <td>0.000458</td>\n      <td>0.002103</td>\n      <td>0.001358</td>\n      <td>0.000820</td>\n      <td>-0.000212</td>\n      <td>-0.001915</td>\n      <td>-0.001631</td>\n      <td>-0.000867</td>\n      <td>-0.001172</td>\n      <td>-0.000028</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>0.000087</td>\n      <td>-0.000272</td>\n      <td>0.001022</td>\n      <td>0.003126</td>\n      <td>0.002284</td>\n      <td>0.000885</td>\n      <td>0.001933</td>\n      <td>0.002270</td>\n      <td>0.002247</td>\n      <td>0.002175</td>\n      <td>...</td>\n      <td>0.002529</td>\n      <td>0.003518</td>\n      <td>-0.000248</td>\n      <td>-0.002761</td>\n      <td>0.000252</td>\n      <td>0.003752</td>\n      <td>0.001626</td>\n      <td>-0.000698</td>\n      <td>-0.001223</td>\n      <td>-0.003328</td>\n    </tr>\n  </tbody>\n</table>\n<p>5 rows × 128 columns</p>\n</div>"
     },
     "metadata": {},
     "execution_count": 3
    }
   ],
   "source": [
    "df = load_har_dataset()\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "y = load_har_classes()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The data is not in a typical time series format so far: \n",
    "the columns are the time steps whereas each row is a measurement of a different person.\n",
    "\n",
    "Therefore we bring it to a format where the time series of different persons are identified by an `id` and are order by time vertically."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "df[\"id\"] = df.index\n",
    "df = df.melt(id_vars=\"id\", var_name=\"time\").sort_values([\"id\", \"time\"]).reset_index(drop=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": "   id time     value\n0   0    0  0.000181\n1   0    1  0.010139\n2   0    2  0.009276\n3   0    3  0.005066\n4   0    4  0.010810",
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>id</th>\n      <th>time</th>\n      <th>value</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>0</td>\n      <td>0</td>\n      <td>0.000181</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>0</td>\n      <td>1</td>\n      <td>0.010139</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>0</td>\n      <td>2</td>\n      <td>0.009276</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>0</td>\n      <td>3</td>\n      <td>0.005066</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>0</td>\n      <td>4</td>\n      <td>0.010810</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 6
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "output_type": "display_data",
     "data": {
      "text/plain": "&lt;Figure size 432x288 with 1 Axes&gt;",
      "image/svg+xml": "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n  \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n<!-- Created with matplotlib (https://matplotlib.org/) -->\n<svg height=\"263.63625pt\" version=\"1.1\" viewBox=\"0 0 393.207813 263.63625\" width=\"393.207813pt\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n <metadata>\n  <rdf:RDF xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n   <cc:Work>\n    <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n    <dc:date>2020-10-01T12:21:59.404485</dc:date>\n    <dc:format>image/svg+xml</dc:format>\n    <dc:creator>\n     <cc:Agent>\n      <dc:title>Matplotlib v3.3.2, https://matplotlib.org/</dc:title>\n     </cc:Agent>\n    </dc:creator>\n   </cc:Work>\n  </rdf:RDF>\n </metadata>\n <defs>\n  <style type=\"text/css\">*{stroke-linecap:butt;stroke-linejoin:round;}</style>\n </defs>\n <g id=\"figure_1\">\n  <g id=\"patch_1\">\n   <path d=\"M 0 263.63625 \nL 393.207813 263.63625 \nL 393.207813 0 \nL 0 0 \nz\n\" style=\"fill:none;\"/>\n  </g>\n  <g id=\"axes_1\">\n   <g id=\"patch_2\">\n    <path d=\"M 51.207813 239.758125 \nL 386.007812 239.758125 \nL 386.007812 22.318125 \nL 51.207813 22.318125 \nz\n\" style=\"fill:#ffffff;\"/>\n   </g>\n   <g id=\"matplotlib.axis_1\">\n    <g id=\"xtick_1\">\n     <g id=\"line2d_1\">\n      <defs>\n       <path d=\"M 0 0 \nL 0 3.5 \n\" id=\"m50774fb6b2\" style=\"stroke:#000000;stroke-width:0.8;\"/>\n      </defs>\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"66.425994\" xlink:href=\"#m50774fb6b2\" y=\"239.758125\"/>\n      </g>\n     </g>\n     <g id=\"text_1\">\n      <!-- 0 -->\n      <g transform=\"translate(63.244744 254.356563)scale(0.1 -0.1)\">\n       <defs>\n        <path d=\"M 31.78125 66.40625 \nQ 24.171875 66.40625 20.328125 58.90625 \nQ 16.5 51.421875 16.5 36.375 \nQ 16.5 21.390625 20.328125 13.890625 \nQ 24.171875 6.390625 31.78125 6.390625 \nQ 39.453125 6.390625 43.28125 13.890625 \nQ 47.125 21.390625 47.125 36.375 \nQ 47.125 51.421875 43.28125 58.90625 \nQ 39.453125 66.40625 31.78125 66.40625 \nz\nM 31.78125 74.21875 \nQ 44.046875 74.21875 50.515625 64.515625 \nQ 56.984375 54.828125 56.984375 36.375 \nQ 56.984375 17.96875 50.515625 8.265625 \nQ 44.046875 -1.421875 31.78125 -1.421875 \nQ 19.53125 -1.421875 13.0625 8.265625 \nQ 6.59375 17.96875 6.59375 36.375 \nQ 6.59375 54.828125 13.0625 64.515625 \nQ 19.53125 74.21875 31.78125 74.21875 \nz\n\" id=\"DejaVuSans-48\"/>\n       </defs>\n       <use xlink:href=\"#DejaVuSans-48\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"xtick_2\">\n     <g id=\"line2d_2\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"114.357276\" xlink:href=\"#m50774fb6b2\" y=\"239.758125\"/>\n      </g>\n     </g>\n     <g id=\"text_2\">\n      <!-- 20 -->\n      <g transform=\"translate(107.994776 254.356563)scale(0.1 -0.1)\">\n       <defs>\n        <path d=\"M 19.1875 8.296875 \nL 53.609375 8.296875 \nL 53.609375 0 \nL 7.328125 0 \nL 7.328125 8.296875 \nQ 12.9375 14.109375 22.625 23.890625 \nQ 32.328125 33.6875 34.8125 36.53125 \nQ 39.546875 41.84375 41.421875 45.53125 \nQ 43.3125 49.21875 43.3125 52.78125 \nQ 43.3125 58.59375 39.234375 62.25 \nQ 35.15625 65.921875 28.609375 65.921875 \nQ 23.96875 65.921875 18.8125 64.3125 \nQ 13.671875 62.703125 7.8125 59.421875 \nL 7.8125 69.390625 \nQ 13.765625 71.78125 18.9375 73 \nQ 24.125 74.21875 28.421875 74.21875 \nQ 39.75 74.21875 46.484375 68.546875 \nQ 53.21875 62.890625 53.21875 53.421875 \nQ 53.21875 48.921875 51.53125 44.890625 \nQ 49.859375 40.875 45.40625 35.40625 \nQ 44.1875 33.984375 37.640625 27.21875 \nQ 31.109375 20.453125 19.1875 8.296875 \nz\n\" id=\"DejaVuSans-50\"/>\n       </defs>\n       <use xlink:href=\"#DejaVuSans-50\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-48\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"xtick_3\">\n     <g id=\"line2d_3\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"162.288557\" xlink:href=\"#m50774fb6b2\" y=\"239.758125\"/>\n      </g>\n     </g>\n     <g id=\"text_3\">\n      <!-- 40 -->\n      <g transform=\"translate(155.926057 254.356563)scale(0.1 -0.1)\">\n       <defs>\n        <path d=\"M 37.796875 64.3125 \nL 12.890625 25.390625 \nL 37.796875 25.390625 \nz\nM 35.203125 72.90625 \nL 47.609375 72.90625 \nL 47.609375 25.390625 \nL 58.015625 25.390625 \nL 58.015625 17.1875 \nL 47.609375 17.1875 \nL 47.609375 0 \nL 37.796875 0 \nL 37.796875 17.1875 \nL 4.890625 17.1875 \nL 4.890625 26.703125 \nz\n\" id=\"DejaVuSans-52\"/>\n       </defs>\n       <use xlink:href=\"#DejaVuSans-52\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-48\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"xtick_4\">\n     <g id=\"line2d_4\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"210.219838\" xlink:href=\"#m50774fb6b2\" y=\"239.758125\"/>\n      </g>\n     </g>\n     <g id=\"text_4\">\n      <!-- 60 -->\n      <g transform=\"translate(203.857338 254.356563)scale(0.1 -0.1)\">\n       <defs>\n        <path d=\"M 33.015625 40.375 \nQ 26.375 40.375 22.484375 35.828125 \nQ 18.609375 31.296875 18.609375 23.390625 \nQ 18.609375 15.53125 22.484375 10.953125 \nQ 26.375 6.390625 33.015625 6.390625 \nQ 39.65625 6.390625 43.53125 10.953125 \nQ 47.40625 15.53125 47.40625 23.390625 \nQ 47.40625 31.296875 43.53125 35.828125 \nQ 39.65625 40.375 33.015625 40.375 \nz\nM 52.59375 71.296875 \nL 52.59375 62.3125 \nQ 48.875 64.0625 45.09375 64.984375 \nQ 41.3125 65.921875 37.59375 65.921875 \nQ 27.828125 65.921875 22.671875 59.328125 \nQ 17.53125 52.734375 16.796875 39.40625 \nQ 19.671875 43.65625 24.015625 45.921875 \nQ 28.375 48.1875 33.59375 48.1875 \nQ 44.578125 48.1875 50.953125 41.515625 \nQ 57.328125 34.859375 57.328125 23.390625 \nQ 57.328125 12.15625 50.6875 5.359375 \nQ 44.046875 -1.421875 33.015625 -1.421875 \nQ 20.359375 -1.421875 13.671875 8.265625 \nQ 6.984375 17.96875 6.984375 36.375 \nQ 6.984375 53.65625 15.1875 63.9375 \nQ 23.390625 74.21875 37.203125 74.21875 \nQ 40.921875 74.21875 44.703125 73.484375 \nQ 48.484375 72.75 52.59375 71.296875 \nz\n\" id=\"DejaVuSans-54\"/>\n       </defs>\n       <use xlink:href=\"#DejaVuSans-54\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-48\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"xtick_5\">\n     <g id=\"line2d_5\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"258.15112\" xlink:href=\"#m50774fb6b2\" y=\"239.758125\"/>\n      </g>\n     </g>\n     <g id=\"text_5\">\n      <!-- 80 -->\n      <g transform=\"translate(251.78862 254.356563)scale(0.1 -0.1)\">\n       <defs>\n        <path d=\"M 31.78125 34.625 \nQ 24.75 34.625 20.71875 30.859375 \nQ 16.703125 27.09375 16.703125 20.515625 \nQ 16.703125 13.921875 20.71875 10.15625 \nQ 24.75 6.390625 31.78125 6.390625 \nQ 38.8125 6.390625 42.859375 10.171875 \nQ 46.921875 13.96875 46.921875 20.515625 \nQ 46.921875 27.09375 42.890625 30.859375 \nQ 38.875 34.625 31.78125 34.625 \nz\nM 21.921875 38.8125 \nQ 15.578125 40.375 12.03125 44.71875 \nQ 8.5 49.078125 8.5 55.328125 \nQ 8.5 64.0625 14.71875 69.140625 \nQ 20.953125 74.21875 31.78125 74.21875 \nQ 42.671875 74.21875 48.875 69.140625 \nQ 55.078125 64.0625 55.078125 55.328125 \nQ 55.078125 49.078125 51.53125 44.71875 \nQ 48 40.375 41.703125 38.8125 \nQ 48.828125 37.15625 52.796875 32.3125 \nQ 56.78125 27.484375 56.78125 20.515625 \nQ 56.78125 9.90625 50.3125 4.234375 \nQ 43.84375 -1.421875 31.78125 -1.421875 \nQ 19.734375 -1.421875 13.25 4.234375 \nQ 6.78125 9.90625 6.78125 20.515625 \nQ 6.78125 27.484375 10.78125 32.3125 \nQ 14.796875 37.15625 21.921875 38.8125 \nz\nM 18.3125 54.390625 \nQ 18.3125 48.734375 21.84375 45.5625 \nQ 25.390625 42.390625 31.78125 42.390625 \nQ 38.140625 42.390625 41.71875 45.5625 \nQ 45.3125 48.734375 45.3125 54.390625 \nQ 45.3125 60.0625 41.71875 63.234375 \nQ 38.140625 66.40625 31.78125 66.40625 \nQ 25.390625 66.40625 21.84375 63.234375 \nQ 18.3125 60.0625 18.3125 54.390625 \nz\n\" id=\"DejaVuSans-56\"/>\n       </defs>\n       <use xlink:href=\"#DejaVuSans-56\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-48\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"xtick_6\">\n     <g id=\"line2d_6\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"306.082401\" xlink:href=\"#m50774fb6b2\" y=\"239.758125\"/>\n      </g>\n     </g>\n     <g id=\"text_6\">\n      <!-- 100 -->\n      <g transform=\"translate(296.538651 254.356563)scale(0.1 -0.1)\">\n       <defs>\n        <path d=\"M 12.40625 8.296875 \nL 28.515625 8.296875 \nL 28.515625 63.921875 \nL 10.984375 60.40625 \nL 10.984375 69.390625 \nL 28.421875 72.90625 \nL 38.28125 72.90625 \nL 38.28125 8.296875 \nL 54.390625 8.296875 \nL 54.390625 0 \nL 12.40625 0 \nz\n\" id=\"DejaVuSans-49\"/>\n       </defs>\n       <use xlink:href=\"#DejaVuSans-49\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"127.246094\" xlink:href=\"#DejaVuSans-48\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"xtick_7\">\n     <g id=\"line2d_7\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"354.013682\" xlink:href=\"#m50774fb6b2\" y=\"239.758125\"/>\n      </g>\n     </g>\n     <g id=\"text_7\">\n      <!-- 120 -->\n      <g transform=\"translate(344.469932 254.356563)scale(0.1 -0.1)\">\n       <use xlink:href=\"#DejaVuSans-49\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-50\"/>\n       <use x=\"127.246094\" xlink:href=\"#DejaVuSans-48\"/>\n      </g>\n     </g>\n    </g>\n   </g>\n   <g id=\"matplotlib.axis_2\">\n    <g id=\"ytick_1\">\n     <g id=\"line2d_8\">\n      <defs>\n       <path d=\"M 0 0 \nL -3.5 0 \n\" id=\"m214a3e4f21\" style=\"stroke:#000000;stroke-width:0.8;\"/>\n      </defs>\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"51.207813\" xlink:href=\"#m214a3e4f21\" y=\"226.023598\"/>\n      </g>\n     </g>\n     <g id=\"text_8\">\n      <!-- −0.004 -->\n      <g transform=\"translate(7.2 229.822816)scale(0.1 -0.1)\">\n       <defs>\n        <path d=\"M 10.59375 35.5 \nL 73.1875 35.5 \nL 73.1875 27.203125 \nL 10.59375 27.203125 \nz\n\" id=\"DejaVuSans-8722\"/>\n        <path d=\"M 10.6875 12.40625 \nL 21 12.40625 \nL 21 0 \nL 10.6875 0 \nz\n\" id=\"DejaVuSans-46\"/>\n       </defs>\n       <use xlink:href=\"#DejaVuSans-8722\"/>\n       <use x=\"83.789062\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"147.412109\" xlink:href=\"#DejaVuSans-46\"/>\n       <use x=\"179.199219\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"242.822266\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"306.445312\" xlink:href=\"#DejaVuSans-52\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"ytick_2\">\n     <g id=\"line2d_9\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"51.207813\" xlink:href=\"#m214a3e4f21\" y=\"199.849585\"/>\n      </g>\n     </g>\n     <g id=\"text_9\">\n      <!-- −0.002 -->\n      <g transform=\"translate(7.2 203.648804)scale(0.1 -0.1)\">\n       <use xlink:href=\"#DejaVuSans-8722\"/>\n       <use x=\"83.789062\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"147.412109\" xlink:href=\"#DejaVuSans-46\"/>\n       <use x=\"179.199219\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"242.822266\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"306.445312\" xlink:href=\"#DejaVuSans-50\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"ytick_3\">\n     <g id=\"line2d_10\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"51.207813\" xlink:href=\"#m214a3e4f21\" y=\"173.675572\"/>\n      </g>\n     </g>\n     <g id=\"text_10\">\n      <!-- 0.000 -->\n      <g transform=\"translate(15.579688 177.474791)scale(0.1 -0.1)\">\n       <use xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-46\"/>\n       <use x=\"95.410156\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"159.033203\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"222.65625\" xlink:href=\"#DejaVuSans-48\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"ytick_4\">\n     <g id=\"line2d_11\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"51.207813\" xlink:href=\"#m214a3e4f21\" y=\"147.501559\"/>\n      </g>\n     </g>\n     <g id=\"text_11\">\n      <!-- 0.002 -->\n      <g transform=\"translate(15.579688 151.300778)scale(0.1 -0.1)\">\n       <use xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-46\"/>\n       <use x=\"95.410156\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"159.033203\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"222.65625\" xlink:href=\"#DejaVuSans-50\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"ytick_5\">\n     <g id=\"line2d_12\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"51.207813\" xlink:href=\"#m214a3e4f21\" y=\"121.327547\"/>\n      </g>\n     </g>\n     <g id=\"text_12\">\n      <!-- 0.004 -->\n      <g transform=\"translate(15.579688 125.126765)scale(0.1 -0.1)\">\n       <use xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-46\"/>\n       <use x=\"95.410156\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"159.033203\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"222.65625\" xlink:href=\"#DejaVuSans-52\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"ytick_6\">\n     <g id=\"line2d_13\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"51.207813\" xlink:href=\"#m214a3e4f21\" y=\"95.153534\"/>\n      </g>\n     </g>\n     <g id=\"text_13\">\n      <!-- 0.006 -->\n      <g transform=\"translate(15.579688 98.952753)scale(0.1 -0.1)\">\n       <use xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-46\"/>\n       <use x=\"95.410156\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"159.033203\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"222.65625\" xlink:href=\"#DejaVuSans-54\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"ytick_7\">\n     <g id=\"line2d_14\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"51.207813\" xlink:href=\"#m214a3e4f21\" y=\"68.979521\"/>\n      </g>\n     </g>\n     <g id=\"text_14\">\n      <!-- 0.008 -->\n      <g transform=\"translate(15.579688 72.77874)scale(0.1 -0.1)\">\n       <use xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-46\"/>\n       <use x=\"95.410156\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"159.033203\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"222.65625\" xlink:href=\"#DejaVuSans-56\"/>\n      </g>\n     </g>\n    </g>\n    <g id=\"ytick_8\">\n     <g id=\"line2d_15\">\n      <g>\n       <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"51.207813\" xlink:href=\"#m214a3e4f21\" y=\"42.805508\"/>\n      </g>\n     </g>\n     <g id=\"text_15\">\n      <!-- 0.010 -->\n      <g transform=\"translate(15.579688 46.604727)scale(0.1 -0.1)\">\n       <use xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"63.623047\" xlink:href=\"#DejaVuSans-46\"/>\n       <use x=\"95.410156\" xlink:href=\"#DejaVuSans-48\"/>\n       <use x=\"159.033203\" xlink:href=\"#DejaVuSans-49\"/>\n       <use x=\"222.65625\" xlink:href=\"#DejaVuSans-48\"/>\n      </g>\n     </g>\n    </g>\n   </g>\n   <g id=\"line2d_16\">\n    <path clip-path=\"url(#pe80a42cc9d)\" d=\"M 66.425994 171.308767 \nL 68.822558 40.992173 \nL 71.219122 52.286076 \nL 73.615687 107.378146 \nL 76.012251 32.201761 \nL 78.408815 120.73731 \nL 80.805379 111.415775 \nL 83.201943 92.357534 \nL 85.598507 130.400192 \nL 87.995071 74.581532 \nL 90.391635 102.912597 \nL 92.788199 92.257693 \nL 95.184763 83.440297 \nL 97.581327 113.34107 \nL 99.977891 77.73037 \nL 102.374455 107.224112 \nL 104.771019 98.260965 \nL 107.167583 86.087536 \nL 109.564148 102.439882 \nL 111.960712 65.826181 \nL 114.357276 88.525606 \nL 116.75384 88.197489 \nL 119.150404 76.538039 \nL 121.546968 98.140394 \nL 123.943532 92.010205 \nL 126.340096 113.174577 \nL 128.73666 110.426568 \nL 131.133224 101.713448 \nL 133.529788 116.862443 \nL 135.926352 106.957778 \nL 138.322916 117.810518 \nL 140.71948 133.82319 \nL 143.116044 131.742291 \nL 145.512608 114.517069 \nL 147.909173 106.575729 \nL 150.305737 127.219435 \nL 152.702301 118.262138 \nL 155.098865 115.194059 \nL 157.495429 126.919598 \nL 159.891993 106.457069 \nL 162.288557 103.270278 \nL 164.685121 115.981243 \nL 167.081685 144.307819 \nL 169.478249 189.165327 \nL 171.874813 182.568641 \nL 174.271377 143.81902 \nL 176.667941 125.859211 \nL 179.064505 143.410051 \nL 181.461069 174.204574 \nL 183.857634 166.684161 \nL 186.254198 134.89533 \nL 188.650762 116.099706 \nL 191.047326 138.310044 \nL 193.44389 169.157532 \nL 195.840454 135.501939 \nL 198.237018 108.521178 \nL 200.633582 128.566362 \nL 203.030146 116.378682 \nL 205.42671 108.159793 \nL 207.823274 147.543503 \nL 210.219838 179.850802 \nL 212.616402 183.563694 \nL 215.012966 193.674743 \nL 217.40953 204.968607 \nL 222.202659 114.128685 \nL 224.599223 135.995817 \nL 226.995787 144.263468 \nL 229.392351 130.427165 \nL 231.788915 142.058857 \nL 234.185479 152.49348 \nL 236.582043 160.832608 \nL 238.978607 173.55929 \nL 241.375171 178.423076 \nL 243.771735 180.049241 \nL 248.564863 176.674598 \nL 250.961427 175.39193 \nL 253.357991 179.448879 \nL 255.754556 194.15887 \nL 258.15112 185.838673 \nL 260.547684 173.561454 \nL 262.944248 189.854004 \nL 265.340812 185.601064 \nL 267.737376 179.368871 \nL 270.13394 189.076283 \nL 272.530504 171.521273 \nL 274.927068 163.849481 \nL 277.323632 168.58914 \nL 279.720196 156.302349 \nL 282.11676 146.814164 \nL 284.513324 137.849486 \nL 286.909888 137.779314 \nL 289.306452 156.269369 \nL 291.703017 178.478392 \nL 294.099581 183.119901 \nL 296.496145 198.588024 \nL 298.892709 229.874489 \nL 301.289273 174.003817 \nL 303.685837 104.957769 \nL 306.082401 141.091112 \nL 308.478965 163.066816 \nL 310.875529 158.415769 \nL 313.272093 182.745269 \nL 315.668657 174.686444 \nL 318.065221 167.083819 \nL 320.461785 180.148445 \nL 322.858349 171.630782 \nL 325.254913 172.830247 \nL 327.651477 177.864155 \nL 330.048042 171.249869 \nL 332.444606 178.581945 \nL 334.84117 186.930328 \nL 337.237734 194.763582 \nL 339.634298 196.773353 \nL 342.030862 181.995334 \nL 344.427426 159.967509 \nL 346.82399 134.169905 \nL 349.220554 155.19456 \nL 351.617118 193.425697 \nL 354.013682 172.887132 \nL 356.410246 167.977195 \nL 358.80681 184.387729 \nL 361.203374 170.696243 \nL 363.599938 177.600284 \nL 365.996503 188.692703 \nL 368.393067 176.584362 \nL 370.789631 153.056994 \nL 370.789631 153.056994 \n\" style=\"fill:none;stroke:#1f77b4;stroke-linecap:square;stroke-width:1.5;\"/>\n   </g>\n   <g id=\"patch_3\">\n    <path d=\"M 51.207813 239.758125 \nL 51.207813 22.318125 \n\" style=\"fill:none;stroke:#000000;stroke-linecap:square;stroke-linejoin:miter;stroke-width:0.8;\"/>\n   </g>\n   <g id=\"patch_4\">\n    <path d=\"M 386.007812 239.758125 \nL 386.007812 22.318125 \n\" style=\"fill:none;stroke:#000000;stroke-linecap:square;stroke-linejoin:miter;stroke-width:0.8;\"/>\n   </g>\n   <g id=\"patch_5\">\n    <path d=\"M 51.207813 239.758125 \nL 386.007813 239.758125 \n\" style=\"fill:none;stroke:#000000;stroke-linecap:square;stroke-linejo

Download .txt

gitextract_1ae_uw_y/

├── .coveragerc
├── .github/
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug-report.md
│   │   └── config.yml
│   └── workflows/
│       ├── benchmark_default_branch.yml
│       ├── deploy.yml
│       ├── stylecheck.yml
│       ├── test.yml
│       └── test_all.yml
├── .gitignore
├── .pre-commit-config.yaml
├── .readthedocs.yml
├── AUTHORS.rst
├── CHANGES.rst
├── Dockerfile
├── Dockerfile.testing
├── LICENSE.txt
├── Makefile
├── README.md
├── binder/
│   └── requirements.txt
├── docs/
│   ├── Makefile
│   ├── _static/
│   │   ├── .gitignore
│   │   └── theme_override.css
│   ├── _templates/
│   │   └── module_functions_template.rst
│   ├── api/
│   │   ├── modules.rst
│   │   ├── tsfresh.convenience.rst
│   │   ├── tsfresh.examples.rst
│   │   ├── tsfresh.feature_extraction.rst
│   │   ├── tsfresh.feature_selection.rst
│   │   ├── tsfresh.rst
│   │   ├── tsfresh.scripts.rst
│   │   ├── tsfresh.transformers.rst
│   │   └── tsfresh.utilities.rst
│   ├── authors.rst
│   ├── changes.rst
│   ├── conf.py
│   ├── images/
│   │   └── rolling_mechanism_drawio_template.xml
│   ├── index.rst
│   ├── license.rst
│   └── text/
│       ├── data_formats.rst
│       ├── faq.rst
│       ├── feature_calculation.rst
│       ├── feature_extraction_settings.rst
│       ├── feature_filtering.rst
│       ├── forecasting.rst
│       ├── how_to_add_custom_feature.rst
│       ├── how_to_contribute.rst
│       ├── introduction.rst
│       ├── large_data.rst
│       ├── list_of_features.rst
│       ├── quick_start.rst
│       ├── sklearn_transformers.rst
│       └── tsfresh_on_a_cluster.rst
├── notebooks/
│   ├── 01 Feature Extraction and Selection.ipynb
│   ├── 02 sklearn Pipeline.ipynb
│   ├── 03 Feature Extraction Settings.ipynb
│   ├── 04 Multiclass Selection Example.ipynb
│   ├── 05 Timeseries Forecasting.ipynb
│   ├── advanced/
│   │   ├── 05 Timeseries Forecasting (multiple ids).ipynb
│   │   ├── compare-runtimes-of-feature-calculators.ipynb
│   │   ├── feature_extraction_with_datetime_index.ipynb
│   │   ├── friedrich_coefficients.ipynb
│   │   ├── inspect_dft_features.ipynb
│   │   ├── perform-PCA-on-extracted-features.ipynb
│   │   └── visualize-benjamini-yekutieli-procedure.ipynb
│   └── pipeline.pkl
├── setup.cfg
├── setup.py
├── tests/
│   ├── __init__.py
│   ├── benchmark.py
│   ├── fixtures.py
│   ├── integrations/
│   │   ├── __init__.py
│   │   ├── examples/
│   │   │   ├── __init__.py
│   │   │   ├── test_driftbif_simulation.py
│   │   │   ├── test_har_dataset.py
│   │   │   └── test_robot_execution_failures.py
│   │   ├── test_bindings.py
│   │   ├── test_feature_extraction.py
│   │   ├── test_full_pipeline.py
│   │   ├── test_notebooks.py
│   │   └── test_relevant_feature_extraction.py
│   └── units/
│       ├── __init__.py
│       ├── feature_extraction/
│       │   ├── __init__.py
│       │   ├── test_data.py
│       │   ├── test_extraction.py
│       │   ├── test_feature_calculations.py
│       │   └── test_settings.py
│       ├── feature_selection/
│       │   ├── __init__.py
│       │   ├── test_checks.py
│       │   ├── test_fdr_control.py
│       │   ├── test_feature_significance.py
│       │   ├── test_relevance.py
│       │   ├── test_selection.py
│       │   └── test_significance_tests.py
│       ├── scripts/
│       │   ├── __init__.py
│       │   └── test_run_tsfresh.py
│       ├── transformers/
│       │   ├── __init__.py
│       │   ├── test_feature_augmenter.py
│       │   ├── test_feature_selector.py
│       │   ├── test_per_column_imputer.py
│       │   └── test_relevant_feature_augmenter.py
│       └── utilities/
│           ├── __init__.py
│           ├── test_dataframe_functions.py
│           ├── test_distribution.py
│           └── test_string_manipilations.py
└── tsfresh/
    ├── __init__.py
    ├── convenience/
    │   ├── __init__.py
    │   ├── bindings.py
    │   └── relevant_extraction.py
    ├── defaults.py
    ├── examples/
    │   ├── __init__.py
    │   ├── driftbif_simulation.py
    │   ├── har_dataset.py
    │   └── robot_execution_failures.py
    ├── feature_extraction/
    │   ├── __init__.py
    │   ├── data.py
    │   ├── extraction.py
    │   ├── feature_calculators.py
    │   └── settings.py
    ├── feature_selection/
    │   ├── __init__.py
    │   ├── relevance.py
    │   ├── selection.py
    │   └── significance_tests.py
    ├── scripts/
    │   ├── __init__.py
    │   ├── data.txt
    │   ├── measure_execution_time.py
    │   ├── run_tsfresh.py
    │   └── test_timing.py
    ├── transformers/
    │   ├── __init__.py
    │   ├── feature_augmenter.py
    │   ├── feature_selector.py
    │   ├── per_column_imputer.py
    │   └── relevant_feature_augmenter.py
    └── utilities/
        ├── __init__.py
        ├── dataframe_functions.py
        ├── distribution.py
        ├── profiling.py
        └── string_manipulation.py

Download .txt

SYMBOL INDEX (687 symbols across 51 files)

FILE: tests/benchmark.py
  function create_data (line 13) | def create_data(time_series_length, num_ids, random_seed=42):
  function test_benchmark_small_data (line 32) | def test_benchmark_small_data(benchmark):
  function test_benchmark_large_data (line 45) | def test_benchmark_large_data(benchmark):
  function test_benchmark_with_selection (line 58) | def test_benchmark_with_selection(benchmark):

FILE: tests/fixtures.py
  function warning_free (line 14) | def warning_free():
  class DataTestCase (line 27) | class DataTestCase(TestCase):
    method create_test_data_sample (line 28) | def create_test_data_sample(self):
    method create_test_data_sample_wide (line 200) | def create_test_data_sample_wide(self):
    method create_test_data_sample_with_time_index (line 256) | def create_test_data_sample_with_time_index(self):
    method create_test_data_nearly_numerical_indices (line 431) | def create_test_data_nearly_numerical_indices(self):
    method create_one_valued_time_series (line 603) | def create_one_valued_time_series(self):
    method create_test_data_sample_with_target (line 611) | def create_test_data_sample_with_target(self):
    method create_test_data_sample_with_multiclass_target (line 624) | def create_test_data_sample_with_multiclass_target(self):

FILE: tests/integrations/examples/test_driftbif_simulation.py
  class DriftBifSimlationTestCase (line 16) | class DriftBifSimlationTestCase(unittest.TestCase):
    method setUp (line 17) | def setUp(self):
    method test_intrinsic_velocity_at_default_bifurcation_point (line 20) | def test_intrinsic_velocity_at_default_bifurcation_point(self):
    method test_relaxation_dynamics (line 27) | def test_relaxation_dynamics(self):
    method test_equlibrium_velocity (line 57) | def test_equlibrium_velocity(self):
    method test_dimensionality (line 71) | def test_dimensionality(self):
    method test_relevant_feature_extraction (line 89) | def test_relevant_feature_extraction(self):
  class SampleTauTestCase (line 107) | class SampleTauTestCase(unittest.TestCase):
    method setUp (line 108) | def setUp(self):
    method test_range (line 111) | def test_range(self):
    method test_ratio (line 116) | def test_ratio(self):
  class LoadDriftBifTestCase (line 124) | class LoadDriftBifTestCase(unittest.TestCase):
    method setUp (line 125) | def setUp(self):
    method test_classification_labels (line 128) | def test_classification_labels(self):
    method test_regression_labels (line 132) | def test_regression_labels(self):
    method test_default_dimensionality (line 141) | def test_default_dimensionality(self):
    method test_configured_dimensionality (line 147) | def test_configured_dimensionality(self):

FILE: tests/integrations/examples/test_har_dataset.py
  class HumanActivityTestCase (line 18) | class HumanActivityTestCase(TestCase):
    method setUp (line 19) | def setUp(self):
    method tearDown (line 26) | def tearDown(self):
    method test_characteristics_downloaded_robot_execution_failures (line 29) | def test_characteristics_downloaded_robot_execution_failures(self):
    method test_index (line 36) | def test_index(self):

FILE: tests/integrations/examples/test_robot_execution_failures.py
  class RobotExecutionFailuresTestCase (line 20) | class RobotExecutionFailuresTestCase(TestCase):
    method setUp (line 21) | def setUp(self):
    method tearDown (line 28) | def tearDown(self):
    method test_characteristics_downloaded_robot_execution_failures (line 31) | def test_characteristics_downloaded_robot_execution_failures(self):
    method test_extraction_runs_through (line 40) | def test_extraction_runs_through(self):
    method test_binary_target_is_default (line 46) | def test_binary_target_is_default(self):
    method test_multilabel_target_on_request (line 51) | def test_multilabel_target_on_request(self):

FILE: tests/integrations/test_bindings.py
  class DaskBindingsTestCase (line 10) | class DaskBindingsTestCase(TestCase):
    method test_feature_extraction (line 11) | def test_feature_extraction(self):

FILE: tests/integrations/test_feature_extraction.py
  class FeatureExtractionTestCase (line 14) | class FeatureExtractionTestCase(TestCase):
    method setUp (line 15) | def setUp(self):
    method test_pandas (line 23) | def test_pandas(self):
    method test_pandas_no_pivot (line 73) | def test_pandas_no_pivot(self):
    method test_dask (line 142) | def test_dask(self):
    method test_dask_no_pivot (line 191) | def test_dask_no_pivot(self):

FILE: tests/integrations/test_full_pipeline.py
  class FullPipelineTestCase_robot_failures (line 21) | class FullPipelineTestCase_robot_failures(TestCase):
    method setUp (line 22) | def setUp(self):
    method tearDown (line 35) | def tearDown(self):
    method test_relevant_extraction (line 38) | def test_relevant_extraction(self):

FILE: tests/integrations/test_notebooks.py
  function _notebook_run (line 16) | def _notebook_run(path, timeout=default_timeout):
  class NotebooksTestCase (line 70) | class NotebooksTestCase(TestCase):
    method test_basic_example (line 71) | def test_basic_example(self):
    method test_pipeline_example (line 78) | def test_pipeline_example(self):
    method test_extraction_settings (line 84) | def test_extraction_settings(self):
    method test_multiclass_selection_example (line 90) | def test_multiclass_selection_example(self):
    method test_timeseries_forecasting (line 96) | def test_timeseries_forecasting(self):
    method test_timeseries_forecasting_exprt (line 102) | def test_timeseries_forecasting_exprt(self):
    method test_inspect_dft_features (line 109) | def test_inspect_dft_features(self):
    method test_feature_extraction_with_datetime_index (line 115) | def test_feature_extraction_with_datetime_index(self):
    method test_friedrich_coefficients (line 122) | def test_friedrich_coefficients(self):
    method test_inspect_dft_features (line 128) | def test_inspect_dft_features(self):
    method test_perform_PCA_on_extracted_features (line 134) | def test_perform_PCA_on_extracted_features(self):
    method test_visualize_benjamini_yekutieli_procedure (line 141) | def test_visualize_benjamini_yekutieli_procedure(self):

FILE: tests/integrations/test_relevant_feature_extraction.py
  class RelevantFeatureExtractionDataTestCase (line 17) | class RelevantFeatureExtractionDataTestCase(DataTestCase):
    method test_functional_equality (line 22) | def test_functional_equality(self):
  class RelevantFeatureExtractionTestCase (line 68) | class RelevantFeatureExtractionTestCase(TestCase):
    method setUp (line 69) | def setUp(self):
    method test_extracted_features_contain_X_features (line 86) | def test_extracted_features_contain_X_features(self):
    method test_extraction_null_as_column_name (line 95) | def test_extraction_null_as_column_name(self):
    method test_raises_mismatch_index_df_and_y_df_more (line 127) | def test_raises_mismatch_index_df_and_y_df_more(self):
    method test_raises_mismatch_index_df_and_y_y_more (line 146) | def test_raises_mismatch_index_df_and_y_y_more(self):
    method test_raises_y_not_series (line 162) | def test_raises_y_not_series(self):
    method test_raises_y_not_more_than_one_label (line 181) | def test_raises_y_not_more_than_one_label(self):

FILE: tests/units/feature_extraction/test_data.py
  class DataAdapterTestCase (line 258) | class DataAdapterTestCase(DataTestCase):
    method test_long_tsframe (line 259) | def test_long_tsframe(self):
    method test_long_tsframe_no_value_column (line 265) | def test_long_tsframe_no_value_column(self):
    method test_wide_tsframe (line 271) | def test_wide_tsframe(self):
    method test_wide_tsframe_without_sort (line 277) | def test_wide_tsframe_without_sort(self):
    method test_dict_tsframe (line 284) | def test_dict_tsframe(self):
    method assert_tsdata (line 290) | def assert_tsdata(self, data, expected):
    method assert_data_chunk_object_equal (line 295) | def assert_data_chunk_object_equal(self, result, expected):
    method test_simple_data_sample_two_timeseries (line 303) | def test_simple_data_sample_two_timeseries(self):
    method test_simple_data_sample_four_timeseries (line 317) | def test_simple_data_sample_four_timeseries(self):
    method test_with_dictionaries_two_rows (line 327) | def test_with_dictionaries_two_rows(self):
    method test_with_dictionaries_two_rows (line 345) | def test_with_dictionaries_two_rows(self):
    method test_wide_dataframe_order_preserved_with_sort_column (line 356) | def test_wide_dataframe_order_preserved_with_sort_column(self):
    method test_dask_dataframe_with_kind (line 377) | def test_dask_dataframe_with_kind(self):
    method test_dask_dataframe_without_kind (line 401) | def test_dask_dataframe_without_kind(self):
    method test_with_wrong_input (line 459) | def test_with_wrong_input(self):
  class PivotListTestCase (line 551) | class PivotListTestCase(TestCase):
    method test_empty_list (line 552) | def test_empty_list(self):
    method test_different_input (line 562) | def test_different_input(self):
    method test_long_input (line 585) | def test_long_input(self):

FILE: tests/units/feature_extraction/test_extraction.py
  class ExtractionTestCase (line 21) | class ExtractionTestCase(DataTestCase):
    method setUp (line 24) | def setUp(self):
    method test_extract_features (line 28) | def test_extract_features(self):
    method test_extract_features_uses_only_kind_to_fc_settings (line 78) | def test_extract_features_uses_only_kind_to_fc_settings(self):
    method test_extract_features_for_one_time_series (line 91) | def test_extract_features_for_one_time_series(self):
    method test_extract_features_for_index_based_functions (line 137) | def test_extract_features_for_index_based_functions(self):
    method test_extract_features_custom_function (line 172) | def test_extract_features_custom_function(self):
    method test_extract_features_after_randomisation (line 207) | def test_extract_features_after_randomisation(self):
    method test_profiling_file_written_out (line 239) | def test_profiling_file_written_out(self):
    method test_profiling_cumulative_file_written_out (line 257) | def test_profiling_cumulative_file_written_out(self):
    method test_extract_features_without_settings (line 280) | def test_extract_features_without_settings(self):
    method test_extract_features_with_and_without_parallelization (line 292) | def test_extract_features_with_and_without_parallelization(self):
    method test_extract_index_preservation (line 320) | def test_extract_index_preservation(self):
    method test_extract_features_alphabetically_sorted (line 334) | def test_extract_features_alphabetically_sorted(self):
  class ParallelExtractionTestCase (line 354) | class ParallelExtractionTestCase(DataTestCase):
    method setUp (line 355) | def setUp(self):
    method test_extract_features (line 368) | def test_extract_features(self):
  class DistributorUsageTestCase (line 399) | class DistributorUsageTestCase(DataTestCase):
    method setUp (line 400) | def setUp(self):
    method test_distributor_map_reduce_is_called (line 404) | def test_distributor_map_reduce_is_called(self):
    method test_distributor_close_is_called (line 423) | def test_distributor_close_is_called(self):

FILE: tests/units/feature_extraction/test_feature_calculations.py
  class FeatureCalculationTestCase (line 21) | class FeatureCalculationTestCase(TestCase):
    method setUp (line 22) | def setUp(self):
    method tearDown (line 27) | def tearDown(self):
    method assertIsNaN (line 30) | def assertIsNaN(self, result):
    method assertEqualOnAllArrayTypes (line 33) | def assertEqualOnAllArrayTypes(self, f, input_to_f, result, *args, **k...
    method assertTrueOnAllArrayTypes (line 53) | def assertTrueOnAllArrayTypes(self, f, input_to_f, *args, **kwargs):
    method assertAllTrueOnAllArrayTypes (line 62) | def assertAllTrueOnAllArrayTypes(self, f, input_to_f, *args, **kwargs):
    method assertFalseOnAllArrayTypes (line 75) | def assertFalseOnAllArrayTypes(self, f, input_to_f, *args, **kwargs):
    method assertAllFalseOnAllArrayTypes (line 84) | def assertAllFalseOnAllArrayTypes(self, f, input_to_f, *args, **kwargs):
    method assertAlmostEqualOnAllArrayTypes (line 98) | def assertAlmostEqualOnAllArrayTypes(self, f, input_to_f, result, *arg...
    method assertIsNanOnAllArrayTypes (line 122) | def assertIsNanOnAllArrayTypes(self, f, input_to_f, *args, **kwargs):
    method assertEqualPandasSeriesWrapper (line 135) | def assertEqualPandasSeriesWrapper(self, f, input_to_f, result, *args,...
    method test__roll (line 144) | def test__roll(self):
    method test___get_length_sequences_where (line 150) | def test___get_length_sequences_where(self):
    method test__into_subchunks (line 169) | def test__into_subchunks(self):
    method test_variance_larger_than_standard_deviation (line 177) | def test_variance_larger_than_standard_deviation(self):
    method test_large_standard_deviation (line 185) | def test_large_standard_deviation(self):
    method test_symmetry_looking (line 193) | def test_symmetry_looking(self):
    method test_has_duplicate_max (line 210) | def test_has_duplicate_max(self):
    method test_has_duplicate_min (line 219) | def test_has_duplicate_min(self):
    method test_has_duplicate (line 226) | def test_has_duplicate(self):
    method test_sum (line 233) | def test_sum(self):
    method test_agg_autocorrelation_returns_correct_values (line 238) | def test_agg_autocorrelation_returns_correct_values(self):
    method test_agg_autocorrelation_returns_max_lag_does_not_affect_other_results (line 268) | def test_agg_autocorrelation_returns_max_lag_does_not_affect_other_res...
    method test_partial_autocorrelation (line 282) | def test_partial_autocorrelation(self):
    method test_augmented_dickey_fuller (line 346) | def test_augmented_dickey_fuller(self):
    method test_abs_energy (line 414) | def test_abs_energy(self):
    method test_cid_ce (line 421) | def test_cid_ce(self):
    method test_lempel_ziv_complexity (line 432) | def test_lempel_ziv_complexity(self):
    method test_fourier_entropy (line 463) | def test_fourier_entropy(self):
    method test_permutation_entropy (line 488) | def test_permutation_entropy(self):
    method test_ratio_beyond_r_sigma (line 534) | def test_ratio_beyond_r_sigma(self):
    method test_mean_abs_change (line 542) | def test_mean_abs_change(self):
    method test_mean_change (line 546) | def test_mean_change(self):
    method test_mean_second_derivate_central (line 553) | def test_mean_second_derivate_central(self):
    method test_median (line 562) | def test_median(self):
    method test_mean (line 568) | def test_mean(self):
    method test_length (line 574) | def test_length(self):
    method test_standard_deviation (line 581) | def test_standard_deviation(self):
    method test_variation_coefficient (line 588) | def test_variation_coefficient(self):
    method test_variance (line 601) | def test_variance(self):
    method test_skewness (line 606) | def test_skewness(self):
    method test_kurtosis (line 614) | def test_kurtosis(self):
    method test_root_mean_square (line 621) | def test_root_mean_square(self):
    method test_mean_n_absolute_max (line 630) | def test_mean_n_absolute_max(self):
    method test_absolute_sum_of_changes (line 651) | def test_absolute_sum_of_changes(self):
    method test_longest_strike_below_mean (line 657) | def test_longest_strike_below_mean(self):
    method test_longest_strike_above_mean (line 668) | def test_longest_strike_above_mean(self):
    method test_count_above_mean (line 679) | def test_count_above_mean(self):
    method test_count_below_mean (line 685) | def test_count_below_mean(self):
    method test_last_location_maximum (line 691) | def test_last_location_maximum(self):
    method test_first_location_of_maximum (line 707) | def test_first_location_of_maximum(self):
    method test_last_location_of_minimum (line 723) | def test_last_location_of_minimum(self):
    method test_first_location_of_minimum (line 739) | def test_first_location_of_minimum(self):
    method test_percentage_of_doubled_datapoints (line 755) | def test_percentage_of_doubled_datapoints(self):
    method test_ratio_of_doubled_values (line 774) | def test_ratio_of_doubled_values(self):
    method test_sum_of_reoccurring_values (line 793) | def test_sum_of_reoccurring_values(self):
    method test_sum_of_reoccurring_data_points (line 806) | def test_sum_of_reoccurring_data_points(self):
    method test_uniqueness_factor (line 819) | def test_uniqueness_factor(self):
    method test_fft_coefficient (line 834) | def test_fft_coefficient(self):
    method test_fft_aggregated (line 893) | def test_fft_aggregated(self):
    method test_number_peaks (line 971) | def test_number_peaks(self):
    method test_mass_quantile (line 980) | def test_mass_quantile(self):
    method test_number_cwt_peaks (line 1041) | def test_number_cwt_peaks(self):
    method test_spkt_welch_density (line 1045) | def test_spkt_welch_density(self):
    method test_cwt_coefficients (line 1055) | def test_cwt_coefficients(self):
    method test_ar_coefficient (line 1077) | def test_ar_coefficient(self):
    method test_time_reversal_asymmetry_statistic (line 1129) | def test_time_reversal_asymmetry_statistic(self):
    method test_number_crossing_m (line 1156) | def test_number_crossing_m(self):
    method test_c3 (line 1165) | def test_c3(self):
    method test_binned_entropy (line 1178) | def test_binned_entropy(self):
    method test_sample_entropy (line 1205) | def test_sample_entropy(self):
    method test_autocorrelation (line 1329) | def test_autocorrelation(self):
    method test_quantile (line 1348) | def test_quantile(self):
    method test_mean_abs_change_quantiles (line 1361) | def test_mean_abs_change_quantiles(self):
    method test_value_count (line 1511) | def test_value_count(self):
    method test_range_count (line 1533) | def test_range_count(self):
    method test_approximate_entropy (line 1550) | def test_approximate_entropy(self):
    method test_absolute_maximum (line 1562) | def test_absolute_maximum(self):
    method test_max_langevin_fixed_point (line 1567) | def test_max_langevin_fixed_point(self):
    method test_linear_trend (line 1585) | def test_linear_trend(self):
    method test__aggregate_on_chunks (line 1634) | def test__aggregate_on_chunks(self):
    method test_agg_linear_trend (line 1691) | def test_agg_linear_trend(self):
    method test_energy_ratio_by_chunks (line 1757) | def test_energy_ratio_by_chunks(self):
    method test_linear_trend_timewise_hours (line 1796) | def test_linear_trend_timewise_hours(self):
    method test_linear_trend_timewise_days (line 1836) | def test_linear_trend_timewise_days(self):
    method test_linear_trend_timewise_seconds (line 1867) | def test_linear_trend_timewise_seconds(self):
    method test_linear_trend_timewise_years (line 1898) | def test_linear_trend_timewise_years(self):
    method test_change_quantiles (line 1934) | def test_change_quantiles(self):
    method test_count_above (line 1940) | def test_count_above(self):
    method test_count_below (line 1962) | def test_count_below(self):
    method test_benford_correlation (line 1984) | def test_benford_correlation(self):
    method test_query_similarity_count (line 2017) | def test_query_similarity_count(self):
    method test_matrix_profile_window (line 2043) | def test_matrix_profile_window(self):
    method test_matrix_profile_no_window (line 2066) | def test_matrix_profile_no_window(self):
    method test_matrix_profile_nan (line 2091) | def test_matrix_profile_nan(self):
  class FriedrichTestCase (line 2108) | class FriedrichTestCase(TestCase):
    method test_estimate_friedrich_coefficients (line 2109) | def test_estimate_friedrich_coefficients(self):
    method test_friedrich_coefficients (line 2127) | def test_friedrich_coefficients(self):
    method test_friedrich_number_of_returned_features_is_equal_to_number_of_parameters (line 2142) | def test_friedrich_number_of_returned_features_is_equal_to_number_of_p...
    method test_friedrich_equal_to_snapshot (line 2158) | def test_friedrich_equal_to_snapshot(self):

FILE: tests/units/feature_extraction/test_settings.py
  class TestSettingsObject (line 26) | class TestSettingsObject(TestCase):
    method test_range_count_correctly_configured (line 31) | def test_range_count_correctly_configured(self):
    method test_from_column_raises_on_wrong_column_format (line 37) | def test_from_column_raises_on_wrong_column_format(self):
    method test_from_column_correct_for_selected_columns (line 45) | def test_from_column_correct_for_selected_columns(self):
    method test_from_column_correct_for_comprehensive_fc_parameters (line 98) | def test_from_column_correct_for_comprehensive_fc_parameters(self):
    method test_from_columns_ignores_columns (line 117) | def test_from_columns_ignores_columns(self):
    method test_default_calculates_all_features (line 138) | def test_default_calculates_all_features(self):
    method test_from_columns_correct_for_different_kind_datatypes (line 159) | def test_from_columns_correct_for_different_kind_datatypes(self):
  class TestEfficientFCParameters (line 195) | class TestEfficientFCParameters(TestCase):
    method test_extraction_runs_through (line 200) | def test_extraction_runs_through(self):
    method test_contains_all_non_high_comp_cost_features (line 217) | def test_contains_all_non_high_comp_cost_features(self):
    method test_contains_all_time_based_features (line 238) | def test_contains_all_time_based_features(self):
    method test_contains_all_index_based_features (line 260) | def test_contains_all_index_based_features(self):
  class TestMinimalSettingsObject (line 281) | class TestMinimalSettingsObject(TestCase):
    method test_all_minimal_features_in (line 282) | def test_all_minimal_features_in(self):
    method test_extraction_runs_through (line 295) | def test_extraction_runs_through(self):
  class TestSettingPickability (line 329) | class TestSettingPickability(TestCase):
    method test_settings_pickable (line 330) | def test_settings_pickable(self):
  class TestIncludeFunction (line 354) | class TestIncludeFunction(TestCase):
    method test_no_fctype (line 355) | def test_no_fctype(self):
    method test_fctype (line 361) | def test_fctype(self):
    method test_fctype_default_exclusion (line 384) | def test_fctype_default_exclusion(self):
    method test_fctype_exclude (line 403) | def test_fctype_exclude(self):

FILE: tests/units/feature_selection/test_checks.py
  function binary_series_with_nan (line 20) | def binary_series_with_nan():
  function real_series_with_nan (line 25) | def real_series_with_nan():
  function binary_series (line 30) | def binary_series():
  function real_series (line 35) | def real_series():
  class TestChecksBinaryReal (line 39) | class TestChecksBinaryReal:
    method test_check_target_is_binary (line 44) | def test_check_target_is_binary(self, real_series):
    method test_checks_test_function (line 50) | def test_checks_test_function(self, binary_series, real_series):
    method test_checks_feature_nan (line 56) | def test_checks_feature_nan(self, real_series_with_nan, binary_series):
    method test_checks_target_nan (line 64) | def test_checks_target_nan(self, binary_series_with_nan, real_series):
    method test_check_feature_is_series (line 72) | def test_check_feature_is_series(self, binary_series, real_series):
    method test_check_feature_is_series (line 76) | def test_check_feature_is_series(self, binary_series, real_series):
  class TestChecksBinaryBinary (line 81) | class TestChecksBinaryBinary:
    method test_checks_feature_is_binary (line 86) | def test_checks_feature_is_binary(self, binary_series, real_series):
    method test_checks_target_is_binary (line 90) | def test_checks_target_is_binary(self, binary_series, real_series):
    method test_checks_feature_is_series (line 94) | def test_checks_feature_is_series(self, binary_series):
    method test_checks_target_is_series (line 98) | def test_checks_target_is_series(self, binary_series):
    method test_checks_feature_nan (line 102) | def test_checks_feature_nan(self, binary_series_with_nan, binary_series):
    method test_checks_target_nan (line 106) | def test_checks_target_nan(self, binary_series_with_nan, binary_series):
  class TestChecksRealReal (line 111) | class TestChecksRealReal:
    method test_checks_feature_is_series (line 116) | def test_checks_feature_is_series(self, real_series):
    method test_checks_target_is_series (line 120) | def test_checks_target_is_series(self, real_series):
    method test_checks_feature_nan (line 124) | def test_checks_feature_nan(self, real_series_with_nan, real_series):
    method test_checks_target_nan (line 128) | def test_checks_target_nan(self, real_series_with_nan, real_series):
  class TestChecksRealBinary (line 133) | class TestChecksRealBinary:
    method test_feature_is_binary (line 138) | def test_feature_is_binary(self, real_series):
    method test_feature_is_series (line 142) | def test_feature_is_series(self, real_series, binary_series):
    method test_feature_is_series (line 146) | def test_feature_is_series(self, real_series, binary_series):
    method test_checks_feature_nan (line 150) | def test_checks_feature_nan(self, binary_series_with_nan, real_series):
    method test_checks_target_nan (line 154) | def test_checks_target_nan(self, real_series_with_nan, binary_series):

FILE: tests/units/feature_selection/test_fdr_control.py
  function test_fdr_control (line 35) | def test_fdr_control(p_value, ind, fdr, expected):

FILE: tests/units/feature_selection/test_feature_significance.py
  class FeatureSignificanceTestCase (line 13) | class FeatureSignificanceTestCase(TestCase):
    method setUp (line 16) | def setUp(self):
    method test_binary_target_mixed_case (line 20) | def test_binary_target_mixed_case(self):
    method test_binary_target_binary_features (line 82) | def test_binary_target_binary_features(self):
    method test_binomial_target_realvalued_features (line 140) | def test_binomial_target_realvalued_features(self):
    method test_real_target_mixed_case (line 181) | def test_real_target_mixed_case(self):
    method test_real_target_binary_features (line 233) | def test_real_target_binary_features(self):
    method test_all_features_good (line 264) | def test_all_features_good(self):
    method test_all_features_bad (line 288) | def test_all_features_bad(self):

FILE: tests/units/feature_selection/test_relevance.py
  class TestInferMLTask (line 20) | class TestInferMLTask:
    method test_infers_classification_for_integer_target (line 21) | def test_infers_classification_for_integer_target(self):
    method test_infers_classification_for_boolean_target (line 25) | def test_infers_classification_for_boolean_target(self):
    method test_infers_classification_for_object_target (line 29) | def test_infers_classification_for_object_target(self):
    method test_infers_regression_for_float_target (line 33) | def test_infers_regression_for_float_target(self):
  class TestCalculateRelevanceTable (line 38) | class TestCalculateRelevanceTable:
    method y_binary (line 40) | def y_binary(self):
    method y_real (line 44) | def y_real(self):
    method y_multi (line 48) | def y_multi(self):
    method X (line 55) | def X(self):
    method test_restrict_ml_task_options (line 61) | def test_restrict_ml_task_options(self, X, y_binary):
    method test_constant_feature_irrelevant (line 65) | def test_constant_feature_irrelevant(self, y_binary):
    method test_target_binary_calls_correct_tests (line 76) | def test_target_binary_calls_correct_tests(
    method test_target_real_calls_correct_tests (line 94) | def test_target_real_calls_correct_tests(
    method test_warning_for_no_relevant_feature (line 127) | def test_warning_for_no_relevant_feature(
    method test_multiclass_requires_classification (line 148) | def test_multiclass_requires_classification(self, X, y_real):
    method test_multiclass_n_significant_error (line 152) | def test_multiclass_n_significant_error(self, X, y_binary):
    method test_multiclass_relevance_table_columns (line 158) | def test_multiclass_relevance_table_columns(self, X, y_binary):
    method test_multiclass_correct_features_relevant (line 165) | def test_multiclass_correct_features_relevant(self, y_multi):
  class TestCombineRelevanceTables (line 190) | class TestCombineRelevanceTables:
    method relevance_table (line 192) | def relevance_table(self):
    method test_disjuncts_relevance (line 201) | def test_disjuncts_relevance(self, relevance_table):
    method test_respects_index (line 208) | def test_respects_index(self, relevance_table):
    method test_aggregates_p_value (line 216) | def test_aggregates_p_value(self, relevance_table):
  class TestGetFeatureType (line 224) | class TestGetFeatureType:
    method test_binary (line 225) | def test_binary(self):
    method test_constant (line 229) | def test_constant(self):
    method test_real (line 233) | def test_real(self):

FILE: tests/units/feature_selection/test_selection.py
  class TestSelectFeatures (line 12) | class TestSelectFeatures:
    method test_assert_list (line 13) | def test_assert_list(self):
    method test_assert_one_row_X (line 17) | def test_assert_one_row_X(self):
    method test_assert_one_label_y (line 23) | def test_assert_one_label_y(self):
    method test_assert_different_index (line 29) | def test_assert_different_index(self):
    method test_assert_shorter_y (line 35) | def test_assert_shorter_y(self):
    method test_assert_X_is_DataFrame (line 41) | def test_assert_X_is_DataFrame(self):
    method test_selects_for_each_class (line 47) | def test_selects_for_each_class(self):
    method test_multiclass_selects_correct_n_significant (line 59) | def test_multiclass_selects_correct_n_significant(self):

FILE: tests/units/feature_selection/test_significance_tests.py
  function set_random_seed (line 18) | def set_random_seed():
  function binary_feature (line 23) | def binary_feature(set_random_seed):
  function binary_target_not_related (line 28) | def binary_target_not_related(set_random_seed):
  function real_feature (line 33) | def real_feature(set_random_seed):
  function real_target_not_related (line 38) | def real_target_not_related(set_random_seed):
  class TestUnsignificant (line 42) | class TestUnsignificant:
    method minimal_p_value_for_unsignificant_features (line 44) | def minimal_p_value_for_unsignificant_features(self):
    method test_feature_selection_target_binary_features_binary (line 47) | def test_feature_selection_target_binary_features_binary(
    method test_feature_selection_target_binary_features_realvalued (line 62) | def test_feature_selection_target_binary_features_realvalued(
    method test_feature_selection_target_realvalued_features_binary (line 77) | def test_feature_selection_target_realvalued_features_binary(
    method test_feature_selection_target_realvalued_features_realvalued (line 91) | def test_feature_selection_target_realvalued_features_realvalued(
  class TestSignificant (line 105) | class TestSignificant:
    method maximal_p_value_for_significant_features (line 107) | def maximal_p_value_for_significant_features(self):
    method test_feature_selection_target_binary_features_binary (line 110) | def test_feature_selection_target_binary_features_binary(
    method test_feature_selection_target_binary_features_realvalued_mann (line 127) | def test_feature_selection_target_binary_features_realvalued_mann(
    method test_feature_selection_target_binary_features_realvalued_smir (line 146) | def test_feature_selection_target_binary_features_realvalued_smir(
    method test_feature_selection_target_realvalued_features_binary (line 163) | def test_feature_selection_target_realvalued_features_binary(
    method test_feature_selection_target_realvalued_features_realvalued (line 177) | def test_feature_selection_target_realvalued_features_realvalued(

FILE: tests/units/scripts/test_run_tsfresh.py
  class RunTSFreshTestCase (line 14) | class RunTSFreshTestCase(TestCase):
    method setUp (line 20) | def setUp(self):
    method tearDown (line 39) | def tearDown(self):
    method call_main_function (line 46) | def call_main_function(self, input_csv_string=None, arguments=""):
    method test_invalid_arguments (line 69) | def test_invalid_arguments(self):
    method test_csv_without_headers_wrong_arguments (line 74) | def test_csv_without_headers_wrong_arguments(self):
    method test_csv_without_headers (line 79) | def test_csv_without_headers(self):
    method test_csv_with_header (line 104) | def test_csv_with_header(self):

FILE: tests/units/transformers/test_feature_augmenter.py
  class FeatureAugmenterTestCase (line 13) | class FeatureAugmenterTestCase(DataTestCase):
    method setUp (line 14) | def setUp(self):
    method test_fit_and_transform (line 23) | def test_fit_and_transform(self):
    method test_add_features_to_only_a_part (line 65) | def test_add_features_to_only_a_part(self):
    method test_no_ids_present (line 96) | def test_no_ids_present(self):

FILE: tests/units/transformers/test_feature_selector.py
  class FeatureSelectorTestCase (line 14) | class FeatureSelectorTestCase(TestCase):
    method setUp (line 15) | def setUp(self):
    method test_not_fitted (line 18) | def test_not_fitted(self):
    method test_extract_relevant_features (line 25) | def test_extract_relevant_features(self):
    method test_nothing_relevant (line 70) | def test_nothing_relevant(self):
    method test_with_numpy_array (line 86) | def test_with_numpy_array(self):
    method test_feature_importance (line 108) | def test_feature_importance(self):
    method test_feature_importance (line 122) | def test_feature_importance(self):
    method test_multiclass_relevant_features_selected (line 139) | def test_multiclass_relevant_features_selected(self):
    method test_multiclass_importance_p_values (line 157) | def test_multiclass_importance_p_values(self):
    method test_multiclass_max_avg_p_values (line 189) | def test_multiclass_max_avg_p_values(self):

FILE: tests/units/transformers/test_per_column_imputer.py
  class PerColumnImputerTestCase (line 18) | class PerColumnImputerTestCase(TestCase):
    method setUp (line 19) | def setUp(self):
    method test_not_fitted (line 22) | def test_not_fitted(self):
    method test_only_nans_and_infs (line 29) | def test_only_nans_and_infs(self):
    method test_with_numpy_array (line 50) | def test_with_numpy_array(self):
    method test_standard_replacement_behavior (line 87) | def test_standard_replacement_behavior(self):
    method test_partial_preset_col_to_NINF_given (line 100) | def test_partial_preset_col_to_NINF_given(self):
    method test_partial_preset_col_to_PINF_given (line 114) | def test_partial_preset_col_to_PINF_given(self):
    method test_partial_preset_col_to_NAN_given (line 128) | def test_partial_preset_col_to_NAN_given(self):
    method test_different_shapes_fitted_and_transformed (line 142) | def test_different_shapes_fitted_and_transformed(self):
    method test_preset_has_higher_priority_than_fit (line 153) | def test_preset_has_higher_priority_than_fit(self):
    method test_only_parameters_of_last_fit_count (line 168) | def test_only_parameters_of_last_fit_count(self):
    method test_only_subset_of_columns_given (line 187) | def test_only_subset_of_columns_given(self):
    method test_NINF_preset_contains_more_columns_than_dataframe_to_fit (line 202) | def test_NINF_preset_contains_more_columns_than_dataframe_to_fit(self):
    method test_PINF_preset_contains_more_columns_than_dataframe_to_fit (line 212) | def test_PINF_preset_contains_more_columns_than_dataframe_to_fit(self):
    method test_NAN_preset_contains_more_columns_than_dataframe_to_fit (line 222) | def test_NAN_preset_contains_more_columns_than_dataframe_to_fit(self):

FILE: tests/units/transformers/test_relevant_feature_augmenter.py
  class RelevantFeatureAugmenterTestCase (line 17) | class RelevantFeatureAugmenterTestCase(DataTestCase):
    method setUp (line 18) | def setUp(self):
    method test_not_fitted (line 26) | def test_not_fitted(self):
    method test_no_timeseries (line 33) | def test_no_timeseries(self):
    method test_nothing_relevant (line 42) | def test_nothing_relevant(self):
    method test_filter_only_tsfresh_features_true (line 65) | def test_filter_only_tsfresh_features_true(self):
    method test_filter_only_tsfresh_features_false (line 97) | def test_filter_only_tsfresh_features_false(self):
    method test_does_impute (line 137) | def test_does_impute(self, calculate_relevance_table_mock):
    method test_no_ids_present (line 156) | def test_no_ids_present(self):
    method test_multiclass_selection (line 186) | def test_multiclass_selection(self):
  function test_relevant_augmentor_cross_validated (line 205) | def test_relevant_augmentor_cross_validated():

FILE: tests/units/utilities/test_dataframe_functions.py
  class RollingTestCase (line 17) | class RollingTestCase(TestCase):
    method test_with_wrong_input (line 18) | def test_with_wrong_input(self):
    method test_assert_single_row (line 135) | def test_assert_single_row(self):
    method test_positive_rolling (line 148) | def test_positive_rolling(self):
    method test_negative_rolling (line 318) | def test_negative_rolling(self):
    method test_rolling_with_larger_shift (line 578) | def test_rolling_with_larger_shift(self):
    method test_stacked_rolling (line 652) | def test_stacked_rolling(self):
    method test_dict_rolling (line 742) | def test_dict_rolling(self):
    method test_dict_rolling_maxshift_1 (line 806) | def test_dict_rolling_maxshift_1(self):
    method test_order_rolling (line 868) | def test_order_rolling(self):
    method test_warning_on_non_uniform_time_steps (line 916) | def test_warning_on_non_uniform_time_steps(self):
    method test_multicore_rolling (line 946) | def test_multicore_rolling(self):
  class CheckForNanTestCase (line 1040) | class CheckForNanTestCase(TestCase):
    method test_all_columns (line 1041) | def test_all_columns(self):
    method test_not_all_columns (line 1053) | def test_not_all_columns(self):
  class ImputeTestCase (line 1084) | class ImputeTestCase(TestCase):
    method test_impute_zero (line 1085) | def test_impute_zero(self):
    method test_toplevel_impute (line 1123) | def test_toplevel_impute(self):
    method test_impute_range (line 1166) | def test_impute_range(self):
  class RestrictTestCase (line 1261) | class RestrictTestCase(TestCase):
    method test_restrict_dataframe (line 1262) | def test_restrict_dataframe(self):
    method test_restrict_dict (line 1273) | def test_restrict_dict(self):
    method test_restrict_wrong (line 1291) | def test_restrict_wrong(self):
  class GetRangeValuesPerColumnTestCase (line 1303) | class GetRangeValuesPerColumnTestCase(TestCase):
    method test_ignores_non_finite_values (line 1304) | def test_ignores_non_finite_values(self):
    method test_range_values_correct_with_even_length (line 1317) | def test_range_values_correct_with_even_length(self):
    method test_range_values_correct_with_uneven_length (line 1330) | def test_range_values_correct_with_uneven_length(self):
    method test_no_finite_values_yields_0 (line 1343) | def test_no_finite_values_yields_0(self):
  class MakeForecastingFrameTestCase (line 1364) | class MakeForecastingFrameTestCase(TestCase):
    method test_make_forecasting_frame_list (line 1365) | def test_make_forecasting_frame_list(self):
    method test_make_forecasting_frame_range (line 1386) | def test_make_forecasting_frame_range(self):
    method test_make_forecasting_frame_pdSeries (line 1406) | def test_make_forecasting_frame_pdSeries(self):
    method test_make_forecasting_frame_feature_extraction (line 1453) | def test_make_forecasting_frame_feature_extraction(self):
  class GetIDsTestCase (line 1472) | class GetIDsTestCase(TestCase):
    method test_get_id__correct_DataFrame (line 1473) | def test_get_id__correct_DataFrame(self):
    method test_get_id__correct_dict (line 1477) | def test_get_id__correct_dict(self):
    method test_get_id_wrong (line 1488) | def test_get_id_wrong(self):
  class AddSubIdTestCase (line 1493) | class AddSubIdTestCase(TestCase):
    method test_no_parameters (line 1494) | def test_no_parameters(self):
    method test_id_parameters (line 1501) | def test_id_parameters(self):
    method test_kind_parameters (line 1516) | def test_kind_parameters(self):
    method test_sort_parameters (line 1536) | def test_sort_parameters(self):
    method test_dict_input (line 1558) | def test_dict_input(self):

FILE: tests/units/utilities/test_distribution.py
  class MultiprocessingDistributorTestCase (line 22) | class MultiprocessingDistributorTestCase(TestCase):
    method test_n_jobs (line 23) | def test_n_jobs(self):
    method test_partition (line 36) | def test_partition(self):
    method test__calculate_best_chunk_size (line 50) | def test__calculate_best_chunk_size(self):
  class LocalDaskDistributorTestCase (line 64) | class LocalDaskDistributorTestCase(DataTestCase):
    method test_local_dask_cluster_extraction_one_worker (line 65) | def test_local_dask_cluster_extraction_one_worker(self):
    method test_local_dask_cluster_extraction_two_worker (line 97) | def test_local_dask_cluster_extraction_two_worker(self):
  class ClusterDaskDistributorTestCase (line 130) | class ClusterDaskDistributorTestCase(DataTestCase):
    method test_dask_cluster_extraction_one_worker (line 132) | def test_dask_cluster_extraction_one_worker(self):
    method test_dask_cluster_extraction_two_workers (line 170) | def test_dask_cluster_extraction_two_workers(self):

FILE: tests/units/utilities/test_string_manipilations.py
  class StringUtilities (line 10) | class StringUtilities(TestCase):
    method test_convert_to_output_format (line 11) | def test_convert_to_output_format(self):
    method test_convert_to_output_format_wrong_order (line 24) | def test_convert_to_output_format_wrong_order(self):

FILE: tsfresh/convenience/bindings.py
  function _feature_extraction_on_chunk_helper (line 9) | def _feature_extraction_on_chunk_helper(
  function dask_feature_extraction_on_chunk (line 45) | def dask_feature_extraction_on_chunk(
  function spark_feature_extraction_on_chunk (line 148) | def spark_feature_extraction_on_chunk(

FILE: tsfresh/convenience/relevant_extraction.py
  function extract_relevant_features (line 17) | def extract_relevant_features(

FILE: tsfresh/examples/driftbif_simulation.py
  class velocity (line 15) | class velocity:
    method __init__ (line 43) | def __init__(self, tau=3.8, kappa_3=0.3, Q=1950.0, R=3e-4, delta_t=0.0...
    method __call__ (line 77) | def __call__(self, v):
    method simulate (line 89) | def simulate(self, N, v0=np.zeros(2)):
  function sample_tau (line 111) | def sample_tau(n=10, kappa_3=0.3, ratio=0.5, rel_increase=0.15):
  function load_driftbif (line 136) | def load_driftbif(n, length, m=2, classification=True, kappa_3=0.3, seed...

FILE: tsfresh/examples/har_dataset.py
  function download_har_dataset (line 36) | def download_har_dataset(folder_name=data_file_name):
  function load_har_dataset (line 73) | def load_har_dataset(folder_name=data_file_name):
  function load_har_classes (line 90) | def load_har_classes(folder_name=data_file_name):

FILE: tsfresh/examples/robot_execution_failures.py
  function download_robot_execution_failures (line 44) | def download_robot_execution_failures(file_name=data_file_name):
  function load_robot_execution_failures (line 84) | def load_robot_execution_failures(multiclass=False, file_name=data_file_...

FILE: tsfresh/feature_extraction/data.py
  function _binding_helper (line 13) | def _binding_helper(f, kwargs, column_sort, column_id, column_kind, colu...
  class Timeseries (line 31) | class Timeseries(namedtuple("Timeseries", ["id", "kind", "data"])):
  class TsData (line 40) | class TsData:
  class PartitionedTsData (line 55) | class PartitionedTsData(Iterable[Timeseries], Sized, TsData):
    method __init__ (line 61) | def __init__(self, df, column_id):
    method pivot (line 64) | def pivot(self, results):
  function _check_colname (line 102) | def _check_colname(*columns):
  function _check_nan (line 126) | def _check_nan(df, *columns):
  function _get_value_columns (line 148) | def _get_value_columns(df, *other_columns):
  class WideTsFrameAdapter (line 159) | class WideTsFrameAdapter(PartitionedTsData):
    method __init__ (line 160) | def __init__(self, df, column_id, column_sort=None, value_columns=None):
    method __len__ (line 199) | def __len__(self):
    method __iter__ (line 202) | def __iter__(self):
  class LongTsFrameAdapter (line 211) | class LongTsFrameAdapter(PartitionedTsData):
    method __init__ (line 212) | def __init__(self, df, column_id, column_kind, column_value=None, colu...
    method __len__ (line 262) | def __len__(self):
    method __iter__ (line 265) | def __iter__(self):
  class TsDictAdapter (line 272) | class TsDictAdapter(PartitionedTsData):
    method __init__ (line 273) | def __init__(self, ts_dict, column_id, column_value, column_sort=None):
    method __iter__ (line 310) | def __iter__(self):
    method __len__ (line 315) | def __len__(self):
  class DaskTsAdapter (line 319) | class DaskTsAdapter(TsData):
    method __init__ (line 320) | def __init__(
    method apply (line 388) | def apply(self, f, meta, **kwargs):
    method pivot (line 407) | def pivot(self, results):
  function to_tsdata (line 425) | def to_tsdata(

FILE: tsfresh/feature_extraction/extraction.py
  function extract_features (line 30) | def extract_features(
  function _do_extraction (line 193) | def _do_extraction(
  function _do_extraction_on_chunk (line 308) | def _do_extraction_on_chunk(

FILE: tsfresh/feature_extraction/feature_calculators.py
  function _roll (line 56) | def _roll(a, shift):
  function _get_length_sequences_where (line 102) | def _get_length_sequences_where(x):
  function _estimate_friedrich_coefficients (line 131) | def _estimate_friedrich_coefficients(x, m, r):
  function _aggregate_on_chunks (line 176) | def _aggregate_on_chunks(x, f_agg, chunk_len):
  function _into_subchunks (line 196) | def _into_subchunks(x, subchunk_length, every_n=1):
  function set_property (line 222) | def set_property(key, value):
  function variance_larger_than_standard_deviation (line 239) | def variance_larger_than_standard_deviation(x):
  function ratio_beyond_r_sigma (line 256) | def ratio_beyond_r_sigma(x, r):
  function large_standard_deviation (line 273) | def large_standard_deviation(x, r):
  function symmetry_looking (line 299) | def symmetry_looking(x, param):
  function has_duplicate_max (line 325) | def has_duplicate_max(x):
  function has_duplicate_min (line 340) | def has_duplicate_min(x):
  function has_duplicate (line 355) | def has_duplicate(x):
  function sum_values (line 371) | def sum_values(x):
  function agg_autocorrelation (line 387) | def agg_autocorrelation(x, param):
  function partial_autocorrelation (line 440) | def partial_autocorrelation(x, param):
  function augmented_dickey_fuller (line 499) | def augmented_dickey_fuller(x, param):
  function abs_energy (line 548) | def abs_energy(x):
  function cid_ce (line 567) | def cid_ce(x, normalize):
  function mean_abs_change (line 604) | def mean_abs_change(x):
  function mean_change (line 624) | def mean_change(x):
  function mean_second_derivative_central (line 644) | def mean_second_derivative_central(x):
  function median (line 663) | def median(x):
  function mean (line 677) | def mean(x):
  function length (line 691) | def length(x):
  function standard_deviation (line 705) | def standard_deviation(x):
  function variation_coefficient (line 718) | def variation_coefficient(x):
  function variance (line 735) | def variance(x):
  function skewness (line 749) | def skewness(x):
  function kurtosis (line 766) | def kurtosis(x):
  function root_mean_square (line 783) | def root_mean_square(x):
  function absolute_sum_of_changes (line 796) | def absolute_sum_of_changes(x):
  function longest_strike_below_mean (line 813) | def longest_strike_below_mean(x):
  function longest_strike_above_mean (line 828) | def longest_strike_above_mean(x):
  function count_above_mean (line 843) | def count_above_mean(x):
  function count_below_mean (line 857) | def count_below_mean(x):
  function last_location_of_maximum (line 871) | def last_location_of_maximum(x):
  function first_location_of_maximum (line 886) | def first_location_of_maximum(x):
  function last_location_of_minimum (line 902) | def last_location_of_minimum(x):
  function first_location_of_minimum (line 917) | def first_location_of_minimum(x):
  function percentage_of_reoccurring_values_to_all_values (line 933) | def percentage_of_reoccurring_values_to_all_values(x):
  function percentage_of_reoccurring_datapoints_to_all_datapoints (line 961) | def percentage_of_reoccurring_datapoints_to_all_datapoints(x):
  function sum_of_reoccurring_values (line 992) | def sum_of_reoccurring_values(x):
  function sum_of_reoccurring_data_points (line 1020) | def sum_of_reoccurring_data_points(x):
  function ratio_value_number_to_time_series_length (line 1045) | def ratio_value_number_to_time_series_length(x):
  function fft_coefficient (line 1067) | def fft_coefficient(x, param):
  function fft_aggregated (line 1123) | def fft_aggregated(x, param):
  function number_peaks (line 1235) | def number_peaks(x, n):
  function index_mass_quantile (line 1275) | def index_mass_quantile(x, param):
  function _ricker (line 1307) | def _ricker(points, a):
  function number_cwt_peaks (line 1320) | def number_cwt_peaks(x, n):
  function linear_trend (line 1343) | def linear_trend(x, param):
  function cwt_coefficients (line 1370) | def cwt_coefficients(x, param):
  function spkt_welch_density (line 1418) | def spkt_welch_density(x, param):
  function ar_coefficient (line 1459) | def ar_coefficient(x, param):
  function change_quantiles (line 1511) | def change_quantiles(x, ql, qh, isabs, f_agg):
  function time_reversal_asymmetry_statistic (line 1557) | def time_reversal_asymmetry_statistic(x, lag):
  function c3 (line 1600) | def c3(x, lag):
  function mean_n_absolute_max (line 1643) | def mean_n_absolute_max(x, number_of_maxima):
  function binned_entropy (line 1666) | def binned_entropy(x, max_bins):
  function sample_entropy (line 1701) | def sample_entropy(x):
  function approximate_entropy (line 1759) | def approximate_entropy(x, m, r):
  function fourier_entropy (line 1809) | def fourier_entropy(x, bins):
  function lempel_ziv_complexity (line 1825) | def lempel_ziv_complexity(x, bins):
  function permutation_entropy (line 1866) | def permutation_entropy(x, tau, dimension):
  function autocorrelation (line 1919) | def autocorrelation(x, lag):
  function quantile (line 1963) | def quantile(x, q):
  function number_crossing_m (line 1980) | def number_crossing_m(x, m):
  function maximum (line 2003) | def maximum(x):
  function absolute_maximum (line 2017) | def absolute_maximum(x):
  function minimum (line 2031) | def minimum(x):
  function value_count (line 2044) | def value_count(x, value):
  function range_count (line 2065) | def range_count(x, min, max):
  function friedrich_coefficients (line 2082) | def friedrich_coefficients(x, param):
  function max_langevin_fixed_point (line 2134) | def max_langevin_fixed_point(x, r, m):
  function agg_linear_trend (line 2171) | def agg_linear_trend(x, param):
  function energy_ratio_by_chunks (line 2226) | def energy_ratio_by_chunks(x, param):
  function linear_trend_timewise (line 2274) | def linear_trend_timewise(x, param):
  function count_above (line 2309) | def count_above(x, t):
  function count_below (line 2325) | def count_below(x, t):
  function benford_correlation (line 2341) | def benford_correlation(x):
  function matrix_profile (line 2385) | def matrix_profile(x, param):
  function query_similarity_count (line 2475) | def query_similarity_count(x, param):

FILE: tsfresh/feature_extraction/settings.py
  function from_columns (line 23) | def from_columns(columns, columns_to_ignore=None):
  function include_function (line 86) | def include_function(func, exclusion_attr="input_type"):
  class PickableSettings (line 109) | class PickableSettings(UserDict):
    method __getstate__ (line 120) | def __getstate__(self):
    method __setstate__ (line 125) | def __setstate__(self, state):
  class ComprehensiveFCParameters (line 133) | class ComprehensiveFCParameters(PickableSettings):
    method __init__ (line 154) | def __init__(self):
  class MinimalFCParameters (line 297) | class MinimalFCParameters(ComprehensiveFCParameters):
    method __init__ (line 313) | def __init__(self):
  class EfficientFCParameters (line 323) | class EfficientFCParameters(ComprehensiveFCParameters):
    method __init__ (line 337) | def __init__(self):
  class IndexBasedFCParameters (line 346) | class IndexBasedFCParameters(ComprehensiveFCParameters):
    method __init__ (line 355) | def __init__(self):
  class TimeBasedFCParameters (line 363) | class TimeBasedFCParameters(ComprehensiveFCParameters):
    method __init__ (line 372) | def __init__(self):

FILE: tsfresh/feature_selection/relevance.py
  function calculate_relevance_table (line 31) | def calculate_relevance_table(
  function _calculate_relevance_table_for_implicit_target (line 323) | def _calculate_relevance_table_for_implicit_target(
  function infer_ml_task (line 351) | def infer_ml_task(y):
  function combine_relevance_tables (line 371) | def combine_relevance_tables(relevance_tables):
  function get_feature_type (line 390) | def get_feature_type(feature_column):

FILE: tsfresh/feature_selection/selection.py
  function select_features (line 17) | def select_features(

FILE: tsfresh/feature_selection/significance_tests.py
  function target_binary_feature_binary_test (line 43) | def target_binary_feature_binary_test(x, y):
  function target_binary_feature_real_test (line 84) | def target_binary_feature_real_test(x, y, test):
  function target_real_feature_binary_test (line 135) | def target_real_feature_binary_test(x, y):
  function target_real_feature_real_test (line 170) | def target_real_feature_real_test(x, y):
  function __check_if_pandas_series (line 191) | def __check_if_pandas_series(x, y):
  function __check_for_binary_target (line 214) | def __check_for_binary_target(y):
  function __check_for_binary_feature (line 239) | def __check_for_binary_feature(x):
  function _check_for_nans (line 266) | def _check_for_nans(x, y):

FILE: tsfresh/scripts/measure_execution_time.py
  class DataCreationTask (line 22) | class DataCreationTask(luigi.Task):
    method output (line 29) | def output(self):
    method run (line 32) | def run(self):
  class TimingTask (line 53) | class TimingTask(luigi.Task):
    method output (line 60) | def output(self):
    method run (line 63) | def run(self):
  class FullTimingTask (line 100) | class FullTimingTask(luigi.Task):
    method output (line 105) | def output(self):
    method run (line 108) | def run(self):
  class CombinerTask (line 135) | class CombinerTask(luigi.Task):
    method complete (line 138) | def complete(self):
    method requires (line 141) | def requires(self):
    method output (line 178) | def output(self):
    method run (line 181) | def run(self):

FILE: tsfresh/scripts/run_tsfresh.py
  function _preprocess (line 32) | def _preprocess(df):
  function main (line 47) | def main(console_args=None):

FILE: tsfresh/scripts/test_timing.py
  function simulate_with_length (line 11) | def simulate_with_length(length, df):
  function plot_results (line 36) | def plot_results():
  function measure_temporal_complexity (line 70) | def measure_temporal_complexity():

FILE: tsfresh/transformers/feature_augmenter.py
  class FeatureAugmenter (line 13) | class FeatureAugmenter(BaseEstimator, TransformerMixin):
    method __init__ (line 64) | def __init__(
    method set_timeseries_container (line 156) | def set_timeseries_container(self, timeseries_container):
    method fit (line 172) | def fit(self, X=None, y=None):
    method transform (line 187) | def transform(self, X):

FILE: tsfresh/transformers/feature_selector.py
  class FeatureSelector (line 12) | class FeatureSelector(BaseEstimator, TransformerMixin):
    method __init__ (line 61) | def __init__(
    method fit (line 152) | def fit(self, X, y):
    method transform (line 223) | def transform(self, X):

FILE: tsfresh/transformers/per_column_imputer.py
  class PerColumnImputer (line 15) | class PerColumnImputer(BaseEstimator, TransformerMixin):
    method __init__ (line 33) | def __init__(
    method fit (line 57) | def fit(self, X, y=None):
    method transform (line 104) | def transform(self, X):

FILE: tsfresh/transformers/relevant_feature_augmenter.py
  class RelevantFeatureAugmenter (line 21) | class RelevantFeatureAugmenter(BaseEstimator, TransformerMixin):
    method __init__ (line 92) | def __init__(
    method set_timeseries_container (line 253) | def set_timeseries_container(self, timeseries_container):
    method fit (line 269) | def fit(self, X, y):
    method transform (line 293) | def transform(self, X):
    method fit_transform (line 365) | def fit_transform(self, X, y):
    method _fit_and_augment (line 394) | def _fit_and_augment(self, X, y):

FILE: tsfresh/utilities/dataframe_functions.py
  function check_for_nans_in_columns (line 21) | def check_for_nans_in_columns(df, columns=None):
  function impute (line 49) | def impute(df_impute):
  function impute_dataframe_zero (line 80) | def impute_dataframe_zero(df_impute):
  function impute_dataframe_range (line 102) | def impute_dataframe_range(df_impute, col_to_max, col_to_min, col_to_med...
  function get_range_values_per_column (line 176) | def get_range_values_per_column(df):
  function restrict_input_to_index (line 216) | def restrict_input_to_index(df_or_dict, column_id, index):
  function get_ids (line 252) | def get_ids(df_or_dict, column_id):
  function _roll_out_time_series (line 274) | def _roll_out_time_series(
  function roll_time_series (line 353) | def roll_time_series(
  function make_forecasting_frame (line 582) | def make_forecasting_frame(x, kind, max_timeshift, rolling_direction):
  function add_sub_time_series_index (line 652) | def add_sub_time_series_index(

FILE: tsfresh/utilities/distribution.py
  function _function_with_partly_reduce (line 24) | def _function_with_partly_reduce(chunk_list, map_function, kwargs):
  function initialize_warnings_in_workers (line 47) | def initialize_warnings_in_workers(show_warnings):
  class DistributorBaseClass (line 64) | class DistributorBaseClass:
    method map_reduce (line 74) | def map_reduce(
  class IterableDistributorBaseClass (line 107) | class IterableDistributorBaseClass(DistributorBaseClass):
    method partition (line 119) | def partition(data, chunk_size):
    method __init__ (line 150) | def __init__(self):
    method calculate_best_chunk_size (line 156) | def calculate_best_chunk_size(self, data_length):
    method map_reduce (line 173) | def map_reduce(
    method distribute (line 247) | def distribute(self, func, partitioned_chunks, kwargs):
    method close (line 265) | def close(self):
  class MapDistributor (line 272) | class MapDistributor(IterableDistributorBaseClass):
    method __init__ (line 277) | def __init__(
    method distribute (line 291) | def distribute(self, func, partitioned_chunks, kwargs):
    method calculate_best_chunk_size (line 308) | def calculate_best_chunk_size(self, data_length):
  class LocalDaskDistributor (line 318) | class LocalDaskDistributor(IterableDistributorBaseClass):
    method __init__ (line 323) | def __init__(self, n_workers):
    method distribute (line 345) | def distribute(self, func, partitioned_chunks, kwargs):
    method close (line 370) | def close(self):
  class ClusterDaskDistributor (line 377) | class ClusterDaskDistributor(IterableDistributorBaseClass):
    method __init__ (line 382) | def __init__(self, address):
    method calculate_best_chunk_size (line 394) | def calculate_best_chunk_size(self, data_length):
    method distribute (line 408) | def distribute(self, func, partitioned_chunks, kwargs):
    method close (line 431) | def close(self):
  class MultiprocessingDistributor (line 438) | class MultiprocessingDistributor(IterableDistributorBaseClass):
    method __init__ (line 443) | def __init__(
    method distribute (line 471) | def distribute(self, func, partitioned_chunks, kwargs):
    method close (line 488) | def close(self):
  class ApplyDistributor (line 497) | class ApplyDistributor(DistributorBaseClass):
    method __init__ (line 498) | def __init__(self, meta):
    method map_reduce (line 501) | def map_reduce(

FILE: tsfresh/utilities/profiling.py
  function start_profiling (line 22) | def start_profiling():
  function end_profiling (line 40) | def end_profiling(profiler, filename, sorting=None):
  function get_n_jobs (line 73) | def get_n_jobs():
  function set_n_jobs (line 83) | def set_n_jobs(n_jobs):

FILE: tsfresh/utilities/string_manipulation.py
  function get_config_from_string (line 10) | def get_config_from_string(parts):
  function convert_to_output_format (line 47) | def convert_to_output_format(param):

Download .json

Condensed preview — 137 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (8,880K chars).

[
  {
    "path": ".coveragerc",
    "chars": 1126,
    "preview": "# .coveragerc to control coverage.py\n[run]\nrelative_files = True\nbranch = True\nsource = ./tsfresh\nomit = tsfresh/utiliti"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug-report.md",
    "chars": 1290,
    "preview": "---\nname: Bug Report\nabout: Report a bug\nlabels:\n  - bug\n---\n\n<!--\nThank you very much for filing a bug report!\nWe, the "
  },
  {
    "path": ".github/ISSUE_TEMPLATE/config.yml",
    "chars": 446,
    "preview": "blank_issues_enabled: false\ncontact_links:\n  - name: \"Q&A and general Questions\"\n    url: https://github.com/blue-yonder"
  },
  {
    "path": ".github/workflows/benchmark_default_branch.yml",
    "chars": 856,
    "preview": "# Store benchmark results as an artifact\nname: Benchmark the default branch\non:\n  # Only run on the default branch\n  pus"
  },
  {
    "path": ".github/workflows/deploy.yml",
    "chars": 702,
    "preview": "name: Upload Python Package\n\non:\n  release:\n    types: [created]\n\njobs:\n  deploy:\n    runs-on: ubuntu-latest\n\n    steps:"
  },
  {
    "path": ".github/workflows/stylecheck.yml",
    "chars": 218,
    "preview": "---\nname: Python Style Check\non: [pull_request]\n\njobs:\n  pre-commit:\n    runs-on: ubuntu-latest\n    steps:\n      - uses:"
  },
  {
    "path": ".github/workflows/test.yml",
    "chars": 1451,
    "preview": "---\nname: Test\non:\n  pull_request:\n  push:\n    branches:\n      - main\n\njobs:\n  test:\n    name: Test\n    runs-on: ubuntu-"
  },
  {
    "path": ".github/workflows/test_all.yml",
    "chars": 1094,
    "preview": "---\nname: Test Default Branch\non:\n  push:\n    branches:\n      - main\n\njobs:\n  test:\n    name: Test\n    runs-on: ubuntu-l"
  },
  {
    "path": ".gitignore",
    "chars": 736,
    "preview": "#others\n*.lock\n\n# Temporary and binary files\n*~\n*.py[cod]\n*.so\n*.cfg\n!setup.cfg\n*.orig\n*.log\n*.pot\n__pycache__/*\n.cache/"
  },
  {
    "path": ".pre-commit-config.yaml",
    "chars": 537,
    "preview": "---\nrepos:\n  - repo: https://github.com/psf/black\n    rev: 22.12.0\n    hooks:\n      - id: black\n        language_version"
  },
  {
    "path": ".readthedocs.yml",
    "chars": 527,
    "preview": "# .readthedocs.yaml\n# Read the Docs configuration file\n# See https://docs.readthedocs.io/en/stable/config-file/v2.html f"
  },
  {
    "path": "AUTHORS.rst",
    "chars": 1888,
    "preview": "\n\nAuthors\n==========\n\n\nCore Development Team\n---------------------\n\n- Maximilian Christ (`maximilianchrist.com <http://m"
  },
  {
    "path": "CHANGES.rst",
    "chars": 16615,
    "preview": "=========\nChangelog\n=========\n\ntsfresh uses `Semantic Versioning <http://semver.org/>`_\n\nVersion 0.21.1\n==============\n-"
  },
  {
    "path": "Dockerfile",
    "chars": 354,
    "preview": "# Define builder and base image\nFROM python:3.8-slim as base\nFROM python:3.8 as builder\n\nLABEL maintainer=\"nilslennartbr"
  },
  {
    "path": "Dockerfile.testing",
    "chars": 1664,
    "preview": "# Bakes the python versions which tsfresh targets into a testing env\nFROM ubuntu:22.04\n\nSHELL [\"/bin/bash\", \"-c\"]\n\n# The"
  },
  {
    "path": "LICENSE.txt",
    "chars": 1092,
    "preview": "MIT LICENCE\n\nCopyright (c) 2016 Maximilian Christ, Blue Yonder GmbH\n\nPermission is hereby granted, free of charge, to an"
  },
  {
    "path": "Makefile",
    "chars": 1184,
    "preview": "WORKDIR := /tsfresh\nTEST_IMAGE := tsfresh-test-image\nTEST_DOCKERFILE := Dockerfile.testing\nTEST_CONTAINER := tsfresh-tes"
  },
  {
    "path": "README.md",
    "chars": 9370,
    "preview": "<div align=\"center\">\n  <img width=\"70%\" src=\"./docs/images/tsfresh_logo.svg\">\n</div>\n\n-----------------\n\n# tsfresh\n\n[![D"
  },
  {
    "path": "binder/requirements.txt",
    "chars": 14,
    "preview": "-e .[testing]\n"
  },
  {
    "path": "docs/Makefile",
    "chars": 634,
    "preview": "# Minimal makefile for Sphinx documentation\n#\n\n# You can set these variables from the command line, and also\n# from the "
  },
  {
    "path": "docs/_static/.gitignore",
    "chars": 18,
    "preview": "# Empty directory\n"
  },
  {
    "path": "docs/_static/theme_override.css",
    "chars": 443,
    "preview": "/* override table width restrictions. Taken from https://rackerlabs.github.io/docs-rackspace/tools/rtd-tables.html */\n@m"
  },
  {
    "path": "docs/_templates/module_functions_template.rst",
    "chars": 147,
    "preview": ".. currentmodule:: {{ fullname }}\n\n{% block functions %}\n\n.. autosummary::\n{% for item in functions %}\n   {{ item }}\n{%-"
  },
  {
    "path": "docs/api/modules.rst",
    "chars": 58,
    "preview": "tsfresh\n=======\n\n.. toctree::\n   :maxdepth: 4\n\n   tsfresh\n"
  },
  {
    "path": "docs/api/tsfresh.convenience.rst",
    "chars": 584,
    "preview": "tsfresh.convenience package\n===========================\n\nSubmodules\n----------\n\ntsfresh.convenience.bindings module\n----"
  },
  {
    "path": "docs/api/tsfresh.examples.rst",
    "chars": 784,
    "preview": "tsfresh.examples package\n========================\n\nSubmodules\n----------\n\ntsfresh.examples.driftbif\\_simulation module\n-"
  },
  {
    "path": "docs/api/tsfresh.feature_extraction.rst",
    "chars": 1039,
    "preview": "tsfresh.feature\\_extraction package\n===================================\n\nSubmodules\n----------\n\ntsfresh.feature\\_extract"
  },
  {
    "path": "docs/api/tsfresh.feature_selection.rst",
    "chars": 840,
    "preview": "tsfresh.feature\\_selection package\n==================================\n\nSubmodules\n----------\n\ntsfresh.feature\\_selection"
  },
  {
    "path": "docs/api/tsfresh.rst",
    "chars": 526,
    "preview": "tsfresh package\n===============\n\nSubpackages\n-----------\n\n.. toctree::\n   :maxdepth: 4\n\n   tsfresh.convenience\n   tsfres"
  },
  {
    "path": "docs/api/tsfresh.scripts.rst",
    "chars": 742,
    "preview": "tsfresh.scripts package\n=======================\n\nSubmodules\n----------\n\ntsfresh.scripts.measure\\_execution\\_time module\n"
  },
  {
    "path": "docs/api/tsfresh.transformers.rst",
    "chars": 1057,
    "preview": "tsfresh.transformers package\n============================\n\nSubmodules\n----------\n\ntsfresh.transformers.feature\\_augmente"
  },
  {
    "path": "docs/api/tsfresh.utilities.rst",
    "chars": 950,
    "preview": "tsfresh.utilities package\n=========================\n\nSubmodules\n----------\n\ntsfresh.utilities.dataframe\\_functions modul"
  },
  {
    "path": "docs/authors.rst",
    "chars": 41,
    "preview": ".. _authors:\n.. include:: ../AUTHORS.rst\n"
  },
  {
    "path": "docs/changes.rst",
    "chars": 41,
    "preview": ".. _changes:\n.. include:: ../CHANGES.rst\n"
  },
  {
    "path": "docs/conf.py",
    "chars": 4619,
    "preview": "# -*- coding: utf-8 -*-\n#\n# This file is execfile()d with the current directory set to its containing dir.\n#\n# Note that"
  },
  {
    "path": "docs/images/rolling_mechanism_drawio_template.xml",
    "chars": 2298,
    "preview": "<mxfile userAgent=\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.30"
  },
  {
    "path": "docs/index.rst",
    "chars": 1354,
    "preview": ".. image:: images/tsfresh_logo.svg\n   :width: 70 %\n   :alt: some characteristics of the time series\n   :align: center\n\n="
  },
  {
    "path": "docs/license.rst",
    "chars": 74,
    "preview": ".. _license:\n\n=======\nLicense\n=======\n\n.. literalinclude:: ../LICENSE.txt\n"
  },
  {
    "path": "docs/text/data_formats.rst",
    "chars": 8527,
    "preview": ".. _data-formats-label:\n\nData Formats\n============\n\ntsfresh offers three different options to specify the format of the "
  },
  {
    "path": "docs/text/faq.rst",
    "chars": 2876,
    "preview": "FAQ\n===\n\n\n    1. **Does tsfresh support different time series lengths?**\n\n       Yes, it supports different time series "
  },
  {
    "path": "docs/text/feature_calculation.rst",
    "chars": 1564,
    "preview": ".. _feature-naming-label:\n\nFeature Calculator Naming\n=========================\n\ntsfresh enforces a strict naming of the "
  },
  {
    "path": "docs/text/feature_extraction_settings.rst",
    "chars": 6576,
    "preview": "Feature extraction settings\n===========================\n\nWhen starting a new data science project involving time series "
  },
  {
    "path": "docs/text/feature_filtering.rst",
    "chars": 3051,
    "preview": "Feature filtering\n=================\n\n\nThe all-relevant problem of feature selection is the identification of all strongl"
  },
  {
    "path": "docs/text/forecasting.rst",
    "chars": 10855,
    "preview": ".. _forecasting-label:\n\nRolling/Time series forecasting\n===============================\n\nFeatures extracted with *tsfres"
  },
  {
    "path": "docs/text/how_to_add_custom_feature.rst",
    "chars": 7086,
    "preview": "How to add a custom feature\n===========================\n\nIf you want to extract custom made features from your time seri"
  },
  {
    "path": "docs/text/how_to_contribute.rst",
    "chars": 4029,
    "preview": "How to contribute\n=================\n\nWe want tsfresh to become the biggest archive of feature extraction methods in pyth"
  },
  {
    "path": "docs/text/introduction.rst",
    "chars": 6201,
    "preview": "Introduction\n============\n\nWhy tsfresh?\n------------\n\ntsfresh is used for systematic feature engineering from time-serie"
  },
  {
    "path": "docs/text/large_data.rst",
    "chars": 3554,
    "preview": ".. _large-data-label:\n\nLarge Input Data\n================\n\nIf you are working with large time series data, you are probab"
  },
  {
    "path": "docs/text/list_of_features.rst",
    "chars": 496,
    "preview": "Overview on extracted features\n==============================\n\n*tsfresh* calculates a comprehensive number of features. "
  },
  {
    "path": "docs/text/quick_start.rst",
    "chars": 5316,
    "preview": ".. _quick-start-label:\n\nQuick Start\n===========\n\n\nInstall tsfresh\n---------------\n\nAs the compiled tsfresh package is ho"
  },
  {
    "path": "docs/text/sklearn_transformers.rst",
    "chars": 3622,
    "preview": ".. _sklearn-transformers-label:\n\nscikit-learn Transformers\n=========================\n\ntsfresh includes three scikit-lear"
  },
  {
    "path": "docs/text/tsfresh_on_a_cluster.rst",
    "chars": 11643,
    "preview": ".. _tsfresh-on-a-cluster-label:\n\n.. role:: python(code)\n    :language: python\n\nParallelization\n===============\n\nThe feat"
  },
  {
    "path": "notebooks/01 Feature Extraction and Selection.ipynb",
    "chars": 9901,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Feature Extraction and Selection\""
  },
  {
    "path": "notebooks/02 sklearn Pipeline.ipynb",
    "chars": 9247,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Feature Selection in a sklearn pi"
  },
  {
    "path": "notebooks/03 Feature Extraction Settings.ipynb",
    "chars": 11045,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Feature Calculator Settings\"\n   ]"
  },
  {
    "path": "notebooks/04 Multiclass Selection Example.ipynb",
    "chars": 87873,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Multiclass Example\"\n   ]\n  },\n  {"
  },
  {
    "path": "notebooks/05 Timeseries Forecasting.ipynb",
    "chars": 13922,
    "preview": "{\n \"cells\": [\n  {\n   \"attachments\": {},\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Timeseries "
  },
  {
    "path": "notebooks/advanced/05 Timeseries Forecasting (multiple ids).ipynb",
    "chars": 8986,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Timeseries Forecasting\"\n   ]\n  },"
  },
  {
    "path": "notebooks/advanced/compare-runtimes-of-feature-calculators.ipynb",
    "chars": 431490,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {\n    \"collapsed\": false,\n    \"deletab"
  },
  {
    "path": "notebooks/advanced/feature_extraction_with_datetime_index.ipynb",
    "chars": 22376,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Example of extracting features fr"
  },
  {
    "path": "notebooks/advanced/friedrich_coefficients.ipynb",
    "chars": 138465,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"<h1><center> Estimating Friedrich's"
  },
  {
    "path": "notebooks/advanced/inspect_dft_features.ipynb",
    "chars": 1203643,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"This notebook illustrates the Discr"
  },
  {
    "path": "notebooks/advanced/perform-PCA-on-extracted-features.ipynb",
    "chars": 43369,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"*tsfresh* returns a great number of"
  },
  {
    "path": "notebooks/advanced/visualize-benjamini-yekutieli-procedure.ipynb",
    "chars": 13923,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {\n    \"deletable\": true,\n    \"editable\": true\n   },\n   \"sou"
  },
  {
    "path": "setup.cfg",
    "chars": 4270,
    "preview": "# This file is used to configure your project.\n# Read more about the various options under:\n# http://setuptools.readthed"
  },
  {
    "path": "setup.py",
    "chars": 571,
    "preview": "# -*- coding: utf-8 -*-\n\"\"\"\n    Setup file for tsfresh.\n    Use setup.cfg to configure your project.\n\n    This file was "
  },
  {
    "path": "tests/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/benchmark.py",
    "chars": 1532,
    "preview": "# Testcase for benchmarking tsfresh feature extraction and selection\nimport numpy as np\nimport pandas as pd\nimport pytes"
  },
  {
    "path": "tests/fixtures.py",
    "chars": 12730,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/integrations/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/integrations/examples/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/integrations/examples/test_driftbif_simulation.py",
    "chars": 4645,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/integrations/examples/test_har_dataset.py",
    "chars": 1176,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/integrations/examples/test_robot_execution_failures.py",
    "chars": 1900,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/integrations/test_bindings.py",
    "chars": 1082,
    "preview": "from unittest import TestCase\n\nimport pandas as pd\nfrom dask import dataframe as dd\n\nfrom tsfresh.convenience.bindings i"
  },
  {
    "path": "tests/integrations/test_feature_extraction.py",
    "chars": 8708,
    "preview": "# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LICENCE.txt)\n# Maximilian C"
  },
  {
    "path": "tests/integrations/test_full_pipeline.py",
    "chars": 2570,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/integrations/test_notebooks.py",
    "chars": 4617,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/integrations/test_relevant_feature_extraction.py",
    "chars": 6215,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/units/feature_extraction/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/units/feature_extraction/test_data.py",
    "chars": 18301,
    "preview": "import math\nfrom unittest import TestCase\nfrom unittest.mock import Mock\n\nimport numpy as np\nimport pandas as pd\nfrom da"
  },
  {
    "path": "tests/units/feature_extraction/test_extraction.py",
    "chars": 15098,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/feature_extraction/test_feature_calculations.py",
    "chars": 83450,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/feature_extraction/test_settings.py",
    "chars": 14252,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/feature_selection/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/units/feature_selection/test_checks.py",
    "chars": 5673,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/feature_selection/test_fdr_control.py",
    "chars": 1567,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/feature_selection/test_feature_significance.py",
    "chars": 10593,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/feature_selection/test_relevance.py",
    "chars": 9321,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/feature_selection/test_selection.py",
    "chars": 3169,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/feature_selection/test_significance_tests.py",
    "chars": 6511,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/scripts/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/units/scripts/test_run_tsfresh.py",
    "chars": 4295,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/transformers/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/units/transformers/test_feature_augmenter.py",
    "chars": 3864,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/transformers/test_feature_selector.py",
    "chars": 6774,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/transformers/test_per_column_imputer.py",
    "chars": 7465,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/transformers/test_relevant_feature_augmenter.py",
    "chars": 8078,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/utilities/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/units/utilities/test_dataframe_functions.py",
    "chars": 46624,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/utilities/test_distribution.py",
    "chars": 7842,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tests/units/utilities/test_string_manipilations.py",
    "chars": 1191,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/__init__.py",
    "chars": 861,
    "preview": "# todo: here should go a top level description (see for example the numpy top level __init__.py)\n\n\"\"\"\nAt the top level w"
  },
  {
    "path": "tsfresh/convenience/__init__.py",
    "chars": 132,
    "preview": "\"\"\"\nThe :mod:`~tsfresh.convenience` submodule contains methods that allow the user to extract and filter features\nconven"
  },
  {
    "path": "tsfresh/convenience/bindings.py",
    "chars": 9597,
    "preview": "from functools import partial\n\nimport pandas as pd\n\nfrom tsfresh.feature_extraction.extraction import _do_extraction_on_"
  },
  {
    "path": "tsfresh/convenience/relevant_extraction.py",
    "chars": 9380,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/defaults.py",
    "chars": 629,
    "preview": "import os\nfrom multiprocessing import cpu_count\n\nn_cores = int(os.getenv(\"NUMBER_OF_CPUS\") or cpu_count())\n\nCHUNKSIZE = "
  },
  {
    "path": "tsfresh/examples/__init__.py",
    "chars": 379,
    "preview": "\"\"\"\nModule with exemplary data sets to play around with.\n\nSee for eample the :ref:`quick-start-label` section on how to "
  },
  {
    "path": "tsfresh/examples/driftbif_simulation.py",
    "chars": 6527,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/examples/har_dataset.py",
    "chars": 3426,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/examples/robot_execution_failures.py",
    "chars": 4610,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/feature_extraction/__init__.py",
    "chars": 318,
    "preview": "\"\"\"\nThe :mod:`tsfresh.feature_extraction` module contains methods to extract the features from the time series\n\"\"\"\n\nfrom"
  },
  {
    "path": "tsfresh/feature_extraction/data.py",
    "chars": 16831,
    "preview": "import itertools\nfrom collections import defaultdict, namedtuple\nfrom typing import Iterable, Sized\n\nimport pandas as pd"
  },
  {
    "path": "tsfresh/feature_extraction/extraction.py",
    "chars": 14940,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/feature_extraction/feature_calculators.py",
    "chars": 82258,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/feature_extraction/settings.py",
    "chars": 15403,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/feature_selection/__init__.py",
    "chars": 586,
    "preview": "\"\"\"\nThe :mod:`~tsfresh.feature_selection` module contains feature selection algorithms.\nThose methods were suited to pic"
  },
  {
    "path": "tsfresh/feature_selection/relevance.py",
    "chars": 15972,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/feature_selection/selection.py",
    "chars": 8174,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/feature_selection/significance_tests.py",
    "chars": 8684,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/scripts/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tsfresh/scripts/data.txt",
    "chars": 6041350,
    "preview": "  1.1653150e-002  1.3109090e-002  1.1268850e-002  2.7830730e-002  2.3183500e-003 -1.8965500e-002 -6.1920230e-002 -9.4248"
  },
  {
    "path": "tsfresh/scripts/measure_execution_time.py",
    "chars": 5804,
    "preview": "# This script extracts the execution time for\n# various different settings of tsfresh\n# using different input data\n# Att"
  },
  {
    "path": "tsfresh/scripts/run_tsfresh.py",
    "chars": 4300,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/scripts/test_timing.py",
    "chars": 2607,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/transformers/__init__.py",
    "chars": 413,
    "preview": "\"\"\"\nThe module :mod:`~tsfresh.transformers` contains several transformers which can be used inside a sklearn pipeline.\n\n"
  },
  {
    "path": "tsfresh/transformers/feature_augmenter.py",
    "chars": 11206,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/transformers/feature_selector.py",
    "chars": 10619,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/transformers/per_column_imputer.py",
    "chars": 5205,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/transformers/relevant_feature_augmenter.py",
    "chars": 23301,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/utilities/__init__.py",
    "chars": 138,
    "preview": "\"\"\"\nThis :mod:`~tsfresh.utilities` submodule contains several utility functions.\nThose should only be used internally in"
  },
  {
    "path": "tsfresh/utilities/dataframe_functions.py",
    "chars": 28650,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/utilities/distribution.py",
    "chars": 18819,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/utilities/profiling.py",
    "chars": 2574,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  },
  {
    "path": "tsfresh/utilities/string_manipulation.py",
    "chars": 2528,
    "preview": "# -*- coding: utf-8 -*-\n# This file as well as the whole tsfresh package are licenced under the MIT licence (see the LIC"
  }
]

// ... and 1 more files (download for full content)

About this extraction

This page contains the full source code of the blue-yonder/tsfresh GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 137 files (8.4 MB), approximately 2.2M tokens, and a symbol index with 687 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo