Showing preview only (3,101K chars total). Download the full file or copy to clipboard to get everything.
Repository: pwwang/datar
Branch: master
Commit: e4a9f8860a90
Files: 149
Total size: 2.9 MB
Directory structure:
gitextract_ikaf322n/
├── .codesandbox/
│ ├── Dockerfile
│ └── setup.sh
├── .coveragerc
├── .github/
│ ├── ISSUE_TEMPLATE/
│ │ ├── bug_report.yml
│ │ ├── feature_request.yml
│ │ └── submit_question.yml
│ └── workflows/
│ ├── ci.yml
│ └── docs.yml
├── .gitignore
├── .pre-commit-config.yaml
├── LICENSE
├── README.md
├── datar/
│ ├── __init__.py
│ ├── all.py
│ ├── apis/
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── dplyr.py
│ │ ├── forcats.py
│ │ ├── misc.py
│ │ ├── tibble.py
│ │ └── tidyr.py
│ ├── base.py
│ ├── core/
│ │ ├── __init__.py
│ │ ├── defaults.py
│ │ ├── load_plugins.py
│ │ ├── names.py
│ │ ├── operator.py
│ │ ├── options.py
│ │ ├── plugin.py
│ │ ├── utils.py
│ │ └── verb_env.py
│ ├── data/
│ │ ├── __init__.py
│ │ └── metadata.py
│ ├── datasets.py
│ ├── dplyr.py
│ ├── forcats.py
│ ├── misc.py
│ ├── tibble.py
│ └── tidyr.py
├── docs/
│ ├── CHANGELOG.md
│ ├── ENV_VARS.md
│ ├── backends.md
│ ├── data.md
│ ├── f.md
│ ├── import.md
│ ├── notebooks/
│ │ ├── across.ipynb
│ │ ├── add_column.ipynb
│ │ ├── add_row.ipynb
│ │ ├── arrange.ipynb
│ │ ├── base-arithmetic.ipynb
│ │ ├── base-funs.ipynb
│ │ ├── base.ipynb
│ │ ├── between.ipynb
│ │ ├── bind.ipynb
│ │ ├── case_when.ipynb
│ │ ├── chop.ipynb
│ │ ├── coalesce.ipynb
│ │ ├── complete.ipynb
│ │ ├── context.ipynb
│ │ ├── count.ipynb
│ │ ├── cumall.ipynb
│ │ ├── desc.ipynb
│ │ ├── distinct.ipynb
│ │ ├── drop_na.ipynb
│ │ ├── enframe.ipynb
│ │ ├── expand.ipynb
│ │ ├── expand_grid.ipynb
│ │ ├── extract.ipynb
│ │ ├── fill.ipynb
│ │ ├── filter-joins.ipynb
│ │ ├── filter.ipynb
│ │ ├── forcats_fct_multi.ipynb
│ │ ├── forcats_lvl_addrm.ipynb
│ │ ├── forcats_lvl_order.ipynb
│ │ ├── forcats_lvl_value.ipynb
│ │ ├── forcats_misc.ipynb
│ │ ├── full_seq.ipynb
│ │ ├── group_by.ipynb
│ │ ├── group_map.ipynb
│ │ ├── group_split.ipynb
│ │ ├── group_trim.ipynb
│ │ ├── lead-lag.ipynb
│ │ ├── mutate-joins.ipynb
│ │ ├── mutate.ipynb
│ │ ├── n_distinct.ipynb
│ │ ├── na_if.ipynb
│ │ ├── nb_helpers.py
│ │ ├── near.ipynb
│ │ ├── nest-join.ipynb
│ │ ├── nest.ipynb
│ │ ├── nth.ipynb
│ │ ├── other.ipynb
│ │ ├── pack.ipynb
│ │ ├── pivot_longer.ipynb
│ │ ├── pivot_wider.ipynb
│ │ ├── pull.ipynb
│ │ ├── ranking.ipynb
│ │ ├── readme.ipynb
│ │ ├── recode.ipynb
│ │ ├── reframe.ipynb
│ │ ├── relocate.ipynb
│ │ ├── rename.ipynb
│ │ ├── replace_na.ipynb
│ │ ├── rownames.ipynb
│ │ ├── rows.ipynb
│ │ ├── rowwise.ipynb
│ │ ├── select.ipynb
│ │ ├── separate.ipynb
│ │ ├── setops.ipynb
│ │ ├── slice.ipynb
│ │ ├── summarise.ipynb
│ │ ├── tibble.ipynb
│ │ ├── uncount.ipynb
│ │ ├── unite.ipynb
│ │ └── with_groups.ipynb
│ ├── options.md
│ ├── reference-maps/
│ │ ├── ALL.md
│ │ ├── base.md
│ │ ├── datasets.md
│ │ ├── dplyr.md
│ │ ├── forcats.md
│ │ ├── other.md
│ │ ├── stats.md
│ │ ├── tibble.md
│ │ ├── tidyr.md
│ │ └── utils.md
│ └── style.css
├── mkdocs.yml
├── pyproject.toml
├── setup.py
├── tests/
│ ├── __init__.py
│ ├── conflict_names.py
│ ├── conftest.py
│ ├── test_array_ufunc.py
│ ├── test_base.py
│ ├── test_conflict_names.py
│ ├── test_data.py
│ ├── test_dplyr.py
│ ├── test_forcats.py
│ ├── test_names.py
│ ├── test_options.py
│ ├── test_pipe.py
│ ├── test_plugin.py
│ ├── test_tibble.py
│ ├── test_tidyr.py
│ ├── test_utils.py
│ ├── test_verb_env.py
│ └── test_verb_env_integration.py
└── tox.ini
================================================
FILE CONTENTS
================================================
================================================
FILE: .codesandbox/Dockerfile
================================================
FROM python:3.10.12
RUN apt-get update && apt-get install -y npm fish && \
pip install -U pip && \
pip install poetry && \
poetry config virtualenvs.create false && \
chsh -s /usr/bin/fish
================================================
FILE: .codesandbox/setup.sh
================================================
WORKSPACE="/workspace"
# Install python dependencies
poetry update && poetry install
cd $WORKSPACE
# Install whichpy
WHICHPY="https://gist.githubusercontent.com/pwwang/879966128b0408c2459eb0a0b413fa69/raw/2f2573d191edec1937a2bf0873aa33a646b5ef29/whichpy.fish"
curl -sS $WHICHPY -o ~/.config/fish/functions/whichpy.fish
================================================
FILE: .coveragerc
================================================
[report]
exclude_lines =
pragma: no cover
if TYPE_CHECKING:
omit =
datar/datasets.py
*/site-packages/*
================================================
FILE: .github/ISSUE_TEMPLATE/bug_report.yml
================================================
name: Bug Report
description: Report incorrect behavior in the datar library
title: "[BUG] "
labels: [bug]
body:
- type: checkboxes
id: checks
attributes:
label: datar version checks
options:
- label: >
I have checked that this issue has not already been reported.
required: true
- label: >
I have confirmed this bug exists on the
**latest version** of datar and its backends.
required: true
- type: textarea
id: problem
attributes:
label: Issue Description
description: >
Please provide a description of the issue shown in the reproducible example.
validations:
required: true
- type: textarea
id: expected-behavior
attributes:
label: Expected Behavior
description: >
Please describe or show a code example of the expected behavior.
validations:
required: true
- type: textarea
id: version
attributes:
label: Installed Versions
description: >
Please paste the output of ``datar.get_versions()``
value: >
<details>
Replace this line with the output of datar.get_versions()
</details>
validations:
required: true
================================================
FILE: .github/ISSUE_TEMPLATE/feature_request.yml
================================================
name: Feature Request
description: Suggest an idea for datar
title: "[ENH] "
labels: [enhancement]
body:
- type: checkboxes
id: checks
attributes:
label: Feature Type
description: Please check what type of feature request you would like to propose.
options:
- label: >
Adding new functionality to datar
- label: >
Changing existing functionality in datar
- label: >
Removing existing functionality in datar
- type: textarea
id: description
attributes:
label: Problem Description
description: >
Please describe what problem the feature would solve, e.g. "I wish I could use datar to ..."
placeholder: >
I wish I could use datar to port the purrr package from R.
validations:
required: true
- type: textarea
id: feature
attributes:
label: Feature Description
description: >
Please describe how the new feature would be implemented, using psudocode if relevant.
placeholder: >
Add a new module `datar.purrr` with functions `map`, `map2`, `map_df`, etc.
validations:
required: true
- type: textarea
id: context
attributes:
label: Additional Context
description: >
Please provide any relevant GitHub issues, code examples or references that help describe and support
the feature request.
================================================
FILE: .github/ISSUE_TEMPLATE/submit_question.yml
================================================
name: Submit Question
description: Ask a general question about datar
title: "[QST] "
labels: [question]
body:
- type: textarea
id: question
attributes:
label: Question about datar
description: >
Try to provide a clear and concise description of your question.
placeholder: |
```python
# Your code here, if applicable
```
================================================
FILE: .github/workflows/ci.yml
================================================
name: CI
on:
push:
pull_request:
release:
types: [published]
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.9, "3.10", "3.11", "3.12"]
steps:
- uses: actions/checkout@v6
- name: Install uv
uses: astral-sh/setup-uv@v8.0.0
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: uv sync
- name: Run flake8
run: uv run flake8 datar
- name: Test with pytest
run: uv run pytest tests/ --junitxml=junit/test-results-${{ matrix.python-version }}.xml
- name: Upload pytest test results
uses: actions/upload-artifact@v7
with:
name: pytest-results-${{ matrix.python-version }}
path: junit/test-results-${{ matrix.python-version }}.xml
# Use always() to always run this step to publish test results when there are test failures
if: ${{ always() }}
- name: Run codacy-coverage-reporter
uses: codacy/codacy-coverage-reporter-action@master
if: matrix.python-version == 3.12
with:
project-token: ${{ secrets.CODACY_PROJECT_TOKEN }}
coverage-reports: cov.xml
deploy:
needs: build
runs-on: ubuntu-latest
if: github.event_name == 'release'
strategy:
matrix:
python-version: ["3.12"]
steps:
- uses: actions/checkout@v6
- name: Install uv
uses: astral-sh/setup-uv@v8.0.0
with:
python-version: ${{ matrix.python-version }}
- name: Build and publish to PyPI
run: |
uv build
uv publish --username ${{ secrets.PYPI_USER }} --password ${{ secrets.PYPI_PASSWORD }}
if: success()
================================================
FILE: .github/workflows/docs.yml
================================================
name: Build Docs
on: [push]
jobs:
docs:
runs-on: ubuntu-latest
# if: github.ref == 'refs/heads/master'
strategy:
matrix:
python-version: ["3.12"]
steps:
- uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: uv sync --group docs
- name: Build docs
run: |
# python -m pip install -r docs/requirements.txt
uv run python -m ipykernel install --user --name python --display-name python
uv run python -m ipykernel install --user --name python3 --display-name python3
cd docs
cp ../README.md index.md
cp ../example.png example.png
cp ../example2.png example2.png
# cp ../logo.png logo.png
cd ..
uv run mkdocs build
if : success()
- name: Deploy docs
run: |
uv run mkdocs gh-deploy --clean --force
# if: success() && github.ref == 'refs/heads/master'
fix-index:
needs: docs
runs-on: ubuntu-latest
# if: github.ref == 'refs/heads/master'
strategy:
matrix:
python-version: ["3.12"]
steps:
- uses: actions/checkout@v4
with:
ref: gh-pages
- name: Fix index.html
run: |
echo ':: head of index.html - before ::'
head index.html
sed -i '1,5{/^$/d}' index.html
echo ':: head of index.html - after ::'
head index.html
if: success()
- name: Commit changes
run: |
git config --local user.email "action@github.com"
git config --local user.name "GitHub Action"
git commit -m "Add changes" -a
if: success()
- name: Push changes
uses: ad-m/github-push-action@master
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
branch: gh-pages
if: success()
================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
.coverage.xml
cov.xml
*,cover
.hypothesis/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# IPython Notebook
.ipynb_checkpoints
# pyenv
.python-version
# celery beat schedule file
celerybeat-schedule
# dotenv
.env
# virtualenv
venv/
ENV/
# Spyder project settings
.spyderproject
# Rope project settings
.ropeproject
workdir/
node_modules/
_book/
.vscode
export/
*.svg
*.dot
*.queue.txt
site/
# poetry
# poetry.lock
# backup files
*.bak
docs/index.md
docs/logo.png
docs/example.png
docs/example2.png
docs/api/
docs/*.nbconvert.ipynb
docs/*/*.nbconvert.ipynb
# vscode's local history extension
.history/
# For quick test
/_t.py
/_t.ipynb
================================================
FILE: .pre-commit-config.yaml
================================================
fail_fast: true
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: 5df1a4bf6f04a1ed3a643167b38d502575e29aef
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
exclude: 'mkdocs.yml'
- repo: local
hooks:
- id: flake8
name: Run flake8
files: ^datar/.+$
pass_filenames: false
entry: flake8
args: [datar]
types: [python]
language: system
- id: versionchecker
name: Check version agreement in pyproject and __version__
entry: bash -c
language: system
args:
- get_ver() { echo $(egrep "^__version|^version" $1 | cut -d= -f2 | sed 's/\"\| //g'); };
v1=`get_ver pyproject.toml`;
v2=`get_ver datar/__init__.py`;
if [[ $v1 == $v2 ]]; then exit 0; else exit 1; fi
pass_filenames: false
files: ^pyproject\.toml|datar/__init__\.py$
- id: pytest
name: Run pytest
entry: pytest
language: system
args: [tests/]
pass_filenames: false
files: ^tests/.+$|^datar/.+$
================================================
FILE: LICENSE
================================================
MIT License
Copyright (c) 2020 pwwang
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: README.md
================================================
# datar
A Grammar of Data Manipulation in python
<!-- badges -->
[![Pypi][6]][7] [![Github][8]][9] ![Building][10] [![Docs and API][11]][5] [![Codacy][12]][13] [![Codacy coverage][14]][13] [![Downloads][20]][7]
[Documentation][5] | [Reference Maps][15] | [Notebook Examples][16] | [API][17]
`datar` is a re-imagining of APIs for data manipulation in python with multiple backends supported. Those APIs are aligned with tidyverse packages in R as much as possible.
## Installation
```shell
pip install -U datar
# install with a backend
pip install -U datar[pandas]
# More backends support coming soon
```
<!-- ## Maximum compatibility with R packages
|Package|Version|
|-|-|
|[dplyr][21]|1.0.8| -->
## Backends
|Repo|Badges|
|-|-|
|[datar-numpy][1]|![3] ![18]|
|[datar-pandas][2]|![4] ![19]|
|[datar-arrow][22]|![23] ![24]|
## Example usage
```python
# with pandas backend
from datar import f
from datar.dplyr import mutate, filter_, if_else
from datar.tibble import tibble
# or
# from datar.all import f, mutate, filter_, if_else, tibble
df = tibble(
x=range(4), # or c[:4] (from datar.base import c)
y=['zero', 'one', 'two', 'three']
)
df >> mutate(z=f.x)
"""# output
x y z
<int64> <object> <int64>
0 0 zero 0
1 1 one 1
2 2 two 2
3 3 three 3
"""
df >> mutate(z=if_else(f.x>1, 1, 0))
"""# output:
x y z
<int64> <object> <int64>
0 0 zero 0
1 1 one 0
2 2 two 1
3 3 three 1
"""
df >> filter_(f.x>1)
"""# output:
x y
<int64> <object>
0 2 two
1 3 three
"""
df >> mutate(z=if_else(f.x>1, 1, 0)) >> filter_(f.z==1)
"""# output:
x y z
<int64> <object> <int64>
0 2 two 1
1 3 three 1
"""
```
```python
# works with plotnine
# example grabbed from https://github.com/has2k1/plydata
import numpy
from datar import f
from datar.base import sin, pi
from datar.tibble import tibble
from datar.dplyr import mutate, if_else
from plotnine import ggplot, aes, geom_line, theme_classic
df = tibble(x=numpy.linspace(0, 2 * pi, 500))
(
df
>> mutate(y=sin(f.x), sign=if_else(f.y >= 0, "positive", "negative"))
>> ggplot(aes(x="x", y="y"))
+ theme_classic()
+ geom_line(aes(color="sign"), size=1.2)
)
```

```python
# very easy to integrate with other libraries
# for example: klib
import klib
from pipda import register_verb
from datar import f
from datar.data import iris
from datar.dplyr import pull
dist_plot = register_verb(func=klib.dist_plot)
iris >> pull(f.Sepal_Length) >> dist_plot()
```

## Testimonials
[@coforfe](https://github.com/coforfe):
> Thanks for your excellent package to port R (`dplyr`) flow of processing to Python. I have been using other alternatives, and yours is the one that offers the most extensive and equivalent to what is possible now with `dplyr`.
[1]: https://github.com/pwwang/datar-numpy
[2]: https://github.com/pwwang/datar-pandas
[3]: https://img.shields.io/codacy/coverage/0a7519dad44246b6bab30576895f6766?style=flat-square
[4]: https://img.shields.io/codacy/coverage/45f4ea84ae024f1a8cf84be54dd144f7?style=flat-square
[5]: https://pwwang.github.io/datar/
[6]: https://img.shields.io/pypi/v/datar?style=flat-square
[7]: https://pypi.org/project/datar/
[8]: https://img.shields.io/github/v/tag/pwwang/datar?style=flat-square
[9]: https://github.com/pwwang/datar
[10]: https://img.shields.io/github/actions/workflow/status/pwwang/datar/ci.yml?branch=master&style=flat-square
[11]: https://img.shields.io/github/actions/workflow/status/pwwang/datar/docs.yml?branch=master&style=flat-square
[12]: https://img.shields.io/codacy/grade/3d9bdff4d7a34bdfb9cd9e254184cb35?style=flat-square
[13]: https://app.codacy.com/gh/pwwang/datar
[14]: https://img.shields.io/codacy/coverage/3d9bdff4d7a34bdfb9cd9e254184cb35?style=flat-square
[15]: https://pwwang.github.io/datar/reference-maps/ALL/
[16]: https://pwwang.github.io/datar/notebooks/across/
[17]: https://pwwang.github.io/datar/api/datar/
[18]: https://img.shields.io/pypi/v/datar-numpy?style=flat-square
[19]: https://img.shields.io/pypi/v/datar-pandas?style=flat-square
[20]: https://img.shields.io/pypi/dm/datar?style=flat-square
[21]: https://github.com/tidyverse/dplyr
[22]: https://github.com/pwwang/datar-arrow
[23]: https://img.shields.io/codacy/coverage/5f4ef9dd2503437db18786ff9e841d8b?style=flat-square
[24]: https://img.shields.io/pypi/v/datar-arrow?style=flat-square
================================================
FILE: datar/__init__.py
================================================
from typing import Mapping as _Mapping
from .core import operator as _
from .core.defaults import f
from .core.options import options, get_option, options_context
__version__ = "0.15.17"
__all__ = [
"f",
"options",
"get_option",
"options_context",
"get_versions",
]
def get_versions(prnt: bool = True) -> _Mapping[str, str]:
"""Return/Print the versions of the dependencies.
Args:
prnt: If True, print the versions, otherwise return them.
Returns:
A dict of the versions of the dependencies if `prnt` is False.
"""
import sys
import executing
import pipda
import simplug
from .core.load_plugins import plugin
versions = {
"python": sys.version,
"datar": __version__,
"simplug": simplug.__version__,
"executing": executing.__version__,
"pipda": pipda.__version__,
}
versions_plg = plugin.hooks.get_versions()
versions.update(versions_plg)
if not prnt:
return versions
keylen = max(map(len, versions))
for key in versions:
ver = versions[key]
verlines = ver.splitlines()
print(f"{key.ljust(keylen)}: {verlines.pop(0)}")
for verline in verlines: # pragma: no cover
print(f"{' ' * keylen} {verline}")
return None
================================================
FILE: datar/all.py
================================================
"""Import all constants, verbs and functions"""
from .core import load_plugins as _
from .core.defaults import f
from .base import _conflict_names as _base_conflict_names
from .dplyr import _conflict_names as _dplyr_conflict_names
from .base import *
from .dplyr import *
from .forcats import *
from .tibble import *
from .tidyr import *
from .misc import *
__all__ = [key for key in locals() if not key.startswith("_")]
if get_option("allow_conflict_names"): # noqa: F405
__all__.extend(_base_conflict_names | _dplyr_conflict_names)
for name in _base_conflict_names | _dplyr_conflict_names:
locals()[name] = locals()[name + "_"]
def __getattr__(name):
"""Even when allow_conflict_names is False, datar.base.sum should be fine
"""
if name in _base_conflict_names | _dplyr_conflict_names:
import sys
import ast
from executing import Source
node = Source.executing(sys._getframe(1)).node
if isinstance(node, (ast.Call, ast.Attribute)):
# import datar.all as d
# d.sum(...) or getattr(d, "sum")(...)
return globals()[name + "_"]
raise AttributeError
================================================
FILE: datar/apis/__init__.py
================================================
================================================
FILE: datar/apis/base.py
================================================
"""APIs ported from r-base"""
# import the variables with _ so that they are not imported by *
import math as _math
from typing import Any
from string import ascii_letters as _ascii_letters
from pipda import register_func as _register_func
from ..core.utils import (
NotImplementedByCurrentBackendError as _NotImplementedByCurrentBackendError,
CollectionFunction as _CollectionFunction,
)
from ..core.options import options, get_option, options_context # noqa: F401
from ..core.names import repair_names as _repair_names
pi = _math.pi
letters = list(_ascii_letters[:26])
LETTERS = list(_ascii_letters[26:])
month_name = [
"January",
"February",
"March",
"April",
"May",
"June",
"July",
"August",
"September",
"October",
"November",
"December",
]
month_abb = [m[:3] for m in month_name]
FALSE = False
TRUE = True
NA = float("nan")
NULL = None
NaN = float("nan")
Inf = float("inf")
@_register_func(pipeable=True, dispatchable=True)
def ceiling(x) -> Any:
"""Round up to the nearest integer
Args:
x: The value to be rounded up
Returns:
The rounded up value
"""
raise _NotImplementedByCurrentBackendError("ceiling", x)
@_register_func(pipeable=True, dispatchable=True)
def cov(x, y=None, na_rm: bool = False, ddof: int = 1) -> Any:
"""Compute pairwise covariance between two variables
Args:
x: a numeric vector, matrix or data frame.
y: None or a vector, matrix or data frame with
compatible dimensions to `x`. The default is equivalent to
`y = x`
na_rm: If `True`, remove missing values before computing
the covariance.
ddof: The denominator degrees of freedom.
Returns:
The covariance matrix
"""
raise _NotImplementedByCurrentBackendError("cov", x)
@_register_func(pipeable=True, dispatchable=True)
def floor(x) -> Any:
"""Round down to the nearest integer
Args:
x: The value to be rounded down
Returns:
The rounded down value
"""
raise _NotImplementedByCurrentBackendError("floor", x)
@_register_func(pipeable=True, dispatchable=True)
def mean(x, na_rm: bool = False) -> Any:
"""Compute the mean of a vector
Args:
x: A numeric vector
na_rm: Whether to remove `NA` values
Returns:
The mean of the vector
"""
raise _NotImplementedByCurrentBackendError("mean", x)
@_register_func(pipeable=True, dispatchable=True)
def median(x, na_rm: bool = False) -> Any:
"""Compute the median of a vector
Args:
x: A numeric vector
na_rm: Whether to remove `NA` values
Returns:
The median of the vector
"""
raise _NotImplementedByCurrentBackendError("median", x)
@_register_func(pipeable=True, dispatchable=True)
def pmax(*args, na_rm: bool = False) -> Any:
"""Returns the (regular or Parallel) maxima and minima of the input
values.
Args:
x: A numeric vector
more: One or more values
na_rm: Whether to remove `NA` values
Returns:
The maximum of the vector and the values
"""
raise _NotImplementedByCurrentBackendError("pmax")
@_register_func(pipeable=True, dispatchable=True)
def pmin(*args, na_rm: bool = False) -> Any:
"""Returns the (regular or Parallel) maxima and minima of the input
values.
Args:
x: A numeric vector
more: One or more values
na_rm: Whether to remove `NA` values
Returns:
The minimum of the vector and the values
"""
raise _NotImplementedByCurrentBackendError("pmin")
@_register_func(pipeable=True, dispatchable=True)
def sqrt(x) -> Any:
"""Compute the square root of a vector
Args:
x: A numeric vector
Returns:
The square root of the vector
"""
raise _NotImplementedByCurrentBackendError("sqrt", x)
@_register_func(pipeable=True, dispatchable=True)
def var(x, na_rm: bool = False, ddof: int = 1) -> Any:
"""Compute the variance of a vector
Args:
x: A numeric vector
y: None or a vector, matrix or data frame with
compatible dimensions to `x`. The default is equivalent to
`y = x`
na_rm: Whether to remove `NA` values
ddof: The degrees of freedom
Returns:
The variance of the vector
"""
raise _NotImplementedByCurrentBackendError("var", x)
@_register_func(pipeable=True, dispatchable=True)
def scale(x, center=True, scale_=True) -> Any:
"""Center and/or scale the data
Args:
x: A numeric vector
center: Whether to center the data
scale_: Whether to scale the data
Returns:
The scaled data
"""
raise _NotImplementedByCurrentBackendError("scale", x)
@_register_func(pipeable=True, dispatchable=True)
def col_sums(x, na_rm: bool = False) -> Any:
"""Compute the column sums of a matrix
Args:
x: A numeric matrix
na_rm: Whether to remove `NA` values
Returns:
The column sums of the matrix
"""
raise _NotImplementedByCurrentBackendError("col_sums", x)
@_register_func(pipeable=True, dispatchable=True)
def col_means(x, na_rm: bool = False) -> Any:
"""Compute the column means of a matrix
Args:
x: A numeric matrix
na_rm: Whether to remove `NA` values
Returns:
The column means of the matrix
"""
raise _NotImplementedByCurrentBackendError("col_means", x)
@_register_func(pipeable=True, dispatchable=True)
def col_sds(x, na_rm: bool = False) -> Any:
"""Compute the column standard deviations of a matrix
Args:
x: A numeric matrix
na_rm: Whether to remove `NA` values
Returns:
The column standard deviations of the matrix
"""
raise _NotImplementedByCurrentBackendError("col_sds", x)
@_register_func(pipeable=True, dispatchable=True)
def col_medians(x, na_rm: bool = False) -> Any:
"""Compute the column medians of a matrix
Args:
x: A numeric matrix
na_rm: Whether to remove `NA` values
Returns:
The column medians of the matrix
"""
raise _NotImplementedByCurrentBackendError("col_medians", x)
@_register_func(pipeable=True, dispatchable=True)
def row_sums(x, na_rm: bool = False) -> Any:
"""Compute the row sums of a matrix
Args:
x: A numeric matrix
na_rm: Whether to remove `NA` values
Returns:
The row sums of the matrix
"""
raise _NotImplementedByCurrentBackendError("row_sums", x)
@_register_func(pipeable=True, dispatchable=True)
def row_means(x, na_rm: bool = False) -> Any:
"""Compute the row means of a matrix
Args:
x: A numeric matrix
na_rm: Whether to remove `NA` values
Returns:
The row means of the matrix
"""
raise _NotImplementedByCurrentBackendError("row_means", x)
@_register_func(pipeable=True, dispatchable=True)
def row_sds(x, na_rm: bool = False) -> Any:
"""Compute the row standard deviations of a matrix
Args:
x: A numeric matrix
na_rm: Whether to remove `NA` values
Returns:
The row standard deviations of the matrix
"""
raise _NotImplementedByCurrentBackendError("row_sds", x)
@_register_func(pipeable=True, dispatchable=True)
def row_medians(x, na_rm: bool = False) -> Any:
"""Compute the row medians of a matrix
Args:
x: A numeric matrix
na_rm: Whether to remove `NA` values
Returns:
The row medians of the matrix
"""
raise _NotImplementedByCurrentBackendError("row_medians", x)
@_register_func(pipeable=True, dispatchable=True)
def min_(x, na_rm: bool = False) -> Any:
"""Compute the minimum of a vector
Args:
x: A numeric vector
na_rm: Whether to remove `NA` values
Returns:
The minimum of the vector
"""
raise _NotImplementedByCurrentBackendError("min", x)
@_register_func(pipeable=True, dispatchable=True)
def max_(x, na_rm: bool = False) -> Any:
"""Compute the maximum of a vector
Args:
x: A numeric vector
na_rm: Whether to remove `NA` values
Returns:
The maximum of the vector
"""
raise _NotImplementedByCurrentBackendError("max", x)
@_register_func(pipeable=True, dispatchable=True)
def round_(x, digits: int = 0) -> Any:
"""Round the values of a vector
Args:
x: A numeric vector
digits: The number of digits to round to
Returns:
The rounded values
"""
raise _NotImplementedByCurrentBackendError("round", x)
@_register_func(pipeable=True, dispatchable=True)
def sum_(x, na_rm: bool = False) -> Any:
"""Compute the sum of a vector
Args:
x: A numeric vector
na_rm: Whether to remove `NA` values
Returns:
The sum of the vector
"""
raise _NotImplementedByCurrentBackendError("sum", x)
@_register_func(pipeable=True, dispatchable=True)
def abs_(x) -> Any:
"""Compute the absolute value of a vector
Args:
x: A numeric vector
Returns:
The absolute values of the vector
"""
raise _NotImplementedByCurrentBackendError("abs", x)
@_register_func(pipeable=True, dispatchable=True)
def prod(x, na_rm: bool = False) -> Any:
"""Compute the product of a vector
Args:
x: A numeric vector
na_rm: Whether to remove `NA` values
Returns:
The product of the vector
"""
raise _NotImplementedByCurrentBackendError("prod", x)
@_register_func(pipeable=True, dispatchable=True)
def sign(x) -> Any:
"""Compute the sign of a vector
Args:
x: A numeric vector
Returns:
The signs of the vector
"""
raise _NotImplementedByCurrentBackendError("sign", x)
@_register_func(pipeable=True, dispatchable=True)
def signif(x, digits: int = 6) -> Any:
"""Round the values of a vector to a given number of significant digits
Args:
x: A numeric vector
digits: The number of significant digits to round to
Returns:
The rounded values
"""
raise _NotImplementedByCurrentBackendError("signif", x)
@_register_func(pipeable=True, dispatchable=True)
def trunc(x) -> Any:
"""Truncate the values of a vector
Args:
x: A numeric vector
Returns:
The truncated values
"""
raise _NotImplementedByCurrentBackendError("trunc", x)
@_register_func(pipeable=True, dispatchable=True)
def exp(x) -> Any:
"""Compute the exponential of a vector
Args:
x: A numeric vector
Returns:
The exponential values
"""
raise _NotImplementedByCurrentBackendError("exp", x)
@_register_func(pipeable=True, dispatchable=True)
def log(x, base: float = _math.e) -> Any:
"""Compute the logarithm of a vector
Args:
x: A numeric vector
base: The base of the logarithm
Returns:
The logarithm values
"""
raise _NotImplementedByCurrentBackendError("log", x)
@_register_func(pipeable=True, dispatchable=True)
def log2(x) -> Any:
"""Compute the base-2 logarithm of a vector
Args:
x: A numeric vector
Returns:
The logarithm values
"""
raise _NotImplementedByCurrentBackendError("log2", x)
@_register_func(pipeable=True, dispatchable=True)
def log10(x) -> Any:
"""Compute the base 10 logarithm of a vector
Args:
x: A numeric vector
Returns:
The logarithm values
"""
raise _NotImplementedByCurrentBackendError("log10", x)
@_register_func(pipeable=True, dispatchable=True)
def log1p(x) -> Any:
"""Compute the logarithm of one plus a vector
Args:
x: A numeric vector
Returns:
The logarithm values
"""
raise _NotImplementedByCurrentBackendError("log1p", x)
@_register_func(pipeable=True, dispatchable=True)
def sd(x, na_rm: bool = False) -> Any:
"""Compute the standard deviation of a vector
Args:
x: A numeric vector
na_rm: Whether to remove `NA` values
Returns:
The standard deviation of the vector
"""
raise _NotImplementedByCurrentBackendError("sd", x)
@_register_func(pipeable=True, dispatchable=True)
def weighted_mean(x, w=None, na_rm: bool = False) -> Any:
"""Compute the weighted mean of a vector
Args:
x: A numeric vector
w: The weights to use
na_rm: Whether to remove `NA` values
Returns:
The weighted mean of the vector
"""
raise _NotImplementedByCurrentBackendError("weighted_mean", x)
@_register_func(pipeable=True, dispatchable=True)
def quantile(
x,
probs=(0.0, 0.25, 0.5, 0.75, 1.0),
na_rm: bool = False,
names: bool = True,
type_: int = 7,
digits: int = 7,
) -> Any:
"""Compute the quantiles of a vector
Args:
x: A numeric vector
probs: The probabilities to use
Returns:
The quantiles of the vector
"""
raise _NotImplementedByCurrentBackendError("quantile", x)
@_register_func(pipeable=True, dispatchable=True)
def bessel_i(x, nu, expon_scaled: bool = False) -> Any:
"""Compute the modified Bessel function of the first kind
Args:
x: A numeric vector
nu: The order of the Bessel function
expon_scaled: Whether to use the scaled version
Returns:
The Bessel function values
"""
raise _NotImplementedByCurrentBackendError("bessel_i", x)
@_register_func(pipeable=True, dispatchable=True)
def bessel_j(x, nu) -> Any:
"""Compute the Bessel function of the first kind
Args:
x: A numeric vector
nu: The order of the Bessel function
Returns:
The Bessel function values
"""
raise _NotImplementedByCurrentBackendError("bessel_j", x)
@_register_func(pipeable=True, dispatchable=True)
def bessel_k(x, nu, expon_scaled: bool = False) -> Any:
"""Compute the modified Bessel function of the second kind
Args:
x: A numeric vector
nu: The order of the Bessel function
expon_scaled: Whether to use the scaled version
Returns:
The Bessel function values
"""
raise _NotImplementedByCurrentBackendError("bessel_k", x)
@_register_func(pipeable=True, dispatchable=True)
def bessel_y(x, nu) -> Any:
"""Compute the Bessel function of the second kind
Args:
x: A numeric vector
nu: The order of the Bessel function
Returns:
The Bessel function values
"""
raise _NotImplementedByCurrentBackendError("bessel_y", x)
@_register_func(pipeable=True, dispatchable=True)
def as_double(x) -> Any:
"""Convert a vector to a double vector
Args:
x: A numeric vector
Returns:
The double vector
"""
raise _NotImplementedByCurrentBackendError("as_double", x)
@_register_func(pipeable=True, dispatchable=True)
def as_integer(x) -> Any:
"""Convert a vector to an integer vector
Args:
x: A numeric vector
Returns:
The integer vector
"""
raise _NotImplementedByCurrentBackendError("as_integer", x)
@_register_func(pipeable=True, dispatchable=True)
def as_logical(x) -> Any:
"""Convert a vector to a logical vector
Args:
x: A numeric vector
Returns:
The logical vector
"""
raise _NotImplementedByCurrentBackendError("as_logical", x)
@_register_func(pipeable=True, dispatchable=True)
def as_character(x) -> Any:
"""Convert a vector to a character vector
Args:
x: A numeric vector
Returns:
The character vector
"""
raise _NotImplementedByCurrentBackendError("as_character", x)
@_register_func(pipeable=True, dispatchable=True)
def as_factor(x) -> Any:
"""Convert a vector to a factor vector
Args:
x: A numeric vector
Returns:
The factor vector
"""
raise _NotImplementedByCurrentBackendError("as_factor", x)
@_register_func(pipeable=True, dispatchable=True)
def as_ordered(x) -> Any:
"""Convert a vector to an ordered vector
Args:
x: A numeric vector
Returns:
The ordered vector
"""
raise _NotImplementedByCurrentBackendError("as_ordered", x)
@_register_func(pipeable=True, dispatchable=True)
def as_date(
x,
*,
format=None,
try_formats=None,
optional=False,
tz=0,
origin=None,
) -> Any:
"""Convert an object to a datetime.date object
See: https://rdrr.io/r/base/as.Date.html
Args:
x: Object that can be converted into a datetime.date object
format: If not specified, it will try try_formats one by one on
the first non-np.nan element, and give an error if none works.
Otherwise, the processing is via strptime
try_formats: vector of format strings to try if format is not specified.
Default formats to try:
"%Y-%m-%d"
"%Y/%m/%d"
"%Y-%m-%d %H:%M:%S"
"%Y/%m/%d %H:%M:%S"
optional: indicating to return np.nan (instead of signalling an error)
if the format guessing does not succeed.
origin: a datetime.date/datetime object, or something which can be
coerced by as_date(origin, ...) to such an object.
tz: a time zone offset or a datetime.timedelta object.
Note that time zone name is not supported yet.
Returns:
The datetime.date object
"""
raise _NotImplementedByCurrentBackendError("as_date", x)
@_register_func(pipeable=True, dispatchable=True)
def as_numeric(x) -> Any:
"""Convert a vector to a numeric vector
Args:
x: A numeric vector
Returns:
The numeric vector
"""
raise _NotImplementedByCurrentBackendError("as_numeric", x)
@_register_func(pipeable=True, dispatchable=True)
def arg(x) -> Any:
"""Angles of complex numbers
Args:
x: A numeric vector
Returns:
The angles
"""
raise _NotImplementedByCurrentBackendError("arg", x)
@_register_func(pipeable=True, dispatchable=True)
def conj(x) -> Any:
"""Complex conjugate
Args:
x: A numeric vector
Returns:
The complex conjugates
"""
raise _NotImplementedByCurrentBackendError("conj", x)
@_register_func(pipeable=True, dispatchable=True)
def mod(x) -> Any:
"""Modulus of complex numbers
Args:
x: A numeric vector
Returns:
The modulus
"""
raise _NotImplementedByCurrentBackendError("mod", x)
@_register_func(pipeable=True, dispatchable=True)
def re_(x) -> Any:
"""Real part of complex numbers
Args:
x: A numeric vector
Returns:
The real parts
"""
raise _NotImplementedByCurrentBackendError("re", x)
@_register_func(pipeable=True, dispatchable=True)
def im(x) -> Any:
"""Imaginary part of complex numbers
Args:
x: A numeric vector
Returns:
The imaginary parts
"""
raise _NotImplementedByCurrentBackendError("im", x)
@_register_func(pipeable=True, dispatchable=True)
def as_complex(x) -> Any:
"""Convert a vector to a complex vector
Args:
x: A numeric vector
Returns:
The complex vector
"""
raise _NotImplementedByCurrentBackendError("as_complex", x)
@_register_func(pipeable=True, dispatchable=True)
def is_complex(x) -> Any:
"""Check if a vector is complex
Args:
x: A numeric vector
Returns:
Whether the vector is complex
"""
raise _NotImplementedByCurrentBackendError("is_complex", x)
@_register_func(pipeable=True, dispatchable=True)
def cummax(x) -> Any:
"""Cumulative maxima
Args:
x: A numeric vector
Returns:
The cumulative maxima
"""
raise _NotImplementedByCurrentBackendError("cummax", x)
@_register_func(pipeable=True, dispatchable=True)
def cummin(x) -> Any:
"""Cumulative minima
Args:
x: A numeric vector
Returns:
The cumulative minima
"""
raise _NotImplementedByCurrentBackendError("cummin", x)
@_register_func(pipeable=True, dispatchable=True)
def cumprod(x) -> Any:
"""Cumulative products
Args:
x: A numeric vector
Returns:
The cumulative products
"""
raise _NotImplementedByCurrentBackendError("cumprod", x)
@_register_func(pipeable=True, dispatchable=True)
def cumsum(x) -> Any:
"""Cumulative sums
Args:
x: A numeric vector
Returns:
The cumulative sums
"""
raise _NotImplementedByCurrentBackendError("cumsum", x)
@_register_func(pipeable=True, dispatchable=True)
def droplevels(x) -> Any:
"""Drop unused levels of a factor
Args:
x: A numeric vector
Returns:
The factor vector
"""
raise _NotImplementedByCurrentBackendError("droplevels", x)
@_register_func(pipeable=True, dispatchable=True)
def levels(x) -> Any:
"""Get the levels of a factor
Args:
x: A numeric vector
Returns:
The factor vector
"""
raise _NotImplementedByCurrentBackendError("levels", x)
@_register_func(pipeable=True, dispatchable=True)
def set_levels(x, levels) -> Any:
"""Set the levels of a factor
Args:
x: A numeric vector
levels: The new levels
Returns:
The factor vector
"""
raise _NotImplementedByCurrentBackendError("set_levels", x)
@_register_func(pipeable=True, dispatchable=True)
def is_factor(x) -> Any:
"""Check if a vector is a factor
Args:
x: A numeric vector
Returns:
Whether the vector is a factor
"""
raise _NotImplementedByCurrentBackendError("is_factor", x)
@_register_func(pipeable=True, dispatchable=True)
def is_ordered(x) -> Any:
"""Check if a vector is ordered
Args:
x: A numeric vector
Returns:
Whether the vector is ordered
"""
raise _NotImplementedByCurrentBackendError("is_ordered", x)
@_register_func(pipeable=True, dispatchable=True)
def nlevels(x) -> Any:
"""Get the number of levels of a factor
Args:
x: A numeric vector
Returns:
The number of levels
"""
raise _NotImplementedByCurrentBackendError("nlevels", x)
@_register_func(pipeable=True, dispatchable=True)
def factor(
x=None,
*,
levels=None,
labels=None,
exclude=None,
ordered=False,
nmax=None,
) -> Any:
"""Create a factor vector
Args:
x: A numeric vector
levels: The levels
labels: The labels
exclude: The excluded levels
ordered: Whether the factor is ordered
nmax: The maximum number of levels
Returns:
The factor vector
"""
raise _NotImplementedByCurrentBackendError("factor", x)
@_register_func(pipeable=True, dispatchable=True)
def ordered(x, levels=None, labels=None, exclude=None, nmax=None) -> Any:
"""Create an ordered factor vector
Args:
x: A numeric vector
levels: The levels
labels: The labels
exclude: The excluded levels
nmax: The maximum number of levels
Returns:
The ordered factor vector
"""
raise _NotImplementedByCurrentBackendError("ordered", x)
@_register_func(pipeable=True, dispatchable=True)
def cut(
x,
breaks,
labels=None,
include_lowest=False,
right=True,
dig_lab=3,
ordered_result=False,
) -> Any:
"""Cut a numeric vector into bins
Args:
x: A numeric vector
breaks: The breaks
labels: The labels
include_lowest: Whether to include the lowest value
right: Whether to include the rightmost value
dig_lab: The number of digits for labels
ordered_result: Whether to return an ordered factor
Returns:
The factor vector
"""
raise _NotImplementedByCurrentBackendError("cut", x)
@_register_func(pipeable=True, dispatchable=True)
def diff(x, lag: int = 1, differences: int = 1) -> Any:
"""Difference of a numeric vector
Args:
x: A numeric vector
lag: The lag to use. Could be negative.
It always calculates `x[lag:] - x[:-lag]` even when `lag` is
negative
differences: The order of the difference
Returns:
An array of `x[lag:] – x[:-lag]`.
If `differences > 1`, the rule applies `differences` times on `x`
"""
raise _NotImplementedByCurrentBackendError("diff", x)
@_register_func(pipeable=True, dispatchable=True)
def expand_grid(x, *args, **kwargs) -> Any:
"""Expand a grid
Args:
x: A numeric vector
*args: Additional numeric vectors
**kwargs: Additional keyword arguments
Returns:
The expanded grid
"""
raise _NotImplementedByCurrentBackendError("expand_grid", x)
@_register_func(pipeable=True, dispatchable=True)
def outer(x, y, fun="*") -> Any:
"""Outer product of two vectors
Args:
x: A numeric vector
y: A numeric vector
fun: The function to handle how the result of the elements from
the first and second vectors should be computed.
The function has to be vectorized at the second argument, and
return the same shape as y.
Returns:
The outer product
"""
raise _NotImplementedByCurrentBackendError("outer", x)
@_register_func(cls=object, pipeable=True, dispatchable=True)
def make_names(names, unique: bool = True) -> Any:
"""Make names for a vector
Args:
names: character vector to be coerced to syntactically valid names.
This is coerced to character if necessary.
unique: Whether to make the names unique
Returns:
The names
"""
try:
from slugify import slugify
except ImportError as imerr: # pragma: no cover
raise ValueError(
"`make_names()` requires `python-slugify` package.\n"
"Try: pip install -U slugify"
) from imerr
if isinstance(names, str):
names = [names]
try:
iter(names)
except TypeError:
names = [names]
names = [
slugify(str(name), separator="_", lowercase=False)
for name in names
]
names = [f"_{name}" if name[0].isdigit() else name for name in names]
if unique:
return _repair_names(names, "unique")
return names
@_register_func(cls=object, pipeable=True, dispatchable=True)
def make_unique(names) -> Any:
"""Make a vector unique
Args:
names: a character vector
Returns:
The unique vector
"""
return make_names(names, unique=True, __ast_fallback="normal")
@_register_func(pipeable=True, dispatchable=True)
def rank(x, na_last: bool = True, ties_method: str = "average") -> Any:
"""Rank a numeric vector
Args:
x: A numeric vector
na_last: Whether to put NA at the end
ties_method: The method to handle ties. One of "average", "first",
"last", "random", "max", "min"
Returns:
The ranks
"""
raise _NotImplementedByCurrentBackendError("rank", x)
@_register_func(cls=object, pipeable=True, dispatchable=True)
def identity(x) -> Any:
"""Identity function
Args:
x: A numeric vector
Returns:
The same vector
"""
return x
@_register_func(pipeable=True, dispatchable=True)
def is_logical(x) -> Any:
"""Check if a vector is logical
Args:
x: A numeric vector
Returns:
Whether the vector is logical
"""
raise _NotImplementedByCurrentBackendError("is_logical", x)
@_register_func(pipeable=True, dispatchable=True)
def is_true(x) -> bool:
"""Check if anything is true
Args:
x: object to be tested
Returns:
Whether `x` is true
"""
raise _NotImplementedByCurrentBackendError("is_true", x)
@_register_func(pipeable=True, dispatchable=True)
def is_false(x) -> bool:
"""Check if anything is false
Args:
x: object to be tested
Returns:
Whether `x` is false
"""
raise _NotImplementedByCurrentBackendError("is_false", x)
@_register_func(pipeable=True, dispatchable=True)
def is_na(x) -> Any:
"""Check if anything is NA
Args:
x: object to be tested
Returns:
Whether `x` is NA
"""
raise _NotImplementedByCurrentBackendError("is_na", x)
@_register_func(pipeable=True, dispatchable=True)
def is_finite(x) -> Any:
"""Check if anything is finite
Args:
x: object to be tested
Returns:
Whether `x` is finite
"""
raise _NotImplementedByCurrentBackendError("is_finite", x)
@_register_func(pipeable=True, dispatchable=True)
def is_infinite(x) -> Any:
"""Check if anything is infinite
Args:
x: object to be tested
Returns:
Whether `x` is infinite
"""
raise _NotImplementedByCurrentBackendError("is_infinite", x)
@_register_func(pipeable=True, dispatchable=True)
def any_na(x) -> Any:
"""Check if anything in `x` is NA
Args:
x: object to be tested
Returns:
Whether anything in `x` is NA
"""
raise _NotImplementedByCurrentBackendError("any_na", x)
@_register_func(pipeable=True, dispatchable=True)
def as_null(x) -> Any:
"""Convert anything to NULL
Args:
x: object to be converted
Returns:
NULL
"""
raise _NotImplementedByCurrentBackendError("as_null", x)
@_register_func(pipeable=True, dispatchable=True)
def is_null(x) -> Any:
"""Check if anything is NULL
Args:
x: object to be tested
Returns:
Whether `x` is NULL
"""
raise _NotImplementedByCurrentBackendError("is_null", x)
@_register_func(pipeable=True, dispatchable=True)
def set_seed(seed) -> Any:
"""Set the seed of the random number generator
Args:
seed: The seed
"""
raise _NotImplementedByCurrentBackendError("set_seed", seed)
@_register_func(pipeable=True, dispatchable="all")
def rep(x, times=1, length=None, each=1) -> Any:
"""Replicate elements of a vector
Args:
x: a vector or scaler
times: number of times to repeat each element if of length len(x),
or to repeat the whole vector if of length 1
length: non-negative integer. The desired length of the output vector
each: non-negative integer. Each element of x is repeated each times.
Returns:
The replicated vector
"""
raise _NotImplementedByCurrentBackendError("rep", x)
@_register_func(pipeable=True, dispatchable=True)
def c_(*args) -> Any:
"""Concatenate vectors
Args:
args: vectors to be concatenated
Returns:
The concatenated vector
"""
raise _NotImplementedByCurrentBackendError("c", *args)
c = _CollectionFunction(c_)
@_register_func(pipeable=True, dispatchable=True)
def length(x) -> Any:
"""Get the length of a vector
Args:
x: a vector or scaler
Returns:
The length of the vector
"""
raise _NotImplementedByCurrentBackendError("length", x)
@_register_func(pipeable=True, dispatchable=True)
def lengths(x) -> Any:
"""Get the lengths of a list
Args:
x: a list
Returns:
The lengths of the list
"""
raise _NotImplementedByCurrentBackendError("lengths", x)
@_register_func(pipeable=True, dispatchable=True)
def order(x, decreasing: bool = False, na_last: bool = True) -> Any:
"""Order a vector
Args:
x: a vector or scaler
decreasing: Whether to order in decreasing order
na_last: Whether to put NA at the end
Returns:
The order
"""
raise _NotImplementedByCurrentBackendError("order", x)
@_register_func(pipeable=True, dispatchable=True)
def sort(x, decreasing: bool = False, na_last: bool = True) -> Any:
"""Sort a vector
Args:
x: a vector or scaler
decreasing: Whether to sort in decreasing order
na_last: Whether to put NA at the end
Returns:
The sorted vector
"""
raise _NotImplementedByCurrentBackendError("sort", x)
@_register_func(pipeable=True, dispatchable=True)
def rev(x) -> Any:
"""Reverse a vector
Args:
x: a vector or scaler
Returns:
The reversed vector
"""
raise _NotImplementedByCurrentBackendError("rev", x)
@_register_func(pipeable=True, dispatchable=True)
def sample(x, size=None, replace: bool = False, prob=None) -> Any:
"""Sample a vector
Args:
x: a vector or scaler
size: the size of the sample
replace: whether to sample with replacement
prob: the probabilities of sampling each element
Returns:
The sampled vector
"""
raise _NotImplementedByCurrentBackendError("sample", x)
@_register_func(pipeable=True, dispatchable=True)
def seq(from_=None, to=None, by=None, length_out=None, along_with=None) -> Any:
"""Generate a sequence
Args:
from_: the start of the sequence
to: the end of the sequence
by: the step of the sequence
length_out: the length of the sequence
along_with: the sequence to be aligned with
Returns:
The sequence
"""
raise _NotImplementedByCurrentBackendError("seq", from_)
@_register_func(pipeable=True, dispatchable=True)
def seq_along(x) -> Any:
"""Generate a sequence along a vector
Args:
x: a vector or scaler
Returns:
The sequence
"""
raise _NotImplementedByCurrentBackendError("seq_along", x)
@_register_func(pipeable=True, dispatchable=True)
def seq_len(x) -> Any:
"""Generate a sequence of length x
Args:
x: a vector or scaler
Returns:
The sequence
"""
raise _NotImplementedByCurrentBackendError("seq_len", x)
@_register_func(pipeable=True, dispatchable=True)
def match(x, table, nomatch=-1) -> Any:
"""Match elements of a vector
Args:
x: a vector or scaler
table: the table to match
nomatch: the value to use for no match
Returns:
The matched vector
"""
raise _NotImplementedByCurrentBackendError("match", x)
@_register_func(pipeable=True, dispatchable=True)
def beta(x, y) -> Any:
"""Compute the beta function
Args:
x: a vector or scaler
y: a vector or scaler
Returns:
The beta function
"""
raise _NotImplementedByCurrentBackendError("beta", x)
@_register_func(pipeable=True, dispatchable=True)
def lgamma(x) -> Any:
"""Compute the log gamma function
Args:
x: a vector or scaler
Returns:
The log gamma function
"""
raise _NotImplementedByCurrentBackendError("lgamma", x)
@_register_func(pipeable=True, dispatchable=True)
def digamma(x) -> Any:
"""Compute the digamma function
Args:
x: a vector or scaler
Returns:
The digamma function
"""
raise _NotImplementedByCurrentBackendError("digamma", x)
@_register_func(pipeable=True, dispatchable=True)
def trigamma(x) -> Any:
"""Compute the trigamma function
Args:
x: a vector or scaler
Returns:
The trigamma function
"""
raise _NotImplementedByCurrentBackendError("trigamma", x)
@_register_func(pipeable=True, dispatchable=True)
def choose(n, k) -> Any:
"""Compute the binomial coefficient
Args:
n: a vector or scaler
k: a vector or scaler
Returns:
The binomial coefficient
"""
raise _NotImplementedByCurrentBackendError("choose", n)
@_register_func(pipeable=True, dispatchable=True)
def factorial(x) -> Any:
"""Compute the factorial
Args:
x: a vector or scaler
Returns:
The factorial
"""
raise _NotImplementedByCurrentBackendError("factorial", x)
@_register_func(pipeable=True, dispatchable=True)
def gamma(x) -> Any:
"""Compute the gamma function
Args:
x: a vector or scaler
Returns:
The gamma function
"""
raise _NotImplementedByCurrentBackendError("gamma", x)
@_register_func(pipeable=True, dispatchable=True)
def lfactorial(x) -> Any:
"""Compute the log factorial
Args:
x: a vector or scaler
Returns:
The log factorial
"""
raise _NotImplementedByCurrentBackendError("lfactorial", x)
@_register_func(pipeable=True, dispatchable=True)
def lchoose(n, k) -> Any:
"""Compute the log binomial coefficient
Args:
n: a vector or scaler
k: a vector or scaler
Returns:
The log binomial coefficient
"""
raise _NotImplementedByCurrentBackendError("lchoose", n)
@_register_func(pipeable=True, dispatchable=True)
def lbeta(x, y) -> Any:
"""Compute the log beta function
Args:
x: a vector or scaler
y: a vector or scaler
Returns:
The log beta function
"""
raise _NotImplementedByCurrentBackendError("lbeta", x)
@_register_func(pipeable=True, dispatchable=True)
def psigamma(x, deriv) -> Any:
"""Compute the psi function
Args:
x: a vector or scaler
deriv: the derivative
Returns:
The psi function
"""
raise _NotImplementedByCurrentBackendError("psigamma", x)
@_register_func(pipeable=True, dispatchable=True)
def rnorm(n, mean=0, sd=1) -> Any:
"""Generate random normal variables
Args:
n: the number of random variables
mean: the mean of the random variables
sd: the standard deviation of the random variables
Returns:
The random normal variables
"""
raise _NotImplementedByCurrentBackendError("rnorm", n)
@_register_func(pipeable=True, dispatchable=True)
def runif(n, min=0, max=1) -> Any:
"""Generate random uniform variables
Args:
n: the number of random variables
min: the minimum of the random variables
max: the maximum of the random variables
Returns:
The random uniform variables
"""
raise _NotImplementedByCurrentBackendError("runif", n)
@_register_func(pipeable=True, dispatchable=True)
def rpois(n, lambda_) -> Any:
"""Generate random Poisson variables
Args:
n: the number of random variables
lambda_: the lambda of the random variables
Returns:
The random Poisson variables
"""
raise _NotImplementedByCurrentBackendError("rpois", n)
@_register_func(pipeable=True, dispatchable=True)
def rbinom(n, size, prob) -> Any:
"""Generate random binomial variables
Args:
n: the number of random variables
size: the size of the random variables
prob: the probability of the random variables
Returns:
The random binomial variables
"""
raise _NotImplementedByCurrentBackendError("rbinom", n)
@_register_func(pipeable=True, dispatchable=True)
def rcauchy(n, location=0, scale=1) -> Any:
"""Generate random Cauchy variables
Args:
n: the number of random variables
location: the location of the random variables
scale: the scale of the random variables
Returns:
The random Cauchy variables
"""
raise _NotImplementedByCurrentBackendError("rcauchy", n)
@_register_func(pipeable=True, dispatchable=True)
def rchisq(n, df) -> Any:
"""Generate random chi-squared variables
Args:
n: the number of random variables
df: the degrees of freedom of the random variables
Returns:
The random chi-squared variables
"""
raise _NotImplementedByCurrentBackendError("rchisq", n)
@_register_func(pipeable=True, dispatchable=True)
def rexp(n, rate) -> Any:
"""Generate random exponential variables
Args:
n: the number of random variables
rate: the rate of the random variables
Returns:
The random exponential variables
"""
raise _NotImplementedByCurrentBackendError("rexp", n)
@_register_func(pipeable=True, dispatchable=True)
def is_character(x) -> Any:
"""Is x a character vector
Args:
x: a vector or scaler
Returns:
True if x is a character vector
"""
raise _NotImplementedByCurrentBackendError("is_character", x)
@_register_func(pipeable=True, dispatchable=True)
def grep(
pattern,
x,
ignore_case=False,
value=False,
fixed=False,
invert=False,
) -> Any:
"""Grep for a pattern
Args:
pattern: the pattern to search for
x: the vector to search
ignore_case: ignore case
value: return the value
fixed: use fixed string matching
invert: invert the match
Returns:
The indices of the matches
"""
raise _NotImplementedByCurrentBackendError("grep", pattern)
@_register_func(pipeable=True, dispatchable=True)
def grepl(pattern, x, ignore_case=False, fixed=False) -> Any:
"""Grep for a pattern
Args:
pattern: the pattern to search for
x: the vector to search
ignore_case: ignore case
fixed: use fixed string matching
Returns:
The indices of the matches
"""
raise _NotImplementedByCurrentBackendError("grepl", pattern)
@_register_func(pipeable=True, dispatchable=True)
def sub(pattern, replacement, x, ignore_case=False, fixed=False) -> Any:
"""Substitute a pattern
Args:
pattern: the pattern to search for
replacement: the replacement
x: the vector to search
ignore_case: ignore case
fixed: use fixed string matching
Returns:
The vector with the substitutions
"""
raise _NotImplementedByCurrentBackendError("sub", pattern)
@_register_func(pipeable=True, dispatchable=True)
def gsub(pattern, replacement, x, ignore_case=False, fixed=False) -> Any:
"""Substitute a pattern
Args:
pattern: the pattern to search for
replacement: the replacement
x: the vector to search
ignore_case: ignore case
fixed: use fixed string matching
Returns:
The vector with the substitutions
"""
raise _NotImplementedByCurrentBackendError("gsub", pattern)
@_register_func(pipeable=True, dispatchable=True)
def strsplit(x, split, fixed=False, perl=False, use_bytes=False) -> Any:
"""Split a string
Args:
x: the vector to split
split: the pattern to split on
fixed: use fixed string matching
perl: use perl regular expressions
use_bytes: use bytes
Returns:
The vector with the splits
"""
raise _NotImplementedByCurrentBackendError("strsplit", x)
@_register_func(pipeable=True, dispatchable=True)
def paste(*args, sep=" ", collapse=None) -> Any:
"""Join a vector into a string
Args:
*args: the vector to join
sep: the separator
collapse: collapse the vector
Returns:
The vector joined into a string
"""
raise _NotImplementedByCurrentBackendError("paste")
@_register_func(pipeable=True, dispatchable=True)
def paste0(*args, collapse=None) -> Any:
"""Join a vector into a string
Args:
*args: the vector to join
collapse: collapse the vector
Returns:
The vector joined into a string
"""
raise _NotImplementedByCurrentBackendError("paste0")
@_register_func(pipeable=True, dispatchable=True)
def sprintf(fmt, *args) -> Any:
"""Format a string
Args:
fmt: the format string
args: the arguments to the format string
Returns:
The formatted string
"""
raise _NotImplementedByCurrentBackendError("sprintf", fmt)
@_register_func(pipeable=True, dispatchable=True)
def substr(x, start, stop) -> Any:
"""Get a substring
Args:
x: the string to get the substring from
start: the start of the substring
stop: the stop of the substring
Returns:
The substring
"""
raise _NotImplementedByCurrentBackendError("substr", x)
@_register_func(pipeable=True, dispatchable=True)
def substring(x, first, last=None) -> Any:
"""Get a substring
Args:
x: the string to get the substring from
first: the start of the substring
last: the stop of the substring
Returns:
The substring
"""
raise _NotImplementedByCurrentBackendError("substring", x)
@_register_func(pipeable=True, dispatchable=True)
def startswith(x, prefix) -> Any:
"""Does x start with prefix
Args:
x: the string to check
prefix: the prefix to check
Returns:
True if x starts with prefix
"""
raise _NotImplementedByCurrentBackendError("startswith", x)
@_register_func(pipeable=True, dispatchable=True)
def endswith(x, suffix) -> Any:
"""Does x end with suffix
Args:
x: the string to check
suffix: the suffix to check
Returns:
True if x ends with suffix
"""
raise _NotImplementedByCurrentBackendError("endswith", x)
@_register_func(pipeable=True, dispatchable=True)
def strtoi(x, base=0) -> Any:
"""Convert a string to an integer
Args:
x: the string to convert
base: the base of the integer
Returns:
The integer
"""
raise _NotImplementedByCurrentBackendError("strtoi", x)
@_register_func(pipeable=True, dispatchable=True)
def trimws(x, which="both", whitespace=r" \t") -> Any:
"""Trim whitespace from a string
Args:
x: the string to trim
which: which whitespace to trim
whitespace: the whitespace to trim
Returns:
The trimmed string
"""
raise _NotImplementedByCurrentBackendError("trimws", x)
@_register_func(pipeable=True, dispatchable=True)
def toupper(x) -> Any:
"""Convert a string to upper case
Args:
x: the string to convert
Returns:
The upper case string
"""
raise _NotImplementedByCurrentBackendError("toupper", x)
@_register_func(pipeable=True, dispatchable=True)
def tolower(x) -> Any:
"""Convert a string to lower case
Args:
x: the string to convert
Returns:
The lower case string
"""
raise _NotImplementedByCurrentBackendError("tolower", x)
@_register_func(pipeable=True, dispatchable=True)
def chartr(old, new, x) -> Any:
"""Translate characters
Args:
old: the characters to translate
new: the new characters
x: the string to translate
Returns:
The translated string
"""
raise _NotImplementedByCurrentBackendError("chartr", x)
@_register_func(pipeable=True, dispatchable=True)
def nchar(
x,
type_="width",
allow_na: bool = True,
keep_na: bool = False,
_na_len: int = 2,
) -> Any:
"""Get the number of characters in a string
Args:
x: the string to count
type: the type of count
allow_na: allow NA
keep_na: keep NA
Returns:
The number of characters
"""
raise _NotImplementedByCurrentBackendError("nchar", x)
@_register_func(pipeable=True, dispatchable=True)
def nzchar(x, keep_na: bool = False) -> Any:
"""Is the string non-zero length
Args:
x: the string to check
keep_na: keep NA
Returns:
True if the string is non-zero length
"""
raise _NotImplementedByCurrentBackendError("nzchar", x)
@_register_func(pipeable=True, dispatchable=True)
def table(
x,
*more,
exclude=None,
use_na="no",
dnn=None,
deparse_level=1,
) -> Any:
"""Get the table of a vector
Args:
x: the vector to get the table of
more: more vectors
exclude: exclude these values
use_na: use NA
dnn: the names of the vectors
deparse_level: the deparse level
Returns:
The table
"""
raise _NotImplementedByCurrentBackendError("table", x)
@_register_func(pipeable=True, dispatchable=True)
def tabulate(bin, nbins=None) -> Any:
"""Get the table of a vector
Args:
bin: the vector to get the table of
nbins: the number of bins
Returns:
An integer valued 'integer' vector (without names).
There is a bin for each of the values '1, ..., nbins'
"""
raise _NotImplementedByCurrentBackendError("tabulate", bin)
@_register_func(pipeable=True, dispatchable=True)
def is_atomic(x) -> Any:
"""Is the object atomic
Args:
x: the object to check
Returns:
True if the object is atomic
"""
raise _NotImplementedByCurrentBackendError("is_atomic", x)
@_register_func(pipeable=True, dispatchable=True)
def is_double(x) -> Any:
"""Is the object a double
Args:
x: the object to check
Returns:
True if the object is a double
"""
raise _NotImplementedByCurrentBackendError("is_double", x)
@_register_func(pipeable=True, dispatchable=True)
def is_element(x, y) -> Any:
"""Is the object an element of the table
Args:
x: the object to check
y: the pool to check
Returns:
True if the object is an element of the pool
"""
raise _NotImplementedByCurrentBackendError("is_element", x)
is_in = is_element
@_register_func(pipeable=True, dispatchable=True)
def is_integer(x) -> Any:
"""Is the object an integer
Args:
x: the object to check
Returns:
True if the object is an integer
"""
raise _NotImplementedByCurrentBackendError("is_integer", x)
@_register_func(pipeable=True, dispatchable=True)
def is_numeric(x) -> Any:
"""Is the object numeric
Args:
x: the object to check
Returns:
True if the object is numeric
"""
raise _NotImplementedByCurrentBackendError("is_numeric", x)
@_register_func(pipeable=True, dispatchable=True)
def any_(x, na_rm: bool = False) -> Any:
"""Is any element true
Args:
x: the vector to check
na_rm: remove NA
Returns:
True if any element is true
"""
raise _NotImplementedByCurrentBackendError("any", x)
@_register_func(pipeable=True, dispatchable=True)
def all_(x, na_rm: bool = False) -> Any:
"""Are all elements true
Args:
x: the vector to check
na_rm: remove NA
Returns:
True if all elements are true
"""
raise _NotImplementedByCurrentBackendError("all", x)
@_register_func(pipeable=True, dispatchable=True)
def acos(x) -> Any:
"""Get the inverse cosine
Args:
x: the value to get the inverse cosine of
Returns:
The inverse cosine
"""
raise _NotImplementedByCurrentBackendError("acos", x)
@_register_func(pipeable=True, dispatchable=True)
def acosh(x) -> Any:
"""Get the inverse hyperbolic cosine
Args:
x: the value to get the inverse hyperbolic cosine of
Returns:
The inverse hyperbolic cosine
"""
raise _NotImplementedByCurrentBackendError("acosh", x)
@_register_func(pipeable=True, dispatchable=True)
def asin(x) -> Any:
"""Get the inverse sine
Args:
x: the value to get the inverse sine of
Returns:
The inverse sine
"""
raise _NotImplementedByCurrentBackendError("asin", x)
@_register_func(pipeable=True, dispatchable=True)
def asinh(x) -> Any:
"""Get the inverse hyperbolic sine
Args:
x: the value to get the inverse hyperbolic sine of
Returns:
The inverse hyperbolic sine
"""
raise _NotImplementedByCurrentBackendError("asinh", x)
@_register_func(pipeable=True, dispatchable=True)
def atan(x) -> Any:
"""Get the inverse tangent
Args:
x: the value to get the inverse tangent of
Returns:
The inverse tangent
"""
raise _NotImplementedByCurrentBackendError("atan", x)
@_register_func(pipeable=True, dispatchable=True)
def atanh(x) -> Any:
"""Get the inverse hyperbolic tangent
Args:
x: the value to get the inverse hyperbolic tangent of
Returns:
The inverse hyperbolic tangent
"""
raise _NotImplementedByCurrentBackendError("atanh", x)
@_register_func(pipeable=True, dispatchable=True)
def cos(x) -> Any:
"""Get the cosine
Args:
x: the value to get the cosine of
Returns:
The cosine
"""
raise _NotImplementedByCurrentBackendError("cos", x)
@_register_func(pipeable=True, dispatchable=True)
def cosh(x) -> Any:
"""Get the hyperbolic cosine
Args:
x: the value to get the hyperbolic cosine of
Returns:
The hyperbolic cosine
"""
raise _NotImplementedByCurrentBackendError("cosh", x)
@_register_func(pipeable=True, dispatchable=True)
def cospi(x) -> Any:
"""Get the cosine of pi times x
Args:
x: the value to get the cosine of pi times x of
Returns:
The cosine of pi times x
"""
raise _NotImplementedByCurrentBackendError("cospi", x)
@_register_func(pipeable=True, dispatchable=True)
def sin(x) -> Any:
"""Get the sine
Args:
x: the value to get the sine of
Returns:
The sine
"""
raise _NotImplementedByCurrentBackendError("sin", x)
@_register_func(pipeable=True, dispatchable=True)
def sinh(x) -> Any:
"""Get the hyperbolic sine
Args:
x: the value to get the hyperbolic sine of
Returns:
The hyperbolic sine
"""
raise _NotImplementedByCurrentBackendError("sinh", x)
@_register_func(pipeable=True, dispatchable=True)
def sinpi(x) -> Any:
"""Get the sine of pi times x
Args:
x: the value to get the sine of pi times x of
Returns:
The sine of pi times x
"""
raise _NotImplementedByCurrentBackendError("sinpi", x)
@_register_func(pipeable=True, dispatchable=True)
def tan(x) -> Any:
"""Get the tangent
Args:
x: the value to get the tangent of
Returns:
The tangent
"""
raise _NotImplementedByCurrentBackendError("tan", x)
@_register_func(pipeable=True, dispatchable=True)
def tanh(x) -> Any:
"""Get the hyperbolic tangent
Args:
x: the value to get the hyperbolic tangent of
Returns:
The hyperbolic tangent
"""
raise _NotImplementedByCurrentBackendError("tanh", x)
@_register_func(pipeable=True, dispatchable=True)
def tanpi(x) -> Any:
"""Get the tangent of pi times x
Args:
x: the value to get the tangent of pi times x of
Returns:
The tangent of pi times x
"""
raise _NotImplementedByCurrentBackendError("tanpi", x)
@_register_func(pipeable=True, dispatchable=True)
def atan2(y, x) -> Any:
"""Get the inverse tangent of y/x
Args:
y: the numerator
x: the denominator
Returns:
The inverse tangent of y/x
"""
raise _NotImplementedByCurrentBackendError("atan2", x)
@_register_func(pipeable=True, dispatchable=True)
def append(x, values, after: int = -1) -> Any:
"""Append values to the vector
Args:
x: the vector to append to
values: the values to append
after: the index to append after
Returns:
The vector with the values appended
"""
raise _NotImplementedByCurrentBackendError("append", x)
@_register_func(pipeable=True, dispatchable=True)
def colnames(x, nested: bool = True) -> Any:
"""Get the column names
Args:
x: the data frame to get the column names of
nested: whether x is a nested data frame
Returns:
The column names
"""
raise _NotImplementedByCurrentBackendError("colnames", x)
@_register_func(pipeable=True, dispatchable=True)
def set_colnames(x, names, nested: bool = True) -> Any:
"""Set the column names
Args:
x: the data frame to set the column names of
names: the column names to set
nested: whether the frame are nested
Returns:
The data frame with the column names set
"""
raise _NotImplementedByCurrentBackendError("set_colnames", x)
@_register_func(pipeable=True, dispatchable=True)
def rownames(x) -> Any:
"""Get the row names
Args:
x: the data frame to get the row names of
Returns:
The row names
"""
raise _NotImplementedByCurrentBackendError("rownames", x)
@_register_func(pipeable=True, dispatchable=True)
def set_rownames(x, names) -> Any:
"""Set the row names
Args:
x: the data frame to set the row names of
names: the row names to set
Returns:
The data frame with the row names set
"""
raise _NotImplementedByCurrentBackendError("set_rownames", x)
@_register_func(pipeable=True, dispatchable=True)
def dim(x, nested: bool = True) -> Any:
"""Get the dimensions
Args:
x: the data frame to get the dimensions of
nested: whether x is a nested data frame
Returns:
The dimensions
"""
raise _NotImplementedByCurrentBackendError("dim", x)
@_register_func(pipeable=True, dispatchable=True)
def diag(x, nrow=None, ncol=None) -> Any:
"""Get the diagonal of a matrix
Args:
x: the matrix to get the diagonal of
nrow: the number of rows
ncol: the number of columns
Returns:
The diagonal of the matrix
"""
raise _NotImplementedByCurrentBackendError("diag", x)
@_register_func(pipeable=True, dispatchable=True)
def duplicated(x, incomparables=None, from_last: bool = False) -> Any:
"""Get the duplicated values
Args:
x: the vector to get the duplicated values of
incomparables: the incomparables
from_last: whether to search from the last
Returns:
The duplicated values
"""
raise _NotImplementedByCurrentBackendError("duplicated", x)
@_register_func(pipeable=True, dispatchable=True)
def intersect(x, y) -> Any:
"""Get the intersection of two vectors
Args:
x: the first vector
y: the second vector
Returns:
The intersection of the two vectors
"""
raise _NotImplementedByCurrentBackendError("intersect", x)
@_register_func(pipeable=True, dispatchable=True)
def ncol(x, nested: bool = True) -> Any:
"""Get the number of columns
Args:
x: the data frame to get the number of columns of
nested: whether x is a nested data frame
Returns:
The number of columns
"""
raise _NotImplementedByCurrentBackendError("ncol", x)
@_register_func(pipeable=True, dispatchable=True)
def nrow(x) -> Any:
"""Get the number of rows
Args:
x: the data frame to get the number of rows of
Returns:
The number of rows
"""
raise _NotImplementedByCurrentBackendError("nrow", x)
@_register_func(pipeable=True, dispatchable=True)
def proportions(x, margin: int = 1) -> Any:
"""Get the proportion table
Args:
x: the data frame to get the proportion table of
margin: the margin
Returns:
The proportion table
"""
raise _NotImplementedByCurrentBackendError("proportions", x)
@_register_func(pipeable=True, dispatchable=True)
def setdiff(x, y) -> Any:
"""Get the difference of two vectors
Args:
x: the first vector
y: the second vector
Returns:
The difference of the two vectors
"""
raise _NotImplementedByCurrentBackendError("setdiff", x)
@_register_func(pipeable=True, dispatchable=True)
def setequal(x, y) -> Any:
"""Check if two vectors are equal
Args:
x: the first vector
y: the second vector
Returns:
Whether the two vectors are equal
"""
raise _NotImplementedByCurrentBackendError("setequal", x)
@_register_func(pipeable=True, dispatchable=True)
def unique(x) -> Any:
"""Get the unique values
Args:
x: the vector to get the unique values of
Returns:
The unique values
"""
raise _NotImplementedByCurrentBackendError("unique", x)
@_register_func(pipeable=True, dispatchable=True)
def t(x) -> Any:
"""Get the transpose
Args:
x: the matrix to get the transpose of
Returns:
The transpose
"""
raise _NotImplementedByCurrentBackendError("t", x)
@_register_func(pipeable=True, dispatchable=True)
def union(x, y) -> Any:
"""Get the union of two vectors
Args:
x: the first vector
y: the second vector
Returns:
The union of the two vectors
"""
raise _NotImplementedByCurrentBackendError("union", x)
@_register_func(pipeable=True, dispatchable=True)
def max_col(x, ties_method: str = "random", nested: bool = True) -> Any:
"""Get the maximum column
Args:
x: the data frame to get the maximum column of
ties_method: the ties method
nested: whether x is a nested data frame
Returns:
The maximum column
"""
raise _NotImplementedByCurrentBackendError("max_col", x)
@_register_func(pipeable=True, dispatchable=True)
def complete_cases(x) -> Any:
"""Get the complete cases
Args:
x: the data frame to get the complete cases of
Returns:
The complete cases
"""
raise _NotImplementedByCurrentBackendError("complete_cases", x)
@_register_func(pipeable=True, dispatchable=True)
def head(x, n: int = 6) -> Any:
"""Get the first n rows
Args:
x: the data frame to get the first n rows of
n: the number of rows to get
Returns:
The first n rows
"""
raise _NotImplementedByCurrentBackendError("head", x)
@_register_func(pipeable=True, dispatchable=True)
def tail(x, n: int = 6) -> Any:
"""Get the last n rows
Args:
x: the data frame to get the last n rows of
n: the number of rows to get
Returns:
The last n rows
"""
raise _NotImplementedByCurrentBackendError("tail", x)
@_register_func(pipeable=True, dispatchable=True)
def which(x) -> Any:
"""Get the indices of the non-zero values
Args:
x: the vector to get the indices of the non-zero values of
Returns:
The indices of the non-zero values
"""
raise _NotImplementedByCurrentBackendError("which", x)
@_register_func(pipeable=True, dispatchable=True)
def which_max(x) -> Any:
"""Get the index of the maximum value
Args:
x: the vector to get the index of the maximum value of
Returns:
The index of the maximum value
"""
raise _NotImplementedByCurrentBackendError("which_max", x)
@_register_func(pipeable=True, dispatchable=True)
def which_min(x) -> Any:
"""Get the index of the minimum value
Args:
x: the vector to get the index of the minimum value of
Returns:
The index of the minimum value
"""
raise _NotImplementedByCurrentBackendError("which_min", x)
================================================
FILE: datar/apis/dplyr.py
================================================
# import the variables with _ so that they are not imported by *
from __future__ import annotations as _
from typing import (
Any,
Callable as _Callable,
Sequence as _Sequence,
TypeVar as _TypeVar,
)
from pipda import (
register_verb as _register_verb,
register_func as _register_func,
)
from ..core.verb_env import get_verb_ast_fallback as _get_verb_ast_fallback
from ..core.defaults import f as _f_symbolic
from ..core.utils import (
NotImplementedByCurrentBackendError as _NotImplementedByCurrentBackendError,
)
from .base import intersect, setdiff, setequal, union # noqa: F401
T = _TypeVar("T")
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("pick"))
def pick(_data: T, *args) -> T:
"""Pick columns by name
The original API:
https://dplyr.tidyverse.org/reference/pick.html
Args:
_data: The dataframe
*args: The columns to pick
Returns:
The picked dataframe
"""
raise _NotImplementedByCurrentBackendError("pick", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("across"))
def across(_data: T, *args, _names=None, **kwargs) -> T:
"""Apply the same transformation to multiple columns
The original API:
https://dplyr.tidyverse.org/reference/across.html
Examples:
#
>>> iris >> mutate(across(c(f.Sepal_Length, f.Sepal_Width), round))
Sepal_Length Sepal_Width Petal_Length Petal_Width Species
<float64> <float64> <float64> <float64> <object>
0 5.0 4.0 1.4 0.2 setosa
1 5.0 3.0 1.4 0.2 setosa
.. ... ... ... ... ...
>>> iris >> group_by(f.Species) >> summarise(
>>> across(starts_with("Sepal"), mean)
>>> )
Species Sepal_Length Sepal_Width
<object> <float64> <float64>
0 setosa 5.006 3.428
1 versicolor 5.936 2.770
2 virginica 6.588 2.974
Args:
_data: The dataframe.
*args: If given, the first 2 elements should be columns and functions
apply to each of the selected columns. The rest of them will be
the arguments for the functions.
_names: A glue specification that describes how to name
the output columns. This can use `{_col}` to stand for the
selected column name, and `{_fn}` to stand for the name of
the function being applied.
The default (None) is equivalent to `{_col}` for the
single function case and `{_col}_{_fn}` for the case where
a list is used for _fns. In such a case, `{_fn}` is 0-based.
To use 1-based index, use `{_fn1}`
_fn_context: Defines the context to evaluate the arguments for functions
if they are plain functions.
Note that registered functions will use its own context
**kwargs: Keyword arguments for the functions
Returns:
A dataframe with one column for each column and each function.
"""
raise _NotImplementedByCurrentBackendError("across", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("c_across"))
def c_across(_data: T, _cols=None) -> T:
"""Apply the same transformation to multiple columns rowwisely
Args:
_data: The dataframe
_cols: The columns
Returns:
A rowwise tibble
"""
raise _NotImplementedByCurrentBackendError("c_across", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("if_any"))
def if_any(_data, *args, _names=None, **kwargs) -> Any:
"""Apply the same predicate function to a selection of columns and combine
the results True if any element is True.
See Also:
[`across()`](datar.dplyr.across.across)
"""
raise _NotImplementedByCurrentBackendError("if_any", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("if_all"))
def if_all(_data, *args, _names=None, **kwargs) -> Any:
"""Apply the same predicate function to a selection of columns and combine
the results True if all elements are True.
See Also:
[`across()`](datar.dplyr.across.across)
"""
raise _NotImplementedByCurrentBackendError("if_all", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("symdiff"))
def symdiff(x: T, y: T) -> T:
"""Get the symmetric difference of two dataframes
It computes the symmetric difference, i.e. all rows in x that aren't in y
and all rows in y that aren't in x.
The original API:
https://dplyr.tidyverse.org/reference/setops.html
Args:
x: A dataframe
y: A dataframe
Returns:
The symmetric difference of x and y
"""
raise _NotImplementedByCurrentBackendError("symdiff", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("arrange"))
def arrange(_data, *args, _by_group=False, **kwargs) -> Any:
"""orders the rows of a data frame by the values of selected columns.
The original API:
https://dplyr.tidyverse.org/reference/arrange.html
Args:
_data: A data frame
*series: Variables, or functions of variables.
Use desc() to sort a variable in descending order.
_by_group: If TRUE, will sort first by grouping variable.
Applies to grouped data frames only.
**kwargs: Name-value pairs that apply with mutate
Returns:
An object of the same type as _data.
The output has the following properties:
All rows appear in the output, but (usually) in a different place.
Columns are not modified.
Groups are not modified.
Data frame attributes are preserved.
"""
raise _NotImplementedByCurrentBackendError("arrange", _data)
@_register_func(pipeable=True, dispatchable=True)
def bind_rows(*data, _id=None, _copy: bool = True, **kwargs) -> Any:
"""Bind rows of give dataframes
Original APIs https://dplyr.tidyverse.org/reference/bind.html
Args:
*data: Dataframes to combine
_id: The name of the id columns
_copy: If `False`, do not copy data unnecessarily.
Original API does not support this. This argument will be
passed by to `pandas.concat()` as `copy` argument.
**kwargs: A mapping of dataframe, keys will be used as _id col.
Returns:
The combined dataframe
"""
raise _NotImplementedByCurrentBackendError("bind_rows")
@_register_func(pipeable=True, dispatchable=True)
def bind_cols(*data, _name_repair="unique", _copy: bool = True) -> Any:
"""Bind columns of give dataframes
Note that unlike `dplyr`, mismatched dimensions are allowed and
missing rows will be filled with `NA`s
Args:
*data: Dataframes to bind
_name_repair: treatment of problematic column names:
- "minimal": No name repair or checks, beyond basic existence,
- "unique": Make sure names are unique and not empty,
- "check_unique": (default value), no name repair,
but check they are unique,
- "universal": Make the names unique and syntactic
- a function: apply custom name repair
_copy: If `False`, do not copy data unnecessarily.
Original API does not support this. This argument will be
passed by to `pandas.concat()` as `copy` argument.
Returns:
The combined dataframe
"""
raise _NotImplementedByCurrentBackendError("bind_cols")
# context
@_register_func(plain=True)
def cur_column(_data, _name) -> Any:
"""Get the current column
Args:
_data: The dataframe
_name: The column name
Returns:
The current column
"""
raise _NotImplementedByCurrentBackendError("cur_column")
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("cur_data"))
def cur_data(_data) -> Any:
"""Get the current dataframe
Args:
_data: The dataframe
Returns:
The current dataframe
"""
raise _NotImplementedByCurrentBackendError("cur_data", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("n"))
def n(_data) -> Any:
"""Get the current group size
Args:
_data: The dataframe
Returns:
The number of rows
"""
raise _NotImplementedByCurrentBackendError("n", _data)
@_register_verb(
dependent=True, ast_fallback=_get_verb_ast_fallback("cur_data_all")
)
def cur_data_all(_data) -> Any:
"""Get the current data for the current group including
the grouping variables
Args:
_data: The dataframe
Returns:
The current dataframe
"""
raise _NotImplementedByCurrentBackendError("cur_data_all", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("cur_group"))
def cur_group(_data) -> Any:
"""Get the current group
Args:
_data: The dataframe
Returns:
The current group
"""
raise _NotImplementedByCurrentBackendError("cur_group", _data)
@_register_verb(
dependent=True, ast_fallback=_get_verb_ast_fallback("cur_group_id")
)
def cur_group_id(_data) -> Any:
"""Get the current group id
Args:
_data: The dataframe
Returns:
The current group id
"""
raise _NotImplementedByCurrentBackendError("cur_group_id", _data)
@_register_verb(
dependent=True, ast_fallback=_get_verb_ast_fallback("cur_group_rows")
)
def cur_group_rows(_data) -> Any:
"""Get the current group row indices
Args:
_data: The dataframe
Returns:
The current group rows
"""
raise _NotImplementedByCurrentBackendError("cur_group_rows", _data)
# count_tally
@_register_verb(ast_fallback=_get_verb_ast_fallback("count"))
def count(
_data,
*args,
wt=None,
sort=False,
name=None,
_drop=None,
**kwargs,
) -> Any:
"""Count the number of rows in each group
Original API:
https://dplyr.tidyverse.org/reference/count.html
Args:
_data: A data frame
*args: Variables, or functions of variables.
Use desc() to sort a variable in descending order.
wt: A variable or function of variables to weight by.
sort: If TRUE, the result will be sorted by the count.
name: The name of the count column.
_drop: If `False`, keep grouping variables even if they are not used.
Original API does not support this.
**kwargs: Name-value pairs that apply with mutate
Returns:
A data frame with the same number of rows as the number of groups.
The output has the following properties:
All rows appear in the output, but (usually) in a different place.
Columns are not modified.
Groups are not modified.
Data frame attributes are preserved.
"""
raise _NotImplementedByCurrentBackendError("count", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("tally"))
def tally(_data, wt=None, sort=False, name=None) -> Any:
"""Count the number of rows in each group
Original API:
https://dplyr.tidyverse.org/reference/count.html
Args:
_data: A data frame
wt: A variable or function of variables to weight by.
sort: If TRUE, the result will be sorted by the count.
name: The name of the count column.
Returns:
A data frame with the same number of rows as the number of groups.
The output has the following properties:
All rows appear in the output, but (usually) in a different place.
Columns are not modified.
Groups are not modified.
Data frame attributes are preserved.
"""
raise _NotImplementedByCurrentBackendError("tally", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("add_count"))
def add_count(_data, *args, wt=None, sort=False, name="n", **kwargs) -> Any:
"""Add a count column to a data frame
Original API:
https://dplyr.tidyverse.org/reference/count.html
Args:
_data: A data frame
*args: Variables, or functions of variables.
Use desc() to sort a variable in descending order.
wt: A variable or function of variables to weight by.
sort: If TRUE, the result will be sorted by the count.
name: The name of the count column.
**kwargs: Name-value pairs that apply with mutate
Returns:
A data frame with the same number of rows as the number of groups.
The output has the following properties:
All rows appear in the output, but (usually) in a different place.
Columns are not modified.
Groups are not modified.
Data frame attributes are preserved.
"""
raise _NotImplementedByCurrentBackendError("add_count", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("add_tally"))
def add_tally(_data, wt=None, sort=False, name="n") -> Any:
"""Add a count column to a data frame
Original API:
https://dplyr.tidyverse.org/reference/count.html
Args:
_data: A data frame
wt: A variable or function of variables to weight by.
sort: If TRUE, the result will be sorted by the count.
name: The name of the count column.
Returns:
A data frame with the same number of rows as the number of groups.
The output has the following properties:
All rows appear in the output, but (usually) in a different place.
Columns are not modified.
Groups are not modified.
Data frame attributes are preserved.
"""
raise _NotImplementedByCurrentBackendError("add_tally", _data)
# desc
@_register_func(pipeable=True, dispatchable=True)
def desc(x) -> Any:
"""Transform a vector into a format that will be sorted in descending order
This is useful within arrange().
The original API:
https://dplyr.tidyverse.org/reference/desc.html
Args:
x: vector to transform
Returns:
The descending order of x
"""
raise _NotImplementedByCurrentBackendError("desc", x)
# filter
@_register_verb(ast_fallback=_get_verb_ast_fallback("filter_"))
def filter_(_data, *conditions, _preserve: bool = False) -> Any:
"""Filter a data frame based on conditions
The original API:
https://dplyr.tidyverse.org/reference/filter.html
Args:
_data: A data frame
*conditions: Conditions to filter by.
_preserve: If `True`, keep grouping variables even if they are not used.
Returns:
The subset dataframe
"""
raise _NotImplementedByCurrentBackendError("filter", _data)
# distinct
@_register_verb(ast_fallback=_get_verb_ast_fallback("distinct"))
def distinct(
_data,
*args,
keep_all: bool = False,
_preserve: bool = False,
) -> Any:
"""Filter a data frame based on conditions
The original API:
https://dplyr.tidyverse.org/reference/distinct.html
Args:
_data: A data frame
*args: Variables to filter by.
keep_all: If `True`, keep all rows that match.
_preserve: If `True`, keep grouping variables even if they are not used.
Returns:
The subset dataframe
"""
raise _NotImplementedByCurrentBackendError("distinct", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("n_distinct"))
def n_distinct(_data, na_rm: bool = True) -> Any:
"""Count the number of distinct values
The original API:
https://dplyr.tidyverse.org/reference/distinct.html
Args:
_data: A data frame
na_rm: If `True`, remove missing values before counting.
Returns:
The number of distinct values
"""
raise _NotImplementedByCurrentBackendError("n_distinct", _data)
# glimpse
@_register_verb(ast_fallback=_get_verb_ast_fallback("glimpse"))
def glimpse(_data, width: int = None, formatter=None) -> Any:
"""Display a summary of a data frame
The original API:
https://dplyr.tidyverse.org/reference/glimpse.html
Args:
_data: A data frame
width: Width of output, defaults to the width of the console.
formatter: A single-dispatch function to format a single element.
"""
raise _NotImplementedByCurrentBackendError("glimpse", _data)
# slice
@_register_verb(ast_fallback=_get_verb_ast_fallback("slice_"))
def slice_(_data, *args, _preserve: bool = False) -> Any:
"""Extract rows by their position
The original API:
https://dplyr.tidyverse.org/reference/slice.html
Args:
_data: A data frame
*args: Positions to extract.
_preserve: If `True`, keep grouping variables even if they are not used.
Returns:
The subset dataframe
"""
raise _NotImplementedByCurrentBackendError("slice", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("slice_head"))
def slice_head(_data, n: int = None, prop: float = None) -> Any:
"""Extract the first rows
The original API:
https://dplyr.tidyverse.org/reference/slice.html
Args:
_data: A data frame
n: Number of rows to extract.
prop: Proportion of rows to extract.
Returns:
The subset dataframe
"""
raise _NotImplementedByCurrentBackendError("slice_head", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("slice_tail"))
def slice_tail(_data, n: int = None, prop: float = None) -> Any:
"""Extract the last rows
The original API:
https://dplyr.tidyverse.org/reference/slice.html
Args:
_data: A data frame
n: Number of rows to extract.
prop: Proportion of rows to extract.
Returns:
The subset dataframe
"""
raise _NotImplementedByCurrentBackendError("slice_tail", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("slice_sample"))
def slice_sample(
_data,
n: int = 1,
prop: float = None,
weight_by=None,
replace: bool = False,
) -> Any:
"""Extract rows by sampling
The original API:
https://dplyr.tidyverse.org/reference/slice.html
Args:
_data: A data frame
n: Number of rows to extract.
prop: Proportion of rows to extract.
weight_by: A variable or function of variables to weight by.
replace: If `True`, sample with replacement.
Returns:
The subset dataframe
"""
raise _NotImplementedByCurrentBackendError("slice_sample", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("slice_min"))
def slice_min(
_data,
order_by,
n: int = 1,
prop: float = None,
with_ties: bool | str = None,
) -> Any:
"""Extract rows with the minimum value
The original API:
https://dplyr.tidyverse.org/reference/slice.html
Args:
_data: A data frame
order_by: A variable or function of variables to order by.
n: Number of rows to extract.
prop: Proportion of rows to extract.
with_ties: If `True`, extract all rows with the minimum value.
If "first", extract the first row with the minimum value.
If "last", extract the last row with the minimum value.
Returns:
The subset dataframe
"""
raise _NotImplementedByCurrentBackendError("slice_min", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("slice_max"))
def slice_max(
_data,
order_by,
n: int = 1,
prop: float = None,
with_ties: bool | str = None,
) -> Any:
"""Extract rows with the maximum value
The original API:
https://dplyr.tidyverse.org/reference/slice.html
Args:
_data: A data frame
order_by: A variable or function of variables to order by.
n: Number of rows to extract.
prop: Proportion of rows to extract.
with_ties: If `True`, extract all rows with the maximum value.
If "first", extract the first row with the maximum value.
If "last", extract the last row with the maximum value.
Returns:
The subset dataframe
"""
raise _NotImplementedByCurrentBackendError("slice_max", _data)
# misc funs
@_register_func(pipeable=True, dispatchable=True)
def between(x, left, right, inclusive: str = "both") -> Any:
"""Check if a value is between two other values
The original API:
https://dplyr.tidyverse.org/reference/between.html
Args:
x: A value
left: The left bound
right: The right bound
inclusive: Either `both`, `neither`, `left` or `right`.
Include boundaries. Whether to set each bound as closed or open.
Returns:
A bool value if `x` is scalar, otherwise an array of boolean values
Note that it will be always False when NA appears in x, left or right.
"""
raise _NotImplementedByCurrentBackendError("between")
@_register_func(pipeable=True, dispatchable=True)
def cummean(x, na_rm: bool = False) -> Any:
"""Cumulative mean
The original API:
https://dplyr.tidyverse.org/reference/cumall.html
Args:
x: A numeric vector
na_rm: If `True`, remove missing values before computing.
Returns:
An array of cumulative means
"""
raise _NotImplementedByCurrentBackendError("cummean", x)
@_register_func(pipeable=True, dispatchable=True)
def cumall(x) -> Any:
"""Get cumulative bool. All cases after first False
The original API:
https://dplyr.tidyverse.org/reference/cumall.html
Args:
x: A logical vector
Returns:
An array of cumulative conjunctions
"""
raise _NotImplementedByCurrentBackendError("cumall", x)
@_register_func(pipeable=True, dispatchable=True)
def cumany(x) -> Any:
"""Get cumulative bool. All cases after first True
The original API:
https://dplyr.tidyverse.org/reference/cumany.html
Args:
x: A logical vector
Returns:
An array of cumulative disjunctions
"""
raise _NotImplementedByCurrentBackendError("cumany", x)
@_register_func(pipeable=True, dispatchable=True)
def coalesce(x, *replace) -> Any:
"""Replace missing values with the first non-missing value
The original API:
https://dplyr.tidyverse.org/reference/coalesce.html
Args:
x: A vector
*replace: Values to replace missing values with.
Returns:
An array of values
"""
raise _NotImplementedByCurrentBackendError("coalesce")
@_register_func(pipeable=True, dispatchable=True)
def consecutive_id(x, *args) -> _Sequence[int]:
"""Generate consecutive ids
The original API:
https://dplyr.tidyverse.org/reference/consecutive_id.html
Args:
x: A vector
*args: Other vectors
Returns:
A sequence of consecutive ids
"""
raise _NotImplementedByCurrentBackendError("consecutive_id", x)
@_register_func(pipeable=True, dispatchable=True)
def na_if(x, value) -> Any:
"""Replace values with missing values
The original API:
https://dplyr.tidyverse.org/reference/na_if.html
Args:
x: A vector
value: Values to replace with missing values.
Returns:
An array of values
"""
raise _NotImplementedByCurrentBackendError("na_if")
@_register_func(pipeable=True, dispatchable=True)
def near(x, y, tol: float = 1e-8) -> Any:
"""Check if values are approximately equal
The original API:
https://dplyr.tidyverse.org/reference/near.html
Args:
x: A numeric vector
y: A numeric vector
tol: Tolerance
Returns:
An array of boolean values
"""
raise _NotImplementedByCurrentBackendError("near")
@_register_func(pipeable=True, dispatchable=True)
def nth(x, n, order_by=None, default=None) -> Any:
"""Extract the nth element of a vector
The original API:
https://dplyr.tidyverse.org/reference/nth.html
Args:
x: A vector
n: The index of the element to extract.
order_by: A variable or function of variables to order by.
default: A default value to return if `n` is out of bounds.
Returns:
A value
"""
raise _NotImplementedByCurrentBackendError("nth", x)
@_register_func(pipeable=True, dispatchable=True)
def first(x, order_by=None, default=None) -> Any:
"""Extract the first element of a vector
The original API:
https://dplyr.tidyverse.org/reference/nth.html
Args:
x: A vector
order_by: A variable or function of variables to order by.
default: A default value to return if `x` is empty.
Returns:
A value
"""
raise _NotImplementedByCurrentBackendError("first", x)
@_register_func(pipeable=True, dispatchable=True)
def last(x, order_by=None, default=None) -> Any:
"""Extract the last element of a vector
The original API:
https://dplyr.tidyverse.org/reference/nth.html
Args:
x: A vector
order_by: A variable or function of variables to order by.
default: A default value to return if `x` is empty.
Returns:
A value
"""
raise _NotImplementedByCurrentBackendError("last", x)
# group_by
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_by"))
def group_by(_data, *args, _add: bool = False, _drop: bool = None) -> Any:
"""Create a grouped frame
The original API:
https://dplyr.tidyverse.org/reference/group_by.html
Args:
_data: A data frame
*args: A variable or function of variables to group by.
_add: If `True`, add grouping variables to an existing group.
_drop: If `True`, drop grouping variables from the output.
Returns:
A grouped frame
"""
raise _NotImplementedByCurrentBackendError("group_by", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("ungroup"))
def ungroup(_data, *cols: str | int) -> Any:
"""Remove grouping variables
The original API:
https://dplyr.tidyverse.org/reference/ungroup.html
Args:
_data: A grouped frame
*cols: Columns to remove grouping variables from.
Returns:
A data frame
"""
raise _NotImplementedByCurrentBackendError("ungroup", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("rowwise"))
def rowwise(_data, *cols: str | int) -> Any:
"""Create a rowwise frame
The original API:
https://dplyr.tidyverse.org/reference/rowwise.html
Args:
_data: A data frame
*cols: Columns to make rowwise.
Returns:
A rowwise frame
"""
raise _NotImplementedByCurrentBackendError("rowwise", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_by_drop_default"))
def group_by_drop_default(_data) -> Any:
"""Get the default value of `_drop` of a frame
The original API:
https://dplyr.tidyverse.org/reference/group_by.html
Args:
_data: A data frame
Returns:
A bool value
"""
raise _NotImplementedByCurrentBackendError("group_by_drop_default", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_vars"))
def group_vars(_data) -> Any:
"""Get the grouping variables of a frame
The original API:
https://dplyr.tidyverse.org/reference/group_vars.html
Args:
_data: A grouped frame
Returns:
A list of grouping variables
"""
raise _NotImplementedByCurrentBackendError("group_vars", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_indices"))
def group_indices(_data) -> Any:
"""Get the group indices of a frame
The original API:
https://dplyr.tidyverse.org/reference/group_indices.html
Args:
_data: A grouped frame
Returns:
A list of group indices
"""
raise _NotImplementedByCurrentBackendError("group_indices", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_keys"))
def group_keys(_data) -> Any:
"""Get the group keys of a frame
The original API:
https://dplyr.tidyverse.org/reference/group_keys.html
Args:
_data: A grouped frame
Returns:
A list of group keys
"""
raise _NotImplementedByCurrentBackendError("group_keys", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_size"))
def group_size(_data) -> Any:
"""Get the group sizes of a frame
The original API:
https://dplyr.tidyverse.org/reference/group_size.html
Args:
_data: A grouped frame
Returns:
A list of group sizes
"""
raise _NotImplementedByCurrentBackendError("group_size", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_rows"))
def group_rows(_data) -> Any:
"""Get the group rows of a frame
The original API:
https://dplyr.tidyverse.org/reference/group_rows.html
Args:
_data: A grouped frame
Returns:
A list of group rows
"""
raise _NotImplementedByCurrentBackendError("group_rows", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_cols"))
def group_cols(_data) -> Any:
"""Get the group columns of a frame
The original API:
https://dplyr.tidyverse.org/reference/group_cols.html
Args:
_data: A grouped frame
Returns:
A list of group columns
"""
raise _NotImplementedByCurrentBackendError("group_cols", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_data"))
def group_data(_data) -> Any:
"""Get the group data of a frame
The original API:
https://dplyr.tidyverse.org/reference/group_data.html
Args:
_data: A grouped frame
Returns:
A list of group data
"""
raise _NotImplementedByCurrentBackendError("group_data", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("n_groups"))
def n_groups(_data) -> int:
"""Get the number of groups of a frame
The original API:
https://dplyr.tidyverse.org/reference/n_groups.html
Args:
_data: A grouped frame
Returns:
An int value
"""
raise _NotImplementedByCurrentBackendError("n_groups", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_map"))
def group_map(_data, _f, *args, _keep: bool = False, **kwargs) -> Any:
"""Apply a function to each group
The original API:
https://dplyr.tidyverse.org/reference/group_map.html
Args:
_data: A grouped frame
_f: A function to apply to each group.
*args: Additional arguments to pass to `func`.
_keep: If `True`, keep the grouping variables in the output.
**kwargs: Additional keyword arguments to pass to `func`.
Returns:
A list of results
"""
raise _NotImplementedByCurrentBackendError("group_map", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_modify"))
def group_modify(_data, _f, *args, _keep: bool = False, **kwargs) -> Any:
"""Apply a function to each group
The original API:
https://dplyr.tidyverse.org/reference/group_modify.html
Args:
_data: A grouped frame
_f: A function to apply to each group.
*args: Additional arguments to pass to `func`.
_keep: If `True`, keep the grouping variables in the output.
**kwargs: Additional keyword arguments to pass to `func`.
Returns:
A data frame
"""
raise _NotImplementedByCurrentBackendError("group_modify", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_split"))
def group_split(_data, *args, _keep: bool = False, **kwargs) -> Any:
"""Split a grouped frame into a list of data frames
The original API:
https://dplyr.tidyverse.org/reference/group_split.html
Args:
_data: A grouped frame
*args: Additional arguments to pass to `func`.
_keep: If `True`, keep the grouping variables in the output.
**kwargs: Additional keyword arguments to pass to `func`.
Returns:
A list of data frames
"""
raise _NotImplementedByCurrentBackendError("group_split", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_trim"))
def group_trim(_data, _drop=None) -> Any:
"""Remove empty groups
The original API:
https://dplyr.tidyverse.org/reference/group_trim.html
Args:
_data: A grouped frame
_drop: See `group_by`.
Returns:
A grouped frame
"""
raise _NotImplementedByCurrentBackendError("group_trim", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("group_walk"))
def group_walk(_data, _f, *args, _keep: bool = False, **kwargs) -> Any:
"""Apply a function to each group
The original API:
https://dplyr.tidyverse.org/reference/group_walk.html
Args:
_data: A grouped frame
_f: A function to apply to each group.
*args: Additional arguments to pass to `func`.
**kwargs: Additional keyword arguments to pass to `func`.
Returns:
A grouped frame
"""
raise _NotImplementedByCurrentBackendError("group_walk", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("with_groups"))
def with_groups(_data, _groups, _func, *args, **kwargs) -> Any:
"""Modify the grouping variables for a single operation.
Args:
_data: A data frame
_groups: columns passed by group_by
Use None to temporarily ungroup.
_func: Function to apply to regrouped data.
Returns:
The new data frame with operations applied.
"""
raise _NotImplementedByCurrentBackendError("with_groups", _data)
@_register_func(pipeable=True, dispatchable=True)
def if_else(condition, true, false, missing=None) -> Any:
"""Where condition is TRUE, the matching value from true, where it's FALSE,
the matching value from false, otherwise missing.
Note that NAs will be False in condition if missing is not specified
Args:
condition: the conditions
true: and
false: Values to use for TRUE and FALSE values of condition.
They must be either the same length as condition, or length 1.
missing: If not None, will be used to replace missing values
Returns:
A series with values replaced.
"""
raise _NotImplementedByCurrentBackendError("if_else")
@_register_func(pipeable=True, dispatchable=True)
def case_match(_x: T, *args, _default=None, _dtypes=None) -> T:
"""This function allows you to vectorise multiple `switch()` statements.
Each case is evaluated sequentially and the first match for each element
determines the corresponding value in the output vector.
If no cases match, the `_default` is used.
The original API:
https://dplyr.tidyverse.org/reference/case_match.html
Args:
_x: A vector
*args: A series of condition-value pairs
_default: The default value
_dtypes: The data types of the output
"""
raise _NotImplementedByCurrentBackendError("case_match", _x)
@_register_func(pipeable=True, dispatchable=True)
def case_when(cond, value, *more_cases) -> Any:
"""Vectorise multiple `if_else()` statements.
Args:
cond: A boolean vector
value: A vector with values to replace
*more_cases: A list of tuples (cond, value)
Returns:
A vector with values replaced.
"""
raise _NotImplementedByCurrentBackendError("case_when")
# join
@_register_verb(ast_fallback=_get_verb_ast_fallback("inner_join"))
def inner_join(
x,
y,
by=None,
copy: bool = False,
suffix: _Sequence[str] = ("_x", "_y"),
keep: bool = False,
na_matches: str = "na",
multiple: str = "all",
unmatched: str = "drop",
relationship: str = None,
) -> Any:
"""Inner join two data frames by matching rows.
The original API:
https://dplyr.tidyverse.org/reference/join.html
Args:
x: A data frame
y: A data frame
by: A list of column names to join by.
If None, use the intersection of the columns of x and y.
copy: If True, always copy the data.
suffix: A tuple of suffixes to apply to overlapping columns.
keep: If True, keep the grouping variables in the output.
na_matches: How should NA values be matched?
"na": NA values are equal.
"never": NA values are never matched.
multiple: How should multiple matches be handled?
"all": All matches are returned.
"first": The first match is returned.
"last": The last match is returned.
"any": Any of the matched rows in y
unmatched: How should unmatched keys that would result in dropped rows
be handled?
"drop": Drop unmatched keys.
"error": Raise an error.
relationship: The relationship between x and y.
None: No expected relationship.
"one_to_one": Each row in x matches at most one row in y.
"one_to_many": Each row in x matches zero or more rows in y.
"many_to_one": Each row in x matches at most one row in y.
"many_to_many": Each row in x matches zero or more rows in y.
Returns:
A data frame
"""
raise _NotImplementedByCurrentBackendError("inner_join", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("left_join"))
def left_join(
x,
y,
by=None,
copy: bool = False,
suffix: _Sequence[str] = ("_x", "_y"),
keep: bool = False,
na_matches: str = "na",
multiple: str = "all",
unmatched: str = "drop",
relationship: str = None,
) -> Any:
"""Left join two data frames by matching rows.
The original API:
https://dplyr.tidyverse.org/reference/join.html
Args:
x: A data frame
y: A data frame
by: A list of column names to join by.
If None, use the intersection of the columns of x and y.
copy: If True, always copy the data.
suffix: A tuple of suffixes to apply to overlapping columns.
keep: If True, keep the grouping variables in the output.
na_matches: How should NA values be matched?
"na": NA values are equal.
"never": NA values are never matched.
multiple: How should multiple matches be handled?
"all": All matches are returned.
"first": The first match is returned.
"last": The last match is returned.
"any": Any of the matched rows in y
unmatched: How should unmatched keys that would result in dropped rows
be handled?
"drop": Drop unmatched keys.
"error": Raise an error.
relationship: The relationship between x and y.
None: No expected relationship.
"one_to_one": Each row in x matches at most one row in y.
"one_to_many": Each row in x matches zero or more rows in y.
"many_to_one": Each row in x matches at most one row in y.
"many_to_many": Each row in x matches zero or more rows in y.
Returns:
A data frame
"""
raise _NotImplementedByCurrentBackendError("left_join", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("right_join"))
def right_join(
x,
y,
by=None,
copy: bool = False,
suffix: _Sequence[str] = ("_x", "_y"),
keep: bool = False,
na_matches: str = "na",
multiple: str = "all",
unmatched: str = "drop",
relationship: str = None,
) -> Any:
"""Right join two data frames by matching rows.
The original API:
https://dplyr.tidyverse.org/reference/join.html
Args:
x: A data frame
y: A data frame
by: A list of column names to join by.
If None, use the intersection of the columns of x and y.
copy: If True, always copy the data.
suffix: A tuple of suffixes to apply to overlapping columns.
keep: If True, keep the grouping variables in the output.
na_matches: How should NA values be matched?
"na": NA values are equal.
"never": NA values are never matched.
multiple: How should multiple matches be handled?
"all": All matches are returned.
"first": The first match is returned.
"last": The last match is returned.
"any": Any of the matched rows in y
unmatched: How should unmatched keys that would result in dropped rows
be handled?
"drop": Drop unmatched keys.
"error": Raise an error.
relationship: The relationship between x and y.
None: No expected relationship.
"one_to_one": Each row in x matches at most one row in y.
"one_to_many": Each row in x matches zero or more rows in y.
"many_to_one": Each row in x matches at most one row in y.
"many_to_many": Each row in x matches zero or more rows in y.
Returns:
A data frame
"""
raise _NotImplementedByCurrentBackendError("right_join", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("full_join"))
def full_join(
x,
y,
by=None,
copy: bool = False,
suffix: _Sequence[str] = ("_x", "_y"),
keep: bool = False,
na_matches: str = "na",
multiple: str = "all",
unmatched: str = "drop",
relationship: str = None,
) -> Any:
"""Full join two data frames by matching rows.
The original API:
https://dplyr.tidyverse.org/reference/join.html
Args:
x: A data frame
y: A data frame
by: A list of column names to join by.
If None, use the intersection of the columns of x and y.
copy: If True, always copy the data.
suffix: A tuple of suffixes to apply to overlapping columns.
keep: If True, keep the grouping variables in the output.
na_matches: How should NA values be matched?
"na": NA values are equal.
"never": NA values are never matched.
multiple: How should multiple matches be handled?
"all": All matches are returned.
"first": The first match is returned.
"last": The last match is returned.
"any": Any of the matched rows in y
unmatched: How should unmatched keys that would result in dropped rows
be handled?
"drop": Drop unmatched keys.
"error": Raise an error.
relationship: The relationship between x and y.
None: No expected relationship.
"one_to_one": Each row in x matches at most one row in y.
"one_to_many": Each row in x matches zero or more rows in y.
"many_to_one": Each row in x matches at most one row in y.
"many_to_many": Each row in x matches zero or more rows in y.
Returns:
A data frame
"""
raise _NotImplementedByCurrentBackendError("full_join", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("semi_join"))
def semi_join(
x,
y,
by=None,
copy: bool = False,
na_matches: str = "na",
) -> Any:
"""Semi join two data frames by matching rows.
The original API:
https://dplyr.tidyverse.org/reference/join.html
Args:
x: A data frame
y: A data frame
by: A list of column names to join by.
If None, use the intersection of the columns of x and y.
copy: If True, always copy the data.
na_matches: How should NA values be matched?
"na": NA values are equal.
"never": NA values are never matched.
Returns:
A data frame
"""
raise _NotImplementedByCurrentBackendError("semi_join", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("anti_join"))
def anti_join(
x,
y,
by=None,
copy: bool = False,
na_matches: str = "na",
) -> Any:
"""Anti join two data frames by matching rows.
The original API:
https://dplyr.tidyverse.org/reference/join.html
Args:
x: A data frame
y: A data frame
by: A list of column names to join by.
If None, use the intersection of the columns of x and y.
copy: If True, always copy the data.
na_matches: How should NA values be matched?
"na": NA values are equal.
"never": NA values are never matched.
Returns:
A data frame
"""
raise _NotImplementedByCurrentBackendError("anti_join", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("nest_join"))
def nest_join(
x,
y,
by=None,
copy: bool = False,
keep: bool = False,
name=None,
na_matches: str = "na",
unmatched: str = "drop",
) -> Any:
"""Nest join two data frames by matching rows.
The original API:
https://dplyr.tidyverse.org/reference/join.html
Args:
x: A data frame
y: A data frame
by: A list of column names to join by.
If None, use the intersection of the columns of x and y.
copy: If True, always copy the data.
keep: If True, keep the grouping variables in the output.
name: The name of the column to store the nested data frame.
na_matches: How should NA values be matched?
"na": NA values are equal.
"never": NA values are never matched.
unmatched: How should unmatched keys that would result in dropped rows
be handled?
"drop": Drop unmatched keys.
"error": Raise an error.
Returns:
A data frame
"""
raise _NotImplementedByCurrentBackendError("nest_join", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("cross_join"))
def cross_join(
x: T,
y: T,
copy: bool = False,
suffix: _Sequence[str] = ("_x", "_y"),
) -> T:
"""Cross joins match each row in x to every row in y, resulting in a
data frame with nrow(x) * nrow(y) rows.
The original API:
https://dplyr.tidyverse.org/reference/cross_join.html
Args:
x: A data frame
y: A data frame
copy: If True, always copy the data.
suffix: A tuple of suffixes to apply to overlapping columns.
Returns:
An object of the same type as x (including the same groups).
"""
raise _NotImplementedByCurrentBackendError("cross_join", x)
# lead/lag
@_register_func(pipeable=True, dispatchable=True)
def lead(x, n=1, default=None, order_by=None) -> Any:
"""Shift a vector by `n` positions.
The original API:
https://dplyr.tidyverse.org/reference/lead.html
Args:
x: A vector
n: The number of positions to shift.
default: The default value to use for positions that don't exist.
order_by: A vector of column names to order by.
Returns:
A vector
"""
raise _NotImplementedByCurrentBackendError("lead", x)
@_register_func(pipeable=True, dispatchable=True)
def lag(x, n=1, default=None, order_by=None) -> Any:
"""Shift a vector by `n` positions.
The original API:
https://dplyr.tidyverse.org/reference/lag.html
Args:
x: A vector
n: The number of positions to shift.
default: The default value to use for positions that don't exist.
order_by: A vector of column names to order by.
Returns:
A vector
"""
raise _NotImplementedByCurrentBackendError("lag", x)
# mutate
@_register_verb(ast_fallback=_get_verb_ast_fallback("mutate"))
def mutate(
_data, *args, _keep: str = "all", _before=None, _after=None, **kwargs
) -> Any:
"""Add new columns to a data frame.
The original API:
https://dplyr.tidyverse.org/reference/mutate.html
Args:
_data: A data frame
_keep: allows you to control which columns from _data are retained
in the output:
- "all", the default, retains all variables.
- "used" keeps any variables used to make new variables;
it's useful for checking your work as it displays inputs and
outputs side-by-side.
- "unused" keeps only existing variables not used to make new
variables.
- "none", only keeps grouping keys (like transmute()).
_before: A list of column names to put the new columns before.
_after: A list of column names to put the new columns after.
*args: and
**kwargs: Name-value pairs. The name gives the name of the column
in the output. The value can be:
- A vector of length 1, which will be recycled to the correct
length.
- A vector the same length as the current group (or the whole
data frame if ungrouped).
- None to remove the column
Returns:
An object of the same type as _data. The output has the following
properties:
- Rows are not affected.
- Existing columns will be preserved according to the _keep
argument. New columns will be placed according to the
_before and _after arguments. If _keep = "none"
(as in transmute()), the output order is determined only
by ..., not the order of existing columns.
- Columns given value None will be removed
- Groups will be recomputed if a grouping variable is mutated.
- Data frame attributes are preserved.
"""
raise _NotImplementedByCurrentBackendError("mutate", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("transmute"))
def transmute(_data, *args, _before=None, _after=None, **kwargs) -> Any:
"""Add new columns to a data frame and remove existing columns
using mutate with `_keep="none"`.
The original API:
https://dplyr.tidyverse.org/reference/mutate.html
Args:
_data: A data frame
_before: A list of column names to put the new columns before.
_after: A list of column names to put the new columns after.
*args: and
**kwargs: Name-value pairs. The name gives the name of the column
in the output. The value can be:
- A vector of length 1, which will be recycled to the correct
length.
- A vector the same length as the current group (or the whole
data frame if ungrouped).
- None to remove the column
Returns:
An object of the same type as _data. The output has the following
properties:
- Rows are not affected.
- Existing columns will be preserved according to the _keep
argument. New columns will be placed according to the
_before and _after arguments. If _keep = "none"
(as in transmute()), the output order is determined only
by ..., not the order of existing columns.
- Columns given value None will be removed
- Groups will be recomputed if a grouping variable is mutated.
- Data frame attributes are preserved.
"""
raise _NotImplementedByCurrentBackendError("transmute", _data)
# order_by
@_register_func(plain=True)
def order_by(order, call) -> Any:
"""Order the data by the given order
Note:
This function should be called as an argument
of a verb. If you want to call it regularly, try `with_order()`
Examples:
>>> df = tibble(x=c[1:6])
>>> df >> mutate(y=order_by(c[5:], cumsum(f.x)))
>>> # df.y:
>>> # 15, 14, 12, 9, 5
Args:
order: An iterable to control the data order
data: The data to be ordered
Returns:
A Function expression for verb to evaluate.
"""
raise _NotImplementedByCurrentBackendError("order_by")
@_register_func(pipeable=True, dispatchable=True)
def with_order(order, func, x, *args, **kwargs) -> Any:
"""Control argument and result of a window function
Examples:
>>> with_order([5,4,3,2,1], cumsum, [1,2,3,4,5])
>>> # 15, 14, 12, 9, 5
Args:
order: An iterable to order the arugment and result
func: The window function
x: The first arugment for the function
*args: and
**kwargs: Other arugments for the function
Returns:
The ordered result or an expression if there is expression in arguments
"""
raise _NotImplementedByCurrentBackendError("with_order", order)
# pull
@_register_verb(ast_fallback=_get_verb_ast_fallback("pull"))
def pull(_data, var: str | int = -1, name=None, to=None) -> Any:
"""Pull a series or a dataframe from a dataframe
Args:
_data: The dataframe
var: The column to pull, either the name or the index
name: The name of the pulled value
- If `to` is frame, or the value pulled is data frame, it will be
the column names
- If `to` is series, it will be the series name. If multiple names
are given, only the first name will be used.
- If `to` is series, but value pulled is a data frame, then a
dictionary of series with the series names as keys or given `name`
as keys.
to: Type of data to return.
Only works when pulling `a` for name `a$b`
- series: Return a pandas Series object
Group information will be lost
If pulled value is a dataframe, it will return a dict of series,
with the series names or the `name` provided.
- array: Return a numpy.ndarray object
- frame: Return a DataFrame with that column
- list: Return a python list
- dict: Return a dict with `name` as keys and pulled value as values
Only a single column is allowed to pull
- If not provided: `series` when pulled data has only one columns.
`dict` if `name` provided and has the same length as the pulled
single column. Otherwise `frame`.
Returns:
The data according to `to`
"""
raise _NotImplementedByCurrentBackendError("pull", _data)
def row_number(x=_f_symbolic) -> Any:
"""Get the row number of x
Note that this function doesn't support piping.
Args:
x: The data to get row number
Defaults to `Symbolic()` so the whole data is used by default
when called `row_number()`
Returns:
The row number
"""
return row_number_(x, __ast_fallback="normal")
@_register_func(pipeable=True, dispatchable=True)
def row_number_(x) -> Any:
raise _NotImplementedByCurrentBackendError("row_number", x)
def ntile(x=_f_symbolic, *, n: int = None) -> Any:
"""a rough rank, which breaks the input vector into n buckets.
The size of the buckets may differ by up to one, larger buckets
have lower rank.
Note that this function doesn't support piping.
Args:
x: The data to get rownumber
Defaults to `Symbolic()` so the whole data is used by default
when called `ntile(n=...)`
n: The number of groups to divide the data into
Returns:
The row number
"""
return ntile_(x, n=n, __ast_fallback="normal")
@_register_func(pipeable=True, dispatchable=True)
def ntile_(x, *, n: int = None) -> Any:
raise _NotImplementedByCurrentBackendError("ntile", x)
def min_rank(x=_f_symbolic, *, na_last: str = "keep") -> Any:
"""Get the min rank of x
Note that this function doesn't support piping.
Args:
x: The data to get row number
Defaults to `Symbolic()` so the whole data is used by default
when called `min_rank()`
na_last: How NA values are ranked
- "keep": NA values are ranked at the end
- "top": NA values are ranked at the top
- "bottom": NA values are ranked at the bottom
Returns:
The row number
"""
return min_rank_(x, na_last=na_last, __ast_fallback="normal")
@_register_func(pipeable=True, dispatchable=True)
def min_rank_(x, *, na_last: str = "keep") -> Any:
raise _NotImplementedByCurrentBackendError("min_rank", x)
def dense_rank(x=_f_symbolic, *, na_last: str = "keep") -> Any:
"""Get the dense rank of x
Note that this function doesn't support piping.
Args:
x: The data to get row number
Defaults to `Symbolic()` so the whole data is used by default
when called `dense_rank()`
na_last: How NA values are ranked
- "keep": NA values are ranked at the end
- "top": NA values are ranked at the top
- "bottom": NA values are ranked at the bottom
Returns:
The row number
"""
return dense_rank_(x, na_last=na_last, __ast_fallback="normal")
@_register_func(pipeable=True, dispatchable=True)
def dense_rank_(x, *, na_last: str = "keep") -> Any:
raise _NotImplementedByCurrentBackendError("dense_rank", x)
def percent_rank(x=_f_symbolic, *, na_last: str = "keep") -> Any:
"""Get the percent rank of x
Note that this function doesn't support piping.
Args:
x: The data to get row number
Defaults to `Symbolic()` so the whole data is used by default
when called `percent_rank()`
na_last: How NA values are ranked
- "keep": NA values are ranked at the end
- "top": NA values are ranked at the top
- "bottom": NA values are ranked at the bottom
Returns:
The row number
"""
return percent_rank_(x, na_last=na_last, __ast_fallback="normal")
@_register_func(pipeable=True, dispatchable=True)
def percent_rank_(x, *, na_last: str = "keep") -> Any:
raise _NotImplementedByCurrentBackendError("percent_rank", x)
def cume_dist(x=_f_symbolic, *, na_last: str = "keep") -> Any:
"""Get the cume_dist of x
Note that this function doesn't support piping.
Args:
x: The data to get row number
Defaults to `Symbolic()` so the whole data is used by default
when called `cume_dist()`
na_last: How NA values are ranked
- "keep": NA values are ranked at the end
- "top": NA values are ranked at the top
- "bottom": NA values are ranked at the bottom
Returns:
The row number
"""
return cume_dist_(x, na_last=na_last, __ast_fallback="normal")
@_register_func(pipeable=True, dispatchable=True)
def cume_dist_(x, *, na_last: str = "keep") -> Any:
raise _NotImplementedByCurrentBackendError("cume_dist", x)
# recode
@_register_func(pipeable=True, dispatchable=True)
def recode(_x, *args, _default=None, _missing=None, **kwargs) -> Any:
"""Recode a vector, replacing elements in it
Args:
x: A vector to modify
*args: and
**kwargs: replacements
_default: If supplied, all values not otherwise matched will be
given this value. If not supplied and if the replacements are
the same type as the original values in series, unmatched values
are not changed. If not supplied and if the replacements are
not compatible, unmatched values are replaced with np.nan.
_missing: If supplied, any missing values in .x will be replaced
by this value.
Returns:
The vector with values replaced
"""
raise _NotImplementedByCurrentBackendError("recode")
@_register_func(pipeable=True, dispatchable=True)
def recode_factor(
_x,
*args,
_default=None,
_missing=None,
_ordered: bool = False,
**kwargs,
) -> Any:
"""Recode a factor, replacing levels in it
Args:
x: A factor to modify
*args: and
**kwargs: replacements
_default: If supplied, all values not otherwise matched will be
given this value. If not supplied and if the replacements are
the same type as the original values in series, unmatched values
are not changed. If not supplied and if the replacements are
not compatible, unmatched values are replaced with np.nan.
_missing: If supplied, any missing values in .x will be replaced
by this value.
_ordered: If True, the factor will be ordered
Returns:
The factor with levels replaced
"""
raise _NotImplementedByCurrentBackendError("recode_factor")
@_register_verb(ast_fallback=_get_verb_ast_fallback("relocate"))
def relocate(
_data,
*args,
_before: int | str = None,
_after: int | str = None,
**kwargs,
) -> Any:
"""change column positions
See original API
https://dplyr.tidyverse.org/reference/relocate.html
Args:
_data: A data frame
*args: and
**kwargs: Columns to rename and move
_before: and
_after: Destination. Supplying neither will move columns to
the left-hand side; specifying both is an error.
Returns:
An object of the same type as .data. The output has the following
properties:
- Rows are not affected.
- The same columns appear in the output, but (usually) in a
different place.
- Data frame attributes are preserved.
- Groups are not affected
"""
raise _NotImplementedByCurrentBackendError("relocate", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("rename"))
def rename(_data, **kwargs) -> Any:
"""Rename columns
See original API
https://dplyr.tidyverse.org/reference/rename.html
Args:
_data: A data frame
**kwargs: Columns to rename
Returns:
The dataframe with new names
"""
raise _NotImplementedByCurrentBackendError("rename", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("rename_with"))
def rename_with(_data, _fn, *args, **kwargs) -> Any:
"""Rename columns with a function
See original API
https://dplyr.tidyverse.org/reference/rename.html
Args:
_data: A data frame
_fn: A function to apply to column names
*args: the columns to rename and non-keyword arguments for the `_fn`.
If `*args` is not provided, then assuming all columns, and
no non-keyword arguments are allowed to pass to the function, use
keyword arguments instead.
**kwargs: keyword arguments for `_fn`
Returns:
The dataframe with new names
"""
raise _NotImplementedByCurrentBackendError("rename_with", _data)
# rows
@_register_verb(ast_fallback=_get_verb_ast_fallback("rows_insert"))
def rows_insert(
x,
y,
by=None,
conflict: str = "error",
**kwargs,
) -> Any:
"""Insert rows from y into x
See original API
https://dplyr.tidyverse.org/reference/rows.html
Args:
x: A data frame
y: A data frame
by: An unnamed character vector giving the key columns.
The key columns must exist in both x and y.
Keys typically uniquely identify each row, but this is only
enforced for the key values of y
By default, we use the first column in y, since the first column is
a reasonable place to put an identifier variable.
conflict: How to handle conflicts
- "error": Throw an error
- "ignore": Ignore conflicts
**kwargs: Additional arguments to pass to the backend, such as
`copy` and `in_place`. Depends on the backend implementation.
Returns:
A data frame with all existing rows and potentially new rows
"""
raise _NotImplementedByCurrentBackendError("rows_insert", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("rows_update"))
def rows_update(
x,
y,
by=None,
unmatched: str = "error",
**kwargs,
) -> Any:
"""Update rows in x with values from y
See original API
https://dplyr.tidyverse.org/reference/rows.html
Args:
x: A data frame
y: A data frame
by: An unnamed character vector giving the key columns.
The key columns must exist in both x and y.
Keys typically uniquely identify each row, but this is only
enforced for the key values of y
By default, we use the first column in y, since the first column is
a reasonable place to put an identifier variable.
unmatched: how should keys in y that are unmatched by the keys
in x be handled?
One of -
"error", the default, will error if there are any keys in y that
are unmatched by the keys in x.
"ignore" will ignore rows in y with keys that are unmatched
by the keys in x.
**kwargs: Additional arguments to pass to the backend, such as
`copy` and `in_place`. Depends on the backend implementation.
Returns:
A data frame with all existing rows and potentially new rows
"""
raise _NotImplementedByCurrentBackendError("rows_update", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("rows_patch"))
def rows_patch(
x,
y,
by=None,
unmatched: str = "error",
**kwargs,
) -> Any:
"""Patch rows in x with values from y
See original API
https://dplyr.tidyverse.org/reference/rows.html
Args:
x: A data frame
y: A data frame
by: An unnamed character vector giving the key columns.
The key columns must exist in both x and y.
Keys typically uniquely identify each row, but this is only
enforced for the key values of y
By default, we use the first column in y, since the first column is
a reasonable place to put an identifier variable.
unmatched: how should keys in y that are unmatched by the keys
in x be handled?
One of -
"error", the default, will error if there are any keys in y that
are unmatched by the keys in x.
"ignore" will ignore rows in y with keys that are unmatched
by the keys in x.
**kwargs: Additional arguments to pass to the backend, such as
`copy` and `in_place`. Depends on the backend implementation.
Returns:
A data frame with NA values overwritten and the number of rows preserved
"""
raise _NotImplementedByCurrentBackendError("rows_patch", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("rows_upsert"))
def rows_upsert(x, y, by=None, **kwargs) -> Any:
"""Upsert rows in x with values from y
See original API
https://dplyr.tidyverse.org/reference/rows.html
Args:
x: A data frame
y: A data frame
by: An unnamed character vector giving the key columns.
The key columns must exist in both x and y.
Keys typically uniquely identify each row, but this is only
enforced for the key values of y
By default, we use the first column in y, since the first column is
a reasonable place to put an identifier variable.
**kwargs: Additional arguments to pass to the backend, such as
`copy` and `in_place`. Depends on the backend implementation.
Returns:
A data frame with inserted or updated depending on whether or not
the key value in y already exists in x. Key values in y must be unique.
"""
raise _NotImplementedByCurrentBackendError("rows_upsert", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("rows_delete"))
def rows_delete(
x,
y,
by=None,
unmatched: str = "error",
**kwargs,
) -> Any:
"""Delete rows in x that match keys in y
See original API
https://dplyr.tidyverse.org/reference/rows.html
Args:
x: A data frame
y: A data frame
by: An unnamed character vector giving the key columns.
The key columns must exist in both x and y.
Keys typically uniquely identify each row, but this is only
enforced for the key values of y
By default, we use the first column in y, since the first column is
a reasonable place to put an identifier variable.
unmatched: how should keys in y that are unmatched by the keys
in x be handled?
One of -
"error", the default, will error if there are any keys in y that
are unmatched by the keys in x.
"ignore" will ignore rows in y with keys that are unmatched
by the keys in x.
**kwargs: Additional arguments to pass to the backend, such as
`copy` and `in_place`. Depends on the backend implementation.
Returns:
A data frame with rows deleted
"""
raise _NotImplementedByCurrentBackendError("rows_delete", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("rows_append"))
def rows_append(x, y, **kwargs) -> Any:
"""Append rows in y to x
See original API
https://dplyr.tidyverse.org/reference/rows.html
Args:
x: A data frame
y: A data frame
**kwargs: Additional arguments to pass to the backend, such as
`copy` and `in_place`. Depends on the backend implementation.
Returns:
A data frame with rows appended
"""
raise _NotImplementedByCurrentBackendError("rows_append", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("select"))
def select(_data, *args, **kwargs) -> Any:
"""Select columns from a data frame.
See original API
https://dplyr.tidyverse.org/reference/select.html
Args:
_data: A data frame
*args: A list of columns to select
**kwargs: A list of columns to select
Returns:
A data frame with only the selected columns
"""
raise _NotImplementedByCurrentBackendError("select", _data)
@_register_func(pipeable=True, dispatchable=True)
def union_all(x, y) -> Any:
"""Combine two data frames together.
See original API
https://dplyr.tidyverse.org/reference/setops.html
Args:
x: A data frame
y: A data frame
Returns:
A data frame with rows from x and y
"""
raise _NotImplementedByCurrentBackendError("union_all", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("summarise"))
def summarise(_data, *args, _groups: str = None, **kwargs) -> Any:
"""Summarise a data frame.
See original API
https://dplyr.tidyverse.org/reference/summarise.html
Args:
_data: A data frame
_groups: Grouping structure of the result.
- "drop_last": dropping the last level of grouping.
- "drop": All levels of grouping are dropped.
- "keep": Same grouping structure as _data.
- "rowwise": Each row is its own group.
*args: and
**kwargs: Name-value pairs, where value is the summarized
data for each group
Returns:
A data frame with the summarised columns
"""
raise _NotImplementedByCurrentBackendError("summarise", _data)
summarize = summarise
@_register_verb(ast_fallback=_get_verb_ast_fallback("reframe"))
def reframe(_data, *args, **kwargs) -> Any:
"""Reframe a data frame.
See original API
https://dplyr.tidyverse.org/reference/reframe.html
Args:
_data: A data frame
*args: and
**kwargs: Name-value pairs, where value is the reframed
data for each group
Returns:
A data frame with the reframed columns
"""
raise _NotImplementedByCurrentBackendError("reframe", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("where"))
def where(_data, fn: _Callable) -> Any:
"""Selects the variables for which a function returns True.
See original API
https://dplyr.tidyverse.org/reference/filter.html
Args:
_data: A data frame
fn: A function that returns True or False.
Currently it has to be `register_func/func_factory
registered function purrr-like formula not supported yet.
Returns:
The matched columns
"""
raise _NotImplementedByCurrentBackendError("where", _data)
@_register_verb(
dependent=True, ast_fallback=_get_verb_ast_fallback("everything")
)
def everything(_data) -> Any:
"""Select all variables.
See original API
https://dplyr.tidyverse.org/reference/select.html
Args:
_data: A data frame
Returns:
All columns
"""
raise _NotImplementedByCurrentBackendError("everything", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("last_col"))
def last_col(_data, offset: int = 0, vars=None) -> Any:
"""Select the last column.
See original API
https://dplyr.tidyverse.org/reference/select.html
Args:
_data: A data frame
offset: The offset of the last column
vars: A list of columns to select
Returns:
The last column
"""
raise _NotImplementedByCurrentBackendError("last_col", _data)
@_register_verb(
dependent=True, ast_fallback=_get_verb_ast_fallback("starts_with")
)
def starts_with(_data, match, ignore_case: bool = True, vars=None) -> Any:
"""Select columns that start with a string.
See original API
https://dplyr.tidyverse.org/reference/select.html
Args:
_data: A data frame
match: The string to match
ignore_case: Ignore case when matching
vars: A list of columns to select
Returns:
The matched columns
"""
raise _NotImplementedByCurrentBackendError("starts_with", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("ends_with"))
def ends_with(_data, match, ignore_case: bool = True, vars=None) -> Any:
"""Select columns that end with a string.
See original API
https://dplyr.tidyverse.org/reference/select.html
Args:
_data: A data frame
match: The string to match
ignore_case: Ignore case when matching
vars: A list of columns to select
Returns:
The matched columns
"""
raise _NotImplementedByCurrentBackendError("ends_with", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("contains"))
def contains(_data, match, ignore_case: bool = True, vars=None) -> Any:
"""Select columns that contain a string.
See original API
https://dplyr.tidyverse.org/reference/select.html
Args:
_data: A data frame
match: The string to match
ignore_case: Ignore case when matching
vars: A list of columns to select
Returns:
The matched columns
"""
raise _NotImplementedByCurrentBackendError("contains", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("matches"))
def matches(_data, match, ignore_case: bool = True, vars=None) -> Any:
"""Select columns that match a regular expression.
See original API
https://dplyr.tidyverse.org/reference/select.html
Args:
_data: A data frame
match: The regular expression to match
ignore_case: Ignore case when matching
vars: A list of columns to select
Returns:
The matched columns
"""
raise _NotImplementedByCurrentBackendError("matches", _data)
@_register_func(pipeable=True, dispatchable=True)
def num_range(prefix: str, range_, width: int = None) -> Any:
"""Matches a numerical range like x01, x02, x03.
Args:
_data: The data piped in
prefix: A prefix that starts the numeric range.
range_: A sequence of integers, like `range(3)` (produces `0,1,2`).
width: Optionally, the "width" of the numeric range.
For example, a range of 2 gives "01", a range of three "001", etc.
Returns:
A list of ranges with prefix.
"""
raise _NotImplementedByCurrentBackendError("num_range")
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("all_of"))
def all_of(_data, x) -> Any:
"""For strict selection.
If any of the variables in the character vector is missing,
an error is thrown.
Args:
_data: The data piped in
x: A set of variables to match the columns
Returns:
The matched column names
Raises:
ColumnNotExistingError: When any of the elements in `x` does not exist
in `_data` columns
"""
raise _NotImplementedByCurrentBackendError("all_of", _data)
@_register_verb(dependent=True, ast_fallback=_get_verb_ast_fallback("any_of"))
def any_of(_data, x, vars=None) -> Any:
"""For strict selection.
If any of the variables in the character vector is missing,
an error is thrown.
Args:
_data: The data piped in
x: A set of variables to match the columns
vars: A list of columns to select
Returns:
The matched column names
Raises:
ColumnNotExistingError: When any of the elements in `x` does not exist
in `_data` columns
"""
raise _NotImplementedByCurrentBackendError("any_of", _data)
================================================
FILE: datar/apis/forcats.py
================================================
from typing import Any
from pipda import register_func as _register_func
from ..core.utils import (
NotImplementedByCurrentBackendError as _NotImplementedByCurrentBackendError,
)
from .base import as_factor # noqa: F401
@_register_func(pipeable=True, dispatchable=True)
def fct_relevel(_f, *lvls, after: int = None) -> Any:
"""Reorder factor levels by hand
Args:
_f: A factor (categoriccal), or a string vector
*lvls: Either a function (then `len(lvls)` should equal to `1`) or
the new levels.
A function will be called with the current levels as input, and the
return value (which must be a character vector) will be used to
relevel the factor.
Any levels not mentioned will be left in their existing order,
by default after the explicitly mentioned levels.
after: Where should the new values be placed?
Returns:
The factor with levels replaced
"""
raise _NotImplementedByCurrentBackendError("fct_relevel", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_inorder(_f, ordered: bool = None) -> Any:
"""Reorder factor levels by first appearance
Args:
_f: A factor
ordered: A logical which determines the "ordered" status of the
output factor.
Returns:
The factor with levels reordered
"""
raise _NotImplementedByCurrentBackendError("fct_inorder", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_infreq(_f, ordered: bool = None) -> Any:
"""Reorder factor levels by frequency
Args:
_f: A factor
ordered: A logical which determines the "ordered" status of the
output factor.
Returns:
The factor with levels reordered
"""
raise _NotImplementedByCurrentBackendError("fct_infreq", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_inseq(_f, ordered: bool = None) -> Any:
"""Reorder factor levels by sequence
Args:
_f: A factor
ordered: A logical which determines the "ordered" status of the
output factor.
Returns:
The factor with levels reordered
"""
raise _NotImplementedByCurrentBackendError("fct_inseq", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_reorder(_f, _x, *args, _fun=None, _desc: bool = False, **kwargs) -> Any:
"""Reorder factor levels by a function (default: median)
Args:
_f: A factor
_x: The data to be used to reorder the factor
_fun: A function to be used to reorder the factor
_desc: If `True`, the factor will be reordered in descending order
*args: Extra arguments to be passed to `_fun`
**kwargs: Extra keyword arguments to be passed to `_fun`
Returns:
The factor with levels reordered
"""
raise _NotImplementedByCurrentBackendError("fct_reorder", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_reorder2(
_f,
_x,
*args,
_fun=None,
_desc: bool = False,
**kwargs,
) -> Any:
"""Reorder factor levels by a function (default: `last2`)
Args:
_f: A factor
_x: The data to be used to reorder the factor
_fun: A function to be used to reorder the factor
_desc: If `True`, the factor will be reordered in descending order
*args: Extra arguments to be passed to `_fun`
**kwargs: Extra keyword arguments to be passed to `_fun`
Returns:
The factor with levels reordered
"""
raise _NotImplementedByCurrentBackendError("fct_reorder2", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_shuffle(_f) -> Any:
"""Shuffle the levels of a factor
Args:
_f: A factor
Returns:
The factor with levels shuffled
"""
raise _NotImplementedByCurrentBackendError("fct_shuffle", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_rev(_f) -> Any:
"""Reverse the order of the levels of a factor
Args:
_f: A factor
Returns:
The factor with levels reversed
"""
raise _NotImplementedByCurrentBackendError("fct_rev", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_shift(_f, n: int = 1) -> Any:
"""Shift the levels of a factor
Args:
_f: A factor
n: The number of levels to shift
Returns:
The factor with levels shifted
"""
raise _NotImplementedByCurrentBackendError("fct_shift", _f)
@_register_func(pipeable=True, dispatchable=True)
def first2(_x, _y) -> Any:
"""Find the first element of `_y` ordered by `_x`
Args:
_x: The vector used to order `_y`
_y: The vector to get the first element of
Returns:
First element of `_y` ordered by `_x`
"""
raise _NotImplementedByCurrentBackendError("first2", _x)
@_register_func(pipeable=True, dispatchable=True)
def last2(_x, _y) -> Any:
"""Find the last element of `_y` ordered by `_x`
Args:
_x: The vector used to order `_y`
_y: The vector to get the last element of
Returns:
Last element of `_y` ordered by `_x`
"""
raise _NotImplementedByCurrentBackendError("last2", _x)
@_register_func(pipeable=True, dispatchable=True)
def fct_anon(_f, prefix: str = "") -> Any:
"""Anonymise factor levels
Args:
f: A factor.
prefix: A character prefix to insert in front of the random labels.
Returns:
The factor with levels anonymised
"""
raise _NotImplementedByCurrentBackendError("fct_anon", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_recode(_f, *args, **kwargs) -> Any:
"""Change factor levels by hand
Args:
_f: A factor
*args: and
**kwargs: A sequence of named character vectors where the name
gives the new level, and the value gives the old level.
Levels not otherwise mentioned will be left as is. Levels can
be removed by naming them `NULL`.
As `NULL/None` cannot be a name of keyword arguments, replacement
has to be specified as a dict
(i.e. `fct_recode(x, {NULL: "apple"})`)
If you want to replace multiple values with the same old value,
use a `set`/`list`/`numpy.ndarray`
(i.e. `fct_recode(x, fruit=["apple", "banana"])`).
This is a safe way, since `set`/`list`/`numpy.ndarray` is
not hashable to be a level of a factor.
Do NOT use a `tuple`, as it's hashable!
Note that the order of the name-value is in the reverse way as
`dplyr.recode()` and `dplyr.recode_factor()`
Returns:
The factor recoded with given recodings
"""
raise _NotImplementedByCurrentBackendError("fct_recode", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_collapse(_f, other_level=None, **kwargs) -> Any:
"""Collapse factor levels into manually defined groups
Args:
_f: A factor
**kwargs: The levels to collapse.
Like `name=[old_level, old_level1, ...]`. The old levels will
be replaced with `name`
other_level: Replace all levels not named in `kwargs`.
If not, don't collapse them.
Returns:
The factor with levels collapsed.
"""
raise _NotImplementedByCurrentBackendError("fct_collapse", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_lump(
_f,
n=None,
prop=None,
w=None,
other_level="Other",
ties_method: str = "min",
) -> Any:
"""Lump together factor levels into "other"
Args:
f: A factor
n: Positive `n` preserves the most common `n` values.
Negative `n` preserves the least common `-n` values.
It there are ties, you will get at least `abs(n)` values.
prop: Positive `prop` lumps values which do not appear at least
`prop` of the time. Negative `prop` lumps values that
do not appear at most `-prop` of the time.
w: An optional numeric vector giving weights for frequency of
each value (not level) in f.
other_level: Value of level used for "other" values. Always
placed at end of levels.
ties_method A character string specifying how ties are treated.
One of: `average`, `first`, `dense`, `max`, and `min`.
Returns:
The factor with levels lumped.
"""
raise _NotImplementedByCurrentBackendError("fct_lump", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_lump_min(_f, min_, w=None, other_level="Other") -> Any:
"""lumps levels that appear fewer than `min_` times.
Args:
_f: A factor
min_: Preserve levels that appear at least `min_` number of times.
w: An optional numeric vector giving weights for frequency of
each value (not level) in f.
other_level: Value of level used for "other" values. Always
placed at end of levels.
Returns:
The factor with levels lumped.
"""
raise _NotImplementedByCurrentBackendError("fct_lump_min", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_lump_prop(_f, prop, w=None, other_level="Other") -> Any:
"""Lumps levels that appear in fewer `prop * n` times.
Args:
_f: A factor
prop: Positive `prop` lumps values which do not appear at least
`prop` of the time. Negative `prop` lumps values that
do not appear at most `-prop` of the time.
w: An optional numeric vector giving weights for frequency of
each value (not level) in f.
other_level: Value of level used for "other" values. Always
placed at end of levels.
Returns:
The factor with levels lumped.
"""
raise _NotImplementedByCurrentBackendError("fct_lump_prop", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_lump_n(_f, n, w=None, other_level="Other") -> Any:
"""Lumps all levels except for the `n` most frequent.
Args:
f: A factor
n: Positive `n` preserves the most common `n` values.
Negative `n` preserves the least common `-n` values.
It there are ties, you will get at least `abs(n)` values.
w: An optional numeric vector giving weights for frequency of
each value (not level) in f.
other_level: Value of level used for "other" values. Always
placed at end of levels.
ties_method A character string specifying how ties are treated.
One of: `average`, `first`, `dense`, `max`, and `min`.
Returns:
The factor with levels lumped.
"""
raise _NotImplementedByCurrentBackendError("fct_lump_n", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_lump_lowfreq(_f, other_level="Other") -> Any:
"""lumps together the least frequent levels, ensuring
that "other" is still the smallest level.
Args:
f: A factor
other_level: Value of level used for "other" values. Always
placed at end of levels.
Returns:
The factor with levels lumped.
"""
raise _NotImplementedByCurrentBackendError("fct_lump_lowfreq", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_other(_f, keep=None, drop=None, other_level="Other") -> Any:
"""Replace levels with "other"
Args:
_f: A factor
keep: and
drop: Pick one of `keep` and `drop`:
- `keep` will preserve listed levels, replacing all others with
`other_level`.
- `drop` will replace listed levels with `other_level`, keeping all
as is.
other_level: Value of level used for "other" values. Always
placed at end of levels.
Returns:
The factor with levels replaced.
"""
raise _NotImplementedByCurrentBackendError("fct_other", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_relabel(_f, _fun, *args, **kwargs) -> Any:
"""Automatically relabel factor levels, collapse as necessary
Args:
_f: A factor
_fun: A function to be applied to each level. Must accept the old
levels and return a character vector of the same length
as its input.
*args: and
**kwargs: Addtional arguments to `_fun`
Returns:
The factor with levels relabeled
"""
raise _NotImplementedByCurrentBackendError("fct_relabel", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_expand(_f, *additional_levels) -> Any:
"""Add additional levels to a factor
Args:
_f: A factor
*additional_levels: Additional levels to add to the factor.
Levels that already exist will be silently ignored.
Returns:
The factor with levels expanded
"""
raise _NotImplementedByCurrentBackendError("fct_expand", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_explicit_na(_f, na_level="(Missing)") -> Any:
"""Make missing values explicit
This gives missing values an explicit factor level, ensuring that they
appear in summaries and on plots.
Args:
_f: A factor
na_level: Level to use for missing values.
This is what NAs will be changed to.
Returns:
The factor with explict na_levels
"""
raise _NotImplementedByCurrentBackendError("fct_explicit_na", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_drop(_f, only=None) -> Any:
"""Drop unused levels
Args:
_f: A factor
only: A character vector restricting the set of levels to be dropped.
If supplied, only levels that have no entries and appear in
this vector will be removed.
Returns:
The factor with unused levels dropped
"""
raise _NotImplementedByCurrentBackendError("fct_drop", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_unify(
fs,
levels=None,
) -> Any:
"""Unify the levels in a list of factors
Args:
fs: A list of factors
levels: Set of levels to apply to every factor. Default to union
of all factor levels
Returns:
A list of factors with the levels expanded
"""
raise _NotImplementedByCurrentBackendError("fct_unify", fs)
@_register_func(pipeable=True, dispatchable=True)
def fct_c(*fs) -> Any:
"""Concatenate factors, combining levels
This is a useful ways of patching together factors from multiple sources
that really should have the same levels but don't.
Args:
*fs: factors to concatenate
Returns:
The concatenated factor
"""
raise _NotImplementedByCurrentBackendError("fct_c")
@_register_func(pipeable=True, dispatchable=True)
def fct_cross(
*fs,
sep: str = ":",
keep_empty: bool = False,
) -> Any:
"""Combine levels from two or more factors to create a new factor
Computes a factor whose levels are all the combinations of
the levels of the input factors.
Args:
*fs: factors to cross
sep: A string to separate levels
keep_empty: If True, keep combinations with no observations as levels
Returns:
The new factor
"""
raise _NotImplementedByCurrentBackendError("fct_cross")
@_register_func(pipeable=True, dispatchable=True)
def fct_count(_f, sort: bool = False, prop=False) -> Any:
"""Count entries in a factor
Args:
_f: A factor
sort: If True, sort the result so that the most common values float to
the top
prop: If True, compute the fraction of marginal table.
Returns:
A data frame with columns `f`, `n` and `p`, if prop is True
"""
raise _NotImplementedByCurrentBackendError("fct_count", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_match(_f, lvls) -> Any:
"""Test for presence of levels in a factor
Do any of `lvls` occur in `_f`?
Args:
_f: A factor
lvls: A vector specifying levels to look for.
Returns:
A logical factor
"""
raise _NotImplementedByCurrentBackendError("fct_match", _f)
@_register_func(pipeable=True, dispatchable=True)
def fct_unique(_f) -> Any:
"""Unique values of a factor
Args:
_f: A factor
Returns:
The factor with the unique values in `_f`
"""
raise _NotImplementedByCurrentBackendError("fct_unique", _f)
@_register_func(pipeable=True, dispatchable=True)
def lvls_reorder(
_f,
idx,
ordered: bool = None,
) -> Any:
"""Leaves values of a factor as they are, but changes the order by
given indices
Args:
f: A factor (or character vector).
idx: A integer index, with one integer for each existing level.
new_levels: A character vector of new levels.
ordered: A logical which determines the "ordered" status of the
output factor. `None` preserves the existing status of the factor.
Returns:
The factor with levels reordered
"""
raise _NotImplementedByCurrentBackendError("lvls_reorder", _f)
@_register_func(pipeable=True, dispatchable=True)
def lvls_revalue(
_f,
new_levels,
) -> Any:
"""changes the values of existing levels; there must
be one new level for each old level
Args:
_f: A factor
new_levels: A character vector of new levels.
Returns:
The factor with the new levels
"""
raise _NotImplementedByCurrentBackendError("lvls_revalue", _f)
@_register_func(pipeable=True, dispatchable=True)
def lvls_expand(
_f,
new_levels,
) -> Any:
"""Expands the set of levels; the new levels must
include the old levels.
Args:
_f: A factor
new_levels: The new levels. Must include the old ones
Returns:
The factor with the new levels
"""
raise _NotImplementedByCurrentBackendError("lvls_expand", _f)
@_register_func(pipeable=True, dispatchable=True)
def lvls_union(fs) -> Any:
"""Find all levels in a list of factors
Args:
fs: A list of factors
Returns:
A list of all levels
"""
raise _NotImplementedByCurrentBackendError("lvls_union", fs)
================================================
FILE: datar/apis/misc.py
================================================
from contextlib import contextmanager
from pipda import register_func
@contextmanager
def _array_ufunc_with_backend(backend: str):
"""Use a backend for the operator"""
old_backend = array_ufunc.backend
array_ufunc.backend = backend
yield
array_ufunc.backend = old_backend
@register_func(cls=object, dispatchable="first")
def array_ufunc(x, ufunc, *args, kind, **kwargs):
"""Implement the array ufunc
Allow other backends to override the behavior of the ufunc on
different types of data.
"""
return ufunc(x, *args, **kwargs)
array_ufunc.backend = None
array_ufunc.with_backend = _array_ufunc_with_backend
================================================
FILE: datar/apis/tibble.py
================================================
from __future__ import annotations as _
from typing import Any, Callable as _Callable
from pipda import (
register_verb as _register_verb,
register_func as _register_func,
)
from ..core.verb_env import get_verb_ast_fallback as _get_verb_ast_fallback
from ..core.utils import (
NotImplementedByCurrentBackendError as _NotImplementedByCurrentBackendError,
)
@_register_func(plain=True)
def tibble(
*args,
_name_repair: str | _Callable = "check_unique",
_rows: int = None,
_dtypes=None,
_drop_index: bool = False,
_index=None,
**kwargs,
) -> Any:
"""Constructs a data frame
Args:
*args: and
**kwargs: A set of name-value pairs.
_name_repair: treatment of problematic column names:
- "minimal": No name repair or checks, beyond basic existence,
- "unique": Make sure names are unique and not empty,
- "check_unique": (default value), no name repair,
but check they are unique,
- "universal": Make the names unique and syntactic
- a function: apply custom name repair
_rows: Number of rows of a 0-col dataframe when args and kwargs are
not provided. When args or kwargs are provided, this is ignored.
_dtypes: The dtypes for each columns to convert to.
_drop_index: Whether drop the index for the final data frame
_index: The new index of the output frame
Returns:
A constructed tibble
"""
raise _NotImplementedByCurrentBackendError("tibble")
@_register_func(pipeable=True, dispatchable=True)
def tibble_(
*args,
_name_repair: str | _Callable = "check_unique",
_rows: int = None,
_dtypes=None,
_drop_index: bool = False,
_index=None,
**kwargs,
) -> Any:
raise _NotImplementedByCurrentBackendError("tibble_")
@_register_func(plain=True)
def tribble(
*dummies,
_name_repair: str | _Callable = "minimal",
_dtypes=None,
) -> Any:
"""Create dataframe using an easier to read row-by-row layout
Unlike original API that uses formula (`f.col`) to indicate the column
names, we use `f.col` to indicate them.
Args:
*dummies: Arguments specifying the structure of a dataframe
Variable names should be specified with `f.name`
_dtypes: The dtypes for each columns to convert to.
Examples:
>>> tribble(
>>> f.colA, f.colB,
>>> "a", 1,
>>> "b", 2,
>>> "c", 3,
>>> )
Returns:
A dataframe
"""
raise _NotImplementedByCurrentBackendError("tribble")
@_register_func(plain=True)
def tibble_row(
*args,
_name_repair: str | _Callable = "check_unique",
_dtypes=None,
**kwargs,
) -> Any:
"""Constructs a data frame that is guaranteed to occupy one row.
Scalar values will be wrapped with `[]`
Args:
*args: and
**kwargs: A set of name-value pairs.
_name_repair: treatment of problematic column names:
- "minimal": No name repair or checks, beyond basic existence,
- "unique": Make sure names are unique and not empty,
- "check_unique": (default value), no name repair,
but check they are unique,
- "universal": Make the names unique and syntactic
- a function: apply custom name repair
Returns:
A constructed dataframe
"""
raise _NotImplementedByCurrentBackendError("tibble_row")
@_register_verb(ast_fallback=_get_verb_ast_fallback("as_tibble"))
def as_tibble(df) -> Any:
"""Convert a DataFrame object to Tibble object"""
raise _NotImplementedByCurrentBackendError("as_tibble", df)
@_register_verb(ast_fallback=_get_verb_ast_fallback("enframe"))
def enframe(x, name="name", value="value") -> Any:
"""Converts mappings or lists to one- or two-column data frames.
Args:
x: a list, a dictionary or a dataframe with one or two columns
name: and
value: value Names of the columns that store the names and values.
If `None`, a one-column dataframe is returned.
`value` cannot be `None`
Returns:
A data frame with two columns if `name` is not None (default) or
one-column otherwise.
"""
raise _NotImplementedByCurrentBackendError("enframe", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("deframe"))
def deframe(x) -> Any:
"""Converts two-column data frames to a dictionary
using the first column as name and the second column as value.
If the input has only one column, a list.
Args:
x: A data frame.
Returns:
A dictionary or a list if only one column in the data frame.
"""
raise _NotImplementedByCurrentBackendError("deframe", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("add_row"))
def add_row(
_data,
*args,
_before=None,
_after=None,
**kwargs,
) -> Any:
"""Add one or more rows of data to an existing data frame.
Aliases `add_case`
Args:
_data: Data frame to append to.
*args: and
**kwargs: Name-value pairs to add to the data frame.
_before: and
_after: row index where to add the new rows.
(default to add after the last row)
Returns:
The dataframe with the added rows
"""
raise _NotImplementedByCurrentBackendError("add_row", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("add_column"))
def add_column(
_data,
*args,
_before=None,
_after=None,
_name_repair="check_unique",
_dtypes=None,
**kwargs,
) -> Any:
"""Add one or more columns to an existing data frame.
Args:
_data: Data frame to append to
*args: and
**kwargs: Name-value pairs to add to the data frame
_before: and
_after: Column index or name where to add the new columns
(default to add after the last column)
_dtypes: The dtypes for the new columns, either a uniform dtype or a
dict of dtypes with keys the column names
Returns:
The dataframe with the added columns
"""
raise _NotImplementedByCurrentBackendError("add_column", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("has_rownames"))
def has_rownames(_data) -> bool:
"""Detect if a data frame has row names
Aliases `has_index`
Args:
_data: The data frame to check
Returns:
True if the data frame has index otherwise False.
"""
raise _NotImplementedByCurrentBackendError("has_rownames", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("remove_rownames"))
def remove_rownames(_data) -> Any:
"""Remove the index/rownames of a data frame
Aliases `remove_index`, `drop_index`, `remove_rownames`
Args:
_data: The data frame
Returns:
The data frame with index removed
"""
raise _NotImplementedByCurrentBackendError("remove_rownames", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("rownames_to_column"))
def rownames_to_column(_data, var="rowname") -> Any:
"""Add rownames as a column
Aliases `index_to_column`
Args:
_data: The data frame
var: The name of the column
Returns:
The data frame with rownames added as one column. Note that the
original index is removed.
"""
raise _NotImplementedByCurrentBackendError("rownames_to_column", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("rowid_to_column"))
def rowid_to_column(_data, var="rowid") -> Any:
"""Add rownames as a column
Args:
_data: The data frame
var: The name of the column
Returns:
The data frame with row ids added as one column.
"""
raise _NotImplementedByCurrentBackendError("rowid_to_column", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("column_to_rownames"))
def column_to_rownames(_data, var="rowname") -> Any:
"""Set rownames/index with one column, and remove it
Aliases `column_to_index`
Args:
_data: The data frame
var: The column to conver to the rownames
Returns:
The data frame with the column converted to rownames
"""
raise _NotImplementedByCurrentBackendError("column_to_rownames", _data)
# aliases
add_case = add_row
has_index = has_rownames
remove_index = drop_index = remove_rownames
index_to_column = rownames_to_column
column_to_index = column_to_rownames
================================================
FILE: datar/apis/tidyr.py
================================================
from __future__ import annotations as _
from typing import Any, Callable as _Callable, Mapping as _Mapping
from pipda import (
register_verb as _register_verb,
register_func as _register_func,
)
from ..core.verb_env import get_verb_ast_fallback as _get_verb_ast_fallback
from ..core.utils import (
NotImplementedByCurrentBackendError as _NotImplementedByCurrentBackendError,
)
from .base import expand_grid # noqa: F401
@_register_func(pipeable=True, dispatchable=True)
def full_seq(x, period, tol=1e-6) -> Any:
"""Create the full sequence of values in a vector
Args:
x: A numeric vector.
period: Gap between each observation. The existing data will be
checked to ensure that it is actually of this periodicity.
tol: Numerical tolerance for checking periodicity.
Returns:
The full sequence
"""
raise _NotImplementedByCurrentBackendError("full_seq", x)
@_register_verb(ast_fallback=_get_verb_ast_fallback("chop"))
def chop(
data,
cols=None,
) -> Any:
"""Makes data frame shorter by converting rows within each group
into list-columns.
Args:
data: A data frame
cols: Columns to chop
Returns:
Data frame with selected columns chopped
"""
raise _NotImplementedByCurrentBackendError("chop", data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("unchop"))
def unchop(
data,
cols=None,
keep_empty: bool = False,
dtypes=None,
) -> Any:
"""Makes df longer by expanding list-columns so that each element
of the list-column gets its own row in the output.
See https://tidyr.tidyverse.org/reference/chop.html
Recycling size-1 elements might be different from `tidyr`
>>> df = tibble(x=[1, [2,3]], y=[[2,3], 1])
>>> df >> unchop([f.x, f.y])
>>> # tibble(x=[1,2,3], y=[2,3,1])
>>> # instead of following in tidyr
>>> # tibble(x=[1,1,2,3], y=[2,3,1,1])
Args:
data: A data frame.
cols: Columns to unchop.
keep_empty: By default, you get one row of output for each element
of the list your unchopping/unnesting.
This means that if there's a size-0 element
(like NULL or an empty data frame), that entire row will be
dropped from the output.
If you want to preserve all rows, use `keep_empty` = `True` to
replace size-0 elements with a single row of missing values.
dtypes: Providing the dtypes for the output columns.
Could be a single dtype, which will be applied to all columns, or
a dictionary of dtypes with keys for the columns and values the
dtypes.
For nested data frames, we need to specify `col$a` as key. If `col`
is used as key, all columns of the nested data frames will be casted
into that dtype.
Returns:
A data frame with selected columns unchopped.
"""
raise _NotImplementedByCurrentBackendError("unchop", data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("nest"))
def nest(
_data,
_names_sep: str = None,
**cols: str | int,
) -> Any:
"""Nesting creates a list-column of data frames
Args:
_data: A data frame
**cols: Columns to nest
_names_sep: If `None`, the default, the names will be left as is.
Inner names will come from the former outer names
If a string, the inner and outer names will be used together.
The names of the new outer columns will be formed by pasting
together the outer and the inner column names, separated by
`_names_sep`.
Returns:
Nested data frame.
"""
raise _NotImplementedByCurrentBackendError("nest", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("unnest"))
def unnest(
data,
*cols: str | int,
keep_empty: bool = False,
dtypes=None,
names_sep: str = None,
names_repair: str | _Callable = "check_unique",
) -> Any:
"""Flattens list-column of data frames back out into regular columns.
Args:
data: A data frame to flatten.
*cols: Columns to unnest.
keep_empty: By default, you get one row of output for each element
of the list your unchopping/unnesting.
This means that if there's a size-0 element
(like NULL or an empty data frame), that entire row will be
dropped from the output.
If you want to preserve all rows, use `keep_empty` = `True` to
replace size-0 elements with a single row of missing values.
dtypes: Providing the dtypes for the output columns.
Could be a single dtype, which will be applied to all columns, or
a dictionary of dtypes with keys for the columns and values the
dtypes.
names_sep: If `None`, the default, the names will be left as is.
Inner names will come from the former outer names
If a string, the inner and outer names will be used together.
The names of the new outer columns will be formed by pasting
together the outer and the inner column names, separated by
`names_sep`.
names_repair: treatment of problematic column names:
- "minimal": No name repair or checks, beyond basic existence,
- "unique": Make sure names are unique and not empty,
- "check_unique": (default value), no name repair,
but check they are unique,
- "universal": Make the names unique and syntactic
- a function: apply custom name repair
Returns:
Data frame with selected columns unnested.
"""
raise _NotImplementedByCurrentBackendError("unnest", data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("pack"))
def pack(
_data,
_names_sep: str = None,
**cols: str | int,
) -> Any:
"""Makes df narrow by collapsing a set of columns into a single df-column.
Args:
_data: A data frame
**cols: Columns to pack
_names_sep: If `None`, the default, the names will be left as is.
Inner names will come from the former outer names
If a string, the inner and outer names will be used together.
The names of the new outer columns will be formed by pasting
together the outer and the inner column names, separated by
`_names_sep`.
"""
raise _NotImplementedByCurrentBackendError("pack", _data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("unpack"))
def unpack(
data,
cols,
names_sep: str = None,
names_repair: str | _Callable = "check_unique",
) -> Any:
"""Makes df wider by expanding df-columns back out into individual columns.
For empty columns, the column is kept asis, instead of removing it.
Args:
data: A data frame
cols: Columns to unpack
names_sep: If `None`, the default, the names will be left as is.
Inner names will come from the former outer names
If a string, the inner and outer names will be used together.
The names of the new outer columns will be formed by pasting
together the outer and the inner column names, separated by
`_names_sep`.
name_repair: treatment of problematic column names:
- "minimal": No name repair or checks, beyond basic existence,
- "unique": Make sure names are unique and not empty,
- "check_unique": (default value), no name repair,
but check they are unique,
- "universal": Make the names unique and syntactic
- a function: apply custom name repair
Returns:
Data frame with given columns unpacked.
"""
raise _NotImplementedByCurrentBackendError("unpack", data)
@_register_verb(ast_fallback=_get_verb_ast_fallback("expand"))
def expand(
data,
*args,
_name_repair: str | _Callable = "check_unique",
**kwargs,
) -> Any:
"""Generates all combination of variables found in a dataset.
Args:
data: A data frame
*args: and,
**kwargs: columns to expand. Columns can be atomic lists.
- To find all unique combinations of x, y and z, including
those not present in the data, supply each variable as a
separate argument: `expand(df, x, y, z)`.
- To find only the combinations that occur in the data, use
nesting: `expand(df, nesting(x, y, z))`.
- You can combine the two forms. For example,
`expand(df, nesting(school_id, student_id), date)` would
produce a row for each present school-student combination
for all possible dates.
_name_repair: treatment of problematic column names:
- "minimal": No name repair or checks, beyond basic existence,
- "unique": Make sure names are unique and not empty,
- "check_unique": (default value), no name repair,
but check they are unique,
- "universal": Make the names unique and syntactic
- a function: apply custom name repair
Returns:
A data frame with all combination of variables.
"""
raise _NotImplementedByCurrentBackendError("expand", data)
@_register_func(dispatchable=True)
def nesting(
*args,
_name_repair: str | _Callable = "check_unique",
**kwargs,
) -> Any:
"""A helper that only finds combinations already present in the data.
Args:
*args: and,
**kwargs: columns to expand. Columns can be atomic lists.
- To find all unique combinations of x, y and z, including
those not present in the data, supply each variable as a
separate argument: `expand(df, x, y, z)`.
- To find only the combinations that occur in the data, use
nesting: `expand(df, nesting(x, y, z))`.
- You can combine the two forms. For example,
`expand(df, nesting(school_id, student_id), date)` would
produce a row for each present school-student combination
for all possible dates.
_name_repair: treatment of problematic column names:
- "minimal": No name repair or checks, beyond basic existence,
- "unique": Make sure names are unique and not empty,
- "check_unique": (default value), no name repair,
but check they are unique,
- "universal": Make the names unique and syntactic
- a function: apply custom name repair
Returns:
A data frame with all combinations in data.
"""
raise _NotImplementedByCurrentBackendError("nesting")
@_register_func(dispatchable=True)
def crossing(
*args,
_name_repair: str | _Callable = "check_unique",
**kwargs,
) -> Any:
"""A wrapper around `expand_grid()` that de-duplicates and sorts its inputs
When values are not specified by literal `list`, they will be sorted.
Args:
*args: and,
**kwargs: columns to expand. Columns can be atomic lists.
- To find all unique combinations of x, y and z, including
those not present in the data, supply each variable as a
separate argument: `expand(df, x, y, z)`.
- To find only the combinations that occur in the data, use
nesting: `expand(df, nesting(x, y, z))`.
- You can combine the two forms. For example,
`expand(df, nesting(school_id, student_id), date)` would
produce a row for each present school-student combination
for all possible dates.
_name_repair: treatment of problematic column names:
- "minimal": No name repair or checks, beyond basic existence,
- "unique": Make sure names are unique and not empty,
- "check_unique": (default value), no name repair,
but check they are un
gitextract_ikaf322n/ ├── .codesandbox/ │ ├── Dockerfile │ └── setup.sh ├── .coveragerc ├── .github/ │ ├── ISSUE_TEMPLATE/ │ │ ├── bug_report.yml │ │ ├── feature_request.yml │ │ └── submit_question.yml │ └── workflows/ │ ├── ci.yml │ └── docs.yml ├── .gitignore ├── .pre-commit-config.yaml ├── LICENSE ├── README.md ├── datar/ │ ├── __init__.py │ ├── all.py │ ├── apis/ │ │ ├── __init__.py │ │ ├── base.py │ │ ├── dplyr.py │ │ ├── forcats.py │ │ ├── misc.py │ │ ├── tibble.py │ │ └── tidyr.py │ ├── base.py │ ├── core/ │ │ ├── __init__.py │ │ ├── defaults.py │ │ ├── load_plugins.py │ │ ├── names.py │ │ ├── operator.py │ │ ├── options.py │ │ ├── plugin.py │ │ ├── utils.py │ │ └── verb_env.py │ ├── data/ │ │ ├── __init__.py │ │ └── metadata.py │ ├── datasets.py │ ├── dplyr.py │ ├── forcats.py │ ├── misc.py │ ├── tibble.py │ └── tidyr.py ├── docs/ │ ├── CHANGELOG.md │ ├── ENV_VARS.md │ ├── backends.md │ ├── data.md │ ├── f.md │ ├── import.md │ ├── notebooks/ │ │ ├── across.ipynb │ │ ├── add_column.ipynb │ │ ├── add_row.ipynb │ │ ├── arrange.ipynb │ │ ├── base-arithmetic.ipynb │ │ ├── base-funs.ipynb │ │ ├── base.ipynb │ │ ├── between.ipynb │ │ ├── bind.ipynb │ │ ├── case_when.ipynb │ │ ├── chop.ipynb │ │ ├── coalesce.ipynb │ │ ├── complete.ipynb │ │ ├── context.ipynb │ │ ├── count.ipynb │ │ ├── cumall.ipynb │ │ ├── desc.ipynb │ │ ├── distinct.ipynb │ │ ├── drop_na.ipynb │ │ ├── enframe.ipynb │ │ ├── expand.ipynb │ │ ├── expand_grid.ipynb │ │ ├── extract.ipynb │ │ ├── fill.ipynb │ │ ├── filter-joins.ipynb │ │ ├── filter.ipynb │ │ ├── forcats_fct_multi.ipynb │ │ ├── forcats_lvl_addrm.ipynb │ │ ├── forcats_lvl_order.ipynb │ │ ├── forcats_lvl_value.ipynb │ │ ├── forcats_misc.ipynb │ │ ├── full_seq.ipynb │ │ ├── group_by.ipynb │ │ ├── group_map.ipynb │ │ ├── group_split.ipynb │ │ ├── group_trim.ipynb │ │ ├── lead-lag.ipynb │ │ ├── mutate-joins.ipynb │ │ ├── mutate.ipynb │ │ ├── n_distinct.ipynb │ │ ├── na_if.ipynb │ │ ├── nb_helpers.py │ │ ├── near.ipynb │ │ ├── nest-join.ipynb │ │ ├── nest.ipynb │ │ ├── nth.ipynb │ │ ├── other.ipynb │ │ ├── pack.ipynb │ │ ├── pivot_longer.ipynb │ │ ├── pivot_wider.ipynb │ │ ├── pull.ipynb │ │ ├── ranking.ipynb │ │ ├── readme.ipynb │ │ ├── recode.ipynb │ │ ├── reframe.ipynb │ │ ├── relocate.ipynb │ │ ├── rename.ipynb │ │ ├── replace_na.ipynb │ │ ├── rownames.ipynb │ │ ├── rows.ipynb │ │ ├── rowwise.ipynb │ │ ├── select.ipynb │ │ ├── separate.ipynb │ │ ├── setops.ipynb │ │ ├── slice.ipynb │ │ ├── summarise.ipynb │ │ ├── tibble.ipynb │ │ ├── uncount.ipynb │ │ ├── unite.ipynb │ │ └── with_groups.ipynb │ ├── options.md │ ├── reference-maps/ │ │ ├── ALL.md │ │ ├── base.md │ │ ├── datasets.md │ │ ├── dplyr.md │ │ ├── forcats.md │ │ ├── other.md │ │ ├── stats.md │ │ ├── tibble.md │ │ ├── tidyr.md │ │ └── utils.md │ └── style.css ├── mkdocs.yml ├── pyproject.toml ├── setup.py ├── tests/ │ ├── __init__.py │ ├── conflict_names.py │ ├── conftest.py │ ├── test_array_ufunc.py │ ├── test_base.py │ ├── test_conflict_names.py │ ├── test_data.py │ ├── test_dplyr.py │ ├── test_forcats.py │ ├── test_names.py │ ├── test_options.py │ ├── test_pipe.py │ ├── test_plugin.py │ ├── test_tibble.py │ ├── test_tidyr.py │ ├── test_utils.py │ ├── test_verb_env.py │ └── test_verb_env_integration.py └── tox.ini
SYMBOL INDEX (509 symbols across 38 files)
FILE: datar/__init__.py
function get_versions (line 18) | def get_versions(prnt: bool = True) -> _Mapping[str, str]:
FILE: datar/all.py
function __getattr__ (line 24) | def __getattr__(name):
FILE: datar/apis/base.py
function ceiling (line 44) | def ceiling(x) -> Any:
function cov (line 57) | def cov(x, y=None, na_rm: bool = False, ddof: int = 1) -> Any:
function floor (line 76) | def floor(x) -> Any:
function mean (line 89) | def mean(x, na_rm: bool = False) -> Any:
function median (line 103) | def median(x, na_rm: bool = False) -> Any:
function pmax (line 117) | def pmax(*args, na_rm: bool = False) -> Any:
function pmin (line 133) | def pmin(*args, na_rm: bool = False) -> Any:
function sqrt (line 149) | def sqrt(x) -> Any:
function var (line 162) | def var(x, na_rm: bool = False, ddof: int = 1) -> Any:
function scale (line 180) | def scale(x, center=True, scale_=True) -> Any:
function col_sums (line 195) | def col_sums(x, na_rm: bool = False) -> Any:
function col_means (line 209) | def col_means(x, na_rm: bool = False) -> Any:
function col_sds (line 223) | def col_sds(x, na_rm: bool = False) -> Any:
function col_medians (line 237) | def col_medians(x, na_rm: bool = False) -> Any:
function row_sums (line 251) | def row_sums(x, na_rm: bool = False) -> Any:
function row_means (line 265) | def row_means(x, na_rm: bool = False) -> Any:
function row_sds (line 279) | def row_sds(x, na_rm: bool = False) -> Any:
function row_medians (line 293) | def row_medians(x, na_rm: bool = False) -> Any:
function min_ (line 307) | def min_(x, na_rm: bool = False) -> Any:
function max_ (line 321) | def max_(x, na_rm: bool = False) -> Any:
function round_ (line 335) | def round_(x, digits: int = 0) -> Any:
function sum_ (line 349) | def sum_(x, na_rm: bool = False) -> Any:
function abs_ (line 363) | def abs_(x) -> Any:
function prod (line 376) | def prod(x, na_rm: bool = False) -> Any:
function sign (line 390) | def sign(x) -> Any:
function signif (line 403) | def signif(x, digits: int = 6) -> Any:
function trunc (line 417) | def trunc(x) -> Any:
function exp (line 430) | def exp(x) -> Any:
function log (line 443) | def log(x, base: float = _math.e) -> Any:
function log2 (line 457) | def log2(x) -> Any:
function log10 (line 470) | def log10(x) -> Any:
function log1p (line 483) | def log1p(x) -> Any:
function sd (line 496) | def sd(x, na_rm: bool = False) -> Any:
function weighted_mean (line 510) | def weighted_mean(x, w=None, na_rm: bool = False) -> Any:
function quantile (line 525) | def quantile(
function bessel_i (line 546) | def bessel_i(x, nu, expon_scaled: bool = False) -> Any:
function bessel_j (line 561) | def bessel_j(x, nu) -> Any:
function bessel_k (line 575) | def bessel_k(x, nu, expon_scaled: bool = False) -> Any:
function bessel_y (line 590) | def bessel_y(x, nu) -> Any:
function as_double (line 604) | def as_double(x) -> Any:
function as_integer (line 617) | def as_integer(x) -> Any:
function as_logical (line 630) | def as_logical(x) -> Any:
function as_character (line 643) | def as_character(x) -> Any:
function as_factor (line 656) | def as_factor(x) -> Any:
function as_ordered (line 669) | def as_ordered(x) -> Any:
function as_date (line 682) | def as_date(
function as_numeric (line 720) | def as_numeric(x) -> Any:
function arg (line 733) | def arg(x) -> Any:
function conj (line 746) | def conj(x) -> Any:
function mod (line 759) | def mod(x) -> Any:
function re_ (line 772) | def re_(x) -> Any:
function im (line 785) | def im(x) -> Any:
function as_complex (line 798) | def as_complex(x) -> Any:
function is_complex (line 811) | def is_complex(x) -> Any:
function cummax (line 824) | def cummax(x) -> Any:
function cummin (line 837) | def cummin(x) -> Any:
function cumprod (line 850) | def cumprod(x) -> Any:
function cumsum (line 863) | def cumsum(x) -> Any:
function droplevels (line 876) | def droplevels(x) -> Any:
function levels (line 889) | def levels(x) -> Any:
function set_levels (line 902) | def set_levels(x, levels) -> Any:
function is_factor (line 916) | def is_factor(x) -> Any:
function is_ordered (line 929) | def is_ordered(x) -> Any:
function nlevels (line 942) | def nlevels(x) -> Any:
function factor (line 955) | def factor(
function ordered (line 981) | def ordered(x, levels=None, labels=None, exclude=None, nmax=None) -> Any:
function cut (line 998) | def cut(
function diff (line 1025) | def diff(x, lag: int = 1, differences: int = 1) -> Any:
function expand_grid (line 1043) | def expand_grid(x, *args, **kwargs) -> Any:
function outer (line 1058) | def outer(x, y, fun="*") -> Any:
function make_names (line 1076) | def make_names(names, unique: bool = True) -> Any:
function make_unique (line 1113) | def make_unique(names) -> Any:
function rank (line 1126) | def rank(x, na_last: bool = True, ties_method: str = "average") -> Any:
function identity (line 1142) | def identity(x) -> Any:
function is_logical (line 1155) | def is_logical(x) -> Any:
function is_true (line 1168) | def is_true(x) -> bool:
function is_false (line 1181) | def is_false(x) -> bool:
function is_na (line 1194) | def is_na(x) -> Any:
function is_finite (line 1207) | def is_finite(x) -> Any:
function is_infinite (line 1220) | def is_infinite(x) -> Any:
function any_na (line 1233) | def any_na(x) -> Any:
function as_null (line 1246) | def as_null(x) -> Any:
function is_null (line 1259) | def is_null(x) -> Any:
function set_seed (line 1272) | def set_seed(seed) -> Any:
function rep (line 1282) | def rep(x, times=1, length=None, each=1) -> Any:
function c_ (line 1299) | def c_(*args) -> Any:
function length (line 1315) | def length(x) -> Any:
function lengths (line 1328) | def lengths(x) -> Any:
function order (line 1341) | def order(x, decreasing: bool = False, na_last: bool = True) -> Any:
function sort (line 1356) | def sort(x, decreasing: bool = False, na_last: bool = True) -> Any:
function rev (line 1371) | def rev(x) -> Any:
function sample (line 1384) | def sample(x, size=None, replace: bool = False, prob=None) -> Any:
function seq (line 1400) | def seq(from_=None, to=None, by=None, length_out=None, along_with=None) ...
function seq_along (line 1417) | def seq_along(x) -> Any:
function seq_len (line 1430) | def seq_len(x) -> Any:
function match (line 1443) | def match(x, table, nomatch=-1) -> Any:
function beta (line 1458) | def beta(x, y) -> Any:
function lgamma (line 1472) | def lgamma(x) -> Any:
function digamma (line 1485) | def digamma(x) -> Any:
function trigamma (line 1498) | def trigamma(x) -> Any:
function choose (line 1511) | def choose(n, k) -> Any:
function factorial (line 1525) | def factorial(x) -> Any:
function gamma (line 1538) | def gamma(x) -> Any:
function lfactorial (line 1551) | def lfactorial(x) -> Any:
function lchoose (line 1564) | def lchoose(n, k) -> Any:
function lbeta (line 1578) | def lbeta(x, y) -> Any:
function psigamma (line 1592) | def psigamma(x, deriv) -> Any:
function rnorm (line 1606) | def rnorm(n, mean=0, sd=1) -> Any:
function runif (line 1621) | def runif(n, min=0, max=1) -> Any:
function rpois (line 1636) | def rpois(n, lambda_) -> Any:
function rbinom (line 1650) | def rbinom(n, size, prob) -> Any:
function rcauchy (line 1665) | def rcauchy(n, location=0, scale=1) -> Any:
function rchisq (line 1680) | def rchisq(n, df) -> Any:
function rexp (line 1694) | def rexp(n, rate) -> Any:
function is_character (line 1708) | def is_character(x) -> Any:
function grep (line 1721) | def grep(
function grepl (line 1746) | def grepl(pattern, x, ignore_case=False, fixed=False) -> Any:
function sub (line 1762) | def sub(pattern, replacement, x, ignore_case=False, fixed=False) -> Any:
function gsub (line 1779) | def gsub(pattern, replacement, x, ignore_case=False, fixed=False) -> Any:
function strsplit (line 1796) | def strsplit(x, split, fixed=False, perl=False, use_bytes=False) -> Any:
function paste (line 1813) | def paste(*args, sep=" ", collapse=None) -> Any:
function paste0 (line 1828) | def paste0(*args, collapse=None) -> Any:
function sprintf (line 1842) | def sprintf(fmt, *args) -> Any:
function substr (line 1856) | def substr(x, start, stop) -> Any:
function substring (line 1871) | def substring(x, first, last=None) -> Any:
function startswith (line 1886) | def startswith(x, prefix) -> Any:
function endswith (line 1900) | def endswith(x, suffix) -> Any:
function strtoi (line 1914) | def strtoi(x, base=0) -> Any:
function trimws (line 1928) | def trimws(x, which="both", whitespace=r" \t") -> Any:
function toupper (line 1943) | def toupper(x) -> Any:
function tolower (line 1956) | def tolower(x) -> Any:
function chartr (line 1969) | def chartr(old, new, x) -> Any:
function nchar (line 1984) | def nchar(
function nzchar (line 2006) | def nzchar(x, keep_na: bool = False) -> Any:
function table (line 2020) | def table(
function tabulate (line 2045) | def tabulate(bin, nbins=None) -> Any:
function is_atomic (line 2060) | def is_atomic(x) -> Any:
function is_double (line 2073) | def is_double(x) -> Any:
function is_element (line 2086) | def is_element(x, y) -> Any:
function is_integer (line 2103) | def is_integer(x) -> Any:
function is_numeric (line 2116) | def is_numeric(x) -> Any:
function any_ (line 2129) | def any_(x, na_rm: bool = False) -> Any:
function all_ (line 2143) | def all_(x, na_rm: bool = False) -> Any:
function acos (line 2157) | def acos(x) -> Any:
function acosh (line 2170) | def acosh(x) -> Any:
function asin (line 2183) | def asin(x) -> Any:
function asinh (line 2196) | def asinh(x) -> Any:
function atan (line 2209) | def atan(x) -> Any:
function atanh (line 2222) | def atanh(x) -> Any:
function cos (line 2235) | def cos(x) -> Any:
function cosh (line 2248) | def cosh(x) -> Any:
function cospi (line 2261) | def cospi(x) -> Any:
function sin (line 2274) | def sin(x) -> Any:
function sinh (line 2287) | def sinh(x) -> Any:
function sinpi (line 2300) | def sinpi(x) -> Any:
function tan (line 2313) | def tan(x) -> Any:
function tanh (line 2326) | def tanh(x) -> Any:
function tanpi (line 2339) | def tanpi(x) -> Any:
function atan2 (line 2352) | def atan2(y, x) -> Any:
function append (line 2366) | def append(x, values, after: int = -1) -> Any:
function colnames (line 2381) | def colnames(x, nested: bool = True) -> Any:
function set_colnames (line 2395) | def set_colnames(x, names, nested: bool = True) -> Any:
function rownames (line 2410) | def rownames(x) -> Any:
function set_rownames (line 2423) | def set_rownames(x, names) -> Any:
function dim (line 2437) | def dim(x, nested: bool = True) -> Any:
function diag (line 2451) | def diag(x, nrow=None, ncol=None) -> Any:
function duplicated (line 2466) | def duplicated(x, incomparables=None, from_last: bool = False) -> Any:
function intersect (line 2481) | def intersect(x, y) -> Any:
function ncol (line 2495) | def ncol(x, nested: bool = True) -> Any:
function nrow (line 2509) | def nrow(x) -> Any:
function proportions (line 2522) | def proportions(x, margin: int = 1) -> Any:
function setdiff (line 2536) | def setdiff(x, y) -> Any:
function setequal (line 2550) | def setequal(x, y) -> Any:
function unique (line 2564) | def unique(x) -> Any:
function t (line 2577) | def t(x) -> Any:
function union (line 2590) | def union(x, y) -> Any:
function max_col (line 2604) | def max_col(x, ties_method: str = "random", nested: bool = True) -> Any:
function complete_cases (line 2619) | def complete_cases(x) -> Any:
function head (line 2632) | def head(x, n: int = 6) -> Any:
function tail (line 2646) | def tail(x, n: int = 6) -> Any:
function which (line 2660) | def which(x) -> Any:
function which_max (line 2673) | def which_max(x) -> Any:
function which_min (line 2686) | def which_min(x) -> Any:
FILE: datar/apis/dplyr.py
function pick (line 26) | def pick(_data: T, *args) -> T:
function across (line 43) | def across(_data: T, *args, _names=None, **kwargs) -> T:
function c_across (line 92) | def c_across(_data: T, _cols=None) -> T:
function if_any (line 106) | def if_any(_data, *args, _names=None, **kwargs) -> Any:
function if_all (line 117) | def if_all(_data, *args, _names=None, **kwargs) -> Any:
function symdiff (line 128) | def symdiff(x: T, y: T) -> T:
function arrange (line 148) | def arrange(_data, *args, _by_group=False, **kwargs) -> Any:
function bind_rows (line 174) | def bind_rows(*data, _id=None, _copy: bool = True, **kwargs) -> Any:
function bind_cols (line 194) | def bind_cols(*data, _name_repair="unique", _copy: bool = True) -> Any:
function cur_column (line 221) | def cur_column(_data, _name) -> Any:
function cur_data (line 235) | def cur_data(_data) -> Any:
function n (line 248) | def n(_data) -> Any:
function cur_data_all (line 263) | def cur_data_all(_data) -> Any:
function cur_group (line 277) | def cur_group(_data) -> Any:
function cur_group_id (line 292) | def cur_group_id(_data) -> Any:
function cur_group_rows (line 307) | def cur_group_rows(_data) -> Any:
function count (line 321) | def count(
function tally (line 358) | def tally(_data, wt=None, sort=False, name=None) -> Any:
function add_count (line 382) | def add_count(_data, *args, wt=None, sort=False, name="n", **kwargs) -> ...
function add_tally (line 409) | def add_tally(_data, wt=None, sort=False, name="n") -> Any:
function desc (line 434) | def desc(x) -> Any:
function filter_ (line 453) | def filter_(_data, *conditions, _preserve: bool = False) -> Any:
function distinct (line 472) | def distinct(
function n_distinct (line 496) | def n_distinct(_data, na_rm: bool = True) -> Any:
function glimpse (line 514) | def glimpse(_data, width: int = None, formatter=None) -> Any:
function slice_ (line 530) | def slice_(_data, *args, _preserve: bool = False) -> Any:
function slice_head (line 548) | def slice_head(_data, n: int = None, prop: float = None) -> Any:
function slice_tail (line 566) | def slice_tail(_data, n: int = None, prop: float = None) -> Any:
function slice_sample (line 584) | def slice_sample(
function slice_min (line 610) | def slice_min(
function slice_max (line 638) | def slice_max(
function between (line 667) | def between(x, left, right, inclusive: str = "both") -> Any:
function cummean (line 688) | def cummean(x, na_rm: bool = False) -> Any:
function cumall (line 705) | def cumall(x) -> Any:
function cumany (line 721) | def cumany(x) -> Any:
function coalesce (line 737) | def coalesce(x, *replace) -> Any:
function consecutive_id (line 754) | def consecutive_id(x, *args) -> _Sequence[int]:
function na_if (line 771) | def na_if(x, value) -> Any:
function near (line 788) | def near(x, y, tol: float = 1e-8) -> Any:
function nth (line 806) | def nth(x, n, order_by=None, default=None) -> Any:
function first (line 825) | def first(x, order_by=None, default=None) -> Any:
function last (line 843) | def last(x, order_by=None, default=None) -> Any:
function group_by (line 862) | def group_by(_data, *args, _add: bool = False, _drop: bool = None) -> Any:
function ungroup (line 881) | def ungroup(_data, *cols: str | int) -> Any:
function rowwise (line 898) | def rowwise(_data, *cols: str | int) -> Any:
function group_by_drop_default (line 915) | def group_by_drop_default(_data) -> Any:
function group_vars (line 931) | def group_vars(_data) -> Any:
function group_indices (line 947) | def group_indices(_data) -> Any:
function group_keys (line 963) | def group_keys(_data) -> Any:
function group_size (line 979) | def group_size(_data) -> Any:
function group_rows (line 995) | def group_rows(_data) -> Any:
function group_cols (line 1011) | def group_cols(_data) -> Any:
function group_data (line 1027) | def group_data(_data) -> Any:
function n_groups (line 1043) | def n_groups(_data) -> int:
function group_map (line 1059) | def group_map(_data, _f, *args, _keep: bool = False, **kwargs) -> Any:
function group_modify (line 1079) | def group_modify(_data, _f, *args, _keep: bool = False, **kwargs) -> Any:
function group_split (line 1099) | def group_split(_data, *args, _keep: bool = False, **kwargs) -> Any:
function group_trim (line 1118) | def group_trim(_data, _drop=None) -> Any:
function group_walk (line 1135) | def group_walk(_data, _f, *args, _keep: bool = False, **kwargs) -> Any:
function with_groups (line 1154) | def with_groups(_data, _groups, _func, *args, **kwargs) -> Any:
function if_else (line 1170) | def if_else(condition, true, false, missing=None) -> Any:
function case_match (line 1190) | def case_match(_x: T, *args, _default=None, _dtypes=None) -> T:
function case_when (line 1209) | def case_when(cond, value, *more_cases) -> Any:
function inner_join (line 1225) | def inner_join(
function left_join (line 1276) | def left_join(
function right_join (line 1327) | def right_join(
function full_join (line 1378) | def full_join(
function semi_join (line 1429) | def semi_join(
function anti_join (line 1458) | def anti_join(
function nest_join (line 1487) | def nest_join(
function cross_join (line 1525) | def cross_join(
function lead (line 1551) | def lead(x, n=1, default=None, order_by=None) -> Any:
function lag (line 1570) | def lag(x, n=1, default=None, order_by=None) -> Any:
function mutate (line 1590) | def mutate(
function transmute (line 1637) | def transmute(_data, *args, _before=None, _after=None, **kwargs) -> Any:
function order_by (line 1675) | def order_by(order, call) -> Any:
function with_order (line 1699) | def with_order(order, func, x, *args, **kwargs) -> Any:
function pull (line 1721) | def pull(_data, var: str | int = -1, name=None, to=None) -> Any:
function row_number (line 1756) | def row_number(x=_f_symbolic) -> Any:
function row_number_ (line 1773) | def row_number_(x) -> Any:
function ntile (line 1777) | def ntile(x=_f_symbolic, *, n: int = None) -> Any:
function ntile_ (line 1797) | def ntile_(x, *, n: int = None) -> Any:
function min_rank (line 1801) | def min_rank(x=_f_symbolic, *, na_last: str = "keep") -> Any:
function min_rank_ (line 1822) | def min_rank_(x, *, na_last: str = "keep") -> Any:
function dense_rank (line 1826) | def dense_rank(x=_f_symbolic, *, na_last: str = "keep") -> Any:
function dense_rank_ (line 1847) | def dense_rank_(x, *, na_last: str = "keep") -> Any:
function percent_rank (line 1851) | def percent_rank(x=_f_symbolic, *, na_last: str = "keep") -> Any:
function percent_rank_ (line 1872) | def percent_rank_(x, *, na_last: str = "keep") -> Any:
function cume_dist (line 1876) | def cume_dist(x=_f_symbolic, *, na_last: str = "keep") -> Any:
function cume_dist_ (line 1897) | def cume_dist_(x, *, na_last: str = "keep") -> Any:
function recode (line 1903) | def recode(_x, *args, _default=None, _missing=None, **kwargs) -> Any:
function recode_factor (line 1925) | def recode_factor(
function relocate (line 1955) | def relocate(
function rename (line 1988) | def rename(_data, **kwargs) -> Any:
function rename_with (line 2005) | def rename_with(_data, _fn, *args, **kwargs) -> Any:
function rows_insert (line 2028) | def rows_insert(
function rows_update (line 2062) | def rows_update(
function rows_patch (line 2100) | def rows_patch(
function rows_upsert (line 2138) | def rows_upsert(x, y, by=None, **kwargs) -> Any:
function rows_delete (line 2164) | def rows_delete(
function rows_append (line 2202) | def rows_append(x, y, **kwargs) -> Any:
function select (line 2221) | def select(_data, *args, **kwargs) -> Any:
function union_all (line 2239) | def union_all(x, y) -> Any:
function summarise (line 2256) | def summarise(_data, *args, _groups: str = None, **kwargs) -> Any:
function reframe (line 2283) | def reframe(_data, *args, **kwargs) -> Any:
function where (line 2302) | def where(_data, fn: _Callable) -> Any:
function everything (line 2323) | def everything(_data) -> Any:
function last_col (line 2339) | def last_col(_data, offset: int = 0, vars=None) -> Any:
function starts_with (line 2359) | def starts_with(_data, match, ignore_case: bool = True, vars=None) -> Any:
function ends_with (line 2378) | def ends_with(_data, match, ignore_case: bool = True, vars=None) -> Any:
function contains (line 2397) | def contains(_data, match, ignore_case: bool = True, vars=None) -> Any:
function matches (line 2416) | def matches(_data, match, ignore_case: bool = True, vars=None) -> Any:
function num_range (line 2435) | def num_range(prefix: str, range_, width: int = None) -> Any:
function all_of (line 2452) | def all_of(_data, x) -> Any:
function any_of (line 2473) | def any_of(_data, x, vars=None) -> Any:
FILE: datar/apis/forcats.py
function fct_relevel (line 13) | def fct_relevel(_f, *lvls, after: int = None) -> Any:
function fct_inorder (line 34) | def fct_inorder(_f, ordered: bool = None) -> Any:
function fct_infreq (line 49) | def fct_infreq(_f, ordered: bool = None) -> Any:
function fct_inseq (line 64) | def fct_inseq(_f, ordered: bool = None) -> Any:
function fct_reorder (line 79) | def fct_reorder(_f, _x, *args, _fun=None, _desc: bool = False, **kwargs)...
function fct_reorder2 (line 97) | def fct_reorder2(
function fct_shuffle (line 122) | def fct_shuffle(_f) -> Any:
function fct_rev (line 135) | def fct_rev(_f) -> Any:
function fct_shift (line 148) | def fct_shift(_f, n: int = 1) -> Any:
function first2 (line 162) | def first2(_x, _y) -> Any:
function last2 (line 176) | def last2(_x, _y) -> Any:
function fct_anon (line 190) | def fct_anon(_f, prefix: str = "") -> Any:
function fct_recode (line 204) | def fct_recode(_f, *args, **kwargs) -> Any:
function fct_collapse (line 234) | def fct_collapse(_f, other_level=None, **kwargs) -> Any:
function fct_lump (line 252) | def fct_lump(
function fct_lump_min (line 284) | def fct_lump_min(_f, min_, w=None, other_level="Other") -> Any:
function fct_lump_prop (line 302) | def fct_lump_prop(_f, prop, w=None, other_level="Other") -> Any:
function fct_lump_n (line 322) | def fct_lump_n(_f, n, w=None, other_level="Other") -> Any:
function fct_lump_lowfreq (line 344) | def fct_lump_lowfreq(_f, other_level="Other") -> Any:
function fct_other (line 360) | def fct_other(_f, keep=None, drop=None, other_level="Other") -> Any:
function fct_relabel (line 381) | def fct_relabel(_f, _fun, *args, **kwargs) -> Any:
function fct_expand (line 399) | def fct_expand(_f, *additional_levels) -> Any:
function fct_explicit_na (line 414) | def fct_explicit_na(_f, na_level="(Missing)") -> Any:
function fct_drop (line 432) | def fct_drop(_f, only=None) -> Any:
function fct_unify (line 448) | def fct_unify(
function fct_c (line 466) | def fct_c(*fs) -> Any:
function fct_cross (line 482) | def fct_cross(
function fct_count (line 504) | def fct_count(_f, sort: bool = False, prop=False) -> Any:
function fct_match (line 520) | def fct_match(_f, lvls) -> Any:
function fct_unique (line 536) | def fct_unique(_f) -> Any:
function lvls_reorder (line 549) | def lvls_reorder(
function lvls_revalue (line 571) | def lvls_revalue(
function lvls_expand (line 589) | def lvls_expand(
function lvls_union (line 607) | def lvls_union(fs) -> Any:
FILE: datar/apis/misc.py
function _array_ufunc_with_backend (line 7) | def _array_ufunc_with_backend(backend: str):
function array_ufunc (line 16) | def array_ufunc(x, ufunc, *args, kind, **kwargs):
FILE: datar/apis/tibble.py
function tibble (line 16) | def tibble(
function tibble_ (line 50) | def tibble_(
function tribble (line 63) | def tribble(
function tibble_row (line 92) | def tibble_row(
function as_tibble (line 117) | def as_tibble(df) -> Any:
function enframe (line 123) | def enframe(x, name="name", value="value") -> Any:
function deframe (line 141) | def deframe(x) -> Any:
function add_row (line 156) | def add_row(
function add_column (line 183) | def add_column(
function has_rownames (line 211) | def has_rownames(_data) -> bool:
function remove_rownames (line 227) | def remove_rownames(_data) -> Any:
function rownames_to_column (line 243) | def rownames_to_column(_data, var="rowname") -> Any:
function rowid_to_column (line 260) | def rowid_to_column(_data, var="rowid") -> Any:
function column_to_rownames (line 275) | def column_to_rownames(_data, var="rowname") -> Any:
FILE: datar/apis/tidyr.py
function full_seq (line 17) | def full_seq(x, period, tol=1e-6) -> Any:
function chop (line 33) | def chop(
function unchop (line 51) | def unchop(
function nest (line 94) | def nest(
function unnest (line 118) | def unnest(
function pack (line 163) | def pack(
function unpack (line 184) | def unpack(
function expand (line 218) | def expand(
function nesting (line 254) | def nesting(
function crossing (line 288) | def crossing(
function complete (line 324) | def complete(
function drop_na (line 358) | def drop_na(
function extract (line 382) | def extract(
function fill (line 414) | def fill(
function pivot_longer (line 439) | def pivot_longer(
function pivot_wider (line 545) | def pivot_wider(
function separate (line 598) | def separate(
function separate_rows (line 645) | def separate_rows(
function uncount (line 667) | def uncount(
function unite (line 690) | def unite(
function replace_na (line 716) | def replace_na(
FILE: datar/base.py
function __getattr__ (line 15) | def __getattr__(name):
FILE: datar/core/load_plugins.py
function _array_ufunc_to_register (line 7) | def _array_ufunc_to_register(ufunc, x, *args, kind, **kwargs):
FILE: datar/core/names.py
class NameNonUniqueError (line 12) | class NameNonUniqueError(ValueError):
function _isnan (line 16) | def _isnan(x: Any) -> bool:
function _is_scalar (line 21) | def _is_scalar(x: Any) -> bool:
function _log_changed_names (line 32) | def _log_changed_names(changed_names: List[Tuple[str, str]]) -> None:
function _repair_names_minimal (line 42) | def _repair_names_minimal(names: Iterable[str]) -> List[str]:
function _repair_names_unique (line 47) | def _repair_names_unique(
function _repair_names_universal (line 75) | def _repair_names_universal(
function _repair_names_check_unique (line 101) | def _repair_names_check_unique(names: Iterable[str]) -> Iterable[str]:
function repair_names (line 123) | def repair_names(
FILE: datar/core/operator.py
class DatarOperator (line 9) | class DatarOperator(Operator):
method with_backend (line 16) | def with_backend(cls, backend: str):
method __getattr__ (line 23) | def __getattr__(self, name: str) -> Callable:
FILE: datar/core/options.py
function options (line 33) | def options(
function options_context (line 84) | def options_context(**kwargs: Any) -> Generator:
function get_option (line 95) | def get_option(x: str, default: Any = None) -> Any:
function add_option (line 106) | def add_option(x: str, default: Any = None) -> None:
FILE: datar/core/plugin.py
function _collect (line 9) | def _collect(calls: List[Tuple[Callable, Tuple, Mapping]]) -> Mapping[st...
function setup (line 20) | def setup():
function get_versions (line 25) | def get_versions():
function load_dataset (line 30) | def load_dataset(name: str, metadata: Mapping):
function base_api (line 35) | def base_api():
function dplyr_api (line 40) | def dplyr_api():
function tibble_api (line 45) | def tibble_api():
function forcats_api (line 50) | def forcats_api():
function tidyr_api (line 55) | def tidyr_api():
function misc_api (line 60) | def misc_api():
function c_getitem (line 65) | def c_getitem(item):
function operate (line 70) | def operate(op: str, x: Any, y: Any = None):
FILE: datar/core/utils.py
class NotImplementedByCurrentBackendError (line 22) | class NotImplementedByCurrentBackendError(NotImplementedError):
method __init__ (line 25) | def __init__(self, func: str, data: Any = None) -> None:
class CollectionFunction (line 37) | class CollectionFunction:
method __init__ (line 40) | def __init__(self, c_func: Callable) -> None:
method __call__ (line 44) | def __call__(self, *args, **kwargs):
method with_backend (line 49) | def with_backend(self, backend: str):
method __getitem__ (line 56) | def __getitem__(self, item):
function arg_match (line 61) | def arg_match(arg, argname, values, errmsg=None):
FILE: datar/core/verb_env.py
function get_verb_ast_fallback (line 7) | def get_verb_ast_fallback(verb: str) -> str | None:
FILE: datar/data/__init__.py
function descr_datasets (line 13) | def descr_datasets(*names: str):
function add_dataset (line 26) | def add_dataset(name: str, meta: Metadata):
function load_dataset (line 37) | def load_dataset(name: str, __backend: str = None) -> Any:
function __getattr__ (line 47) | def __getattr__(name: str):
FILE: datar/datasets.py
class DatasetsDeprecatedWarning (line 5) | class DatasetsDeprecatedWarning(DeprecationWarning):
function __getattr__ (line 18) | def __getattr__(name: str):
FILE: datar/dplyr.py
function __getattr__ (line 16) | def __getattr__(name):
FILE: datar/misc.py
function pipe (line 12) | def pipe(data: _Any, func: _Callable, *args, **kwargs) -> _Any:
FILE: docs/notebooks/nb_helpers.py
function nb_header (line 20) | def nb_header(*funcs, book=None):
function try_catch (line 60) | def try_catch():
FILE: tests/conflict_names.py
function test_getattr (line 4) | def test_getattr(module, allow_conflict_names, fun, error):
function _import (line 27) | def _import(module, fun):
function test_import (line 38) | def test_import(module, allow_conflict_names, fun, error):
function make_test (line 54) | def make_test(module, allow_conflict_names, getattr, fun, error):
function main (line 64) | def main():
FILE: tests/conftest.py
function pytest_sessionstart (line 4) | def pytest_sessionstart(session):
FILE: tests/test_array_ufunc.py
function test_default (line 10) | def test_default():
function test_misc_obj (line 15) | def test_misc_obj():
FILE: tests/test_base.py
function test_default_implementation (line 376) | def test_default_implementation(fun, args):
function test_make_names (line 386) | def test_make_names(x, uniq, y):
function test_make_unique (line 395) | def test_make_unique(x, y):
function test_identify (line 400) | def test_identify():
FILE: tests/test_conflict_names.py
function _run_conflict_names (line 8) | def _run_conflict_names(module, allow_conflict_names, getat, error):
function test_from_all_import_allow_conflict_names_true (line 28) | def test_from_all_import_allow_conflict_names_true():
function test_from_all_import_allow_conflict_names_false (line 33) | def test_from_all_import_allow_conflict_names_false():
function test_all_getattr_allow_conflict_names_true (line 38) | def test_all_getattr_allow_conflict_names_true():
function test_all_getattr_allow_conflict_names_false (line 43) | def test_all_getattr_allow_conflict_names_false():
function test_from_base_import_allow_conflict_names_true (line 48) | def test_from_base_import_allow_conflict_names_true():
function test_from_base_import_allow_conflict_names_false (line 53) | def test_from_base_import_allow_conflict_names_false():
function test_base_getattr_allow_conflict_names_true (line 58) | def test_base_getattr_allow_conflict_names_true():
function test_base_getattr_allow_conflict_names_false (line 63) | def test_base_getattr_allow_conflict_names_false():
function test_from_dplyr_import_allow_conflict_names_true (line 68) | def test_from_dplyr_import_allow_conflict_names_true():
function test_from_dplyr_import_allow_conflict_names_false (line 73) | def test_from_dplyr_import_allow_conflict_names_false():
function test_dplyr_getattr_allow_conflict_names_true (line 78) | def test_dplyr_getattr_allow_conflict_names_true():
function test_dplyr_getattr_allow_conflict_names_false (line 83) | def test_dplyr_getattr_allow_conflict_names_false():
FILE: tests/test_data.py
function test_descr_datasets (line 6) | def test_descr_datasets():
function test_add_dataset (line 14) | def test_add_dataset():
function test_load_dataset (line 20) | def test_load_dataset():
function test_no_such (line 26) | def test_no_such():
FILE: tests/test_dplyr.py
function test_verb_not_implemented (line 208) | def test_verb_not_implemented(verb, data, args, kwargs):
function test_dep_verbs (line 236) | def test_dep_verbs(verb, data, args, kwargs):
FILE: tests/test_forcats.py
function test_default_impl (line 78) | def test_default_impl(verb, data, args, kwargs):
FILE: tests/test_names.py
function test_minimal (line 24) | def test_minimal(names, expect):
function test_unique (line 60) | def test_unique(names, expect):
function test_unique_algebraic_y (line 64) | def test_unique_algebraic_y():
function test_universal (line 140) | def test_universal(names, expect):
function test_check_unique (line 144) | def test_check_unique():
function test_custom_repair (line 158) | def test_custom_repair():
FILE: tests/test_options.py
function reset_options (line 11) | def reset_options():
function test_options_empty_args_returns_full_options (line 18) | def test_options_empty_args_returns_full_options():
function test_options_with_names_only_selects_options (line 24) | def test_options_with_names_only_selects_options():
function test_opts_with_names_nameval_pairs_mixed_rets_sel_opts_and_changes_option (line 30) | def test_opts_with_names_nameval_pairs_mixed_rets_sel_opts_and_changes_o...
function test_options_with_dict_updates_options (line 36) | def test_options_with_dict_updates_options():
function test_options_context (line 42) | def test_options_context():
FILE: tests/test_pipe.py
function test_pipe_with_list (line 5) | def test_pipe_with_list():
function test_pipe_with_dict (line 13) | def test_pipe_with_dict():
function test_pipe_with_args (line 21) | def test_pipe_with_args():
function test_pipe_with_kwargs (line 33) | def test_pipe_with_kwargs():
function test_pipe_with_string (line 45) | def test_pipe_with_string():
function test_pipe_with_tuple (line 52) | def test_pipe_with_tuple():
function test_pipe_returns_different_type (line 60) | def test_pipe_returns_different_type():
function test_pipe_chain_multiple (line 67) | def test_pipe_chain_multiple():
function test_pipe_with_custom_class (line 80) | def test_pipe_with_custom_class():
function test_pipe_with_multiple_args_and_kwargs (line 98) | def test_pipe_with_multiple_args_and_kwargs():
FILE: tests/test_plugin.py
class TestPlugin1 (line 12) | class TestPlugin1:
method get_versions (line 15) | def get_versions():
method load_dataset (line 19) | def load_dataset(name, metadata):
method misc_api (line 23) | def misc_api():
method operate (line 33) | def operate(op, x, y=None):
method c_getitem (line 39) | def c_getitem(item):
class TestPlugin2 (line 43) | class TestPlugin2:
method load_dataset (line 46) | def load_dataset(name, metadata):
method c_getitem (line 50) | def c_getitem(item):
method operate (line 54) | def operate(op, x, y=None):
function setup_function (line 60) | def setup_function(function):
function with_test_plugin1 (line 68) | def with_test_plugin1():
function with_test_plugin2 (line 75) | def with_test_plugin2():
function test_get_versions (line 81) | def test_get_versions(with_test_plugin1, capsys):
function test_misc_api (line 89) | def test_misc_api(with_test_plugin1):
function test_misc_api_array_ufunc (line 101) | def test_misc_api_array_ufunc(with_test_plugin1):
function test_load_dataset (line 118) | def test_load_dataset(with_test_plugin1, with_test_plugin2):
function test_operate (line 128) | def test_operate(with_test_plugin1):
function test_operate2 (line 134) | def test_operate2(with_test_plugin1, with_test_plugin2):
function test_c_getitem (line 146) | def test_c_getitem(with_test_plugin1):
function test_c_getitem2 (line 151) | def test_c_getitem2(with_test_plugin1, with_test_plugin2):
FILE: tests/test_tibble.py
function test_default_impl (line 38) | def test_default_impl(verb, data, args, kwargs):
FILE: tests/test_tidyr.py
function test_default_impl (line 52) | def test_default_impl(verb, data, args, kwargs):
FILE: tests/test_utils.py
function test_arg_match (line 5) | def test_arg_match():
FILE: tests/test_verb_env.py
function test_env_var_global (line 6) | def test_env_var_global():
function test_env_var_per_verb (line 23) | def test_env_var_per_verb():
function test_env_var_per_verb_with_trailing_underscore (line 40) | def test_env_var_per_verb_with_trailing_underscore():
function test_env_var_precedence (line 58) | def test_env_var_precedence():
function test_env_var_not_set (line 80) | def test_env_var_not_set():
function test_verb_with_env_var (line 94) | def test_verb_with_env_var():
function test_explicit_ast_fallback_with_env_var (line 116) | def test_explicit_ast_fallback_with_env_var():
FILE: tests/test_verb_env_integration.py
function test_verb_ast_fallback_piping (line 6) | def test_verb_ast_fallback_piping():
function test_verb_ast_fallback_normal (line 36) | def test_verb_ast_fallback_normal():
function test_verb_ast_fallback_global (line 65) | def test_verb_ast_fallback_global():
function test_verb_ast_fallback_precedence (line 95) | def test_verb_ast_fallback_precedence():
Condensed preview — 149 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (3,400K chars).
[
{
"path": ".codesandbox/Dockerfile",
"chars": 211,
"preview": "FROM python:3.10.12\r\n\r\nRUN apt-get update && apt-get install -y npm fish && \\\r\n pip install -U pip && \\\r\n pip inst"
},
{
"path": ".codesandbox/setup.sh",
"chars": 322,
"preview": "WORKSPACE=\"/workspace\"\n\n# Install python dependencies\npoetry update && poetry install\n\ncd $WORKSPACE\n\n# Install whichpy\n"
},
{
"path": ".coveragerc",
"chars": 119,
"preview": "[report]\nexclude_lines =\n pragma: no cover\n if TYPE_CHECKING:\nomit =\n datar/datasets.py\n */site-packages/*\n"
},
{
"path": ".github/ISSUE_TEMPLATE/bug_report.yml",
"chars": 1261,
"preview": "name: Bug Report\ndescription: Report incorrect behavior in the datar library\ntitle: \"[BUG] \"\nlabels: [bug]\n\nbody:\n - ty"
},
{
"path": ".github/ISSUE_TEMPLATE/feature_request.yml",
"chars": 1418,
"preview": "name: Feature Request\ndescription: Suggest an idea for datar\ntitle: \"[ENH] \"\nlabels: [enhancement]\n\nbody:\n - type: chec"
},
{
"path": ".github/ISSUE_TEMPLATE/submit_question.yml",
"chars": 384,
"preview": "name: Submit Question\ndescription: Ask a general question about datar\ntitle: \"[QST] \"\nlabels: [question]\n\nbody:\n - type"
},
{
"path": ".github/workflows/ci.yml",
"chars": 1754,
"preview": "name: CI\n\non:\n push:\n pull_request:\n release:\n types: [published]\n\njobs:\n build:\n runs-on: ubuntu-latest\n s"
},
{
"path": ".github/workflows/docs.yml",
"chars": 2003,
"preview": "name: Build Docs\n\non: [push]\n\njobs:\n docs:\n runs-on: ubuntu-latest\n # if: github.ref == 'refs/heads/master'\n s"
},
{
"path": ".gitignore",
"chars": 1393,
"preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
},
{
"path": ".pre-commit-config.yaml",
"chars": 1150,
"preview": "fail_fast: true\nrepos:\n- repo: https://github.com/pre-commit/pre-commit-hooks\n rev: 5df1a4bf6f04a1ed3a643167b38d502"
},
{
"path": "LICENSE",
"chars": 1063,
"preview": "MIT License\n\nCopyright (c) 2020 pwwang\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof "
},
{
"path": "README.md",
"chars": 4616,
"preview": "# datar\n\nA Grammar of Data Manipulation in python\n\n<!-- badges -->\n[![Pypi][6]][7] [![Github][8]][9] ![Building][10] [!["
},
{
"path": "datar/__init__.py",
"chars": 1319,
"preview": "from typing import Mapping as _Mapping\n\nfrom .core import operator as _\nfrom .core.defaults import f\nfrom .core.options "
},
{
"path": "datar/all.py",
"chars": 1164,
"preview": "\"\"\"Import all constants, verbs and functions\"\"\"\n\nfrom .core import load_plugins as _\nfrom .core.defaults import f\n\nfrom "
},
{
"path": "datar/apis/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "datar/apis/base.py",
"chars": 62568,
"preview": "\"\"\"APIs ported from r-base\"\"\"\n# import the variables with _ so that they are not imported by *\nimport math as _math\nfrom"
},
{
"path": "datar/apis/dplyr.py",
"chars": 77057,
"preview": "# import the variables with _ so that they are not imported by *\nfrom __future__ import annotations as _\nfrom typing imp"
},
{
"path": "datar/apis/forcats.py",
"chars": 18267,
"preview": "\nfrom typing import Any\n\nfrom pipda import register_func as _register_func\n\nfrom ..core.utils import (\n NotImplemente"
},
{
"path": "datar/apis/misc.py",
"chars": 652,
"preview": "from contextlib import contextmanager\n\nfrom pipda import register_func\n\n\n@contextmanager\ndef _array_ufunc_with_backend(b"
},
{
"path": "datar/apis/tibble.py",
"chars": 8520,
"preview": "from __future__ import annotations as _\nfrom typing import Any, Callable as _Callable\n\nfrom pipda import (\n register_"
},
{
"path": "datar/apis/tidyr.py",
"chars": 28796,
"preview": "from __future__ import annotations as _\nfrom typing import Any, Callable as _Callable, Mapping as _Mapping\n\nfrom pipda i"
},
{
"path": "datar/base.py",
"chars": 884,
"preview": "\nfrom .core.load_plugins import plugin as _plugin\nfrom .apis.base import *\n\nlocals().update(_plugin.hooks.base_api())\n__"
},
{
"path": "datar/core/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "datar/core/defaults.py",
"chars": 174,
"preview": "from pathlib import Path\n\nfrom pipda import Symbolic\n\nf = Symbolic()\n\nOPTION_FILE_HOME = Path(\"~/.datar.toml\").expanduse"
},
{
"path": "datar/core/load_plugins.py",
"chars": 520,
"preview": "from pipda import register_array_ufunc\n\nfrom .options import get_option\nfrom .plugin import plugin\n\n\ndef _array_ufunc_to"
},
{
"path": "datar/core/names.py",
"chars": 5726,
"preview": "\"\"\"Name repairing\"\"\"\nimport inspect\nimport re\nimport keyword\nimport math\nfrom numbers import Number\nfrom typing import A"
},
{
"path": "datar/core/operator.py",
"chars": 726,
"preview": "\"\"\"Operators for datar\"\"\"\nfrom typing import Callable\nfrom contextlib import contextmanager\n\nfrom pipda import register_"
},
{
"path": "datar/core/options.py",
"chars": 2927,
"preview": "\"\"\"Provide options\"\"\"\nfrom __future__ import annotations\n\nfrom typing import Any, Generator, Mapping\nfrom contextlib imp"
},
{
"path": "datar/core/plugin.py",
"chars": 1565,
"preview": "\"\"\"Plugin system to support different backends\"\"\"\nfrom typing import Any, List, Mapping, Tuple, Callable\n\nfrom simplug i"
},
{
"path": "datar/core/utils.py",
"chars": 1955,
"preview": "\"\"\"Utilities for datar\"\"\"\nimport sys\nimport logging\nfrom typing import Any, Callable\nfrom contextlib import contextmanag"
},
{
"path": "datar/core/verb_env.py",
"chars": 1256,
"preview": "\"\"\"Utilities for getting verb AST fallback from environment variables\"\"\"\nfrom __future__ import annotations\n\nimport os\n\n"
},
{
"path": "datar/data/__init__.py",
"chars": 1395,
"preview": "\"\"\"Collects datasets from R-datasets, dplyr and tidyr packages\"\"\"\nimport functools\nfrom typing import Any, List\n\nfrom .."
},
{
"path": "datar/data/metadata.py",
"chars": 12632,
"preview": "from collections import namedtuple\nfrom pathlib import Path\n\nHERE = Path(__file__).parent\n\nMetadata = namedtuple('Metada"
},
{
"path": "datar/datasets.py",
"chars": 415,
"preview": "# pragma: no cover\nimport warnings\n\n\nclass DatasetsDeprecatedWarning(DeprecationWarning):\n ...\n\n\nwarnings.simplefilte"
},
{
"path": "datar/dplyr.py",
"chars": 888,
"preview": "\nfrom .core.load_plugins import plugin as _plugin\nfrom .core.options import get_option as _get_option\nfrom .apis.dplyr i"
},
{
"path": "datar/forcats.py",
"chars": 124,
"preview": "\nfrom .core.load_plugins import plugin as _plugin\nfrom .apis.forcats import *\n\nlocals().update(_plugin.hooks.forcats_api"
},
{
"path": "datar/misc.py",
"chars": 1629,
"preview": "from typing import Any as _Any, Callable as _Callable\n\nfrom pipda import register_verb as _register_verb\nfrom .core.verb"
},
{
"path": "datar/tibble.py",
"chars": 122,
"preview": "\nfrom .core.load_plugins import plugin as _plugin\nfrom .apis.tibble import *\n\nlocals().update(_plugin.hooks.tibble_api()"
},
{
"path": "datar/tidyr.py",
"chars": 120,
"preview": "\nfrom .core.load_plugins import plugin as _plugin\nfrom .apis.tidyr import *\n\nlocals().update(_plugin.hooks.tidyr_api())\n"
},
{
"path": "docs/CHANGELOG.md",
"chars": 18360,
"preview": "# Change Log\n\n## 0.15.17\n\n- feat: update pandas dependency version to ^0.7 with pandas v3 support\n\n## 0.15.16\n\n- refacto"
},
{
"path": "docs/ENV_VARS.md",
"chars": 2867,
"preview": "# Environment Variable Support for Verb AST Fallback\n\nThis document explains how to use environment variables to control"
},
{
"path": "docs/backends.md",
"chars": 2682,
"preview": "# Backends\n\nThe `datar` package is a collection of APIs that are ported from a bunch of R packages. The APIs are impleme"
},
{
"path": "docs/data.md",
"chars": 805,
"preview": "\nSee full reference of datasets at: [reference-maps/data][1]\n\nDatasets have to be imported individually by:\n\n```python\nf"
},
{
"path": "docs/f.md",
"chars": 1443,
"preview": "## Why `f`?\n\nIt is just fast for you to type, since usually, it is `.` right after `f`. Then you have your left hand and"
},
{
"path": "docs/import.md",
"chars": 2005,
"preview": "## Import submodule, verbs and functions from datar\n\nYou can import everything (all verbs and functions) from datar by:\n"
},
{
"path": "docs/notebooks/across.ipynb",
"chars": 86038,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/add_column.ipynb",
"chars": 9244,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/add_row.ipynb",
"chars": 15017,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/arrange.ipynb",
"chars": 120441,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 3,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/base-arithmetic.ipynb",
"chars": 37120,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"data\":"
},
{
"path": "docs/notebooks/base-funs.ipynb",
"chars": 11195,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"data\":"
},
{
"path": "docs/notebooks/base.ipynb",
"chars": 33457,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"453624fc\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/between.ipynb",
"chars": 21803,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"5fcd666d\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/bind.ipynb",
"chars": 78794,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"bbd58535\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/case_when.ipynb",
"chars": 19529,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"registered-ghost\",\n \"metadata\": {\n \"ex"
},
{
"path": "docs/notebooks/chop.ipynb",
"chars": 22864,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/coalesce.ipynb",
"chars": 4183,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"applicable-fault\",\n \"metadata\": {\n \"ex"
},
{
"path": "docs/notebooks/complete.ipynb",
"chars": 10367,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/context.ipynb",
"chars": 32930,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"3d8dbd18\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/count.ipynb",
"chars": 54712,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"floral-roberts\",\n \"metadata\": {\n \"exec"
},
{
"path": "docs/notebooks/cumall.ipynb",
"chars": 13429,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"minute-millennium\",\n \"metadata\": {\n \"e"
},
{
"path": "docs/notebooks/desc.ipynb",
"chars": 3913,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"important-empty\",\n \"metadata\": {\n \"exe"
},
{
"path": "docs/notebooks/distinct.ipynb",
"chars": 27045,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"incoming-criminal\",\n \"metadata\": {\n \"e"
},
{
"path": "docs/notebooks/drop_na.ipynb",
"chars": 12607,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"ca9d87cf\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/enframe.ipynb",
"chars": 11726,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/expand.ipynb",
"chars": 66337,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"imposed-davis\",\n \"metadata\": {\n \"execu"
},
{
"path": "docs/notebooks/expand_grid.ipynb",
"chars": 14535,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"respiratory-oriental\",\n \"metadata\": {\n "
},
{
"path": "docs/notebooks/extract.ipynb",
"chars": 12442,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"processed-allah\",\n \"metadata\": {\n \"exe"
},
{
"path": "docs/notebooks/fill.ipynb",
"chars": 25361,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"armed-shield\",\n \"metadata\": {\n \"execut"
},
{
"path": "docs/notebooks/filter-joins.ipynb",
"chars": 11510,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/filter.ipynb",
"chars": 113875,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/forcats_fct_multi.ipynb",
"chars": 6110,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"data\":"
},
{
"path": "docs/notebooks/forcats_lvl_addrm.ipynb",
"chars": 13286,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"data\":"
},
{
"path": "docs/notebooks/forcats_lvl_order.ipynb",
"chars": 230030,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 32,\n \"metadata\": {"
},
{
"path": "docs/notebooks/forcats_lvl_value.ipynb",
"chars": 79935,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"data\":"
},
{
"path": "docs/notebooks/forcats_misc.ipynb",
"chars": 36973,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"data\":"
},
{
"path": "docs/notebooks/full_seq.ipynb",
"chars": 3375,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"occasional-onion\",\n \"metadata\": {\n \"ex"
},
{
"path": "docs/notebooks/group_by.ipynb",
"chars": 63607,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/group_map.ipynb",
"chars": 15987,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 2,\n \"id\": \"57a3cb89\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/group_split.ipynb",
"chars": 43533,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"47292892\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/group_trim.ipynb",
"chars": 6745,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"9941c94b\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/lead-lag.ipynb",
"chars": 18944,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"framed-grill\",\n \"metadata\": {\n \"execut"
},
{
"path": "docs/notebooks/mutate-joins.ipynb",
"chars": 29360,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"particular-aurora\",\n \"metadata\": {\n \"e"
},
{
"path": "docs/notebooks/mutate.ipynb",
"chars": 63231,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/n_distinct.ipynb",
"chars": 4680,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"several-cowboy\",\n \"metadata\": {\n \"exec"
},
{
"path": "docs/notebooks/na_if.ipynb",
"chars": 18980,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"ddb0828f\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/nb_helpers.py",
"chars": 1720,
"preview": "\"\"\"helpers for notebooks\"\"\"\nfrom contextlib import contextmanager\n\nfrom IPython.display import display, Markdown, HTML\nf"
},
{
"path": "docs/notebooks/near.ipynb",
"chars": 3690,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"permanent-waters\",\n \"metadata\": {\n \"ex"
},
{
"path": "docs/notebooks/nest-join.ipynb",
"chars": 6218,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"adverse-thesis\",\n \"metadata\": {\n \"exec"
},
{
"path": "docs/notebooks/nest.ipynb",
"chars": 54241,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/nth.ipynb",
"chars": 9942,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"respiratory-velvet\",\n \"metadata\": {\n \""
},
{
"path": "docs/notebooks/other.ipynb",
"chars": 20887,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 2,\n \"id\": \"5ddd5613\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/pack.ipynb",
"chars": 36605,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/pivot_longer.ipynb",
"chars": 74282,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"6401a1db\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/pivot_wider.ipynb",
"chars": 125758,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"weekly-pavilion\",\n \"metadata\": {\n \"exe"
},
{
"path": "docs/notebooks/pull.ipynb",
"chars": 15421,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"radio-madonna\",\n \"metadata\": {\n \"execu"
},
{
"path": "docs/notebooks/ranking.ipynb",
"chars": 48440,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"interstate-header\",\n \"metadata\": {\n \"e"
},
{
"path": "docs/notebooks/readme.ipynb",
"chars": 67535,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"0bf6a031\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/recode.ipynb",
"chars": 18384,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"material-amount\",\n \"metadata\": {\n \"exe"
},
{
"path": "docs/notebooks/reframe.ipynb",
"chars": 24269,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"4ba9dd17\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/relocate.ipynb",
"chars": 33594,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/rename.ipynb",
"chars": 32205,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"1dec8f90\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/replace_na.ipynb",
"chars": 11021,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"82eadf3d\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/rownames.ipynb",
"chars": 40847,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/rows.ipynb",
"chars": 34757,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/rowwise.ipynb",
"chars": 24096,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/select.ipynb",
"chars": 65201,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {\n \"execution\": {\n \"iopub.execu"
},
{
"path": "docs/notebooks/separate.ipynb",
"chars": 32312,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"005da05e\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/setops.ipynb",
"chars": 89999,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"f197471c\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/slice.ipynb",
"chars": 196895,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"687000ba\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/summarise.ipynb",
"chars": 21826,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"4ba9dd17\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/tibble.ipynb",
"chars": 48102,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 2,\n \"id\": \"8b02806d\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/uncount.ipynb",
"chars": 11195,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"c822a641\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/unite.ipynb",
"chars": 12943,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"35dd20d5\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/notebooks/with_groups.ipynb",
"chars": 13750,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"id\": \"86647a1e\",\n \"metadata\": {\n \"execution\""
},
{
"path": "docs/options.md",
"chars": 1702,
"preview": "Options are used to change some behaviors in `datar`.\n\nFor environment variable configuration (such as controlling verb "
},
{
"path": "docs/reference-maps/ALL.md",
"chars": 766,
"preview": "\n|Module|Description|Reference|\n|-|-|-|\n|`base`|APIs ported from `r-base/r-stats/r-utils`|[:octicons-cross-reference-16:"
},
{
"path": "docs/reference-maps/base.md",
"chars": 24606,
"preview": "<style>\n.md-typeset__table {\n min-width: 100%;\n}\n\n.md-typeset table:not([class]) {\n display: table;\n max-width: "
},
{
"path": "docs/reference-maps/datasets.md",
"chars": 6674,
"preview": "<style>\n.md-typeset__table {\n min-width: 100%;\n}\n\n.md-typeset table:not([class]) {\n display: table;\n max-width: "
},
{
"path": "docs/reference-maps/dplyr.md",
"chars": 12465,
"preview": "<style>\n.md-typeset__table {\n min-width: 100%;\n}\n\n.md-typeset table:not([class]) {\n display: table;\n max-width: "
},
{
"path": "docs/reference-maps/forcats.md",
"chars": 5674,
"preview": "<style>\n.md-typeset__table {\n min-width: 100%;\n}\n\n.md-typeset table:not([class]) {\n display: table;\n max-width: "
},
{
"path": "docs/reference-maps/other.md",
"chars": 1727,
"preview": "<style>\n.md-typeset__table {\n min-width: 100%;\n}\n\n.md-typeset table:not([class]) {\n display: table;\n max-width: "
},
{
"path": "docs/reference-maps/stats.md",
"chars": 856,
"preview": "<style>\n.md-typeset__table {\n min-width: 100%;\n}\n\n.md-typeset table:not([class]) {\n display: table;\n max-width: "
},
{
"path": "docs/reference-maps/tibble.md",
"chars": 3350,
"preview": "<style>\n.md-typeset__table {\n min-width: 100%;\n}\n\n.md-typeset table:not([class]) {\n display: table;\n max-width: "
},
{
"path": "docs/reference-maps/tidyr.md",
"chars": 4460,
"preview": "<style>\n.md-typeset__table {\n min-width: 100%;\n}\n\n.md-typeset table:not([class]) {\n display: table;\n max-width: "
},
{
"path": "docs/reference-maps/utils.md",
"chars": 669,
"preview": "<style>\n.md-typeset__table {\n min-width: 100%;\n}\n\n.md-typeset table:not([class]) {\n display: table;\n max-width: "
},
{
"path": "docs/style.css",
"chars": 2539,
"preview": "\n.md-main__inner.md-grid {\n max-width: 80%;\n margin-left: 32px;\n}\n\n.md-typeset .admonition, .md-typeset details {\n"
},
{
"path": "mkdocs.yml",
"chars": 5028,
"preview": "site_name: datar\nrepo_url: https://github.com/pwwang/datar\nrepo_name: pwwang/datar\ntheme:\n favicon: favicon.png\n l"
},
{
"path": "pyproject.toml",
"chars": 1785,
"preview": "[project]\nname = \"datar\"\nversion = \"0.15.17\"\ndescription = \"A Grammar of Data Manipulation in python\"\nauthors = [\n {n"
},
{
"path": "setup.py",
"chars": 231,
"preview": "\"\"\"\n# This will not be included in the distribution.\n# The distribution is managed by uv\n# This file is kept only for\n# "
},
{
"path": "tests/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "tests/conflict_names.py",
"chars": 2891,
"preview": "import argparse\n\n\ndef test_getattr(module, allow_conflict_names, fun, error):\n from datar import options\n options("
},
{
"path": "tests/conftest.py",
"chars": 113,
"preview": "from datar import options\n\n\ndef pytest_sessionstart(session):\n # Load no plugins\n options(backends=[None])\n"
},
{
"path": "tests/test_array_ufunc.py",
"chars": 604,
"preview": "import pytest # noqa: F401\n\nimport numpy as np\nfrom pipda import Context\nfrom datar import f\nfrom datar.core import plu"
},
{
"path": "tests/test_base.py",
"chars": 6666,
"preview": "import pytest\n\nfrom datar.base import (\n ceiling,\n cov,\n floor,\n mean,\n median,\n pmax,\n pmin,\n s"
},
{
"path": "tests/test_conflict_names.py",
"chars": 2384,
"preview": "import sys\r\nimport subprocess\r\nfrom pathlib import Path\r\n\r\nimport pytest\r\n\r\n\r\ndef _run_conflict_names(module, allow_conf"
},
{
"path": "tests/test_data.py",
"chars": 654,
"preview": "import pytest\nfrom datar.data import descr_datasets, add_dataset\nfrom datar.core.utils import NotImplementedByCurrentBac"
},
{
"path": "tests/test_dplyr.py",
"chars": 5769,
"preview": "import pytest\n\nfrom datar.core.utils import NotImplementedByCurrentBackendError\nfrom datar.dplyr import (\n across,\n "
},
{
"path": "tests/test_forcats.py",
"chars": 2068,
"preview": "import pytest # noqa: F401\n\nfrom datar.core.utils import NotImplementedByCurrentBackendError\nfrom datar.forcats import "
},
{
"path": "tests/test_names.py",
"chars": 5450,
"preview": "# https://github.com/r-lib/vctrs/blob/master/tests/testthat/test-names.R\nimport pytest\nfrom typing import Iterable\n\nimpo"
},
{
"path": "tests/test_options.py",
"chars": 1037,
"preview": "import pytest\nfrom datar.core.options import (\n options,\n options_context,\n add_option,\n get_option,\n)\n\n\n@py"
},
{
"path": "tests/test_pipe.py",
"chars": 2730,
"preview": "import pytest\nfrom datar.all import pipe\n\n\ndef test_pipe_with_list():\n \"\"\"Test pipe with a list\"\"\"\n data = [1, 2, "
},
{
"path": "tests/test_plugin.py",
"chars": 3955,
"preview": "import pytest\n\nimport numpy as np\nfrom simplug import MultipleImplsForSingleResultHookWarning\nfrom pipda import Context\n"
},
{
"path": "tests/test_tibble.py",
"chars": 1079,
"preview": "import pytest # noqa: F401\n\nfrom datar.core.utils import NotImplementedByCurrentBackendError\nfrom datar.tibble import ("
},
{
"path": "tests/test_tidyr.py",
"chars": 1285,
"preview": "import pytest\n\nfrom datar.core.utils import NotImplementedByCurrentBackendError\nfrom datar.tidyr import (\n chop,\n "
},
{
"path": "tests/test_utils.py",
"chars": 336,
"preview": "import pytest\nfrom datar.core.utils import arg_match\n\n\ndef test_arg_match():\n with pytest.raises(ValueError, match='a"
},
{
"path": "tests/test_verb_env.py",
"chars": 4190,
"preview": "\"\"\"Tests for verb environment variable support\"\"\"\nimport os\nimport pytest\n\n\ndef test_env_var_global():\n \"\"\"Test globa"
},
{
"path": "tests/test_verb_env_integration.py",
"chars": 4488,
"preview": "\"\"\"Integration test to demonstrate the environment variable feature\"\"\"\nimport os\nimport pytest\n\n\ndef test_verb_ast_fallb"
},
{
"path": "tox.ini",
"chars": 544,
"preview": "[flake8]\nignore = E203, W503, E731\nper-file-ignores =\n # imported but unused\n __init__.py: F401, E402\n datar/al"
}
]
About this extraction
This page contains the full source code of the pwwang/datar GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 149 files (2.9 MB), approximately 775.5k tokens, and a symbol index with 509 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.