Full Code of srstevenson/nb-clean for AI

main 1e21d56623ba cached

35 files

86.7 KB

22.1k tokens

71 symbols

1 requests

Download .txt

Repository: srstevenson/nb-clean
Branch: main
Commit: 1e21d56623ba
Files: 35
Total size: 86.7 KB

Directory structure:
gitextract_9zcbq9pw/

├── .github/
│   ├── CODEOWNERS
│   ├── CONTRIBUTING.md
│   ├── dependabot.yml
│   └── workflows/
│       └── ci.yml
├── .gitignore
├── .pre-commit-hooks.yaml
├── .prettierrc.toml
├── .python-version
├── LICENSE
├── README.md
├── justfile
├── pyproject.toml
├── src/
│   └── nb_clean/
│       ├── __init__.py
│       ├── __main__.py
│       ├── cli.py
│       └── py.typed
└── tests/
    ├── conftest.py
    ├── notebooks/
    │   ├── clean.ipynb
    │   ├── clean_with_cell_metadata.ipynb
    │   ├── clean_with_counts.ipynb
    │   ├── clean_with_empty_cells.ipynb
    │   ├── clean_with_notebook_metadata.ipynb
    │   ├── clean_with_outputs.ipynb
    │   ├── clean_with_outputs_with_counts.ipynb
    │   ├── clean_with_tags_metadata.ipynb
    │   ├── clean_with_tags_special_metadata.ipynb
    │   ├── clean_without_empty_cells.ipynb
    │   ├── clean_without_notebook_metadata.ipynb
    │   ├── dirty.ipynb
    │   ├── dirty_empty_octave.ipynb
    │   └── dirty_with_version.ipynb
    ├── test_check_notebook.py
    ├── test_clean_notebook.py
    ├── test_cli.py
    └── test_git_integration.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/CODEOWNERS
================================================
* @srstevenson


================================================
FILE: .github/CONTRIBUTING.md
================================================
# Contributing

Thanks for considering contributing! The following is a set of guidelines for
doing so. They're guidelines rather than rules, so follow your best judgement,
but reading them will help make the contribution process easier and more
effective for both you and the maintainers.

## Reporting issues

GitHub issues are used for managing bug reports and feature requests, except
security vulnerabilities: these should be emailed to the maintainers instead.

Search for existing issues before creating a new one, to ensure your problem
hasn't already been reported. If it has, you're welcome to comment on the
existing issue with extra information that might help reproduce and fix the
problem, or sharing why a feature would be useful, but refrain from "+1" type
comments. Duplicate issues will be closed with a reference to the existing
issue.

In your report describe what you did, what you expected to happen, and what
happened instead. Provide a [minimal reproducible example][mre] that the
maintainers can run. Provide as much detail as you can in your description of
the problem, including the version of the project you're using, and details of
your operating system and environment, and other information which might help
diagnose the problem, such as what you've already tried to fix it.

## Contributing changes

### Planning

When you contribute a new change, the responsibility for maintenance is (by
default) transferred to the existing project maintainers. The benefit of the
contribution must be weighed against the cost of maintaining it.

If you're considering contributing a non-trivial bugfix or feature, discuss the
changes you plan to make before you start coding by opening an issue. This
ensures your proposed change will be accepted, and provides the maintainers the
opportunity to help you.

### Implementation

Changes are managed using GitHub pull requests. If you're new to pull requests,
read the [documentation][pr docs] to learn how they work.

[uv] is used for managing dependencies and packaging, and you will need it
installed. If you're not familiar with uv, we suggest reading its documentation
before you begin.

After cloning the repository, you can implement your changes as follows:

1. Install the project and its dependencies into an isolated virtual environment
   with `uv sync`.
2. Before making your changes, run the tests with `just test`, and ensure they
   pass. This checks your development environment is correctly configured, and
   there aren't outstanding issues before you start coding. If they don't pass,
   you can open a GitHub issue for help debugging.
3. Checkout a new branch for your changes, branching from `main`, with a
   sensible name for your changes.
4. Implement your changes.
5. If you introduced new functionality or fixed a bug, add appropriate automated
   tests to prevent future regressions.
6. Ensure you've updated any docstrings or documentation files (including
   `README.md`) which are affected by your change.
7. Run the formatter, linter and type checker with `just fmt lint`, and tests
   with `just test`, and fix any problems.
8. Commit your changes, following [these guidelines][commit guidelines] for your
   commit messages.
9. Fork the base repository on GitHub, push your branch to your fork, and open a
   pull request against the base repository. Make sure your pull request has a
   clear title and description. The easier your changes are to understand, the
   easier it is for the maintainers to approve and merge them.
10. Your pull request will be reviewed by the maintainers and either merged, or
    feedback will be provided on changes that are required.

[commit guidelines]:
  https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html
[mre]: https://stackoverflow.com/help/minimal-reproducible-example
[pr docs]: https://docs.github.com/en/github/collaborating-with-pull-requests
[uv]: https://docs.astral.sh/uv/


================================================
FILE: .github/dependabot.yml
================================================
version: 2
updates:
  - package-ecosystem: "pip"
    directory: "/"
    schedule:
      interval: "monthly"
    cooldown:
      default-days: 7
  - package-ecosystem: "github-actions"
    directory: "/"
    schedule:
      interval: "monthly"
    cooldown:
      default-days: 7


================================================
FILE: .github/workflows/ci.yml
================================================
name: CI

on:
  push:
    branches: [main]
  pull_request:

jobs:
  checks:
    name: Run checks
    runs-on: ubuntu-slim
    strategy:
      matrix:
        python:
          - "3.10"
          - "3.11"
          - "3.12"
          - "3.13"
          - "3.14"
    env:
      UV_PYTHON: ${{ matrix.python }}
    steps:
      - uses: actions/checkout@v6

      - name: Setup uv
        uses: astral-sh/setup-uv@v7

      - name: Setup Python
        uses: actions/setup-python@v6
        with:
          python-version: ${{ matrix.python }}

      - name: Install dependencies
        run: uv sync --dev

      - name: Run formatter
        run: uv run ruff format --check .

      - name: Run linter
        run: uv run ruff check .

      - name: Run type checker
        run: uv run ty check .

      - name: Run tests
        run: uv run coverage run -m pytest

      - name: Print test coverage report
        run: uv run coverage report


================================================
FILE: .gitignore
================================================
*.egg-info/
.ipynb_checkpoints/
/.coverage
/build/
/coverage.xml
/dist/
__pycache__/


================================================
FILE: .pre-commit-hooks.yaml
================================================
- id: nb-clean
  name: nb-clean
  entry: nb-clean clean
  language: python
  types_or: [jupyter]
  minimum_pre_commit_version: 2.9.2


================================================
FILE: .prettierrc.toml
================================================
proseWrap = "always"


================================================
FILE: .python-version
================================================
3.10


================================================
FILE: LICENSE
================================================
Copyright © Scott Stevenson <scott@stevenson.io>

Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted, provided that the above
copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH
REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,
INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR
OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THIS SOFTWARE.


================================================
FILE: README.md
================================================
<p align="center"><img src="images/nb-clean.png" width=300></p>

[![License](https://img.shields.io/github/license/srstevenson/nb-clean?label=License&color=blue)](https://github.com/srstevenson/nb-clean/blob/main/LICENSE)
[![GitHub release](https://img.shields.io/github/v/release/srstevenson/nb-clean?label=GitHub)](https://github.com/srstevenson/nb-clean)
[![PyPI version](https://img.shields.io/pypi/v/nb-clean?label=PyPI)](https://pypi.org/project/nb-clean/)
[![Python versions](https://img.shields.io/pypi/pyversions/nb-clean?label=Python)](https://pypi.org/project/nb-clean/)
[![CI status](https://github.com/srstevenson/nb-clean/workflows/CI/badge.svg)](https://github.com/srstevenson/nb-clean/actions)

nb-clean cleans Jupyter notebooks of cell execution counts, metadata, outputs,
and (optionally) empty cells, preparing them for committing to version control.
It provides both a Git filter and pre-commit hook to automatically clean
notebooks before they're staged, and can also be used with other version control
systems, as a command line tool, and as a Python library. It can determine if a
notebook is clean or not, which can be used as a check in your continuous
integration pipelines.

Jupyter notebooks contain execution metadata that changes every time you run a
cell, including execution counts, timestamps, and output data. When committed to
version control, these elements create unnecessary diff noise, make meaningful
code review difficult, and can accidentally expose sensitive information in cell
outputs. By cleaning notebooks before committing, you preserve only the
essential code and markdown content, leading to cleaner diffs, more focused
reviews, and better collaboration.

For a detailed discussion of the challenges notebooks present for version
control and collaborative development, see my [PyCon UK 2017 talk][pycon talk]
and accompanying [blog post][blog post].

> [!NOTE]
>
> nb-clean 2.0.0 introduced a new command line interface to make cleaning
> notebooks in place easier. If you upgrade from a previous release, you'll need
> to migrate to the new interface as described under
> [Migrating to nb-clean 2](#migrating-to-nb-clean-2).

## Installation

nb-clean requires Python 3.10 or later. To run the latest release of nb-clean in
an ephemeral virtual environment, use [uv]:

```bash
uvx nb-clean
```

To add nb-clean as a dependency to a Python project managed with uv, use:

```bash
uv add --dev nb-clean
```

## Command line usage

### Understanding notebook metadata

Jupyter notebooks contain several types of metadata that nb-clean can handle:

**Cell metadata** includes information attached to individual cells, such as
tags, slideshow settings, and execution timing. Cell metadata fields like
`collapsed`, `scrolled`, `deletable`, and `editable` control notebook interface
behaviour, whilst `tags` and custom fields support workflow automation.

**Notebook metadata** contains document-level information including the kernel
specification, language version, and notebook format version. The language
version information (`metadata.language_info.version`) frequently changes
between Python versions and creates unnecessary version control noise.

**Execution metadata** encompasses execution counts for code cells and their
outputs, along with execution timestamps and output data. This metadata changes
every time you run cells, regardless of whether the actual code has changed.

### Checking

You can check if a notebook is clean with:

```bash
nb-clean check notebook.ipynb
```

You can also process notebooks through standard input and output streams, which
is useful for integrating with shell pipelines or processing notebooks without
writing to disk:

```bash
nb-clean check < notebook.ipynb
```

When reading from standard input, nb-clean processes the notebook content
directly without accessing the filesystem. This approach is particularly useful
for automated workflows, continuous integration pipelines, or when you want to
check notebooks without creating temporary files.

The check can be run with the following flags:

- To check for empty cells use `--remove-empty-cells` or the short form `-e`.
- To ignore cell metadata use `--preserve-cell-metadata` or the short form `-m`.
  This will ignore all metadata fields. You can also pass a list of fields to
  ignore with `--preserve-cell-metadata field1 field2` or `-m field1 field2`.
  Note that when _not_ passing a list of fields, either the `-m` or
  `--preserve-cell-metadata` flag must be passed _after_ the notebook paths to
  process, or the notebook paths should be preceded with `--` so they are not
  interpreted as metadata fields.
- To ignore cell outputs use `--preserve-cell-outputs` or the short form `-o`.
- To ignore cell execution counts use `--preserve-execution-counts` or the short
  form `-c`.
- To ignore language version notebook metadata use
  `--preserve-notebook-metadata` or the short form `-n`.
- To check the notebook does not contain any notebook metadata use
  `--remove-all-notebook-metadata` or the short form `-M`.

For example, to check if a notebook is clean whilst ignoring notebook metadata:

```bash
nb-clean check --preserve-notebook-metadata notebook.ipynb
```

To check if a notebook is clean whilst ignoring all cell metadata:

```bash
nb-clean check --preserve-cell-metadata -- notebook.ipynb
```

To check if a notebook is clean whilst ignoring only the `tags` cell metadata
field:

```bash
nb-clean check --preserve-cell-metadata tags -- notebook.ipynb
```

nb-clean will exit with status code 0 if the notebook is clean, and status code
1 if it is not. nb-clean will also print details of cell execution counts,
metadata, outputs, and empty cells it finds.

Note that the conflicting options `--preserve-notebook-metadata` and
`--remove-all-notebook-metadata` cannot be used together, as they represent
contradictory instructions.

### Cleaning (interactive)

You can clean a Jupyter notebook with:

```bash
nb-clean clean notebook.ipynb
```

This cleans the notebook in place. You can also pass the notebook content on
standard input, in which case the cleaned notebook is written to standard
output:

```bash
nb-clean clean < original.ipynb > cleaned.ipynb
```

The cleaning can be run with the following flags:

- To remove empty cells use `--remove-empty-cells` or the short form `-e`.
- To preserve cell metadata use `--preserve-cell-metadata` or the short form
  `-m`. This will preserve all metadata fields. You can also pass a list of
  fields to preserve with `--preserve-cell-metadata field1 field2` or
  `-m field1 field2`. Note that when _not_ passing a list of fields, either the
  `-m` or `--preserve-cell-metadata` flag must be passed _after_ the notebook
  paths to process, or the notebook paths should be preceded with `--` so they
  are not interpreted as metadata fields.
- To preserve cell outputs use `--preserve-cell-outputs` or the short form `-o`.
- To preserve cell execution counts use `--preserve-execution-counts` or the
  short form `-c`.
- To preserve notebook metadata (such as language version) use
  `--preserve-notebook-metadata` or the short form `-n`.
- To remove all notebook metadata use `--remove-all-notebook-metadata` or the
  short form `-M`.

For example, to clean a notebook whilst preserving notebook metadata:

```bash
nb-clean clean --preserve-notebook-metadata notebook.ipynb
```

To clean a notebook whilst preserving all cell metadata:

```bash
nb-clean clean --preserve-cell-metadata -- notebook.ipynb
```

To clean a notebook whilst preserving only the `tags` cell metadata field:

```bash
nb-clean clean --preserve-cell-metadata tags -- notebook.ipynb
```

#### Directory processing

Both the `check` and `clean` commands can operate on directories as well as
individual notebook files. When you provide a directory path, nb-clean will
recursively find all `.ipynb` files within that directory and process them. For
example:

```bash
nb-clean check notebooks/
```

or

```bash
nb-clean clean experiments/
```

This is particularly useful for batch processing entire project directories or
ensuring all notebooks in a repository are clean.

### Cleaning (Git filter)

To add a filter to an existing Git repository to automatically clean notebooks
when they're staged, run the following from the working tree:

```bash
nb-clean add-filter
```

This will configure a filter to remove cell execution counts, metadata, and
outputs. The same flags as described above for
[interactive cleaning](#cleaning-interactive) can be passed to customise the
behaviour.

The Git filter operates by configuring the `filter.nb-clean.clean` setting in
your repository's local Git configuration and adding the line
`*.ipynb filter=nb-clean` to `.git/info/attributes`. This ensures that all
notebook files are automatically processed through nb-clean when staged for
commit. The filter configuration is local to the repository and won't affect
your global or system Git settings.

To remove the filter, run:

```bash
nb-clean remove-filter
```

### Cleaning (Jujutsu)

nb-clean can be used to clean notebooks tracked with [Jujutsu] rather than Git.
Configure Jujutsu to use nb-clean as a fix tool by adding the following snippet
to `~/.config/jj/config.toml`:

```toml
[fix.tools.nb-clean]
command = ["nb-clean", "clean"]
patterns = ["glob:'**/*.ipynb'"]
```

The same flags as described above for
[interactive cleaning](#cleaning-interactive) can be appended to the `command`
array to customise the behaviour.

Tracked notebooks can then be cleaned by running `jj fix`. See the [Jujutsu
documentation][jujutsu docs] for further details of how to invoke and configure
fix tools.

### Cleaning (pre-commit hook)

nb-clean can also be used as a [pre-commit] hook. You may prefer this to the Git
filter if your project already uses the pre-commit framework.

Note that the Git filter and pre-commit hook work differently, with different
effects on your working directory. The pre-commit hook operates on the notebook
on disk, cleaning the copy in your working directory. The Git filter cleans
notebooks as they are added to the index, leaving the copy in your working
directory dirty. This means cell outputs are still visible to you in your local
Jupyter instance when using the Git filter, but not when using the pre-commit
hook.

After installing [pre-commit], add the nb-clean hook by adding the following
snippet to `.pre-commit-config.yaml` in the root of your repository:

```yaml
repos:
  - repo: https://github.com/srstevenson/nb-clean
    rev: 4.0.1
    hooks:
      - id: nb-clean
```

You can pass additional arguments to nb-clean with an `args` array. The
following example shows how to preserve only two specific metadata fields. Note
that, in the example, the final item `--` in the arg list is mandatory. The
option `--preserve-cell-metadata` may take an arbitrary number of field
arguments, and the `--` argument is needed to separate them from notebook
filenames, which `pre-commit` will append to the list of arguments.

```yaml
repos:
  - repo: https://github.com/srstevenson/nb-clean
    rev: 4.0.1
    hooks:
      - id: nb-clean
        args:
          - --remove-empty-cells
          - --preserve-cell-metadata
          - tags
          - slideshow
          - --
```

Run `pre-commit install` to ensure the hook is installed, and
`pre-commit autoupdate` to update the hook to the latest release of nb-clean.

### Preserving all nbformat metadata

To ignore or preserve specifically the metadata defined in the
[`nbformat` documentation](https://nbformat.readthedocs.io/en/latest/format_description.html#cell-metadata),
use the following options:
`--preserve-cell-metadata collapsed scrolled deletable editable format name tags jupyter execution`.

## Python library usage

nb-clean can be used programmatically as a Python library, allowing integration
into other tools.

```python
import nbformat

import nb_clean

# Load a notebook
with open("notebook.ipynb") as f:
    notebook = nbformat.read(f, as_version=nbformat.NO_CONVERT)

# Check if the notebook is clean
is_clean = nb_clean.check_notebook(
    notebook, preserve_cell_outputs=True, filename="notebook.ipynb"
)

# Clean the notebook
cleaned_notebook = nb_clean.clean_notebook(
    notebook, remove_empty_cells=True, preserve_cell_metadata=["tags", "slideshow"]
)
```

The library functions accept the same configuration options as the command-line
interface. The `check_notebook()` function returns a boolean indicating whether
the notebook is clean, whilst `clean_notebook()` returns a cleaned copy of the
notebook.

## Migrating to nb-clean 2

The following table maps from the command line interface of nb-clean 1.6.0 to
that of nb-clean >=2.0.0.

The examples in the table use long flags, but short flags can also be used
instead.

| Description                                 | nb-clean 1.6.0                                                   | nb-clean >=2.0.0                                            |
| ------------------------------------------- | ---------------------------------------------------------------- | ----------------------------------------------------------- |
| Clean notebook                              | `nb-clean clean --input notebook.ipynb \| sponge notebook.ipynb` | `nb-clean clean notebook.ipynb`                             |
| Clean notebook (remove empty cells)         | `nb-clean clean --input notebook.ipynb --remove-empty`           | `nb-clean clean --remove-empty-cells notebook.ipynb`        |
| Clean notebook (preserve all cell metadata) | `nb-clean clean --input notebook.ipynb --preserve-metadata`      | `nb-clean clean --preserve-cell-metadata -- notebook.ipynb` |
| Check notebook                              | `nb-clean check --input notebook.ipynb`                          | `nb-clean check notebook.ipynb`                             |
| Check notebook (ignore non-empty cells)     | `nb-clean check --input notebook.ipynb --remove-empty`           | `nb-clean check --remove-empty-cells notebook.ipynb`        |
| Check notebook (ignore all cell metadata)   | `nb-clean check --input notebook.ipynb --preserve-metadata`      | `nb-clean check --preserve-cell-metadata -- notebook.ipynb` |
| Add Git filter to clean notebooks           | `nb-clean configure-git`                                         | `nb-clean add-filter`                                       |
| Remove Git filter                           | `nb-clean unconfigure-git`                                       | `nb-clean remove-filter`                                    |

## Copyright

Copyright © Scott Stevenson.

nb-clean is distributed under the terms of the [ISC license].

[blog post]: https://srstevenson.com/posts/jupyter-notebooks-and-collaboration/
[isc license]: https://opensource.org/licenses/ISC
[jujutsu docs]: https://jj-vcs.github.io/jj/latest/cli-reference/#jj-fix
[jujutsu]: https://jj-vcs.github.io/jj/
[pre-commit]: https://pre-commit.com/
[pycon talk]: https://www.youtube.com/watch?v=J3k3HkVnd2c
[uv]: https://docs.astral.sh/uv/


================================================
FILE: justfile
================================================
# show this help message (default)
help:
    @just -l

# format with ruff
fmt:
    uv run ruff check --fix
    uv run ruff format

# lint with ruff and type-check with ty
lint:
    uv run ruff check
    uv run ruff format --check
    uv run ty check

# run tests with pytest and report coverage
test:
    uv run coverage run -m pytest
    uv run coverage report


================================================
FILE: pyproject.toml
================================================
[project]
name = "nb-clean"
version = "4.0.1"
description = "Clean Jupyter notebooks for versioning"
authors = [{ name = "Scott Stevenson", email = "scott@stevenson.io" }]
readme = "README.md"
license = "ISC"
license-files = ["LICENSE"]
requires-python = ">=3.10"
keywords = ["jupyter", "notebook", "clean", "filter", "git"]
classifiers = [
  "Development Status :: 5 - Production/Stable",
  "Intended Audience :: Science/Research",
  "Natural Language :: English",
]
dependencies = ["nbformat>=5.9.2"]

[project.urls]
Homepage = "https://github.com/srstevenson/nb-clean"
Repository = "https://github.com/srstevenson/nb-clean"
Issues = "https://github.com/srstevenson/nb-clean/issues"

[project.scripts]
nb-clean = "nb_clean.cli:main"

[dependency-groups]
dev = [
  "coverage>=7.6.10",
  "pytest>=7.2.1",
  "pytest-mock>=3.11.1",
  "ruff>=0.1.6",
  "ty>=0.0.19",
  "typing-extensions>=4.14.1",
]

[build-system]
requires = ["uv_build>=0.7.19,<0.12"]
build-backend = "uv_build"

[tool.coverage.report]
exclude_also = ["if __name__ == .__main__.:", "if TYPE_CHECKING:"]

[tool.ruff]
target-version = "py310"

[tool.ruff.format]
docstring-code-format = true
skip-magic-trailing-comma = true

[tool.ruff.lint]
select = ["ALL"]
ignore = [
  "COM812",  # Trailing comma missing
  "C901",    # Function is too complex
  "E501",    # Line too long
  "PLR0912", # Too many branches
  "PLR0913", # Too many arguments in function definition
  "PLR2004", # Magic value used in comparison
  "S603",    # subprocess call: check for execution of untrusted input
  "S607",    # Starting a process with a partial executable path
  "T201",    # print found
]

[tool.ruff.lint.flake8-tidy-imports]
ban-relative-imports = "all"

[tool.ruff.lint.isort]
split-on-trailing-comma = false

[tool.ruff.lint.per-file-ignores]
"tests/**.py" = [
  "D",      # pydocstyle
  "INP001", # Implicit namespace package
  "S101",   # Magic value used in comparison
]

[tool.ruff.lint.pydocstyle]
convention = "google"

[tool.ty.rules]
all = "error"


================================================
FILE: src/nb_clean/__init__.py
================================================
"""Clean Jupyter notebooks of execution counts, metadata, and outputs."""

from __future__ import annotations

import contextlib
import subprocess
from pathlib import Path
from typing import TYPE_CHECKING, Any, Final, cast

if TYPE_CHECKING:
    from collections.abc import Collection

    import nbformat
    from typing_extensions import Self

VERSION: Final = "4.0.1"
GIT_ATTRIBUTES_LINE: Final = "*.ipynb filter=nb-clean"


class GitProcessError(Exception):
    """Exception for errors executing Git."""

    def __init__(self: Self, message: str, return_code: int) -> None:
        """Exception for errors executing Git.

        Args:
            message: Error message.
            return_code: Return code.
        """
        super().__init__(message)
        self.message: str = message
        self.return_code: int = return_code


def git(*args: str) -> str:
    """Execute a Git subcommand with the provided arguments.

    Args:
        *args: Git subcommand and arguments to execute.

    Returns:
        Standard output from the Git command, stripped of whitespace.

    Raises:
        GitProcessError: If the Git command fails with a non-zero exit code.

    Examples:
        >>> git("rev-parse", "--git-dir")
        '.git'
    """
    try:
        process = subprocess.run(["git", *list(args)], capture_output=True, check=True)
    except subprocess.CalledProcessError as exc:
        raise GitProcessError(exc.stderr.decode(), exc.returncode) from exc

    return process.stdout.decode().strip()


def git_attributes_path() -> Path:
    """Get path to the attributes file in the current Git repository.

    Returns:
        Path to the attributes file.

    Examples:
        >>> git_attributes_path()
        PosixPath('.git/info/attributes')
    """
    git_dir = git("rev-parse", "--git-dir")
    return Path(git_dir, "info", "attributes")


def add_git_filter(
    *,
    remove_empty_cells: bool = False,
    remove_all_notebook_metadata: bool = False,
    preserve_cell_metadata: Collection[str] | None = None,
    preserve_cell_outputs: bool = False,
    preserve_execution_counts: bool = False,
    preserve_notebook_metadata: bool = False,
) -> None:
    """Configure and add a Git filter to automatically clean Jupyter notebooks.

    This function sets up a Git filter that will automatically clean notebooks
    when they are staged for commit, removing execution counts, outputs, and
    metadata according to the specified options.

    Args:
        remove_empty_cells: If True, remove empty cells. Defaults to False.
        remove_all_notebook_metadata: If True, remove all notebook metadata. Defaults to False.
        preserve_cell_metadata: Controls cell metadata handling. If None, clean all cell metadata.
            If [], preserve all cell metadata.
            (This corresponds to the `-m` CLI option without specifying any fields.)
            If list of str, these are the cell metadata fields to preserve.
            Defaults to None.
        preserve_cell_outputs: If True, preserve cell outputs. Defaults to False.
        preserve_execution_counts: If True, preserve cell execution counts. Defaults to False.
        preserve_notebook_metadata: If True, preserve notebook metadata such as language version.
            Defaults to False.

    Raises:
        ValueError: If both preserve_notebook_metadata and remove_all_notebook_metadata are True.
    """
    if preserve_notebook_metadata and remove_all_notebook_metadata:
        msg = "`preserve_notebook_metadata` and `remove_all_notebook_metadata` cannot both be `True`"
        raise ValueError(msg)

    command = ["nb-clean", "clean"]

    if remove_empty_cells:
        command.append("--remove-empty-cells")

    if preserve_cell_metadata is not None:
        if len(preserve_cell_metadata) > 0:
            command.append(
                f"--preserve-cell-metadata {' '.join(preserve_cell_metadata)}"
            )
        else:
            command.append("--preserve-cell-metadata")

    if preserve_cell_outputs:
        command.append("--preserve-cell-outputs")

    if preserve_execution_counts:
        command.append("--preserve-execution-counts")

    if preserve_notebook_metadata:
        command.append("--preserve-notebook-metadata")

    if remove_all_notebook_metadata:
        command.append("--remove-all-notebook-metadata")

    git("config", "filter.nb-clean.clean", " ".join(command))

    attributes_path = git_attributes_path()

    if attributes_path.is_file() and GIT_ATTRIBUTES_LINE in attributes_path.read_text(
        encoding="UTF-8"
    ):
        return

    with attributes_path.open("a", encoding="UTF-8") as file:
        file.write(f"\n{GIT_ATTRIBUTES_LINE}\n")


def remove_git_filter() -> None:
    """Remove the nb-clean filter from the current Git repository.

    This function removes the nb-clean filter configuration from the Git repository
    and cleans up the attributes file by removing the filter directive.

    Raises:
        GitProcessError: If Git command execution fails.
    """
    attributes_path = git_attributes_path()

    if attributes_path.is_file():
        original_contents = attributes_path.read_text(encoding="UTF-8").split("\n")
        revised_contents = [
            line for line in original_contents if line != GIT_ATTRIBUTES_LINE
        ]
        attributes_path.write_text("\n".join(revised_contents), encoding="UTF-8")

    git("config", "--remove-section", "filter.nb-clean")


def check_notebook(
    notebook: nbformat.NotebookNode,
    *,
    remove_empty_cells: bool = False,
    remove_all_notebook_metadata: bool = False,
    preserve_cell_metadata: Collection[str] | None = None,
    preserve_cell_outputs: bool = False,
    preserve_execution_counts: bool = False,
    preserve_notebook_metadata: bool = False,
    filename: str = "notebook",
) -> bool:
    """Check notebook is clean of execution counts, metadata, and outputs.

    Args:
        notebook: The notebook to check.
        remove_empty_cells: If True, also check for the presence of empty cells. Defaults to False.
        remove_all_notebook_metadata: If True, also check for the presence of any notebook metadata.
            Defaults to False.
        preserve_cell_metadata: If None, check for all cell metadata.
            If [], don't check for any cell metadata.
            (This corresponds to the `-m` CLI option without specifying any fields.)
            If list of str, these are the cell metadata fields to ignore.
            Defaults to None.
        preserve_cell_outputs: If True, don't check for cell outputs. Defaults to False.
        preserve_execution_counts: If True, don't check for cell execution counts. Defaults to False.
        preserve_notebook_metadata: If True, preserve notebook metadata such as language version.
            Defaults to False.
        filename: Notebook filename to use in log messages. Defaults to "notebook".

    Returns:
        True if the notebook is clean, False otherwise.
    """
    if preserve_notebook_metadata and remove_all_notebook_metadata:
        msg = "`preserve_notebook_metadata` and `remove_all_notebook_metadata` cannot both be `True`"
        raise ValueError(msg)

    is_clean = True

    for index, cell in enumerate(notebook.cells):
        prefix = f"{filename} cell {index}"

        if remove_empty_cells and not cell["source"]:
            print(f"{prefix}: empty cell")
            is_clean = False

        if preserve_cell_metadata is None:
            if cell["metadata"]:
                print(f"{prefix}: metadata")
                is_clean = False
        elif len(preserve_cell_metadata) > 0:
            for field in cell["metadata"]:
                if field not in preserve_cell_metadata:
                    print(f"{prefix}: metadata {field}")
                    is_clean = False

        if cell["cell_type"] == "code":
            if not preserve_execution_counts and cell["execution_count"]:
                print(f"{prefix}: execution count")
                is_clean = False

            if preserve_cell_outputs:
                if not preserve_execution_counts:
                    for output in cell["outputs"]:
                        if output.get("execution_count") is not None:
                            print(f"{prefix}: output execution count")
                            is_clean = False
            elif cell["outputs"]:
                print(f"{prefix}: outputs")
                is_clean = False

    if remove_all_notebook_metadata and cast("dict[str, Any]", notebook.metadata):
        print(f"{filename}: metadata")
        is_clean = False

    if not preserve_notebook_metadata:
        with contextlib.suppress(KeyError):
            notebook["metadata"]["language_info"]["version"]
            print(f"{filename} metadata: language_info.version")
            is_clean = False

    return is_clean


def clean_notebook(
    notebook: nbformat.NotebookNode,
    *,
    remove_empty_cells: bool = False,
    remove_all_notebook_metadata: bool = False,
    preserve_cell_metadata: Collection[str] | None = None,
    preserve_cell_outputs: bool = False,
    preserve_execution_counts: bool = False,
    preserve_notebook_metadata: bool = False,
) -> nbformat.NotebookNode:
    """Clean notebook of execution counts, metadata, and outputs.

    Args:
        notebook: The notebook to clean.
        remove_empty_cells: If True, remove empty cells. Defaults to False.
        remove_all_notebook_metadata: If True, remove all notebook metadata. Defaults to False.
        preserve_cell_metadata: If None, clean all cell metadata.
            If [], preserve all cell metadata.
            (This corresponds to the `-m` CLI option without specifying any fields.)
            If list of str, these are the cell metadata fields to preserve.
            Defaults to None.
        preserve_cell_outputs: If True, preserve cell outputs. Defaults to False.
        preserve_execution_counts: If True, preserve cell execution counts. Defaults to False.
        preserve_notebook_metadata: If True, preserve notebook metadata such as language version.
            Defaults to False.

    Returns:
        The cleaned notebook.
    """
    if preserve_notebook_metadata and remove_all_notebook_metadata:
        msg = "`preserve_notebook_metadata` and `remove_all_notebook_metadata` cannot both be `True`"
        raise ValueError(msg)

    if remove_empty_cells:
        notebook.cells = [cell for cell in notebook.cells if cell["source"]]

    for cell in notebook.cells:
        if preserve_cell_metadata is None:
            cell["metadata"] = {}
        elif len(preserve_cell_metadata) > 0:
            cell["metadata"] = {
                field: value
                for field, value in cell["metadata"].items()
                if field in preserve_cell_metadata
            }
        if cell["cell_type"] == "code":
            if not preserve_execution_counts:
                cell["execution_count"] = None
            if preserve_cell_outputs:
                if not preserve_execution_counts:
                    for output in cell["outputs"]:
                        if "execution_count" in output:
                            output["execution_count"] = None
            else:
                cell["outputs"] = []

    if remove_all_notebook_metadata:
        notebook.metadata = {}
    elif not preserve_notebook_metadata:
        with contextlib.suppress(KeyError):
            del notebook["metadata"]["language_info"]["version"]

    return notebook


================================================
FILE: src/nb_clean/__main__.py
================================================
"""Top-level script to run nb-clean."""

from nb_clean.cli import main

if __name__ == "__main__":
    main()


================================================
FILE: src/nb_clean/cli.py
================================================
"""Command line interface to nb-clean."""

from __future__ import annotations

import argparse
import os
import sys
from dataclasses import dataclass, field
from pathlib import Path
from typing import TYPE_CHECKING, NoReturn, TextIO, cast

import nbformat

import nb_clean

if TYPE_CHECKING:
    from collections.abc import Collection, Iterable, Sequence


@dataclass
class Args(argparse.Namespace):
    """Arguments parsed from the command-line."""

    subcommand: str = ""
    inputs: list[Path] = field(default_factory=list)
    remove_empty_cells: bool = False
    remove_all_notebook_metadata: bool = False
    preserve_cell_metadata: list[str] | None = None
    preserve_cell_outputs: bool = False
    preserve_execution_counts: bool = False
    preserve_notebook_metadata: bool = False


def expand_directories(paths: Iterable[Path]) -> list[Path]:
    """Expand paths to directories into paths to notebooks contained within.

    Args:
        paths: Paths to expand, including directories.

    Returns:
        Paths with directories expanded into notebooks contained within.
    """
    expanded: set[Path] = set()
    for path in paths:
        if path.is_dir():
            expanded.update(path.rglob("*.ipynb"))
        else:
            expanded.add(path)
    return list(expanded)


def exit_with_error(message: str, return_code: int) -> NoReturn:
    """Print an error message to standard error and exit.

    Args:
        message: Error message to print to standard error.
        return_code: Return code with which to exit.
    """
    print(f"nb-clean: error: {message}", file=sys.stderr)
    sys.exit(return_code)


def add_filter(
    *,
    remove_empty_cells: bool,
    remove_all_notebook_metadata: bool,
    preserve_cell_metadata: Collection[str] | None,
    preserve_cell_outputs: bool,
    preserve_execution_counts: bool,
    preserve_notebook_metadata: bool,
) -> None:
    """Add the nb-clean filter to the current Git repository.

    Args:
        remove_empty_cells: Configure the filter to remove empty cells.
        remove_all_notebook_metadata: Configure the filter to remove all notebook metadata.
        preserve_cell_metadata: Configure the filter to preserve cell metadata.
        preserve_cell_outputs: Configure the filter to preserve cell outputs.
        preserve_execution_counts: Configure the filter to preserve cell execution counts.
        preserve_notebook_metadata: Configure the filter to preserve notebook metadata such as language version.
    """
    try:
        nb_clean.add_git_filter(
            remove_empty_cells=remove_empty_cells,
            remove_all_notebook_metadata=remove_all_notebook_metadata,
            preserve_cell_metadata=preserve_cell_metadata,
            preserve_cell_outputs=preserve_cell_outputs,
            preserve_execution_counts=preserve_execution_counts,
            preserve_notebook_metadata=preserve_notebook_metadata,
        )
    except nb_clean.GitProcessError as exc:
        exit_with_error(exc.message, exc.return_code)


def remove_filter() -> None:
    """Remove the nb-clean filter from the current Git repository.

    This function removes the nb-clean filter configuration and cleans up
    the Git attributes file. If Git command execution fails, the program
    will exit with an appropriate error code.
    """
    try:
        nb_clean.remove_git_filter()
    except nb_clean.GitProcessError as exc:
        exit_with_error(exc.message, exc.return_code)


def check(
    inputs: Iterable[Path],
    *,
    remove_empty_cells: bool,
    remove_all_notebook_metadata: bool,
    preserve_cell_metadata: Collection[str] | None,
    preserve_cell_outputs: bool,
    preserve_execution_counts: bool,
    preserve_notebook_metadata: bool,
) -> None:
    """Check notebooks are clean of execution counts, metadata, and outputs.

    Args:
        inputs: Input notebook paths to check, empty list for stdin.
        remove_empty_cells: Check for the presence of empty cells.
        remove_all_notebook_metadata: Check for any notebook metadata.
        preserve_cell_metadata: Don't check for cell metadata.
        preserve_cell_outputs: Don't check for cell outputs.
        preserve_execution_counts: Don't check for cell execution counts.
        preserve_notebook_metadata: Don't check for notebook metadata such as language version.
    """
    if inputs:
        processed_inputs: list[Path] | list[TextIO] = expand_directories(inputs)
    else:
        processed_inputs = [sys.stdin]

    all_clean = True
    for input_ in processed_inputs:
        name = "stdin" if input_ is sys.stdin else os.fspath(cast("Path", input_))

        notebook = cast(
            "nbformat.NotebookNode",
            nbformat.read(input_, as_version=nbformat.NO_CONVERT),
        )
        is_clean = nb_clean.check_notebook(
            notebook,
            remove_empty_cells=remove_empty_cells,
            remove_all_notebook_metadata=remove_all_notebook_metadata,
            preserve_cell_metadata=preserve_cell_metadata,
            preserve_cell_outputs=preserve_cell_outputs,
            preserve_execution_counts=preserve_execution_counts,
            preserve_notebook_metadata=preserve_notebook_metadata,
            filename=name,
        )
        all_clean &= is_clean

    if not all_clean:
        sys.exit(1)


def clean(
    inputs: Iterable[Path],
    *,
    remove_empty_cells: bool,
    remove_all_notebook_metadata: bool,
    preserve_cell_metadata: Collection[str] | None,
    preserve_cell_outputs: bool,
    preserve_execution_counts: bool,
    preserve_notebook_metadata: bool,
) -> None:
    """Clean notebooks of execution counts, metadata, and outputs.

    Args:
        inputs: Input notebook paths to clean, empty list for stdin.
        remove_empty_cells: Remove empty cells.
        remove_all_notebook_metadata: Remove all notebook metadata.
        preserve_cell_metadata: Don't clean cell metadata.
        preserve_cell_outputs: Don't clean cell outputs.
        preserve_execution_counts: Don't clean cell execution counts.
        preserve_notebook_metadata: Don't clean notebook metadata such as language version.
    """
    if inputs:
        processed_inputs: list[Path] | list[TextIO] = expand_directories(inputs)
        outputs = processed_inputs
    else:
        processed_inputs = [sys.stdin]
        outputs = [sys.stdout]

    for input_, output in zip(processed_inputs, outputs, strict=True):
        notebook = cast(
            "nbformat.NotebookNode",
            nbformat.read(input_, as_version=nbformat.NO_CONVERT),
        )

        notebook = nb_clean.clean_notebook(
            notebook,
            remove_empty_cells=remove_empty_cells,
            remove_all_notebook_metadata=remove_all_notebook_metadata,
            preserve_cell_metadata=preserve_cell_metadata,
            preserve_cell_outputs=preserve_cell_outputs,
            preserve_execution_counts=preserve_execution_counts,
            preserve_notebook_metadata=preserve_notebook_metadata,
        )
        nbformat.write(notebook, output)


def parse_args(args: Sequence[str]) -> Args:
    """Parse command line arguments and call corresponding function.

    Args:
        args: Command line arguments to parse.

    Returns:
        Parsed command line arguments.
    """
    parser = argparse.ArgumentParser(description=__doc__)
    subparsers = parser.add_subparsers(dest="subcommand", required=True)

    subparsers.add_parser("version", help="print version number")

    add_filter_parser = subparsers.add_parser(
        "add-filter", help="add Git filter to clean notebooks before staging"
    )
    add_filter_parser.add_argument(
        "-e", "--remove-empty-cells", action="store_true", help="remove empty cells"
    )
    add_filter_parser.add_argument(
        "-M",
        "--remove-all-notebook-metadata",
        action="store_true",
        help="remove all notebook metadata",
    )
    add_filter_parser.add_argument(
        "-m",
        "--preserve-cell-metadata",
        default=None,
        nargs="*",
        help="preserve cell metadata, all unless fields are specified",
    )
    add_filter_parser.add_argument(
        "-o",
        "--preserve-cell-outputs",
        action="store_true",
        help="preserve cell outputs",
    )
    add_filter_parser.add_argument(
        "-c",
        "--preserve-execution-counts",
        action="store_true",
        help="preserve cell execution counts",
    )
    add_filter_parser.add_argument(
        "-n",
        "--preserve-notebook-metadata",
        action="store_true",
        help="preserve notebook metadata",
    )

    subparsers.add_parser(
        "remove-filter", help="remove Git filter that cleans notebooks before staging"
    )

    check_parser = subparsers.add_parser(
        "check",
        help=(
            "check a notebook is clean of cell execution counts, metadata, and outputs"
        ),
    )
    check_parser.add_argument(
        "inputs", nargs="*", metavar="PATH", type=Path, help="input file"
    )
    check_parser.add_argument(
        "-e", "--remove-empty-cells", action="store_true", help="check for empty cells"
    )
    check_parser.add_argument(
        "-M",
        "--remove-all-notebook-metadata",
        action="store_true",
        help="check for any notebook metadata",
    )
    check_parser.add_argument(
        "-m",
        "--preserve-cell-metadata",
        default=None,
        nargs="*",
        help="preserve cell metadata, all unless fields are specified",
    )
    check_parser.add_argument(
        "-o",
        "--preserve-cell-outputs",
        action="store_true",
        help="preserve cell outputs",
    )
    check_parser.add_argument(
        "-c",
        "--preserve-execution-counts",
        action="store_true",
        help="preserve cell execution counts",
    )
    check_parser.add_argument(
        "-n",
        "--preserve-notebook-metadata",
        action="store_true",
        help="preserve notebook metadata",
    )

    clean_parser = subparsers.add_parser(
        "clean", help="clean notebook of cell execution counts, metadata, and outputs"
    )
    clean_parser.add_argument(
        "inputs", nargs="*", metavar="PATH", type=Path, help="input path"
    )
    clean_parser.add_argument(
        "-e", "--remove-empty-cells", action="store_true", help="remove empty cells"
    )
    clean_parser.add_argument(
        "-M",
        "--remove-all-notebook-metadata",
        action="store_true",
        help="remove all notebook metadata",
    )
    clean_parser.add_argument(
        "-m",
        "--preserve-cell-metadata",
        default=None,
        nargs="*",
        help="preserve cell metadata, all unless fields are specified",
    )
    clean_parser.add_argument(
        "-o",
        "--preserve-cell-outputs",
        action="store_true",
        help="preserve cell outputs",
    )
    clean_parser.add_argument(
        "-c",
        "--preserve-execution-counts",
        action="store_true",
        help="preserve cell execution counts",
    )
    clean_parser.add_argument(
        "-n",
        "--preserve-notebook-metadata",
        action="store_true",
        help="preserve notebook metadata",
    )

    return parser.parse_args(args, namespace=Args())


def main() -> None:  # pragma: no cover
    """Command line entrypoint for nb-clean.

    Parses command line arguments and dispatches to the appropriate
    subcommand handler (version, add-filter, remove-filter, check, or clean).
    """
    args = parse_args(sys.argv[1:])

    if args.subcommand == "version":
        print(f"nb-clean {nb_clean.VERSION}")
    elif args.subcommand == "add-filter":
        add_filter(
            remove_empty_cells=args.remove_empty_cells,
            remove_all_notebook_metadata=args.remove_all_notebook_metadata,
            preserve_cell_metadata=args.preserve_cell_metadata,
            preserve_cell_outputs=args.preserve_cell_outputs,
            preserve_execution_counts=args.preserve_execution_counts,
            preserve_notebook_metadata=args.preserve_notebook_metadata,
        )
    elif args.subcommand == "remove-filter":
        remove_filter()
    elif args.subcommand == "check":
        check(
            args.inputs,
            remove_empty_cells=args.remove_empty_cells,
            remove_all_notebook_metadata=args.remove_all_notebook_metadata,
            preserve_cell_metadata=args.preserve_cell_metadata,
            preserve_cell_outputs=args.preserve_cell_outputs,
            preserve_execution_counts=args.preserve_execution_counts,
            preserve_notebook_metadata=args.preserve_notebook_metadata,
        )
    elif args.subcommand == "clean":
        clean(
            args.inputs,
            remove_empty_cells=args.remove_empty_cells,
            remove_all_notebook_metadata=args.remove_all_notebook_metadata,
            preserve_cell_metadata=args.preserve_cell_metadata,
            preserve_cell_outputs=args.preserve_cell_outputs,
            preserve_execution_counts=args.preserve_execution_counts,
            preserve_notebook_metadata=args.preserve_notebook_metadata,
        )
    else:
        # This should never happen due to argparse validation, but be defensive
        exit_with_error(f"Unknown subcommand: {args.subcommand}", 1)


================================================
FILE: src/nb_clean/py.typed
================================================


================================================
FILE: tests/conftest.py
================================================
from pathlib import Path
from typing import Final, cast

import nbformat
import pytest

NOTEBOOKS_DIR: Final = Path(__file__).parent / "notebooks"


def _read_notebook(filename: str) -> nbformat.NotebookNode:
    return cast(
        "nbformat.NotebookNode",
        nbformat.read(NOTEBOOKS_DIR / filename, as_version=nbformat.NO_CONVERT),
    )


@pytest.fixture
def dirty_notebook() -> nbformat.NotebookNode:
    return _read_notebook("dirty.ipynb")


@pytest.fixture
def dirty_notebook_with_version() -> nbformat.NotebookNode:
    return _read_notebook("dirty_with_version.ipynb")


@pytest.fixture
def clean_notebook() -> nbformat.NotebookNode:
    return _read_notebook("clean.ipynb")


@pytest.fixture
def clean_notebook_with_notebook_metadata() -> nbformat.NotebookNode:
    return _read_notebook("clean_with_notebook_metadata.ipynb")


@pytest.fixture
def clean_notebook_without_empty_cells() -> nbformat.NotebookNode:
    return _read_notebook("clean_without_empty_cells.ipynb")


@pytest.fixture
def clean_notebook_with_empty_cells() -> nbformat.NotebookNode:
    return _read_notebook("clean_with_empty_cells.ipynb")


@pytest.fixture
def clean_notebook_with_counts() -> nbformat.NotebookNode:
    return _read_notebook("clean_with_counts.ipynb")


@pytest.fixture
def clean_notebook_with_cell_metadata() -> nbformat.NotebookNode:
    return _read_notebook("clean_with_cell_metadata.ipynb")


@pytest.fixture
def clean_notebook_with_tags_metadata() -> nbformat.NotebookNode:
    return _read_notebook("clean_with_tags_metadata.ipynb")


@pytest.fixture
def clean_notebook_with_tags_special_metadata() -> nbformat.NotebookNode:
    return _read_notebook("clean_with_tags_special_metadata.ipynb")


@pytest.fixture
def clean_notebook_with_outputs() -> nbformat.NotebookNode:
    return _read_notebook("clean_with_outputs.ipynb")


@pytest.fixture
def clean_notebook_with_outputs_with_counts() -> nbformat.NotebookNode:
    return _read_notebook("clean_with_outputs_with_counts.ipynb")


@pytest.fixture
def clean_notebook_without_notebook_metadata() -> nbformat.NotebookNode:
    return _read_notebook("clean_without_notebook_metadata.ipynb")


================================================
FILE: tests/notebooks/clean.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/clean_with_cell_metadata.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "nbclean": "test",
    "special": "my special metadata",
    "tags": [
     "before-import",
     "answer"
    ]
   },
   "outputs": [],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/clean_with_counts.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/clean_with_empty_cells.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/clean_with_notebook_metadata.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/clean_with_outputs.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Hello, world'"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello, world\n"
     ]
    }
   ],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/clean_with_outputs_with_counts.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Hello, world'"
      ]
     },
     "execution_count": 0,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello, world\n"
     ]
    }
   ],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/clean_with_tags_metadata.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "before-import",
     "answer"
    ]
   },
   "outputs": [],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/clean_with_tags_special_metadata.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "special": "my special metadata",
    "tags": [
     "before-import",
     "answer"
    ]
   },
   "outputs": [],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/clean_without_empty_cells.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(text)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/clean_without_notebook_metadata.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/dirty.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "nbclean": "test",
    "tags": [
     "before-import",
     "answer"
    ],
    "special": "my special metadata"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Hello, world'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello, world\n"
     ]
    }
   ],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/notebooks/dirty_empty_octave.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "10cfba24-bab5-47a0-9ab8-5d1fc01f1f58",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Octave",
   "language": "octave",
   "name": "octave"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}


================================================
FILE: tests/notebooks/dirty_with_version.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "nbclean": "test"
   },
   "outputs": [],
   "source": [
    "text = \"Hello, world\"\n",
    "text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "nbclean": "test",
    "tags": [
     "example-tag",
     "another-tag"
    ],
    "special": "my special metadata"
   },
   "outputs": [],
   "source": [
    "print(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:Python3] *",
   "language": "python",
   "name": "conda-env-Python3-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: tests/test_check_notebook.py
================================================
from __future__ import annotations

from typing import TYPE_CHECKING, cast

import pytest

import nb_clean

if TYPE_CHECKING:
    from collections.abc import Collection

    import nbformat


@pytest.mark.parametrize(
    ("notebook_name", "is_clean"),
    [
        ("clean_notebook", True),
        ("dirty_notebook", False),
        ("dirty_notebook_with_version", False),
    ],
)
def test_check_notebook(
    notebook_name: str, *, is_clean: bool, request: pytest.FixtureRequest
) -> None:
    notebook = cast("nbformat.NotebookNode", request.getfixturevalue(notebook_name))
    assert nb_clean.check_notebook(notebook) is is_clean


@pytest.mark.parametrize("preserve_notebook_metadata", [True, False])
def test_check_notebook_preserve_notebook_metadata(
    clean_notebook_with_notebook_metadata: nbformat.NotebookNode,
    *,
    preserve_notebook_metadata: bool,
) -> None:
    assert (
        nb_clean.check_notebook(
            clean_notebook_with_notebook_metadata,
            preserve_notebook_metadata=preserve_notebook_metadata,
        )
        is preserve_notebook_metadata
    )


@pytest.mark.parametrize("remove_empty_cells", [True, False])
def test_check_notebook_remove_empty_cells(
    clean_notebook_with_empty_cells: nbformat.NotebookNode, *, remove_empty_cells: bool
) -> None:
    output = nb_clean.check_notebook(
        clean_notebook_with_empty_cells, remove_empty_cells=remove_empty_cells
    )
    assert output is not remove_empty_cells


@pytest.mark.parametrize(
    "preserve_cell_metadata",
    [
        [],
        ["tags"],
        ["other"],
        ["tags", "special"],
        ["nbformat", "tags", "special"],
        None,
    ],
)
def test_check_notebook_preserve_cell_metadata(
    clean_notebook_with_cell_metadata: nbformat.NotebookNode,
    preserve_cell_metadata: Collection[str] | None,
) -> None:
    expected = (preserve_cell_metadata is not None) and (
        preserve_cell_metadata == []
        or {"tags", "special", "nbclean"}.issubset(preserve_cell_metadata)
    )
    output = nb_clean.check_notebook(
        clean_notebook_with_cell_metadata, preserve_cell_metadata=preserve_cell_metadata
    )
    assert output is expected


@pytest.mark.parametrize(
    "preserve_cell_metadata",
    [
        [],
        ["tags"],
        ["other"],
        ["tags", "special"],
        ["nbformat", "tags", "special"],
        None,
    ],
)
def test_check_notebook_preserve_cell_metadata_tags(
    clean_notebook_with_tags_metadata: nbformat.NotebookNode,
    preserve_cell_metadata: Collection[str] | None,
) -> None:
    expected = (preserve_cell_metadata is not None) and (
        preserve_cell_metadata == [] or {"tags"}.issubset(preserve_cell_metadata)
    )
    output = nb_clean.check_notebook(
        clean_notebook_with_tags_metadata, preserve_cell_metadata=preserve_cell_metadata
    )
    assert output is expected


@pytest.mark.parametrize(
    "preserve_cell_metadata",
    [
        [],
        ["tags"],
        ["other"],
        ["tags", "special"],
        ["nbformat", "tags", "special"],
        None,
    ],
)
def test_check_notebook_preserve_cell_metadata_tags_special(
    clean_notebook_with_tags_special_metadata: nbformat.NotebookNode,
    preserve_cell_metadata: Collection[str] | None,
) -> None:
    expected = (preserve_cell_metadata is not None) and (
        preserve_cell_metadata == []
        or {"tags", "special"}.issubset(preserve_cell_metadata)
    )
    output = nb_clean.check_notebook(
        clean_notebook_with_tags_special_metadata,
        preserve_cell_metadata=preserve_cell_metadata,
    )
    assert output is expected


@pytest.mark.parametrize(
    ("notebook_name", "preserve_cell_outputs", "is_clean"),
    [
        ("clean_notebook_with_outputs", True, True),
        ("clean_notebook_with_outputs", False, False),
        ("clean_notebook_with_outputs_with_counts", True, False),
    ],
)
def test_check_notebook_preserve_outputs(
    notebook_name: str,
    *,
    preserve_cell_outputs: bool,
    is_clean: bool,
    request: pytest.FixtureRequest,
) -> None:
    notebook = cast("nbformat.NotebookNode", request.getfixturevalue(notebook_name))
    output = nb_clean.check_notebook(
        notebook, preserve_cell_outputs=preserve_cell_outputs
    )
    assert output is is_clean


@pytest.mark.parametrize(
    ("notebook_name", "preserve_execution_counts", "is_clean"),
    [
        ("clean_notebook_with_counts", True, True),
        ("clean_notebook_with_counts", False, False),
    ],
)
def test_check_notebook_preserve_execution_counts(
    notebook_name: str,
    *,
    preserve_execution_counts: bool,
    is_clean: bool,
    request: pytest.FixtureRequest,
) -> None:
    notebook = cast("nbformat.NotebookNode", request.getfixturevalue(notebook_name))
    output = nb_clean.check_notebook(
        notebook, preserve_execution_counts=preserve_execution_counts
    )
    assert output is is_clean


@pytest.mark.parametrize(
    ("notebook_name", "remove_all_notebook_metadata", "is_clean"),
    [
        ("clean_notebook_with_notebook_metadata", True, False),
        ("clean_notebook_with_notebook_metadata", False, False),
        ("clean_notebook_without_notebook_metadata", True, True),
        ("clean_notebook_without_notebook_metadata", False, True),
        ("clean_notebook", True, False),
        ("clean_notebook", False, True),
    ],
)
def test_check_notebook_remove_all_notebook_metadata(
    notebook_name: str,
    *,
    remove_all_notebook_metadata: bool,
    is_clean: bool,
    request: pytest.FixtureRequest,
) -> None:
    # The test with `("clean_notebook_with_notebook_metadata", False, True)`
    # is False due to `clean_notebook_with_notebook_metadata` containing
    # `language_info.version` detected when `preserve_notebook_metadata=False`.
    notebook = cast("nbformat.NotebookNode", request.getfixturevalue(notebook_name))
    assert (
        nb_clean.check_notebook(
            notebook, remove_all_notebook_metadata=remove_all_notebook_metadata
        )
        == is_clean
    )


def test_check_notebook_exclusive_arguments(
    dirty_notebook: nbformat.NotebookNode,
) -> None:
    with pytest.raises(
        ValueError,
        match="`preserve_notebook_metadata` and `remove_all_notebook_metadata` cannot both be `True`",
    ):
        nb_clean.check_notebook(
            dirty_notebook,
            remove_all_notebook_metadata=True,
            preserve_notebook_metadata=True,
        )


================================================
FILE: tests/test_clean_notebook.py
================================================
from collections.abc import Collection
from typing import cast

import nbformat
import pytest

import nb_clean


def test_clean_notebook(
    dirty_notebook: nbformat.NotebookNode, clean_notebook: nbformat.NotebookNode
) -> None:
    assert nb_clean.clean_notebook(dirty_notebook) == clean_notebook


@pytest.mark.parametrize(
    ("preserve_notebook_metadata", "expected_output_name"),
    [(True, "clean_notebook_with_notebook_metadata"), (False, "clean_notebook")],
)
def test_clean_notebook_with_notebook_metadata(
    clean_notebook_with_notebook_metadata: nbformat.NotebookNode,
    *,
    preserve_notebook_metadata: bool,
    expected_output_name: str,
    request: pytest.FixtureRequest,
) -> None:
    expected_output = cast(
        "nbformat.NotebookNode", request.getfixturevalue(expected_output_name)
    )
    assert (
        nb_clean.clean_notebook(
            clean_notebook_with_notebook_metadata,
            preserve_notebook_metadata=preserve_notebook_metadata,
        )
        == expected_output
    )


def test_clean_notebook_remove_empty_cells(
    clean_notebook_with_empty_cells: nbformat.NotebookNode,
    clean_notebook_without_empty_cells: nbformat.NotebookNode,
) -> None:
    assert (
        nb_clean.clean_notebook(
            clean_notebook_with_empty_cells, remove_empty_cells=True
        )
        == clean_notebook_without_empty_cells
    )


@pytest.mark.parametrize(
    "preserve_cell_metadata",
    [[], ["nbclean", "tags", "special"], ["nbclean", "tags", "special", "toomany"]],
)
def test_clean_notebook_preserve_cell_metadata(
    dirty_notebook: nbformat.NotebookNode,
    clean_notebook_with_cell_metadata: nbformat.NotebookNode,
    preserve_cell_metadata: Collection[str],
) -> None:
    assert (
        nb_clean.clean_notebook(
            dirty_notebook, preserve_cell_metadata=preserve_cell_metadata
        )
        == clean_notebook_with_cell_metadata
    )


@pytest.mark.parametrize("preserve_cell_metadata", [["tags"], ["tags", "toomany"]])
def test_clean_notebook_preserve_cell_metadata_tags(
    dirty_notebook: nbformat.NotebookNode,
    clean_notebook_with_tags_metadata: nbformat.NotebookNode,
    preserve_cell_metadata: Collection[str],
) -> None:
    assert (
        nb_clean.clean_notebook(
            dirty_notebook, preserve_cell_metadata=preserve_cell_metadata
        )
        == clean_notebook_with_tags_metadata
    )


@pytest.mark.parametrize(
    "preserve_cell_metadata", [["tags", "special"], ["tags", "special", "toomany"]]
)
def test_clean_notebook_preserve_cell_metadata_tags_special(
    dirty_notebook: nbformat.NotebookNode,
    clean_notebook_with_tags_special_metadata: nbformat.NotebookNode,
    preserve_cell_metadata: Collection[str],
) -> None:
    assert (
        nb_clean.clean_notebook(
            dirty_notebook, preserve_cell_metadata=preserve_cell_metadata
        )
        == clean_notebook_with_tags_special_metadata
    )


def test_clean_notebook_preserve_outputs(
    dirty_notebook: nbformat.NotebookNode,
    clean_notebook_with_outputs: nbformat.NotebookNode,
) -> None:
    assert (
        nb_clean.clean_notebook(dirty_notebook, preserve_cell_outputs=True)
        == clean_notebook_with_outputs
    )


def test_clean_notebook_preserve_execution_counts(
    dirty_notebook: nbformat.NotebookNode,
    clean_notebook_with_counts: nbformat.NotebookNode,
) -> None:
    assert (
        nb_clean.clean_notebook(dirty_notebook, preserve_execution_counts=True)
        == clean_notebook_with_counts
    )


def test_clean_notebook_remove_all_notebook_metadata(
    dirty_notebook: nbformat.NotebookNode,
    clean_notebook_without_notebook_metadata: nbformat.NotebookNode,
) -> None:
    assert (
        nb_clean.clean_notebook(dirty_notebook, remove_all_notebook_metadata=True)
        == clean_notebook_without_notebook_metadata
    )


def test_clean_notebook_exclusive_arguments(
    dirty_notebook: nbformat.NotebookNode,
) -> None:
    with pytest.raises(
        ValueError,
        match="`preserve_notebook_metadata` and `remove_all_notebook_metadata` cannot both be `True`",
    ):
        nb_clean.clean_notebook(
            dirty_notebook,
            remove_all_notebook_metadata=True,
            preserve_notebook_metadata=True,
        )


================================================
FILE: tests/test_cli.py
================================================
from __future__ import annotations

import io
import os
import sys
from pathlib import Path
from typing import TYPE_CHECKING, cast

import nbformat
import pytest

import nb_clean
import nb_clean.cli

if TYPE_CHECKING:
    from collections.abc import Collection, Iterable

    from pytest import CaptureFixture  # noqa: PT013


def test_expand_directories_with_files() -> None:
    paths = [Path("tests/notebooks/dirty.ipynb")]
    assert nb_clean.cli.expand_directories(paths) == paths


def test_expand_directories_recursively() -> None:
    input_paths = [Path("tests")]
    expanded_paths = nb_clean.cli.expand_directories(input_paths)
    assert len(expanded_paths) > len(input_paths)
    assert all(path.is_file() and path.suffix == ".ipynb" for path in expanded_paths)


def test_exit_with_error(capsys: CaptureFixture[str]) -> None:
    with pytest.raises(SystemExit) as exc:
        nb_clean.cli.exit_with_error("error message", 42)
    assert exc.value.code == 42
    assert capsys.readouterr().err == "nb-clean: error: error message\n"


def test_add_filter_dispatch(monkeypatch: pytest.MonkeyPatch) -> None:
    captured: dict[str, object] = {}

    def fake_add_git_filter(**kwargs: object) -> None:
        captured.update(kwargs)

    monkeypatch.setattr(nb_clean, "add_git_filter", fake_add_git_filter)

    argv = ["nb-clean", "add-filter", "-e", "-n"]
    monkeypatch.setattr(sys, "argv", argv)
    nb_clean.cli.main()

    assert captured == {
        "remove_empty_cells": True,
        "remove_all_notebook_metadata": False,
        "preserve_cell_metadata": None,
        "preserve_cell_outputs": False,
        "preserve_execution_counts": False,
        "preserve_notebook_metadata": True,
    }


def test_add_filter_remove_all_notebook_metadata_dispatch(
    monkeypatch: pytest.MonkeyPatch,
) -> None:
    captured: dict[str, object] = {}

    def fake_add_git_filter(**kwargs: object) -> None:
        captured.update(kwargs)

    monkeypatch.setattr(nb_clean, "add_git_filter", fake_add_git_filter)

    argv = ["nb-clean", "add-filter", "-e", "-M"]
    monkeypatch.setattr(sys, "argv", argv)
    nb_clean.cli.main()

    assert captured == {
        "remove_empty_cells": True,
        "remove_all_notebook_metadata": True,
        "preserve_cell_metadata": None,
        "preserve_cell_outputs": False,
        "preserve_execution_counts": False,
        "preserve_notebook_metadata": False,
    }


def test_add_filter_failure_dispatch(
    capsys: CaptureFixture[str], monkeypatch: pytest.MonkeyPatch
) -> None:
    def fake_add_git_filter(**_kwargs: object) -> None:
        raise nb_clean.GitProcessError(message="error message", return_code=42)

    monkeypatch.setattr(nb_clean, "add_git_filter", fake_add_git_filter)
    monkeypatch.setattr(sys, "argv", ["nb-clean", "add-filter", "-e", "-M"])

    with pytest.raises(SystemExit) as exc:
        nb_clean.cli.main()
    assert exc.value.code == 42
    assert capsys.readouterr().err == "nb-clean: error: error message\n"


def test_remove_filter_dispatch(monkeypatch: pytest.MonkeyPatch) -> None:
    called = {"value": False}

    def fake_remove_git_filter() -> None:
        called["value"] = True

    monkeypatch.setattr(nb_clean, "remove_git_filter", fake_remove_git_filter)
    monkeypatch.setattr(sys, "argv", ["nb-clean", "remove-filter"])
    nb_clean.cli.main()
    assert called["value"]


def test_remove_filter_failure_dispatch(
    capsys: CaptureFixture[str], monkeypatch: pytest.MonkeyPatch
) -> None:
    def fake_remove_git_filter() -> None:
        raise nb_clean.GitProcessError(message="error message", return_code=42)

    monkeypatch.setattr(nb_clean, "remove_git_filter", fake_remove_git_filter)
    monkeypatch.setattr(sys, "argv", ["nb-clean", "remove-filter"])

    with pytest.raises(SystemExit) as exc:
        nb_clean.cli.main()
    assert exc.value.code == 42
    assert capsys.readouterr().err == "nb-clean: error: error message\n"


@pytest.mark.parametrize(
    ("name", "expect_exit"), [("clean.ipynb", False), ("dirty.ipynb", True)]
)
def test_check_file(
    tmp_path: Path, monkeypatch: pytest.MonkeyPatch, name: str, *, expect_exit: bool
) -> None:
    src = Path("tests/notebooks") / name
    dst = tmp_path / name
    dst.write_bytes(src.read_bytes())

    monkeypatch.setattr(sys, "argv", ["nb-clean", "check", os.fspath(dst)])

    if expect_exit:
        with pytest.raises(SystemExit) as exc:
            nb_clean.cli.main()
        assert exc.value.code == 1
    else:
        nb_clean.cli.main()


@pytest.mark.parametrize(
    ("notebook_name", "expect_exit"),
    [("clean_notebook", False), ("dirty_notebook", True)],
)
def test_check_stdin(
    monkeypatch: pytest.MonkeyPatch,
    notebook_name: str,
    *,
    expect_exit: bool,
    request: pytest.FixtureRequest,
) -> None:
    notebook = cast("nbformat.NotebookNode", request.getfixturevalue(notebook_name))
    monkeypatch.setattr(sys, "argv", ["nb-clean", "check"])
    content = cast("str", nbformat.writes(notebook))
    monkeypatch.setattr(sys, "stdin", io.StringIO(content))
    if expect_exit:
        with pytest.raises(SystemExit) as exc:
            nb_clean.cli.main()
        assert exc.value.code == 1
    else:
        nb_clean.cli.main()


def test_clean_file(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
    src_dirty = Path("tests/notebooks/dirty.ipynb")
    dst_dirty = tmp_path / "dirty.ipynb"
    dst_dirty.write_bytes(src_dirty.read_bytes())

    monkeypatch.setattr(sys, "argv", ["nb-clean", "clean", str(dst_dirty)])
    nb_clean.cli.main()

    cleaned = cast(
        "nbformat.NotebookNode",
        nbformat.read(dst_dirty, as_version=nbformat.NO_CONVERT),
    )
    expected = cast(
        "nbformat.NotebookNode",
        nbformat.read(
            Path("tests/notebooks/clean.ipynb"), as_version=nbformat.NO_CONVERT
        ),
    )
    assert cleaned == expected


def test_clean_stdin(
    capsys: CaptureFixture[str], monkeypatch: pytest.MonkeyPatch
) -> None:
    dirty = cast(
        "nbformat.NotebookNode",
        nbformat.read(
            Path("tests/notebooks/dirty.ipynb"), as_version=nbformat.NO_CONVERT
        ),
    )
    expected = cast(
        "nbformat.NotebookNode",
        nbformat.read(
            Path("tests/notebooks/clean.ipynb"), as_version=nbformat.NO_CONVERT
        ),
    )

    monkeypatch.setattr(sys, "argv", ["nb-clean", "clean"])
    dirty_content = cast("str", nbformat.writes(dirty))
    monkeypatch.setattr(sys, "stdin", io.StringIO(dirty_content))

    nb_clean.cli.main()

    out = capsys.readouterr().out
    expected_text = cast("str", nbformat.writes(expected))
    assert out.strip() == expected_text.strip()


@pytest.mark.parametrize(
    (
        "argv",
        "inputs",
        "remove_empty_cells",
        "remove_all_notebook_metadata",
        "preserve_cell_metadata",
        "preserve_cell_outputs",
        "preserve_execution_counts",
        "preserve_notebook_metadata",
    ),
    [
        ("add-filter -e", [], True, False, None, False, False, False),
        (
            "check -m -o a.ipynb b.ipynb",
            ["a.ipynb", "b.ipynb"],
            False,
            False,
            [],
            True,
            False,
            False,
        ),
        (
            "check -m tags -o a.ipynb b.ipynb",
            ["a.ipynb", "b.ipynb"],
            False,
            False,
            ["tags"],
            True,
            False,
            False,
        ),
        (
            "check -m tags special -o a.ipynb b.ipynb",
            ["a.ipynb", "b.ipynb"],
            False,
            False,
            ["tags", "special"],
            True,
            False,
            False,
        ),
        ("clean -e -o a.ipynb", ["a.ipynb"], True, False, None, True, False, False),
        ("clean -e -c -o a.ipynb", ["a.ipynb"], True, False, None, True, True, False),
    ],
)
def test_parse_args(
    argv: str,
    inputs: Iterable[str],
    *,
    remove_empty_cells: bool,
    remove_all_notebook_metadata: bool,
    preserve_cell_metadata: Collection[str] | None,
    preserve_cell_outputs: bool,
    preserve_execution_counts: bool,
    preserve_notebook_metadata: bool,
) -> None:
    args = nb_clean.cli.parse_args(argv.split())
    if inputs:
        assert args.inputs == [Path(path) for path in inputs]
    assert args.remove_empty_cells is remove_empty_cells
    assert args.remove_all_notebook_metadata is remove_all_notebook_metadata
    assert args.preserve_cell_metadata == preserve_cell_metadata
    assert args.preserve_cell_outputs is preserve_cell_outputs
    assert args.preserve_execution_counts is preserve_execution_counts
    assert args.preserve_notebook_metadata is preserve_notebook_metadata


================================================
FILE: tests/test_git_integration.py
================================================
from __future__ import annotations

import subprocess
from pathlib import Path
from typing import TYPE_CHECKING
from unittest.mock import Mock

import pytest

import nb_clean

if TYPE_CHECKING:
    from collections.abc import Collection

    from pytest_mock import MockerFixture


def test_git(mocker: MockerFixture) -> None:
    mock_process = Mock()
    mock_process.stdout = b" output string "
    mock_run = mocker.patch("nb_clean.subprocess.run", return_value=mock_process)
    output = nb_clean.git("command", "--flag")
    mock_run.assert_called_once_with(
        ["git", "command", "--flag"], capture_output=True, check=True
    )
    assert output == "output string"


def test_git_failure(mocker: MockerFixture) -> None:
    mocker.patch(
        "nb_clean.subprocess.run",
        side_effect=subprocess.CalledProcessError(
            returncode=42, cmd="command", stderr=b"standard error"
        ),
    )
    with pytest.raises(nb_clean.GitProcessError) as exc:
        nb_clean.git("command", "--flag")
    assert exc.value.message == "standard error"
    assert exc.value.return_code == 42


def test_git_attributes_path(mocker: MockerFixture) -> None:
    mocker.patch("nb_clean.git", return_value="dir/.git")
    assert nb_clean.git_attributes_path() == Path("dir", ".git", "info", "attributes")


@pytest.mark.parametrize(
    (
        "remove_empty_cells",
        "remove_all_notebook_metadata",
        "preserve_cell_metadata",
        "preserve_cell_outputs",
        "preserve_execution_counts",
        "preserve_notebook_metadata",
        "filter_command",
    ),
    [
        (False, False, None, False, False, False, "nb-clean clean"),
        (True, False, None, False, False, False, "nb-clean clean --remove-empty-cells"),
        (
            False,
            False,
            [],
            False,
            False,
            False,
            "nb-clean clean --preserve-cell-metadata",
        ),
        (
            False,
            False,
            ["tags"],
            False,
            False,
            False,
            "nb-clean clean --preserve-cell-metadata tags",
        ),
        (
            False,
            False,
            ["tags", "special"],
            False,
            False,
            False,
            "nb-clean clean --preserve-cell-metadata tags special",
        ),
        (
            False,
            False,
            None,
            True,
            False,
            False,
            "nb-clean clean --preserve-cell-outputs",
        ),
        (
            True,
            False,
            [],
            True,
            False,
            False,
            "nb-clean clean --remove-empty-cells --preserve-cell-metadata --preserve-cell-outputs",
        ),
        (
            False,
            False,
            None,
            False,
            True,
            True,
            "nb-clean clean --preserve-execution-counts --preserve-notebook-metadata",
        ),
        (
            False,
            True,
            None,
            False,
            False,
            False,
            "nb-clean clean --remove-all-notebook-metadata",
        ),
    ],
)
def test_add_git_filter(
    mocker: MockerFixture,
    tmp_path: Path,
    *,
    remove_empty_cells: bool,
    remove_all_notebook_metadata: bool,
    preserve_cell_metadata: Collection[str] | None,
    preserve_cell_outputs: bool,
    preserve_execution_counts: bool,
    preserve_notebook_metadata: bool,
    filter_command: str,
) -> None:
    mock_git = mocker.patch("nb_clean.git")
    mock_git_attributes_path = mocker.patch(
        "nb_clean.git_attributes_path", return_value=tmp_path / "attributes"
    )
    nb_clean.add_git_filter(
        remove_empty_cells=remove_empty_cells,
        remove_all_notebook_metadata=remove_all_notebook_metadata,
        preserve_cell_metadata=preserve_cell_metadata,
        preserve_cell_outputs=preserve_cell_outputs,
        preserve_execution_counts=preserve_execution_counts,
        preserve_notebook_metadata=preserve_notebook_metadata,
    )
    mock_git.assert_called_once_with("config", "filter.nb-clean.clean", filter_command)
    mock_git_attributes_path.assert_called_once()
    assert nb_clean.GIT_ATTRIBUTES_LINE in (tmp_path / "attributes").read_text()


def test_add_git_filter_exclusive_arguments() -> None:
    with pytest.raises(
        ValueError,
        match="`preserve_notebook_metadata` and `remove_all_notebook_metadata` cannot both be `True`",
    ):
        nb_clean.add_git_filter(
            remove_all_notebook_metadata=True, preserve_notebook_metadata=True
        )


def test_add_git_filter_idempotent(mocker: MockerFixture, tmp_path: Path) -> None:
    mocker.patch("nb_clean.git")
    (tmp_path / "attributes").write_text(nb_clean.GIT_ATTRIBUTES_LINE)
    mock_git_attributes_path = mocker.patch(
        "nb_clean.git_attributes_path", return_value=tmp_path / "attributes"
    )
    nb_clean.add_git_filter()
    mock_git_attributes_path.assert_called_once()
    assert (tmp_path / "attributes").read_text() == nb_clean.GIT_ATTRIBUTES_LINE


@pytest.mark.parametrize("filter_exists", [True, False])
def test_remove_git_filter(
    mocker: MockerFixture, tmp_path: Path, *, filter_exists: bool
) -> None:
    mock_git = mocker.patch("nb_clean.git")
    mock_git_attributes_path = mocker.patch(
        "nb_clean.git_attributes_path", return_value=tmp_path / "attributes"
    )
    (tmp_path / "attributes").touch()
    if filter_exists:
        (tmp_path / "attributes").write_text(nb_clean.GIT_ATTRIBUTES_LINE)
    nb_clean.remove_git_filter()
    mock_git_attributes_path.assert_called_once()
    mock_git.assert_called_once_with("config", "--remove-section", "filter.nb-clean")
    if filter_exists:
        assert nb_clean.GIT_ATTRIBUTES_LINE not in (tmp_path / "attributes").read_text()

Download .txt

gitextract_9zcbq9pw/

├── .github/
│   ├── CODEOWNERS
│   ├── CONTRIBUTING.md
│   ├── dependabot.yml
│   └── workflows/
│       └── ci.yml
├── .gitignore
├── .pre-commit-hooks.yaml
├── .prettierrc.toml
├── .python-version
├── LICENSE
├── README.md
├── justfile
├── pyproject.toml
├── src/
│   └── nb_clean/
│       ├── __init__.py
│       ├── __main__.py
│       ├── cli.py
│       └── py.typed
└── tests/
    ├── conftest.py
    ├── notebooks/
    │   ├── clean.ipynb
    │   ├── clean_with_cell_metadata.ipynb
    │   ├── clean_with_counts.ipynb
    │   ├── clean_with_empty_cells.ipynb
    │   ├── clean_with_notebook_metadata.ipynb
    │   ├── clean_with_outputs.ipynb
    │   ├── clean_with_outputs_with_counts.ipynb
    │   ├── clean_with_tags_metadata.ipynb
    │   ├── clean_with_tags_special_metadata.ipynb
    │   ├── clean_without_empty_cells.ipynb
    │   ├── clean_without_notebook_metadata.ipynb
    │   ├── dirty.ipynb
    │   ├── dirty_empty_octave.ipynb
    │   └── dirty_with_version.ipynb
    ├── test_check_notebook.py
    ├── test_clean_notebook.py
    ├── test_cli.py
    └── test_git_integration.py

Download .txt

SYMBOL INDEX (71 symbols across 7 files)

FILE: src/nb_clean/__init__.py
  class GitProcessError (line 20) | class GitProcessError(Exception):
    method __init__ (line 23) | def __init__(self: Self, message: str, return_code: int) -> None:
  function git (line 35) | def git(*args: str) -> str:
  function git_attributes_path (line 59) | def git_attributes_path() -> Path:
  function add_git_filter (line 73) | def add_git_filter(
  function remove_git_filter (line 146) | def remove_git_filter() -> None:
  function check_notebook (line 167) | def check_notebook(
  function clean_notebook (line 250) | def clean_notebook(

FILE: src/nb_clean/cli.py
  class Args (line 21) | class Args(argparse.Namespace):
  function expand_directories (line 34) | def expand_directories(paths: Iterable[Path]) -> list[Path]:
  function exit_with_error (line 52) | def exit_with_error(message: str, return_code: int) -> NoReturn:
  function add_filter (line 63) | def add_filter(
  function remove_filter (line 95) | def remove_filter() -> None:
  function check (line 108) | def check(
  function clean (line 158) | def clean(
  function parse_args (line 204) | def parse_args(args: Sequence[str]) -> Args:
  function main (line 348) | def main() -> None:  # pragma: no cover

FILE: tests/conftest.py
  function _read_notebook (line 10) | def _read_notebook(filename: str) -> nbformat.NotebookNode:
  function dirty_notebook (line 18) | def dirty_notebook() -> nbformat.NotebookNode:
  function dirty_notebook_with_version (line 23) | def dirty_notebook_with_version() -> nbformat.NotebookNode:
  function clean_notebook (line 28) | def clean_notebook() -> nbformat.NotebookNode:
  function clean_notebook_with_notebook_metadata (line 33) | def clean_notebook_with_notebook_metadata() -> nbformat.NotebookNode:
  function clean_notebook_without_empty_cells (line 38) | def clean_notebook_without_empty_cells() -> nbformat.NotebookNode:
  function clean_notebook_with_empty_cells (line 43) | def clean_notebook_with_empty_cells() -> nbformat.NotebookNode:
  function clean_notebook_with_counts (line 48) | def clean_notebook_with_counts() -> nbformat.NotebookNode:
  function clean_notebook_with_cell_metadata (line 53) | def clean_notebook_with_cell_metadata() -> nbformat.NotebookNode:
  function clean_notebook_with_tags_metadata (line 58) | def clean_notebook_with_tags_metadata() -> nbformat.NotebookNode:
  function clean_notebook_with_tags_special_metadata (line 63) | def clean_notebook_with_tags_special_metadata() -> nbformat.NotebookNode:
  function clean_notebook_with_outputs (line 68) | def clean_notebook_with_outputs() -> nbformat.NotebookNode:
  function clean_notebook_with_outputs_with_counts (line 73) | def clean_notebook_with_outputs_with_counts() -> nbformat.NotebookNode:
  function clean_notebook_without_notebook_metadata (line 78) | def clean_notebook_without_notebook_metadata() -> nbformat.NotebookNode:

FILE: tests/test_check_notebook.py
  function test_check_notebook (line 23) | def test_check_notebook(
  function test_check_notebook_preserve_notebook_metadata (line 31) | def test_check_notebook_preserve_notebook_metadata(
  function test_check_notebook_remove_empty_cells (line 46) | def test_check_notebook_remove_empty_cells(
  function test_check_notebook_preserve_cell_metadata (line 66) | def test_check_notebook_preserve_cell_metadata(
  function test_check_notebook_preserve_cell_metadata_tags (line 91) | def test_check_notebook_preserve_cell_metadata_tags(
  function test_check_notebook_preserve_cell_metadata_tags_special (line 115) | def test_check_notebook_preserve_cell_metadata_tags_special(
  function test_check_notebook_preserve_outputs (line 138) | def test_check_notebook_preserve_outputs(
  function test_check_notebook_preserve_execution_counts (line 159) | def test_check_notebook_preserve_execution_counts(
  function test_check_notebook_remove_all_notebook_metadata (line 184) | def test_check_notebook_remove_all_notebook_metadata(
  function test_check_notebook_exclusive_arguments (line 203) | def test_check_notebook_exclusive_arguments(

FILE: tests/test_clean_notebook.py
  function test_clean_notebook (line 10) | def test_clean_notebook(
  function test_clean_notebook_with_notebook_metadata (line 20) | def test_clean_notebook_with_notebook_metadata(
  function test_clean_notebook_remove_empty_cells (line 39) | def test_clean_notebook_remove_empty_cells(
  function test_clean_notebook_preserve_cell_metadata (line 55) | def test_clean_notebook_preserve_cell_metadata(
  function test_clean_notebook_preserve_cell_metadata_tags (line 69) | def test_clean_notebook_preserve_cell_metadata_tags(
  function test_clean_notebook_preserve_cell_metadata_tags_special (line 85) | def test_clean_notebook_preserve_cell_metadata_tags_special(
  function test_clean_notebook_preserve_outputs (line 98) | def test_clean_notebook_preserve_outputs(
  function test_clean_notebook_preserve_execution_counts (line 108) | def test_clean_notebook_preserve_execution_counts(
  function test_clean_notebook_remove_all_notebook_metadata (line 118) | def test_clean_notebook_remove_all_notebook_metadata(
  function test_clean_notebook_exclusive_arguments (line 128) | def test_clean_notebook_exclusive_arguments(

FILE: tests/test_cli.py
  function test_expand_directories_with_files (line 21) | def test_expand_directories_with_files() -> None:
  function test_expand_directories_recursively (line 26) | def test_expand_directories_recursively() -> None:
  function test_exit_with_error (line 33) | def test_exit_with_error(capsys: CaptureFixture[str]) -> None:
  function test_add_filter_dispatch (line 40) | def test_add_filter_dispatch(monkeypatch: pytest.MonkeyPatch) -> None:
  function test_add_filter_remove_all_notebook_metadata_dispatch (line 62) | def test_add_filter_remove_all_notebook_metadata_dispatch(
  function test_add_filter_failure_dispatch (line 86) | def test_add_filter_failure_dispatch(
  function test_remove_filter_dispatch (line 101) | def test_remove_filter_dispatch(monkeypatch: pytest.MonkeyPatch) -> None:
  function test_remove_filter_failure_dispatch (line 113) | def test_remove_filter_failure_dispatch(
  function test_check_file (line 131) | def test_check_file(
  function test_check_stdin (line 152) | def test_check_stdin(
  function test_clean_file (line 171) | def test_clean_file(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> ...
  function test_clean_stdin (line 192) | def test_clean_stdin(
  function test_parse_args (line 266) | def test_parse_args(

FILE: tests/test_git_integration.py
  function test_git (line 18) | def test_git(mocker: MockerFixture) -> None:
  function test_git_failure (line 29) | def test_git_failure(mocker: MockerFixture) -> None:
  function test_git_attributes_path (line 42) | def test_git_attributes_path(mocker: MockerFixture) -> None:
  function test_add_git_filter (line 125) | def test_add_git_filter(
  function test_add_git_filter_exclusive_arguments (line 154) | def test_add_git_filter_exclusive_arguments() -> None:
  function test_add_git_filter_idempotent (line 164) | def test_add_git_filter_idempotent(mocker: MockerFixture, tmp_path: Path...
  function test_remove_git_filter (line 176) | def test_remove_git_filter(

Download .json

Condensed preview — 35 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (96K chars).

[
  {
    "path": ".github/CODEOWNERS",
    "chars": 15,
    "preview": "* @srstevenson\n"
  },
  {
    "path": ".github/CONTRIBUTING.md",
    "chars": 3940,
    "preview": "# Contributing\n\nThanks for considering contributing! The following is a set of guidelines for\ndoing so. They're guidelin"
  },
  {
    "path": ".github/dependabot.yml",
    "chars": 279,
    "preview": "version: 2\nupdates:\n  - package-ecosystem: \"pip\"\n    directory: \"/\"\n    schedule:\n      interval: \"monthly\"\n    cooldown"
  },
  {
    "path": ".github/workflows/ci.yml",
    "chars": 942,
    "preview": "name: CI\n\non:\n  push:\n    branches: [main]\n  pull_request:\n\njobs:\n  checks:\n    name: Run checks\n    runs-on: ubuntu-sli"
  },
  {
    "path": ".gitignore",
    "chars": 85,
    "preview": "*.egg-info/\n.ipynb_checkpoints/\n/.coverage\n/build/\n/coverage.xml\n/dist/\n__pycache__/\n"
  },
  {
    "path": ".pre-commit-hooks.yaml",
    "chars": 133,
    "preview": "- id: nb-clean\n  name: nb-clean\n  entry: nb-clean clean\n  language: python\n  types_or: [jupyter]\n  minimum_pre_commit_ve"
  },
  {
    "path": ".prettierrc.toml",
    "chars": 21,
    "preview": "proseWrap = \"always\"\n"
  },
  {
    "path": ".python-version",
    "chars": 5,
    "preview": "3.10\n"
  },
  {
    "path": "LICENSE",
    "chars": 747,
    "preview": "Copyright © Scott Stevenson <scott@stevenson.io>\n\nPermission to use, copy, modify, and/or distribute this software for a"
  },
  {
    "path": "README.md",
    "chars": 15072,
    "preview": "<p align=\"center\"><img src=\"images/nb-clean.png\" width=300></p>\n\n[![License](https://img.shields.io/github/license/srste"
  },
  {
    "path": "justfile",
    "chars": 362,
    "preview": "# show this help message (default)\nhelp:\n    @just -l\n\n# format with ruff\nfmt:\n    uv run ruff check --fix\n    uv run ru"
  },
  {
    "path": "pyproject.toml",
    "chars": 2012,
    "preview": "[project]\nname = \"nb-clean\"\nversion = \"4.0.1\"\ndescription = \"Clean Jupyter notebooks for versioning\"\nauthors = [{ name ="
  },
  {
    "path": "src/nb_clean/__init__.py",
    "chars": 11551,
    "preview": "\"\"\"Clean Jupyter notebooks of execution counts, metadata, and outputs.\"\"\"\n\nfrom __future__ import annotations\n\nimport co"
  },
  {
    "path": "src/nb_clean/__main__.py",
    "chars": 110,
    "preview": "\"\"\"Top-level script to run nb-clean.\"\"\"\n\nfrom nb_clean.cli import main\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "src/nb_clean/cli.py",
    "chars": 13344,
    "preview": "\"\"\"Command line interface to nb-clean.\"\"\"\n\nfrom __future__ import annotations\n\nimport argparse\nimport os\nimport sys\nfrom"
  },
  {
    "path": "src/nb_clean/py.typed",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/conftest.py",
    "chars": 2152,
    "preview": "from pathlib import Path\nfrom typing import Final, cast\n\nimport nbformat\nimport pytest\n\nNOTEBOOKS_DIR: Final = Path(__fi"
  },
  {
    "path": "tests/notebooks/clean.ipynb",
    "chars": 863,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": "
  },
  {
    "path": "tests/notebooks/clean_with_cell_metadata.ipynb",
    "chars": 984,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {\n    \"nbclean\": \"test\",\n    \"speci"
  },
  {
    "path": "tests/notebooks/clean_with_counts.ipynb",
    "chars": 857,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n "
  },
  {
    "path": "tests/notebooks/clean_with_empty_cells.ipynb",
    "chars": 977,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": "
  },
  {
    "path": "tests/notebooks/clean_with_notebook_metadata.ipynb",
    "chars": 886,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": "
  },
  {
    "path": "tests/notebooks/clean_with_outputs.ipynb",
    "chars": 1158,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"dat"
  },
  {
    "path": "tests/notebooks/clean_with_outputs_with_counts.ipynb",
    "chars": 1155,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"dat"
  },
  {
    "path": "tests/notebooks/clean_with_tags_metadata.ipynb",
    "chars": 923,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {\n    \"tags\": [\n     \"before-import"
  },
  {
    "path": "tests/notebooks/clean_with_tags_special_metadata.ipynb",
    "chars": 961,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {\n    \"special\": \"my special metada"
  },
  {
    "path": "tests/notebooks/clean_without_empty_cells.ipynb",
    "chars": 749,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": "
  },
  {
    "path": "tests/notebooks/clean_without_notebook_metadata.ipynb",
    "chars": 485,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": "
  },
  {
    "path": "tests/notebooks/dirty.ipynb",
    "chars": 1293,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"metadata\": {\n    \"nbclean\": \"test\",\n    \"tags\": ["
  },
  {
    "path": "tests/notebooks/dirty_empty_octave.ipynb",
    "chars": 334,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"id\": \"10cfba24-bab5-47a0-9ab8-5d1fc01f1f58\",\n "
  },
  {
    "path": "tests/notebooks/dirty_with_version.ipynb",
    "chars": 1036,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {\n    \"nbclean\": \"test\"\n   },\n   \"o"
  },
  {
    "path": "tests/test_check_notebook.py",
    "chars": 6471,
    "preview": "from __future__ import annotations\n\nfrom typing import TYPE_CHECKING, cast\n\nimport pytest\n\nimport nb_clean\n\nif TYPE_CHEC"
  },
  {
    "path": "tests/test_clean_notebook.py",
    "chars": 4272,
    "preview": "from collections.abc import Collection\nfrom typing import cast\n\nimport nbformat\nimport pytest\n\nimport nb_clean\n\n\ndef tes"
  },
  {
    "path": "tests/test_cli.py",
    "chars": 8758,
    "preview": "from __future__ import annotations\n\nimport io\nimport os\nimport sys\nfrom pathlib import Path\nfrom typing import TYPE_CHEC"
  },
  {
    "path": "tests/test_git_integration.py",
    "chars": 5879,
    "preview": "from __future__ import annotations\n\nimport subprocess\nfrom pathlib import Path\nfrom typing import TYPE_CHECKING\nfrom uni"
  }
]

About this extraction

This page contains the full source code of the srstevenson/nb-clean GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 35 files (86.7 KB), approximately 22.1k tokens, and a symbol index with 71 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo