Showing preview only (2,200K chars total). Download the full file or copy to clipboard to get everything.
Repository: tensorwerk/hangar-py
Branch: master
Commit: a6deb22854a6
Files: 190
Total size: 2.1 MB
Directory structure:
gitextract_qj3h30ym/
├── .bumpversion.cfg
├── .coveragerc
├── .editorconfig
├── .gitattributes
├── .github/
│ ├── ISSUE_TEMPLATE/
│ │ ├── bug_report.md
│ │ ├── feature_request.md
│ │ └── questions_and_documentation.md
│ ├── PULL_REQUEST_TEMPLATE.md
│ └── workflows/
│ ├── asvbench.yml
│ ├── release.yml
│ ├── testsphinx.yml
│ └── testsuite.yml
├── .gitignore
├── .readthedocs.yml
├── AUTHORS.rst
├── CHANGELOG.rst
├── CODE_OF_CONDUCT.rst
├── CONTRIBUTING.rst
├── LICENSE
├── MANIFEST.in
├── README.rst
├── asv_bench/
│ ├── README.rst
│ ├── asv.conf.json
│ └── benchmarks/
│ ├── __init__.py
│ ├── backend_comparisons.py
│ ├── backends/
│ │ ├── __init__.py
│ │ ├── hdf5_00.py
│ │ ├── hdf5_01.py
│ │ └── numpy_10.py
│ ├── commit_and_checkout.py
│ └── package.py
├── codecov.yml
├── docs/
│ ├── Tutorial-001.ipynb
│ ├── Tutorial-002.ipynb
│ ├── Tutorial-003.ipynb
│ ├── Tutorial-Dataset.ipynb
│ ├── Tutorial-QuickStart.ipynb
│ ├── Tutorial-RealQuickStart.ipynb
│ ├── api.rst
│ ├── authors.rst
│ ├── backends/
│ │ ├── hdf5_00.rst
│ │ ├── hdf5_01.rst
│ │ ├── lmdb_30.rst
│ │ ├── numpy_10.rst
│ │ └── remote_50.rst
│ ├── backends.rst
│ ├── benchmarking.rst
│ ├── changelog.rst
│ ├── cli.rst
│ ├── codeofconduct.rst
│ ├── concepts.rst
│ ├── conf.py
│ ├── contributing.rst
│ ├── contributingindex.rst
│ ├── design.rst
│ ├── externals.rst
│ ├── faq.rst
│ ├── index.rst
│ ├── installation.rst
│ ├── noindexapi/
│ │ ├── apiinit.rst
│ │ └── apiremotefetchdata.rst
│ ├── quickstart.rst
│ ├── readme.rst
│ ├── requirements.txt
│ ├── requirements_rtd.txt
│ ├── spelling_wordlist.txt
│ └── tutorial.rst
├── hangar.yml
├── mypy.ini
├── scripts/
│ └── run_proto_codegen.py
├── setup.cfg
├── setup.py
├── src/
│ └── hangar/
│ ├── __init__.py
│ ├── __main__.py
│ ├── _version.py
│ ├── backends/
│ │ ├── __init__.py
│ │ ├── chunk.py
│ │ ├── hdf5_00.py
│ │ ├── hdf5_01.py
│ │ ├── lmdb_30.py
│ │ ├── lmdb_31.py
│ │ ├── numpy_10.py
│ │ ├── remote_50.py
│ │ ├── specparse.pyx
│ │ ├── specs.pxd
│ │ └── specs.pyx
│ ├── bulk_importer.py
│ ├── checkout.py
│ ├── cli/
│ │ ├── __init__.py
│ │ ├── cli.py
│ │ └── utils.py
│ ├── columns/
│ │ ├── __init__.py
│ │ ├── column.py
│ │ ├── common.py
│ │ ├── constructors.py
│ │ ├── introspection.py
│ │ ├── layout_flat.py
│ │ └── layout_nested.py
│ ├── constants.py
│ ├── context.py
│ ├── dataset/
│ │ ├── __init__.py
│ │ ├── common.py
│ │ ├── numpy_dset.py
│ │ ├── tensorflow_dset.py
│ │ └── torch_dset.py
│ ├── diagnostics/
│ │ ├── __init__.py
│ │ ├── ecosystem.py
│ │ ├── graphing.py
│ │ └── integrity.py
│ ├── diff.py
│ ├── external/
│ │ ├── __init__.py
│ │ ├── _external.py
│ │ ├── base_plugin.py
│ │ └── plugin_manager.py
│ ├── external_cpython.pxd
│ ├── merger.py
│ ├── mixins/
│ │ ├── __init__.py
│ │ ├── checkout_iteration.py
│ │ ├── datasetget.py
│ │ └── recorditer.py
│ ├── op_state.py
│ ├── optimized_utils.pxd
│ ├── optimized_utils.pyx
│ ├── records/
│ │ ├── __init__.py
│ │ ├── column_parsers.pyx
│ │ ├── commiting.py
│ │ ├── hashmachine.pyx
│ │ ├── hashs.py
│ │ ├── heads.py
│ │ ├── parsing.py
│ │ ├── queries.py
│ │ ├── recordstructs.pxd
│ │ ├── recordstructs.pyx
│ │ ├── summarize.py
│ │ └── vcompat.py
│ ├── remote/
│ │ ├── __init__.py
│ │ ├── chunks.py
│ │ ├── client.py
│ │ ├── config_server.ini
│ │ ├── content.py
│ │ ├── hangar_service.proto
│ │ ├── hangar_service_pb2.py
│ │ ├── hangar_service_pb2.pyi
│ │ ├── hangar_service_pb2_grpc.py
│ │ ├── header_manipulator_client_interceptor.py
│ │ ├── request_header_validator_interceptor.py
│ │ └── server.py
│ ├── remotes.py
│ ├── repository.py
│ ├── txnctx.py
│ ├── typesystem/
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── descriptors.py
│ │ ├── ndarray.py
│ │ ├── pybytes.py
│ │ └── pystring.py
│ └── utils.py
├── tests/
│ ├── bulk_importer/
│ │ └── test_bulk_importer.py
│ ├── conftest.py
│ ├── ml_datasets/
│ │ └── test_dataset.py
│ ├── property_based/
│ │ ├── conftest.py
│ │ ├── test_pbt_column_flat.py
│ │ └── test_pbt_column_nested.py
│ ├── test_backend_hdf5_00_hdf5_01.py
│ ├── test_branching.py
│ ├── test_checkout.py
│ ├── test_checkout_arrayset_access.py
│ ├── test_cli.py
│ ├── test_column.py
│ ├── test_column_backends.py
│ ├── test_column_definition_permutations.py
│ ├── test_column_nested.py
│ ├── test_column_pickle.py
│ ├── test_commit_ref_verification.py
│ ├── test_context_management.py
│ ├── test_diff.py
│ ├── test_diff_staged_summary.py
│ ├── test_initiate.py
│ ├── test_merging.py
│ ├── test_optimized_utils.py
│ ├── test_remote_serialize.py
│ ├── test_remotes.py
│ ├── test_repo_integrity_verification.py
│ ├── test_utils.py
│ ├── test_version.py
│ ├── test_visualizations.py
│ └── typesystem/
│ ├── test_ndarray_typesysem.py
│ ├── test_pybytes_typesystem.py
│ └── test_pystr_typesystem.py
└── tox.ini
================================================
FILE CONTENTS
================================================
================================================
FILE: .bumpversion.cfg
================================================
[bumpversion]
current_version = 0.5.2
commit = True
tag = False
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\.(?P<release>[a-z]+)(?P<build>\d+))?
serialize =
{major}.{minor}.{patch}.{release}{build}
{major}.{minor}.{patch}
[bumpversion:part:release]
optional_value = rc
first_value = dev
values =
dev
rc
[bumpversion:part:build]
[bumpversion:file:setup.py]
search = version='{current_version}'
replace = version='{new_version}'
[bumpversion:file:docs/conf.py]
search = version = release = '{current_version}'
replace = version = release = '{new_version}'
[bumpversion:file:src/hangar/__init__.py]
search = __version__ = '{current_version}'
replace = __version__ = '{new_version}'
[bumpversion:file:src/hangar/diagnostics/__init__.py]
search = __version__ = '{current_version}'
replace = __version__ = '{new_version}'
================================================
FILE: .coveragerc
================================================
[paths]
source =
src
[run]
branch = True
parallel = True
source =
hangar
tests
omit =
*/hangar/__main__.py
*/hangar_service_pb2.py
*/hangar_service_pb2_grpc.py
*/hangar_service_pb2.pyi
[report]
exclude_lines =
pragma: no cover
def __repr__
def _repr_pretty_
def _ipython_key_completions_
show_missing = True
precision = 2
omit = *migrations*
================================================
FILE: .editorconfig
================================================
# see http://editorconfig.org
root = true
[*]
end_of_line = lf
trim_trailing_whitespace = true
insert_final_newline = true
indent_style = space
indent_size = 4
charset = utf-8
[*.{bat,cmd,ps1}]
end_of_line = crlf
================================================
FILE: .gitattributes
================================================
* text=auto
*.bat eol=crlf
*.cmd eol=crlf
*.ps1 eol=lf
*.sh eol=lf
*.rtf -text
================================================
FILE: .github/ISSUE_TEMPLATE/bug_report.md
================================================
---
name: Bug report
about: Create a report to help us improve
title: "[BUG REPORT]"
labels: 'Bug: Awaiting Priority Assignment'
assignees: ''
---
**Describe the bug**
A clear and concise description of what the bug is.
**Severity**
<!--- fill in the space between `[ ]` with and `x` (ie. `[x]`) --->
Select an option:
- [ ] Data Corruption / Loss of Any Kind
- [ ] Unexpected Behavior, Exceptions or Error Thrown
- [ ] Performance Bottleneck
**To Reproduce**
Steps to reproduce the behavior, minimal example code preferred:
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Desktop (please complete the following information):**
- OS:
- Python:
- Hangar:
**Additional context**
Add any other context about the problem here.
================================================
FILE: .github/ISSUE_TEMPLATE/feature_request.md
================================================
---
name: Feature request
about: Suggest an idea for this project
title: "[FEATURE REQUEST]"
labels: enhancement
assignees: ''
---
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
**Describe the solution you'd like**
A clear and concise description of what you want to happen.
**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.
**Additional context**
Add any other context or screenshots about the feature request here.
================================================
FILE: .github/ISSUE_TEMPLATE/questions_and_documentation.md
================================================
---
name: Questions and Documentation
about: Is something confusing? The documentation not clear? We can help
title: "[QUESTION & DOCS]: "
labels: documentation, question
assignees: ''
---
**Executive Summary**
In one to two sentences, describe your question or issue with the documentation:
**Additional Context / Explantation**
(if applicable) provide more info about the question/problem (we love example code & screenshots!)
**Desktop (If applicable, please complete the following version information):**
- OS:
- Python:
- Hangar Version:
- _Install Type_
<!--- fill in the space between `[ ]` with and `x` (ie. `[x]`) --->
<!--- For Source Build, include commit hash if possible --->
- [ ] Source Build
- [ ] Pip install
- [ ] Conda (conda-forge) install
**External Links**
(If applicable) reference other issues, read the docs pages, code docstrings.
-
<!--- insert more `bullets` as needed --->
================================================
FILE: .github/PULL_REQUEST_TEMPLATE.md
================================================
## Motivation and Context
#### _Why is this change required? What problem does it solve?:_
#### _If it fixes an open issue, please link to the issue here:_
## Description
#### _Describe your changes in detail:_
## Screenshots (if appropriate):
## Types of changes
What types of changes does your code introduce? Put an `x` in all the boxes that apply:
- [ ] Documentation update
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
Is this PR ready for review, or a work in progress?
- [ ] Ready for review
- [ ] Work in progress
## How Has This Been Tested?
Put an `x` in the boxes that apply:
- [ ] Current tests cover modifications made
- [ ] New tests have been added to the test suite
- [ ] Modifications were made to existing tests to support these changes
- [ ] Tests may be needed, but they are not included when the PR was proposed
- [ ] I don't know. Help!
## Checklist:
- [ ] My code follows the code style of this project.
- [ ] My change requires a change to the documentation.
- [ ] I have updated the documentation accordingly.
- [ ] I have read the **[CONTRIBUTING](../CONTRIBUTING.rst)** document.
- [ ] I have signed (or will sign when prompted) the tensorwork CLA.
- [ ] I have added tests to cover my changes.
- [ ] All new and existing tests passed.
================================================
FILE: .github/workflows/asvbench.yml
================================================
name: ASV Benchmarking
on:
pull_request:
branches:
- master
jobs:
run_benchmarks:
runs-on: ${{ matrix.os }}
strategy:
max-parallel: 4
fail-fast: false
matrix:
os: [ubuntu-18.04, macOS-10.14]
python-version: [3.6, 3.7]
steps:
- uses: actions/checkout@v1
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v1
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install --upgrade setuptools
pip install virtualenv==16.7.9
pip install git+https://github.com/airspeed-velocity/asv
- name: Run Benchmarks
run: |
cd asv_bench/
asv machine --yes
asv continuous --split origin/master HEAD | tee -a asv_continuous.log
shell: bash
continue-on-error: true
- name: Show Comparison
run: |
cd asv_bench/
asv compare --split origin/master HEAD | tee -a asv_compare.log
if [[ $(cat asv_continuous.log | grep "PERFORMANCE DECREASED") ]]; then
echo "Benchmarks Performance Decreased"
exit 1
elif [[ $(cat asv_continuous.log | grep "PERFORMANCE INCREASED") ]]; then
echo "Benchmark Performance Increased"
else
echo "Benchmarks Run Without Errors, No Significant Change."
fi
shell: bash
================================================
FILE: .github/workflows/release.yml
================================================
name: release
on:
release:
types: [published, prereleased]
jobs:
build-linux-cp36:
runs-on: ubuntu-latest
container: quay.io/pypa/manylinux2014_x86_64
steps:
- uses: actions/checkout@v2
- name: Install Python package dependencies
run: /opt/python/cp36-cp36m/bin/python -m pip install cython wheel setuptools
- name: Build binary wheel
run: /opt/python/cp36-cp36m/bin/python setup.py bdist_wheel
- name: Apply auditwheel
run: auditwheel repair -w dist dist/*
- name: Remove linux wheel
run: rm dist/*-linux_x86_64.whl
- name: Archive dist artifacts
uses: actions/upload-artifact@v1
with:
name: dist-linux-3.6
path: dist
build-linux-cp37:
runs-on: ubuntu-latest
container: quay.io/pypa/manylinux2014_x86_64
steps:
- uses: actions/checkout@v2
- name: Install Python package dependencies
run: /opt/python/cp37-cp37m/bin/python -m pip install cython wheel setuptools
- name: Build binary wheel
run: /opt/python/cp37-cp37m/bin/python setup.py bdist_wheel
- name: Apply auditwheel
run: auditwheel repair -w dist dist/*
- name: Remove linux wheel
run: rm dist/*-linux_x86_64.whl
- name: Archive dist artifacts
uses: actions/upload-artifact@v1
with:
name: dist-linux-3.7
path: dist
build-linux-cp38:
runs-on: ubuntu-latest
container: quay.io/pypa/manylinux2014_x86_64
steps:
- uses: actions/checkout@v2
- name: Install Python package dependencies
run: /opt/python/cp38-cp38/bin/python -m pip install cython wheel setuptools
- name: Build binary wheel
run: /opt/python/cp38-cp38/bin/python setup.py bdist_wheel
- name: Apply auditwheel for manylinux wheel
run: auditwheel repair -w dist dist/*
- name: Remove linux wheel
run: rm dist/*-linux_x86_64.whl
- name: Archive dist artifacts
uses: actions/upload-artifact@v1
with:
name: dist-linux-3.8
path: dist
build-macos:
runs-on: macos-latest
strategy:
max-parallel: 4
matrix:
python-version: [3.6, 3.7, 3.8]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }} x64
uses: actions/setup-python@v1
with:
python-version: ${{ matrix.python-version }}
architecture: x64
- name: Install Python package dependencies
run: pip install cython wheel setuptools
- name: Build binary wheel
run: python setup.py bdist_wheel
- name: Archive dist artifacts
uses: actions/upload-artifact@v1
with:
name: dist-macos-${{ matrix.python-version }}
path: dist
build-windows:
runs-on: windows-latest
strategy:
max-parallel: 3
matrix:
python-version: [3.6, 3.7, 3.8]
steps:
- uses: actions/checkout@v2
- name: Download Build Tools for Visual Studio 2019
run: Invoke-WebRequest -Uri https://aka.ms/vs/16/release/vs_buildtools.exe -OutFile vs_buildtools.exe
- name: Run vs_buildtools.exe install
run: ./vs_buildtools.exe --quiet --wait --norestart --nocache --add Microsoft.VisualStudio.Component.VC.Tools.x86.x64 --add Microsoft.VisualStudio.Component.VC.v141.x86.x64 --add Microsoft.VisualStudio.Component.VC.140 --includeRecommended
- name: Set up Python ${{ matrix.python-version }} x64
uses: actions/setup-python@v1
with:
python-version: ${{ matrix.python-version }}
architecture: x64
- name: Install Python package dependencies
run: pip install cython wheel setuptools
- name: Build binary wheel
run: python setup.py bdist_wheel
- name: Archive dist artifacts
uses: actions/upload-artifact@v1
with:
name: dist-windows-${{ matrix.python-version }}
path: dist
upload:
needs: [build-linux-cp36, build-linux-cp37, build-linux-cp38, build-macos, build-windows]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- name: Set up Python
uses: actions/setup-python@v1
with:
python-version: 3.8
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install cython wheel setuptools
- name: Create source dist
run: python setup.py sdist
# Linux
- name: Stage linux 3.6
uses: actions/download-artifact@v1
with:
name: dist-linux-3.6
- run: mv -v dist-linux-3.6/* dist/
- name: Stage linux 3.7
uses: actions/download-artifact@v1
with:
name: dist-linux-3.7
- run: mv -v dist-linux-3.7/* dist/
- name: Stage linux 3.8
uses: actions/download-artifact@v1
with:
name: dist-linux-3.8
- run: mv -v dist-linux-3.8/* dist/
# MacOS
- name: Stage macos 3.6
uses: actions/download-artifact@v1
with:
name: dist-macos-3.6
- run: mv -v dist-macos-3.6/* dist/
- name: Stage macos 3.7
uses: actions/download-artifact@v1
with:
name: dist-macos-3.7
- run: mv -v dist-macos-3.7/* dist/
- name: Stage macos 3.8
uses: actions/download-artifact@v1
with:
name: dist-macos-3.8
- run: mv -v dist-macos-3.8/* dist/
# Windows
- name: Stage windows 3.6
uses: actions/download-artifact@v1
with:
name: dist-windows-3.6
- run: mv -v dist-windows-3.6/* dist/
- name: Stage windows 3.7
uses: actions/download-artifact@v1
with:
name: dist-windows-3.7
- run: mv -v dist-windows-3.7/* dist/
- name: Stage windows 3.8
uses: actions/download-artifact@v1
with:
name: dist-windows-3.8
- run: mv -v dist-windows-3.8/* dist/
- name: Upload PreRelease to Test PyPi with Twine
if: "github.event.release.prerelease"
env:
TWINE_USERNAME: ${{ secrets.TEST_PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.TEST_PYPI_PASSWORD }}
run: |
ls -l dist/*
pip install twine
twine upload --repository-url https://test.pypi.org/legacy/ dist/*
- name: Upload Release to PyPi with Twine
if: "!github.event.release.prerelease"
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: |
ls -l dist/*
pip install twine
twine upload dist/*
================================================
FILE: .github/workflows/testsphinx.yml
================================================
name: Build Sphinx Docs
on:
pull_request:
branches:
- master
push:
branches:
- master
jobs:
build_docs:
runs-on: ubuntu-latest
strategy:
fail-fast: false
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.7
uses: actions/setup-python@v1
with:
python-version: 3.7
- name: Install dependencies
run: |
python -m pip install --upgrade setuptools pip wheel tox
sudo apt-get update
sudo apt-get install pandoc
- name: Run Documentation Generator
run: tox -e docs
env:
GH_ACTIONS_PROC_NR: 1
================================================
FILE: .github/workflows/testsuite.yml
================================================
name: Run Test Suite
on:
pull_request:
branches:
- master
push:
branches:
- master
jobs:
run_test_suite:
runs-on: ${{ matrix.platform }}
strategy:
fail-fast: false
matrix:
# https://help.github.com/articles/virtual-environments-for-github-actions
testcover: [yes, no]
testml: [no, yes]
platform:
- windows-latest
- macos-latest
- ubuntu-latest
python-version: [3.6, 3.7, 3.8]
exclude:
# tensorflow-cpu:latest (2.1.0) is not available for python 3.8 yet.
- python-version: 3.8
testml: yes
# build time with limited macos jobs
- platform: macos-latest
python-version: 3.7
- platform: windows-latest
python-version: 3.7
testml: yes
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade setuptools wheel
# Use the latest published version for myself :)
python -m pip install tox-gh-actions
- name: Run Tests Without Coverage Report
if: matrix.testcover == 'no'
run: tox
env:
PYTEST_XDIST_PROC_NR: 2
TESTCOVER: ${{ matrix.testcover }}
TESTML: ${{ matrix.testml }}
- name: Run Tests With Coverage Report
if: matrix.testcover == 'yes'
run: tox -- --cov-report xml
env:
PYTEST_XDIST_PROC_NR: 2
TESTCOVER: ${{ matrix.testcover }}
TESTML: ${{ matrix.testml }}
- name: Upload Coverage Report to Codecov
if: matrix.testcover == 'yes'
run: bash <(curl -s https://codecov.io/bash) -n "${CC_PLAT}-py${CC_PY}-cov${CC_COV}-ml${CC_ML}"
shell: bash
env:
CC_PLAT: ${{ matrix.platform }}
CC_PY: ${{ matrix.python-version }}
CC_COV: ${{ matrix.testcover }}
CC_ML: ${{ matrix.testml }}
================================================
FILE: .gitignore
================================================
*.py[cod]
# C extensions
*.c
*.so
cython_debug/
# cython annotation files
src/hangar/backends/*.html
docs/_build
# Packages
*.egg
*.egg-info
dist
build
eggs
.eggs
parts
bin
var
sdist
wheelhouse
develop-eggs
.installed.cfg
lib
lib64
venv*/
pyvenv*/
MANIFEST
# Installer logs
pip-log.txt
# Unit test / coverage reports
.coverage
.tox
.coverage.*
.pytest_cache/
nosetests.xml
coverage.xml
htmlcov
.hypothesis
# Performance Testing
asv_bench/html
asv_bench/env
asv_bench/results
# Translations
*.mo
# Mr Developer
.mr.developer.cfg
.project
.pydevproject
.idea
*.iml
*.komodoproject
# Complexity
output/*.html
output/*/index.html
# Sphinx
docs/_build
.DS_Store
*~
.*.sw[po]
.build
.ve
.env
.cache
.pytest
.bootstrap
.appveyor.token
*.bak
# Mypy Cache
.mypy_cache/
.dmypy.json
monkeytype.sqlite3
# IDE Settings
.vscode/
.ipynb_checkpoints/
# Testing data
*.pkl.gz
*.sqlite3
*.dmypy.json
================================================
FILE: .readthedocs.yml
================================================
# .readthedocs.yml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
# Required
version: 2
# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py
# Optionally build your docs in additional formats such as PDF and ePub
formats: all
# Optionally set the version of Python and requirements required to build your docs
python:
version: 3.7
install:
- requirements: docs/requirements.txt
- method: pip
path: .
- method: setuptools
path: .
- requirements: docs/requirements_rtd.txt
system_packages: true
================================================
FILE: AUTHORS.rst
================================================
Authors
=======
* Richard Izzo - rick@tensorwerk.com
* Luca Antiga - luca@tensorwerk.com
* Sherin Thomas - sherin@tensorwerk.com
* Alessia Marcolini - alessia@tensorwerk.com
================================================
FILE: CHANGELOG.rst
================================================
==========
Change Log
==========
_`In-Progress`
==============
Improvements
------------
* New API design for datasets (previously dataloaders) for machine learning libraries.
(`#187 <https://github.com/tensorwerk/hangar-py/pull/187>`__) `@hhsecond <<https://github.com/hhsecond>>`__
`v0.5.2`_ (2020-05-08)
======================
New Features
------------
* New column data type supporting arbitrary ``bytes`` data.
(`#198 <https://github.com/tensorwerk/hangar-py/pull/198>`__) `@rlizzo <https://github.com/rlizzo>`__
Improvements
------------
* ``str`` typed columns can now accept data containing any unicode code-point. In prior releases
data containing any ``non-ascii`` character could not be written to this column type.
(`#198 <https://github.com/tensorwerk/hangar-py/pull/198>`__) `@rlizzo <https://github.com/rlizzo>`__
Bug Fixes
---------
* Fixed issue where ``str`` and (newly added) ``bytes`` column data could not be fetched / pushed
between a local client repository and remote server.
(`#198 <https://github.com/tensorwerk/hangar-py/pull/198>`__) `@rlizzo <https://github.com/rlizzo>`__
`v0.5.1`_ (2020-04-05)
======================
BugFixes
--------
* Fixed issue where importing ``make_torch_dataloader`` or ``make_tf_dataloader`` under python 3.6
Would raise a ``NameError`` irrigardless of if the package is installed.
(`#196 <https://github.com/tensorwerk/hangar-py/pull/196>`__) `@rlizzo <https://github.com/rlizzo>`__
`v0.5.0`_ (2020-04-4)
=====================
Improvements
------------
* Python 3.8 is now fully supported.
(`#193 <https://github.com/tensorwerk/hangar-py/pull/193>`__) `@rlizzo <https://github.com/rlizzo>`__
* Major backend overhaul which defines column layouts and data types in the same interchangable
/ extensable manner as storage backends. This will allow rapid development of new layouts and
data type support as new use cases are discovered by the community.
(`#184 <https://github.com/tensorwerk/hangar-py/pull/184>`__) `@rlizzo <https://github.com/rlizzo>`__
* Column and backend classes are now fully serializable (pickleable) for ``read-only`` checkouts.
(`#180 <https://github.com/tensorwerk/hangar-py/pull/180>`__) `@rlizzo <https://github.com/rlizzo>`__
* Modularized internal structure of API classes to easily allow new columnn layouts / data types
to be added in the future.
(`#180 <https://github.com/tensorwerk/hangar-py/pull/180>`__) `@rlizzo <https://github.com/rlizzo>`__
* Improved type / value checking of manual specification for column ``backend`` and ``backend_options``.
(`#180 <https://github.com/tensorwerk/hangar-py/pull/180>`__) `@rlizzo <https://github.com/rlizzo>`__
* Standardized column data access API to follow python standard library ``dict`` methods API.
(`#180 <https://github.com/tensorwerk/hangar-py/pull/180>`__) `@rlizzo <https://github.com/rlizzo>`__
* Memory usage of arrayset checkouts has been reduced by ~70% by using C-structs for allocating
sample record locating info.
(`#179 <https://github.com/tensorwerk/hangar-py/pull/179>`__) `@rlizzo <https://github.com/rlizzo>`__
* Read times from the ``HDF5_00`` and ``HDF5_01`` backend have been reduced by 33-38% (or more for
arraysets with many samples) by eliminating redundant computation of chunked storage B-Tree.
(`#179 <https://github.com/tensorwerk/hangar-py/pull/179>`__) `@rlizzo <https://github.com/rlizzo>`__
* Commit times and checkout times have been reduced by 11-18% by optimizing record parsing and
memory allocation.
(`#179 <https://github.com/tensorwerk/hangar-py/pull/179>`__) `@rlizzo <https://github.com/rlizzo>`__
New Features
------------
* Added ``str`` type column with same behavior as ``ndarray`` column (supporting both
single-level and nested layouts) added to replace functionality of removed ``metadata`` container.
(`#184 <https://github.com/tensorwerk/hangar-py/pull/184>`__) `@rlizzo <https://github.com/rlizzo>`__
* New backend based on ``LMDB`` has been added (specifier of ``lmdb_30``).
(`#184 <https://github.com/tensorwerk/hangar-py/pull/184>`__) `@rlizzo <https://github.com/rlizzo>`__
* Added ``.diff()`` method to ``Repository`` class to enable diffing changes between any pair of
commits / branches without needing to open the diff base in a checkout.
(`#183 <https://github.com/tensorwerk/hangar-py/pull/183>`__) `@rlizzo <https://github.com/rlizzo>`__
* New CLI command ``hangar diff`` which reports a summary view of changes made between any pair of
commits / branches.
(`#183 <https://github.com/tensorwerk/hangar-py/pull/183>`__) `@rlizzo <https://github.com/rlizzo>`__
* Added ``.log()`` method to ``Checkout`` objects so graphical commit graph or machine readable
commit details / DAG can be queried when operating on a particular commit.
(`#183 <https://github.com/tensorwerk/hangar-py/pull/183>`__) `@rlizzo <https://github.com/rlizzo>`__
* "string" type columns now supported alongside "ndarray" column type.
(`#180 <https://github.com/tensorwerk/hangar-py/pull/180>`__) `@rlizzo <https://github.com/rlizzo>`__
* New "column" API, which replaces "arrayset" name.
(`#180 <https://github.com/tensorwerk/hangar-py/pull/180>`__) `@rlizzo <https://github.com/rlizzo>`__
* Arraysets can now contain "nested subsamples" under a common sample key.
(`#179 <https://github.com/tensorwerk/hangar-py/pull/179>`__) `@rlizzo <https://github.com/rlizzo>`__
* New API to add and remove samples from and arrayset.
(`#179 <https://github.com/tensorwerk/hangar-py/pull/179>`__) `@rlizzo <https://github.com/rlizzo>`__
* Added ``repo.size_nbytes`` and ``repo.size_human`` to report disk usage of a repository on disk.
(`#174 <https://github.com/tensorwerk/hangar-py/pull/174>`__) `@rlizzo <https://github.com/rlizzo>`__
* Added method to traverse the entire repository history and cryptographically verify integrity.
(`#173 <https://github.com/tensorwerk/hangar-py/pull/173>`__) `@rlizzo <https://github.com/rlizzo>`__
Changes
-------
* Argument syntax of ``__getitem__()`` and ``get()`` methods of ``ReaderCheckout`` and
``WriterCheckout`` classes. The new format supports handeling arbitrary arguments specific
to retrieval of data from any column type.
(`#183 <https://github.com/tensorwerk/hangar-py/pull/183>`__) `@rlizzo <https://github.com/rlizzo>`__
Removed
-------
* ``metadata`` container for ``str`` typed data has been completly removed. It is replaced by a highly
extensible and much more user-friendly ``str`` typed column.
(`#184 <https://github.com/tensorwerk/hangar-py/pull/184>`__) `@rlizzo <https://github.com/rlizzo>`__
* ``__setitem__()`` method in ``WriterCheckout`` objects. Writing data to columns via a checkout object
is no longer supported.
(`#183 <https://github.com/tensorwerk/hangar-py/pull/183>`__) `@rlizzo <https://github.com/rlizzo>`__
Bug Fixes
---------
* Backend data stores no longer use file symlinks, improving compatibility with some types file systems.
(`#171 <https://github.com/tensorwerk/hangar-py/pull/171>`__) `@rlizzo <https://github.com/rlizzo>`__
* All arrayset types ("flat" and "nested subsamples") and backend readers can now be pickled -- for parallel
processing -- in a read-only checkout.
(`#179 <https://github.com/tensorwerk/hangar-py/pull/179>`__) `@rlizzo <https://github.com/rlizzo>`__
Breaking changes
----------------
* New backend record serialization format is incompatible with repositories written in version 0.4 or earlier.
* New arrayset API is incompatible with Hangar API in version 0.4 or earlier.
`v0.4.0`_ (2019-11-21)
======================
New Features
------------
* Added ability to delete branch names/pointers from a local repository via both API and CLI.
(`#128 <https://github.com/tensorwerk/hangar-py/pull/128>`__) `@rlizzo <https://github.com/rlizzo>`__
* Added ``local`` keyword arg to arrayset key/value iterators to return only locally available samples
(`#131 <https://github.com/tensorwerk/hangar-py/pull/131>`__) `@rlizzo <https://github.com/rlizzo>`__
* Ability to change the backend storage format and options applied to an ``arrayset`` after initialization.
(`#133 <https://github.com/tensorwerk/hangar-py/pull/133>`__) `@rlizzo <https://github.com/rlizzo>`__
* Added blosc compression to HDF5 backend by default on PyPi installations.
(`#146 <https://github.com/tensorwerk/hangar-py/pull/146>`__) `@rlizzo <https://github.com/rlizzo>`__
* Added Benchmarking Suite to Test for Performance Regressions in PRs.
(`#155 <https://github.com/tensorwerk/hangar-py/pull/155>`__) `@rlizzo <https://github.com/rlizzo>`__
* Added new backend optimized to increase speeds for fixed size arrayset access.
(`#160 <https://github.com/tensorwerk/hangar-py/pull/160>`__) `@rlizzo <https://github.com/rlizzo>`__
Improvements
------------
* Removed ``msgpack`` and ``pyyaml`` dependencies. Cleaned up and improved remote client/server code.
(`#130 <https://github.com/tensorwerk/hangar-py/pull/130>`__) `@rlizzo <https://github.com/rlizzo>`__
* Multiprocess Torch DataLoaders allowed on Linux and MacOS.
(`#144 <https://github.com/tensorwerk/hangar-py/pull/144>`__) `@rlizzo <https://github.com/rlizzo>`__
* Added CLI options ``commit``, ``checkout``, ``arrayset create``, & ``arrayset remove``.
(`#150 <https://github.com/tensorwerk/hangar-py/pull/150>`__) `@rlizzo <https://github.com/rlizzo>`__
* Plugin system revamp.
(`#134 <https://github.com/tensorwerk/hangar-py/pull/134>`__) `@hhsecond <https://github.com/hhsecond>`__
* Documentation Improvements and Typo-Fixes.
(`#156 <https://github.com/tensorwerk/hangar-py/pull/156>`__) `@alessiamarcolini <https://github.com/alessiamarcolini>`__
* Removed implicit removal of arrayset schema from checkout if every sample was removed from arrayset.
This could potentially result in dangling accessors which may or may not self-destruct (as expected)
in certain edge-cases.
(`#159 <https://github.com/tensorwerk/hangar-py/pull/159>`__) `@rlizzo <https://github.com/rlizzo>`__
* Added type codes to hash digests so that calculation function can be updated in the future without
breaking repos written in previous Hangar versions.
(`#165 <https://github.com/tensorwerk/hangar-py/pull/165>`__) `@rlizzo <https://github.com/rlizzo>`__
Bug Fixes
---------
* Programatic access to repository log contents now returns branch heads alongside other log info.
(`#125 <https://github.com/tensorwerk/hangar-py/pull/125>`__) `@rlizzo <https://github.com/rlizzo>`__
* Fixed minor bug in types of values allowed for ``Arrayset`` names vs ``Sample`` names.
(`#151 <https://github.com/tensorwerk/hangar-py/pull/151>`__) `@rlizzo <https://github.com/rlizzo>`__
* Fixed issue where using checkout object to access a sample in multiple arraysets would try to create
a ``namedtuple`` instance with invalid field names. Now incompatible field names are automatically
renamed with their positional index.
(`#161 <https://github.com/tensorwerk/hangar-py/pull/161>`__) `@rlizzo <https://github.com/rlizzo>`__
* Explicitly raise error if ``commit`` argument is set while checking out a repository with ``write=True``.
(`#166 <https://github.com/tensorwerk/hangar-py/pull/166>`__) `@rlizzo <https://github.com/rlizzo>`__
Breaking changes
----------------
* New commit reference serialization format is incompatible with repositories written in version 0.3.0 or earlier.
`v0.3.0`_ (2019-09-10)
======================
New Features
------------
* API addition allowing reading and writing arrayset data from a checkout object directly.
(`#115 <https://github.com/tensorwerk/hangar-py/pull/115>`__) `@rlizzo <https://github.com/rlizzo>`__
* Data importer, exporters, and viewers via CLI for common file formats. Includes plugin system
for easy extensibility in the future.
(`#103 <https://github.com/tensorwerk/hangar-py/pull/103>`__)
(`@rlizzo <https://github.com/rlizzo>`__, `@hhsecond <https://github.com/hhsecond>`__)
Improvements
------------
* Added tutorial on working with remote data.
(`#113 <https://github.com/tensorwerk/hangar-py/pull/113>`__) `@rlizzo <https://github.com/rlizzo>`__
* Added Tutorial on Tensorflow and PyTorch Dataloaders.
(`#117 <https://github.com/tensorwerk/hangar-py/pull/117>`__) `@hhsecond <https://github.com/hhsecond>`__
* Large performance improvement to diff/merge algorithm (~30x previous).
(`#112 <https://github.com/tensorwerk/hangar-py/pull/112>`__) `@rlizzo <https://github.com/rlizzo>`__
* New commit hash algorithm which is much more reproducible in the long term.
(`#120 <https://github.com/tensorwerk/hangar-py/pull/120>`__) `@rlizzo <https://github.com/rlizzo>`__
* HDF5 backend updated to increase speed of reading/writing variable sized dataset compressed chunks
(`#120 <https://github.com/tensorwerk/hangar-py/pull/120>`__) `@rlizzo <https://github.com/rlizzo>`__
Bug Fixes
---------
* Fixed ML Dataloaders errors for a number of edge cases surrounding partial-remote data and non-common keys.
(`#110 <https://github.com/tensorwerk/hangar-py/pull/110>`__)
( `@hhsecond <https://github.com/hhsecond>`__, `@rlizzo <https://github.com/rlizzo>`__)
Breaking changes
----------------
* New commit hash algorithm is incompatible with repositories written in version 0.2.0 or earlier
`v0.2.0`_ (2019-08-09)
======================
New Features
------------
* Numpy memory-mapped array file backend added.
(`#70 <https://github.com/tensorwerk/hangar-py/pull/70>`__) `@rlizzo <https://github.com/rlizzo>`__
* Remote server data backend added.
(`#70 <https://github.com/tensorwerk/hangar-py/pull/70>`__) `@rlizzo <https://github.com/rlizzo>`__
* Selection heuristics to determine appropriate backend from arrayset schema.
(`#70 <https://github.com/tensorwerk/hangar-py/pull/70>`__) `@rlizzo <https://github.com/rlizzo>`__
* Partial remote clones and fetch operations now fully supported.
(`#85 <https://github.com/tensorwerk/hangar-py/pull/85>`__) `@rlizzo <https://github.com/rlizzo>`__
* CLI has been placed under test coverage, added interface usage to docs.
(`#85 <https://github.com/tensorwerk/hangar-py/pull/85>`__) `@rlizzo <https://github.com/rlizzo>`__
* TensorFlow and PyTorch Machine Learning Dataloader Methods (*Experimental Release*).
(`#91 <https://github.com/tensorwerk/hangar-py/pull/91>`__)
lead: `@hhsecond <https://github.com/hhsecond>`__, co-author: `@rlizzo <https://github.com/rlizzo>`__,
reviewed by: `@elistevens <https://github.com/elistevens>`__
Improvements
------------
* Record format versioning and standardization so to not break backwards compatibility in the future.
(`#70 <https://github.com/tensorwerk/hangar-py/pull/70>`__) `@rlizzo <https://github.com/rlizzo>`__
* Backend addition and update developer protocols and documentation.
(`#70 <https://github.com/tensorwerk/hangar-py/pull/70>`__) `@rlizzo <https://github.com/rlizzo>`__
* Read-only checkout arrayset sample ``get`` methods now are multithread and multiprocess safe.
(`#84 <https://github.com/tensorwerk/hangar-py/pull/84>`__) `@rlizzo <https://github.com/rlizzo>`__
* Read-only checkout metadata sample ``get`` methods are thread safe if used within a context manager.
(`#101 <https://github.com/tensorwerk/hangar-py/pull/101>`__) `@rlizzo <https://github.com/rlizzo>`__
* Samples can be assigned integer names in addition to ``string`` names.
(`#89 <https://github.com/tensorwerk/hangar-py/pull/89>`__) `@rlizzo <https://github.com/rlizzo>`__
* Forgetting to close a ``write-enabled`` checkout before terminating the python process will close the
checkout automatically for many situations.
(`#101 <https://github.com/tensorwerk/hangar-py/pull/101>`__) `@rlizzo <https://github.com/rlizzo>`__
* Repository software version compatability methods added to ensure upgrade paths in the future.
(`#101 <https://github.com/tensorwerk/hangar-py/pull/101>`__) `@rlizzo <https://github.com/rlizzo>`__
* Many tests added (including support for Mac OSX on Travis-CI).
lead: `@rlizzo <https://github.com/rlizzo>`__, co-author: `@hhsecond <https://github.com/hhsecond>`__
Bug Fixes
---------
* Diff results for fast forward merges now returns sensible results.
(`#77 <https://github.com/tensorwerk/hangar-py/pull/77>`__) `@rlizzo <https://github.com/rlizzo>`__
* Many type annotations added, and developer documentation improved.
`@hhsecond <https://github.com/hhsecond>`__ & `@rlizzo <https://github.com/rlizzo>`__
Breaking changes
----------------
* Renamed all references to ``datasets`` in the API / world-view to ``arraysets``.
* These are backwards incompatible changes. For all versions > 0.2, repository upgrade utilities will
be provided if breaking changes occur.
`v0.1.1`_ (2019-05-24)
======================
Bug Fixes
---------
* Fixed typo in README which was uploaded to PyPi
`v0.1.0`_ (2019-05-24)
======================
New Features
------------
* Remote client-server config negotiation and administrator permissions.
(`#10 <https://github.com/tensorwerk/hangar-py/pull/10>`__) `@rlizzo <https://github.com/rlizzo>`__
* Allow single python process to access multiple repositories simultaneously.
(`#20 <https://github.com/tensorwerk/hangar-py/pull/20>`__) `@rlizzo <https://github.com/rlizzo>`__
* Fast-Forward and 3-Way Merge and Diff methods now fully supported and behaving as expected.
(`#32 <https://github.com/tensorwerk/hangar-py/pull/32>`__) `@rlizzo <https://github.com/rlizzo>`__
Improvements
------------
* Initial test-case specification.
(`#14 <https://github.com/tensorwerk/hangar-py/pull/14>`__) `@hhsecond <https://github.com/hhsecond>`__
* Checkout test-case work.
(`#25 <https://github.com/tensorwerk/hangar-py/pull/25>`__) `@hhsecond <https://github.com/hhsecond>`__
* Metadata test-case work.
(`#27 <https://github.com/tensorwerk/hangar-py/pull/27>`__) `@hhsecond <https://github.com/hhsecond>`__
* Any potential failure cases raise exceptions instead of silently returning.
(`#16 <https://github.com/tensorwerk/hangar-py/pull/16>`__) `@rlizzo <https://github.com/rlizzo>`__
* Many usability improvements in a variety of commits.
Bug Fixes
---------
* Ensure references to checkout arrayset or metadata objects cannot operate after the checkout is closed.
(`#41 <https://github.com/tensorwerk/hangar-py/pull/41>`__) `@rlizzo <https://github.com/rlizzo>`__
* Sensible exception classes and error messages raised on a variety of situations (Many commits).
`@hhsecond <https://github.com/hhsecond>`__ & `@rlizzo <https://github.com/rlizzo>`__
* Many minor issues addressed.
API Additions
-------------
* Refer to API documentation (`#23 <https://github.com/tensorwerk/hangar-py/pull/23>`__)
Breaking changes
----------------
* All repositories written with previous versions of Hangar are liable to break when using this version. Please upgrade versions immediately.
`v0.0.0`_ (2019-04-15)
======================
* First Public Release of Hangar!
.. _v0.0.0: https://github.com/tensorwerk/hangar-py/commit/2aff3805c66083a7fbb2ebf701ceaf38ac5165c7
.. _v0.1.0: https://github.com/tensorwerk/hangar-py/compare/v0.0.0...v0.1.0
.. _v0.1.1: https://github.com/tensorwerk/hangar-py/compare/v0.1.0...v0.1.1
.. _v0.2.0: https://github.com/tensorwerk/hangar-py/compare/v0.1.1...v0.2.0
.. _v0.3.0: https://github.com/tensorwerk/hangar-py/compare/v0.2.0...v0.3.0
.. _v0.4.0: https://github.com/tensorwerk/hangar-py/compare/v0.3.0...v0.4.0
.. _v0.5.0: https://github.com/tensorwerk/hangar-py/compare/v0.4.0...v0.5.0
.. _v0.5.1: https://github.com/tensorwerk/hangar-py/compare/v0.5.0...v0.5.1
.. _v0.5.2: https://github.com/tensorwerk/hangar-py/compare/v0.5.1...v0.5.2
.. _In-Progress: https://github.com/tensorwerk/hangar-py/compare/v0.5.2...master
================================================
FILE: CODE_OF_CONDUCT.rst
================================================
===========================
Contributor Code of Conduct
===========================
Our Pledge
----------
In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to making participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, sex characteristics, gender identity and expression,
level of experience, education, socio-economic status, nationality, personal
appearance, race, religion, or sexual identity and orientation.
Our Standards
-------------
Examples of behavior that contributes to creating a positive environment
include:
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
Our Responsibilities
--------------------
Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.
Scope
-----
This Code of Conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community. Examples of
representing a project or community include using an official project e-mail
address, posting via an official social media account, or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.
Enforcement
-----------
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at
`hangar.info@tensorwerk.com <hangar.info@tensorwerk.com>`__. All complaints will
be reviewed and investigated and will result in a response that is deemed
necessary and appropriate to the circumstances. The project team is obligated to
maintain confidentiality with regard to the reporter of an incident. Further
details of specific enforcement policies may be posted separately.
Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.
Attribution
-----------
This Code of Conduct is adapted from the `Contributor Covenant`_ homepage, version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
.. _Contributor Covenant: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq
================================================
FILE: CONTRIBUTING.rst
================================================
============
Contributing
============
Contributions are welcome, and they are greatly appreciated! Every
little bit helps, and credit will always be given.
All community members should read and abide by our :ref:`ref-code-of-conduct`.
Bug reports
===========
When `reporting a bug <https://github.com/tensorwerk/hangar-py/issues>`_ please include:
* Your operating system name and version.
* Any details about your local setup that might be helpful in
troubleshooting.
* Detailed steps to reproduce the bug.
Documentation improvements
==========================
Hangar could always use more documentation, whether as part of the
official Hangar docs, in docstrings, or even on the web in blog posts,
articles, and such.
Feature requests and feedback
=============================
The best way to send feedback is to file an issue at https://github.com/tensorwerk/hangar-py/issues.
If you are proposing a feature:
* Explain in detail how it would work.
* Keep the scope as narrow as possible, to make it easier to implement.
* Remember that this is a volunteer-driven project, and that code contributions
are welcome :)
Development
===========
To set up `hangar-py` for local development:
1. Fork `hangar-py <https://github.com/tensorwerk/hangar-py>`_
(look for the "Fork" button).
2. Clone your fork locally::
git clone git@github.com:your_name_here/hangar-py.git
3. Create a branch for local development::
git checkout -b name-of-your-bugfix-or-feature
Now you can make your changes locally.
4. When you're done making changes, run all the checks, doc builder and spell
checker with `tox <http://tox.readthedocs.io/en/latest/install.html>`_ one
command::
tox
5. Commit your changes and push your branch to GitHub::
git add .
git commit -m "Your detailed description of your changes."
git push origin name-of-your-bugfix-or-feature
6. Submit a pull request through the GitHub website.
Pull Request Guidelines
-----------------------
If you need some code review or feedback while you're developing the code just
make the pull request.
For merging, you should:
1. Include passing tests (run ``tox``) [1]_.
2. Update documentation when there's new API, functionality etc.
3. Add a note to ``CHANGELOG.rst`` about the changes.
4. Add yourself to ``AUTHORS.rst``.
.. [1] If you don't have all the necessary python versions available
locally you can rely on Travis - it will `run the tests
<https://travis-ci.org/tensorwerk/hangar-py/pull_requests>`_ for each change
you add in the pull request.
It will be slower though ...
Tips
----
To run a subset of tests::
tox -e envname -- pytest -k test_myfeature
To run all the test environments in *parallel* (you need to ``pip install detox``)::
detox
================================================
FILE: LICENSE
================================================
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
Copyright 2019 Richard Izzo
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
================================================
FILE: MANIFEST.in
================================================
graft docs
graft src
graft tests
include .bumpversion.cfg
include .coveragerc
include .editorconfig
include AUTHORS.rst
include CHANGELOG.rst
include CONTRIBUTING.rst
include CODE_OF_CONDUCT.rst
include LICENSE
include README.rst
include tox.ini
include mypy.ini
include setup.py
global-exclude *.py[cod] *.so *.DS_Store
global-exclude __pycache__ .mypy_cache .pytest_cache .hypothesis
================================================
FILE: README.rst
================================================
========
Overview
========
.. start-badges
.. list-table::
:stub-columns: 1
* - docs
- |docs|
* - tests
- | |gh-build-status| |codecov|
| |lgtm|
* - package
- | |version| |wheel| |conda-forge|
| |supported-versions| |supported-implementations|
| |license|
.. |docs| image:: https://readthedocs.org/projects/hangar-py/badge/?style=flat
:target: https://readthedocs.org/projects/hangar-py
:alt: Documentation Status
.. |gh-build-status| image:: https://github.com/tensorwerk/hangar-py/workflows/Run%20Test%20Suite/badge.svg?branch=master
:alt: Build Status
:target: https://github.com/tensorwerk/hangar-py/actions?query=workflow%3A%22Run+Test+Suite%22+branch%3Amaster+event%3Apush+is%3Acompleted
.. |codecov| image:: https://codecov.io/gh/tensorwerk/hangar-py/branch/master/graph/badge.svg
:alt: Code Coverage
:target: https://codecov.io/gh/tensorwerk/hangar-py
.. |lgtm| image:: https://img.shields.io/lgtm/grade/python/g/tensorwerk/hangar-py.svg?logo=lgtm&logoWidth=18
:alt: Language grade: Python
:target: https://lgtm.com/projects/g/tensorwerk/hangar-py/context:python
.. |version| image:: https://img.shields.io/pypi/v/hangar.svg
:alt: PyPI Package latest release
:target: https://pypi.org/project/hangar
.. |license| image:: https://img.shields.io/github/license/tensorwerk/hangar-py
:alt: GitHub license
:target: https://github.com/tensorwerk/hangar-py/blob/master/LICENSE
.. |conda-forge| image:: https://img.shields.io/conda/vn/conda-forge/hangar.svg
:alt: Conda-Forge Latest Version
:target: https://anaconda.org/conda-forge/hangar
.. |wheel| image:: https://img.shields.io/pypi/wheel/hangar.svg
:alt: PyPI Wheel
:target: https://pypi.org/project/hangar
.. |supported-versions| image:: https://img.shields.io/pypi/pyversions/hangar.svg
:alt: Supported versions
:target: https://pypi.org/project/hangar
.. |supported-implementations| image:: https://img.shields.io/pypi/implementation/hangar.svg
:alt: Supported implementations
:target: https://pypi.org/project/hangar
.. end-badges
Hangar is version control for tensor data. Commit, branch, merge, revert, and
collaborate in the data-defined software era.
* Free software: Apache 2.0 license
What is Hangar?
===============
Hangar is based off the belief that too much time is spent collecting, managing,
and creating home-brewed version control systems for data. At its core Hangar
is designed to solve many of the same problems faced by traditional code version
control system (i.e. ``Git``), just adapted for numerical data:
* Time travel through the historical evolution of a dataset
* Zero-cost Branching to enable exploratory analysis and collaboration
* Cheap Merging to build datasets over time (with multiple collaborators)
* Completely abstracted organization and management of data files on disk
* Ability to only retrieve a small portion of the data (as needed) while still
maintaining complete historical record
* Ability to push and pull changes directly to collaborators or a central server
(i.e. a truly distributed version control system)
The ability of version control systems to perform these tasks for codebases is
largely taken for granted by almost every developer today; however, we are
in-fact standing on the shoulders of giants, with decades of engineering which
has resulted in these phenomenally useful tools. Now that a new era of
"Data-Defined software" is taking hold, we find there is a strong need for
analogous version control systems which are designed to handle numerical data at
large scale... Welcome to Hangar!
The Hangar Workflow:
::
Checkout Branch
|
▼
Create/Access Data
|
▼
Add/Remove/Update Samples
|
▼
Commit
Log Style Output:
.. code-block:: text
* 5254ec (master) : merge commit combining training updates and new validation samples
|\
| * 650361 (add-validation-data) : Add validation labels and image data in isolated branch
* | 5f15b4 : Add some metadata for later reference and add new training samples received after initial import
|/
* baddba : Initial commit adding training images and labels
Learn more about what Hangar is all about at https://hangar-py.readthedocs.io/
Installation
============
Hangar is in early alpha development release!
::
pip install hangar
Documentation
=============
https://hangar-py.readthedocs.io/
Development
===========
To run the all tests run::
tox
Note, to combine the coverage data from all the tox environments run:
.. list-table::
:widths: 10 90
:stub-columns: 1
- - Windows
- ::
set PYTEST_ADDOPTS=--cov-append
tox
- - Other
- ::
PYTEST_ADDOPTS=--cov-append tox
================================================
FILE: asv_bench/README.rst
================================================
Hangar Performance Benchmarking Suite
=====================================
A set of benchmarking tools are included in order to track the performance of
common hangar operations over the course of time. The benchmark suite is run
via the phenomenal `Airspeed Velocity (ASV) <https://asv.readthedocs.io/>`_
project.
Benchmarks can be viewed at the following web link, or by examining the raw
data files in the separate benchmark results repo.
- `Benchmark Web View <https://tensorwerk.com/hangar-benchmarks>`_
- `Benchmark Results Repo <https://github.com/tensorwerk/hangar-benchmarks>`_
.. figure:: ../docs/img/asv-detailed.png
:align: center
Purpose
*******
In addition to providing historical metrics and insight into application
performance over many releases of Hangar, *the benchmark suite is used as a
canary to identify potentially problematic pull requests.* All PRs to the
Hangar repository are automatically benchmarked by our CI system to compare the
performance of proposed changes to that of the current ``master`` branch.
*The results of this canary are explicitly NOT to be used as the
"be-all-end-all" decider of whether a PR is suitable to be merged or not.*
Instead, it is meant to serve the following purposes:
1. **Help contributors understand the consequences of some set of changes on the
greater system early in the PR process.** Simple code is best; if there's no
obvious performance degradation or significant improvement to be had, then
there's no need (or really rationale) for using more complex algorithms or
data structures. It's more work for the author, project maintainers, and
long term health of the codebase.
2. **Not everything can be caught by the capabilities of a traditional test
suite.** Hangar is fairly flat/modular in structure, but there are certain
hotspots in the codebase where a simple change could drastically degrade
performance. It's not always obvious where these hotspots are, and even a
change which is functionally identical (introducing no issues/bugs to the
end user) can unknowingly cross a line and introduce some large regression
completely unnoticed to the authors/reviewers.
3. Sometimes tradeoffs need to be made when introducing something new to a
system. Whether this be due to fundamental CS problems (space vs. time) or
simple matters of practicality vs. purity, it's always easier to act in
environments where relevant information is available before a decision is
made. **Identifying and quantifying tradeoffs/regressions/benefits during
development is the only way we can make informed decisions.** The only times
to be OK with some regression is when knowing about it in advance, it might
be the right choice at the time, but if we don't measure we will never know.
Important Notes on Using/Modifying the Benchmark Suite
******************************************************
1. **Do not commit any of the benchmark results, environment files, or generated
visualizations to the repository**. We store benchmark results in a `separate
repository <https://github.com/tensorwerk/hangar-benchmarks>`_ so to not
clutter the main repo with un-necessary data. The default directories these are
generated in are excluded in our ``.gitignore`` config, so baring some unusual
git usage patterns, this should not be a day-to-day concern.
2. Proposed changes to the benchmark suite should be made to the code in this
repository first. The benchmark results repository mirror will be
synchronized upon approval/merge of changes to the main Hangar repo.
Introduction to Running Benchmarks
**********************************
As ASV sets up and manages it's own virtual environments and source
installations, benchmark execution is not run via ``tox``. While a brief
tutorial is included below, please refer to the `ASV Docs
<https://asv.readthedocs.io/>`_ for detailed information on how to both run,
understand, and write ASV benchmarks.
First Time Setup
----------------
1. Ensure that ``virtualenv``, ``setuptools``, ``pip`` are updated to the
latest version.
2. Install ASV ``$ pip install asv``.
3. Open a terminal and navigate to the ``hangar-py/asv-bench`` directory.
4. Run ``$ asv machine`` to record details of your machine, it is OK to
just use the defaults.
Running Benchmarks
------------------
Refer to the `using ASV
<https://asv.readthedocs.io/en/stable/using.html#running-benchmarks>`_ page for
a full tutorial, paying close attention to the `asv run
<https://asv.readthedocs.io/en/stable/commands.html#asv-run>`_ command.
Generally ``asv run`` requires a range of commits to benchmark across
(specified via either branch name, tags, or commit digests).
To benchmark every commit between the current master ``HEAD`` and ``v0.3.0``,
you would execute::
$ asv run v0.2.0..master
However, this may result in a larger workload then you are willing to wait
around for. To limit the number of commits, you can specify the ``--steps=N``
option to only benchmark ``N`` commits at most between ``HEAD`` and ``v0.3.0``.
The most useful tool during development is the `asv continuous
<https://asv.readthedocs.io/en/stable/commands.html#asv-continuous>`_ command.
using the following syntax will benchmark any changes in a local development
branch against the base ``master`` commit::
$ asv continuous origin/master HEAD
Running `asv compare
<https://asv.readthedocs.io/en/stable/commands.html#asv-compare>`_ will
generate a quick summary of any performance differences::
$ asv compare origin/master HEAD
Visualizing Results
-------------------
After generating benchmark data for a number of commits through history, the
results can be reviewed in (an automatically generated) local web interface by
running the following commands::
$ asv publish
$ asv preview
Navigating to ``http://127.0.0.1:8080/`` will pull up an interactive webpage
where the full set of benchmark graphs/explorations utilities can be viewed.
This will look something like the image below.
.. figure:: ../docs/img/asv-main.png
:align: center
================================================
FILE: asv_bench/asv.conf.json
================================================
{
// The version of the config file format. Do not change, unless
// you know what you are doing.
"version": 1,
// The name of the project being benchmarked
"project": "hangar",
// The project's homepage
"project_url": "https://hangar-py.readthedocs.io",
// The URL or local path of the source code repository for the
// project being benchmarked
"repo": "..",
// The Python project's subdirectory in your repo. If missing or
// the empty string, the project is assumed to be located at the root
// of the repository.
// "repo_subdir": "",
// Customizable commands for building, installing, and
// uninstalling the project. See asv.conf.json documentation.
//
// "install_command": ["in-dir={env_dir} python -mpip install {wheel_file}"],
// "uninstall_command": ["return-code=any python -mpip uninstall -y {project}"],
// "build_command": [
// "python setup.py build",
// "PIP_NO_BUILD_ISOLATION=false python -mpip wheel --no-deps --no-index -w {build_cache_dir} {build_dir}"
// ],
// List of branches to benchmark. If not provided, defaults to "master"
// (for git) or "default" (for mercurial).
"branches": ["master"], // for git
// "branches": ["default"], // for mercurial
// The DVCS being used. If not set, it will be automatically
// determined from "repo" by looking at the protocol in the URL
// (if remote), or by looking for special directories, such as
// ".git" (if local).
"dvcs": "git",
// The tool to use to create environments. May be "conda",
// "virtualenv" or other value depending on the plugins in use.
// If missing or the empty string, the tool will be automatically
// determined by looking for tools on the PATH environment
// variable.
"environment_type": "virtualenv",
// timeout in seconds for installing any dependencies in environment
// defaults to 10 min
//"install_timeout": 600,
// the base URL to show a commit for the project.
"show_commit_url": "http://github.com/tensorwerk/hangar-py/commit/",
// The Pythons you'd like to test against. If not provided, defaults
// to the current version of Python used to run `asv`.
// "pythons": ["3.7"],
// The list of conda channel names to be searched for benchmark
// dependency packages in the specified order
// "conda_channels": ["conda-forge", "defaults"],
// The matrix of dependencies to test. Each key is the name of a
// package (in PyPI) and the values are version numbers. An empty
// list or empty string indicates to just test against the default
// (latest) version. null indicates that the package is to not be
// installed. If the package to be tested is only available from
// PyPi, and the 'environment_type' is conda, then you can preface
// the package name by 'pip+', and the package will be installed via
// pip (with all the conda available packages installed first,
// followed by the pip installed packages).
//
// "matrix": {
// "numpy": ["1.6", "1.7"],
// "six": ["", null], // test with and without six installed
// "pip+emcee": [""], // emcee is only available for install with pip.
// },
"matrix": {
"req": {
"Cython": [], // latest version of Cython
},
},
// Combinations of libraries/python versions can be excluded/included
// from the set to test. Each entry is a dictionary containing additional
// key-value pairs to include/exclude.
//
// An exclude entry excludes entries where all values match. The
// values are regexps that should match the whole string.
//
// An include entry adds an environment. Only the packages listed
// are installed. The 'python' key is required. The exclude rules
// do not apply to includes.
//
// In addition to package names, the following keys are available:
//
// - python
// Python version, as in the *pythons* variable above.
// - environment_type
// Environment type, as above.
// - sys_platform
// Platform, as in sys.platform. Possible values for the common
// cases: 'linux2', 'win32', 'cygwin', 'darwin'.
//
// "exclude": [
// {"python": "3.2", "sys_platform": "win32"}, // skip py3.2 on windows
// {"environment_type": "conda", "six": null}, // don't run without six on conda
// ],
//
// "include": [
// // additional env for python2.7
// {"python": "2.7", "numpy": "1.8"},
// // additional env if run on windows+conda
// {"platform": "win32", "environment_type": "conda", "python": "2.7", "libpython": ""},
// ],
// The directory (relative to the current directory) that benchmarks are
// stored in. If not provided, defaults to "benchmarks"
"benchmark_dir": "benchmarks",
// The directory (relative to the current directory) to cache the Python
// environments in. If not provided, defaults to "env"
"env_dir": "env",
// The directory (relative to the current directory) that raw benchmark
// results are stored in. If not provided, defaults to "results".
"results_dir": "results",
// The directory (relative to the current directory) that the html tree
// should be written to. If not provided, defaults to "html".
"html_dir": "html",
// The number of characters to retain in the commit hashes.
"hash_length": 8,
// `asv` will cache results of the recent builds in each
// environment, making them faster to install next time. This is
// the number of builds to keep, per environment.
"build_cache_size": 2
// The commits after which the regression search in `asv publish`
// should start looking for regressions. Dictionary whose keys are
// regexps matching to benchmark names, and values corresponding to
// the commit (exclusive) after which to start looking for
// regressions. The default is to start from the first commit
// with results. If the commit is `null`, regression detection is
// skipped for the matching benchmark.
//
// "regressions_first_commits": {
// "some_benchmark": "352cdf", // Consider regressions only after this commit
// "another_benchmark": null, // Skip regression detection altogether
// },
// The thresholds for relative change in results, after which `asv
// publish` starts reporting regressions. Dictionary of the same
// form as in ``regressions_first_commits``, with values
// indicating the thresholds. If multiple entries match, the
// maximum is taken. If no entry matches, the default is 5%.
//
// "regressions_thresholds": {
// "some_benchmark": 0.01, // Threshold of 1%
// "another_benchmark": 0.5, // Threshold of 50%
// },
}
================================================
FILE: asv_bench/benchmarks/__init__.py
================================================
================================================
FILE: asv_bench/benchmarks/backend_comparisons.py
================================================
# Write the benchmarking functions here.
# See "Writing benchmarks" in the asv docs for more information.
import numpy as np
import os
from hangar import Repository
from tempfile import mkdtemp
from shutil import rmtree
from hangar.utils import folder_size
# ------------------------- fixture functions ----------------------------------
class _WriterSuite:
params = ['hdf5_00', 'hdf5_01', 'numpy_10']
param_names = ['backend']
processes = 2
repeat = (2, 4, 30.0)
# repeat == tuple (min_repeat, max_repeat, max_time)
number = 2
warmup_time = 0
def setup(self, backend):
# self.method
self.current_iter_number = 0
self.backend_code = {
'numpy_10': '10',
'hdf5_00': '00',
'hdf5_01': '01',
}
# self.num_samples
self.sample_shape = (50, 50, 20)
self.tmpdir = mkdtemp()
self.repo = Repository(path=self.tmpdir, exists=False)
self.repo.init('tester', 'foo@test.bar', remove_old=True)
self.co = self.repo.checkout(write=True)
component_arrays = []
ndims = len(self.sample_shape)
for idx, shape in enumerate(self.sample_shape):
layout = [1 for i in range(ndims)]
layout[idx] = shape
component = np.hamming(shape).reshape(*layout) * 100
component_arrays.append(component.astype(np.float32))
self.arr = np.prod(component_arrays).astype(np.float32)
try:
self.aset = self.co.arraysets.init_arrayset(
'aset', prototype=self.arr, backend_opts=self.backend_code[backend])
except TypeError:
try:
self.aset = self.co.arraysets.init_arrayset(
'aset', prototype=self.arr, backend=self.backend_code[backend])
except ValueError:
raise NotImplementedError
except ValueError:
raise NotImplementedError
except AttributeError:
self.aset = self.co.add_ndarray_column(
'aset', prototype=self.arr, backend=self.backend_code[backend])
def teardown(self, backend):
self.co.close()
self.repo._env._close_environments()
rmtree(self.tmpdir)
def write(self, backend):
arr = self.arr
iter_number = self.current_iter_number
with self.aset as cm_aset:
for i in range(self.num_samples):
arr[iter_number, iter_number, iter_number] += 1
cm_aset[i] = arr
self.current_iter_number += 1
# ----------------------------- Writes ----------------------------------------
class Write_50by50by20_300_samples(_WriterSuite):
method = 'write'
num_samples = 300
time_write = _WriterSuite.write
# ----------------------------- Reads -----------------------------------------
class _ReaderSuite:
params = ['hdf5_00', 'hdf5_01', 'numpy_10']
param_names = ['backend']
processes = 2
repeat = (2, 4, 30.0)
# repeat == tuple (min_repeat, max_repeat, max_time)
number = 3
warmup_time = 0
timeout = 60
def setup_cache(self):
backend_code = {
'numpy_10': '10',
'hdf5_00': '00',
'hdf5_01': '01',
}
sample_shape = (50, 50, 10)
num_samples = 3_000
repo = Repository(path=os.getcwd(), exists=False)
repo.init('tester', 'foo@test.bar', remove_old=True)
co = repo.checkout(write=True)
component_arrays = []
ndims = len(sample_shape)
for idx, shape in enumerate(sample_shape):
layout = [1 for i in range(ndims)]
layout[idx] = shape
component = np.hamming(shape).reshape(*layout) * 100
component_arrays.append(component.astype(np.float32))
arr = np.prod(component_arrays).astype(np.float32)
for backend, code in backend_code.items():
try:
co.arraysets.init_arrayset(
backend, prototype=arr, backend_opts=code)
except TypeError:
try:
co.arraysets.init_arrayset(
backend, prototype=arr, backend=code)
except ValueError:
pass
except ValueError:
pass
except AttributeError:
co.add_ndarray_column(backend, prototype=arr, backend=code)
try:
col = co.columns
except AttributeError:
col = co.arraysets
with col as asets_cm:
for aset in asets_cm.values():
changer = 0
for i in range(num_samples):
arr[changer, changer, changer] += 1
aset[i] = arr
changer += 1
co.commit('first commit')
co.close()
repo._env._close_environments()
def setup(self, backend):
self.repo = Repository(path=os.getcwd(), exists=True)
self.co = self.repo.checkout(write=False)
try:
try:
self.aset = self.co.columns[backend]
except AttributeError:
self.aset = self.co.arraysets[backend]
except KeyError:
raise NotImplementedError
def teardown(self, backend):
self.co.close()
self.repo._env._close_environments()
def read(self, backend):
with self.aset as cm_aset:
for i in cm_aset.keys():
arr = cm_aset[i]
class Read_50by50by10_3000_samples(_ReaderSuite):
method = 'read'
num_samples = 3000
time_read = _ReaderSuite.read
================================================
FILE: asv_bench/benchmarks/backends/__init__.py
================================================
================================================
FILE: asv_bench/benchmarks/backends/hdf5_00.py
================================================
# Write the benchmarking functions here.
# See "Writing benchmarks" in the asv docs for more information.
import numpy as np
from hangar import Repository
from tempfile import mkdtemp
from shutil import rmtree
from hangar.utils import folder_size
class _WriterSuite_HDF5_00:
processes = 2
repeat = (2, 4, 20.0)
# repeat == tuple (min_repeat, max_repeat, max_time)
number = 2
warmup_time = 0
def setup(self):
# self.method
# self.num_samples
# self.sample_shape
self.current_iter_number = 0
self.tmpdir = mkdtemp()
self.repo = Repository(path=self.tmpdir, exists=False)
self.repo.init('tester', 'foo@test.bar', remove_old=True)
self.co = self.repo.checkout(write=True)
component_arrays = []
ndims = len(self.sample_shape)
for idx, shape in enumerate(self.sample_shape):
layout = [1 for i in range(ndims)]
layout[idx] = shape
component = np.hamming(shape).reshape(*layout) * 100
component_arrays.append(component.astype(np.float32))
arr = np.prod(component_arrays).astype(np.float32)
try:
self.aset = self.co.arraysets.init_arrayset('aset', prototype=arr, backend_opts='00')
except TypeError:
self.aset = self.co.arraysets.init_arrayset('aset', prototype=arr, backend='00')
except ValueError:
# marks as skipped benchmark for commits which do not have this backend.
raise NotImplementedError
except AttributeError:
self.aset = self.co.add_ndarray_column('aset', prototype=arr, backend='00')
if self.method == 'read':
with self.aset as cm_aset:
for i in range(self.num_samples):
arr[0, 0, 0] += 1
cm_aset[i] = arr
self.co.commit('first commit')
self.co.close()
self.co = self.repo.checkout(write=False)
try:
self.aset = self.co.columns['aset']
except AttributeError:
self.aset = self.co.arraysets['aset']
else:
self.arr = arr
def teardown(self):
self.co.close()
self.repo._env._close_environments()
rmtree(self.tmpdir)
def read(self):
with self.aset as cm_aset:
for k in cm_aset.keys():
arr = cm_aset[k]
def write(self):
arr = self.arr
iter_num = self.current_iter_number
with self.aset as cm_aset:
for i in range(self.num_samples):
arr[iter_num, iter_num, iter_num] += 1
cm_aset[i] = arr
self.current_iter_number += 1
def size(self):
return folder_size(self.repo._env.repo_path, recurse=True)
class Write_50by50by10_1_samples(_WriterSuite_HDF5_00):
method = 'write'
sample_shape = (50, 50, 10)
num_samples = 1
time_write = _WriterSuite_HDF5_00.write
class Write_50by50by10_100_samples(_WriterSuite_HDF5_00):
method = 'write'
sample_shape = (50, 50, 10)
num_samples = 100
time_write = _WriterSuite_HDF5_00.write
# ----------------------------- Reads -----------------------------------------
class Read_50by50by10_1_samples(_WriterSuite_HDF5_00):
method = 'read'
sample_shape = (50, 50, 10)
num_samples = 1
time_read = _WriterSuite_HDF5_00.read
class Read_50by50by10_100_samples(_WriterSuite_HDF5_00):
method = 'read'
sample_shape = (50, 50, 10)
num_samples = 100
time_read = _WriterSuite_HDF5_00.read
class Read_50by50by10_300_samples(_WriterSuite_HDF5_00):
method = 'read'
sample_shape = (50, 50, 10)
num_samples = 300
time_read = _WriterSuite_HDF5_00.read
track_repo_size = _WriterSuite_HDF5_00.size
track_repo_size.unit = 'bytes'
================================================
FILE: asv_bench/benchmarks/backends/hdf5_01.py
================================================
# Write the benchmarking functions here.
# See "Writing benchmarks" in the asv docs for more information.
import numpy as np
from hangar import Repository
from tempfile import mkdtemp
from shutil import rmtree
from hangar.utils import folder_size
class _WriterSuite_HDF5_01:
processes = 2
repeat = (2, 4, 20.0)
# repeat == tuple (min_repeat, max_repeat, max_time)
number = 2
warmup_time = 0
def setup(self):
# self.method
# self.num_samples
# self.sample_shape
self.current_iter_number = 0
self.tmpdir = mkdtemp()
self.repo = Repository(path=self.tmpdir, exists=False)
self.repo.init('tester', 'foo@test.bar', remove_old=True)
self.co = self.repo.checkout(write=True)
component_arrays = []
ndims = len(self.sample_shape)
for idx, shape in enumerate(self.sample_shape):
layout = [1 for i in range(ndims)]
layout[idx] = shape
component = np.hamming(shape).reshape(*layout) * 100
component_arrays.append(component.astype(np.float32))
arr = np.prod(component_arrays).astype(np.float32)
try:
self.aset = self.co.arraysets.init_arrayset('aset', prototype=arr, backend_opts='01')
except TypeError:
try:
self.aset = self.co.arraysets.init_arrayset('aset', prototype=arr, backend='01')
except ValueError:
raise NotImplementedError
except ValueError:
# marks as skipped benchmark for commits which do not have this backend.
raise NotImplementedError
except AttributeError:
self.aset = self.co.add_ndarray_column('aset', prototype=arr, backend='01')
if self.method == 'read':
with self.aset as cm_aset:
for i in range(self.num_samples):
arr[0, 0, 0] += 1
cm_aset[i] = arr
self.co.commit('first commit')
self.co.close()
self.co = self.repo.checkout(write=False)
try:
self.aset = self.co.columns['aset']
except AttributeError:
self.aset = self.co.arraysets['aset']
else:
self.arr = arr
def teardown(self):
self.co.close()
self.repo._env._close_environments()
rmtree(self.tmpdir)
def read(self):
with self.aset as cm_aset:
for k in cm_aset.keys():
arr = cm_aset[k]
def write(self):
arr = self.arr
iter_num = self.current_iter_number
with self.aset as cm_aset:
for i in range(self.num_samples):
arr[iter_num, iter_num, iter_num] += 1
cm_aset[i] = arr
self.current_iter_number += 1
def size(self):
return folder_size(self.repo._env.repo_path, recurse=True)
class Write_50by50by10_1_samples(_WriterSuite_HDF5_01):
method = 'write'
sample_shape = (50, 50, 10)
num_samples = 1
time_write = _WriterSuite_HDF5_01.write
class Write_50by50by10_100_samples(_WriterSuite_HDF5_01):
method = 'write'
sample_shape = (50, 50, 10)
num_samples = 100
time_write = _WriterSuite_HDF5_01.write
# ----------------------------- Reads -----------------------------------------
class Read_50by50by10_1_samples(_WriterSuite_HDF5_01):
method = 'read'
sample_shape = (50, 50, 10)
num_samples = 1
time_read = _WriterSuite_HDF5_01.read
class Read_50by50by10_100_samples(_WriterSuite_HDF5_01):
method = 'read'
sample_shape = (50, 50, 10)
num_samples = 100
time_read = _WriterSuite_HDF5_01.read
class Read_50by50by10_300_samples(_WriterSuite_HDF5_01):
method = 'read'
sample_shape = (50, 50, 10)
num_samples = 300
time_read = _WriterSuite_HDF5_01.read
track_repo_size = _WriterSuite_HDF5_01.size
track_repo_size.unit = 'bytes'
================================================
FILE: asv_bench/benchmarks/backends/numpy_10.py
================================================
# Write the benchmarking functions here.
# See "Writing benchmarks" in the asv docs for more information.
import numpy as np
from hangar import Repository
from tempfile import mkdtemp
from shutil import rmtree
from hangar.utils import folder_size
class _WriterSuite_NUMPY_10:
processes = 2
repeat = (2, 4, 20.0)
# repeat == tuple (min_repeat, max_repeat, max_time)
number = 2
warmup_time = 0
def setup(self):
# self.method
# self.num_samples
# self.sample_shape
self.current_iter_number = 0
self.tmpdir = mkdtemp()
self.repo = Repository(path=self.tmpdir, exists=False)
self.repo.init('tester', 'foo@test.bar', remove_old=True)
self.co = self.repo.checkout(write=True)
component_arrays = []
ndims = len(self.sample_shape)
for idx, shape in enumerate(self.sample_shape):
layout = [1 for i in range(ndims)]
layout[idx] = shape
component = np.hamming(shape).reshape(*layout) * 100
component_arrays.append(component.astype(np.float32))
arr = np.prod(component_arrays).astype(np.float32)
try:
self.aset = self.co.arraysets.init_arrayset('aset', prototype=arr, backend_opts='10')
except TypeError:
self.aset = self.co.arraysets.init_arrayset('aset', prototype=arr, backend='10')
except ValueError:
# marks as skipped benchmark for commits which do not have this backend.
raise NotImplementedError
except AttributeError:
self.aset = self.co.add_ndarray_column('aset', prototype=arr, backend='10')
if self.method == 'read':
with self.aset as cm_aset:
for i in range(self.num_samples):
arr[0, 0, 0] += 1
cm_aset[i] = arr
self.co.commit('first commit')
self.co.close()
self.co = self.repo.checkout(write=False)
try:
self.aset = self.co.columns['aset']
except AttributeError:
self.aset = self.co.arraysets['aset']
else:
self.arr = arr
def teardown(self):
self.co.close()
self.repo._env._close_environments()
rmtree(self.tmpdir)
def read(self):
with self.aset as cm_aset:
for k in cm_aset.keys():
arr = cm_aset[k]
def write(self):
arr = self.arr
iter_num = self.current_iter_number
with self.aset as cm_aset:
for i in range(self.num_samples):
arr[iter_num, iter_num, iter_num] += 1
cm_aset[i] = arr
self.current_iter_number += 1
def size(self):
return folder_size(self.repo._env.repo_path, recurse=True)
class Write_50by50by10_1_samples(_WriterSuite_NUMPY_10):
method = 'write'
sample_shape = (50, 50, 10)
num_samples = 1
time_write = _WriterSuite_NUMPY_10.write
class Write_50by50by10_100_samples(_WriterSuite_NUMPY_10):
method = 'write'
sample_shape = (50, 50, 10)
num_samples = 100
time_write = _WriterSuite_NUMPY_10.write
# ----------------------------- Reads -----------------------------------------
class Read_50by50by10_1_samples(_WriterSuite_NUMPY_10):
method = 'read'
sample_shape = (50, 50, 10)
num_samples = 1
time_read = _WriterSuite_NUMPY_10.read
class Read_50by50by10_100_samples(_WriterSuite_NUMPY_10):
method = 'read'
sample_shape = (50, 50, 10)
num_samples = 100
time_read = _WriterSuite_NUMPY_10.read
class Read_50by50by10_300_samples(_WriterSuite_NUMPY_10):
method = 'read'
sample_shape = (50, 50, 10)
num_samples = 300
time_read = _WriterSuite_NUMPY_10.read
track_repo_size = _WriterSuite_NUMPY_10.size
track_repo_size.unit = 'bytes'
================================================
FILE: asv_bench/benchmarks/commit_and_checkout.py
================================================
from tempfile import mkdtemp
from shutil import rmtree
import numpy as np
from hangar import Repository
class MakeCommit(object):
params = (5_000, 20_000, 50_000)
param_names = ['num_samples']
processes = 2
repeat = (2, 4, 20)
number = 1
warmup_time = 0
def setup(self, num_samples):
self.tmpdir = mkdtemp()
self.repo = Repository(path=self.tmpdir, exists=False)
self.repo.init('tester', 'foo@test.bar', remove_old=True)
self.co = self.repo.checkout(write=True)
arr = np.array([0,], dtype=np.uint8)
try:
aset = self.co.arraysets.init_arrayset('aset', prototype=arr, backend_opts='10')
except TypeError:
aset = self.co.arraysets.init_arrayset('aset', prototype=arr, backend='10')
except AttributeError:
aset = self.co.add_ndarray_column('aset', prototype=arr, backend='10')
with aset as cm_aset:
for i in range(num_samples):
arr[:] = i % 255
cm_aset[i] = arr
def teardown(self, num_samples):
self.co.close()
self.repo._env._close_environments()
rmtree(self.tmpdir)
def time_commit(self, num_samples):
self.co.commit('hello')
class CheckoutCommit(object):
params = (5_000, 20_000, 50_000)
param_names = ['num_samples']
processes = 2
number = 1
repeat = (2, 4, 20)
warmup_time = 0
def setup(self, num_samples):
self.tmpdir = mkdtemp()
self.repo = Repository(path=self.tmpdir, exists=False)
self.repo.init('tester', 'foo@test.bar', remove_old=True)
self.co = self.repo.checkout(write=True)
arr = np.array([0,], dtype=np.uint8)
try:
aset = self.co.arraysets.init_arrayset('aset', prototype=arr, backend_opts='10')
except TypeError:
aset = self.co.arraysets.init_arrayset('aset', prototype=arr, backend='10')
except AttributeError:
aset = self.co.add_ndarray_column('aset', prototype=arr, backend='10')
with aset as cm_aset:
for i in range(num_samples):
arr[:] = i % 255
cm_aset[i] = arr
self.co.commit('first')
self.co.close()
self.co = None
def teardown(self, num_samples):
try:
self.co.close()
except PermissionError:
pass
self.repo._env._close_environments()
rmtree(self.tmpdir)
def time_checkout_read_only(self, num_samples):
self.co = self.repo.checkout(write=False)
def time_checkout_write_enabled(self, num_samples):
self.co = self.repo.checkout(write=True)
self.co.close()
================================================
FILE: asv_bench/benchmarks/package.py
================================================
class TimeImport(object):
processes = 2
repeat = (5, 10, 10.0)
def timeraw_import(self):
return """
from hangar import Repository
"""
================================================
FILE: codecov.yml
================================================
comment:
layout: "diff, files"
behavior: default
require_changes: false # if true: only post the comment if coverage changes
coverage:
range: 60..100
round: nearest
precision: 2
================================================
FILE: docs/Tutorial-001.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Part 1: Creating A Repository And Working With Data\n",
"\n",
"This tutorial will review the first steps of working with a hangar repository.\n",
"\n",
"To fit with the beginner's theme, we will use the MNIST dataset. Later examples will show off how to work with much more complex data."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from hangar import Repository\n",
"\n",
"import numpy as np\n",
"import pickle\n",
"import gzip\n",
"import matplotlib.pyplot as plt\n",
"\n",
"from tqdm import tqdm"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating & Interacting with a Hangar Repository\n",
"\n",
"Hangar is designed to “just make sense” in every operation you have to perform.\n",
"As such, there is a single interface which all interaction begins with: the\n",
" designed to “just make sense” in every operation you have to perform.\n",
"As such, there is a single interface which all interaction begins with: the\n",
"[Repository](api.rst#hangar.repository.Repository) object.\n",
"\n",
"Whether a hangar repository exists at the path you specify or not, just tell\n",
"hangar where it should live!\n",
"\n",
"#### Intitializing a repository\n",
"\n",
"The first time you want to work with a new repository, the repository\n",
"[init()](api.rst#hangar.repository.Repository.init) method\n",
"must be called. This is where you provide Hangar with your name and email\n",
"address (to be used in the commit log), as well as implicitly confirming that\n",
"you do want to create the underlying data files hangar uses on disk."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hangar Repo initialized at: /Users/rick/projects/tensorwerk/hangar/dev/mnist/.hangar\n"
]
},
{
"data": {
"text/plain": [
"'/Users/rick/projects/tensorwerk/hangar/dev/mnist/.hangar'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"repo = Repository(path='/Users/rick/projects/tensorwerk/hangar/dev/mnist/')\n",
"\n",
"# First time a repository is accessed only!\n",
"# Note: if you feed a path to the `Repository` which does not contain a pre-initialized hangar repo,\n",
"# when the Repository object is initialized it will let you know that you need to run `init()`\n",
"\n",
"repo.init(user_name='Rick Izzo', user_email='rick@tensorwerk.com', remove_old=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Checking out the repo for writing\n",
"\n",
"A repository can be checked out in two modes:\n",
"\n",
"1. [write-enabled](api.rst#hangar.checkout.WriterCheckout): applies all operations to the staging area’s current\n",
" state. Only one write-enabled checkout can be active at a different time,\n",
" must be closed upon last use, or manual intervention will be needed to remove\n",
" the writer lock.\n",
"\n",
"2. [read-only](api.rst#read-only-checkout): checkout a commit or branch to view repository state as it\n",
" existed at that point in time.\n",
"\n",
"#### Lots of useful information is in the iPython `__repr__`\n",
"\n",
"If you're ever in doubt about what the state of the object your working\n",
"on is, just call its reps, and the most relevant information will be\n",
"sent to your screen!"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar WriterCheckout \n",
" Writer : True \n",
" Base Branch : master \n",
" Num Columns : 0\n"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co = repo.checkout(write=True)\n",
"co"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### A checkout allows access to [columns](api.rst#hangar.columns.column.Columns)\n",
"\n",
"The [columns](api.rst#hangar.checkout.WriterCheckout.columns) attributes\n",
"of a checkout provide the interface to working with all of the data on disk!"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar Columns \n",
" Writeable : True \n",
" Number of Columns : 0 \n",
" Column Names / Partial Remote References: \n",
" - "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Before data can be added to a repository, a column must be initialized.\n",
"\n",
"We're going to first load up a the MNIST pickled dataset so it can be added to\n",
"the repo!"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# Load the dataset\n",
"with gzip.open('/Users/rick/projects/tensorwerk/hangar/dev/data/mnist.pkl.gz', 'rb') as f:\n",
" train_set, valid_set, test_set = pickle.load(f, encoding='bytes')\n",
"\n",
"def rescale(array):\n",
" array = array * 256\n",
" rounded = np.round(array)\n",
" return rounded.astype(np.uint8())\n",
"\n",
"sample_trimg = rescale(train_set[0][0])\n",
"sample_trlabel = np.array([train_set[1][0]])\n",
"trimgs = rescale(train_set[0])\n",
"trlabels = train_set[1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Before data can be added to a repository, a column must be initialized.\n",
"\n",
"An \"Column\" is a named grouping of data samples where each sample shares a\n",
"number of similar attributes and array properties.\n",
"\n",
"See the docstrings below or in [add_ndarray_column()](api.rst#hangar.checkout.WriterCheckout.add_ndarray_column)\n",
"\n",
".. include:: ./noindexapi/apiinit.rst"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"col = co.add_ndarray_column(name='mnist_training_images', prototype=trimgs[0])"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar FlatSampleWriter \n",
" Column Name : mnist_training_images \n",
" Writeable : True \n",
" Column Type : ndarray \n",
" Column Layout : flat \n",
" Schema Type : fixed_shape \n",
" DType : uint8 \n",
" Shape : (784,) \n",
" Number of Samples : 0 \n",
" Partial Remote Data Refs : False\n"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"col"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Interaction\n",
"\n",
"#### Through columns attribute\n",
"\n",
"When a column is initialized, a column accessor object will be returned,\n",
"however, depending on your use case, this may or may not be the most convenient\n",
"way to access a arrayset.\n",
"\n",
"In general, we have implemented a full `dict` mapping interface on top of all\n",
"objects. To access the `'mnist_training_images'` arrayset you can just use a\n",
"dict style access like the following (note: if operating in iPython/Jupyter, the\n",
"arrayset keys will autocomplete for you).\n",
"\n",
"The column objects returned here contain many useful instrospecion methods which\n",
"we will review over the rest of the tutorial."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar FlatSampleWriter \n",
" Column Name : mnist_training_images \n",
" Writeable : True \n",
" Column Type : ndarray \n",
" Column Layout : flat \n",
" Schema Type : fixed_shape \n",
" DType : uint8 \n",
" Shape : (784,) \n",
" Number of Samples : 0 \n",
" Partial Remote Data Refs : False\n"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.columns['mnist_training_images']"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar FlatSampleWriter \n",
" Column Name : mnist_training_images \n",
" Writeable : True \n",
" Column Type : ndarray \n",
" Column Layout : flat \n",
" Schema Type : fixed_shape \n",
" DType : uint8 \n",
" Shape : (784,) \n",
" Number of Samples : 0 \n",
" Partial Remote Data Refs : False\n"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"train_aset = co.columns['mnist_training_images']\n",
"\n",
"# OR an equivalent way using the `.get()` method\n",
"\n",
"train_aset = co.columns.get('mnist_training_images')\n",
"train_aset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Through the checkout object (arrayset and sample access)\n",
"\n",
"In addition to the standard `co.columns` access methods, we have implemented a convenience mapping to [columns](api.rst#hangar.columns.column.Columns) and [flat samples](api.rst#hangar.columns.layout_flat.FlatSampleWriter) or [nested samples](api.rst#hangar.columns.layout_nested.NestedSampleWriter) / [nested subsamples](api.rst#hangar.columns.layout_nested.FlatSubsampleWriter) (ie. data) for both reading and writing from the [checkout](api.rst#hangar.checkout.WriterCheckout) object itself.\n",
"\n",
"To get the same arrayset object from the checkout, simply use:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar FlatSampleWriter \n",
" Column Name : mnist_training_images \n",
" Writeable : True \n",
" Column Type : ndarray \n",
" Column Layout : flat \n",
" Schema Type : fixed_shape \n",
" DType : uint8 \n",
" Shape : (784,) \n",
" Number of Samples : 0 \n",
" Partial Remote Data Refs : False\n"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"train_asets = co['mnist_training_images']\n",
"train_asets"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Though that works as expected, most use cases will take advantage of adding and reading data from multiple columns / samples at a time. This is shown in the next section."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Adding Data\n",
"\n",
"To add data to a named arrayset, we can use dict-style setting\n",
"(refer to the `__setitem__`, `__getitem__`, and `__delitem__` methods),\n",
"or the `update()` method. Sample keys can be either `str` or `int` type."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"train_aset['0'] = trimgs[0]\n",
"\n",
"data = {\n",
" '1': trimgs[1],\n",
" '2': trimgs[2],\n",
"}\n",
"train_aset.update(data)\n",
"\n",
"train_aset[51] = trimgs[51]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using the checkout method"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"co['mnist_training_images', 60] = trimgs[60]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### How many samples are in the arrayset?"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(train_aset)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Containment Testing"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"'hi' in train_aset"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"'0' in train_aset"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"60 in train_aset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Dictionary Style Retrieval for known keys"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n"
]
},
{
"data": {
"text/plain": [
"<matplotlib.image.AxesImage at 0x3703cc7f0>"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD4CAYAAAAq5pAIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAOYElEQVR4nO3dbYxc5XnG8euKbUwxJvHGseMQFxzjFAg0Jl0ZkBFQoVCCIgGKCLGiiFBapwlOQutKUFoVWtHKrRIiSimSKS6m4iWQgPAHmsSyECRqcFmoAROHN+MS4+0aswIDIfZ6fffDjqsFdp5dZs68eO//T1rNzLnnzLk1cPmcmeeceRwRAjD5faDTDQBoD8IOJEHYgSQIO5AEYQeSmNrOjR3i6XGoZrRzk0Aqv9Fb2ht7PFatqbDbPkfS9ZKmSPrXiFhVev6hmqGTfVYzmwRQsDE21K01fBhve4qkGyV9TtLxkpbZPr7R1wPQWs18Zl8i6fmI2BoReyXdJem8atoCULVmwn6kpF+Nery9tuwdbC+33We7b0h7mtgcgGY0E/axvgR4z7m3EbE6InojoneapjexOQDNaCbs2yXNH/X445J2NNcOgFZpJuyPSlpke4HtQyR9SdK6atoCULWGh94iYp/tFZJ+rJGhtzUR8XRlnQGoVFPj7BHxgKQHKuoFQAtxuiyQBGEHkiDsQBKEHUiCsANJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kQdiBJAg7kARhB5Ig7EAShB1IgrADSRB2IAnCDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJNDWLK7qfp5b/E0/5yOyWbv+ZPz+6bm34sP3FdY9auLNYP+wbLtb/97pD6tYe7/1+cd1dw28V6yffs7JYP+bPHinWO6GpsNveJukNScOS9kVEbxVNAaheFXv234+IXRW8DoAW4jM7kESzYQ9JP7H9mO3lYz3B9nLbfbb7hrSnyc0BaFSzh/FLI2KH7TmS1tv+ZUQ8PPoJEbFa0mpJOsI90eT2ADSoqT17ROyo3e6UdJ+kJVU0BaB6DYfd9gzbMw/cl3S2pM1VNQagWs0cxs+VdJ/tA69zR0T8qJKuJpkpxy0q1mP6tGJ9xxkfKtbfPqX+mHDPB8vjxT/9dHm8uZP+49czi/V/+OdzivWNJ95Rt/bi0NvFdVcNfLZY/9hPD75PpA2HPSK2Svp0hb0AaCGG3oAkCDuQBGEHkiDsQBKEHUiCS1wrMHzmZ4r16269sVj/5LT6l2JOZkMxXKz/9Q1fLdanvlUe/jr1nhV1azNf3ldcd/qu8tDcYX0bi/VuxJ4dSIKwA0kQdiAJwg4kQdiBJAg7kARhB5JgnL0C05/ZUaw/9pv5xfonpw1U2U6lVvafUqxvfbP8U9S3LvxB3drr+8vj5HP/6T+L9VY6+C5gHR97diAJwg4kQdiBJAg7kARhB5Ig7EAShB1IwhHtG1E8wj1xss9q2/a6xeAlpxbru88p/9zzlCcPL9af+MYN77unA67d9bvF+qNnlMfRh197vViPU+v/APG2bxVX1YJlT5SfgPfYGBu0OwbHnMuaPTuQBGEHkiDsQBKEHUiCsANJEHYgCcIOJME4exeYMvvDxfrwq4PF+ot31B8rf/r0NcV1l/z9N4v1OTd27ppyvH9NjbPbXmN7p+3No5b12F5v+7na7awqGwZQvYkcxt8q6d2z3l8paUNELJK0ofYYQBcbN+wR8bCkdx9Hnidpbe3+WknnV9wXgIo1+gXd3Ijol6Ta7Zx6T7S93Haf7b4h7WlwcwCa1fJv4yNidUT0RkTvNE1v9eYA1NFo2Adsz5Ok2u3O6loC0AqNhn2dpItr9y+WdH817QBolXF/N972nZLOlDTb9nZJV0taJelu25dKeknSha1scrIb3vVqU+sP7W58fvdPffkXxforN00pv8D+8hzr6B7jhj0iltUpcXYMcBDhdFkgCcIOJEHYgSQIO5AEYQeSYMrmSeC4K56tW7vkxPKgyb8dtaFYP+PCy4r1md9/pFhH92DPDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJMM4+CZSmTX7168cV131p3dvF+pXX3las/8UXLyjW478/WLc2/+9+XlxXbfyZ8wzYswNJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEkzZnNzgH55arN9+9XeK9QVTD21425+6bUWxvujm/mJ939ZtDW97smpqymYAkwNhB5Ig7EAShB1IgrADSRB2IAnCDiTBODuKYuniYv2IVduL9Ts/8eOGt33sg39UrP/O39S/jl+Shp/b2vC2D1ZNjbPbXmN7p+3No5ZdY/tl25tqf+dW2TCA6k3kMP5WSeeMsfx7EbG49vdAtW0BqNq4YY+IhyUNtqEXAC3UzBd0K2w/WTvMn1XvSbaX2+6z3TekPU1sDkAzGg37TZIWSlosqV/Sd+s9MSJWR0RvRPRO0/QGNwegWQ2FPSIGImI4IvZLulnSkmrbAlC1hsJue96ohxdI2lzvuQC6w7jj7LbvlHSmpNmSBiRdXXu8WFJI2ibpaxFRvvhYjLNPRlPmzinWd1x0TN3axiuuL677gXH2RV9+8exi/fXTXi3WJ6PSOPu4k0RExLIxFt/SdFcA2orTZYEkCDuQBGEHkiDsQBKEHUiCS1zRMXdvL0/ZfJgPKdZ/HXuL9c9/8/L6r33fxuK6Byt+ShoAYQeyIOxAEoQdSIKwA0kQdiAJwg4kMe5Vb8ht/2nln5J+4cLylM0nLN5WtzbeOPp4bhg8qVg/7P6+pl5/smHPDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJMM4+ybn3hGL92W+Vx7pvXrq2WD/90PI15c3YE0PF+iODC8ovsH/cXzdPhT07kARhB5Ig7EAShB1IgrADSRB2IAnCDiTBOPtBYOqCo4r1Fy75WN3aNRfdVVz3C4fvaqinKlw10FusP3T9KcX6rLXl353HO427Z7c93/aDtrfYftr2t2vLe2yvt/1c7XZW69sF0KiJHMbvk7QyIo6TdIqky2wfL+lKSRsiYpGkDbXHALrUuGGPiP6IeLx2/w1JWyQdKek8SQfOpVwr6fxWNQmgee/rCzrbR0s6SdJGSXMjol8a+QdB0pw66yy33We7b0h7musWQMMmHHbbh0v6oaTLI2L3RNeLiNUR0RsRvdM0vZEeAVRgQmG3PU0jQb89Iu6tLR6wPa9WnydpZ2taBFCFcYfebFvSLZK2RMR1o0rrJF0saVXt9v6WdDgJTD36t4v1139vXrF+0d/+qFj/kw/dW6y30sr+8vDYz/+l/vBaz63/VVx31n6G1qo0kXH2pZK+Iukp25tqy67SSMjvtn2ppJckXdiaFgFUYdywR8TPJI05ubuks6ptB0CrcLoskARhB5Ig7EAShB1IgrADSXCJ6wRNnffRurXBNTOK6359wUPF+rKZAw31VIUVL59WrD9+U3nK5tk/2Fys97zBWHm3YM8OJEHYgSQIO5AEYQeSIOxAEoQdSIKwA0mkGWff+wflny3e+6eDxfpVxzxQt3b2b73VUE9VGRh+u27t9HUri+se+1e/LNZ7XiuPk+8vVtFN2LMDSRB2IAnCDiRB2IEkCDuQBGEHkiDsQBJpxtm3nV/+d+3ZE+9p2bZvfG1hsX79Q2cX6x6u9+O+I4699sW6tUUDG4vrDhermEzYswNJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEo6I8hPs+ZJuk/RRjVy+vDoirrd9jaQ/lvRK7alXRUT9i74lHeGeONlM/Aq0ysbYoN0xOOaJGRM5qWafpJUR8bjtmZIes72+VvteRHynqkYBtM5E5mfvl9Rfu/+G7S2Sjmx1YwCq9b4+s9s+WtJJkg6cg7nC9pO219ieVWed5bb7bPcNaU9TzQJo3ITDbvtwST+UdHlE7JZ0k6SFkhZrZM//3bHWi4jVEdEbEb3TNL2ClgE0YkJhtz1NI0G/PSLulaSIGIiI4YjYL+lmSUta1yaAZo0bdtuWdIukLRFx3ajl80Y97QJJ5ek8AXTURL6NXyrpK5Kesr2ptuwqSctsL5YUkrZJ+lpLOgRQiYl8G/8zSWON2xXH1AF0F86gA5Ig7EAShB1IgrADSRB2IAnCDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJEHYgCcIOJDHuT0lXujH7FUn/M2rRbEm72tbA+9OtvXVrXxK9NarK3o6KiI+MVWhr2N+zcbsvIno71kBBt/bWrX1J9NaodvXGYTyQBGEHkuh02Fd3ePsl3dpbt/Yl0Vuj2tJbRz+zA2ifTu/ZAbQJYQeS6EjYbZ9j+xnbz9u+shM91GN7m+2nbG+y3dfhXtbY3ml786hlPbbX236udjvmHHsd6u0a2y/X3rtNts/tUG/zbT9oe4vtp21/u7a8o+9doa+2vG9t/8xue4qkZyV9VtJ2SY9KWhYRv2hrI3XY3iapNyI6fgKG7dMlvSnptog4obbsHyUNRsSq2j+UsyLiii7p7RpJb3Z6Gu/abEXzRk8zLul8SV9VB9+7Ql9fVBvet07s2ZdIej4itkbEXkl3STqvA310vYh4WNLguxafJ2lt7f5ajfzP0nZ1eusKEdEfEY/X7r8h6cA04x197wp9tUUnwn6kpF+Nerxd3TXfe0j6ie3HbC/vdDNjmBsR/dLI/zyS5nS4n3cbdxrvdnrXNONd8941Mv15szoR9rGmkuqm8b+lEfEZSZ+TdFntcBUTM6FpvNtljGnGu0Kj0583qxNh3y5p/qjHH5e0owN9jCkidtRud0q6T903FfXAgRl0a7c7O9zP/+umabzHmmZcXfDedXL6806E/VFJi2wvsH2IpC9JWteBPt7D9ozaFyeyPUPS2eq+qajXSbq4dv9iSfd3sJd36JZpvOtNM64Ov3cdn/48Itr+J+lcjXwj/4Kkv+xED3X6+oSkJ2p/T3e6N0l3auSwbkgjR0SXSvqwpA2Snqvd9nRRb/8u6SlJT2okWPM61NtpGvlo+KSkTbW/czv93hX6asv7xumyQBKcQQckQdiBJAg7kARhB5Ig7EAShB1IgrADSfwfs4RxaLJFjqkAAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"out1 = train_aset['0']\n",
"# OR\n",
"out2 = co['mnist_training_images', '0']\n",
"\n",
"print(np.allclose(out1, out2))\n",
"\n",
"plt.imshow(out1.reshape(28, 28))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Dict style iteration supported out of the box"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n",
"1\n",
"2\n",
"51\n",
"60\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAlAAAACBCAYAAAAPH4TmAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAZWUlEQVR4nO3deZgV1ZkG8Pf0AnSzN9BsIg1Cs4kBaRSQJQYRNa4ji8QIIThmNC4oKkicSaIYMZNHRgVUVECNwd2IjqLCdAwisquADYIsgiCbIMja3ffMH7Tn1HfTRd+6a93q9/c8Pv2d/ureOvbXdftQdeqU0lqDiIiIiCKXkeoOEBEREaUbDqCIiIiIPOIAioiIiMgjDqCIiIiIPOIAioiIiMgjDqCIiIiIPIppAKWUukgptV4ptVEpNSFenaLUYD2Dg7UMFtYzOFjL4FDRrgOllMoE8CWAQQC2A1gGYITW+ov4dY+ShfUMDtYyWFjP4GAtgyUrhteeA2Cj1noTACilXgRwBQDXX4Qaqqauhdox7JJicQyHcUIfVy5pT/VkLVPvEPbv1Vo3qSTFYzPN8NgMFh6bwXGqYzOWAVRLANsc7e0Azg3fSCl1A4AbAKAWcnGuGhjDLikWS/SCU6WrrCdr6S/z9atbXVI8NtMMj81g4bEZHKc6NmOZA1XZiOxfrgdqrWdorYu01kXZqBnD7ijBqqwna5k2eGwGC4/N4OCxGSCxDKC2A2jlaJ8GYEds3aEUYj2Dg7UMFtYzOFjLAIllALUMQHulVBulVA0A1wCYG59uUQqwnsHBWgYL6xkcrGWARD0HSmtdppS6GcB7ADIBzNRar41bzyipWM/gYC2DhfUMDtYyWGKZRA6t9TsA3olTXyjFWM/gYC2DhfUMDtYyOLgSOREREZFHHEARERERecQBFBEREZFHHEARERERecQBFBEREZFHHEAREREReRTTMgZEQVH2sx4m3nnTcZH7rPezJv7J4lEi12JaDRNnFq9MUO+IiMhveAaKiIiIyCMOoIiIiIg84iW8Sqgs+2PJbNI4otesv7NAtMtzQyZufcZukcu9yT6Q+9uHa4jcyqKXTLy3/LDInfvKOBO3u+OTiPpFlQsN6C7aj86cauJ22fKwCDniVb1nidz6onIT31XQK34dpJQ7PORc0X7oz4+b+P5hI0VOL1+TlD4FVahvNxPv6Jcrcp/fPDV884hkKnt+oOTEEZEbO+w3trF0dVTvXx2UXtBDtLPnr0hRT4B9/97bxE3nbRO5sm3bk90dADwDRUREROQZB1BEREREHnEARURERORRoOdAZXZqb2JdM1vkdgxoYOKjveRco7z6tr3wJy8hVu8eqSvaD029yMRLuv5N5DaXHjXx5F2DRK7FQh1zX6qz0guLTHz39OdFrjDbzkULiVlPwKbSUhN/H6opct0dzeMX9xS5nGI7tyJ07Jj3DqeBo1ecY+NGmSKXN3NxsrsTV7uL5L8v799yWYp6EgyqexfR3jS0noknDbGfg1fX3i+2CyG6z72QtvMT22XL43bwzEUmfvuOn4lc9vvLo9pfUBy6xs7lnP7gIyK36Gg7E7/VvYXI6eNy+ZdY7bmxt2gX/+5hEw++6jqRq39JXHcdMZ6BIiIiIvKIAygiIiIijwJ1Ca/8p2eL9sOzp5nYeYkmGUodp4//67FfiVzWYXtKuvcrN4tc3W/KTFxz71GRy12+JI49DKbMevVE+3D/jia+fYq9THB+zg9hr3T/t8Ts/X1MvGC6PK286A+PmviDp58Quc5/tbVtOz69L2e52dHf/txyzzggkzOT3Jl4yLCXIfXp8vgbmL/OxAtUH1DVMhs3MnGnp0tE7q1my1xepVy+Hz+3NNxg4if7Dxa5gvcTvntfOThCLr8y9U/2M61rDTn1pWuNrSZ+W7UUuXhPMMkMuyJYqu3UilmdnxO5YW9cb+IWV30R55644xkoIiIiIo84gCIiIiLyiAMoIiIiIo8CNQeq5vodor3iWCsTF2bvivn9x+2U14o3/WAf8zL7jFdF7vuQvSLc9NGPo9ofFy3wbvtz8rr8sp7TXLaM3H35dq7GvDpy7svoLRea+NmC+SJXr/O+mPftd3+89BUTP1Ry4Sm2TA+ZZ7Q28boBchJXt6W/NHGLZXz8x48ym+abeOv0JiL3wtn2Z9ilRvz/3OwP2eVBVp+Q8x/71zoR9/0FRWaD+ibuf7d8LFg3R53KUC5ynRbYR+C0P/FZgnp3UvgyKPPHn2bioXXkZ+u4Tvaz9+VGZ4pc+b7vEtC7k3gGioiIiMgjDqCIiIiIPArUJbyynd+K9mMPDTXxAxfJ1cYzP69j4s9uesz1PSftPcvEGy+QTwkvP7DTxL/ofZPIbbnVxm2Q2FOd1V3Zz+wTw+d0k09uz0Dly1eM3jpQtJfP72Ti1WPkexQfrWXi/OXy1vaN++0yCdl/Kpb7Tvzd2CmXrcqq3iiNZD19xDV39Kt6rrnqbNt1dnXqT3uFf5Ym9k/MSwc7m3jGrJ+L3Mrb3T/Xq7stT9vpLW/lF7tu1+3jX4t2+5ErE9anWFxX1/7tnzJyiMg1mxLdFJpI8AwUERERkUccQBERERF5VOUASik1Uym1Wym1xvG9PKXUB0qpDRVfGya2mxQvrGegFLCWwcFjM1B4bFYDkVygng1gKgDn2ukTACzQWk9WSk2oaI+Pf/dikzfL3gbZ5K1GIue8tbHLmfI679r+9tbbuTMGmDj/gPu1VLVYznNq498nd8xGmtbzR6EB3UX70Zl2zlK7bPkrHYJd/v/ydVeZOHOInBPX4Od20YjOz8vH6xRO22bijG2rRK7hQhuXPiBv+X3tLPt79OvzbxW5zOK4zCXYC+AXSGItQ327iXa/Wh/F6619oaC2+9ITreaXu+biZDbS4NjMalsg2r2Gxj7Hs8Nrdg5pnS2ZIldr4B4TL+r2osg986Rj3lNyn9ZVlaQfm6dSfr58zNnzPZyPnZKfmatPlJq45XT5KBeSqjwDpbX+J4DwhRSuAPBsRfwsgCvj3C9KENYzUH4AaxkYPDYDhcdmNRDtHKimWuudAFDxNd9tQ6XUDUqp5Uqp5aU47rYZpVZE9WQt0wKPzWDhsRkcPDYDJuHLGGitZwCYAQD1VF7KFtcu3+t+ar70oPu53y7X2ic773lcnlpGKOGn9H0llbVUPbqYeO8dcimBwmxbvxVhnzX/94O9zXnfi/bW3Ub75TXW+n+1q/HWhxTtjfpNM2vafY+Vt8ef4s7hpImmnlsvzRHt/Mxcly3TQ1bB6aI9JG+u67Y5m/eb2G9HfjKPzXPe+FK0JzZ2X5W9VNuf1Ocn5OfntX//rYk7/H6tiUOHDontsuY0M/FlLUaKXLPPlpo4o6GcUtR/4DAT//Osl1376EfxqKdzeZfHZz0qcmdk5YRvbox5cKyJGxf7Zy7KpLWXmHjouc+7bvfIzU+I9oNTznLZMnbRnoHapZRqDgAVX3fHr0uUAqxncLCWwcJ6BgdrGTDRDqDmAhhVEY8C8GZ8ukMpwnoGB2sZLKxncLCWARPJMgZzACwG0EEptV0pNQbAZACDlFIbAAyqaFMaYD0DpQ1Yy8DgsRkoPDargSrnQGmtR7ikBrp8P+10Gi+v6Y/uav/XZrVeYOIBQ38rtqv7knyKdTpIl3pm5Mq5NWV/PmjiTzq+LnKby+xT1++YOE7kGi782sT5te0Z82TPYTmn+VbR3hKft92stS6q5PsJq2VWu0OuuWPrGiRqtwmz7X9qi/Z5Ne2yF88cPE1ufOAgEsnPx+aJwfbXbESDR8KyteDGOe/p9217iFw72M/PENyJR3SFPa5LaCLnQLWsE34TXFIl/dgMt+0COzf0VHOe7tvbVbTz55ilq05Zl2RrNcp+lk/66EyRu7ex7XMtVYpk4UrkRERERB5xAEVERETkUcKXMUgH5Qe+F+19N3Yy8ddz7S3zEyY9J7a7Z5hd2Vqvkje/t3rAcfunTtnqDWnr6IAuov1ex+mu215/2+0mrvt3eVk12iUIyLv85f454Z/Z2D55YNfVhSKXN2y7iT8sfCbslfZy1OPT5DqH+bsS91R337vTrgbeJsv9kl0451IFzkt2ibDtksaivbLtnITuz+9mDH/SNbfihJ3E8MGf+olc3UP+nJriXN7iYJn772D9DLmWTWaXDiYuX7s+rn3iGSgiIiIijziAIiIiIvKIl/AqEfqsxMTX/PEuE7/w+7+I7T7t5bik10u+R5fa9oG07Z/aKXJlm7bE3smAO+v+T0U7wzHWH71V3siS8/el8INsJVdaLnVcuc1Uwb+MezRP/nustst24UL95MOhdaYy8bYLaorciRb2DpuMGvYyxPv9HhPbZdu3wLfl8j3+c5O99P5dSF52zM2w79l0ibzjMPgVjJ3zocBA2Arjye5MNffTHPsTLw/75f3dpn8zsZ/uJs9q09rEx1s3ct2uZc1/uOYKs+Xlvatf/dDEL3dqFr55THgGioiIiMgjDqCIiIiIPOIAioiIiMgjzoGqQt5MuxzBzevlSuT1Jtvboee0fU/k1o6cauKOra4XuQ5/tOPW8g2b4tLPIDhwXW8T39tUzjcLwa6qu+L9ziJ3Ovxxe7nzyfMAEHLM+phXIvvcHiuT0qd4O34sW7RDjplBsyZOEbm5N3eL6D3HN3patDNgJzAd1SdEbke5/RlP3fNTE18wf6zYrsEq+/vS/P1dIqe22uN2T4lcoblppp1jpZetrqrrgbX5wd6iXdJ5mqOlRO4Tx13j+UtlznnreaK1+Iv8HDjr3F+ZeE2fZ91fqNxTQXVX63km/s3jo0Wu05R9Ub3nvl75Ji4dEt0q8MPb2M/FO/Pis+RAnxz7N/ZlcA4UERERUUpxAEVERETkES/heaAWyVvrjwyxpyx7Dr9F5JaMtw/cXHe+vERxbcGFJv6+bzx7mN7KHFdT6mfUELnFx+yt6G2f2yFfl9BeSeEPOV73F+dDLVeI3LWbLjZxx9s2i1yyH2YcL+1+uUq0uzxol+to1fObqN6zeLdcKXzPu/Yhvo3WygeD1pi3zNGyuUIsd33/8J/1N+P7mLhnzcUi9+IPLavobTURdtt76BSLOIxe8msTt/mrf26JD4XstblT9T+o61P0X22X6yg+8zWRG5hjr7tuvPwJ+cLLE9qthPu67Iho3zL6VhNnxnnqBM9AEREREXnEARQRERGRRxxAEREREXnEOVAxKN+128RNH90tcsfutjNzcpWcz/NUwdsmvvQqeft17htL4tnFwNhXXsfEyX4UjnPe0/rJXUVu3RV2uYp3j9QXuR3T2pm47n7/zA2Jpzb3LK56I4+a4+u4v6dTbv89rrl7i682cSH88YggomjkDLbzLs956xqRW3r2i8nuTkRu2NbfxMVLznTd7omfPyPazjldV382RuSaFCduyRiegSIiIiLyiAMoIiIiIo94Cc+DUF+5svJXQ+1Tn8/stkXkwi/bOT32nX36fO6b7rdfk3XnoqEmLgxbLiDeQgO6i/buO46auKRoqsgNXD3cxLUvkqvK10UwL9sFWes3A3pPewScn2+ThvwthT2JnPPy+qaJPxG5hX2cTzOoJXKXrbf36rd9aI3IhRA8TYZsFe3L6lxg4o3jOohcqPWxiN6z7mK5in/dbXbayu6z7dCi7aPrIu6nPmr33f6I++fne33lVIqBOfbvaHlxo4j3FyuegSIiIiLyiAMoIiIiIo84gCIiIiLyiHOgKqGK7O2TX95q5zI9dZ58onf/WvJJ8W6Oa/k4ik++a2MboZ1R9DCgHE9Fzwgb2z/Sd46Jp0E++iMett5nnz7/2siHRa4w2/4OnL10lMi1uOqLuPeFKBUyPrKPqrr31V+I3FWjpoZv7gvOeU9rRof3sRbcHCm1x3TOoUPx7pbv6OPHRbvc0W4zMf5LkbR+07GvOL2nPs/O0buswXNxetfY8AwUERERkUdVDqCUUq2UUsVKqRKl1Fql1G0V389TSn2glNpQ8bVh4rtLsQghBNYyULJZz2DgsRk4PDargUgu4ZUBGKe1XqmUqgtghVLqAwC/ArBAaz1ZKTUBwAQA4xPX1fjKatPaxF+NbiFyfxhuV2m9us7eqN5/4q4iE3/4SC+Ra/hs/E+ZeuDfWjruIA+F3Uw8IGeficfO7iFyZ8yy22Z/K0/H7xrQxMR5w7eb+JbTF4jtLs61SyPMPdxU5EauvsjEjZ+s7dr9FPFvPdNAppL/htxfmG3iZu8muzesZVW+/q8+or1w1H87Wu6X7DaXyVvzQ0/lO7Nx6FmlWM84UovsJea3Dsglhfo1S81yQFWegdJa79Rar6yIDwEoAdASwBUAfpwU9CyAKxPVSYqPDGSAtQyUUtYzGHhsBg6PzWrA0xwopVQBgO4AlgBoqrXeCZwcZAHId3nNDUqp5Uqp5aU4XtkmlAKsZbCwnsHBWgYL6xlcEQ+glFJ1ALwGYKzW+mCkr9Naz9BaF2mti7JRM5o+UpyxlsHCegYHaxksrGewRbSMgVIqGyd/CV7QWr9e8e1dSqnmWuudSqnmAHYnqpPRyio43cTf92gucsPvm2fi/2jwOqIxbqed27R4epHI5c22T3JvGErpnCchXWtZS9lf1ZJBT4jcR/3s3IcNx5uJ3Oj6WyJ6/9t29DPxvI/l9fX2t/n3kSzpWk+/KNdhD+5I4X3J6VrLN3rb43HhF+1E7tWbBpu45sZdEb3f9+eeJtq/vP9tEw+q/WeRa5hhHyeyt/yoyG0ts7m7xt0hcrXfWBJRX2KRrvVMB58faCnaEzPscdzyf+XvWbyWUahMJHfhKQDPACjRWjsXyJkL4MdFcUYBeDP8teQv+uQsbdYyWFjPAOCxGUisZ8BFcgbqPADXAVitlPpxGvxEAJMBvKyUGgPgawBDXV5PPlF+cizOWgZHHbCegcBjM3B4bFYDVQ6gtNYfQawRLQyMb3e8y2puL9l8N1PeYn5jmw9NPKJuZKePw938TV8Tr3xcXtpp/Kp9infeIf9cpnOThSxorX1by6b/sGezx/+mt8g91Mz95+tcEb5vrS2u2606bk+4jvjwBpErHG2XMWgP/16yC/ODn+uZjo70PJKS/frt2Kz3lWz/85hduTv8CQzOlfoL638tcmNeeMrzvjPC/tyEnOubIEfknMsTXDnjbpFr9cDHJs5F4i/ZheGxmUA1rpcXz1Znn2Hi8i+/Ct88YbgSOREREZFHHEARERERecQBFBEREZFHES1jkGonBtslAk7c/p3ITWz3jokvzDkc1fvvCrv9tf/ccSbueO86E+cdkPNwwm6Aphg5r11vGFogcp1vucXEXwx7LOL37PjOTSbuMN3ObylctaKyzamaCX+UC53U6Bn5WTfpmktN/H7n6JZ9SYQrn7rLxM45TxRsZZu3proLAHgGioiIiMgzDqCIiIiIPEqLS3hbrrTjvC+7vhLx66YdsLc2PvLhhSKnyu0dph0nyadxt99lb3lN5Cqm5K5s0xbRbne7bV9+e8+I36cQy0ysT7EdVR/H5zcxcXk3XoiPRM2J9Uy88RX5bLZ22Yl91Ej3JSNN3PipXJE7ff5yE/P4pmTjGSgiIiIijziAIiIiIvKIAygiIiIij9JiDlThjUtNfOmNPaJ7Dyx1zXGeE1H10WyKvd39kilni1xbfBq+OQHQy1abeGxBn6TuuyXWuuY474lSiWegiIiIiDziAIqIiIjIIw6giIiIiDziAIqIiIjIIw6giIiIiDziAIqIiIjIIw6giIiIiDziAIqIiIjIIw6giIiIiDxSWidvLVel1B4AWwE0BrA3aTt2V9360Vpr3aTqzarGWp4S6xm76tYP1jI50rWeh1H9foZVSXktkzqAMjtVarnWuijpO2Y/4s4vffdLPwB/9cUrv/Sd/YidX/rul34A/uqLF37qt1/64od+8BIeERERkUccQBERERF5lKoB1IwU7Tcc+xE7v/TdL/0A/NUXr/zSd/Yjdn7pu1/6AfirL174qd9+6UvK+5GSOVBERERE6YyX8IiIiIg84gCKiIiIyKOkDqCUUhcppdYrpTYqpSYked8zlVK7lVJrHN/LU0p9oJTaUPG1YRL60UopVayUKlFKrVVK3ZaqvsQqVfVkLeOPx2Zw6slaBqeWAOtZsU9f1jNpAyilVCaAaQAuBtAZwAilVOdk7R/AbAAXhX1vAoAFWuv2ABZUtBOtDMA4rXUnAL0A/Lbi55CKvkQtxfWcDdYybnhsGmlfT9bSSPtaAqyngz/rqbVOyn8AegN4z9G+B8A9ydp/xT4LAKxxtNcDaF4RNwewPpn9qdjvmwAG+aEv6VRP1jI4tWQ9WUvWkvVMx3om8xJeSwDbHO3tFd9LpaZa650AUPE1P5k7V0oVAOgOYEmq+xIFv9WTtYye32oJsJ7RYi3DpHEtAdbzX/ipnskcQKlKvldt11BQStUB8BqAsVrrg6nuTxRYzwqsZbCkeT1ZS4c0ryXAegp+q2cyB1DbAbRytE8DsCOJ+6/MLqVUcwCo+Lo7GTtVSmXj5C/BC1rr11PZlxj4rZ6sZfT8VkuA9YwWa1khALUEWE/Dj/VM5gBqGYD2Sqk2SqkaAK4BMDeJ+6/MXACjKuJROHldNaGUUgrAMwBKtNYPp7IvMfJbPVnL6PmtlgDrGS3WEoGpJcB6AvBxPZM88esSAF8C+ArA75K87zkAdgIoxclR/RgAjXBy5v6Giq95SehHX5w8Bfs5gE8r/rskFX1J13qylsGpJevJWrKWrGe61pOPciEiIiLyiCuRExEREXnEARQRERGRRxxAEREREXnEARQRERGRRxxAEREREXnEARQRERGRRxxAEREREXn0/6qK5FZQqcBNAAAAAElFTkSuQmCC\n",
"text/plain": [
"<Figure size 720x720 with 5 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# iterate normally over keys\n",
"\n",
"for k in train_aset:\n",
" # equivalent method: for k in train_aset.keys():\n",
" print(k)\n",
"\n",
"# iterate over items (plot results)\n",
"\n",
"fig, axs = plt.subplots(nrows=1, ncols=5, figsize=(10, 10))\n",
"\n",
"for idx, v in enumerate(train_aset.values()):\n",
" axs[idx].imshow(v.reshape(28, 28))\n",
"plt.show()\n",
"\n",
"# iterate over items, store k, v in dict\n",
"\n",
"myDict = {}\n",
"for k, v in train_aset.items():\n",
" myDict[k] = v"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Performance\n",
"\n",
"Once you’ve completed an interactive exploration, be sure to use the context\n",
"manager form of the `update()` and `get()` methods!\n",
"\n",
"In order to make sure that all your data is always safe in Hangar, the backend\n",
"diligently ensures that all contexts (operations which can somehow interact\n",
"with the record structures) are opened and closed appropriately. When you use the\n",
"context manager form of a arrayset object, we can offload a significant amount of\n",
"work to the python runtime, and dramatically increase read and write speeds.\n",
"\n",
"Most columns we’ve tested see an increased throughput differential of 250% -\n",
"500% for writes and 300% - 600% for reads when comparing using the context\n",
"manager form vs the naked form!"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Beginning non-context manager form\n",
"----------------------------------\n",
"Finished non-context manager form in: 78.54769086837769 seconds\n",
"Hard reset requested with writer_lock: 8910b50e-1f9d-4cb1-986c-b99ea84c8a54\n",
"\n",
"Beginning context manager form\n",
"--------------------------------\n",
"Finished context manager form in: 11.608536720275879 seconds\n",
"Hard reset requested with writer_lock: ad4a2ef9-8494-49f8-84ef-40c3990b1e9b\n"
]
}
],
"source": [
"import time\n",
"\n",
"# ----------------- Non Context Manager Form ----------------------\n",
"\n",
"co = repo.checkout(write=True)\n",
"aset_trimgs = co.add_ndarray_column(name='train_images', prototype=sample_trimg)\n",
"aset_trlabels = co.add_ndarray_column(name='train_labels', prototype=sample_trlabel)\n",
"\n",
"print(f'Beginning non-context manager form')\n",
"print('----------------------------------')\n",
"start_time = time.time()\n",
"\n",
"for idx, img in enumerate(trimgs):\n",
" aset_trimgs[idx] = img\n",
" aset_trlabels[idx] = np.array([trlabels[idx]])\n",
"\n",
"print(f'Finished non-context manager form in: {time.time() - start_time} seconds')\n",
"\n",
"co.reset_staging_area()\n",
"co.close()\n",
"\n",
"# ----------------- Context Manager Form --------------------------\n",
"\n",
"co = repo.checkout(write=True)\n",
"aset_trimgs = co.add_ndarray_column(name='train_images', prototype=sample_trimg)\n",
"aset_trlabels = co.add_ndarray_column(name='train_labels', prototype=sample_trlabel)\n",
"\n",
"print(f'\\nBeginning context manager form')\n",
"print('--------------------------------')\n",
"start_time = time.time()\n",
"\n",
"with aset_trimgs, aset_trlabels:\n",
" for idx, img in enumerate(trimgs):\n",
" aset_trimgs[idx] = img\n",
" aset_trlabels[idx] = np.array([trlabels[idx]])\n",
"\n",
"print(f'Finished context manager form in: {time.time() - start_time} seconds')\n",
"\n",
"co.reset_staging_area()\n",
"co.close()\n",
"\n",
"print(f'Finished context manager with checkout form in: {time.time() - start_time} seconds')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Clearly, the context manager form is far and away superior, however we fell that\n",
"for the purposes of interactive use that the \"Naked\" form is valubal to the\n",
"average user!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Commiting Changes\n",
"\n",
"Once you have made a set of changes you want to commit, just simply call the [commit()](api.rst#hangar.checkout.WriterCheckout.commit) method (and pass in a message)!"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'a=8eb01eaf0c657f8526dbf9a8ffab0a4606ebfd3b'"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.commit('hello world, this is my first hangar commit')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The returned value (`'e11d061dc457b361842801e24cbd119a745089d6'`) is the commit hash of this commit. It\n",
"may be useful to assign this to a variable and follow this up by creating a\n",
"branch from this commit!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Don't Forget to Close the Write-Enabled Checkout to Release the Lock!\n",
"\n",
"We mentioned in `Checking out the repo for writing` that when a\n",
"`write-enabled` checkout is created, it places a lock on writers until it is\n",
"closed. If for whatever reason the program terminates via a non python `SIGKILL` or fatal\n",
"interpreter error without closing the\n",
"write-enabled checkout, this lock will persist (forever technically, but\n",
"realistically until it is manually freed).\n",
"\n",
"Luckily, preventing this issue from occurring is as simple as calling\n",
"[close()](api.rst#hangar.checkout.WriterCheckout.close)!\n",
"\n",
"If you forget, normal interperter shutdown should trigger an `atexit` hook automatically,\n",
"however this behavior should not be relied upon. Is better to just call\n",
"[close()](api.rst#hangar.checkout.WriterCheckout.close)."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"co.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### But if you did forget, and you recieve a `PermissionError` next time you open a checkout\n",
"\n",
"```\n",
"PermissionError: Cannot acquire the writer lock. Only one instance of\n",
"a writer checkout can be active at a time. If the last checkout of this\n",
"repository did not properly close, or a crash occured, the lock must be\n",
"manually freed before another writer can be instantiated.\n",
"```\n",
"\n",
"You can manually free the lock with the following method. However!\n",
"\n",
"This is a dangerous operation, and it's one of the only ways where a user can put\n",
"data in their repository at risk! If another python process is still holding the\n",
"lock, do NOT force the release. Kill the process (that's totally fine to do at\n",
"any time, then force the lock release)."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"repo.force_release_writer_lock()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Reading Data\n",
"\n",
"Two different styles of access are considered below, In general, the contex manager form\n",
"if recomended (though marginal performance improvements are expected to be seen at best)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
" Neither BRANCH or COMMIT specified.\n",
" * Checking out writing HEAD BRANCH: master\n",
"\n",
"Begining Key Iteration\n",
"-----------------------\n",
"completed in 5.838773965835571 sec\n",
"\n",
"Begining Items Iteration with Context Manager\n",
"---------------------------------------------\n",
"completed in 5.516948938369751 sec\n"
]
}
],
"source": [
"co = repo.checkout()\n",
"\n",
"trlabel_col = co['train_labels']\n",
"trimg_col = co['train_images']\n",
"\n",
"print(f'\\nBegining Key Iteration')\n",
"print('-----------------------')\n",
"start = time.time()\n",
"\n",
"for idx in trimg_col.keys():\n",
" image_data = trimg_col[idx]\n",
" label_data = trlabel_col[idx]\n",
"\n",
"print(f'completed in {time.time() - start} sec')\n",
"\n",
"print(f'\\nBegining Items Iteration with Context Manager')\n",
"print('---------------------------------------------')\n",
"start = time.time()\n",
"\n",
"with trlabel_col, trimg_col:\n",
" for index, image_data in trimg_col.items():\n",
" label_data = trlabel_col[index]\n",
"\n",
"print(f'completed in {time.time() - start} sec')\n",
"\n",
"co.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Inspecting state from the top!\n",
"\n",
"After your first commit, the summary and log methods will begin to work, and you can either print the stream to the console (as shown below), or you can\n",
"dig deep into the internal of how hangar thinks about your data! (To be covered in an advanced tutorial later on).\n",
"\n",
"The point is, regardless of your level of interaction with a live hangar repository, all level of state is accessable from the top, and in general has been built to be the only way to directly access it!"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Summary of Contents Contained in Data Repository \n",
" \n",
"================== \n",
"| Repository Info \n",
"|----------------- \n",
"| Base Directory: /Users/rick/projects/tensorwerk/hangar/dev/mnist \n",
"| Disk Usage: 57.29 MB \n",
" \n",
"=================== \n",
"| Commit Details \n",
"------------------- \n",
"| Commit: a=8eb01eaf0c657f8526dbf9a8ffab0a4606ebfd3b \n",
"| Created: Tue Feb 25 19:03:06 2020 \n",
"| By: Rick Izzo \n",
"| Email: rick@tensorwerk.com \n",
"| Message: hello world, this is my first hangar commit \n",
" \n",
"================== \n",
"| DataSets \n",
"|----------------- \n",
"| Number of Named Columns: 2 \n",
"|\n",
"| * Column Name: ColumnSchemaKey(column=\"train_images\", layout=\"flat\") \n",
"| Num Data Pieces: 50000 \n",
"| Details: \n",
"| - column_layout: flat \n",
"| - column_type: ndarray \n",
"| - schema_type: fixed_shape \n",
"| - shape: (784,) \n",
"| - dtype: uint8 \n",
"| - backend: 00 \n",
"| - backend_options: {'complib': 'blosc:lz4hc', 'complevel': 5, 'shuffle': 'byte'} \n",
"|\n",
"| * Column Name: ColumnSchemaKey(column=\"train_labels\", layout=\"flat\") \n",
"| Num Data Pieces: 50000 \n",
"| Details: \n",
"| - column_layout: flat \n",
"| - column_type: ndarray \n",
"| - schema_type: fixed_shape \n",
"| - shape: (1,) \n",
"| - dtype: int64 \n",
"| - backend: 10 \n",
"| - backend_options: {} \n",
" \n",
"================== \n",
"| Metadata: \n",
"|----------------- \n",
"| Number of Keys: 0 \n",
"\n"
]
}
],
"source": [
"repo.summary()"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=8eb01eaf0c657f8526dbf9a8ffab0a4606ebfd3b (\u001B[1;31mmaster\u001B[m) : hello world, this is my first hangar commit\n"
]
}
],
"source": [
"repo.log()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
================================================
FILE: docs/Tutorial-002.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Part 2: Checkouts, Branching, & Merging\n",
"\n",
"This section deals with navigating repository history, creating & merging\n",
"branches, and understanding conflicts."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### The Hangar Workflow\n",
"\n",
"The hangar workflow is intended to mimic common ``git`` workflows in which small\n",
"incremental changes are made and committed on dedicated ``topic`` branches.\n",
"After the ``topic`` has been adequatly set, ``topic`` branch is merged into\n",
"a separate branch (commonly referred to as ``master``, though it need not to be the\n",
"actual branch named ``\"master\"``), where well vetted and more permanent changes\n",
"are kept.\n",
"\n",
" Create Branch -> Checkout Branch -> Make Changes -> Commit\n",
"\n",
"#### Making the Initial Commit\n",
"\n",
"Let's initialize a new repository and see how branching works in Hangar:\n",
"\n",
"<!-- However, unlike GIT, remember that it is not possible to make changes in a DETACHED HEAD state. Hangar enforces the requirement that all work is performed at the tip of a branch. -->"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from hangar import Repository\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"repo = Repository(path='/Users/rick/projects/tensorwerk/hangar/dev/mnist/')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hangar Repo initialized at: /Users/rick/projects/tensorwerk/hangar/dev/mnist/.hangar\n"
]
}
],
"source": [
"repo_pth = repo.init(user_name='Test User', user_email='test@foo.com', remove_old=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When a repository is first initialized, it has no history, no commits."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"repo.log() # -> returns None"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Though the repository is essentially empty at this point in time, there is one\n",
"thing which is present: a branch with the name: ``\"master\"``."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['master']"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"repo.list_branches()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This ``\"master\"`` is the branch we make our first commit on; until we do, the\n",
"repository is in a semi-unstable state; with no history or contents, most of the\n",
"functionality of a repository (to store, retrieve, and work with versions of\n",
"data across time) just isn't possible. A significant portion of otherwise\n",
"standard operations will generally flat out refuse to execute (ie. read-only\n",
"checkouts, log, push, etc.) until the first commit is made.\n",
"\n",
"One of the only options available at this point is to create a\n",
"write-enabled checkout on the ``\"master\"`` branch and to begin to add data so we\n",
"can make a commit. Let’s do that now:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"co = repo.checkout(write=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As expected, there are no columns nor metadata samples recorded in the checkout."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"number of metadata keys: 0\n",
"number of columns: 0\n"
]
}
],
"source": [
"print(f'number of metadata keys: {len(co.metadata)}')\n",
"print(f'number of columns: {len(co.columns)}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let’s add a dummy array just to put something in the repository history to\n",
"commit. We'll then close the checkout so we can explore some useful tools which\n",
"depend on having at least one historical record (commit) in the repo."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"dummy = np.arange(10, dtype=np.uint16)\n",
"col = co.add_ndarray_column('dummy_column', prototype=dummy)\n",
"col['0'] = dummy\n",
"initialCommitHash = co.commit('first commit with a single sample added to a dummy column')\n",
"co.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we check the history now, we can see our first commit hash, and that it is labeled with the branch name `\"master\"`"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=eaee002ed9c6e949c3657bd50e3949d6a459d50e (\u001B[1;31mmaster\u001B[m) : first commit with a single sample added to a dummy column\n"
]
}
],
"source": [
"repo.log()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So now our repository contains:\n",
"- [A commit](api.rst#hangar.checkout.WriterCheckout.commit_hash): a fully\n",
" independent description of the entire repository state as\n",
" it existed at some point in time. A commit is identified by a `commit_hash`.\n",
"- [A branch](api.rst#hangar.checkout.WriterCheckout.branch_name): a label\n",
" pointing to a particular `commit` / `commit_hash`.\n",
"\n",
"Once committed, it is not possible to remove, modify, or otherwise tamper with\n",
"the contents of a commit in any way. It is a permanent record, which Hangar has\n",
"no method to change once written to disk.\n",
"\n",
"In addition, as a `commit_hash` is not only calculated from the `commit` ’s\n",
"contents, but from the `commit_hash` of its parents (more on this to follow),\n",
"knowing a single top-level `commit_hash` allows us to verify the integrity of\n",
"the entire repository history. This fundamental behavior holds even in cases of\n",
"disk-corruption or malicious use.\n",
"\n",
"### Working with Checkouts & Branches\n",
"\n",
"As mentioned in the first tutorial, we work with the data in a repository through\n",
"a [checkout](api.rst#hangar.repository.Repository.checkout). There are two types\n",
"of checkouts (each of which have different uses and abilities):\n",
"\n",
"**[Checking out a branch / commit for reading:](api.rst#read-only-checkout)** is\n",
"the process of retrieving records describing repository state at some point in\n",
"time, and setting up access to the referenced data.\n",
"\n",
"- Any number of read checkout processes can operate on a repository (on\n",
" any number of commits) at the same time.\n",
"\n",
"**[Checking out a branch for writing:](api.rst#write-enabled-checkout)** is the\n",
"process of setting up a (mutable) ``staging area`` to temporarily gather\n",
"record references / data before all changes have been made and staging area\n",
"contents are committed in a new permanent record of history (a `commit`).\n",
"\n",
"- Only one write-enabled checkout can ever be operating in a repository\n",
" at a time.\n",
"- When initially creating the checkout, the `staging area` is not\n",
" actually “empty”. Instead, it has the full contents of the last `commit`\n",
" referenced by a branch’s `HEAD`. These records can be removed / mutated / added\n",
" to in any way to form the next `commit`. The new `commit` retains a\n",
" permanent reference identifying the previous ``HEAD`` ``commit`` was used as\n",
" its base `staging area`.\n",
"- On commit, the branch which was checked out has its ``HEAD`` pointer\n",
" value updated to the new `commit`’s `commit_hash`. A write-enabled\n",
" checkout starting from the same branch will now use that `commit`’s\n",
" record content as the base for its `staging area`.\n",
"\n",
"#### Creating a branch\n",
"\n",
"A branch is an individual series of changes / commits which diverge from the main\n",
"history of the repository at some point in time. All changes made along a branch\n",
"are completely isolated from those on other branches. After some point in time,\n",
"changes made in a disparate branches can be unified through an automatic\n",
"`merge` process (described in detail later in this tutorial). In general, the\n",
"`Hangar` branching model is semantically identical to the `Git` one; The one exception\n",
"is that in Hangar, a branch must always have a `name` and a `base_commit`. (No\n",
"\"Detached HEAD state\" is possible for a `write-enabled` checkout). If No `base_commit` is\n",
"specified, the current writer branch `HEAD` `commit` is used as the `base_commit`\n",
"hash for the branch automatically.\n",
"\n",
"Hangar branches have the same lightweight and performant properties which\n",
"make working with `Git` branches so appealing - they are cheap and easy to use,\n",
"create, and discard (if necessary).\n",
"\n",
"To create a branch, use the [create_branch()](api.rst#hangar.repository.Repository.create_branch)\n",
"method."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"branch_1 = repo.create_branch(name='testbranch')"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BranchHead(name='testbranch', digest='a=eaee002ed9c6e949c3657bd50e3949d6a459d50e')"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"branch_1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We use the [list_branches()](api.rst#hangar.repository.Repository.list_branches) and [log()](api.rst#hangar.repository.Repository.log) methods to see that a new branch named `testbranch` has been created and is indeed pointing to our initial commit."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"branch names: ['master', 'testbranch'] \n",
"\n",
"* a=eaee002ed9c6e949c3657bd50e3949d6a459d50e (\u001B[1;31mmaster\u001B[m) (\u001B[1;31mtestbranch\u001B[m) : first commit with a single sample added to a dummy column\n"
]
}
],
"source": [
"print(f'branch names: {repo.list_branches()} \\n')\n",
"repo.log()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If instead, we actually specify the base commit (with a different branch\n",
"name) we see we do actually get a third branch. pointing to the same commit as\n",
"`master` and `testbranch`"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"branch_2 = repo.create_branch(name='new', base_commit=initialCommitHash)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BranchHead(name='new', digest='a=eaee002ed9c6e949c3657bd50e3949d6a459d50e')"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"branch_2"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=eaee002ed9c6e949c3657bd50e3949d6a459d50e (\u001B[1;31mmaster\u001B[m) (\u001B[1;31mnew\u001B[m) (\u001B[1;31mtestbranch\u001B[m) : first commit with a single sample added to a dummy column\n"
]
}
],
"source": [
"repo.log()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Making changes on a branch\n",
"\n",
"Let’s make some changes on the `new` branch to see how things work.\n",
"\n",
"We can see that the data we added previously is still here (`dummy` arrayset containing\n",
"one sample labeled `0`)."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"co = repo.checkout(write=True, branch='new')"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar Columns \n",
" Writeable : True \n",
" Number of Columns : 1 \n",
" Column Names / Partial Remote References: \n",
" - dummy_column / False"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.columns"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar FlatSampleWriter \n",
" Column Name : dummy_column \n",
" Writeable : True \n",
" Column Type : ndarray \n",
" Column Layout : flat \n",
" Schema Type : fixed_shape \n",
" DType : uint16 \n",
" Shape : (10,) \n",
" Number of Samples : 1 \n",
" Partial Remote Data Refs : False\n"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.columns['dummy_column']"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint16)"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.columns['dummy_column']['0']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's add another sample to the `dummy_arrayset` called `1`"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"arr = np.arange(10, dtype=np.uint16)\n",
"# let's increment values so that `0` and `1` aren't set to the same thing\n",
"arr += 1\n",
"\n",
"co['dummy_column', '1'] = arr"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can see that in this checkout, there are indeed two samples in the `dummy_arrayset`:"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(co.columns['dummy_column'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That's all, let's commit this and be done with this branch."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"co.commit('commit on `new` branch adding a sample to dummy_arrayset')\n",
"co.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### How do changes appear when made on a branch?\n",
"\n",
"If we look at the log, we see that the branch we were on (`new`) is a commit ahead of `master` and `testbranch`"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=c1cf1bd6863ed0b95239d2c9e1a6c6cc65569e94 (\u001B[1;31mnew\u001B[m) : commit on `new` branch adding a sample to dummy_arrayset\n",
"* a=eaee002ed9c6e949c3657bd50e3949d6a459d50e (\u001B[1;31mmaster\u001B[m) (\u001B[1;31mtestbranch\u001B[m) : first commit with a single sample added to a dummy column\n"
]
}
],
"source": [
"repo.log()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The meaning is exactly what one would intuit. We made some changes, they were\n",
"reflected on the `new` branch, but the `master` and `testbranch` branches\n",
"were not impacted at all, nor were any of the commits!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Merging (Part 1) Fast-Forward Merges\n",
"\n",
"Say we like the changes we made on the ``new`` branch so much that we want them\n",
"to be included into our ``master`` branch! How do we make this happen for this\n",
"scenario??\n",
"\n",
"Well, the history between the ``HEAD`` of the ``new`` and the ``HEAD`` of the\n",
"``master`` branch is perfectly linear. In fact, when we began making changes\n",
"on ``new``, our staging area was *identical* to what the ``master`` ``HEAD``\n",
"commit references are right now!\n",
"\n",
"If you’ll remember that a branch is just a pointer which assigns some ``name``\n",
"to a ``commit_hash``, it becomes apparent that a merge in this case really\n",
"doesn’t involve any work at all. With a linear history between ``master`` and\n",
"``new``, any ``commits`` exsting along the path between the ``HEAD`` of\n",
"``new`` and ``master`` are the only changes which are introduced, and we can\n",
"be sure that this is the only view of the data records which can exist!\n",
"\n",
"What this means in practice is that for this type of merge, we can just update\n",
"the ``HEAD`` of ``master`` to point to the ``HEAD`` of ``\"new\"``, and the\n",
"merge is complete.\n",
"\n",
"This situation is referred to as a **Fast Forward (FF) Merge**. A FF merge is\n",
"safe to perform any time a linear history lies between the ``HEAD`` of some\n",
"``topic`` and ``base`` branch, regardless of how many commits or changes which\n",
"were introduced.\n",
"\n",
"For other situations, a more complicated **Three Way Merge** is required. This\n",
"merge method will be explained a bit more later in this tutorial."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"co = repo.checkout(write=True, branch='master')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Performing the Merge\n",
"\n",
"In practice, you’ll never need to know the details of the merge theory explained\n",
"above (or even remember it exists). Hangar automatically figures out which merge\n",
"algorithms should be used and then performed whatever calculations are needed to\n",
"compute the results.\n",
"\n",
"As a user, merging in Hangar is a one-liner! just use the [merge()](api.rst#hangar.checkout.WriterCheckout.merge)\n",
"method from a `write-enabled` checkout (shown below), or the analogous methods method\n",
"from the Repository Object [repo.merge()](api.rst#hangar.repository.Repository.merge)\n",
"(if not already working with a `write-enabled` checkout object)."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Selected Fast-Forward Merge Strategy\n"
]
},
{
"data": {
"text/plain": [
"'a=c1cf1bd6863ed0b95239d2c9e1a6c6cc65569e94'"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.merge(message='message for commit (not used for FF merge)', dev_branch='new')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's check the log!"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=c1cf1bd6863ed0b95239d2c9e1a6c6cc65569e94 (\u001B[1;31mmaster\u001B[m) (\u001B[1;31mnew\u001B[m) : commit on `new` branch adding a sample to dummy_arrayset\n",
"* a=eaee002ed9c6e949c3657bd50e3949d6a459d50e (\u001B[1;31mtestbranch\u001B[m) : first commit with a single sample added to a dummy column\n"
]
}
],
"source": [
"repo.log()"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'master'"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.branch_name"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'a=c1cf1bd6863ed0b95239d2c9e1a6c6cc65569e94'"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.commit_hash"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar FlatSampleWriter \n",
" Column Name : dummy_column \n",
" Writeable : True \n",
" Column Type : ndarray \n",
" Column Layout : flat \n",
" Schema Type : fixed_shape \n",
" DType : uint16 \n",
" Shape : (10,) \n",
" Number of Samples : 2 \n",
" Partial Remote Data Refs : False\n"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.columns['dummy_column']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As you can see, everything is as it should be!"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"co.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Making changes to introduce diverged histories\n",
"\n",
"Let’s now go back to our `testbranch` branch and make some changes there so\n",
"we can see what happens when changes don’t follow a linear history."
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"co = repo.checkout(write=True, branch='testbranch')"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar Columns \n",
" Writeable : True \n",
" Number of Columns : 1 \n",
" Column Names / Partial Remote References: \n",
" - dummy_column / False"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.columns"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar FlatSampleWriter \n",
" Column Name : dummy_column \n",
" Writeable : True \n",
" Column Type : ndarray \n",
" Column Layout : flat \n",
" Schema Type : fixed_shape \n",
" DType : uint16 \n",
" Shape : (10,) \n",
" Number of Samples : 1 \n",
" Partial Remote Data Refs : False\n"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.columns['dummy_column']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will start by mutating sample `0` in `dummy_arrayset` to a different value"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([50, 51, 52, 53, 54, 55, 56, 57, 58, 59], dtype=uint16)"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"old_arr = co['dummy_column', '0']\n",
"new_arr = old_arr + 50\n",
"new_arr"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"co['dummy_column', '0'] = new_arr"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let’s make a commit here, then add some metadata and make a new commit (all on\n",
"the `testbranch` branch)."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'a=fcd82f86e39b19c3e5351dda063884b5d2fda67b'"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.commit('mutated sample `0` of `dummy_column` to new value')"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=fcd82f86e39b19c3e5351dda063884b5d2fda67b (\u001B[1;31mtestbranch\u001B[m) : mutated sample `0` of `dummy_column` to new value\n",
"* a=eaee002ed9c6e949c3657bd50e3949d6a459d50e : first commit with a single sample added to a dummy column\n"
]
}
],
"source": [
"repo.log()"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [],
"source": [
"co.metadata['hello'] = 'world'"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'a=69a08ca41ca1f5577fb0ffcf59d4d1585f614c4d'"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.commit('added hellow world metadata')"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [],
"source": [
"co.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Looking at our history how, we see that none of the original branches reference\n",
"our first commit anymore."
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=69a08ca41ca1f5577fb0ffcf59d4d1585f614c4d (\u001B[1;31mtestbranch\u001B[m) : added hellow world metadata\n",
"* a=fcd82f86e39b19c3e5351dda063884b5d2fda67b : mutated sample `0` of `dummy_column` to new value\n",
"* a=eaee002ed9c6e949c3657bd50e3949d6a459d50e : first commit with a single sample added to a dummy column\n"
]
}
],
"source": [
"repo.log()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can check the history of the `master` branch by specifying it as an argument to the `log()` method."
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=c1cf1bd6863ed0b95239d2c9e1a6c6cc65569e94 (\u001B[1;31mmaster\u001B[m) (\u001B[1;31mnew\u001B[m) : commit on `new` branch adding a sample to dummy_arrayset\n",
"* a=eaee002ed9c6e949c3657bd50e3949d6a459d50e : first commit with a single sample added to a dummy column\n"
]
}
],
"source": [
"repo.log('master')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Merging (Part 2) Three Way Merge\n",
"\n",
"If we now want to merge the changes on `testbranch` into `master`, we can't just follow a simple linear history; **the branches have diverged**.\n",
"\n",
"For this case, Hangar implements a **Three Way Merge** algorithm which does the following:\n",
"- Find the most recent common ancestor `commit` present in both the `testbranch` and `master` branches\n",
"- Compute what changed between the common ancestor and each branch's `HEAD` commit\n",
"- Check if any of the changes conflict with each other (more on this in a later tutorial)\n",
"- If no conflicts are present, compute the results of the merge between the two sets of changes\n",
"- Create a new `commit` containing the merge results reference both branch `HEAD`s as parents of the new `commit`, and update the `base` branch `HEAD` to that new `commit`'s `commit_hash`"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [],
"source": [
"co = repo.checkout(write=True, branch='master')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once again, as a user, the details are completely irrelevant, and the operation\n",
"occurs from the same one-liner call we used before for the FF Merge."
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Selected 3-Way Merge Strategy\n"
]
},
{
"data": {
"text/plain": [
"'a=002041fe8d8846b06f33842964904b627de55214'"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.merge(message='merge of testbranch into master', dev_branch='testbranch')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we now look at the log, we see that this has a much different look than\n",
"before. The three way merge results in a history which references changes made\n",
"in both diverged branches, and unifies them in a single ``commit``"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=002041fe8d8846b06f33842964904b627de55214 (\u001B[1;31mmaster\u001B[m) : merge of testbranch into master\n",
"\u001B[1;31m|\u001B[m\u001B[1;32m\\\u001B[m \n",
"\u001B[1;31m|\u001B[m * a=69a08ca41ca1f5577fb0ffcf59d4d1585f614c4d (\u001B[1;31mtestbranch\u001B[m) : added hellow world metadata\n",
"\u001B[1;31m|\u001B[m * a=fcd82f86e39b19c3e5351dda063884b5d2fda67b : mutated sample `0` of `dummy_column` to new value\n",
"* \u001B[1;32m|\u001B[m a=c1cf1bd6863ed0b95239d2c9e1a6c6cc65569e94 (\u001B[1;31mnew\u001B[m) : commit on `new` branch adding a sample to dummy_arrayset\n",
"\u001B[1;32m|\u001B[m\u001B[1;32m/\u001B[m \n",
"* a=eaee002ed9c6e949c3657bd50e3949d6a459d50e : first commit with a single sample added to a dummy column\n"
]
}
],
"source": [
"repo.log()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Manually inspecting the merge result to verify it matches our expectations\n",
"\n",
"`dummy_arrayset` should contain two arrays, key `1` was set in the previous\n",
"commit originally made in `new` and merged into `master`. Key `0` was\n",
"mutated in `testbranch` and unchanged in `master`, so the update from\n",
"`testbranch` is kept.\n",
"\n",
"There should be one metadata sample with they key `hello` and the value\n",
"``\"world\"``."
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar Columns \n",
" Writeable : True \n",
" Number of Columns : 1 \n",
" Column Names / Partial Remote References: \n",
" - dummy_column / False"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.columns"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar FlatSampleWriter \n",
" Column Name : dummy_column \n",
" Writeable : True \n",
" Column Type : ndarray \n",
" Column Layout : flat \n",
" Schema Type : fixed_shape \n",
" DType : uint16 \n",
" Shape : (10,) \n",
" Number of Samples : 2 \n",
" Partial Remote Data Refs : False\n"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.columns['dummy_column']"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[array([50, 51, 52, 53, 54, 55, 56, 57, 58, 59], dtype=uint16),\n",
" array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], dtype=uint16)]"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co['dummy_column', ['0', '1']]"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hangar Metadata \n",
" Writeable: True \n",
" Number of Keys: 1\n"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.metadata"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'world'"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.metadata['hello']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**The Merge was a success!**"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [],
"source": [
"co.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Conflicts\n",
"\n",
"Now that we've seen merging in action, the next step is to talk about conflicts.\n",
"\n",
"#### How Are Conflicts Detected?\n",
"\n",
"Any merge conflicts can be identified and addressed ahead of running a `merge`\n",
"command by using the built in [diff](api.rst#hangar.diff.WriterUserDiff) tools.\n",
"When diffing commits, Hangar will provide a list of conflicts which it identifies.\n",
"In general these fall into 4 categories:\n",
"\n",
"1. **Additions** in both branches which created new keys (samples /\n",
" columns / metadata) with non-compatible values. For samples &\n",
" metadata, the hash of the data is compared, for columns, the schema\n",
" specification is checked for compatibility in a method custom to the\n",
" internal workings of Hangar.\n",
"2. **Removal** in `Master Commit/Branch` **& Mutation** in `Dev Commit / Branch`. Applies for samples, columns, and metadata identically.\n",
"3. **Mutation** in `Dev Commit/Branch` **& Removal** in `Master Commit / Branch`. Applies for samples, columns, and metadata identically.\n",
"4. **Mutations** on keys of both branches to non-compatible values. For\n",
" samples & metadata, the hash of the data is compared; for columns, the\n",
" schema specification is checked for compatibility in a method custom to the\n",
" internal workings of Hangar.\n",
"\n",
"#### Let's make a merge conflict\n",
"\n",
"To force a conflict, we are going to checkout the `new` branch and set the\n",
"metadata key `hello` to the value `foo conflict... BOO!`. Then if we try\n",
"to merge this into the `testbranch` branch (which set `hello` to a value\n",
"of `world`) we see how hangar will identify the conflict and halt without\n",
"making any changes.\n",
"\n",
"Automated conflict resolution will be introduced in a future version of Hangar,\n",
"for now it is up to the user to manually resolve conflicts by making any\n",
"necessary changes in each branch before reattempting a merge operation."
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"co = repo.checkout(write=True, branch='new')"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [],
"source": [
"co.metadata['hello'] = 'foo conflict... BOO!'"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'a=95896880b33fc06a3c2359a03408f07c87bcc8c0'"
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.commit ('commit on new branch to hello metadata key so we can demonstrate a conflict')"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=95896880b33fc06a3c2359a03408f07c87bcc8c0 (\u001B[1;31mnew\u001B[m) : commit on new branch to hello metadata key so we can demonstrate a conflict\n",
"* a=c1cf1bd6863ed0b95239d2c9e1a6c6cc65569e94 : commit on `new` branch adding a sample to dummy_arrayset\n",
"* a=eaee002ed9c6e949c3657bd50e3949d6a459d50e : first commit with a single sample added to a dummy column\n"
]
}
],
"source": [
"repo.log()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**When we attempt the merge, an exception is thrown telling us there is a conflict!**"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Selected 3-Way Merge Strategy\n"
]
},
{
"ename": "ValueError",
"evalue": "HANGAR VALUE ERROR:: Merge ABORTED with conflict: Conflicts(t1=[(b'l:hello', b'2=d8fa6800caf496e637d965faac1a033e4636c2e6')], t21=[], t22=[], t3=[], conflict=True)",
"output_type": "error",
"traceback": [
"\u001B[0;31m---------------------------------------------------------------------------\u001B[0m",
"\u001B[0;31mValueError\u001B[0m Traceback (most recent call last)",
"\u001B[0;32m<ipython-input-57-1a98dce1852b>\u001B[0m in \u001B[0;36m<module>\u001B[0;34m\u001B[0m\n\u001B[0;32m----> 1\u001B[0;31m \u001B[0mco\u001B[0m\u001B[0;34m.\u001B[0m\u001B[0mmerge\u001B[0m\u001B[0;34m(\u001B[0m\u001B[0mmessage\u001B[0m\u001B[0;34m=\u001B[0m\u001B[0;34m'this merge should not happen'\u001B[0m\u001B[0;34m,\u001B[0m \u001B[0mdev_branch\u001B[0m\u001B[0;34m=\u001B[0m\u001B[0;34m'testbranch'\u001B[0m\u001B[0;34m)\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n\u001B[0m",
"\u001B[0;32m~/projects/tensorwerk/hangar/hangar-py/src/hangar/checkout.py\u001B[0m in \u001B[0;36mmerge\u001B[0;34m(self, message, dev_branch)\u001B[0m\n\u001B[1;32m 1027\u001B[0m \u001B[0mdev_branch\u001B[0m\u001B[0;34m=\u001B[0m\u001B[0mdev_branch\u001B[0m\u001B[0;34m,\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n\u001B[1;32m 1028\u001B[0m \u001B[0mrepo_path\u001B[0m\u001B[0;34m=\u001B[0m\u001B[0mself\u001B[0m\u001B[0;34m.\u001B[0m\u001B[0m_repo_path\u001B[0m\u001B[0;34m,\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n\u001B[0;32m-> 1029\u001B[0;31m writer_uuid=self._writer_lock)\n\u001B[0m\u001B[1;32m 1030\u001B[0m \u001B[0;34m\u001B[0m\u001B[0m\n\u001B[1;32m 1031\u001B[0m \u001B[0;32mfor\u001B[0m \u001B[0masetHandle\u001B[0m \u001B[0;32min\u001B[0m \u001B[0mself\u001B[0m\u001B[0;34m.\u001B[0m\u001B[0m_columns\u001B[0m\u001B[0;34m.\u001B[0m\u001B[0mvalues\u001B[0m\u001B[0;34m(\u001B[0m\u001B[0;34m)\u001B[0m\u001B[0;34m:\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n",
"\u001B[0;32m~/projects/tensorwerk/hangar/hangar-py/src/hangar/merger.py\u001B[0m in \u001B[0;36mselect_merge_algorithm\u001B[0;34m(message, branchenv, stageenv, refenv, stagehashenv, master_branch, dev_branch, repo_path, writer_uuid)\u001B[0m\n\u001B[1;32m 136\u001B[0m \u001B[0;34m\u001B[0m\u001B[0m\n\u001B[1;32m 137\u001B[0m \u001B[0;32mexcept\u001B[0m \u001B[0mValueError\u001B[0m \u001B[0;32mas\u001B[0m \u001B[0me\u001B[0m\u001B[0;34m:\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n\u001B[0;32m--> 138\u001B[0;31m \u001B[0;32mraise\u001B[0m \u001B[0me\u001B[0m \u001B[0;32mfrom\u001B[0m \u001B[0;32mNone\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n\u001B[0m\u001B[1;32m 139\u001B[0m \u001B[0;34m\u001B[0m\u001B[0m\n\u001B[1;32m 140\u001B[0m \u001B[0;32mfinally\u001B[0m\u001B[0;34m:\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n",
"\u001B[0;32m~/projects/tensorwerk/hangar/hangar-py/src/hangar/merger.py\u001B[0m in \u001B[0;36mselect_merge_algorithm\u001B[0;34m(message, branchenv, stageenv, refenv, stagehashenv, master_branch, dev_branch, repo_path, writer_uuid)\u001B[0m\n\u001B[1;32m 133\u001B[0m \u001B[0mrefenv\u001B[0m\u001B[0;34m=\u001B[0m\u001B[0mrefenv\u001B[0m\u001B[0;34m,\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n\u001B[1;32m 134\u001B[0m \u001B[0mstagehashenv\u001B[0m\u001B[0;34m=\u001B[0m\u001B[0mstagehashenv\u001B[0m\u001B[0;34m,\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n\u001B[0;32m--> 135\u001B[0;31m repo_path=repo_path)\n\u001B[0m\u001B[1;32m 136\u001B[0m \u001B[0;34m\u001B[0m\u001B[0m\n\u001B[1;32m 137\u001B[0m \u001B[0;32mexcept\u001B[0m \u001B[0mValueError\u001B[0m \u001B[0;32mas\u001B[0m \u001B[0me\u001B[0m\u001B[0;34m:\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n",
"\u001B[0;32m~/projects/tensorwerk/hangar/hangar-py/src/hangar/merger.py\u001B[0m in \u001B[0;36m_three_way_merge\u001B[0;34m(message, master_branch, masterHEAD, dev_branch, devHEAD, ancestorHEAD, branchenv, stageenv, refenv, stagehashenv, repo_path)\u001B[0m\n\u001B[1;32m 260\u001B[0m \u001B[0;32mif\u001B[0m \u001B[0mconflict\u001B[0m\u001B[0;34m.\u001B[0m\u001B[0mconflict\u001B[0m \u001B[0;32mis\u001B[0m \u001B[0;32mTrue\u001B[0m\u001B[0;34m:\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n\u001B[1;32m 261\u001B[0m \u001B[0mmsg\u001B[0m \u001B[0;34m=\u001B[0m \u001B[0;34mf'HANGAR VALUE ERROR:: Merge ABORTED with conflict: {conflict}'\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n\u001B[0;32m--> 262\u001B[0;31m \u001B[0;32mraise\u001B[0m \u001B[0mValueError\u001B[0m\u001B[0;34m(\u001B[0m\u001B[0mmsg\u001B[0m\u001B[0;34m)\u001B[0m \u001B[0;32mfrom\u001B[0m \u001B[0;32mNone\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n\u001B[0m\u001B[1;32m 263\u001B[0m \u001B[0;34m\u001B[0m\u001B[0m\n\u001B[1;32m 264\u001B[0m \u001B[0;32mwith\u001B[0m \u001B[0mmEnv\u001B[0m\u001B[0;34m.\u001B[0m\u001B[0mbegin\u001B[0m\u001B[0;34m(\u001B[0m\u001B[0mwrite\u001B[0m\u001B[0;34m=\u001B[0m\u001B[0;32mTrue\u001B[0m\u001B[0;34m)\u001B[0m \u001B[0;32mas\u001B[0m \u001B[0mtxn\u001B[0m\u001B[0;34m:\u001B[0m\u001B[0;34m\u001B[0m\u001B[0;34m\u001B[0m\u001B[0m\n",
"\u001B[0;31mValueError\u001B[0m: HANGAR VALUE ERROR:: Merge ABORTED with conflict: Conflicts(t1=[(b'l:hello', b'2=d8fa6800caf496e637d965faac1a033e4636c2e6')], t21=[], t22=[], t3=[], conflict=True)"
]
}
],
"source": [
"co.merge(message='this merge should not happen', dev_branch='testbranch')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Checking for Conflicts\n",
"\n",
"Alternatively, use the diff methods on a checkout to test for conflicts before attempting a merge.\n",
"\n",
"It is possible to diff between a checkout object and:\n",
"\n",
"1. Another branch ([diff.branch()](api.rst#hangar.diff.WriterUserDiff.branch))\n",
"2. A specified commit ([diff.commit()](api.rst#hangar.diff.WriterUserDiff.commit))\n",
"3. Changes made in the staging area before a commit is made\n",
" ([diff.staged()](api.rst#hangar.diff.WriterUserDiff.staged))\n",
" (for `write-enabled` checkouts only.)\n",
"\n",
"Or via the [CLI status tool](cli.rst#hangar-status) between the staging area and any branch/commit\n",
"(only a human readable summary is produced)."
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [],
"source": [
"merge_results, conflicts_found = co.diff.branch('testbranch')"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Conflicts(t1=Changes(schema={}, samples=(), metadata=(MetadataRecordKey(key='hello'),)), t21=Changes(schema={}, samples=(), metadata=()), t22=Changes(schema={}, samples=(), metadata=()), t3=Changes(schema={}, samples=(), metadata=()), conflict=True)"
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conflicts_found"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(MetadataRecordKey(key='hello'),)"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conflicts_found.t1.metadata"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The type codes for a `Conflicts` `namedtuple` such as the one we saw:\n",
"\n",
" Conflicts(t1=('hello',), t21=(), t22=(), t3=(), conflict=True)\n",
"\n",
"are as follow:\n",
"\n",
"- ``t1``: Addition of key in master AND dev with different values.\n",
"- ``t21``: Removed key in master, mutated value in dev.\n",
"- ``t22``: Removed key in dev, mutated value in master.\n",
"- ``t3``: Mutated key in both master AND dev to different values.\n",
"- ``conflict``: Bool indicating if any type of conflict is present."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### To resolve, remove the conflict"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'a=e69ba8aeffc130c57d2ae0a8131c8ea59083cb62'"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"del co.metadata['hello']\n",
"# resolved conflict by removing hello key\n",
"co.commit('commit which removes conflicting metadata key')"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Selected 3-Way Merge Strategy\n"
]
},
{
"data": {
"text/plain": [
"'a=ef7ddf4a4a216315d929bd905e78866e3ad6e4fd'"
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"co.merge(message='this merge succeeds as it no longer has a conflict', dev_branch='testbranch')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can verify that history looks as we would expect via the log!"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=ef7ddf4a4a216315d929bd905e78866e3ad6e4fd (\u001B[1;31mnew\u001B[m) : this merge succeeds as it no longer has a conflict\n",
"\u001B[1;31m|\u001B[m\u001B[1;32m\\\u001B[m \n",
"* \u001B[1;32m|\u001B[m a=e69ba8aeffc130c57d2ae0a8131c8ea59083cb62 : commit which removes conflicting metadata key\n",
"* \u001B[1;32m|\u001B[m a=95896880b33fc06a3c2359a03408f07c87bcc8c0 : commit on new branch to hello metadata key so we can demonstrate a conflict\n",
"\u001B[1;32m|\u001B[m * a=69a08ca41ca1f5577fb0ffcf59d4d1585f614c4d (\u001B[1;31mtestbranch\u001B[m) : added hellow world metadata\n",
"\u001B[1;32m|\u001B[m * a=fcd82f86e39b19c3e5351dda063884b5d2fda67b : mutated sample `0` of `dummy_column` to new value\n",
"* \u001B[1;32m|\u001B[m a=c1cf1bd6863ed0b95239d2c9e1a6c6cc65569e94 : commit on `new` branch adding a sample to dummy_arrayset\n",
"\u001B[1;32m|\u001B[m\u001B[1;32m/\u001B[m \n",
"* a=eaee002ed9c6e949c3657bd50e3949d6a459d50e : first commit with a single sample added to a dummy column\n"
]
}
],
"source": [
"repo.log()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
================================================
FILE: docs/Tutorial-003.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Part 3: Working With Remote Servers\n",
"\n",
"This tutorial will introduce how to start a remote Hangar server, and how to work with [remotes](api.rst#hangar.repository.Remotes) from the client side.\n",
"\n",
"Particular attention is paid to the concept of a ***partially fetch* / *partial clone*** operations. This is a key component of the Hangar design which provides the ability to quickly and efficiently work with data contained in remote repositories whose full size would be significatly prohibitive to local use under most circumstances.\n",
"\n",
"*Note:*\n",
"\n",
"> At the time of writing, the API, user-facing functionality, client-server negotiation protocols, and test coverage of the remotes implementation is generally adqequate for this to serve as an \"alpha\" quality preview. However, please be warned that significantly less time has been spent in this module to optimize speed, refactor for simplicity, and assure stability under heavy loads than the rest of the Hangar core. While we can guarantee that your data is secure on disk, you may experience crashes from time to time when working with remotes. In addition, sending data over the wire should NOT be considered secure in ANY way. No in-transit encryption, user authentication, or secure access limitations are implemented at this moment. We realize the importance of these types of protections, and they are on our radar for the next release cycle. If you are interested in making a contribution to Hangar, this module contains a lot of low hanging fruit which would would provide drastic improvements and act as a good intro the the internal Hangar data model. Please get in touch with us to discuss!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Starting a Hangar Server\n",
"\n",
"To start a Hangar server, navigate to the command line and simply execute:\n",
"\n",
"```\n",
"$ hangar server\n",
"```\n",
"\n",
"This will get a local server instance running at `localhost:50051`. The IP and port can be configured by setting the `--ip` and `--port` flags to the desired values in the command line.\n",
"\n",
"A blocking process will begin in that terminal session. Leave it running while you experiment with connecting from a client repo."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using Remotes with a Local Repository\n",
"\n",
"The [CLI](cli.rst#hangar-cli-documentation) is the easiest way to interact with the remote server from a local repository (though all functioanlity is mirrorred via the [repository API](api.rst#hangar.repository.Remotes) (more on that later).\n",
"\n",
"Before we begin we will set up a repository with some data, a few commits, two branches, and a merge."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Setup a Test Repo\n",
"\n",
"As normal, we shall begin with creating a repository and adding some data. This should be familiar to you from previous tutorials."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from hangar import Repository\n",
"import numpy as np\n",
"from tqdm import tqdm\n",
"\n",
"testData = np.loadtxt('/Users/rick/projects/tensorwerk/hangar/dev/data/dota2Dataset/dota2Test.csv', delimiter=',', dtype=np.uint8)\n",
"trainData = np.loadtxt('/Users/rick/projects/tensorwerk/hangar/dev/data/dota2Dataset/dota2Train.csv', delimiter=',', dtype=np.uint16)\n",
"\n",
"testName = 'test'\n",
"testPrototype = testData[0]\n",
"trainName = 'train'\n",
"trainPrototype = trainData[0]"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hangar Repo initialized at: /Users/rick/projects/tensorwerk/hangar/dev/intro/.hangar\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/rick/projects/tensorwerk/hangar/hangar-py/src/hangar/context.py:94: UserWarning: No repository exists at /Users/rick/projects/tensorwerk/hangar/dev/intro/.hangar, please use `repo.init()` method\n",
" warnings.warn(msg, UserWarning)\n"
]
}
],
"source": [
"repo = Repository('/Users/rick/projects/tensorwerk/hangar/dev/intro/')\n",
"repo.init(user_name='Rick Izzo', user_email='rick@tensorwerk.com', remove_old=True)\n",
"co = repo.checkout(write=True)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"10500it [00:02, 4286.17it/s] \n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=b98f6b65c0036489e53ddaf2b30bf797ddc40da0 (\u001B[1;31madd-train\u001B[m) (\u001B[1;31mmaster\u001B[m) : initial commit on master with test data\n"
]
}
],
"source": [
"co.add_ndarray_column(testName, prototype=testPrototype)\n",
"testcol = co.columns[testName]\n",
"\n",
"pbar = tqdm(total=testData.shape[0])\n",
"with testcol as tcol:\n",
" for gameIdx, gameData in enumerate(testData):\n",
" if (gameIdx % 500 == 0):\n",
" pbar.update(500)\n",
" tcol.append(gameData)\n",
"pbar.close()\n",
"\n",
"co.commit('initial commit on master with test data')\n",
"\n",
"repo.create_branch('add-train')\n",
"co.close()\n",
"repo.log()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"93000it [00:22, 4078.73it/s] \n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=957d20e4b921f41975591cc8ee51a4a6912cb919 (\u001B[1;31madd-train\u001B[m) : added training data on another branch\n",
"* a=b98f6b65c0036489e53ddaf2b30bf797ddc40da0 (\u001B[1;31mmaster\u001B[m) : initial commit on master with test data\n"
]
}
],
"source": [
"co = repo.checkout(write=True, branch='add-train')\n",
"\n",
"co.add_ndarray_column(trainName, prototype=trainPrototype)\n",
"traincol = co.columns[trainName]\n",
"\n",
"pbar = tqdm(total=trainData.shape[0])\n",
"with traincol as trcol:\n",
" for gameIdx, gameData in enumerate(trainData):\n",
" if (gameIdx % 500 == 0):\n",
" pbar.update(500)\n",
" trcol.append(gameData)\n",
"pbar.close()\n",
"\n",
"co.commit('added training data on another branch')\n",
"co.close()\n",
"repo.log()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* a=bb1b108ef17b7d7667a2
gitextract_qj3h30ym/ ├── .bumpversion.cfg ├── .coveragerc ├── .editorconfig ├── .gitattributes ├── .github/ │ ├── ISSUE_TEMPLATE/ │ │ ├── bug_report.md │ │ ├── feature_request.md │ │ └── questions_and_documentation.md │ ├── PULL_REQUEST_TEMPLATE.md │ └── workflows/ │ ├── asvbench.yml │ ├── release.yml │ ├── testsphinx.yml │ └── testsuite.yml ├── .gitignore ├── .readthedocs.yml ├── AUTHORS.rst ├── CHANGELOG.rst ├── CODE_OF_CONDUCT.rst ├── CONTRIBUTING.rst ├── LICENSE ├── MANIFEST.in ├── README.rst ├── asv_bench/ │ ├── README.rst │ ├── asv.conf.json │ └── benchmarks/ │ ├── __init__.py │ ├── backend_comparisons.py │ ├── backends/ │ │ ├── __init__.py │ │ ├── hdf5_00.py │ │ ├── hdf5_01.py │ │ └── numpy_10.py │ ├── commit_and_checkout.py │ └── package.py ├── codecov.yml ├── docs/ │ ├── Tutorial-001.ipynb │ ├── Tutorial-002.ipynb │ ├── Tutorial-003.ipynb │ ├── Tutorial-Dataset.ipynb │ ├── Tutorial-QuickStart.ipynb │ ├── Tutorial-RealQuickStart.ipynb │ ├── api.rst │ ├── authors.rst │ ├── backends/ │ │ ├── hdf5_00.rst │ │ ├── hdf5_01.rst │ │ ├── lmdb_30.rst │ │ ├── numpy_10.rst │ │ └── remote_50.rst │ ├── backends.rst │ ├── benchmarking.rst │ ├── changelog.rst │ ├── cli.rst │ ├── codeofconduct.rst │ ├── concepts.rst │ ├── conf.py │ ├── contributing.rst │ ├── contributingindex.rst │ ├── design.rst │ ├── externals.rst │ ├── faq.rst │ ├── index.rst │ ├── installation.rst │ ├── noindexapi/ │ │ ├── apiinit.rst │ │ └── apiremotefetchdata.rst │ ├── quickstart.rst │ ├── readme.rst │ ├── requirements.txt │ ├── requirements_rtd.txt │ ├── spelling_wordlist.txt │ └── tutorial.rst ├── hangar.yml ├── mypy.ini ├── scripts/ │ └── run_proto_codegen.py ├── setup.cfg ├── setup.py ├── src/ │ └── hangar/ │ ├── __init__.py │ ├── __main__.py │ ├── _version.py │ ├── backends/ │ │ ├── __init__.py │ │ ├── chunk.py │ │ ├── hdf5_00.py │ │ ├── hdf5_01.py │ │ ├── lmdb_30.py │ │ ├── lmdb_31.py │ │ ├── numpy_10.py │ │ ├── remote_50.py │ │ ├── specparse.pyx │ │ ├── specs.pxd │ │ └── specs.pyx │ ├── bulk_importer.py │ ├── checkout.py │ ├── cli/ │ │ ├── __init__.py │ │ ├── cli.py │ │ └── utils.py │ ├── columns/ │ │ ├── __init__.py │ │ ├── column.py │ │ ├── common.py │ │ ├── constructors.py │ │ ├── introspection.py │ │ ├── layout_flat.py │ │ └── layout_nested.py │ ├── constants.py │ ├── context.py │ ├── dataset/ │ │ ├── __init__.py │ │ ├── common.py │ │ ├── numpy_dset.py │ │ ├── tensorflow_dset.py │ │ └── torch_dset.py │ ├── diagnostics/ │ │ ├── __init__.py │ │ ├── ecosystem.py │ │ ├── graphing.py │ │ └── integrity.py │ ├── diff.py │ ├── external/ │ │ ├── __init__.py │ │ ├── _external.py │ │ ├── base_plugin.py │ │ └── plugin_manager.py │ ├── external_cpython.pxd │ ├── merger.py │ ├── mixins/ │ │ ├── __init__.py │ │ ├── checkout_iteration.py │ │ ├── datasetget.py │ │ └── recorditer.py │ ├── op_state.py │ ├── optimized_utils.pxd │ ├── optimized_utils.pyx │ ├── records/ │ │ ├── __init__.py │ │ ├── column_parsers.pyx │ │ ├── commiting.py │ │ ├── hashmachine.pyx │ │ ├── hashs.py │ │ ├── heads.py │ │ ├── parsing.py │ │ ├── queries.py │ │ ├── recordstructs.pxd │ │ ├── recordstructs.pyx │ │ ├── summarize.py │ │ └── vcompat.py │ ├── remote/ │ │ ├── __init__.py │ │ ├── chunks.py │ │ ├── client.py │ │ ├── config_server.ini │ │ ├── content.py │ │ ├── hangar_service.proto │ │ ├── hangar_service_pb2.py │ │ ├── hangar_service_pb2.pyi │ │ ├── hangar_service_pb2_grpc.py │ │ ├── header_manipulator_client_interceptor.py │ │ ├── request_header_validator_interceptor.py │ │ └── server.py │ ├── remotes.py │ ├── repository.py │ ├── txnctx.py │ ├── typesystem/ │ │ ├── __init__.py │ │ ├── base.py │ │ ├── descriptors.py │ │ ├── ndarray.py │ │ ├── pybytes.py │ │ └── pystring.py │ └── utils.py ├── tests/ │ ├── bulk_importer/ │ │ └── test_bulk_importer.py │ ├── conftest.py │ ├── ml_datasets/ │ │ └── test_dataset.py │ ├── property_based/ │ │ ├── conftest.py │ │ ├── test_pbt_column_flat.py │ │ └── test_pbt_column_nested.py │ ├── test_backend_hdf5_00_hdf5_01.py │ ├── test_branching.py │ ├── test_checkout.py │ ├── test_checkout_arrayset_access.py │ ├── test_cli.py │ ├── test_column.py │ ├── test_column_backends.py │ ├── test_column_definition_permutations.py │ ├── test_column_nested.py │ ├── test_column_pickle.py │ ├── test_commit_ref_verification.py │ ├── test_context_management.py │ ├── test_diff.py │ ├── test_diff_staged_summary.py │ ├── test_initiate.py │ ├── test_merging.py │ ├── test_optimized_utils.py │ ├── test_remote_serialize.py │ ├── test_remotes.py │ ├── test_repo_integrity_verification.py │ ├── test_utils.py │ ├── test_version.py │ ├── test_visualizations.py │ └── typesystem/ │ ├── test_ndarray_typesysem.py │ ├── test_pybytes_typesystem.py │ └── test_pystr_typesystem.py └── tox.ini
SYMBOL INDEX (2012 symbols across 98 files)
FILE: asv_bench/benchmarks/backend_comparisons.py
class _WriterSuite (line 14) | class _WriterSuite:
method setup (line 24) | def setup(self, backend):
method teardown (line 66) | def teardown(self, backend):
method write (line 71) | def write(self, backend):
class Write_50by50by20_300_samples (line 84) | class Write_50by50by20_300_samples(_WriterSuite):
class _ReaderSuite (line 93) | class _ReaderSuite:
method setup_cache (line 104) | def setup_cache(self):
method setup (line 159) | def setup(self, backend):
method teardown (line 170) | def teardown(self, backend):
method read (line 174) | def read(self, backend):
class Read_50by50by10_3000_samples (line 180) | class Read_50by50by10_3000_samples(_ReaderSuite):
FILE: asv_bench/benchmarks/backends/hdf5_00.py
class _WriterSuite_HDF5_00 (line 10) | class _WriterSuite_HDF5_00:
method setup (line 18) | def setup(self):
method teardown (line 63) | def teardown(self):
method read (line 68) | def read(self):
method write (line 73) | def write(self):
method size (line 82) | def size(self):
class Write_50by50by10_1_samples (line 86) | class Write_50by50by10_1_samples(_WriterSuite_HDF5_00):
class Write_50by50by10_100_samples (line 94) | class Write_50by50by10_100_samples(_WriterSuite_HDF5_00):
class Read_50by50by10_1_samples (line 104) | class Read_50by50by10_1_samples(_WriterSuite_HDF5_00):
class Read_50by50by10_100_samples (line 111) | class Read_50by50by10_100_samples(_WriterSuite_HDF5_00):
class Read_50by50by10_300_samples (line 118) | class Read_50by50by10_300_samples(_WriterSuite_HDF5_00):
FILE: asv_bench/benchmarks/backends/hdf5_01.py
class _WriterSuite_HDF5_01 (line 10) | class _WriterSuite_HDF5_01:
method setup (line 18) | def setup(self):
method teardown (line 66) | def teardown(self):
method read (line 71) | def read(self):
method write (line 76) | def write(self):
method size (line 85) | def size(self):
class Write_50by50by10_1_samples (line 89) | class Write_50by50by10_1_samples(_WriterSuite_HDF5_01):
class Write_50by50by10_100_samples (line 97) | class Write_50by50by10_100_samples(_WriterSuite_HDF5_01):
class Read_50by50by10_1_samples (line 107) | class Read_50by50by10_1_samples(_WriterSuite_HDF5_01):
class Read_50by50by10_100_samples (line 114) | class Read_50by50by10_100_samples(_WriterSuite_HDF5_01):
class Read_50by50by10_300_samples (line 121) | class Read_50by50by10_300_samples(_WriterSuite_HDF5_01):
FILE: asv_bench/benchmarks/backends/numpy_10.py
class _WriterSuite_NUMPY_10 (line 10) | class _WriterSuite_NUMPY_10:
method setup (line 18) | def setup(self):
method teardown (line 63) | def teardown(self):
method read (line 68) | def read(self):
method write (line 73) | def write(self):
method size (line 82) | def size(self):
class Write_50by50by10_1_samples (line 86) | class Write_50by50by10_1_samples(_WriterSuite_NUMPY_10):
class Write_50by50by10_100_samples (line 94) | class Write_50by50by10_100_samples(_WriterSuite_NUMPY_10):
class Read_50by50by10_1_samples (line 104) | class Read_50by50by10_1_samples(_WriterSuite_NUMPY_10):
class Read_50by50by10_100_samples (line 111) | class Read_50by50by10_100_samples(_WriterSuite_NUMPY_10):
class Read_50by50by10_300_samples (line 118) | class Read_50by50by10_300_samples(_WriterSuite_NUMPY_10):
FILE: asv_bench/benchmarks/commit_and_checkout.py
class MakeCommit (line 7) | class MakeCommit(object):
method setup (line 16) | def setup(self, num_samples):
method teardown (line 34) | def teardown(self, num_samples):
method time_commit (line 39) | def time_commit(self, num_samples):
class CheckoutCommit (line 43) | class CheckoutCommit(object):
method setup (line 52) | def setup(self, num_samples):
method teardown (line 73) | def teardown(self, num_samples):
method time_checkout_read_only (line 81) | def time_checkout_read_only(self, num_samples):
method time_checkout_write_enabled (line 84) | def time_checkout_write_enabled(self, num_samples):
FILE: asv_bench/benchmarks/package.py
class TimeImport (line 3) | class TimeImport(object):
method timeraw_import (line 8) | def timeraw_import(self):
FILE: setup.py
class LazyCommandClass (line 35) | class LazyCommandClass(dict):
method __contains__ (line 41) | def __contains__(self, key):
method __setitem__ (line 44) | def __setitem__(self, key, value):
method __getitem__ (line 49) | def __getitem__(self, key):
method make_build_ext_cmd (line 59) | def make_build_ext_cmd(self):
method make_bdist_wheel_cmd (line 79) | def make_bdist_wheel_cmd(self):
method make_sdist_cmd (line 91) | def make_sdist_cmd(self):
FILE: src/hangar/_version.py
class InfinityType (line 34) | class InfinityType(object):
method __repr__ (line 37) | def __repr__(self) -> str:
method __hash__ (line 40) | def __hash__(self) -> int:
method __lt__ (line 43) | def __lt__(self, other: object) -> bool:
method __le__ (line 46) | def __le__(self, other: object) -> bool:
method __eq__ (line 49) | def __eq__(self, other: object) -> bool:
method __ne__ (line 52) | def __ne__(self, other: object) -> bool:
method __gt__ (line 55) | def __gt__(self, other: object) -> bool:
method __ge__ (line 58) | def __ge__(self, other: object) -> bool:
method __neg__ (line 61) | def __neg__(self) -> 'NegativeInfinityType':
class NegativeInfinityType (line 68) | class NegativeInfinityType(object):
method __repr__ (line 71) | def __repr__(self) -> str:
method __hash__ (line 74) | def __hash__(self) -> int:
method __lt__ (line 77) | def __lt__(self, other: object) -> bool:
method __le__ (line 80) | def __le__(self, other: object) -> bool:
method __eq__ (line 83) | def __eq__(self, other: object) -> bool:
method __ne__ (line 86) | def __ne__(self, other: object) -> bool:
method __gt__ (line 89) | def __gt__(self, other: object) -> bool:
method __ge__ (line 92) | def __ge__(self, other: object) -> bool:
method __neg__ (line 95) | def __neg__(self) -> InfinityType:
function parse (line 127) | def parse(version: str) -> Union['Version']:
class InvalidVersion (line 135) | class InvalidVersion(ValueError):
class _BaseVersion (line 142) | class _BaseVersion(object):
method __init__ (line 146) | def __init__(self):
method __hash__ (line 149) | def __hash__(self) -> int:
method __lt__ (line 152) | def __lt__(self, other: '_BaseVersion') -> bool:
method __le__ (line 155) | def __le__(self, other: '_BaseVersion') -> bool:
method __eq__ (line 158) | def __eq__(self, other: object) -> bool:
method __ge__ (line 161) | def __ge__(self, other: '_BaseVersion') -> bool:
method __gt__ (line 164) | def __gt__(self, other: '_BaseVersion') -> bool:
method __ne__ (line 167) | def __ne__(self, other: object) -> bool:
method _compare (line 170) | def _compare(self, other: object, method: 'VersionComparisonMethod'
class Version (line 213) | class Version(_BaseVersion): # lgtm [py/missing-equals]
method __init__ (line 217) | def __init__(self, version: str) -> None:
method __repr__ (line 247) | def __repr__(self) -> str:
method __str__ (line 250) | def __str__(self) -> str:
method epoch (line 279) | def epoch(self) -> int:
method release (line 284) | def release(self) -> Tuple[int, ...]:
method pre (line 289) | def pre(self) -> Optional[Tuple[str, int]]:
method post (line 294) | def post(self) -> Optional[Tuple[str, int]]:
method dev (line 298) | def dev(self) -> Optional[Tuple[str, int]]:
method local (line 302) | def local(self) -> Optional[str]:
method public (line 309) | def public(self) -> str:
method base_version (line 313) | def base_version(self) -> str:
method is_prerelease (line 326) | def is_prerelease(self) -> bool:
method is_postrelease (line 330) | def is_postrelease(self) -> bool:
method is_devrelease (line 334) | def is_devrelease(self) -> bool:
method major (line 338) | def major(self) -> int:
method minor (line 342) | def minor(self) -> int:
method micro (line 346) | def micro(self) -> int:
function _parse_letter_version (line 350) | def _parse_letter_version(
function _parse_local_version (line 390) | def _parse_local_version(local: str) -> Optional['LocalType']:
function _cmpkey (line 402) | def _cmpkey(
FILE: src/hangar/backends/chunk.py
function _csformula (line 21) | def _csformula(expected_mb):
function _limit_es (line 31) | def _limit_es(expected_mb):
function _calc_chunksize (line 41) | def _calc_chunksize(expected_mb):
function _rowsize (line 63) | def _rowsize(shape, maindim, itemsize):
function calc_chunkshape (line 90) | def calc_chunkshape(shape, expectedrows, itemsize, maindim):
FILE: src/hangar/backends/hdf5_00.py
function hdf5_00_encode (line 212) | def hdf5_00_encode(uid: str, cksum: str, dset: int, dset_idx: int, shape...
class BloscCompressionOptions (line 248) | class BloscCompressionOptions(Descriptor):
class GzipCompressionOptions (line 256) | class GzipCompressionOptions(Descriptor):
class LzfCompressionOptions (line 264) | class LzfCompressionOptions(Descriptor):
class AllowedDtypes (line 271) | class AllowedDtypes(Descriptor):
class HDF5_00_Options (line 279) | class HDF5_00_Options(metaclass=checkedmeta):
method __init__ (line 287) | def __init__(self, backend_options, dtype, shape, *args, **kwargs):
method _verify_data_nbytes_larger_than_clib_min (line 303) | def _verify_data_nbytes_larger_than_clib_min(self):
method default_options (line 319) | def default_options(self):
method backend_options (line 329) | def backend_options(self):
method init_requires (line 333) | def init_requires(self):
class HDF5_00_FileHandles (line 340) | class HDF5_00_FileHandles(object):
method __init__ (line 348) | def __init__(self, repo_path: Path, schema_shape: tuple, schema_dtype:...
method __enter__ (line 373) | def __enter__(self):
method __exit__ (line 376) | def __exit__(self, *exc):
method __getstate__ (line 383) | def __getstate__(self) -> dict:
method __setstate__ (line 395) | def __setstate__(self, state: dict) -> None: # pragma: no cover
method backend_opts (line 407) | def backend_opts(self):
method _backend_opts_set (line 411) | def _backend_opts_set(self, val):
method backend_opts (line 418) | def backend_opts(self, value):
method open (line 433) | def open(self, mode: str, *, remote_operation: bool = False):
method close (line 466) | def close(self):
method delete_in_process_data (line 499) | def delete_in_process_data(repo_path: Path, *, remote_operation=False)...
method _dataset_opts (line 526) | def _dataset_opts(complib: str, complevel: int, shuffle: Union[bool, s...
method _chunk_opts (line 596) | def _chunk_opts(sample_array: np.ndarray, max_chunk_nbytes: int) -> Tu...
method _create_schema (line 626) | def _create_schema(self, *, remote_operation: bool = False):
method read_data (line 732) | def read_data(self, hashVal: HDF5_00_DataHashSpec) -> np.ndarray:
method write_data (line 803) | def write_data(self, array: np.ndarray, *, remote_operation: bool = Fa...
FILE: src/hangar/backends/hdf5_01.py
function hdf5_01_encode (line 250) | def hdf5_01_encode(uid: str, cksum: str, dset: int, dset_idx: int,
class BloscCompressionOptions (line 287) | class BloscCompressionOptions(Descriptor):
class GzipCompressionOptions (line 295) | class GzipCompressionOptions(Descriptor):
class LzfCompressionOptions (line 303) | class LzfCompressionOptions(Descriptor):
class AllowedDtypes (line 310) | class AllowedDtypes(Descriptor):
class HDF5_01_Options (line 316) | class HDF5_01_Options(metaclass=checkedmeta):
method __init__ (line 324) | def __init__(self, backend_options, dtype, shape, *args, **kwargs):
method _verify_data_nbytes_larger_than_clib_min (line 340) | def _verify_data_nbytes_larger_than_clib_min(self):
method default_options (line 356) | def default_options(self):
method backend_options (line 366) | def backend_options(self):
method init_requires (line 370) | def init_requires(self):
class HDF5_01_FileHandles (line 377) | class HDF5_01_FileHandles(object):
method __init__ (line 385) | def __init__(self, repo_path: Path, schema_shape: tuple, schema_dtype:...
method __enter__ (line 410) | def __enter__(self):
method __exit__ (line 413) | def __exit__(self, *exc):
method __getstate__ (line 420) | def __getstate__(self) -> dict:
method __setstate__ (line 432) | def __setstate__(self, state: dict) -> None: # pragma: no cover
method backend_opts (line 444) | def backend_opts(self):
method _backend_opts_set (line 448) | def _backend_opts_set(self, val):
method backend_opts (line 455) | def backend_opts(self, value):
method open (line 470) | def open(self, mode: str, *, remote_operation: bool = False):
method close (line 503) | def close(self):
method delete_in_process_data (line 536) | def delete_in_process_data(repo_path: Path, *, remote_operation=False)...
method _dataset_opts (line 563) | def _dataset_opts(complib: str, complevel: int, shuffle: Union[bool, s...
method _create_schema (line 632) | def _create_schema(self, *, remote_operation: bool = False):
method read_data (line 743) | def read_data(self, hashVal: HDF5_01_DataHashSpec) -> np.ndarray:
method write_data (line 810) | def write_data(self, array: np.ndarray, *, remote_operation: bool = Fa...
FILE: src/hangar/backends/lmdb_30.py
function _lexicographic_keys (line 114) | def _lexicographic_keys():
function lmdb_30_encode (line 131) | def lmdb_30_encode(uid: str, row_idx: int, checksum: str) -> bytes:
class AllowedDtypes (line 137) | class AllowedDtypes(Descriptor):
class LMDB_30_Options (line 141) | class LMDB_30_Options(metaclass=checkedmeta):
method __init__ (line 145) | def __init__(self, backend_options, dtype, *args, **kwargs):
method default_options (line 152) | def default_options(self):
method backend_options (line 156) | def backend_options(self):
method init_requires (line 160) | def init_requires(self):
class LMDB_30_FileHandles (line 164) | class LMDB_30_FileHandles:
method __init__ (line 166) | def __init__(self, repo_path: Path, *args, **kwargs):
method __enter__ (line 185) | def __enter__(self):
method __exit__ (line 189) | def __exit__(self, *exc):
method __getstate__ (line 193) | def __getstate__(self) -> dict:
method __setstate__ (line 203) | def __setstate__(self, state: dict) -> None: # pragma: no cover
method backend_opts (line 212) | def backend_opts(self):
method _backend_opts_set (line 216) | def _backend_opts_set(self, val):
method backend_opts (line 223) | def backend_opts(self, value):
method open (line 238) | def open(self, mode: str, *, remote_operation: bool = False):
method close (line 271) | def close(self):
method delete_in_process_data (line 295) | def delete_in_process_data(repo_path: Path, *, remote_operation=False)...
method _create_schema (line 322) | def _create_schema(self, *, remote_operation: bool = False):
method read_data (line 334) | def read_data(self, hashVal: LMDB_30_DataHashSpec) -> str:
method write_data (line 370) | def write_data(self, data: str, *, remote_operation: bool = False) -> ...
FILE: src/hangar/backends/lmdb_31.py
function _lexicographic_keys (line 114) | def _lexicographic_keys():
function lmdb_31_encode (line 131) | def lmdb_31_encode(uid: str, row_idx: int, checksum: str) -> bytes:
class AllowedDtypes (line 137) | class AllowedDtypes(Descriptor):
class LMDB_31_Options (line 141) | class LMDB_31_Options(metaclass=checkedmeta):
method __init__ (line 145) | def __init__(self, backend_options, dtype, *args, **kwargs):
method default_options (line 152) | def default_options(self):
method backend_options (line 156) | def backend_options(self):
method init_requires (line 160) | def init_requires(self):
class LMDB_31_FileHandles (line 164) | class LMDB_31_FileHandles:
method __init__ (line 166) | def __init__(self, repo_path: Path, *args, **kwargs):
method __enter__ (line 185) | def __enter__(self):
method __exit__ (line 189) | def __exit__(self, *exc):
method __getstate__ (line 193) | def __getstate__(self) -> dict:
method __setstate__ (line 203) | def __setstate__(self, state: dict) -> None: # pragma: no cover
method backend_opts (line 212) | def backend_opts(self):
method _backend_opts_set (line 216) | def _backend_opts_set(self, val):
method backend_opts (line 223) | def backend_opts(self, value):
method open (line 238) | def open(self, mode: str, *, remote_operation: bool = False):
method close (line 271) | def close(self):
method delete_in_process_data (line 295) | def delete_in_process_data(repo_path: Path, *, remote_operation=False)...
method _create_schema (line 322) | def _create_schema(self, *, remote_operation: bool = False):
method read_data (line 334) | def read_data(self, hashVal: LMDB_31_DataHashSpec) -> str:
method write_data (line 369) | def write_data(self, data: bytes, *, remote_operation: bool = False) -...
FILE: src/hangar/backends/numpy_10.py
function numpy_10_encode (line 106) | def numpy_10_encode(uid: str, cksum: str, collection_idx: int, shape: tu...
class AllowedDtypes (line 137) | class AllowedDtypes(Descriptor):
class NUMPY_10_Options (line 143) | class NUMPY_10_Options(metaclass=checkedmeta):
method __init__ (line 147) | def __init__(self, backend_options, dtype, *args, **kwargs):
method default_options (line 155) | def default_options(self):
method backend_options (line 159) | def backend_options(self):
method init_requires (line 163) | def init_requires(self):
class NUMPY_10_FileHandles (line 167) | class NUMPY_10_FileHandles(object):
method __init__ (line 169) | def __init__(self, repo_path: Path, schema_shape: tuple, schema_dtype:...
method __getstate__ (line 190) | def __getstate__(self) -> dict:
method __setstate__ (line 200) | def __setstate__(self, state: dict) -> None: # pragma: no cover
method __enter__ (line 209) | def __enter__(self):
method __exit__ (line 212) | def __exit__(self, *exc):
method backend_opts (line 217) | def backend_opts(self):
method _backend_opts_set (line 221) | def _backend_opts_set(self, val):
method backend_opts (line 228) | def backend_opts(self, value):
method open (line 243) | def open(self, mode: str, *, remote_operation: bool = False):
method close (line 271) | def close(self, *args, **kwargs):
method delete_in_process_data (line 286) | def delete_in_process_data(repo_path: Path, *, remote_operation: bool ...
method _create_schema (line 312) | def _create_schema(self, *, remote_operation: bool = False):
method read_data (line 336) | def read_data(self, hashVal: NUMPY_10_DataHashSpec) -> np.ndarray:
method write_data (line 394) | def write_data(self, array: np.ndarray, *, remote_operation: bool = Fa...
FILE: src/hangar/backends/remote_50.py
function remote_50_encode (line 70) | def remote_50_encode(schema_hash: str = '') -> bytes:
class REMOTE_50_Options (line 84) | class REMOTE_50_Options(metaclass=checkedmeta):
method __init__ (line 87) | def __init__(self, backend_options, *args, **kwargs):
method default_options (line 93) | def default_options(self):
method backend_options (line 97) | def backend_options(self):
method init_requires (line 101) | def init_requires(self):
class REMOTE_50_Handler (line 105) | class REMOTE_50_Handler(object):
method __init__ (line 107) | def __init__(self, repo_path: Path, *args, **kwargs):
method __enter__ (line 112) | def __enter__(self):
method __exit__ (line 115) | def __exit__(self, *exc):
method __getstate__ (line 119) | def __getstate__(self) -> dict: # pragma: no cover
method __setstate__ (line 126) | def __setstate__(self, state: dict) -> None: # pragma: no cover
method backend_opts (line 133) | def backend_opts(self):
method _backend_opts_set (line 137) | def _backend_opts_set(self, val):
method backend_opts (line 144) | def backend_opts(self, value):
method open (line 159) | def open(self, mode, *args, **kwargs):
method close (line 163) | def close(self, *args, **kwargs):
method delete_in_process_data (line 167) | def delete_in_process_data(*args, **kwargs) -> None:
method read_data (line 172) | def read_data(self, hashVal: REMOTE_50_DataHashSpec) -> None:
method write_data (line 177) | def write_data(self, schema_hash: str, *args, **kwargs) -> bytes:
FILE: src/hangar/bulk_importer.py
class UDF_Return (line 114) | class UDF_Return(NamedTuple):
method __eq__ (line 130) | def __eq__(self, other):
function run_bulk_import (line 147) | def run_bulk_import(
class _ContentDescriptionPrep (line 401) | class _ContentDescriptionPrep(NamedTuple):
method db_record_key (line 408) | def db_record_key(self):
method db_record_val (line 417) | def db_record_val(self):
class _Task (line 421) | class _Task(NamedTuple):
method num_steps (line 426) | def num_steps(self):
class _WrittenContentDescription (line 430) | class _WrittenContentDescription(NamedTuple):
function _num_steps_in_task_list (line 444) | def _num_steps_in_task_list(task_list: List[_Task]) -> int:
function _serialize_udf (line 449) | def _serialize_udf(udf: UDF_T) -> bytes:
function _deserialize_udf (line 454) | def _deserialize_udf(raw: bytes) -> UDF_T:
function _process_num_cpus (line 459) | def _process_num_cpus(ncpus: int) -> int:
function _check_user_input_func (line 483) | def _check_user_input_func(
class _MPQueue (line 562) | class _MPQueue(mpq.Queue):
method __init__ (line 566) | def __init__(self, *args, **kwargs):
method safe_get (line 570) | def safe_get(self, timeout=0.5):
method safe_put (line 579) | def safe_put(self, item, timeout=0.5) -> bool:
method drain (line 586) | def drain(self):
method safe_close (line 592) | def safe_close(self) -> int:
class _BatchProcessPrepare (line 599) | class _BatchProcessPrepare(mp.Process):
method __init__ (line 601) | def __init__(
method _setup (line 639) | def _setup(self):
method _input_tasks (line 642) | def _input_tasks(self) -> Iterator[List[dict]]:
method run (line 648) | def run(self):
function _run_prepare_recipe (line 677) | def _run_prepare_recipe(
class _BatchProcessWriter (line 743) | class _BatchProcessWriter(mp.Process):
method __init__ (line 745) | def __init__(
method _setup (line 785) | def _setup(self):
method _input_tasks (line 802) | def _input_tasks(self) -> Iterator[List[_Task]]:
method _enter_backends (line 809) | def _enter_backends(self):
method run (line 818) | def run(self):
function _run_write_recipe_data (line 846) | def _run_write_recipe_data(
function _unify_recipe_contents (line 917) | def _unify_recipe_contents(recipe: List[Tuple[dict, List[_ContentDescrip...
function _reduce_recipe_on_required_digests (line 936) | def _reduce_recipe_on_required_digests(recipe: List[Tuple[dict, List[_Co...
function _write_digest_to_bespec_mapping (line 997) | def _write_digest_to_bespec_mapping(
function _write_full_recipe_sample_key_to_digest_mapping (line 1021) | def _write_full_recipe_sample_key_to_digest_mapping(
function _mock_hangar_directory_structure (line 1041) | def _mock_hangar_directory_structure(dir_name: str) -> Path:
function _move_tmpdir_data_files_to_repodir (line 1063) | def _move_tmpdir_data_files_to_repodir(repodir: Path, tmpdir: Path):
FILE: src/hangar/checkout.py
class ReaderCheckout (line 36) | class ReaderCheckout(GetMixin, CheckoutDictIteration):
method __init__ (line 80) | def __init__(self,
method _repr_pretty_ (line 123) | def _repr_pretty_(self, p, cycle):
method __repr__ (line 133) | def __repr__(self):
method __enter__ (line 142) | def __enter__(self):
method __exit__ (line 151) | def __exit__(self, *exc):
method _verify_alive (line 155) | def _verify_alive(self):
method _is_conman (line 170) | def _is_conman(self) -> bool:
method columns (line 175) | def columns(self) -> Columns:
method diff (line 213) | def diff(self) -> ReaderUserDiff:
method commit_hash (line 233) | def commit_hash(self) -> str:
method log (line 248) | def log(self,
method close (line 299) | def close(self) -> None:
class WriterCheckout (line 321) | class WriterCheckout(GetMixin, CheckoutDictIteration):
method __init__ (line 359) | def __init__(self,
method _repr_pretty_ (line 407) | def _repr_pretty_(self, p, cycle):
method __repr__ (line 417) | def __repr__(self):
method __enter__ (line 428) | def __enter__(self):
method __exit__ (line 437) | def __exit__(self, *exc):
method _is_conman (line 442) | def _is_conman(self):
method _verify_alive (line 446) | def _verify_alive(self):
method _setup (line 476) | def _setup(self):
method columns (line 533) | def columns(self) -> Columns:
method diff (line 579) | def diff(self) -> WriterUserDiff:
method branch_name (line 599) | def branch_name(self) -> str:
method commit_hash (line 611) | def commit_hash(self) -> str:
method log (line 624) | def log(self,
method add_str_column (line 675) | def add_str_column(self,
method add_bytes_column (line 756) | def add_bytes_column(self,
method add_ndarray_column (line 837) | def add_ndarray_column(self,
method _initialize_new_column (line 959) | def _initialize_new_column(self,
method merge (line 1009) | def merge(self, message: str, dev_branch: str) -> str:
method commit (line 1058) | def commit(self, commit_message: str) -> str:
method reset_staging_area (line 1110) | def reset_staging_area(self, *, force=False) -> str:
method close (line 1175) | def close(self) -> None:
FILE: src/hangar/cli/cli.py
function main (line 34) | def main(ctx): # pragma: no cover
function init (line 48) | def init(repo: Repository, name, email, overwrite):
function writer_lock_held (line 64) | def writer_lock_held(repo: Repository, force_release_):
function checkout (line 87) | def checkout(repo: Repository, branchname):
function commit (line 107) | def commit(repo: Repository, message):
function column (line 152) | def column(ctx): # pragma: no cover
function create_column (line 170) | def create_column(repo: Repository, name, dtype, shape, variable_, subsa...
function remove_column (line 237) | def remove_column(repo: Repository, name):
function clone (line 266) | def clone(repo: Repository, remote, name, email, overwrite):
function fetch_records (line 282) | def fetch_records(repo: Repository, remote, branch):
function fetch_data (line 300) | def fetch_data(repo: Repository, remote, startpoint, column, all_):
function push (line 333) | def push(repo: Repository, remote, branch):
function remote (line 345) | def remote(ctx): # pragma: no cover
function list_remotes (line 353) | def list_remotes(repo: Repository):
function add_remote (line 363) | def add_remote(repo: Repository, name, address):
function remove_remote (line 375) | def remove_remote(repo: Repository, name):
function diff (line 390) | def diff(repo: Repository, dev, master):
function summary (line 419) | def summary(repo: Repository, startpoint):
function log (line 441) | def log(repo: Repository, startpoint):
function status (line 460) | def status(repo: Repository):
function branch (line 477) | def branch(ctx): # pragma: no cover
function branch_list (line 485) | def branch_list(repo: Repository):
function branch_create (line 497) | def branch_create(repo: Repository, name, startpoint):
function branch_remove (line 533) | def branch_remove(repo: Repository, name, force):
function server (line 558) | def server(overwrite, ip, port, timeout):
function import_data (line 613) | def import_data(ctx, repo: Repository, column, path, branch, plugin, ove...
function export_data (line 683) | def export_data(ctx, repo: Repository, column, outdir, startpoint, sampl...
function view_data (line 742) | def view_data(ctx, repo: Repository, column, sample, startpoint, format_...
function lmdb_record_details (line 785) | def lmdb_record_details(repo: Repository, a, b, r, d, s, z, limit):
FILE: src/hangar/cli/utils.py
class StrOrIntType (line 4) | class StrOrIntType(click.ParamType):
method convert (line 9) | def convert(self, value, param, ctx):
function parse_custom_arguments (line 25) | def parse_custom_arguments(click_args: list) -> dict:
FILE: src/hangar/columns/column.py
class Columns (line 33) | class Columns:
method __init__ (line 43) | def __init__(self,
method _open (line 90) | def _open(self):
method _close (line 94) | def _close(self):
method _destruct (line 98) | def _destruct(self):
method __getattr__ (line 107) | def __getattr__(self, name):
method _repr_pretty_ (line 123) | def _repr_pretty_(self, p, cycle):
method __repr__ (line 133) | def __repr__(self):
method _ipython_key_completions_ (line 140) | def _ipython_key_completions_(self):
method __getitem__ (line 154) | def __getitem__(self, key: str) -> ModifierTypes:
method __contains__ (line 174) | def __contains__(self, key: str) -> bool:
method __len__ (line 190) | def __len__(self) -> int:
method __iter__ (line 195) | def __iter__(self) -> Iterable[str]:
method _is_conman (line 199) | def _is_conman(self):
method _any_is_conman (line 202) | def _any_is_conman(self) -> bool:
method __enter__ (line 213) | def __enter__(self):
method __exit__ (line 221) | def __exit__(self, *exc):
method iswriteable (line 226) | def iswriteable(self) -> bool:
method contains_remote_references (line 232) | def contains_remote_references(self) -> Mapping[str, bool]:
method remote_sample_keys (line 248) | def remote_sample_keys(self) -> Mapping[str, Iterable[Union[int, str]]]:
method keys (line 262) | def keys(self) -> List[str]:
method values (line 272) | def values(self) -> Iterable[ModifierTypes]:
method items (line 285) | def items(self) -> Iterable[Tuple[str, ModifierTypes]]:
method get (line 297) | def get(self, name: str) -> ModifierTypes:
method __delitem__ (line 318) | def __delitem__(self, key: str) -> str:
method delete (line 344) | def delete(self, column: str) -> str:
method _from_staging_area (line 397) | def _from_staging_area(cls, repo_pth, hashenv, stageenv, stagehashenv):
method _from_commit (line 458) | def _from_commit(cls, repo_pth, hashenv, cmtrefenv):
FILE: src/hangar/columns/common.py
class ColumnTxn (line 9) | class ColumnTxn(object):
method __init__ (line 23) | def __init__(self, dataenv, hashenv, stagehashenv):
method _debug_ (line 35) | def _debug_(self): # pragma: no cover
method open_read (line 46) | def open_read(self):
method close_read (line 53) | def close_read(self):
method open_write (line 59) | def open_write(self):
method close_write (line 67) | def close_write(self):
method read (line 75) | def read(self):
method write (line 87) | def write(self):
function open_file_handles (line 99) | def open_file_handles(backends, path, mode, schema, *, remote_operation=...
FILE: src/hangar/columns/constructors.py
function column_type_object_from_schema (line 40) | def column_type_object_from_schema(schema: dict):
function _warn_remote (line 51) | def _warn_remote(aset_name):
function _flat_load_sample_keys_and_specs (line 61) | def _flat_load_sample_keys_and_specs(column_name, txnctx):
function generate_flat_column (line 92) | def generate_flat_column(txnctx, column_name, path, schema, mode):
function _nested_load_sample_keys_and_specs (line 147) | def _nested_load_sample_keys_and_specs(column_name, txnctx):
function generate_nested_column (line 179) | def generate_nested_column(txnctx, column_name, path, schema, mode):
FILE: src/hangar/columns/introspection.py
function is_column (line 10) | def is_column(obj) -> bool:
function is_writer_column (line 20) | def is_writer_column(obj) -> bool:
FILE: src/hangar/columns/layout_flat.py
class FlatSampleReader (line 34) | class FlatSampleReader:
method __init__ (line 63) | def __init__(self,
method _debug_ (line 82) | def _debug_(self): # pragma: no cover
method __repr__ (line 94) | def __repr__(self):
method _repr_pretty_ (line 103) | def _repr_pretty_(self, p, cycle):
method _ipython_key_completions_ (line 116) | def _ipython_key_completions_(self): # pragma: no cover
method __getstate__ (line 131) | def __getstate__(self) -> dict:
method __setstate__ (line 136) | def __setstate__(self, state: dict) -> None:
method __enter__ (line 146) | def __enter__(self):
method __exit__ (line 149) | def __exit__(self, *exc):
method _destruct (line 152) | def _destruct(self):
method __getattr__ (line 159) | def __getattr__(self, name):
method _is_conman (line 174) | def _is_conman(self) -> bool:
method __iter__ (line 177) | def __iter__(self) -> Iterable[KeyType]:
method __len__ (line 187) | def __len__(self) -> int:
method __contains__ (line 192) | def __contains__(self, key: KeyType) -> bool:
method _open (line 197) | def _open(self):
method _close (line 201) | def _close(self):
method __getitem__ (line 205) | def __getitem__(self, key: KeyType):
method get (line 228) | def get(self, key: KeyType, default=None):
method column (line 252) | def column(self) -> str:
method column_type (line 258) | def column_type(self):
method column_layout (line 264) | def column_layout(self):
method schema_type (line 270) | def schema_type(self):
method dtype (line 276) | def dtype(self):
method shape (line 282) | def shape(self):
method backend (line 291) | def backend(self) -> str:
method backend_options (line 297) | def backend_options(self):
method iswriteable (line 303) | def iswriteable(self) -> bool:
method contains_subsamples (line 309) | def contains_subsamples(self) -> bool:
method contains_remote_references (line 315) | def contains_remote_references(self) -> bool:
method remote_reference_keys (line 333) | def remote_reference_keys(self) -> Tuple[KeyType]:
method _mode_local_aware_key_looper (line 345) | def _mode_local_aware_key_looper(self, local: bool) -> Iterable[KeyType]:
method keys (line 371) | def keys(self, local: bool = False) -> Iterable[KeyType]:
method values (line 387) | def values(self, local: bool = False) -> Iterable[Any]:
method items (line 405) | def items(self, local: bool = False) -> Iterable[Tuple[KeyType, Any]]:
class FlatSampleWriter (line 426) | class FlatSampleWriter(FlatSampleReader):
method __init__ (line 431) | def __init__(self, aset_ctx, *args, **kwargs):
method __enter__ (line 435) | def __enter__(self):
method __exit__ (line 446) | def __exit__(self, *exc):
method _set_arg_validate (line 450) | def _set_arg_validate(self, key, value):
method _perform_set (line 473) | def _perform_set(self, key, value):
method __setitem__ (line 527) | def __setitem__(self, key, value):
method append (line 552) | def append(self, value) -> KeyType:
method update (line 585) | def update(self, other=None, **kwargs):
method __delitem__ (line 626) | def __delitem__(self, key: KeyType) -> None:
method pop (line 656) | def pop(self, key: KeyType):
method change_backend (line 678) | def change_backend(self, backend: str, backend_options: Optional[dict]...
FILE: src/hangar/columns/layout_nested.py
class FlatSubsampleReader (line 37) | class FlatSubsampleReader(object):
method __init__ (line 43) | def __init__(self,
method _debug_ (line 59) | def _debug_(self): # pragma: no cover
method __repr__ (line 70) | def __repr__(self):
method _repr_pretty_ (line 76) | def _repr_pretty_(self, p, cycle):
method _ipython_key_completions_ (line 84) | def _ipython_key_completions_(self):
method __enter__ (line 98) | def __enter__(self):
method __exit__ (line 102) | def __exit__(self, *exc):
method _destruct (line 105) | def _destruct(self):
method __getattr__ (line 111) | def __getattr__(self, name):
method __getstate__ (line 126) | def __getstate__(self) -> dict:
method __setstate__ (line 131) | def __setstate__(self, state: dict) -> None:
method __len__ (line 141) | def __len__(self) -> int:
method __contains__ (line 144) | def __contains__(self, key: KeyType) -> bool:
method __iter__ (line 147) | def __iter__(self) -> Iterable[KeyType]:
method __getitem__ (line 150) | def __getitem__(self, key: SubsampleGetKeysType) -> Union[Any, Dict[Ke...
method _enter_count (line 201) | def _enter_count(self):
method _enter_count (line 205) | def _enter_count(self, value):
method _is_conman (line 209) | def _is_conman(self):
method sample (line 213) | def sample(self) -> KeyType:
method column (line 219) | def column(self) -> str:
method iswriteable (line 225) | def iswriteable(self) -> bool:
method data (line 231) | def data(self) -> Dict[KeyType, Any]:
method _mode_local_aware_key_looper (line 242) | def _mode_local_aware_key_looper(self, local: bool) -> Iterable[KeyType]:
method contains_remote_references (line 269) | def contains_remote_references(self) -> bool:
method remote_reference_keys (line 287) | def remote_reference_keys(self) -> Tuple[KeyType]:
method keys (line 299) | def keys(self, local: bool = False) -> Iterable[KeyType]:
method values (line 315) | def values(self, local: bool = False) -> Iterable[Any]:
method items (line 333) | def items(self, local: bool = False) -> Iterable[Tuple[KeyType, Any]]:
method get (line 351) | def get(self, key: KeyType, default=None):
class FlatSubsampleWriter (line 378) | class FlatSubsampleWriter(FlatSubsampleReader):
method __init__ (line 383) | def __init__(self,
method __enter__ (line 394) | def __enter__(self):
method __exit__ (line 407) | def __exit__(self, *exc):
method _set_arg_validate (line 413) | def _set_arg_validate(self, key, value):
method _perform_set (line 420) | def _perform_set(self, key, value):
method __setitem__ (line 476) | def __setitem__(self, key, value):
method append (line 498) | def append(self, value) -> KeyType:
method update (line 533) | def update(self, other=None, **kwargs):
method __delitem__ (line 573) | def __delitem__(self, key: KeyType):
method pop (line 603) | def pop(self, key: KeyType):
class NestedSampleReader (line 621) | class NestedSampleReader:
method __init__ (line 627) | def __init__(self,
method __repr__ (line 644) | def __repr__(self):
method _repr_pretty_ (line 653) | def _repr_pretty_(self, p, cycle):
method _ipython_key_completions_ (line 667) | def _ipython_key_completions_(self):
method __enter__ (line 681) | def __enter__(self):
method __exit__ (line 685) | def __exit__(self, *exc):
method _destruct (line 688) | def _destruct(self):
method __getattr__ (line 697) | def __getattr__(self, name):
method __getstate__ (line 712) | def __getstate__(self) -> dict:
method __setstate__ (line 717) | def __setstate__(self, state: dict) -> None:
method __getitem__ (line 728) | def __getitem__(
method __iter__ (line 754) | def __iter__(self) -> Iterable[KeyType]:
method __len__ (line 764) | def __len__(self) -> int:
method __contains__ (line 769) | def __contains__(self, key: KeyType) -> bool:
method _open (line 774) | def _open(self):
method _close (line 784) | def _close(self):
method _enter_count (line 795) | def _enter_count(self):
method _enter_count (line 799) | def _enter_count(self, value):
method _is_conman (line 803) | def _is_conman(self):
method column (line 807) | def column(self) -> str:
method column_type (line 813) | def column_type(self):
method column_layout (line 819) | def column_layout(self):
method schema_type (line 825) | def schema_type(self):
method dtype (line 831) | def dtype(self):
method shape (line 837) | def shape(self):
method backend (line 846) | def backend(self) -> str:
method backend_options (line 852) | def backend_options(self):
method iswriteable (line 858) | def iswriteable(self) -> bool:
method _mode_local_aware_key_looper (line 863) | def _mode_local_aware_key_looper(self, local: bool) -> Iterable[KeyType]:
method contains_remote_references (line 890) | def contains_remote_references(self) -> bool:
method remote_reference_keys (line 908) | def remote_reference_keys(self) -> Tuple[KeyType]:
method contains_subsamples (line 921) | def contains_subsamples(self) -> bool:
method num_subsamples (line 927) | def num_subsamples(self) -> int:
method keys (line 935) | def keys(self, local: bool = False) -> Iterable[KeyType]:
method values (line 951) | def values(self, local: bool = False) -> Iterable[Any]:
method items (line 969) | def items(self, local: bool = False) -> Iterable[Tuple[KeyType, Any]]:
method get (line 987) | def get(
class NestedSampleWriter (line 1015) | class NestedSampleWriter(NestedSampleReader):
method __init__ (line 1020) | def __init__(self, aset_ctx=None, *args, **kwargs):
method __enter__ (line 1025) | def __enter__(self):
method __exit__ (line 1038) | def __exit__(self, *exc):
method _set_arg_validate (line 1042) | def _set_arg_validate(self, sample_key, subsample_map):
method _perform_set (line 1053) | def _perform_set(self, key, value) -> None:
method __setitem__ (line 1072) | def __setitem__(self, key, value) -> None:
method update (line 1086) | def update(self, other=None, **kwargs) -> None:
method __delitem__ (line 1126) | def __delitem__(self, key: KeyType):
method pop (line 1146) | def pop(self, key: KeyType) -> Dict[KeyType, Any]:
method change_backend (line 1165) | def change_backend(self, backend: str, backend_options: Optional[dict]...
FILE: src/hangar/context.py
class Environments (line 43) | class Environments(object):
method __init__ (line 45) | def __init__(self, pth: Path):
method repo_is_initialized (line 57) | def repo_is_initialized(self) -> bool:
method _startup (line 68) | def _startup(self) -> bool:
method init_repo (line 108) | def init_repo(self,
method checkout_commit (line 164) | def checkout_commit(self, branch_name: str = '', commit: str = '') -> ...
method _open_environments (line 213) | def _open_environments(self):
method _close_environments (line 231) | def _close_environments(self):
FILE: src/hangar/dataset/__init__.py
function make_numpy_dataset (line 13) | def make_numpy_dataset(
function make_torch_dataset (line 90) | def make_torch_dataset(
function make_tensorflow_dataset (line 149) | def make_tensorflow_dataset(
FILE: src/hangar/dataset/common.py
class HangarDataset (line 13) | class HangarDataset:
method __init__ (line 33) | def __init__(self,
method columns (line 91) | def columns(self):
method __len__ (line 94) | def __len__(self):
method index_get (line 97) | def index_get(self, index: int):
FILE: src/hangar/dataset/numpy_dset.py
function default_collate_fn (line 14) | def default_collate_fn(batch):
class NumpyDataset (line 28) | class NumpyDataset:
method __init__ (line 66) | def __init__(self, dataset: HangarDataset, batch_size: int, drop_last:...
method dataset (line 84) | def dataset(self):
method num_batches (line 88) | def num_batches(self):
method batch_size (line 92) | def batch_size(self):
method batch_size (line 96) | def batch_size(self, value: int):
method shuffle (line 104) | def shuffle(self):
method shuffle (line 108) | def shuffle(self, value: bool):
method __len__ (line 113) | def __len__(self):
method _batch (line 116) | def _batch(self, batch_size, drop_last=True) -> None:
method __iter__ (line 138) | def __iter__(self):
function _make_numpy_dataset (line 155) | def _make_numpy_dataset(columns: Sequence['Columns'],
FILE: src/hangar/dataset/tensorflow_dset.py
function yield_data (line 24) | def yield_data(dataset: HangarDataset, indices: list,
function _make_tensorflow_dataset (line 33) | def _make_tensorflow_dataset(columns: Sequence['Columns'],
FILE: src/hangar/dataset/torch_dset.py
class TorchDataset (line 18) | class TorchDataset(torch.utils.data.Dataset):
method __init__ (line 26) | def __init__(self, hangar_dataset: HangarDataset, as_dict: bool = False):
method __len__ (line 31) | def __len__(self) -> int:
method __getitem__ (line 34) | def __getitem__(self, index: int):
function _make_torch_dataset (line 44) | def _make_torch_dataset(columns: Sequence['Columns'],
FILE: src/hangar/diagnostics/ecosystem.py
function get_versions (line 19) | def get_versions() -> dict:
function get_system_info (line 31) | def get_system_info() -> List[Tuple[str, str]]:
function get_optional_info (line 70) | def get_optional_info() -> Dict[str, Union[str, bool]]:
function get_package_info (line 103) | def get_package_info(pkgs):
FILE: src/hangar/diagnostics/graphing.py
class Column (line 61) | class Column(object): # pylint: disable=too-few-public-methods
method __init__ (line 71) | def __init__(self, commit, color):
class GraphState (line 76) | class GraphState(Enum): # pylint: disable=too-few-public-methods
class Graph (line 85) | class Graph(object): # pragma: no cover
method __init__ (line 164) | def __init__(self,
method show_nodes (line 209) | def show_nodes(self, dag, spec, branch, start, order, stop='',
method _write_column (line 269) | def _write_column(self, col, col_char):
method _update_state (line 276) | def _update_state(self, state):
method _interesting_parents (line 280) | def _interesting_parents(self):
method _get_current_column_color (line 286) | def _get_current_column_color(self):
method _increment_column_color (line 291) | def _increment_column_color(self):
method _find_commit_color (line 295) | def _find_commit_color(self, commit):
method _insert_into_new_columns (line 301) | def _insert_into_new_columns(self, commit, mapping_index):
method _update_width (line 318) | def _update_width(self, is_commit_in_existing_columns):
method _update_columns (line 343) | def _update_columns(self):
method _update (line 416) | def _update(self, commit, parents):
method _is_mapping_correct (line 452) | def _is_mapping_correct(self):
method _pad_horizontally (line 467) | def _pad_horizontally(self, chars_written):
method _output_padding_line (line 479) | def _output_padding_line(self):
method _output_skip_line (line 488) | def _output_skip_line(self):
method _output_pre_commit_line (line 499) | def _output_pre_commit_line(self):
method _draw_octopus_merge (line 555) | def _draw_octopus_merge(self):
method _output_commit_line (line 569) | def _output_commit_line(self): # noqa: C901, E501 pylint: disable=too...
method _find_new_column_by_commit (line 627) | def _find_new_column_by_commit(self, commit):
method _output_post_merge_line (line 633) | def _output_post_merge_line(self):
method _output_collapsing_line (line 679) | def _output_collapsing_line(self): # noqa: C901, E501 pylint: disable...
method _next_line (line 775) | def _next_line(self): # pylint: disable=too-many-return-statements
method _padding_line (line 797) | def _padding_line(self):
method _is_commit_finished (line 827) | def _is_commit_finished(self):
method _show_commit (line 830) | def _show_commit(self):
method _show_padding (line 847) | def _show_padding(self):
method _show_remainder (line 852) | def _show_remainder(self):
FILE: src/hangar/diagnostics/integrity.py
function _verify_column_integrity (line 18) | def _verify_column_integrity(hashenv: lmdb.Environment, repo_path: Path):
function _verify_schema_integrity (line 52) | def _verify_schema_integrity(hashenv: lmdb.Environment):
function _verify_commit_tree_integrity (line 68) | def _verify_commit_tree_integrity(refenv: lmdb.Environment):
function _verify_commit_ref_digests_exist (line 108) | def _verify_commit_ref_digests_exist(hashenv: lmdb.Environment, refenv: ...
function _verify_branch_integrity (line 143) | def _verify_branch_integrity(branchenv: lmdb.Environment, refenv: lmdb.E...
function run_verification (line 166) | def run_verification(branchenv: lmdb.Environment,
FILE: src/hangar/diff.py
class HistoryDiffStruct (line 28) | class HistoryDiffStruct(NamedTuple):
class Changes (line 35) | class Changes(NamedTuple):
class DiffOutDB (line 40) | class DiffOutDB(NamedTuple):
class DiffOut (line 46) | class DiffOut(NamedTuple):
class Conflicts (line 55) | class Conflicts(NamedTuple):
class DiffAndConflictsDB (line 78) | class DiffAndConflictsDB(NamedTuple):
class DiffAndConflicts (line 83) | class DiffAndConflicts(NamedTuple):
function diff_envs (line 91) | def diff_envs(base_env: lmdb.Environment, head_env: lmdb.Environment, ) ...
function _raw_from_db_change (line 164) | def _raw_from_db_change(changes: Set[Tuple[bytes, bytes]]) -> Changes:
function _all_raw_from_db_changes (line 200) | def _all_raw_from_db_changes(outDb: DiffAndConflictsDB) -> DiffAndConfli...
function _symmetric_difference_keys (line 228) | def _symmetric_difference_keys(pair1: Set[Tuple[bytes, bytes]],
function find_conflicts (line 259) | def find_conflicts(master_diff: DiffOutDB, dev_diff: DiffOutDB) -> Confl...
class BaseUserDiff (line 292) | class BaseUserDiff(object):
method __init__ (line 294) | def __init__(self, branchenv: lmdb.Environment, refenv: lmdb.Environme...
method _determine_ancestors (line 299) | def _determine_ancestors(self, mHEAD: str, dHEAD: str) -> HistoryDiffS...
method _diff3 (line 336) | def _diff3(a_env: lmdb.Environment,
method _diff (line 362) | def _diff(a_env: lmdb.Environment, m_env: lmdb.Environment) -> DiffAnd...
class ReaderUserDiff (line 390) | class ReaderUserDiff(BaseUserDiff):
method __init__ (line 425) | def __init__(self, commit_hash, *args, **kwargs):
method _run_diff (line 430) | def _run_diff(self, dev_commit_hash: str) -> DiffAndConflictsDB:
method commit (line 454) | def commit(self, dev_commit_hash: str) -> DiffAndConflicts:
method branch (line 481) | def branch(self, dev_branch: str) -> DiffAndConflicts:
class WriterUserDiff (line 515) | class WriterUserDiff(BaseUserDiff):
method __init__ (line 553) | def __init__(self, stageenv: lmdb.Environment, branch_name: str, *args...
method _run_diff (line 559) | def _run_diff(self, dev_commit_hash: str) -> DiffAndConflictsDB:
method commit (line 583) | def commit(self, dev_commit_hash: str) -> DiffAndConflicts:
method branch (line 610) | def branch(self, dev_branch: str) -> DiffAndConflicts:
method staged (line 640) | def staged(self) -> DiffAndConflicts:
method status (line 655) | def status(self) -> str:
FILE: src/hangar/external/_external.py
function load (line 26) | def load(fpath: str,
function save (line 66) | def save(arr: np.ndarray, outdir: str, sample_det: str, extension: str,
function show (line 111) | def show(arr: np.ndarray, plugin: str = None,
function board_show (line 141) | def board_show(arr: np.ndarray, plugin: str = None,
FILE: src/hangar/external/base_plugin.py
class BasePlugin (line 16) | class BasePlugin(object):
method __init__ (line 30) | def __init__(self, provides, accepts):
method provides (line 39) | def provides(self):
method accepts (line 43) | def accepts(self):
method load (line 46) | def load(self, fpath, *args, **kwargs):
method save (line 79) | def save(self, arr, outdir, sample_detail, extension, *args, **kwargs):
method show (line 111) | def show(self, arr, *args, **kwargs):
method board_show (line 122) | def board_show(self, arr, *args, **kwargs):
method sample_name (line 133) | def sample_name(fpath: os.PathLike) -> str:
FILE: src/hangar/external/plugin_manager.py
class PluginManager (line 5) | class PluginManager(object):
method __init__ (line 13) | def __init__(self):
method reset_plugins (line 18) | def reset_plugins(self):
method _clear_plugins (line 29) | def _clear_plugins(self):
method _scan_plugins (line 37) | def _scan_plugins(self):
method _read_defaults (line 47) | def _read_defaults(self):
method get_plugin (line 58) | def get_plugin(self, method: str, plugin: str = None, extension: str =...
FILE: src/hangar/merger.py
function select_merge_algorithm (line 36) | def select_merge_algorithm(message: str,
function _fast_forward_merge (line 150) | def _fast_forward_merge(branchenv: lmdb.Environment,
function _three_way_merge (line 205) | def _three_way_merge(message: str,
FILE: src/hangar/mixins/checkout_iteration.py
class CheckoutDictIteration (line 3) | class CheckoutDictIteration:
method __len__ (line 16) | def __len__(self):
method __contains__ (line 22) | def __contains__(self, key):
method __iter__ (line 28) | def __iter__(self):
method keys (line 33) | def keys(self):
method values (line 39) | def values(self):
method items (line 45) | def items(self):
FILE: src/hangar/mixins/datasetget.py
class GetMixin (line 6) | class GetMixin:
method __getitem__ (line 13) | def __getitem__(self, index):
method get (line 112) | def get(self, keys, default=None, except_missing=False):
method _get_in (line 152) | def _get_in(self, keys, default=None, except_missing=False,
FILE: src/hangar/mixins/recorditer.py
class CursorRangeIterator (line 5) | class CursorRangeIterator:
method cursor_range_iterator (line 8) | def cursor_range_iterator(datatxn: lmdb.Transaction, startRangeKey: by...
FILE: src/hangar/op_state.py
function writer_checkout_only (line 8) | def writer_checkout_only(wrapped, instance, args, kwargs) -> types.Metho...
function reader_checkout_only (line 54) | def reader_checkout_only(wrapped, instance, args, kwargs) -> types.Metho...
function tb_params_last_called (line 99) | def tb_params_last_called(tb: types.TracebackType) -> dict:
function report_corruption_risk_on_parsing_error (line 130) | def report_corruption_risk_on_parsing_error(func):
FILE: src/hangar/records/commiting.py
function expand_short_commit_digest (line 47) | def expand_short_commit_digest(refenv: lmdb.Environment, commit_hash: st...
function check_commit_hash_in_history (line 91) | def check_commit_hash_in_history(refenv, commit_hash):
function get_commit_spec (line 116) | def get_commit_spec(refenv, commit_hash):
function get_commit_ancestors (line 150) | def get_commit_ancestors(refenv, commit_hash):
function get_commit_ancestors_graph (line 186) | def get_commit_ancestors_graph(refenv, starting_commit):
function get_commit_ref (line 248) | def get_commit_ref(refenv, commit_hash):
function unpack_commit_ref (line 308) | def unpack_commit_ref(refenv, cmtrefenv, commit_hash):
function tmp_cmt_env (line 343) | def tmp_cmt_env(refenv: lmdb.Environment, commit_hash: str):
function _commit_ancestors (line 383) | def _commit_ancestors(branchenv: lmdb.Environment,
function _commit_spec (line 429) | def _commit_spec(message: str, user: str, email: str) -> DigestAndBytes:
function _commit_ref (line 457) | def _commit_ref(stageenv: lmdb.Environment) -> DigestAndBytes:
function commit_records (line 482) | def commit_records(message, branchenv, stageenv, refenv, repo_path: Path,
function replace_staging_area_with_commit (line 563) | def replace_staging_area_with_commit(refenv, stageenv, commit_hash):
function replace_staging_area_with_refs (line 592) | def replace_staging_area_with_refs(stageenv, sorted_content):
function move_process_data_to_store (line 628) | def move_process_data_to_store(repo_path: Path, *, remote_operation: boo...
function list_all_commits (line 672) | def list_all_commits(refenv):
function number_commits_recorded (line 700) | def number_commits_recorded(refenv) -> int:
FILE: src/hangar/records/hashs.py
class HashQuery (line 20) | class HashQuery(CursorRangeIterator):
method __init__ (line 36) | def __init__(self, hashenv: lmdb.Environment):
method _traverse_all_hash_records (line 41) | def _traverse_all_hash_records(self, keys: bool = True, values: bool =...
method _traverse_all_schema_records (line 64) | def _traverse_all_schema_records(self, keys: bool = True, values: bool...
method list_all_hash_keys_raw (line 87) | def list_all_hash_keys_raw(self) -> List[str]:
method gen_all_hash_keys_db (line 91) | def gen_all_hash_keys_db(self) -> Iterable[bytes]:
method intersect_keys_db (line 94) | def intersect_keys_db(self, other: Set[bytes]):
method list_all_schema_digests (line 119) | def list_all_schema_digests(self) -> List[str]:
method gen_all_schema_keys_db (line 123) | def gen_all_schema_keys_db(self) -> Iterable[bytes]:
method num_data_records (line 126) | def num_data_records(self) -> int:
method num_schema_records (line 133) | def num_schema_records(self) -> int:
method gen_all_data_digests_and_parsed_backend_specs (line 138) | def gen_all_data_digests_and_parsed_backend_specs(self):
method gen_all_schema_digests_and_parsed_specs (line 144) | def gen_all_schema_digests_and_parsed_specs(self) -> Iterable[Tuple[st...
method get_schema_digest_spec (line 150) | def get_schema_digest_spec(self, digest) -> dict:
function backends_remove_in_process_data (line 162) | def backends_remove_in_process_data(repo_path: Path, *, remote_operation...
function clear_stage_hash_records (line 183) | def clear_stage_hash_records(stagehashenv):
function remove_stage_hash_records_from_hashenv (line 203) | def remove_stage_hash_records_from_hashenv(hashenv, stagehashenv):
FILE: src/hangar/records/heads.py
class BranchHead (line 28) | class BranchHead(NamedTuple):
function writer_lock_held (line 43) | def writer_lock_held(branchenv):
function acquire_writer_lock (line 73) | def acquire_writer_lock(branchenv, writer_uuid):
function release_writer_lock (line 130) | def release_writer_lock(branchenv, writer_uuid):
function create_branch (line 194) | def create_branch(branchenv, name, base_commit) -> BranchHead:
function remove_branch (line 245) | def remove_branch(branchenv: lmdb.Environment,
function get_staging_branch_head (line 337) | def get_staging_branch_head(branchenv):
function set_staging_branch_head (line 360) | def set_staging_branch_head(branchenv, branch_name):
function get_branch_head_commit (line 407) | def get_branch_head_commit(branchenv, branch_name):
function set_branch_head_commit (line 441) | def set_branch_head_commit(branchenv, branch_name, commit_hash):
function get_branch_names (line 484) | def get_branch_names(branchenv):
function commit_hash_to_branch_name_map (line 519) | def commit_hash_to_branch_name_map(branchenv: lmdb.Environment) -> dict:
function add_remote (line 545) | def add_remote(branchenv: lmdb.Environment, name: str, address: str) -> ...
function get_remote_address (line 578) | def get_remote_address(branchenv: lmdb.Environment, name: str) -> str:
function remove_remote (line 613) | def remove_remote(branchenv: lmdb.Environment, name: str) -> str:
function get_remote_names (line 648) | def get_remote_names(branchenv):
FILE: src/hangar/records/parsing.py
function generate_sample_name (line 33) | def generate_sample_name() -> str:
function repo_version_raw_spec_from_raw_string (line 52) | def repo_version_raw_spec_from_raw_string(v_str: str) -> Version:
function repo_version_db_key (line 61) | def repo_version_db_key() -> bytes:
function repo_version_db_val_from_raw_val (line 75) | def repo_version_db_val_from_raw_val(v_spec: Version) -> bytes:
function repo_version_raw_val_from_db_val (line 96) | def repo_version_raw_val_from_db_val(db_val: bytes) -> Version:
function repo_head_db_key (line 123) | def repo_head_db_key() -> bytes:
function repo_head_db_val_from_raw_val (line 137) | def repo_head_db_val_from_raw_val(branch_name: str) -> bytes:
function repo_head_raw_val_from_db_val (line 143) | def repo_head_raw_val_from_db_val(db_val: bytes) -> str:
function repo_branch_head_db_key_from_raw_key (line 155) | def repo_branch_head_db_key_from_raw_key(branch_name: str) -> bytes:
function repo_branch_head_db_val_from_raw_val (line 159) | def repo_branch_head_db_val_from_raw_val(commit_hash: str) -> bytes:
function repo_branch_head_raw_key_from_db_key (line 166) | def repo_branch_head_raw_key_from_db_key(db_key: bytes) -> str:
function repo_branch_head_raw_val_from_db_val (line 170) | def repo_branch_head_raw_val_from_db_val(db_val: bytes) -> str:
function repo_writer_lock_db_key (line 186) | def repo_writer_lock_db_key() -> bytes:
function repo_writer_lock_sentinal_db_val (line 190) | def repo_writer_lock_sentinal_db_val() -> bytes:
function repo_writer_lock_force_release_sentinal (line 194) | def repo_writer_lock_force_release_sentinal() -> str:
function repo_writer_lock_db_val_from_raw_val (line 201) | def repo_writer_lock_db_val_from_raw_val(lock_uuid: str) -> bytes:
function repo_writer_lock_raw_val_from_db_val (line 208) | def repo_writer_lock_raw_val_from_db_val(db_val: bytes) -> str:
function remote_db_key_from_raw_key (line 215) | def remote_db_key_from_raw_key(remote_name: str) -> bytes:
function remote_raw_key_from_db_key (line 231) | def remote_raw_key_from_db_key(db_key: bytes, *, _SPLT=len(K_REMOTES)) -...
function remote_db_val_from_raw_val (line 247) | def remote_db_val_from_raw_val(grpc_address: str) -> bytes:
function remote_raw_val_from_db_val (line 263) | def remote_raw_val_from_db_val(db_val: bytes) -> str:
class CommitAncestorSpec (line 288) | class CommitAncestorSpec(NamedTuple):
class CommitUserSpec (line 294) | class CommitUserSpec(NamedTuple):
class DigestAndUserSpec (line 301) | class DigestAndUserSpec(NamedTuple):
class DigestAndAncestorSpec (line 306) | class DigestAndAncestorSpec(NamedTuple):
class DigestAndBytes (line 311) | class DigestAndBytes(NamedTuple):
class DigestAndDbRefs (line 316) | class DigestAndDbRefs(NamedTuple):
function _hash_func (line 321) | def _hash_func(recs: bytes) -> str:
function cmt_final_digest (line 337) | def cmt_final_digest(parent_digest: str, spec_digest: str, refs_digest: ...
function commit_parent_db_key_from_raw_key (line 378) | def commit_parent_db_key_from_raw_key(commit_hash: str) -> bytes:
function commit_parent_db_val_from_raw_val (line 382) | def commit_parent_db_val_from_raw_val(master_ancestor: str,
function commit_parent_raw_key_from_db_key (line 397) | def commit_parent_raw_key_from_db_key(db_key: bytes) -> str:
function commit_parent_raw_val_from_db_val (line 401) | def commit_parent_raw_val_from_db_val(db_val: bytes) -> DigestAndAncesto...
function commit_ref_db_key_from_raw_key (line 439) | def commit_ref_db_key_from_raw_key(commit_hash: str) -> bytes:
function _commit_ref_joined_kv_digest (line 443) | def _commit_ref_joined_kv_digest(joined_db_kvs: Iterable[bytes]) -> str:
function commit_ref_db_val_from_raw_val (line 471) | def commit_ref_db_val_from_raw_val(db_kvs: Iterable[Tuple[bytes, bytes]]...
function commit_ref_raw_val_from_db_val (line 492) | def commit_ref_raw_val_from_db_val(commit_db_val: bytes) -> DigestAndDbR...
function commit_spec_db_key_from_raw_key (line 528) | def commit_spec_db_key_from_raw_key(commit_hash: str) -> bytes:
function commit_spec_db_val_from_raw_val (line 532) | def commit_spec_db_val_from_raw_val(commit_time: float, commit_message: ...
function commit_spec_raw_val_from_db_val (line 568) | def commit_spec_raw_val_from_db_val(db_val: bytes) -> DigestAndUserSpec:
FILE: src/hangar/records/queries.py
class RecordQuery (line 26) | class RecordQuery(CursorRangeIterator):
method __init__ (line 28) | def __init__(self, dataenv: lmdb.Environment):
method _traverse_all_records (line 33) | def _traverse_all_records(self) -> Iterator[Tuple[bytes, bytes]]:
method _traverse_column_schema_records (line 50) | def _traverse_column_schema_records(self, keys: bool = True, values: b...
method _traverse_column_data_records (line 75) | def _traverse_column_data_records(self,
method column_names (line 124) | def column_names(self) -> List[str]:
method column_count (line 136) | def column_count(self) -> int:
method data_hashes (line 146) | def data_hashes(self) -> List[str]:
method column_data_records (line 167) | def column_data_records(self, column_name: str) -> Iterable[RawDataTup...
method column_data_hashes (line 185) | def column_data_hashes(self, column_name: str) -> Set[DataRecordVal]:
method column_data_count (line 204) | def column_data_count(self, column_name: str) -> int:
method schema_specs (line 222) | def schema_specs(self):
method schema_hashes (line 237) | def schema_hashes(self) -> List[str]:
method data_hash_to_schema_hash (line 251) | def data_hash_to_schema_hash(self) -> Dict[str, str]:
method column_schema_layout (line 270) | def column_schema_layout(self, column: str) -> str:
FILE: src/hangar/records/summarize.py
function log (line 25) | def log(branchenv: lmdb.Environment,
function list_history (line 90) | def list_history(refenv, branchenv, branch_name=None, commit_hash=None):
function details (line 145) | def details(env: lmdb.Environment, line_limit=100, line_length=100) -> S...
function summary (line 192) | def summary(env, *, branch='', commit='') -> StringIO:
function status (line 270) | def status(hashenv: lmdb.Environment, branch_name: str, diff: DiffOut) -...
FILE: src/hangar/records/vcompat.py
function set_repository_software_version (line 17) | def set_repository_software_version(branchenv: lmdb.Environment,
function get_repository_software_version_spec (line 50) | def get_repository_software_version_spec(branchenv: lmdb.Environment) ->...
function startup_check_repo_version (line 90) | def startup_check_repo_version(repo_path: Path) -> Version:
function is_repo_software_version_compatible (line 133) | def is_repo_software_version_compatible(repo_v: Version, curr_v: Version...
FILE: src/hangar/remote/chunks.py
function chunk_bytes (line 15) | def chunk_bytes(bytesData, *, chunkSize: int = 32_000) -> Iterable[bytes]:
function clientCommitChunkedIterator (line 39) | def clientCommitChunkedIterator(commit: str, parentVal: bytes, specVal: ...
function tensorChunkedIterator (line 73) | def tensorChunkedIterator(buf, uncomp_nbytes, pb2_request,
function missingHashIterator (line 91) | def missingHashIterator(commit, hash_bytes, err, pb2_func):
function missingHashRequestIterator (line 106) | def missingHashRequestIterator(commit, hash_bytes, pb2_func):
class DataIdent (line 123) | class DataIdent(NamedTuple):
class DataRecord (line 128) | class DataRecord(NamedTuple):
function _serialize_arr (line 134) | def _serialize_arr(arr: np.ndarray) -> bytes:
function _deserialize_arr (line 144) | def _deserialize_arr(raw: bytes) -> np.ndarray:
function _serialize_str (line 151) | def _serialize_str(data: str) -> bytes:
function _deserialize_str (line 158) | def _deserialize_str(raw: bytes) -> str:
function _serialize_bytes (line 162) | def _serialize_bytes(data: bytes) -> bytes:
function _deserialize_bytes (line 169) | def _deserialize_bytes(data: bytes) -> bytes:
function serialize_ident (line 173) | def serialize_ident(digest: str, schema: str) -> bytes:
function deserialize_ident (line 184) | def deserialize_ident(raw: bytes) -> DataIdent:
function serialize_data (line 192) | def serialize_data(data: Union[np.ndarray, str, bytes]) -> Tuple[int, by...
function deserialize_data (line 203) | def deserialize_data(dtype_code: int, raw_data: bytes) -> Union[np.ndarr...
function serialize_record (line 214) | def serialize_record(data: Union[np.ndarray, str, bytes], digest: str, s...
function deserialize_record (line 227) | def deserialize_record(raw: bytes) -> DataRecord:
function serialize_record_pack (line 237) | def serialize_record_pack(records: List[bytes]) -> bytes:
function deserialize_record_pack (line 246) | def deserialize_record_pack(raw: bytes) -> List[bytes]:
FILE: src/hangar/remote/client.py
class HangarClient (line 33) | class HangarClient(object):
method __init__ (line 56) | def __init__(self,
method _setup_client_channel_config (line 88) | def _setup_client_channel_config(self):
method close (line 142) | def close(self):
method ping_pong (line 149) | def ping_pong(self) -> str:
method push_branch_record (line 161) | def push_branch_record(self, name: str, head: str
method fetch_branch_record (line 182) | def fetch_branch_record(self, name: str
method push_commit_record (line 202) | def push_commit_record(self, commit: str, parentVal: bytes, specVal: b...
method fetch_commit_record (line 230) | def fetch_commit_record(self, commit: str) -> Tuple[str, bytes, bytes,...
method fetch_schema (line 260) | def fetch_schema(self, schema_hash: str) -> Tuple[str, bytes]:
method push_schema (line 283) | def push_schema(self, schema_hash: str,
method fetch_data (line 305) | def fetch_data(
method fetch_data_origin (line 391) | def fetch_data_origin(self, digests: Sequence[str]) -> List[hangar_ser...
method push_find_data_origin (line 405) | def push_find_data_origin(self, digests):
method push_data_begin_context (line 419) | def push_data_begin_context(self):
method push_data_end_context (line 424) | def push_data_end_context(self):
method push_data (line 429) | def push_data(self, schema_hash: str, digests: Sequence[str],
method fetch_find_missing_commits (line 546) | def fetch_find_missing_commits(self, branch_name):
method push_find_missing_commits (line 556) | def push_find_missing_commits(self, branch_name):
method fetch_find_missing_hash_records (line 570) | def fetch_find_missing_hash_records(self, commit):
method push_find_missing_hash_records (line 590) | def push_find_missing_hash_records(self, commit, tmpDB: lmdb.Environme...
method fetch_find_missing_schemas (line 623) | def fetch_find_missing_schemas(self, commit):
method push_find_missing_schemas (line 634) | def push_find_missing_schemas(self, commit, tmpDB: lmdb.Environment = ...
FILE: src/hangar/remote/content.py
class ContentWriter (line 16) | class ContentWriter(object):
method __init__ (line 29) | def __init__(self, envs: Environments):
method commit (line 34) | def commit(self, commit: str, parentVal: bytes, specVal: bytes,
method schema (line 71) | def schema(self, schema_hash: str, schemaVal: bytes) -> Union[str, bool]:
class DataWriter (line 99) | class DataWriter:
method __init__ (line 101) | def __init__(self, envs):
method __enter__ (line 110) | def __enter__(self):
method __exit__ (line 115) | def __exit__(self, *exc):
method is_cm (line 125) | def is_cm(self):
method _open_new_backend (line 128) | def _open_new_backend(self, schema):
method _get_schema_object (line 136) | def _get_schema_object(self, schema_hash):
method _get_changed_schema_object (line 149) | def _get_changed_schema_object(self, schema_hash, backend, backend_opt...
method data (line 160) | def data(self,
class ContentReader (line 212) | class ContentReader(object):
method __init__ (line 224) | def __init__(self, envs):
method commit (line 229) | def commit(self, commit: str) -> Union[RawCommitContent, bool]:
method schema (line 264) | def schema(self, schema_hash: str) -> Union[bytes, bool]:
FILE: src/hangar/remote/hangar_service_pb2.pyi
class _DataLocation (line 47) | class _DataLocation(google___protobuf___internal___enum_type_wrapper____...
class _DataType (line 64) | class _DataType(google___protobuf___internal___enum_type_wrapper____Enum...
class PushBeginContextRequest (line 76) | class PushBeginContextRequest(google___protobuf___message___Message):
method __init__ (line 80) | def __init__(self,
method ClearField (line 84) | def ClearField(self, field_name: typing_extensions___Literal[u"client_...
class PushBeginContextReply (line 87) | class PushBeginContextReply(google___protobuf___message___Message):
method err (line 91) | def err(self) -> type___ErrorProto: ...
method __init__ (line 93) | def __init__(self,
method HasField (line 97) | def HasField(self, field_name: typing_extensions___Literal[u"err",b"er...
method ClearField (line 98) | def ClearField(self, field_name: typing_extensions___Literal[u"err",b"...
class PushEndContextRequest (line 101) | class PushEndContextRequest(google___protobuf___message___Message):
method __init__ (line 105) | def __init__(self,
method ClearField (line 109) | def ClearField(self, field_name: typing_extensions___Literal[u"client_...
class PushEndContextReply (line 112) | class PushEndContextReply(google___protobuf___message___Message):
method err (line 116) | def err(self) -> type___ErrorProto: ...
method __init__ (line 118) | def __init__(self,
method HasField (line 122) | def HasField(self, field_name: typing_extensions___Literal[u"err",b"er...
method ClearField (line 123) | def ClearField(self, field_name: typing_extensions___Literal[u"err",b"...
class ErrorProto (line 126) | class ErrorProto(google___protobuf___message___Message):
method __init__ (line 131) | def __init__(self,
method ClearField (line 136) | def ClearField(self, field_name: typing_extensions___Literal[u"code",b...
class BranchRecord (line 139) | class BranchRecord(google___protobuf___message___Message):
method __init__ (line 144) | def __init__(self,
method ClearField (line 149) | def ClearField(self, field_name: typing_extensions___Literal[u"commit"...
class HashRecord (line 152) | class HashRecord(google___protobuf___message___Message):
method __init__ (line 157) | def __init__(self,
method ClearField (line 162) | def ClearField(self, field_name: typing_extensions___Literal[u"digest"...
class CommitRecord (line 165) | class CommitRecord(google___protobuf___message___Message):
method __init__ (line 171) | def __init__(self,
method ClearField (line 177) | def ClearField(self, field_name: typing_extensions___Literal[u"parent"...
class SchemaRecord (line 180) | class SchemaRecord(google___protobuf___message___Message):
method __init__ (line 185) | def __init__(self,
method ClearField (line 190) | def ClearField(self, field_name: typing_extensions___Literal[u"blob",b...
class DataOriginRequest (line 193) | class DataOriginRequest(google___protobuf___message___Message):
method __init__ (line 197) | def __init__(self,
method ClearField (line 201) | def ClearField(self, field_name: typing_extensions___Literal[u"digest"...
class DataOriginReply (line 204) | class DataOriginReply(google___protobuf___message___Message):
class CompressionOptsEntry (line 206) | class CompressionOptsEntry(google___protobuf___message___Message):
method __init__ (line 211) | def __init__(self,
method ClearField (line 216) | def ClearField(self, field_name: typing_extensions___Literal[u"key",...
method compression_opts (line 226) | def compression_opts(self) -> typing___MutableMapping[typing___Text, t...
method __init__ (line 228) | def __init__(self,
method ClearField (line 237) | def ClearField(self, field_name: typing_extensions___Literal[u"compres...
class PushFindDataOriginRequest (line 240) | class PushFindDataOriginRequest(google___protobuf___message___Message):
method __init__ (line 246) | def __init__(self,
method ClearField (line 252) | def ClearField(self, field_name: typing_extensions___Literal[u"compres...
class PushFindDataOriginReply (line 255) | class PushFindDataOriginReply(google___protobuf___message___Message):
class CompressionOptsExpectedEntry (line 257) | class CompressionOptsExpectedEntry(google___protobuf___message___Messa...
method __init__ (line 262) | def __init__(self,
method ClearField (line 267) | def ClearField(self, field_name: typing_extensions___Literal[u"key",...
method compression_opts_expected (line 276) | def compression_opts_expected(self) -> typing___MutableMapping[typing_...
method __init__ (line 278) | def __init__(self,
method ClearField (line 286) | def ClearField(self, field_name: typing_extensions___Literal[u"compres...
class PingRequest (line 289) | class PingRequest(google___protobuf___message___Message):
method __init__ (line 292) | def __init__(self,
class PingReply (line 296) | class PingReply(google___protobuf___message___Message):
method __init__ (line 300) | def __init__(self,
method ClearField (line 304) | def ClearField(self, field_name: typing_extensions___Literal[u"result"...
class GetClientConfigRequest (line 307) | class GetClientConfigRequest(google___protobuf___message___Message):
method __init__ (line 310) | def __init__(self,
class GetClientConfigReply (line 314) | class GetClientConfigReply(google___protobuf___message___Message):
class ConfigEntry (line 316) | class ConfigEntry(google___protobuf___message___Message):
method __init__ (line 321) | def __init__(self,
method ClearField (line 326) | def ClearField(self, field_name: typing_extensions___Literal[u"key",...
method config (line 331) | def config(self) -> typing___MutableMapping[typing___Text, typing___Te...
method error (line 334) | def error(self) -> type___ErrorProto: ...
method __init__ (line 336) | def __init__(self,
method HasField (line 341) | def HasField(self, field_name: typing_extensions___Literal[u"error",b"...
method ClearField (line 342) | def ClearField(self, field_name: typing_extensions___Literal[u"config"...
class FetchBranchRecordRequest (line 345) | class FetchBranchRecordRequest(google___protobuf___message___Message):
method rec (line 349) | def rec(self) -> type___BranchRecord: ...
method __init__ (line 351) | def __init__(self,
method HasField (line 355) | def HasField(self, field_name: typing_extensions___Literal[u"rec",b"re...
method ClearField (line 356) | def ClearField(self, field_name: typing_extensions___Literal[u"rec",b"...
class FetchBranchRecordReply (line 359) | class FetchBranchRecordReply(google___protobuf___message___Message):
method rec (line 363) | def rec(self) -> type___BranchRecord: ...
method error (line 366) | def error(self) -> type___ErrorProto: ...
method __init__ (line 368) | def __init__(self,
method HasField (line 373) | def HasField(self, field_name: typing_extensions___Literal[u"error",b"...
method ClearField (line 374) | def ClearField(self, field_name: typing_extensions___Literal[u"error",...
class FetchDataRequest (line 377) | class FetchDataRequest(google___protobuf___message___Message):
method __init__ (line 381) | def __init__(self,
method ClearField (line 385) | def ClearField(self, field_name: typing_extensions___Literal[u"uri",b"...
class FetchDataReply (line 388) | class FetchDataReply(google___protobuf___message___Message):
method error (line 395) | def error(self) -> type___ErrorProto: ...
method __init__ (line 397) | def __init__(self,
method HasField (line 404) | def HasField(self, field_name: typing_extensions___Literal[u"error",b"...
method ClearField (line 405) | def ClearField(self, field_name: typing_extensions___Literal[u"error",...
class FetchCommitRequest (line 408) | class FetchCommitRequest(google___protobuf___message___Message):
method __init__ (line 412) | def __init__(self,
method ClearField (line 416) | def ClearField(self, field_name: typing_extensions___Literal[u"commit"...
class FetchCommitReply (line 419) | class FetchCommitReply(google___protobuf___message___Message):
method record (line 425) | def record(self) -> type___CommitRecord: ...
method error (line 428) | def error(self) -> type___ErrorProto: ...
method __init__ (line 430) | def __init__(self,
method HasField (line 437) | def HasField(self, field_name: typing_extensions___Literal[u"error",b"...
method ClearField (line 438) | def ClearField(self, field_name: typing_extensions___Literal[u"commit"...
class FetchSchemaRequest (line 441) | class FetchSchemaRequest(google___protobuf___message___Message):
method rec (line 445) | def rec(self) -> type___SchemaRecord: ...
method __init__ (line 447) | def __init__(self,
method HasField (line 451) | def HasField(self, field_name: typing_extensions___Literal[u"rec",b"re...
method ClearField (line 452) | def ClearField(self, field_name: typing_extensions___Literal[u"rec",b"...
class FetchSchemaReply (line 455) | class FetchSchemaReply(google___protobuf___message___Message):
method rec (line 459) | def rec(self) -> type___SchemaRecord: ...
method error (line 462) | def error(self) -> type___ErrorProto: ...
method __init__ (line 464) | def __init__(self,
method HasField (line 469) | def HasField(self, field_name: typing_extensions___Literal[u"error",b"...
method ClearField (line 470) | def ClearField(self, field_name: typing_extensions___Literal[u"error",...
class PushBranchRecordRequest (line 473) | class PushBranchRecordRequest(google___protobuf___message___Message):
method rec (line 477) | def rec(self) -> type___BranchRecord: ...
method __init__ (line 479) | def __init__(self,
method HasField (line 483) | def HasField(self, field_name: typing_extensions___Literal[u"rec",b"re...
method ClearField (line 484) | def ClearField(self, field_name: typing_extensions___Literal[u"rec",b"...
class PushBranchRecordReply (line 487) | class PushBranchRecordReply(google___protobuf___message___Message):
method error (line 491) | def error(self) -> type___ErrorProto: ...
method __init__ (line 493) | def __init__(self,
method HasField (line 497) | def HasField(self, field_name: typing_extensions___Literal[u"error",b"...
method ClearField (line 498) | def ClearField(self, field_name: typing_extensions___Literal[u"error",...
class PushDataRequest (line 501) | class PushDataRequest(google___protobuf___message___Message):
method __init__ (line 509) | def __init__(self,
method ClearField (line 517) | def ClearField(self, field_name: typing_extensions___Literal[u"data_ty...
class PushDataReply (line 520) | class PushDataReply(google___protobuf___message___Message):
method error (line 524) | def error(self) -> type___ErrorProto: ...
method __init__ (line 526) | def __init__(self,
method HasField (line 530) | def HasField(self, field_name: typing_extensions___Literal[u"error",b"...
method ClearField (line 531) | def ClearField(self, field_name: typing_extensions___Literal[u"error",...
class PushCommitRequest (line 534) | class PushCommitRequest(google___protobuf___message___Message):
method record (line 540) | def record(self) -> type___CommitRecord: ...
method __init__ (line 542) | def __init__(self,
method HasField (line 548) | def HasField(self, field_name: typing_extensions___Literal[u"record",b...
method ClearField (line 549) | def ClearField(self, field_name: typing_extensions___Literal[u"commit"...
class PushCommitReply (line 552) | class PushCommitReply(google___protobuf___message___Message):
method error (line 556) | def error(self) -> type___ErrorProto: ...
method __init__ (line 558) | def __init__(self,
method HasField (line 562) | def HasField(self, field_name: typing_extensions___Literal[u"error",b"...
method ClearField (line 563) | def ClearField(self, field_name: typing_extensions___Literal[u"error",...
class PushSchemaRequest (line 566) | class PushSchemaRequest(google___protobuf___message___Message):
method rec (line 570) | def rec(self) -> type___SchemaRecord: ...
method __init__ (line 572) | def __init__(self,
method HasField (line 576) | def HasField(self, field_name: typing_extensions___Literal[u"rec",b"re...
method ClearField (line 577) | def ClearField(self, field_name: typing_extensions___Literal[u"rec",b"...
class PushSchemaReply (line 580) | class PushSchemaReply(google___protobuf___message___Message):
method error (line 584) | def error(self) -> type___ErrorProto: ...
method __init__ (line 586) | def __init__(self,
method HasField (line 590) | def HasField(self, field_name: typing_extensions___Literal[u"error",b"...
method ClearField (line 591) | def ClearField(self, field_name: typing_extensions___Literal[u"error",...
class FindMissingCommitsRequest (line 594) | class FindMissingCommitsRequest(google___protobuf___message___Message):
method branch (line 599) | def branch(self) -> type___BranchRecord: ...
method __init__ (line 601) | def __init__(self,
method HasField (line 606) | def HasField(self, field_name: typing_extensions___Literal[u"branch",b...
method ClearField (line 607) | def ClearField(self, field_name: typing_extensions___Literal[u"branch"...
class FindMissingCommitsReply (line 610) | class FindMissingCommitsReply(google___protobuf___message___Message):
method branch (line 615) | def branch(self) -> type___BranchRecord: ...
method error (line 618) | def error(self) -> type___ErrorProto: ...
method __init__ (line 620) | def __init__(self,
method HasField (line 626) | def HasField(self, field_name: typing_extensions___Literal[u"branch",b...
method ClearField (line 627) | def ClearField(self, field_name: typing_extensions___Literal[u"branch"...
class FindMissingHashRecordsRequest (line 630) | class FindMissingHashRecordsRequest(google___protobuf___message___Message):
method __init__ (line 636) | def __init__(self,
method ClearField (line 642) | def ClearField(self, field_name: typing_extensions___Literal[u"commit"...
class FindMissingHashRecordsReply (line 645) | class FindMissingHashRecordsReply(google___protobuf___message___Message):
method error (line 652) | def error(self) -> type___ErrorProto: ...
method __init__ (line 654) | def __init__(self,
method HasField (line 661) | def HasField(self, field_name: typing_extensions___Literal[u"error",b"...
method ClearField (line 662) | def ClearField(self, field_name: typing_extensions___Literal[u"commit"...
class FindMissingSchemasRequest (line 665) | class FindMissingSchemasRequest(google___protobuf___message___Message):
method __init__ (line 670) | def __init__(self,
method ClearField (line 675) | def ClearField(self, field_name: typing_extensions___Literal[u"commit"...
class FindMissingSchemasReply (line 678) | class FindMissingSchemasReply(google___protobuf___message___Message):
method error (line 684) | def error(self) -> type___ErrorProto: ...
method __init__ (line 686) | def __init__(self,
method HasField (line 692) | def HasField(self, field_name: typing_extensions___Literal[u"error",b"...
method ClearField (line 693) | def ClearField(self, field_name: typing_extensions___Literal[u"commit"...
FILE: src/hangar/remote/hangar_service_pb2_grpc.py
class HangarServiceStub (line 7) | class HangarServiceStub(object):
method __init__ (line 10) | def __init__(self, channel):
class HangarServiceServicer (line 118) | class HangarServiceServicer(object):
method PING (line 121) | def PING(self, request, context):
method GetClientConfig (line 127) | def GetClientConfig(self, request, context):
method FetchBranchRecord (line 133) | def FetchBranchRecord(self, request, context):
method FetchData (line 139) | def FetchData(self, request, context):
method FetchCommit (line 145) | def FetchCommit(self, request, context):
method FetchSchema (line 151) | def FetchSchema(self, request, context):
method PushBranchRecord (line 157) | def PushBranchRecord(self, request, context):
method PushData (line 163) | def PushData(self, request_iterator, context):
method PushCommit (line 169) | def PushCommit(self, request_iterator, context):
method PushSchema (line 175) | def PushSchema(self, request, context):
method FetchFindMissingCommits (line 181) | def FetchFindMissingCommits(self, request, context):
method FetchFindMissingHashRecords (line 187) | def FetchFindMissingHashRecords(self, request_iterator, context):
method FetchFindMissingSchemas (line 193) | def FetchFindMissingSchemas(self, request, context):
method PushFindMissingCommits (line 199) | def PushFindMissingCommits(self, request, context):
method PushFindMissingHashRecords (line 205) | def PushFindMissingHashRecords(self, request_iterator, context):
method PushFindMissingSchemas (line 211) | def PushFindMissingSchemas(self, request, context):
method FetchFindDataOrigin (line 217) | def FetchFindDataOrigin(self, request_iterator, context):
method PushFindDataOrigin (line 223) | def PushFindDataOrigin(self, request_iterator, context):
method PushBeginContext (line 229) | def PushBeginContext(self, request, context):
method PushEndContext (line 235) | def PushEndContext(self, request, context):
function add_HangarServiceServicer_to_server (line 242) | def add_HangarServiceServicer_to_server(servicer, server):
class HangarService (line 351) | class HangarService(object):
method PING (line 355) | def PING(request,
method GetClientConfig (line 371) | def GetClientConfig(request,
method FetchBranchRecord (line 387) | def FetchBranchRecord(request,
method FetchData (line 403) | def FetchData(request,
method FetchCommit (line 419) | def FetchCommit(request,
method FetchSchema (line 435) | def FetchSchema(request,
method PushBranchRecord (line 451) | def PushBranchRecord(request,
method PushData (line 467) | def PushData(request_iterator,
method PushCommit (line 483) | def PushCommit(request_iterator,
method PushSchema (line 499) | def PushSchema(request,
method FetchFindMissingCommits (line 515) | def FetchFindMissingCommits(request,
method FetchFindMissingHashRecords (line 531) | def FetchFindMissingHashRecords(request_iterator,
method FetchFindMissingSchemas (line 547) | def FetchFindMissingSchemas(request,
method PushFindMissingCommits (line 563) | def PushFindMissingCommits(request,
method PushFindMissingHashRecords (line 579) | def PushFindMissingHashRecords(request_iterator,
method PushFindMissingSchemas (line 595) | def PushFindMissingSchemas(request,
method FetchFindDataOrigin (line 611) | def FetchFindDataOrigin(request_iterator,
method PushFindDataOrigin (line 627) | def PushFindDataOrigin(request_iterator,
method PushBeginContext (line 643) | def PushBeginContext(request,
method PushEndContext (line 659) | def PushEndContext(request,
FILE: src/hangar/remote/header_manipulator_client_interceptor.py
class _GenericClientInterceptor (line 30) | class _GenericClientInterceptor(
method __init__ (line 35) | def __init__(self, interceptor_function):
method intercept_unary_unary (line 38) | def intercept_unary_unary(self, continuation, client_call_details, req...
method intercept_unary_stream (line 44) | def intercept_unary_stream(self, continuation, client_call_details,
method intercept_stream_unary (line 51) | def intercept_stream_unary(self, continuation, client_call_details,
method intercept_stream_stream (line 58) | def intercept_stream_stream(self, continuation, client_call_details,
function create_client_interceptor (line 66) | def create_client_interceptor(intercept_call):
class _ClientCallDetails (line 70) | class _ClientCallDetails(
function header_adder_interceptor (line 78) | def header_adder_interceptor(header, value):
FILE: src/hangar/remote/request_header_validator_interceptor.py
function _unary_unary_rpc_terminator (line 54) | def _unary_unary_rpc_terminator(code, details):
function _unary_stream_rpc_terminator (line 60) | def _unary_stream_rpc_terminator(code, details): # pragma: no cover
function _stream_unary_rpc_terminator (line 66) | def _stream_unary_rpc_terminator(code, details): # pragma: no cover
function _stream_stream_rpc_terminator (line 72) | def _stream_stream_rpc_terminator(code, details): # pragma: no cover
function _select_rpc_terminator (line 78) | def _select_rpc_terminator(intercepted_method):
class RequestHeaderValidatorInterceptor (line 93) | class RequestHeaderValidatorInterceptor(grpc.ServerInterceptor):
method __init__ (line 95) | def __init__(self, push_restricted, header, value, code, details):
method intercept_service (line 102) | def intercept_service(self, continuation, handler_call_details):
FILE: src/hangar/remote/server.py
function server_config (line 45) | def server_config(server_dir, *, create: bool = True) -> configparser.Co...
function context_abort_with_exception_traceback (line 65) | def context_abort_with_exception_traceback(
function context_abort_with_handled_error (line 77) | def context_abort_with_handled_error(
class HangarServer (line 85) | class HangarServer(hangar_service_pb2_grpc.HangarServiceServicer):
method __init__ (line 87) | def __init__(self, repo_path: Union[str, bytes, Path], overwrite=False):
method close (line 125) | def close(self):
method PING (line 132) | def PING(self, request, context):
method GetClientConfig (line 138) | def GetClientConfig(self, request, context):
method FetchBranchRecord (line 155) | def FetchBranchRecord(self, request, context):
method PushBranchRecord (line 171) | def PushBranchRecord(self, request, context):
method FetchCommit (line 196) | def FetchCommit(self, request, context):
method PushCommit (line 232) | def PushCommit(self, request_iterator, context):
method FetchSchema (line 262) | def FetchSchema(self, request, context):
method PushSchema (line 287) | def PushSchema(self, request, context):
method FetchFindDataOrigin (line 310) | def FetchFindDataOrigin(self, request_iterator, context):
method FetchData (line 354) | def FetchData(self, request, context):
method PushFindDataOrigin (line 408) | def PushFindDataOrigin(
method PushBeginContext (line 458) | def PushBeginContext(self, request, context):
method PushEndContext (line 472) | def PushEndContext(self, request, context):
method PushData (line 486) | def PushData(
method FetchFindMissingCommits (line 544) | def FetchFindMissingCommits(self, request, context):
method PushFindMissingCommits (line 578) | def PushFindMissingCommits(self, request, context):
method FetchFindMissingHashRecords (line 601) | def FetchFindMissingHashRecords(self, request_iterator, context):
method PushFindMissingHashRecords (line 632) | def PushFindMissingHashRecords(self, request_iterator, context):
method FetchFindMissingSchemas (line 656) | def FetchFindMissingSchemas(self, request, context):
method PushFindMissingSchemas (line 675) | def PushFindMissingSchemas(self, request, context):
function serve (line 689) | def serve(hangar_path: str,
FILE: src/hangar/remotes.py
class Remotes (line 38) | class Remotes(object):
method __init__ (line 51) | def __init__(self, env: Environments):
method __verify_repo_initialized (line 57) | def __verify_repo_initialized(self):
method add (line 70) | def add(self, name: str, address: str) -> RemoteInfo:
method remove (line 109) | def remove(self, name: str) -> RemoteInfo:
method list_all (line 134) | def list_all(self) -> List[RemoteInfo]:
method ping (line 151) | def ping(self, name: str) -> float:
method fetch (line 182) | def fetch(self, remote: str, branch: str) -> str:
method fetch_data_sample (line 287) | def fetch_data_sample(self,
method _select_digests_fetch_data_sample (line 400) | def _select_digests_fetch_data_sample(
method fetch_data (line 491) | def fetch_data(self,
method _form_missing_schema_digest_map (line 605) | def _form_missing_schema_digest_map(
method _select_digest_fetch_data (line 637) | def _select_digest_fetch_data(
method push (line 667) | def push(self, remote: str, branch: str,
FILE: src/hangar/repository.py
class Repository (line 24) | class Repository(object):
method __init__ (line 54) | def __init__(self, path: Union[str, Path], exists: bool = True):
method _repr_pretty_ (line 76) | def _repr_pretty_(self, p, cycle):
method __repr__ (line 95) | def __repr__(self):
method __verify_repo_initialized (line 113) | def __verify_repo_initialized(self):
method remote (line 128) | def remote(self) -> Remotes:
method path (line 144) | def path(self) -> str:
method writer_lock_held (line 156) | def writer_lock_held(self) -> bool:
method version (line 168) | def version(self) -> str:
method initialized (line 181) | def initialized(self) -> bool:
method size_nbytes (line 193) | def size_nbytes(self) -> int:
method size_human (line 210) | def size_human(self) -> str:
method checkout (line 227) | def checkout(self,
method clone (line 301) | def clone(self, user_name: str, user_email: str, remote_address: str,
method init (line 348) | def init(self,
method log (line 378) | def log(self,
method summary (line 424) | def summary(self, *, branch: str = '', commit: str = '') -> None:
method _details (line 449) | def _details(self, *, line_limit=100, line_length=100) -> None: # pra...
method _ecosystem_details (line 467) | def _ecosystem_details(self) -> dict:
method diff (line 473) | def diff(self, master: str, dev: str) -> DiffAndConflicts:
method merge (line 525) | def merge(self, message: str, master_branch: str, dev_branch: str) -> ...
method create_branch (line 555) | def create_branch(self, name: str, base_commit: str = None) -> heads.B...
method remove_branch (line 619) | def remove_branch(self, name: str, *, force_delete: bool = False) -> h...
method list_branches (line 739) | def list_branches(self) -> List[str]:
method verify_repo_integrity (line 751) | def verify_repo_integrity(self) -> bool:
method force_release_writer_lock (line 816) | def force_release_writer_lock(self) -> bool:
FILE: src/hangar/txnctx.py
class TxnRegisterSingleton (line 7) | class TxnRegisterSingleton(type):
method __call__ (line 9) | def __call__(cls, *args, **kwargs):
class TxnRegister (line 15) | class TxnRegister(metaclass=TxnRegisterSingleton):
method __init__ (line 22) | def __init__(self):
method _debug_ (line 29) | def _debug_(self): # pragma: no cover
method begin_writer_txn (line 38) | def begin_writer_txn(self, lmdbenv: lmdb.Environment,
method begin_reader_txn (line 64) | def begin_reader_txn(self, lmdbenv: lmdb.Environment,
method commit_writer_txn (line 90) | def commit_writer_txn(self, lmdbenv: lmdb.Environment) -> bool:
method abort_reader_txn (line 126) | def abort_reader_txn(self, lmdbenv: lmdb.Environment) -> bool:
FILE: src/hangar/typesystem/base.py
class ColumnLayout (line 6) | class ColumnLayout(String):
class ColumnDType (line 11) | class ColumnDType(String):
class SchemaHasherTcode (line 16) | class SchemaHasherTcode(String):
class ColumnBase (line 20) | class ColumnBase(metaclass=checkedmeta):
method __init__ (line 25) | def __init__(
method _beopts (line 51) | def _beopts(self):
method _beopts (line 61) | def _beopts(self):
method _beopts (line 65) | def _beopts(self, backend_options):
method column_layout (line 73) | def column_layout(self):
method column_type (line 77) | def column_type(self):
method schema_hasher_tcode (line 81) | def schema_hasher_tcode(self):
method schema (line 85) | def schema(self):
method schema_hash_digest (line 92) | def schema_hash_digest(self):
method backend_from_heuristics (line 95) | def backend_from_heuristics(self, *args, **kwargs):
method verify_data_compatible (line 98) | def verify_data_compatible(self, *args, **kwargs):
method data_hasher_tcode (line 102) | def data_hasher_tcode(self):
method data_hash_digest (line 105) | def data_hash_digest(self, *args, **kwargs):
FILE: src/hangar/typesystem/descriptors.py
class Descriptor (line 75) | class Descriptor:
method __init__ (line 77) | def __init__(self, name=None, **opts):
method __set__ (line 81) | def __set__(self, instance, value):
function Typed (line 85) | def Typed(expected_type, cls=None):
function TypedSequence (line 101) | def TypedSequence(expected_element_types, cls=None):
function OneOf (line 119) | def OneOf(expected_values, cls=None):
function MaxSized (line 133) | def MaxSized(cls):
function DictItems (line 151) | def DictItems(expected_keys_required, expected_values, cls=None):
class String (line 178) | class String(Descriptor):
class EmptyDict (line 184) | class EmptyDict(Descriptor):
class OptionalDict (line 189) | class OptionalDict(Descriptor):
class OptionalString (line 194) | class OptionalString(Descriptor):
class Tuple (line 199) | class Tuple(Descriptor):
class SizedIntegerTuple (line 205) | class SizedIntegerTuple(Tuple):
class checkedmeta (line 209) | class checkedmeta(type):
method __new__ (line 211) | def __new__(cls, clsname, bases, methods):
FILE: src/hangar/typesystem/ndarray.py
class NdarraySchemaType (line 9) | class NdarraySchemaType(String):
class NdarrayColumnType (line 14) | class NdarrayColumnType(String):
class DataHasherTcode (line 19) | class DataHasherTcode(String):
class NdarraySchemaBase (line 23) | class NdarraySchemaBase(ColumnBase):
method __init__ (line 28) | def __init__(
method backend_from_heuristics (line 53) | def backend_from_heuristics(self):
method schema_type (line 69) | def schema_type(self):
method shape (line 73) | def shape(self):
method dtype (line 77) | def dtype(self):
method backend (line 81) | def backend(self):
method backend_options (line 85) | def backend_options(self):
method data_hash_digest (line 88) | def data_hash_digest(self, data: np.ndarray) -> str:
method change_backend (line 91) | def change_backend(self, backend, backend_options=None):
method data_nbytes (line 106) | def data_nbytes(self, obj: np.ndarray):
class NdarrayFixedShapeBackends (line 111) | class NdarrayFixedShapeBackends(OptionalString):
class FixedShapeSchemaType (line 116) | class FixedShapeSchemaType(String):
class NdarrayFixedShape (line 120) | class NdarrayFixedShape(NdarraySchemaBase):
method __init__ (line 127) | def __init__(self, *args, **kwargs):
method verify_data_compatible (line 142) | def verify_data_compatible(self, data):
class NdarrayVariableShapeBackends (line 164) | class NdarrayVariableShapeBackends(OptionalString):
class VariableShapeSchemaType (line 169) | class VariableShapeSchemaType(String):
class NdarrayVariableShape (line 173) | class NdarrayVariableShape(NdarraySchemaBase):
method __init__ (line 180) | def __init__(self, *args, **kwargs):
method verify_data_compatible (line 195) | def verify_data_compatible(self, data):
FILE: src/hangar/typesystem/pybytes.py
class BytesDType (line 8) | class BytesDType(Descriptor):
class BytesSchemaType (line 18) | class BytesSchemaType(String):
class BytesColumnType (line 23) | class BytesColumnType(String):
class DataHasherTcode (line 28) | class DataHasherTcode(String):
class BytesSchemaBase (line 32) | class BytesSchemaBase(ColumnBase):
method __init__ (line 37) | def __init__(
method backend_from_heuristics (line 62) | def backend_from_heuristics(self):
method schema_type (line 66) | def schema_type(self):
method dtype (line 70) | def dtype(self):
method backend (line 74) | def backend(self):
method backend_options (line 78) | def backend_options(self):
method data_hash_digest (line 81) | def data_hash_digest(self, data: str) -> str:
method change_backend (line 84) | def change_backend(self, backend, backend_options=None):
class BytesVariableShapeBackends (line 102) | class BytesVariableShapeBackends(OptionalString):
class VariableShapeSchemaType (line 107) | class VariableShapeSchemaType(String):
class BytesVariableShape (line 111) | class BytesVariableShape(BytesSchemaBase):
method __init__ (line 117) | def __init__(self, *args, **kwargs):
method verify_data_compatible (line 132) | def verify_data_compatible(self, data):
FILE: src/hangar/typesystem/pystring.py
class StringDType (line 8) | class StringDType(Descriptor):
class StringSchemaType (line 18) | class StringSchemaType(String):
class StrColumnType (line 23) | class StrColumnType(String):
class DataHasherTcode (line 28) | class DataHasherTcode(String):
class StringSchemaBase (line 32) | class StringSchemaBase(ColumnBase):
method __init__ (line 37) | def __init__(
method backend_from_heuristics (line 62) | def backend_from_heuristics(self):
method schema_type (line 66) | def schema_type(self):
method dtype (line 70) | def dtype(self):
method backend (line 74) | def backend(self):
method backend_options (line 78) | def backend_options(self):
method data_hash_digest (line 81) | def data_hash_digest(self, data: str) -> str:
method change_backend (line 84) | def change_backend(self, backend, backend_options=None):
method data_nbytes (line 100) | def data_nbytes(self, obj: str):
class StringVariableShapeBackends (line 105) | class StringVariableShapeBackends(OptionalString):
class VariableShapeSchemaType (line 110) | class VariableShapeSchemaType(String):
class StringVariableShape (line 114) | class StringVariableShape(StringSchemaBase):
method __init__ (line 120) | def __init__(self, *args, **kwargs):
method verify_data_compatible (line 135) | def verify_data_compatible(self, data):
FILE: src/hangar/utils.py
function bound (line 19) | def bound(low: NumType, high: NumType, value: NumType) -> NumType:
function calc_num_threadpool_workers (line 38) | def calc_num_threadpool_workers() -> int:
function is_64bits (line 43) | def is_64bits():
function set_blosc_nthreads (line 49) | def set_blosc_nthreads() -> int: # pragma: no cover
function random_string (line 72) | def random_string(
function is_suitable_user_key (line 90) | def is_suitable_user_key(key: Union[str, int]) -> bool:
function is_ascii (line 120) | def is_ascii(str_data: str) -> bool:
function pairwise (line 142) | def pairwise(iterable):
function unique_everseen (line 149) | def unique_everseen(iterable, key=None):
function ilen (line 171) | def ilen(iterable):
function grouper (line 187) | def grouper(iterable, n, fillvalue=None):
function file_size (line 208) | def file_size(p: Path) -> int: # pragma: no cover
function folder_size (line 232) | def folder_size(p: Path, *, recurse: bool = False) -> int:
function is_valid_directory_path (line 259) | def is_valid_directory_path(p: Path) -> Path:
function format_bytes (line 301) | def format_bytes(n: int) -> str:
function parse_bytes (line 348) | def parse_bytes(s: str) -> int:
function readme_contents (line 380) | def readme_contents(user_name: str, user_email: str) -> StringIO:
FILE: tests/bulk_importer/test_bulk_importer.py
function assert_equal (line 5) | def assert_equal(arr, arr2):
function test_bulk_importer_ndarray (line 10) | def test_bulk_importer_ndarray(repo):
function test_bulk_importer_pystr (line 59) | def test_bulk_importer_pystr(repo):
function test_bulk_importer_pybytes (line 104) | def test_bulk_importer_pybytes(repo):
function test_bulk_importer_two_col_pybytes_pystr (line 150) | def test_bulk_importer_two_col_pybytes_pystr(repo):
function test_signature_wrong (line 214) | def test_signature_wrong(repo):
FILE: tests/conftest.py
function monkeysession (line 20) | def monkeysession(request):
function classrepo (line 28) | def classrepo(tmp_path_factory) -> Repository:
function managed_tmpdir (line 64) | def managed_tmpdir(monkeypatch, tmp_path):
function managed_tmpdir_class (line 80) | def managed_tmpdir_class(monkeysession, tmp_path_factory):
function repo (line 99) | def repo(managed_tmpdir) -> Repository:
function aset_samples_initialized_repo (line 107) | def aset_samples_initialized_repo(repo) -> Repository:
function aset_subsamples_initialized_repo (line 116) | def aset_subsamples_initialized_repo(repo) -> Repository:
function repo_20_filled_samples (line 126) | def repo_20_filled_samples(request, aset_samples_initialized_repo, array...
function repo_20_filled_subsamples (line 141) | def repo_20_filled_subsamples(request, aset_subsamples_initialized_repo,...
function repo_300_filled_samples (line 158) | def repo_300_filled_samples(request, aset_samples_initialized_repo, arra...
function repo_20_filled_samples2 (line 171) | def repo_20_filled_samples2(repo) -> Repository:
function aset_samples_var_shape_initialized_repo (line 185) | def aset_samples_var_shape_initialized_repo(request, repo) -> Repository:
function aset_samples_initialized_w_checkout (line 195) | def aset_samples_initialized_w_checkout(aset_samples_initialized_repo) -...
function array5by7 (line 202) | def array5by7():
function randomsizedarray (line 207) | def randomsizedarray():
function two_commit_filled_samples_repo (line 214) | def two_commit_filled_samples_repo(request, repo, array5by7) -> Repository:
function repo_1_br_no_conf (line 234) | def repo_1_br_no_conf(repo) -> Repository:
function repo_2_br_no_conf (line 257) | def repo_2_br_no_conf(repo_1_br_no_conf) -> Repository:
function mock_server_config (line 271) | def mock_server_config(*args, **kwargs):
function server_instance (line 287) | def server_instance(monkeypatch, managed_tmpdir, worker_id):
function server_instance_class (line 307) | def server_instance_class(monkeysession, tmp_path_factory, worker_id):
function written_two_cmt_server_repo (line 334) | def written_two_cmt_server_repo(server_instance, two_commit_filled_sampl...
function server_instance_push_restricted (line 343) | def server_instance_push_restricted(monkeypatch, managed_tmpdir, worker_...
FILE: tests/ml_datasets/test_dataset.py
class TestInternalDatasetClass (line 18) | class TestInternalDatasetClass:
method test_column_without_wrapping_list (line 20) | def test_column_without_wrapping_list(self, repo_20_filled_samples, ar...
method test_no_column (line 31) | def test_no_column(self):
method test_fails_on_write_enabled_columns (line 35) | def test_fails_on_write_enabled_columns(self, repo_20_filled_samples):
method test_columns_without_local_data_and_without_key_argument (line 44) | def test_columns_without_local_data_and_without_key_argument(self, rep...
method test_columns_without_common_keys_and_without_key_argument (line 71) | def test_columns_without_common_keys_and_without_key_argument(self, re...
method test_keys_single_column_success (line 84) | def test_keys_single_column_success(self, repo_20_filled_samples):
method test_keys_multiple_column_success (line 92) | def test_keys_multiple_column_success(self, repo_20_filled_samples):
method test_keys_nested_column_success (line 104) | def test_keys_nested_column_success(self, repo_20_filled_subsamples):
method test_keys_not_valid (line 122) | def test_keys_not_valid(self, repo_20_filled_samples):
method test_keys_non_local (line 132) | def test_keys_non_local(self, repo_20_filled_samples):
class TestNumpyDataset (line 154) | class TestNumpyDataset:
method test_multiple_dataset_batched_loader (line 155) | def test_multiple_dataset_batched_loader(self, repo_20_filled_samples):
method test_nested_column (line 189) | def test_nested_column(self, repo_20_filled_subsamples):
method test_lots_of_data_with_multiple_backend (line 210) | def test_lots_of_data_with_multiple_backend(self, repo_300_filled_samp...
method test_shuffle (line 220) | def test_shuffle(self, repo_20_filled_samples):
method test_collate_fn (line 243) | def test_collate_fn(self, repo_20_filled_subsamples):
class TestTorchDataset (line 281) | class TestTorchDataset(object):
method test_multiple_dataset_loader (line 283) | def test_multiple_dataset_loader(self, repo_20_filled_samples):
method test_return_as_dict (line 298) | def test_return_as_dict(self, repo_20_filled_samples):
method test_lots_of_data_with_multiple_backend (line 311) | def test_lots_of_data_with_multiple_backend(self, repo_300_filled_samp...
method test_lots_of_data_with_multiple_backend_multiple_worker_dataloader (line 324) | def test_lots_of_data_with_multiple_backend_multiple_worker_dataloader...
method test_two_aset_loader_two_worker_dataloader (line 336) | def test_two_aset_loader_two_worker_dataloader(self, repo_20_filled_sa...
class TestTfDataset (line 358) | class TestTfDataset(object):
method test_dataset_loader (line 361) | def test_dataset_loader(self, repo_20_filled_samples):
method test_variably_shaped (line 375) | def test_variably_shaped(self, aset_samples_var_shape_initialized_repo):
method test_lots_of_data_with_multiple_backend (line 397) | def test_lots_of_data_with_multiple_backend(self, repo_300_filled_samp...
method test_shuffle (line 407) | def test_shuffle(self, repo_20_filled_samples):
FILE: tests/property_based/test_pbt_column_flat.py
function fixed_shape_repo_co_float32_aset_flat (line 26) | def fixed_shape_repo_co_float32_aset_flat(classrepo, request) -> Reposit...
function variable_shape_repo_co_float32_aset_flat (line 44) | def variable_shape_repo_co_float32_aset_flat(classrepo, request) -> Repo...
function variable_shape_repo_co_uint8_aset_flat (line 62) | def variable_shape_repo_co_uint8_aset_flat(classrepo, request) -> Reposi...
function variable_shape_repo_co_str_aset_flat (line 80) | def variable_shape_repo_co_str_aset_flat(classrepo, request) -> Repository:
function variable_shape_repo_co_bytes_aset_flat (line 95) | def variable_shape_repo_co_bytes_aset_flat(classrepo, request) -> Reposi...
class TestColumn1 (line 144) | class TestColumn1:
method test_arrayset_fixed_key_values (line 148) | def test_arrayset_fixed_key_values(self, key, val, fixed_shape_repo_co...
class TestColumn2 (line 179) | class TestColumn2:
method test_arrayset_variable_shape_float32 (line 183) | def test_arrayset_variable_shape_float32(self, key, val, variable_shap...
class TestColumn3 (line 205) | class TestColumn3:
method test_arrayset_variable_shape_uint8 (line 209) | def test_arrayset_variable_shape_uint8(self, key, val, variable_shape_...
class TestColumn4 (line 229) | class TestColumn4:
method test_str_column_variable_shape (line 233) | def test_str_column_variable_shape(self, key, val, variable_shape_repo...
class TestColumn5 (line 253) | class TestColumn5:
method test_bytes_column_variable_shape (line 257) | def test_bytes_column_variable_shape(self, key, val, variable_shape_re...
FILE: tests/property_based/test_pbt_column_nested.py
function fixed_shape_repo_co_float32_aset_nested (line 28) | def fixed_shape_repo_co_float32_aset_nested(classrepo, request) -> Repos...
function variable_shape_repo_co_float32_aset_nested (line 46) | def variable_shape_repo_co_float32_aset_nested(classrepo, request) -> Re...
function variable_shape_repo_co_uint8_aset_nested (line 64) | def variable_shape_repo_co_uint8_aset_nested(classrepo, request) -> Repo...
function variable_shape_repo_co_str_aset_nested (line 82) | def variable_shape_repo_co_str_aset_nested(classrepo, request) -> Reposi...
function variable_shape_repo_co_bytes_aset_nested (line 97) | def variable_shape_repo_co_bytes_aset_nested(classrepo, request) -> Repo...
class TestColumn1 (line 145) | class TestColumn1:
method test_arrayset_fixed_key_values_nested (line 149) | def test_arrayset_fixed_key_values_nested(self, key, subkey, val, fixe...
class TestColumn2 (line 183) | class TestColumn2:
method test_arrayset_variable_shape_float32_nested (line 187) | def test_arrayset_variable_shape_float32_nested(self, key, val, subkey...
class TestColumn3 (line 211) | class TestColumn3:
method test_arrayset_variable_shape_uint8_nested (line 215) | def test_arrayset_variable_shape_uint8_nested(self, key, val, subkey, ...
class TestStrColumn (line 237) | class TestStrColumn:
method test_str_column_variable_shape_nested (line 241) | def test_str_column_variable_shape_nested(self, key, subkey, val, vari...
class TestBytesColumn (line 261) | class TestBytesColumn:
method test_bytes_column_variable_shape_nested (line 265) | def test_bytes_column_variable_shape_nested(self, key, subkey, val, va...
FILE: tests/test_backend_hdf5_00_hdf5_01.py
function be_filehandle (line 6) | def be_filehandle(request):
function test_blosc_filter_opts_result_in_correct_dataset_args (line 24) | def test_blosc_filter_opts_result_in_correct_dataset_args(
function test_lzf_filter_opts_result_in_correct_dataset_args (line 38) | def test_lzf_filter_opts_result_in_correct_dataset_args(be_filehandle, c...
function test_gzip_filter_opts_result_in_correct_dataset_args (line 51) | def test_gzip_filter_opts_result_in_correct_dataset_args(be_filehandle, ...
function test_arrayset_init_with_various_blosc_opts (line 72) | def test_arrayset_init_with_various_blosc_opts(repo, array5by7, clib, cl...
function test_arrayset_init_with_various_lzf_opts (line 99) | def test_arrayset_init_with_various_lzf_opts(repo, array5by7, cshuffle, ...
function test_arrayset_init_with_various_gzip_opts (line 124) | def test_arrayset_init_with_various_gzip_opts(repo, array5by7, clevel, c...
function test_arrayset_overflows_collection_size_collection_count (line 150) | def test_arrayset_overflows_collection_size_collection_count(be_code, re...
FILE: tests/test_branching.py
function test_create_branch_fails_invalid_name (line 8) | def test_create_branch_fails_invalid_name(aset_samples_initialized_repo,...
function test_list_branches_only_reports_master_upon_initialization (line 14) | def test_list_branches_only_reports_master_upon_initialization(repo):
function test_cannot_create_new_branch_from_initialized_repo_with_no_commits (line 19) | def test_cannot_create_new_branch_from_initialized_repo_with_no_commits(...
function test_can_create_new_branch_from_repo_with_one_commit (line 24) | def test_can_create_new_branch_from_repo_with_one_commit(repo):
function test_cannot_duplicate_branch_name (line 36) | def test_cannot_duplicate_branch_name(aset_samples_initialized_repo):
function test_create_multiple_branches_different_name_same_commit (line 42) | def test_create_multiple_branches_different_name_same_commit(aset_sample...
function test_create_branch_by_specifying_base_commit (line 53) | def test_create_branch_by_specifying_base_commit(repo):
function test_remove_branch_works_when_commits_align (line 79) | def test_remove_branch_works_when_commits_align(repo):
function test_delete_branch_raises_runtime_error_when_history_not_merged (line 96) | def test_delete_branch_raises_runtime_error_when_history_not_merged(repo):
function test_delete_branch_completes_when_history_not_merged_but_force_option_set (line 119) | def test_delete_branch_completes_when_history_not_merged_but_force_optio...
function test_delete_branch_raises_value_error_if_invalid_branch_name (line 144) | def test_delete_branch_raises_value_error_if_invalid_branch_name(repo):
function test_delete_branch_raises_permission_error_if_writer_lock_held (line 165) | def test_delete_branch_raises_permission_error_if_writer_lock_held(repo):
function test_delete_branch_raises_permission_error_if_branch_requested_is_staging_head (line 187) | def test_delete_branch_raises_permission_error_if_branch_requested_is_st...
function test_delete_branch_raises_permission_error_if_only_one_branch_left (line 206) | def test_delete_branch_raises_permission_error_if_only_one_branch_left(r...
FILE: tests/test_checkout.py
class TestCheckout (line 7) | class TestCheckout(object):
method test_write_checkout_specifying_commit_not_allowed_if_commit_exists (line 9) | def test_write_checkout_specifying_commit_not_allowed_if_commit_exists...
method test_write_checkout_specifying_commit_not_allowed_if_commit_does_not_exists (line 14) | def test_write_checkout_specifying_commit_not_allowed_if_commit_does_n...
method test_two_write_checkouts (line 19) | def test_two_write_checkouts(self, repo):
method test_two_read_checkouts (line 25) | def test_two_read_checkouts(self, repo, array5by7):
method test_write_with_read_checkout (line 40) | def test_write_with_read_checkout(self, aset_samples_initialized_repo,...
method test_writer_aset_obj_not_accessible_after_close (line 48) | def test_writer_aset_obj_not_accessible_after_close(self, two_commit_f...
method test_writer_aset_obj_arrayset_iter_values_not_accessible_after_close (line 62) | def test_writer_aset_obj_arrayset_iter_values_not_accessible_after_clo...
method test_writer_aset_obj_arrayset_iter_items_not_accessible_after_close (line 74) | def test_writer_aset_obj_arrayset_iter_items_not_accessible_after_clos...
method test_writer_aset_obj_not_accessible_after_commit_and_close (line 87) | def test_writer_aset_obj_not_accessible_after_commit_and_close(self, a...
method test_reader_aset_obj_not_accessible_after_close (line 105) | def test_reader_aset_obj_not_accessible_after_close(self, two_commit_f...
method test_reader_aset_obj_column_iter_values_not_accessible_after_close (line 119) | def test_reader_aset_obj_column_iter_values_not_accessible_after_close...
method test_reader_aset_obj_arrayset_iter_items_not_accessible_after_close (line 131) | def test_reader_aset_obj_arrayset_iter_items_not_accessible_after_clos...
method test_reader_arrayset_context_manager_not_accessible_after_close (line 144) | def test_reader_arrayset_context_manager_not_accessible_after_close(se...
method test_writer_arrayset_context_manager_not_accessible_after_close (line 162) | def test_writer_arrayset_context_manager_not_accessible_after_close(se...
method test_close_read_does_not_invalidate_write_checkout (line 180) | def test_close_read_does_not_invalidate_write_checkout(self, aset_samp...
method test_close_write_does_not_invalidate_read_checkout (line 198) | def test_close_write_does_not_invalidate_read_checkout(self, aset_samp...
method test_operate_on_arrayset_after_closing_old_checkout (line 216) | def test_operate_on_arrayset_after_closing_old_checkout(self, repo, ar...
method test_operate_on_closed_checkout (line 229) | def test_operate_on_closed_checkout(self, repo, array5by7):
method test_operate_on_arrayset_samples_after_commiting_but_not_closing_checkout (line 238) | def test_operate_on_arrayset_samples_after_commiting_but_not_closing_c...
method test_operate_on_arraysets_after_commiting_but_not_closing_checkout (line 254) | def test_operate_on_arraysets_after_commiting_but_not_closing_checkout...
method test_with_wrong_argument_value (line 274) | def test_with_wrong_argument_value(self, repo):
method test_reset_staging_area_no_changes_made_does_not_work (line 290) | def test_reset_staging_area_no_changes_made_does_not_work(self, aset1_...
method test_reset_staging_area_clears_arraysets (line 314) | def test_reset_staging_area_clears_arraysets(self, aset1_backend, aset...
method test_checkout_dunder_contains_method (line 337) | def test_checkout_dunder_contains_method(self, repo_20_filled_samples,...
method test_checkout_dunder_len_method (line 345) | def test_checkout_dunder_len_method(self, repo_20_filled_samples, write):
method test_checkout_dunder_iter_method (line 351) | def test_checkout_dunder_iter_method(self, repo_20_filled_samples, wri...
method test_checkout_keys_method (line 364) | def test_checkout_keys_method(self, repo_20_filled_samples, write):
method test_checkout_values_method (line 373) | def test_checkout_values_method(self, repo_20_filled_samples, write):
method test_checkout_items_method (line 388) | def test_checkout_items_method(self, repo_20_filled_samples, write):
method test_checkout_log_method (line 404) | def test_checkout_log_method(self, repo_20_filled_samples, write):
class TestBranchingMergingInCheckout (line 412) | class TestBranchingMergingInCheckout(object):
method test_merge (line 414) | def test_merge(self, aset_samples_initialized_repo, array5by7):
method test_merge_without_closing_previous_checkout (line 431) | def test_merge_without_closing_previous_checkout(self, aset_samples_in...
method test_merge_multiple_checkouts_same_aset (line 441) | def test_merge_multiple_checkouts_same_aset(self, aset_samples_initial...
method test_merge_multiple_checkouts_multiple_aset (line 469) | def test_merge_multiple_checkouts_multiple_aset(self, aset_samples_ini...
method test_merge_diverged_conflict (line 492) | def test_merge_diverged_conflict(self, aset_samples_initialized_repo, ...
method test_new_branch_from_where (line 518) | def test_new_branch_from_where(self, aset_samples_initialized_repo, ar...
method test_cannot_checkout_branch_with_staged_changes (line 541) | def test_cannot_checkout_branch_with_staged_changes(self, aset_samples...
function test_full_from_short_commit_digest (line 565) | def test_full_from_short_commit_digest(two_commit_filled_samples_repo):
function test_writer_context_manager_objects_are_gc_removed_after_co_close (line 580) | def test_writer_context_manager_objects_are_gc_removed_after_co_close(tw...
function test_reader_context_manager_objects_are_gc_removed_after_co_close (line 615) | def test_reader_context_manager_objects_are_gc_removed_after_co_close(tw...
function test_checkout_branch_not_existing_does_not_hold_writer_lock (line 641) | def test_checkout_branch_not_existing_does_not_hold_writer_lock(two_comm...
FILE: tests/test_checkout_arrayset_access.py
function test_arrayset_getattr_does_not_raise_permission_error_if_alive (line 9) | def test_arrayset_getattr_does_not_raise_permission_error_if_alive(write...
function test_write_in_context_manager_no_loop (line 26) | def test_write_in_context_manager_no_loop(aset_samples_initialized_repo,...
function test_write_in_context_manager_many_samples_looping (line 50) | def test_write_in_context_manager_many_samples_looping(aset_samples_init...
function test_write_fails_if_checkout_closed (line 86) | def test_write_fails_if_checkout_closed(aset_samples_initialized_repo, a...
function test_write_context_manager_fails_if_checkout_closed (line 105) | def test_write_context_manager_fails_if_checkout_closed(aset_samples_ini...
function test_writer_co_read_single_aset_single_sample (line 127) | def test_writer_co_read_single_aset_single_sample(aset_samples_initializ...
function test_writer_co_read_single_aset_multiple_samples (line 141) | def test_writer_co_read_single_aset_multiple_samples(aset_samples_initia...
function test_writer_co_read_multiple_aset_single_samples (line 156) | def test_writer_co_read_multiple_aset_single_samples(aset_samples_initia...
function test_writer_co_read_multtiple_aset_multiple_samples (line 180) | def test_writer_co_read_multtiple_aset_multiple_samples(aset_samples_ini...
function test_writer_co_read_fails_nonexistant_aset_name (line 205) | def test_writer_co_read_fails_nonexistant_aset_name(aset_samples_initial...
function test_writer_co_read_fails_nonexistant_sample_name (line 215) | def test_writer_co_read_fails_nonexistant_sample_name(aset_samples_initi...
function test_writer_co_get_returns_none_on_nonexistant_sample_name (line 225) | def test_writer_co_get_returns_none_on_nonexistant_sample_name(aset_samp...
function test_writer_co_read_in_context_manager_no_loop (line 235) | def test_writer_co_read_in_context_manager_no_loop(aset_samples_initiali...
function test_writer_co_read_in_context_manager_many_samples_looping (line 248) | def test_writer_co_read_in_context_manager_many_samples_looping(aset_sam...
function test_co_read_dunder_getitem_excepts_missing_sample (line 281) | def test_co_read_dunder_getitem_excepts_missing_sample(aset_samples_init...
function test_co_read_get_except_missing_true_excepts_missing_sample (line 289) | def test_co_read_get_except_missing_true_excepts_missing_sample(aset_sam...
function test_co_read_get_except_missing_false_returns_none_on_missing_sample (line 297) | def test_co_read_get_except_missing_false_returns_none_on_missing_sample...
function test_writer_co_aset_finds_connection_manager_of_any_aset_in_cm (line 306) | def test_writer_co_aset_finds_connection_manager_of_any_aset_in_cm(aset_...
function test_writer_co_aset_cm_not_allow_remove_aset (line 327) | def test_writer_co_aset_cm_not_allow_remove_aset(aset_samples_initialize...
function test_writer_co_column_instance_cm_not_allow_any_column_removal (line 359) | def test_writer_co_column_instance_cm_not_allow_any_column_removal(repo_...
function test_writer_co_aset_removes_all_samples_and_arrayset_still_exists (line 423) | def test_writer_co_aset_removes_all_samples_and_arrayset_still_exists(as...
function test_reader_co_read_single_aset_single_sample (line 452) | def test_reader_co_read_single_aset_single_sample(aset_samples_initializ...
function test_reader_co_read_single_aset_multiple_samples (line 469) | def test_reader_co_read_single_aset_multiple_samples(aset_samples_initia...
function test_reader_co_read_multiple_aset_single_samples (line 487) | def test_reader_co_read_multiple_aset_single_samples(aset_samples_initia...
function test_reader_co_read_multtiple_aset_multiple_samples (line 514) | def test_reader_co_read_multtiple_aset_multiple_samples(aset_samples_ini...
function test_reader_co_read_fails_nonexistant_aset_name (line 542) | def test_reader_co_read_fails_nonexistant_aset_name(aset_samples_initial...
function test_reader_co_read_fails_nonexistant_sample_name (line 549) | def test_reader_co_read_fails_nonexistant_sample_name(aset_samples_initi...
function test_reader_co_get_read_returns_none_nonexistant_sample_name (line 562) | def test_reader_co_get_read_returns_none_nonexistant_sample_name(aset_sa...
function test_reader_co_read_in_context_manager_no_loop (line 575) | def test_reader_co_read_in_context_manager_no_loop(aset_samples_initiali...
function test_reader_co_read_in_context_manager_many_samples_looping (line 592) | def test_reader_co_read_in_context_manager_many_samples_looping(aset_sam...
FILE: tests/test_cli.py
function test_help_option (line 49) | def test_help_option():
function test_help_no_args_option (line 57) | def test_help_no_args_option():
function test_version_long_option (line 65) | def test_version_long_option():
function test_init_repo (line 74) | def test_init_repo(managed_tmpdir):
function test_writer_lock_is_held_check (line 87) | def test_writer_lock_is_held_check(repo_20_filled_samples2):
function test_writer_lock_force_release (line 99) | def test_writer_lock_force_release(repo_20_filled_samples2):
function test_checkout_writer_branch_works (line 115) | def test_checkout_writer_branch_works(repo_20_filled_samples2):
function test_checkout_writer_branch_nonexistant_branch_errors (line 127) | def test_checkout_writer_branch_nonexistant_branch_errors(repo_20_filled...
function test_checkout_writer_branch_lock_held_errors (line 138) | def test_checkout_writer_branch_lock_held_errors(repo_20_filled_samples2):
function test_diff_command (line 157) | def test_diff_command(repo_2_br_no_conf):
function test_commit_cli_message (line 163) | def test_commit_cli_message(repo_20_filled_samples2):
function test_commit_cli_message_with_no_changes (line 187) | def test_commit_cli_message_with_no_changes(repo_20_filled_samples2):
function substitute_editor_commit_message (line 207) | def substitute_editor_commit_message(hint):
function test_commit_editor_message (line 211) | def test_commit_editor_message(monkeypatch, repo_20_filled_samples2):
function substitute_editor_empty_commit_message (line 238) | def substitute_editor_empty_commit_message(hint):
function test_commit_editor_empty_message (line 242) | def test_commit_editor_empty_message(monkeypatch, repo_20_filled_samples2):
function test_clone (line 265) | def test_clone(written_two_cmt_server_repo):
function test_push_fetch_records (line 287) | def test_push_fetch_records(server_instance, backend):
function test_fetch_records_and_data (line 333) | def test_fetch_records_and_data(server_instance, backend, options):
function test_add_remote (line 383) | def test_add_remote(managed_tmpdir):
function test_remove_remote (line 404) | def test_remove_remote(managed_tmpdir):
function test_list_all_remotes (line 430) | def test_list_all_remotes(managed_tmpdir):
function test_summary (line 463) | def test_summary(written_two_cmt_server_repo, capsys):
function test_summary_before_commit_made (line 487) | def test_summary_before_commit_made(managed_tmpdir):
function test_log (line 501) | def test_log(written_two_cmt_server_repo, capsys):
function test_status (line 525) | def test_status(repo_20_filled_samples2):
function test_arrayset_create_uint8 (line 544) | def test_arrayset_create_uint8(repo_20_filled_samples2):
function test_arrayset_create_float32 (line 562) | def test_arrayset_create_float32(repo_20_filled_samples2):
function test_arrayset_create_invalid_dtype_fails (line 580) | def test_arrayset_create_invalid_dtype_fails(repo_20_filled_samples2):
function test_arrayset_create_invalid_name_fails (line 597) | def test_arrayset_create_invalid_name_fails(repo_20_filled_samples2):
function test_arrayset_create_variable_shape (line 612) | def test_arrayset_create_variable_shape(repo_20_filled_samples2):
function test_arrayset_create_contains_subsamples (line 631) | def test_arrayset_create_contains_subsamples(repo_20_filled_samples2):
function test_remove_arrayset (line 650) | def test_remove_arrayset(repo_20_filled_samples2):
function test_remove_non_existing_arrayset (line 663) | def test_remove_non_existing_arrayset(repo_20_filled_samples2):
function test_branch_create_and_list (line 678) | def test_branch_create_and_list(written_two_cmt_server_repo):
function test_branch_create_and_delete (line 710) | def test_branch_create_and_delete(written_two_cmt_server_repo):
function test_start_server (line 765) | def test_start_server(managed_tmpdir):
function test_db_view_command (line 779) | def test_db_view_command(repo_20_filled_samples):
function monkeypatch_scan (line 798) | def monkeypatch_scan(provides, accepts, attribute, func):
function written_repo_with_1_sample (line 810) | def written_repo_with_1_sample(aset_samples_initialized_repo):
class TestImport (line 823) | class TestImport(object):
method load (line 826) | def load(fpath, *args, **kwargs):
method test_import (line 832) | def test_import(self, monkeypatch, written_repo_with_1_sample):
method test_import_wrong_args (line 870) | def test_import_wrong_args(self, monkeypatch, written_repo_with_1_samp...
method test_import_generator_on_load (line 909) | def test_import_generator_on_load(self, monkeypatch, written_repo_with...
class TestExport (line 936) | class TestExport(object):
method save (line 940) | def save(cls, data, outdir, sampleN, extension, *args, **kwargs):
method test_export_success (line 945) | def test_export_success(self, monkeypatch, written_repo_with_1_sample,...
method test_export_wrong_out_location (line 981) | def test_export_wrong_out_location(self, monkeypatch, written_repo_wit...
method test_export_wrong_arg (line 997) | def test_export_wrong_arg(self, monkeypatch, written_repo_with_1_sampl...
method test_export_without_specifying_out (line 1009) | def test_export_without_specifying_out(self, monkeypatch, written_repo...
method test_export_for_non_existent_sample (line 1021) | def test_export_for_non_existent_sample(self, monkeypatch, written_rep...
method test_export_for_specified_branch (line 1033) | def test_export_for_specified_branch(self, monkeypatch, written_repo_w...
class TestShow (line 1045) | class TestShow(object):
method show (line 1049) | def show(cls, fpath, *args, **kwargs):
method test_show_success (line 1052) | def test_show_success(self, monkeypatch, written_repo_with_1_sample):
method test_show_on_startpoint (line 1064) | def test_show_on_startpoint(self, monkeypatch, written_repo_with_1_sam...
method test_show_with_wrong_arg (line 1078) | def test_show_with_wrong_arg(self, monkeypatch, written_repo_with_1_sa...
method test_wrong_sample_name (line 1090) | def test_wrong_sample_name(self, monkeypatch, written_repo_with_1_samp...
FILE: tests/test_column.py
function assert_equal (line 7) | def assert_equal(arr, arr2):
class TestColumn (line 12) | class TestColumn(object):
method test_invalid_column_name (line 17) | def test_invalid_column_name(self, repo, randomsizedarray, name):
method test_read_only_mode (line 25) | def test_read_only_mode(self, aset_samples_initialized_repo):
method test_get_column (line 40) | def test_get_column(self, aset_samples_initialized_repo, array5by7):
method test_remove_column (line 63) | def test_remove_column(self, aset_backend, aset_samples_initialized_re...
method test_init_again (line 89) | def test_init_again(self, aset_backend, repo, randomsizedarray):
method test_column_with_more_dimension (line 97) | def test_column_with_more_dimension(self, aset_backend, repo):
method test_column_with_empty_dimension (line 112) | def test_column_with_empty_dimension(self, aset_backend, repo):
method test_column_with_int_specifier_as_dimension (line 130) | def test_column_with_int_specifier_as_dimension(self, aset_backend, re...
method test_getattr_does_not_raise_permission_error_if_alive (line 150) | def test_getattr_does_not_raise_permission_error_if_alive(self, aset_b...
class TestDataWithFixedSizedColumn (line 173) | class TestDataWithFixedSizedColumn(object):
method test_column_remote_references_property_with_none (line 178) | def test_column_remote_references_property_with_none(
method test_column_remote_references_property_with_remotes (line 198) | def test_column_remote_references_property_with_remotes(
method test_iterating_over (line 230) | def test_iterating_over(self, aset1_backend, aset2_backend, aset3_back...
method test_iterating_over_local_only (line 280) | def test_iterating_over_local_only(self, aset1_backend, aset2_backend,...
method test_get_data (line 363) | def test_get_data(self, aset_samples_initialized_repo, array5by7):
method test_get_sample_with_default_works (line 372) | def test_get_sample_with_default_works(self, aset_samples_initialized_...
method test_get_multiple_samples_fails (line 380) | def test_get_multiple_samples_fails(self, aset_samples_initialized_rep...
method test_getitem_multiple_samples_missing_key (line 399) | def test_getitem_multiple_samples_missing_key(self, aset_samples_initi...
method test_get_multiple_samples_missing_key (line 413) | def test_get_multiple_samples_missing_key(self, aset_samples_initializ...
method test_add_data_str_keys (line 425) | def test_add_data_str_keys(self, aset_samples_initialized_repo, array5...
method test_add_data_int_keys (line 440) | def test_add_data_int_keys(self, aset_samples_initialized_repo, array5...
method test_cannot_add_data_negative_int_key (line 454) | def test_cannot_add_data_negative_int_key(self, aset_samples_initializ...
method test_cannot_add_data_float_key (line 462) | def test_cannot_add_data_float_key(self, aset_samples_initialized_repo...
method test_add_data_mixed_int_str_keys (line 472) | def test_add_data_mixed_int_str_keys(self, aset_samples_initialized_re...
method test_cannot_add_data_sample_name_longer_than_64_characters (line 492) | def test_cannot_add_data_sample_name_longer_than_64_characters(self, a...
method test_add_with_wrong_argument_order (line 500) | def test_add_with_wrong_argument_order(self, aset_samples_initialized_...
method test_update_with_dict_single_item (line 505) | def test_update_with_dict_single_item(self, aset_samples_initialized_w...
method test_update_with_dict_multiple_items (line 511) | def test_update_with_dict_multiple_items(self, aset_samples_initialize...
method test_update_with_list_single_item (line 521) | def test_update_with_list_single_item(self, aset_samples_initialized_w...
method test_update_with_list_multiple_items (line 531) | def test_update_with_list_multiple_items(self, aset_samples_initialize...
method test_update_with_only_kwargs_single_item (line 541) | def test_update_with_only_kwargs_single_item(self, aset_samples_initia...
method test_update_with_only_kwargs_multiple_items (line 546) | def test_update_with_only_kwargs_multiple_items(self, aset_samples_ini...
method test_update_with_list_and_kwargs (line 552) | def test_update_with_list_and_kwargs(self, aset_samples_initialized_w_...
method test_update_with_dict_and_kwargs (line 563) | def test_update_with_dict_and_kwargs(self, aset_samples_initialized_w_...
method test_update_with_dict_and_kwargs_does_not_modify_input_in_calling_scopy (line 574) | def test_update_with_dict_and_kwargs_does_not_modify_input_in_calling_...
method test_update_with_invalid_data_map_fails (line 609) | def test_update_with_invalid_data_map_fails(self, aset_samples_initial...
method test_setitem_with_invalid_data_map_fails (line 626) | def test_setitem_with_invalid_data_map_fails(self, aset_samples_initia...
method test_add_multiple_data_single_commit (line 631) | def test_add_multiple_data_single_commit(self, aset_samples_initialize...
method test_add_same_data_same_key_does_not_duplicate_hash (line 646) | def test_add_same_data_same_key_does_not_duplicate_hash(self, aset_sam...
method test_multiple_data_multiple_commit (line 658) | def test_multiple_data_multiple_commit(self, aset_samples_initialized_...
method test_added_but_not_commited (line 679) | def test_added_but_not_commited(self, aset_samples_initialized_repo, a...
method test_remove_data (line 693) | def test_remove_data(self, aset_samples_initialized_repo, array5by7):
method test_remove_data_multiple_items (line 723) | def test_remove_data_multiple_items(self, aset_samples_initialized_rep...
method test_pop_data (line 757) | def test_pop_data(self, aset_samples_initialized_repo, array5by7):
method test_pop_data_multiple_items (line 792) | def test_pop_data_multiple_items(self, aset_samples_initialized_repo, ...
method test_remove_all_data (line 826) | def test_remove_all_data(self, aset_samples_initialized_repo, array5by7):
method test_remove_data_nonexistant_sample_key_raises (line 869) | def test_remove_data_nonexistant_sample_key_raises(self, aset_samples_...
method test_multiple_columns_single_commit (line 882) | def test_multiple_columns_single_commit(
method test_prototype_and_shape (line 899) | def test_prototype_and_shape(self, aset1_backend, aset2_backend, repo,...
method test_samples_without_name (line 915) | def test_samples_without_name(self, repo, randomsizedarray):
method test_append_samples (line 927) | def test_append_samples(self, repo, randomsizedarray):
method test_different_data_types_and_shapes (line 940) | def test_different_data_types_and_shapes(self, repo):
method test_add_sample_with_non_numpy_array_data_fails (line 959) | def test_add_sample_with_non_numpy_array_data_fails(self, aset_samples...
method test_add_sample_with_fortran_order_data_fails (line 965) | def test_add_sample_with_fortran_order_data_fails(self, aset_samples_i...
method test_add_sample_with_dimension_rank_fails (line 971) | def test_add_sample_with_dimension_rank_fails(self, repo):
method test_add_sample_with_dimension_exceeding_max_fails (line 979) | def test_add_sample_with_dimension_exceeding_max_fails(self, repo):
method test_writer_context_manager_column_add_sample (line 988) | def test_writer_context_manager_column_add_sample(self, aset_backend, ...
method test_column_context_manager_aset_sample_add (line 1000) | def test_column_context_manager_aset_sample_add(self, aset_backend, re...
method test_writer_column_properties_are_correct (line 1014) | def test_writer_column_properties_are_correct(self, aset_samples_initi...
method test_reader_column_properties_are_correct (line 1031) | def test_reader_column_properties_are_correct(self, aset_samples_initi...
method test_iter_column_samples_yields_keys (line 1047) | def test_iter_column_samples_yields_keys(self, aset_samples_initialize...
method test_iter_columns_yields_aset_names (line 1059) | def test_iter_columns_yields_aset_names(self, repo_20_filled_samples):
method test_set_item_column_fails (line 1065) | def test_set_item_column_fails(self, aset_samples_initialized_repo):
class TestVariableSizedColumn (line 1072) | class TestVariableSizedColumn(object):
method test_write_all_zeros_same_size_different_shape_does_not_store_as_identical_hashs (line 1083) | def test_write_all_zeros_same_size_different_shape_does_not_store_as_i...
method test_writer_can_create_variable_size_column (line 1153) | def test_writer_can_create_variable_size_column(
method test_reader_recieves_expected_values_for_variable_size_column (line 1185) | def test_reader_recieves_expected_values_for_variable_size_column(
method test_writer_reader_can_create_read_multiple_variable_size_column (line 1221) | def test_writer_reader_can_create_read_multiple_variable_size_column(
method test_writer_column_properties_are_correct (line 1254) | def test_writer_column_properties_are_correct(self, aset_samples_var_s...
method test_reader_column_properties_are_correct (line 1269) | def test_reader_column_properties_are_correct(self, aset_samples_var_s...
class TestMultiprocessColumnReads (line 1285) | class TestMultiprocessColumnReads(object):
method test_external_multi_process_pool (line 1288) | def test_external_multi_process_pool(self, repo, backend):
method test_external_multi_process_pool_fails_on_write_enabled_checkout (line 1323) | def test_external_multi_process_pool_fails_on_write_enabled_checkout(s...
method test_multiprocess_get_succeeds_on_superset_and_subset_of_keys (line 1345) | def test_multiprocess_get_succeeds_on_superset_and_subset_of_keys(self...
method test_writer_iterating_over_keys_can_have_additions_made_no_error (line 1381) | def test_writer_iterating_over_keys_can_have_additions_made_no_error(s...
method test_writer_iterating_over_values_can_have_additions_made_no_error (line 1401) | def test_writer_iterating_over_values_can_have_additions_made_no_error...
method test_writer_iterating_over_items_can_have_additions_made_no_error (line 1422) | def test_writer_iterating_over_items_can_have_additions_made_no_error(...
method test_reader_iterating_over_items_can_not_make_additions (line 1445) | def test_reader_iterating_over_items_can_not_make_additions(self, two_...
FILE: tests/test_column_backends.py
function test_backend_property_reports_correct_backend (line 7) | def test_backend_property_reports_correct_backend(repo, array5by7, backe...
function test_setting_backend_property_cannot_change_backend (line 23) | def test_setting_backend_property_cannot_change_backend(repo, array5by7,...
function test_setting_backend_opts_property_cannot_change_backend_opts (line 44) | def test_setting_backend_opts_property_cannot_change_backend_opts(repo, ...
function test_heuristics_select_backend (line 80) | def test_heuristics_select_backend(repo, shape, dtype, variable_shape, e...
function test_manual_override_heuristics_select_backend (line 111) | def test_manual_override_heuristics_select_backend(repo, prototype, back...
function test_manual_override_heuristics_invalid_value_raises_error (line 139) | def test_manual_override_heuristics_invalid_value_raises_error(repo):
function test_manual_change_backends_after_write_works (line 150) | def test_manual_change_backends_after_write_works(repo, array5by7, backe...
function test_manual_change_backend_to_invalid_fmt_code_fails (line 196) | def test_manual_change_backend_to_invalid_fmt_code_fails(repo, array5by7...
function test_manual_change_backend_fails_while_in_cm (line 239) | def test_manual_change_backend_fails_while_in_cm(repo, array5by7, backen...
function dummy_writer_checkout (line 297) | def dummy_writer_checkout(classrepo):
class TestComplibRestrictions (line 303) | class TestComplibRestrictions:
method test_schema_smaller_16_bytes_cannot_select_blosc_backend (line 317) | def test_schema_smaller_16_bytes_cannot_select_blosc_backend(
function test_schema_smaller_16_bytes_does_not_use_heuristic_to_select_blosc (line 346) | def test_schema_smaller_16_bytes_does_not_use_heuristic_to_select_blosc(
function test_schema_smaller_16_bytes_cannot_change_to_blosc_backend (line 375) | def test_schema_smaller_16_bytes_cannot_change_to_blosc_backend(
FILE: tests/test_column_definition_permutations.py
function assert_equal (line 10) | def assert_equal(expected, actual):
function ndarray_generate_data_fixed_shape (line 20) | def ndarray_generate_data_fixed_shape(shape, dtype, low=0, high=255):
function ndarray_generate_data_variable_shape (line 25) | def ndarray_generate_data_variable_shape(shape, dtype, low=0, high=255):
function str_generate_data_variable_shape (line 36) | def str_generate_data_variable_shape(
function bytes_generate_data_variable_shape (line 44) | def bytes_generate_data_variable_shape(
function add_data_to_column (line 88) | def add_data_to_column(col, data_gen, nsamples, nsubsamples=None):
function num_samples_gen (line 105) | def num_samples_gen(request):
function num_subsamples_gen (line 110) | def num_subsamples_gen(request):
function column_permutation_repo (line 115) | def column_permutation_repo(repo, num_samples_gen, num_subsamples_gen):
function column_permutations_read_write_checkout (line 172) | def column_permutations_read_write_checkout(request, column_permutation_...
function column_permutations_write_checkout (line 180) | def column_permutations_write_checkout(column_permutation_repo):
function test_cannot_create_column_within_cm (line 193) | def test_cannot_create_column_within_cm(repo, column_type, column_kwargs...
function test_contains_subsamples_non_bool_value_fails (line 216) | def test_contains_subsamples_non_bool_value_fails(repo, column_type, col...
function test_cannot_create_column_name_exists (line 239) | def test_cannot_create_column_name_exists(repo, column_type, column_kwar...
function test_read_data_from_column_permutations (line 268) | def test_read_data_from_column_permutations(column_permutations_read_wri...
function test_write_data_to_column_permutations (line 289) | def test_write_data_to_column_permutations(
function test_merge_write_data_to_column_permutations (line 330) | def test_merge_write_data_to_column_permutations(
FILE: tests/test_column_nested.py
function assert_equal (line 11) | def assert_equal(arr, arr2):
class TestArraysetSetup (line 19) | class TestArraysetSetup:
method test_does_not_allow_invalid_arrayset_names (line 24) | def test_does_not_allow_invalid_arrayset_names(self, repo, randomsized...
method test_read_only_mode_arrayset_methods_limited (line 30) | def test_read_only_mode_arrayset_methods_limited(self, aset_subsamples...
method test_get_arrayset_in_read_and_write_checkouts (line 43) | def test_get_arrayset_in_read_and_write_checkouts(self, aset_subsample...
method test_delete_arrayset (line 61) | def test_delete_arrayset(self, aset_backend, aset_subsamples_initializ...
method test_init_same_arrayset_twice_fails_again (line 95) | def test_init_same_arrayset_twice_fails_again(self, aset_backend, repo...
method test_arrayset_with_invalid_dimension_sizes_shapes (line 110) | def test_arrayset_with_invalid_dimension_sizes_shapes(self, aset_backe...
function multi_item_generator (line 135) | def multi_item_generator(request):
function iterable_subsamples (line 145) | def iterable_subsamples(request, multi_item_generator):
function iterable_samples (line 186) | def iterable_samples(request, multi_item_generator, iterable_subsamples):
function backend_params (line 209) | def backend_params(request):
function subsample_writer_written_aset (line 214) | def subsample_writer_written_aset(backend_params, repo, monkeypatch):
class TestAddData (line 231) | class TestAddData:
method test_update_sample_subsamples_empty_arrayset (line 233) | def test_update_sample_subsamples_empty_arrayset(self, subsample_write...
method test_update_sample_kwargs_only_empty_arrayset (line 241) | def test_update_sample_kwargs_only_empty_arrayset(self, subsample_writ...
method test_update_sample_kwargs_and_other_dict_doesnt_modify_input_in_calling_scope (line 258) | def test_update_sample_kwargs_and_other_dict_doesnt_modify_input_in_ca...
method test_update_sample_kwargs_and_iterably_empty_arrayset (line 280) | def test_update_sample_kwargs_and_iterably_empty_arrayset(
method test_update_sample_subsamples_duplicate_data_does_not_save_new (line 291) | def test_update_sample_subsamples_duplicate_data_does_not_save_new(
method test_update_sample_subsamples_context_manager (line 306) | def test_update_sample_subsamples_context_manager(self, subsample_writ...
method test_setitem_sample_subsamples_empty_arrayset (line 319) | def test_setitem_sample_subsamples_empty_arrayset(
method test_setitem_sample_subsamples_contextmanager (line 334) | def test_setitem_sample_subsamples_contextmanager(
method test_update_subsamples_empty_arrayset (line 354) | def test_update_subsamples_empty_arrayset(self, multi_item_generator, ...
method test_update_subsamples_via_kwargs_empty_arrayset (line 369) | def test_update_subsamples_via_kwargs_empty_arrayset(self, multi_item_...
method test_update_subsamples_kwargs_and_other_dict_doesnt_modify_input_in_calling_scopy (line 382) | def test_update_subsamples_kwargs_and_other_dict_doesnt_modify_input_i...
method test_update_subsamples_via_kwargs_and_iterable_empty_arrayset (line 408) | def test_update_subsamples_via_kwargs_and_iterable_empty_arrayset(
method test_update_subsamples_context_manager (line 425) | def test_update_subsamples_context_manager(
method test_setitem_sample_empty_arrayset (line 450) | def test_setitem_sample_empty_arrayset(
method test_setitem_sample_setitem_subsample_empty_arrayset_fails (line 469) | def test_setitem_sample_setitem_subsample_empty_arrayset_fails(self, s...
method test_setitem_subsamples_contextmanager (line 484) | def test_setitem_subsamples_contextmanager(self, multi_item_generator,...
method test_append_subsamples_empty_arrayset (line 506) | def test_append_subsamples_empty_arrayset(self, multi_item_generator, ...
method test_append_subsamples_contextmanager (line 519) | def test_append_subsamples_contextmanager(self, multi_item_generator, ...
method test_update_noniterable_subsample_iter_fails (line 540) | def test_update_noniterable_subsample_iter_fails(self, backend, other,...
method test_update_subsamples_with_too_many_arguments_fails (line 554) | def test_update_subsamples_with_too_many_arguments_fails(self, backend...
method test_update_subsamples_with_too_few_arguments_fails (line 568) | def test_update_subsamples_with_too_few_arguments_fails(self, backend,...
method test_update_noniterable_samples_fails (line 597) | def test_update_noniterable_samples_fails(self, other, subsample_write...
method test_update_noniterable_subsamples_fails (line 613) | def test_update_noniterable_subsamples_fails(self, other, subsample_wr...
method test_update_invalid_sample_key_fails (line 629) | def test_update_invalid_sample_key_fails(self, other, subsample_writer...
method test_update_sample_invalid_subsample_key_fails (line 645) | def test_update_sample_invalid_subsample_key_fails(self, other, subsam...
method test_update_sample_invalid_array_fails_fixed_shape (line 662) | def test_update_sample_invalid_array_fails_fixed_shape(self, backend, ...
method test_update_subsample_invalid_subsample_key_fails (line 682) | def test_update_subsample_invalid_subsample_key_fails(self, other, sub...
method test_update_subsample_invalid_array_fails_fixed_shape (line 701) | def test_update_subsample_invalid_array_fails_fixed_shape(self, backen...
function subsample_data_map (line 719) | def subsample_data_map():
function backend_param (line 736) | def backend_param(request):
function write_enabled (line 741) | def write_enabled(request):
function initialized_arrayset (line 746) | def initialized_arrayset(write_enabled, backend_param, classrepo, subsam...
function initialized_arrayset_write_only (line 765) | def initialized_arrayset_write_only(backend_param, repo, subsample_data_...
class TestRemoveData (line 774) | class TestRemoveData:
method test_delitem_single_sample_from_arrayset (line 778) | def test_delitem_single_sample_from_arrayset(self, initialized_arrayse...
method test_delitem_single_subsample_from_sample (line 784) | def test_delitem_single_subsample_from_sample(self, initialized_arrays...
method test_delitem_sample_nonexisting_keys_fails (line 790) | def test_delitem_sample_nonexisting_keys_fails(self, initialized_array...
method test_delitem_single_subsample_nonexisting_key_fails (line 797) | def test_delitem_single_subsample_nonexisting_key_fails(self, initiali...
method test_delitem_multiple_samples_fails_keyerror (line 806) | def test_delitem_multiple_samples_fails_keyerror(self, initialized_arr...
method test_pop_single_sample_from_arrayset (line 815) | def test_pop_single_sample_from_arrayset(self, initialized_arrayset_wr...
method test_pop_multiple_samples_from_arrayset_fails (line 824) | def test_pop_multiple_samples_from_arrayset_fails(self, initialized_ar...
method test_pop_single_subsample_from_sample (line 831) | def test_pop_single_subsample_from_sample(self, initialized_arrayset_w...
method test_pop_multiple_subsample_from_sample_fails (line 838) | def test_pop_multiple_subsample_from_sample_fails(self, initialized_ar...
class TestContainerIntrospection (line 849) | class TestContainerIntrospection:
method test_get_sample_returns_object (line 851) | def test_get_sample_returns_object(self, initialized_arrayset, subsamp...
method test_get_sample_test_subsample_len_method (line 862) | def test_get_sample_test_subsample_len_method(self, initialized_arrays...
method test_get_sample_test_subsample_contains_method (line 868) | def test_get_sample_test_subsample_contains_method(self, initialized_a...
method test_sample_len_reported_correctly (line 875) | def test_sample_len_reported_correctly(self, initialized_arrayset, sub...
method test_get_sample_test_subsample_sample_property (line 882) | def test_get_sample_test_subsample_sample_property(self, initialized_a...
method test_get_sample_test_subsample_arrayset_property (line 888) | def test_get_sample_test_subsample_arrayset_property(self, initialized...
method test_get_sample_test_data_property (line 894) | def test_get_sample_test_data_property(self, initialized_arrayset, sub...
method test_get_sample_test_subsample_contains_remote_references_property (line 904) | def test_get_sample_test_subsample_contains_remote_references_property(
method test_get_sample_test_subsample_remote_reference_keys_property (line 929) | def test_get_sample_test_subsample_remote_reference_keys_property(self...
method test_getattr_does_not_raise_permission_error_if_alive (line 952) | def test_getattr_does_not_raise_permission_error_if_alive(self, initia...
class TestGetDataMethods (line 988) | class TestGetDataMethods:
method test_get_sample_missing_key (line 990) | def test_get_sample_missing_key(self, initialized_arrayset):
method test_getitem_sample_missing_key (line 997) | def test_getitem_sample_missing_key(self, initialized_arrayset):
method test_get_sample_get_subsample (line 1002) | def test_get_sample_get_subsample(self, initialized_arrayset, subsampl...
method test_getitem_sample_getitem_subsample (line 1010) | def test_getitem_sample_getitem_subsample(self, initialized_arrayset, ...
method test_getitem_subsample_from_column (line 1021) | def test_getitem_subsample_from_column(self, initialized_arrayset, sub...
method test_recursive_subsample_getitem_from_column (line 1034) | def test_recursive_subsample_getitem_from_column(self, initialized_arr...
method test_get_subsample_from_column (line 1041) | def test_get_subsample_from_column(self, initialized_arrayset, subsamp...
method test_get_sample_get_subsample_missing_key (line 1054) | def test_get_sample_get_subsample_missing_key(self, initialized_arrays...
method test_getitem_sample_getitem_subsample_missing_key (line 1071) | def test_getitem_sample_getitem_subsample_missing_key(self, initialize...
method test_get_sample_get_multiple_subsamples_fails (line 1078) | def test_get_sample_get_multiple_subsamples_fails(self, initialized_ar...
method test_get_sample_getitem_single_subsample (line 1085) | def test_get_sample_getitem_single_subsample(self, initialized_arrayse...
method test_get_sample_getitem_single_subsample_missing_key (line 1093) | def test_get_sample_getitem_single_subsample_missing_key(self, initial...
method test_get_sample_getitem_multiple_subsamples_fails (line 1102) | def test_get_sample_getitem_multiple_subsamples_fails(self, initialize...
method test_get_sample_getitem_subsamples_with_ellipsis (line 1109) | def test_get_sample_getitem_subsamples_with_ellipsis(self, initialized...
method test_get_sample_getitem_subsamples_with_keys_and_ellipsis_fails (line 1119) | def test_get_sample_getitem_subsamples_with_keys_and_ellipsis_fails(se...
method test_get_sample_getitem_subsamples_with_unbound_slice (line 1129) | def test_get_sample_getitem_subsamples_with_unbound_slice(self, initia...
method test_get_sample_getitem_subsamples_with_bounded_slice (line 1140) | def test_get_sample_getitem_subsamples_with_bounded_slice(self, initia...
method test_subsample_getitem_with_bounded_slice_from_column (line 1150) | def test_subsample_getitem_with_bounded_slice_from_column(self, initia...
method test_get_sample_getitem_subsamples_with_out_of_bounds_slice_does_not_fail (line 1159) | def test_get_sample_getitem_subsamples_with_out_of_bounds_slice_does_n...
method test_aset_contextmanager (line 1176) | def test_aset_contextmanager(self, initialized_arrayset, subsample_dat...
method test_sample_contextmanager (line 1190) | def test_sample_contextmanager(self, initialized_arrayset, subsample_d...
method test_sample_subsample_contextmanager (line 1205) | def test_sample_subsample_contextmanager(self, initialized_arrayset, s...
method test_sample_reentrant_contextmanager_fails (line 1231) | def test_sample_reentrant_contextmanager_fails(self, initialized_array...
method test_calling_iter_on_arrayset (line 1270) | def test_calling_iter_on_arrayset(self, initialized_arrayset, subsampl...
method test_calling_iter_on_sample_in_arrayset (line 1277) | def test_calling_iter_on_sample_in_arrayset(self, initialized_arrayset...
method test_get_sample_keys_method (line 1289) | def test_get_sample_keys_method(self, initialized_arrayset):
method test_get_sample_keys_method_local_only (line 1298) | def test_get_sample_keys_method_local_only(self, initialized_arrayset):
method test_get_sample_subsample_keys_method (line 1315) | def test_get_sample_subsample_keys_method(self, initialized_arrayset, ...
method test_get_sample_subsample_keys_method_local_only (line 1326) | def test_get_sample_subsample_keys_method_local_only(self, initialized...
method test_get_sample_values_method (line 1359) | def test_get_sample_values_method(self, initialized_arrayset):
method test_get_sample_values_method_local_only (line 1371) | def test_get_sample_values_method_local_only(self, initialized_arrayset):
method test_get_sample_subsample_values_method (line 1390) | def test_get_sample_subsample_values_method(self, initialized_arrayset...
method test_get_sample_subsample_values_method_local_only (line 1401) | def test_get_sample_subsample_values_method_local_only(self, initializ...
method test_get_sample_items_method (line 1434) | def test_get_sample_items_method(self, initialized_arrayset):
method test_get_sample_items_method_local_only (line 1447) | def test_get_sample_items_method_local_only(self, initialized_arrayset):
method test_get_sample_subsample_items_method (line 1467) | def test_get_sample_subsample_items_method(self, initialized_arrayset,...
method test_get_sample_subsample_items_method_local_only (line 1478) | def test_get_sample_subsample_items_method_local_only(self, initialize...
method test_arrayset_remote_references_property_with_none (line 1514) | def test_arrayset_remote_references_property_with_none(
method test_arrayset_remote_references_property_with_remotes (line 1536) | def test_arrayset_remote_references_property_with_remotes(
class TestWriteThenReadCheckout (line 1568) | class TestWriteThenReadCheckout:
method test_add_data_commit_checkout_read_only_contains_same (line 1571) | def test_add_data_commit_checkout_read_only_contains_same(self, backen...
FILE: tests/test_column_pickle.py
function assert_equal (line 6) | def assert_equal(arr, arr2):
function subsample_data_map (line 13) | def subsample_data_map():
function sample_data_map (line 30) | def sample_data_map():
function backend_param (line 43) | def backend_param(request):
function write_enabled (line 48) | def write_enabled(request):
function contains_subsamples (line 53) | def contains_subsamples(request):
function initialized_column (line 58) | def initialized_column(
function initialized_column_read_only (line 82) | def initialized_column_read_only(backend_param, contains_subsamples, cla...
class TestPickleableColumns (line 99) | class TestPickleableColumns:
method test_is_pickleable (line 101) | def test_is_pickleable(self, initialized_column, sample_data_map, subs...
class TestLoadableColumns (line 113) | class TestLoadableColumns:
method test_is_pickle_is_loadable (line 115) | def test_is_pickle_is_loadable(self, initialized_column_read_only, sam...
FILE: tests/test_commit_ref_verification.py
function test_verify_corruption_in_commit_ref_alerts (line 4) | def test_verify_corruption_in_commit_ref_alerts(two_commit_filled_sample...
function test_verify_corruption_in_commit_parent_val_alerts (line 35) | def test_verify_corruption_in_commit_parent_val_alerts(two_commit_filled...
function test_verify_corruption_in_spec_val_alerts (line 65) | def test_verify_corruption_in_spec_val_alerts(two_commit_filled_samples_...
FILE: tests/test_context_management.py
function test_nested_context_manager_does_not_close_all_open (line 11) | def test_nested_context_manager_does_not_close_all_open(repo, backend1, ...
FILE: tests/test_diff.py
class TestReaderWriterDiff (line 5) | class TestReaderWriterDiff(object):
method test_diff_by_commit_and_branch (line 8) | def test_diff_by_commit_and_branch(self, repo_2_br_no_conf, writer):
method test_diff_with_wrong_commit_hash (line 19) | def test_diff_with_wrong_commit_hash(self, repo_2_br_no_conf, writer):
method test_diff_with_wrong_branch_name (line 30) | def test_diff_with_wrong_branch_name(self, repo_1_br_no_conf, writer):
method test_comparing_diffs_of_dev_and_master (line 38) | def test_comparing_diffs_of_dev_and_master(self, repo_1_br_no_conf, wr...
method test_diff_data_samples (line 64) | def test_diff_data_samples(self, repo_1_br_no_conf, writer):
method test_sample_addition_conflict (line 98) | def test_sample_addition_conflict(self, repo_1_br_no_conf, writer):
method test_sample_removal_conflict (line 126) | def test_sample_removal_conflict(self, repo_1_br_no_conf, writer):
method test_sample_mutation_conflict (line 154) | def test_sample_mutation_conflict(self, repo_1_br_no_conf, writer):
method test_aset_addition_conflict (line 178) | def test_aset_addition_conflict(self, aset_samples_initialized_repo, w...
method test_aset_removal_conflict (line 201) | def test_aset_removal_conflict(self, aset_samples_initialized_repo, wr...
method test_aset_mutation_conflict (line 234) | def test_aset_mutation_conflict(self, aset_samples_initialized_repo, w...
method test_commits_inside_cm (line 263) | def test_commits_inside_cm(self, aset_samples_initialized_repo, array5...
class TestWriterDiff (line 289) | class TestWriterDiff(object):
method test_status_and_staged_column (line 291) | def test_status_and_staged_column(self, aset_samples_initialized_repo):
method test_status_and_staged_samples (line 301) | def test_status_and_staged_samples(self, aset_samples_initialized_repo):
method test_status_and_staged_aset (line 322) | def test_status_and_staged_aset(self, aset_samples_initialized_repo):
function test_repo_diff_method_branch_names (line 334) | def test_repo_diff_method_branch_names(aset_samples_initialized_repo):
function test_repo_diff_method_commit_digests (line 363) | def test_repo_diff_method_commit_digests(aset_samples_initialized_repo):
function test_repo_diff_method_one_branch_one_commit_digest (line 394) | def test_repo_diff_method_one_branch_one_commit_digest(aset_samples_init...
FILE: tests/test_diff_staged_summary.py
function test_add_samples_to_existing_column (line 5) | def test_add_samples_to_existing_column(repo_20_filled_samples2):
function test_mutate_sample_values (line 44) | def test_mutate_sample_values(repo_20_filled_samples2):
function test_delete_samples (line 83) | def test_delete_samples(repo_20_filled_samples2):
function test_add_new_column_schema_and_samples (line 120) | def test_add_new_column_schema_and_samples(repo_20_filled_samples2):
function test_add_new_column_schema_and_sample_and_delete_old_column (line 172) | def test_add_new_column_schema_and_sample_and_delete_old_column(repo_20_...
function test_add_new_schema_and_samples_and_change_old_backend (line 237) | def test_add_new_schema_and_samples_and_change_old_backend(repo_20_fille...
FILE: tests/test_initiate.py
function test_imports (line 6) | def test_imports():
function test_starting_up_repo_warns_should_exist_no_args (line 11) | def test_starting_up_repo_warns_should_exist_no_args(managed_tmpdir):
function test_starting_up_repo_warns_should_exist_manual_args (line 24) | def test_starting_up_repo_warns_should_exist_manual_args(managed_tmpdir):
function test_starting_up_repo_does_not_warn_not_exist_manual_args (line 37) | def test_starting_up_repo_does_not_warn_not_exist_manual_args(managed_tm...
function test_initial_read_checkout (line 52) | def test_initial_read_checkout(managed_tmpdir):
function test_initial_arrayset (line 60) | def test_initial_arrayset(managed_tmpdir, randomsizedarray):
function test_empty_commit (line 74) | def test_empty_commit(managed_tmpdir, caplog):
function test_cannot_operate_without_repo_init (line 84) | def test_cannot_operate_without_repo_init(managed_tmpdir):
function test_check_repo_size (line 137) | def test_check_repo_size(repo_20_filled_samples):
function test_force_release_writer_lock (line 149) | def test_force_release_writer_lock(managed_tmpdir, monkeypatch):
function test_force_release_writer_lock_works (line 172) | def test_force_release_writer_lock_works(managed_tmpdir):
function test_repo_summary_does_not_error_before_any_commit_made (line 187) | def test_repo_summary_does_not_error_before_any_commit_made(capfd, manag...
function test_get_ecosystem_details (line 197) | def test_get_ecosystem_details(managed_tmpdir):
function test_inject_repo_version (line 209) | def test_inject_repo_version(monkeypatch):
function test_check_repository_version (line 215) | def test_check_repository_version(aset_samples_initialized_repo):
function test_check_repository_software_version_startup (line 223) | def test_check_repository_software_version_startup(managed_tmpdir):
function test_check_repository_software_version_fails_hangar_version (line 254) | def test_check_repository_software_version_fails_hangar_version(monkeypa...
function test_check_repository_software_version_works_on_newer_hangar_version (line 277) | def test_check_repository_software_version_works_on_newer_hangar_version...
FILE: tests/test_merging.py
function test_merge_fails_with_invalid_branch_name (line 5) | def test_merge_fails_with_invalid_branch_name(repo_1_br_no_conf):
function test_is_ff_merge (line 13) | def test_is_ff_merge(repo_1_br_no_conf):
function test_writer_checkout_ff_merge (line 19) | def test_writer_checkout_ff_merge(repo_1_br_no_conf):
function test_merge_fails_if_changes_staged (line 34) | def test_merge_fails_if_changes_staged(repo_1_br_no_conf):
function test_writer_checkout_merge_fails_if_changes_staged (line 43) | def test_writer_checkout_merge_fails_if_changes_staged(repo_1_br_no_conf):
function test_ff_merge_no_conf_correct_contents_for_name_or_hash_checkout (line 52) | def test_ff_merge_no_conf_correct_contents_for_name_or_hash_checkout(rep...
function test_ff_merge_no_conf_updates_head_commit_of_branches (line 69) | def test_ff_merge_no_conf_updates_head_commit_of_branches(repo_1_br_no_c...
function test_is_3_way_merge (line 87) | def test_is_3_way_merge(repo_2_br_no_conf):
function test_writer_checkout_is_3_way_merge (line 95) | def test_writer_checkout_is_3_way_merge(repo_2_br_no_conf):
function test_3_way_merge_no_conflict_correct_contents (line 105) | def test_3_way_merge_no_conflict_correct_contents(repo_2_br_no_conf):
function test_writer_checkout_3_way_merge_no_conflict_correct_contents (line 130) | def test_writer_checkout_3_way_merge_no_conflict_correct_contents(repo_2...
function test_3_way_merge_no_conflict_and_mutation_correct_contents (line 156) | def test_3_way_merge_no_conflict_and_mutation_correct_contents(repo_2_br...
function test_3_way_merge_updates_head_commit_of_branches (line 197) | def test_3_way_merge_updates_head_commit_of_branches(repo_2_br_no_conf):
function test_writer_checkout_3_way_merge_updates_head_commit_of_branches (line 211) | def test_writer_checkout_3_way_merge_updates_head_commit_of_branches(rep...
class TestArraysetSampleConflicts (line 227) | class TestArraysetSampleConflicts(object):
method test_conflict_additions_same_str_name_different_value (line 229) | def test_conflict_additions_same_str_name_different_value(self, repo_2...
method test_conflict_additions_same_int_name_different_value (line 242) | def test_conflict_additions_same_int_name_different_value(self, repo_2...
method test_conflict_additions_same_str_and_int_name_different_value (line 255) | def test_conflict_additions_same_str_and_int_name_different_value(self...
method test_no_conflict_additions_same_name_and_value (line 269) | def test_no_conflict_additions_same_name_and_value(self, repo_2_br_no_...
method test_conflict_mutations_same_name_different_value (line 287) | def test_conflict_mutations_same_name_different_value(self, repo_2_br_...
method test_conflict_mutation_and_removal (line 304) | def test_conflict_mutation_and_removal(self, repo_2_br_no_conf):
method test_no_conflict_both_removal (line 320) | def test_no_conflict_both_removal(self, repo_2_br_no_conf):
FILE: tests/test_optimized_utils.py
function test_sizeddict_maxsize_property (line 6) | def test_sizeddict_maxsize_property():
function test_sizeddict_setitem_no_overflow_retains_keys_and_len (line 13) | def test_sizeddict_setitem_no_overflow_retains_keys_and_len():
function test_sizeddict_setitem_overflow_truncates_keys_and_len (line 24) | def test_sizeddict_setitem_overflow_truncates_keys_and_len():
function test_sizeddict_update_no_overflow_retains_keys_and_len (line 39) | def test_sizeddict_update_no_overflow_retains_keys_and_len():
function test_sizeddict_updateoverflow_truncates_keys_and_len (line 50) | def test_sizeddict_updateoverflow_truncates_keys_and_len():
function test_sizeddict_get_returns_default_on_missing_key (line 65) | def test_sizeddict_get_returns_default_on_missing_key():
function test_sizeddict_delitem (line 73) | def test_sizeddict_delitem():
function test_sizeddict_pop (line 87) | def test_sizeddict_pop():
function test_sizeddict_popitem (line 100) | def test_sizeddict_popitem():
function test_sizeddict_keys (line 119) | def test_sizeddict_keys():
function test_sizeddict_values (line 129) | def test_sizeddict_values():
function test_sizeddict_keys (line 139) | def test_sizeddict_keys():
function test_sizeddict_setdefault (line 149) | def test_sizeddict_setdefault():
function test_sizeddict_clear (line 173) | def test_sizeddict_clear():
function test_sizeddict_repr (line 187) | def test_sizeddict_repr():
function test_sizeddict_is_pickleable (line 197) | def test_sizeddict_is_pickleable():
FILE: tests/test_remote_serialize.py
function assert_array_equal (line 12) | def assert_array_equal(arr, arr2):
function arr_shape (line 18) | def arr_shape(request):
function arr_dtype (line 22) | def arr_dtype(request):
function ident_digest (line 26) | def ident_digest(request):
function ident_schema (line 30) | def ident_schema(request):
function array_testcase (line 35) | def array_testcase(arr_shape, arr_dtype):
function str_testcase (line 45) | def str_testcase(request):
function bytes_testcase (line 53) | def bytes_testcase(request):
function ident_testcase (line 58) | def ident_testcase(ident_digest, ident_schema):
function test_serialize_deserialize_array (line 62) | def test_serialize_deserialize_array(array_testcase):
function test_serialize_deserialize_str (line 71) | def test_serialize_deserialize_str(str_testcase):
function test_serialize_deserialize_bytes (line 78) | def test_serialize_deserialize_bytes(bytes_testcase):
function test_serialize_deserialize_data (line 90) | def test_serialize_deserialize_data(expected_dtype_code, data):
function test_serialize_deserialize_ident (line 104) | def test_serialize_deserialize_ident(ident_testcase):
function test_serialize_deserialize_record (line 117) | def test_serialize_deserialize_record(array_testcase, ident_testcase):
function test_serialize_deserialize_record_pack (line 132) | def test_serialize_deserialize_record_pack(ident_testcase, nrecords):
function test_serialize_deserialize_ident_digest_field_only (line 166) | def test_serialize_deserialize_ident_digest_field_only(ident_testcase):
function test_serialize_deserialize_ident_schema_field_only (line 179) | def test_serialize_deserialize_ident_schema_field_only(ident_testcase):
function test_serialize_deserialize_ident_only_record_pack (line 193) | def test_serialize_deserialize_ident_only_record_pack(ident_testcase, nr...
function test_serialize_deserialize_ident_only_digest_only_record_pack (line 223) | def test_serialize_deserialize_ident_only_digest_only_record_pack(ident_...
function test_serialize_deserialize_ident_only_schema_only_record_pack (line 253) | def test_serialize_deserialize_ident_only_schema_only_record_pack(ident_...
FILE: tests/test_remotes.py
function test_cannot_add_invalid_remote_names (line 15) | def test_cannot_add_invalid_remote_names(repo, name):
function test_list_all_remotes_works (line 20) | def test_list_all_remotes_works(repo):
function test_cannot_add_remote_twice_with_same_name (line 46) | def test_cannot_add_remote_twice_with_same_name(repo):
function test_remote_remote_which_does_not_exist_fails (line 54) | def test_remote_remote_which_does_not_exist_fails(repo):
function test_can_update_remote_after_removal (line 59) | def test_can_update_remote_after_removal(repo):
function test_server_is_started_multiple_times_via_ping_pong (line 71) | def test_server_is_started_multiple_times_via_ping_pong(server_instance,
function test_push_and_clone_master_linear_history_multiple_commits (line 80) | def test_push_and_clone_master_linear_history_multiple_commits(
function test_server_push_second_branch_with_new_commit (line 136) | def test_server_push_second_branch_with_new_commit(server_instance, repo,
function test_server_push_second_branch_with_new_commit_then_clone_partial_fetch (line 184) | def test_server_push_second_branch_with_new_commit_then_clone_partial_fe...
function array5by7_class (line 288) | def array5by7_class():
function two_branch_multi_commit_repo_class (line 292) | def two_branch_multi_commit_repo_class(server_instance_class, classrepo,...
class TestLargeRemoteServer (line 392) | class TestLargeRemoteServer:
method test_server_push_two_branch_then_clone_fetch_data_options (line 401) | def test_server_push_two_branch_then_clone_fetch_data_options(
function two_multi_format_repo_class (line 489) | def two_multi_format_repo_class(server_instance_class, classrepo):
class TestRemoteServerFetchDataSample (line 527) | class TestRemoteServerFetchDataSample:
method test_server_fetch_data_sample (line 563) | def test_server_fetch_data_sample(
method test_server_fetch_data_sample_commit_not_existing (line 613) | def test_server_fetch_data_sample_commit_not_existing(
method test_server_fetch_data_sample_branch_not_existing (line 635) | def test_server_fetch_data_sample_branch_not_existing(
method test_server_fetch_data_sample_branch_and_commit_args_passed_fails (line 657) | def test_server_fetch_data_sample_branch_and_commit_args_passed_fails(
method test_server_fetch_data_sample_not_existing_fails (line 680) | def test_server_fetch_data_sample_not_existing_fails(
method test_server_fetch_data_sample_not_valid_type (line 716) | def test_server_fetch_data_sample_not_valid_type(
function test_push_unchanged_repo_makes_no_modifications (line 746) | def test_push_unchanged_repo_makes_no_modifications(written_two_cmt_serv...
function test_fetch_unchanged_repo_makes_no_modifications (line 753) | def test_fetch_unchanged_repo_makes_no_modifications(written_two_cmt_ser...
function test_fetch_newer_disk_repo_makes_no_modifications (line 760) | def test_fetch_newer_disk_repo_makes_no_modifications(written_two_cmt_se...
function test_fetch_branch_which_does_not_exist_client_server_raises_rpc_error (line 772) | def test_fetch_branch_which_does_not_exist_client_server_raises_rpc_erro...
function test_fetch_branch_on_client_which_does_not_existserver_raises_rpc_error (line 780) | def test_fetch_branch_on_client_which_does_not_existserver_raises_rpc_er...
function test_push_clone_three_way_merge (line 789) | def test_push_clone_three_way_merge(server_instance, repo_2_br_no_conf, ...
function test_push_restricted_with_right_username_password (line 824) | def test_push_restricted_with_right_username_password(server_instance_pu...
function test_push_restricted_wrong_user_and_password (line 871) | def test_push_restricted_wrong_user_and_password(server_instance_push_re...
FILE: tests/test_repo_integrity_verification.py
function diverse_repo (line 7) | def diverse_repo(repo):
function test_verify_correct (line 79) | def test_verify_correct(diverse_repo):
class TestVerifyCommitRefDigests (line 83) | class TestVerifyCommitRefDigests(object):
method test_remove_array_digest_is_caught (line 85) | def test_remove_array_digest_is_caught(self, diverse_repo):
method test_remove_schema_digest_is_caught (line 104) | def test_remove_schema_digest_is_caught(self, diverse_repo):
class TestVerifyCommitTree (line 123) | class TestVerifyCommitTree(object):
method test_parent_ref_digest_of_cmt_does_not_exist (line 125) | def test_parent_ref_digest_of_cmt_does_not_exist(self, diverse_repo):
method test_parent_ref_references_nonexisting_commits (line 144) | def test_parent_ref_references_nonexisting_commits(self, diverse_repo):
method test_parent_ref_has_two_initial_commits (line 181) | def test_parent_ref_has_two_initial_commits(self, diverse_repo):
class TestBranchIntegrity (line 206) | class TestBranchIntegrity(object):
method test_atleast_one_branch_exists (line 208) | def test_atleast_one_branch_exists(self, diverse_repo):
method test_branch_name_head_commit_digests_exist (line 225) | def test_branch_name_head_commit_digests_exist(self, diverse_repo):
method test_staging_head_branch_name_exists (line 256) | def test_staging_head_branch_name_exists(self, diverse_repo):
function test_data_digest_modification_is_caught (line 273) | def test_data_digest_modification_is_caught(diverse_repo):
function test_data_digest_remote_location_warns (line 294) | def test_data_digest_remote_location_warns(diverse_repo):
function test_schema_digest_modification_is_caught (line 308) | def test_schema_digest_modification_is_caught(diverse_repo):
FILE: tests/test_utils.py
function test_unique_everseen (line 10) | def test_unique_everseen(arg, key, expected):
function test_valid_directory_path_errors_on_invalid_path_arg (line 18) | def test_valid_directory_path_errors_on_invalid_path_arg(pth):
function test_valid_directory_path_recognizes_not_a_directory (line 24) | def test_valid_directory_path_recognizes_not_a_directory(managed_tmpdir):
function test_format_bytes (line 43) | def test_format_bytes(arg, expected):
function test_parse_bytes (line 61) | def test_parse_bytes(arg, expected):
function test_find_next_prime (line 79) | def test_find_next_prime(arg, expected):
FILE: tests/test_version.py
function test_infinity_repr (line 31) | def test_infinity_repr():
function test_negative_infinity_repr (line 35) | def test_negative_infinity_repr():
function test_infinity_hash (line 39) | def test_infinity_hash():
function test_negative_infinity_hash (line 43) | def test_negative_infinity_hash():
function test_infinity_comparison (line 48) | def test_infinity_comparison(left):
function test_negative_infinity_lesser (line 58) | def test_negative_infinity_lesser(left):
function test_infinty_equal (line 67) | def test_infinty_equal():
function test_negative_infinity_equal (line 71) | def test_negative_infinity_equal():
function test_negate_infinity (line 75) | def test_negate_infinity():
function test_negate_negative_infinity (line 79) | def test_negate_negative_infinity():
function test_parse (line 89) | def test_parse(version, klass):
function test_legacy_version_raises (line 93) | def test_legacy_version_raises():
class TestVersion (line 133) | class TestVersion:
method test_valid_versions (line 135) | def test_valid_versions(self, version):
method test_invalid_versions (line 151) | def test_invalid_versions(self, version):
method test_normalized_versions (line 273) | def test_normalized_versions(self, version, normalized):
method test_version_str_repr (line 328) | def test_version_str_repr(self, version, expected):
method test_version_rc_and_c_equals (line 332) | def test_version_rc_and_c_equals(self):
method test_version_hash (line 336) | def test_version_hash(self, version):
method test_version_public (line 373) | def test_version_public(self, version, public):
method test_version_base_version (line 410) | def test_version_base_version(self, version, base_version):
method test_version_epoch (line 447) | def test_version_epoch(self, version, epoch):
method test_version_release (line 484) | def test_version_release(self, version, release):
method test_version_local (line 521) | def test_version_local(self, version, local):
method test_version_pre (line 558) | def test_version_pre(self, version, pre):
method test_version_is_prerelease (line 588) | def test_version_is_prerelease(self, version, expected):
method test_version_dev (line 625) | def test_version_dev(self, version, dev):
method test_version_is_devrelease (line 662) | def test_version_is_devrelease(self, version, expected):
method test_version_post (line 699) | def test_version_post(self, version, post):
method test_version_is_postrelease (line 712) | def test_version_is_postrelease(self, version, expected):
method test_comparison_true (line 755) | def test_comparison_true(self, left, right, op):
method test_comparison_false (line 798) | def test_comparison_false(self, left, right, op):
method test_compare_other (line 802) | def test_compare_other(self, monkeypatch, op, expected):
method test_major_version (line 807) | def test_major_version(self):
method test_minor_version (line 810) | def test_minor_version(self):
method test_micro_version (line 814) | def test_micro_version(self):
FILE: tests/test_visualizations.py
function verify_out (line 5) | def verify_out(capfd, expected):
function test_flat_merge_graph (line 11) | def test_flat_merge_graph(capfd):
function test_three_way_merge_graph (line 55) | def test_three_way_merge_graph(capfd):
function test_octopus_merge_graph (line 115) | def test_octopus_merge_graph(capfd):
function test_octopus_large_merge_graph (line 222) | def test_octopus_large_merge_graph(capfd):
function test_repo_log_return_contents_correct_default_args (line 366) | def test_repo_log_return_contents_correct_default_args(repo):
function test_repo_log_return_contents_correct_when_specify_branch_name (line 407) | def test_repo_log_return_contents_correct_when_specify_branch_name(repo):
function test_repo_log_return_contents_correct_when_specify_digest (line 448) | def test_repo_log_return_contents_correct_when_specify_digest(repo):
FILE: tests/typesystem/test_ndarray_typesysem.py
class TestInvalidValues (line 8) | class TestInvalidValues:
method test_shape_not_tuple_of_int_less_than_32_dims (line 16) | def test_shape_not_tuple_of_int_less_than_32_dims(self, shape, expecte...
method test_column_type_must_be_ndarray (line 24) | def test_column_type_must_be_ndarray(self, coltype):
method test_column_layout_must_be_valid_value (line 32) | def test_column_layout_must_be_valid_value(self, collayout):
method test_fixed_shape_backend_code_valid_value (line 40) | def test_fixed_shape_backend_code_valid_value(self, backend):
method test_variable_shape_backend_code_valid_value (line 46) | def test_variable_shape_backend_code_valid_value(self, backend):
method test_backend_options_must_be_dict_or_nonetype (line 52) | def test_backend_options_must_be_dict_or_nonetype(self, opts):
method test_backend_must_be_specified_if_backend_options_provided (line 58) | def test_backend_must_be_specified_if_backend_options_provided(self):
method test_variable_shape_must_have_variable_shape_schema_type (line 66) | def test_variable_shape_must_have_variable_shape_schema_type(self, sch...
method test_fixed_shape_must_have_fixed_shape_schema_type (line 72) | def test_fixed_shape_must_have_fixed_shape_schema_type(self, schema_ty...
FILE: tests/typesystem/test_pybytes_typesystem.py
class TestInvalidValues (line 7) | class TestInvalidValues:
method test_column_type_must_be_str (line 10) | def test_column_type_must_be_str(self, coltype):
method test_column_layout_must_be_valid_value (line 15) | def test_column_layout_must_be_valid_value(self, collayout):
method test_variable_shape_backend_code_valid_value (line 20) | def test_variable_shape_backend_code_valid_value(self, backend):
method test_backend_options_must_be_dict_or_nonetype (line 25) | def test_backend_options_must_be_dict_or_nonetype(self, opts):
method test_backend_must_be_specified_if_backend_options_provided (line 29) | def test_backend_must_be_specified_if_backend_options_provided(self):
method test_variable_shape_must_have_variable_shape_schema_type (line 34) | def test_variable_shape_must_have_variable_shape_schema_type(self, sch...
function column_layout (line 43) | def column_layout(request):
function backend (line 48) | def backend(request):
function backend_options (line 53) | def backend_options(request):
function valid_schema (line 58) | def valid_schema(column_layout, backend, backend_options):
class TestValidSchema (line 64) | class TestValidSchema:
method test_valid_data (line 72) | def test_valid_data(self, valid_schema, data):
method test_data_over_2MB_size_not_allowed (line 77) | def test_data_over_2MB_size_not_allowed(self, valid_schema):
FILE: tests/typesystem/test_pystr_typesystem.py
class TestInvalidValues (line 9) | class TestInvalidValues:
method test_column_type_must_be_str (line 12) | def test_column_type_must_be_str(self, coltype):
method test_column_layout_must_be_valid_value (line 17) | def test_column_layout_must_be_valid_value(self, collayout):
method test_variable_shape_backend_code_valid_value (line 22) | def test_variable_shape_backend_code_valid_value(self, backend):
method test_backend_options_must_be_dict_or_nonetype (line 27) | def test_backend_options_must_be_dict_or_nonetype(self, opts):
method test_backend_must_be_specified_if_backend_options_provided (line 31) | def test_backend_must_be_specified_if_backend_options_provided(self):
method test_variable_shape_must_have_variable_shape_schema_type (line 36) | def test_variable_shape_must_have_variable_shape_schema_type(self, sch...
function column_layout (line 45) | def column_layout(request):
function backend (line 50) | def backend(request):
function backend_options (line 55) | def backend_options(request):
function valid_schema (line 60) | def valid_schema(column_layout, backend, backend_options):
class TestValidSchema (line 66) | class TestValidSchema:
method test_valid_data (line 72) | def test_valid_data(self, valid_schema, data):
method test_large_unicode_codepoints_strings_compatible (line 78) | def test_large_unicode_codepoints_strings_compatible(self, valid_schem...
method test_strings_over_2MB_size_not_allowed (line 83) | def test_strings_over_2MB_size_not_allowed(self, valid_schema):
Condensed preview — 190 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (2,260K chars).
[
{
"path": ".bumpversion.cfg",
"chars": 840,
"preview": "[bumpversion]\ncurrent_version = 0.5.2\ncommit = True\ntag = False\nparse = (?P<major>\\d+)\\.(?P<minor>\\d+)\\.(?P<patch>\\d+)(\\"
},
{
"path": ".coveragerc",
"chars": 387,
"preview": "[paths]\nsource =\n src\n\n[run]\nbranch = True\nparallel = True\nsource =\n hangar\n tests\nomit =\n */hangar/__main__."
},
{
"path": ".editorconfig",
"chars": 215,
"preview": "# see http://editorconfig.org\nroot = true\n\n[*]\nend_of_line = lf\ntrim_trailing_whitespace = true\ninsert_final_newline = t"
},
{
"path": ".gitattributes",
"chars": 79,
"preview": "* text=auto\n\n*.bat eol=crlf\n*.cmd eol=crlf\n*.ps1 eol=lf\n*.sh eol=lf\n*.rtf -text"
},
{
"path": ".github/ISSUE_TEMPLATE/bug_report.md",
"chars": 856,
"preview": "---\nname: Bug report\nabout: Create a report to help us improve\ntitle: \"[BUG REPORT]\"\nlabels: 'Bug: Awaiting Priority Ass"
},
{
"path": ".github/ISSUE_TEMPLATE/feature_request.md",
"chars": 621,
"preview": "---\nname: Feature request\nabout: Suggest an idea for this project\ntitle: \"[FEATURE REQUEST]\"\nlabels: enhancement\nassigne"
},
{
"path": ".github/ISSUE_TEMPLATE/questions_and_documentation.md",
"chars": 943,
"preview": "---\nname: Questions and Documentation\nabout: Is something confusing? The documentation not clear? We can help\ntitle: \"[Q"
},
{
"path": ".github/PULL_REQUEST_TEMPLATE.md",
"chars": 1440,
"preview": "## Motivation and Context\n#### _Why is this change required? What problem does it solve?:_\n\n\n\n#### _If it fixes an open "
},
{
"path": ".github/workflows/asvbench.yml",
"chars": 1448,
"preview": "name: ASV Benchmarking\n\non:\n pull_request:\n branches:\n - master\n\njobs:\n run_benchmarks:\n runs-on: ${{ matrix."
},
{
"path": ".github/workflows/release.yml",
"chars": 6408,
"preview": "name: release\n\non:\n release:\n types: [published, prereleased]\n\njobs:\n build-linux-cp36:\n runs-on: ubuntu-latest\n"
},
{
"path": ".github/workflows/testsphinx.yml",
"chars": 653,
"preview": "name: Build Sphinx Docs\n\non:\n pull_request:\n branches:\n - master\n push:\n branches:\n - master\n\njobs:\n "
},
{
"path": ".github/workflows/testsuite.yml",
"chars": 2087,
"preview": "name: Run Test Suite\n\non:\n pull_request:\n branches:\n - master\n push:\n branches:\n - master\n\njobs:\n run"
},
{
"path": ".gitignore",
"chars": 897,
"preview": "*.py[cod]\n\n# C extensions\n*.c\n*.so\ncython_debug/\n\n# cython annotation files\nsrc/hangar/backends/*.html\ndocs/_build\n\n# Pa"
},
{
"path": ".readthedocs.yml",
"chars": 636,
"preview": "# .readthedocs.yml\n# Read the Docs configuration file\n# See https://docs.readthedocs.io/en/stable/config-file/v2.html fo"
},
{
"path": "AUTHORS.rst",
"chars": 174,
"preview": "Authors\n=======\n\n* Richard Izzo - rick@tensorwerk.com\n* Luca Antiga - luca@tensorwerk.com\n* Sherin Thomas - sherin@tenso"
},
{
"path": "CHANGELOG.rst",
"chars": 19774,
"preview": "==========\nChange Log\n==========\n\n\n_`In-Progress`\n==============\n\nImprovements\n------------\n\n* New API design for datase"
},
{
"path": "CODE_OF_CONDUCT.rst",
"chars": 3512,
"preview": "===========================\nContributor Code of Conduct\n===========================\n\nOur Pledge\n----------\n\nIn the inter"
},
{
"path": "CONTRIBUTING.rst",
"chars": 2818,
"preview": "============\nContributing\n============\n\nContributions are welcome, and they are greatly appreciated! Every\nlittle bit he"
},
{
"path": "LICENSE",
"chars": 10757,
"preview": "\n Apache License\n Version 2.0, January 2004\n "
},
{
"path": "MANIFEST.in",
"chars": 390,
"preview": "graft docs\ngraft src\ngraft tests\n\ninclude .bumpversion.cfg\ninclude .coveragerc\ninclude .editorconfig\n\ninclude AUTHORS.rs"
},
{
"path": "README.rst",
"chars": 4898,
"preview": "========\nOverview\n========\n\n.. start-badges\n\n.. list-table::\n :stub-columns: 1\n\n * - docs\n - |docs|\n * - t"
},
{
"path": "asv_bench/README.rst",
"chars": 6126,
"preview": "Hangar Performance Benchmarking Suite\n=====================================\n\nA set of benchmarking tools are included in"
},
{
"path": "asv_bench/asv.conf.json",
"chars": 6944,
"preview": "{\n // The version of the config file format. Do not change, unless\n // you know what you are doing.\n \"version\""
},
{
"path": "asv_bench/benchmarks/__init__.py",
"chars": 1,
"preview": "\n"
},
{
"path": "asv_bench/benchmarks/backend_comparisons.py",
"chars": 5649,
"preview": "# Write the benchmarking functions here.\n# See \"Writing benchmarks\" in the asv docs for more information.\nimport numpy a"
},
{
"path": "asv_bench/benchmarks/backends/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "asv_bench/benchmarks/backends/hdf5_00.py",
"chars": 3853,
"preview": "# Write the benchmarking functions here.\n# See \"Writing benchmarks\" in the asv docs for more information.\nimport numpy a"
},
{
"path": "asv_bench/benchmarks/backends/hdf5_01.py",
"chars": 3947,
"preview": "# Write the benchmarking functions here.\n# See \"Writing benchmarks\" in the asv docs for more information.\nimport numpy a"
},
{
"path": "asv_bench/benchmarks/backends/numpy_10.py",
"chars": 3865,
"preview": "# Write the benchmarking functions here.\n# See \"Writing benchmarks\" in the asv docs for more information.\nimport numpy a"
},
{
"path": "asv_bench/benchmarks/commit_and_checkout.py",
"chars": 2707,
"preview": "from tempfile import mkdtemp\nfrom shutil import rmtree\nimport numpy as np\nfrom hangar import Repository\n\n\nclass MakeComm"
},
{
"path": "asv_bench/benchmarks/package.py",
"chars": 173,
"preview": "\n\nclass TimeImport(object):\n\n processes = 2\n repeat = (5, 10, 10.0)\n\n def timeraw_import(self):\n return "
},
{
"path": "codecov.yml",
"chars": 192,
"preview": "comment:\n layout: \"diff, files\"\n behavior: default\n require_changes: false # if true: only post the comment if cover"
},
{
"path": "docs/Tutorial-001.ipynb",
"chars": 43572,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"## Part 1: Creating A Repository An"
},
{
"path": "docs/Tutorial-002.ipynb",
"chars": 51352,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"## Part 2: Checkouts, Branching, & "
},
{
"path": "docs/Tutorial-003.ipynb",
"chars": 56307,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"## Part 3: Working With Remote Serv"
},
{
"path": "docs/Tutorial-Dataset.ipynb",
"chars": 243381,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"colab_type\": \"text\",\n \"id\": \"CQhd0TTQCMeh\"\n },\n"
},
{
"path": "docs/Tutorial-QuickStart.ipynb",
"chars": 30134,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"## Quick Start Tutorial\"\n ]\n },\n"
},
{
"path": "docs/Tutorial-RealQuickStart.ipynb",
"chars": 33026,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"## \\\"Real World\\\" Quick Start Tutor"
},
{
"path": "docs/api.rst",
"chars": 3030,
"preview": ".. _ref-api:\n\n==========\nPython API\n==========\n\nThis is the python API for the Hangar project.\n\n\nRepository\n==========\n\n"
},
{
"path": "docs/authors.rst",
"chars": 28,
"preview": ".. include:: ../AUTHORS.rst\n"
},
{
"path": "docs/backends/hdf5_00.rst",
"chars": 79,
"preview": "Local HDF5 Backend\n==================\n\n.. automodule:: hangar.backends.hdf5_00\n"
},
{
"path": "docs/backends/hdf5_01.rst",
"chars": 106,
"preview": "Fixed Shape Optimized Local HDF5\n================================\n\n.. automodule:: hangar.backends.hdf5_01"
},
{
"path": "docs/backends/lmdb_30.rst",
"chars": 117,
"preview": "Variable Shape LMDB String Data Store\n=====================================\n\n.. automodule:: hangar.backends.lmdb_30\n"
},
{
"path": "docs/backends/numpy_10.rst",
"chars": 90,
"preview": "Local NP Memmap Backend\n=======================\n\n.. automodule:: hangar.backends.numpy_10\n"
},
{
"path": "docs/backends/remote_50.rst",
"chars": 103,
"preview": "Remote Server Unknown Backend\n=============================\n\n.. automodule:: hangar.backends.remote_50\n"
},
{
"path": "docs/backends.rst",
"chars": 732,
"preview": ".. _ref-backends:\n\n.. note::\n\n The following documentation contains highly technical descriptions of the\n data writi"
},
{
"path": "docs/benchmarking.rst",
"chars": 36,
"preview": ".. include:: ../asv_bench/README.rst"
},
{
"path": "docs/changelog.rst",
"chars": 30,
"preview": ".. include:: ../CHANGELOG.rst\n"
},
{
"path": "docs/cli.rst",
"chars": 469,
"preview": "Hangar CLI Documentation\n========================\n\nThe CLI described below is automatically available after the Hangar P"
},
{
"path": "docs/codeofconduct.rst",
"chars": 61,
"preview": ".. _ref-code-of-conduct:\n\n.. include:: ../CODE_OF_CONDUCT.rst"
},
{
"path": "docs/concepts.rst",
"chars": 27938,
"preview": ".. _ref-concepts:\n\n####################\nHangar Core Concepts\n####################\n\n.. warning::\n\n The usage info displa"
},
{
"path": "docs/conf.py",
"chars": 3426,
"preview": "# -*- coding: utf-8 -*-\nfrom __future__ import unicode_literals\n\nimport os\n\n\nextensions = [\n 'sphinx.ext.autodoc',\n "
},
{
"path": "docs/contributing.rst",
"chars": 32,
"preview": ".. include:: ../CONTRIBUTING.rst"
},
{
"path": "docs/contributingindex.rst",
"chars": 172,
"preview": ".. _ref-contributing:\n\n######################\nContributing to Hangar\n######################\n\n.. toctree::\n :maxdepth: "
},
{
"path": "docs/design.rst",
"chars": 15509,
"preview": ".. _ref-hangar-under-the-hood:\n\n=====================\nHangar Under The Hood\n=====================\n\nAt its core, Hangar i"
},
{
"path": "docs/externals.rst",
"chars": 321,
"preview": ".. _ref-external:\n\n===============\nHangar External\n===============\n\nHigh level interaction interface between hangar and "
},
{
"path": "docs/faq.rst",
"chars": 11067,
"preview": ".. _ref-faq:\n\n==========================\nFrequently Asked Questions\n==========================\n\nThe following documentat"
},
{
"path": "docs/index.rst",
"chars": 316,
"preview": ".. include:: ../README.rst\n\n.. toctree::\n :maxdepth: 3\n\n readme\n quickstart\n installation\n concepts\n api\n "
},
{
"path": "docs/installation.rst",
"chars": 1854,
"preview": ".. _ref_installation:\n\n============\nInstallation\n============\n\nFor general usage it is recommended that you use a pre-bu"
},
{
"path": "docs/noindexapi/apiinit.rst",
"chars": 233,
"preview": ".. automethod:: hangar.checkout.WriterCheckout.add_ndarray_column\n :noindex:\n\n.. automethod:: hangar.checkout.WriterCh"
},
{
"path": "docs/noindexapi/apiremotefetchdata.rst",
"chars": 65,
"preview": ".. automethod:: hangar.repository.Remotes.fetch_data\n :noindex:"
},
{
"path": "docs/quickstart.rst",
"chars": 208,
"preview": "=====\nUsage\n=====\n\nTo use Hangar in a project::\n\n\tfrom hangar import Repository\n\n\nPlease refer to the :ref:`ref-tutorial"
},
{
"path": "docs/readme.rst",
"chars": 27,
"preview": ".. include:: ../README.rst\n"
},
{
"path": "docs/requirements.txt",
"chars": 96,
"preview": "sphinx>=2.0\nsphinx-material\nsphinx-click\nnbsphinx\nsphinx-copybutton\nrecommonmark\nIPython\nCython\n"
},
{
"path": "docs/requirements_rtd.txt",
"chars": 254,
"preview": "https://files.pythonhosted.org/packages/84/ad/ee890cbea43dd97cbb05aa30b9b08ff908efa8407f514e9d447dd365ef15/tensorflow_cp"
},
{
"path": "docs/spelling_wordlist.txt",
"chars": 109,
"preview": "builtin\nbuiltins\nclassmethod\nstaticmethod\nclassmethods\nstaticmethods\nargs\nkwargs\ncallstack\nChangelog\nIndices\n"
},
{
"path": "docs/tutorial.rst",
"chars": 235,
"preview": ".. _ref-tutorial:\n\n###############\nHangar Tutorial\n###############\n\n.. toctree::\n :maxdepth: 2\n :titlesonly:\n\n Tut"
},
{
"path": "hangar.yml",
"chars": 1214,
"preview": "# Metadata file for Zenoodo source code upload\n# This is part of the Escape 2020 project and was originally\n# requested "
},
{
"path": "mypy.ini",
"chars": 325,
"preview": "# ------------------------- Global Options ------------------------------------\n\n[mypy]\nwarn_unused_configs = True\n\n# --"
},
{
"path": "scripts/run_proto_codegen.py",
"chars": 1829,
"preview": "import os\nfrom shutil import move\n\nfrom grpc_tools import protoc\n\n\n# ------------------------- output locations --------"
},
{
"path": "setup.cfg",
"chars": 525,
"preview": "[bdist_wheel]\nuniversal = 0\n\n\n[flake8]\nmax-line-length = 150\nexclude = */migrations/*\n\n[tool:pytest]\nnorecursedirs =\n "
},
{
"path": "setup.py",
"chars": 7726,
"preview": "#!/usr/bin/env python\n# -*- encoding: utf-8 -*-\nimport os\nimport platform\nimport sys\nfrom os.path import join\nfrom distu"
},
{
"path": "src/hangar/__init__.py",
"chars": 84,
"preview": "__version__ = '0.5.2'\n__all__ = ('Repository',)\n\nfrom .repository import Repository\n"
},
{
"path": "src/hangar/__main__.py",
"chars": 360,
"preview": "\"\"\"\nEntrypoint module, in case you use `python -m hangar`.\n\n\nWhy does this file exist, and why __main__? For more info, "
},
{
"path": "src/hangar/_version.py",
"chars": 13587,
"preview": "# -*- coding: utf-8 -*-\n\"\"\"\nPortions of this code have been taken and modified from the \"packaging\" project.\n\nURL: "
},
{
"path": "src/hangar/backends/__init__.py",
"chars": 5414,
"preview": "\"\"\"Definition and dynamic routing to Hangar backend implementations.\n\nThis module defines the available backends for a H"
},
{
"path": "src/hangar/backends/chunk.py",
"chars": 5248,
"preview": "\"\"\"\nPortions of this code have been taken and modified from the \"PyTables\" project.\n\nURL: https://github.com/PyTabl"
},
{
"path": "src/hangar/backends/hdf5_00.py",
"chars": 33140,
"preview": "\"\"\"Local HDF5 Backend Implementation, Identifier: ``HDF5_00``\n\nBackend Identifiers\n===================\n\n* Backend: ``0`"
},
{
"path": "src/hangar/backends/hdf5_01.py",
"chars": 34247,
"preview": "\"\"\"Local HDF5 Backend Implementation, Identifier: ``HDF5_01``\n\nBackend Identifiers\n===================\n\n* Backend: ``0`"
},
{
"path": "src/hangar/backends/lmdb_30.py",
"chars": 13892,
"preview": "\"\"\"Local LMDB Backend Implementation, Identifier: ``LMDB_30``\n\nBackend Identifiers\n===================\n\n* Backend: ``3`"
},
{
"path": "src/hangar/backends/lmdb_31.py",
"chars": 13790,
"preview": "\"\"\"Local LMDB Backend Implementation, Identifier: ``LMDB_30``\n\nBackend Identifiers\n===================\n\n* Backend: ``3`"
},
{
"path": "src/hangar/backends/numpy_10.py",
"chars": 14813,
"preview": "\"\"\"Local Numpy memmap Backend Implementation, Identifier: ``NUMPY_10``\n\nBackend Identifiers\n===================\n\n* Back"
},
{
"path": "src/hangar/backends/remote_50.py",
"chars": 5338,
"preview": "\"\"\"Remote server location unknown backend, Identifier: ``REMOTE_50``\n\nBackend Identifiers\n===================\n\n* Backen"
},
{
"path": "src/hangar/backends/specparse.pyx",
"chars": 5725,
"preview": "# decoding methods to convert from byte string -> spec struct.\n\nfrom .specs cimport HDF5_01_DataHashSpec, \\\n HDF5_00_"
},
{
"path": "src/hangar/backends/specs.pxd",
"chars": 1069,
"preview": "# header files for spec containers\n\ncdef class HDF5_01_DataHashSpec:\n\n cdef readonly str backend\n cdef readonly st"
},
{
"path": "src/hangar/backends/specs.pyx",
"chars": 4528,
"preview": "# memory efficient container classes for data backends specs.\n# Allow for attribute access similar to named tuples.\n\ncde"
},
{
"path": "src/hangar/bulk_importer.py",
"chars": 42172,
"preview": "\"\"\"Bulk importer methods to ingest large quantities of data into Hangar.\n\nThe following module is designed to address ch"
},
{
"path": "src/hangar/checkout.py",
"chars": 48955,
"preview": "import atexit\nfrom pathlib import Path\nimport weakref\nfrom contextlib import suppress, ExitStack\nfrom uuid import uuid4\n"
},
{
"path": "src/hangar/cli/__init__.py",
"chars": 42,
"preview": "from .cli import main\n\n__all__ = ['main']\n"
},
{
"path": "src/hangar/cli/cli.py",
"chars": 31482,
"preview": "\"\"\"Module that contains the command line app.\n\nWhy does this file exist, and why not put this in __main__?\n\n You might"
},
{
"path": "src/hangar/cli/utils.py",
"chars": 1520,
"preview": "import click\n\n\nclass StrOrIntType(click.ParamType):\n \"\"\"Custom type for click to parse the sample name\n argument t"
},
{
"path": "src/hangar/columns/__init__.py",
"chars": 451,
"preview": "from .column import Columns, ModifierTypes\nfrom .common import ColumnTxn\nfrom .constructors import (\n generate_flat_c"
},
{
"path": "src/hangar/columns/column.py",
"chars": 18504,
"preview": "\"\"\"Constructor and Interaction Class for Columns\n\"\"\"\nfrom contextlib import ExitStack\nfrom pathlib import Path\nfrom typi"
},
{
"path": "src/hangar/columns/common.py",
"chars": 5233,
"preview": "from contextlib import contextmanager\nfrom typing import Optional\n\nimport lmdb\n\nfrom ..txnctx import TxnRegister\n\n\nclass"
},
{
"path": "src/hangar/columns/constructors.py",
"chars": 8472,
"preview": "\"\"\"Constructors for initializing FlatSampleReader and NestedSampleReader columns\n\"\"\"\nimport warnings\nfrom _weakref impor"
},
{
"path": "src/hangar/columns/introspection.py",
"chars": 774,
"preview": "from .layout_flat import FlatSampleReader, FlatSampleWriter\nfrom .layout_nested import (\n FlatSubsampleReader,\n Fl"
},
{
"path": "src/hangar/columns/layout_flat.py",
"chars": 26560,
"preview": "\"\"\"Accessor class for columns containing single-level key/value mappings\n\nThe FlatSampleReader container is used to stor"
},
{
"path": "src/hangar/columns/layout_nested.py",
"chars": 43305,
"preview": "\"\"\"Accessor column containing nested mapping of data under top level keys.\n\"\"\"\nfrom contextlib import ExitStack\nfrom pat"
},
{
"path": "src/hangar/constants.py",
"chars": 1309,
"preview": "from .utils import is_64bits, parse_bytes\n\n# parsing constants\n\nSEP_KEY = ':'\nSEP_LST = ' '\nSEP_CMT = ' << '\nSEP_SLC = \""
},
{
"path": "src/hangar/context.py",
"chars": 8873,
"preview": "import configparser\nimport os\nfrom pathlib import Path\nimport platform\nimport shutil\nimport tempfile\nimport warnings\nfro"
},
{
"path": "src/hangar/dataset/__init__.py",
"chars": 8003,
"preview": "__all__ = ('make_numpy_dataset', 'make_torch_dataset', 'make_tensorflow_dataset')\n\nfrom typing import Sequence, Callable"
},
{
"path": "src/hangar/dataset/common.py",
"chars": 4501,
"preview": "import typing\nfrom typing import Union, Sequence, Tuple, List, Dict\nfrom collections import OrderedDict\n\nfrom ..columns "
},
{
"path": "src/hangar/dataset/numpy_dset.py",
"chars": 9010,
"preview": "from typing import Sequence, Callable, TYPE_CHECKING, Union, List, Tuple\nimport random\n\nimport numpy as np\n\nfrom .common"
},
{
"path": "src/hangar/dataset/tensorflow_dset.py",
"chars": 4721,
"preview": "from typing import Sequence, Callable, List, Tuple, Union\nimport typing\nfrom functools import partial\nimport random\n\ntry"
},
{
"path": "src/hangar/dataset/torch_dset.py",
"chars": 4503,
"preview": "from typing import Sequence, TYPE_CHECKING, Union, List, Tuple\nfrom collections import OrderedDict\n\ntry:\n import torc"
},
{
"path": "src/hangar/diagnostics/__init__.py",
"chars": 71,
"preview": "__version__ = '0.5.2'\n\nfrom .graphing import Graph\n\n__all__ = ['Graph']"
},
{
"path": "src/hangar/diagnostics/ecosystem.py",
"chars": 3258,
"preview": "from typing import Dict, List, Tuple, Union\n\n\nrequired_packages = [\n ('hangar', lambda p: p.__version__),\n ('click"
},
{
"path": "src/hangar/diagnostics/graphing.py",
"chars": 33462,
"preview": "# -*- coding: utf-8 -*-\n\"\"\"\nPortions of this code have been taken and modified from the \"asciidag\" project.\n\nURL: h"
},
{
"path": "src/hangar/diagnostics/integrity.py",
"chars": 7836,
"preview": "from pathlib import Path\nimport warnings\n\nimport lmdb\nfrom tqdm import tqdm\n\nfrom ..records import (\n hash_data_db_ke"
},
{
"path": "src/hangar/diff.py",
"chars": 22959,
"preview": "from itertools import starmap\nfrom typing import Iterable, List, NamedTuple, Set, Tuple, Union\n\nimport lmdb\n\nfrom .recor"
},
{
"path": "src/hangar/external/__init__.py",
"chars": 213,
"preview": "from ._external import load, save, show, board_show\nfrom .plugin_manager import PluginManager\nfrom .base_plugin import B"
},
{
"path": "src/hangar/external/_external.py",
"chars": 6320,
"preview": "\"\"\"\nHigh level methods let user interact with hangar without diving into the internal\nmethods of hangar. We have enabled"
},
{
"path": "src/hangar/external/base_plugin.py",
"chars": 6710,
"preview": "\"\"\"\nHangar's external plugin system is designed to make it flexible for users to\nwrite custom plugins for custom data fo"
},
{
"path": "src/hangar/external/plugin_manager.py",
"chars": 3988,
"preview": "import pkg_resources\nfrom typing import Callable\n\n\nclass PluginManager(object):\n \"\"\"\n Container class that holds t"
},
{
"path": "src/hangar/external_cpython.pxd",
"chars": 495,
"preview": "\"\"\" Additional bindings to Python's C-API.\nThese differ from Cython's bindings in ``cpython``.\n\"\"\"\nfrom cpython.ref cimp"
},
{
"path": "src/hangar/merger.py",
"chars": 10501,
"preview": "\"\"\"Merge Methods\n\nIn the current implementation only fast-forward and a competent, but limited,\nthree-way merge algorith"
},
{
"path": "src/hangar/mixins/__init__.py",
"chars": 203,
"preview": "from .checkout_iteration import CheckoutDictIteration\nfrom .datasetget import GetMixin\nfrom .recorditer import CursorRan"
},
{
"path": "src/hangar/mixins/checkout_iteration.py",
"chars": 1184,
"preview": "\n\nclass CheckoutDictIteration:\n \"\"\"Mixin class for checkout objects which mock common iter methods\n\n Methods\n -"
},
{
"path": "src/hangar/mixins/datasetget.py",
"chars": 8779,
"preview": "from functools import reduce\nfrom operator import getitem as op_getitem\nfrom contextlib import ExitStack\n\n# noinspection"
},
{
"path": "src/hangar/mixins/recorditer.py",
"chars": 3046,
"preview": "from typing import Iterable, Union, Tuple\nimport lmdb\n\n\nclass CursorRangeIterator:\n\n @staticmethod\n def cursor_ran"
},
{
"path": "src/hangar/op_state.py",
"chars": 5088,
"preview": "import types\nimport sys\n\nimport wrapt\n\n\n@wrapt.decorator\ndef writer_checkout_only(wrapped, instance, args, kwargs) -> ty"
},
{
"path": "src/hangar/optimized_utils.pxd",
"chars": 1296,
"preview": "\"\"\"\nPortions of this code have been taken and modified from the \"cytoolz\" project.\n\nURL: https://github.com/pytoolz"
},
{
"path": "src/hangar/optimized_utils.pyx",
"chars": 9817,
"preview": "\"\"\"\nPortions of this code have been taken and modified from the \"cytoolz\" project.\n\nURL: https://github.com/pytoolz"
},
{
"path": "src/hangar/records/__init__.py",
"chars": 402,
"preview": "from .hashmachine import hash_func_from_tcode\nfrom .column_parsers import *\nfrom .recordstructs import (\n CompatibleD"
},
{
"path": "src/hangar/records/column_parsers.pyx",
"chars": 7312,
"preview": "from .recordstructs cimport ColumnSchemaKey, \\\n FlatColumnDataKey, \\\n NestedColumnDataKey, \\\n DataRecordVal\n\nfr"
},
{
"path": "src/hangar/records/commiting.py",
"chars": 23582,
"preview": "import configparser\nimport os\nimport shutil\nimport tempfile\nimport time\nfrom contextlib import contextmanager, closing\nf"
},
{
"path": "src/hangar/records/hashmachine.pyx",
"chars": 4032,
"preview": "import array\nfrom cpython cimport array\n\nimport numpy as np\nfrom hashlib import blake2b\n\n\ncpdef str hash_type_code_from_"
},
{
"path": "src/hangar/records/hashs.py",
"chars": 8523,
"preview": "from pathlib import Path\nfrom typing import Iterable, List, Tuple, Union, Set\n\nimport lmdb\n\nfrom .column_parsers import "
},
{
"path": "src/hangar/records/heads.py",
"chars": 22273,
"preview": "import warnings\nfrom collections import defaultdict\nfrom typing import NamedTuple\n\nimport lmdb\n\nfrom .parsing import (\n "
},
{
"path": "src/hangar/records/parsing.py",
"chars": 15885,
"preview": "import json\nfrom hashlib import blake2b\nfrom itertools import cycle\nfrom random import randint\nfrom time import perf_cou"
},
{
"path": "src/hangar/records/queries.py",
"chars": 10645,
"preview": "from typing import Dict, Iterable, Iterator, List, Set, Tuple, Union, Sequence\n\nimport lmdb\n\nfrom .column_parsers import"
},
{
"path": "src/hangar/records/recordstructs.pxd",
"chars": 501,
"preview": "# header file for record containers\n\ncdef class CompatibleData:\n cdef readonly bint compatible\n cdef readonly str "
},
{
"path": "src/hangar/records/recordstructs.pyx",
"chars": 4618,
"preview": "\ncdef class CompatibleData:\n \"\"\"Bool recording if data `compatible` and if False the rejection `reason`.\n \"\"\"\n\n "
},
{
"path": "src/hangar/records/summarize.py",
"chars": 11348,
"preview": "from pathlib import Path\nimport time\nfrom io import StringIO\n\nimport lmdb\n\nfrom .commiting import (\n get_commit_ances"
},
{
"path": "src/hangar/records/vcompat.py",
"chars": 4755,
"preview": "from pathlib import Path\n\nimport lmdb\n\nfrom .parsing import (\n repo_version_db_key,\n repo_version_db_val_from_raw_"
},
{
"path": "src/hangar/remote/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "src/hangar/remote/chunks.py",
"chars": 7223,
"preview": "import math\nimport struct\nfrom io import BytesIO\nfrom typing import NamedTuple, List, Union, Tuple, Iterable\n\nimport blo"
},
{
"path": "src/hangar/remote/client.py",
"chars": 25993,
"preview": "import concurrent.futures\nimport logging\nimport os\nimport tempfile\nimport time\nfrom threading import Lock\nfrom typing im"
},
{
"path": "src/hangar/remote/config_server.ini",
"chars": 375,
"preview": "[SERVER_GRPC]\nchannel_address = [::]:50051\nmax_thread_pool_workers = 200\nmax_concurrent_rpcs = 100\nenable_compression = "
},
{
"path": "src/hangar/remote/content.py",
"chars": 10162,
"preview": "from typing import NamedTuple, Union, Optional\n\nimport numpy as np\n\nfrom ..columns.constructors import open_file_handles"
},
{
"path": "src/hangar/remote/hangar_service.proto",
"chars": 9042,
"preview": "syntax = \"proto3\";\n\npackage hangar;\noption optimize_for = SPEED;\n\n\nservice HangarService {\n\n rpc PING (PingRequest) r"
},
{
"path": "src/hangar/remote/hangar_service_pb2.py",
"chars": 85284,
"preview": "# -*- coding: utf-8 -*-\n# Generated by the protocol buffer compiler. DO NOT EDIT!\n# source: hangar_service.proto\n\nfrom "
},
{
"path": "src/hangar/remote/hangar_service_pb2.pyi",
"chars": 29598,
"preview": "# @generated by generate_proto_mypy_stubs.py. Do not edit!\nimport sys\nfrom google.protobuf.descriptor import (\n Desc"
},
{
"path": "src/hangar/remote/hangar_service_pb2_grpc.py",
"chars": 32854,
"preview": "# Generated by the gRPC Python protocol compiler plugin. DO NOT EDIT!\nimport grpc\n\nfrom . import hangar_service_pb2 as h"
},
{
"path": "src/hangar/remote/header_manipulator_client_interceptor.py",
"chars": 3996,
"preview": "\"\"\"Interceptor that adds headers to outgoing requests\n\nPortions of this code have been taken and modified from the \"gRPC"
},
{
"path": "src/hangar/remote/request_header_validator_interceptor.py",
"chars": 4020,
"preview": "\"\"\"Interceptor that ensures a specific header is present.\n\nPortions of this code have been taken and modified from the \""
},
{
"path": "src/hangar/remote/server.py",
"chars": 32225,
"preview": "import configparser\nimport os\nimport shutil\nimport tempfile\nimport traceback\nimport warnings\nfrom concurrent import futu"
},
{
"path": "src/hangar/remotes.py",
"chars": 34777,
"preview": "import logging\nimport tempfile\nimport time\nimport warnings\nfrom collections import defaultdict\nfrom contextlib import cl"
},
{
"path": "src/hangar/repository.py",
"chars": 34890,
"preview": "from pathlib import Path\nimport weakref\nimport warnings\nfrom typing import Union, Optional, List\nfrom io import StringIO"
},
{
"path": "src/hangar/txnctx.py",
"chars": 5657,
"preview": "from collections import Counter\nfrom typing import MutableMapping\n\nimport lmdb\n\n\nclass TxnRegisterSingleton(type):\n _"
},
{
"path": "src/hangar/typesystem/__init__.py",
"chars": 449,
"preview": "from .descriptors import (\n Descriptor, OneOf, DictItems, EmptyDict, SizedIntegerTuple, checkedmeta\n)\nfrom .ndarray i"
},
{
"path": "src/hangar/typesystem/base.py",
"chars": 3043,
"preview": "from .descriptors import OneOf, String, checkedmeta\nfrom ..records import hash_func_from_tcode\n\n\n@OneOf(['flat', 'nested"
},
{
"path": "src/hangar/typesystem/descriptors.py",
"chars": 7885,
"preview": "\"\"\"\nPortions of this code have been taken and modified from the book:\n\nBeazley, D. and B. K. Jones (2013). Python Cookbo"
},
{
"path": "src/hangar/typesystem/ndarray.py",
"chars": 6616,
"preview": "import numpy as np\n\nfrom .base import ColumnBase\nfrom .descriptors import OneOf, String, OptionalString, SizedIntegerTup"
},
{
"path": "src/hangar/typesystem/pybytes.py",
"chars": 3931,
"preview": "from .base import ColumnBase\nfrom .descriptors import OneOf, Descriptor, String, OptionalString, OptionalDict\nfrom ..rec"
},
{
"path": "src/hangar/typesystem/pystring.py",
"chars": 3995,
"preview": "from .base import ColumnBase\nfrom .descriptors import OneOf, Descriptor, String, OptionalString, OptionalDict\nfrom ..rec"
},
{
"path": "src/hangar/utils.py",
"chars": 11245,
"preview": "import os\nimport re\nimport secrets\nimport string\nimport sys\nimport time\nfrom collections import deque\nfrom io import Str"
},
{
"path": "tests/bulk_importer/test_bulk_importer.py",
"chars": 6739,
"preview": "import pytest\nimport numpy as np\n\n\ndef assert_equal(arr, arr2):\n assert np.array_equal(arr, arr2)\n assert arr.dtyp"
},
{
"path": "tests/conftest.py",
"chars": 13669,
"preview": "import time\nimport shutil\nimport random\nfrom os.path import join as pjoin\nfrom os import mkdir\n\nimport pytest\nimport num"
},
{
"path": "tests/ml_datasets/test_dataset.py",
"chars": 17873,
"preview": "import sys\n\nimport numpy as np\nimport pytest\nfrom torch.utils.data import DataLoader\nimport warnings\nwith warnings.catch"
},
{
"path": "tests/property_based/conftest.py",
"chars": 196,
"preview": "import pytest\n\nvariable_shape_backend_params = ['00', '10']\nfixed_shape_backend_params = ['00', '01', '10']\nstr_variable"
},
{
"path": "tests/property_based/test_pbt_column_flat.py",
"chars": 9965,
"preview": "import pytest\nimport numpy as np\n\nfrom conftest import (\n variable_shape_backend_params,\n fixed_shape_backend_para"
},
{
"path": "tests/property_based/test_pbt_column_nested.py",
"chars": 11203,
"preview": "from collections import defaultdict\n\nimport pytest\nimport numpy as np\n\nfrom conftest import (\n variable_shape_backend"
},
{
"path": "tests/test_backend_hdf5_00_hdf5_01.py",
"chars": 6988,
"preview": "import pytest\nimport numpy as np\n\n\n@pytest.fixture(params=['00', '01'])\ndef be_filehandle(request):\n if request.param"
},
{
"path": "tests/test_branching.py",
"chars": 7093,
"preview": "import pytest\n\n\n@pytest.mark.parametrize('name', [\n 'dummy branch', 'origin/master', '\\nmaster', '\\\\master', 'master\\"
},
{
"path": "tests/test_checkout.py",
"chars": 26458,
"preview": "import atexit\nimport numpy as np\nimport pytest\nfrom conftest import fixed_shape_backend_params\n\n\nclass TestCheckout(obje"
},
{
"path": "tests/test_checkout_arrayset_access.py",
"chars": 22429,
"preview": "import pytest\nimport numpy as np\n\n\n# -------------------------- Reader Checkout ----------------------------------\n\n\n@py"
},
{
"path": "tests/test_cli.py",
"chars": 43183,
"preview": "from os import getcwd\nimport os\nfrom pathlib import Path\n\nimport numpy as np\nimport pytest\nfrom click.testing import Cli"
},
{
"path": "tests/test_column.py",
"chars": 62584,
"preview": "import pytest\nimport numpy as np\nfrom conftest import fixed_shape_backend_params, variable_shape_backend_params\nfrom ite"
},
{
"path": "tests/test_column_backends.py",
"chars": 13197,
"preview": "import pytest\nimport numpy as np\nfrom conftest import fixed_shape_backend_params\n\n\n@pytest.mark.parametrize('backend', f"
},
{
"path": "tests/test_column_definition_permutations.py",
"chars": 15416,
"preview": "from collections import defaultdict\nfrom functools import partial\nimport secrets\nimport string\n\nimport pytest\nimport num"
},
{
"path": "tests/test_column_nested.py",
"chars": 74930,
"preview": "\"\"\"Tests for the class methods contained in the nested subsample column accessor.\n\"\"\"\nimport numpy as np\nimport pytest\nf"
},
{
"path": "tests/test_column_pickle.py",
"chars": 5035,
"preview": "import pytest\nimport numpy as np\nfrom conftest import fixed_shape_backend_params\n\n\ndef assert_equal(arr, arr2):\n asse"
},
{
"path": "tests/test_commit_ref_verification.py",
"chars": 3510,
"preview": "import pytest\n\n\ndef test_verify_corruption_in_commit_ref_alerts(two_commit_filled_samples_repo):\n from hangar.records"
},
{
"path": "tests/test_context_management.py",
"chars": 1212,
"preview": "import pytest\nimport numpy as np\n\nfrom conftest import fixed_shape_backend_params, variable_shape_backend_params\n\nall_ba"
},
{
"path": "tests/test_diff.py",
"chars": 15897,
"preview": "import pytest\nimport numpy as np\n\n\nclass TestReaderWriterDiff(object):\n\n @pytest.mark.parametrize('writer', [False, T"
},
{
"path": "tests/test_diff_staged_summary.py",
"chars": 9912,
"preview": "import pytest\nimport numpy as np\n\n\ndef test_add_samples_to_existing_column(repo_20_filled_samples2):\n from hangar.rec"
},
{
"path": "tests/test_initiate.py",
"chars": 10276,
"preview": "import os\nimport pytest\nfrom hangar import Repository\n\n\ndef test_imports():\n import hangar\n from hangar import Rep"
},
{
"path": "tests/test_merging.py",
"chars": 12645,
"preview": "import pytest\nimport numpy as np\n\n\ndef test_merge_fails_with_invalid_branch_name(repo_1_br_no_conf):\n with pytest.rai"
},
{
"path": "tests/test_optimized_utils.py",
"chars": 4665,
"preview": "import pytest\n\nfrom hangar.optimized_utils import SizedDict\n\n\ndef test_sizeddict_maxsize_property():\n d = SizedDict(m"
},
{
"path": "tests/test_remote_serialize.py",
"chars": 9230,
"preview": "import pytest\n\nimport numpy as np\n\n\nparam_shapes = [(1,), (1000,), (1, 1), (623, 3, 5), (2, 4, 5, 6, 1, 3)]\nparam_dtypes"
},
{
"path": "tests/test_remotes.py",
"chars": 36288,
"preview": "import pytest\n\nimport numpy as np\nimport time\nfrom os.path import join as pjoin\nfrom os import mkdir\nfrom random import "
},
{
"path": "tests/test_repo_integrity_verification.py",
"chars": 13873,
"preview": "import pytest\n\nimport numpy as np\n\n\n@pytest.fixture()\ndef diverse_repo(repo):\n co = repo.checkout(write=True)\n co."
},
{
"path": "tests/test_utils.py",
"chars": 2116,
"preview": "import pytest\n\n\n@pytest.mark.parametrize('arg,key,expected', [\n ['AAABBBCCC', None, ['A', 'B', 'C']],\n ['AAABbBCcC"
},
{
"path": "tests/test_version.py",
"chars": 27619,
"preview": "# -*- coding: utf-8 -*-\n\"\"\"\nPortions of this code have been taken and modified from the \"packaging\" project.\n\nURL: "
},
{
"path": "tests/test_visualizations.py",
"chars": 27736,
"preview": "import pytest\nimport numpy as np\n\n\ndef verify_out(capfd, expected):\n out, _ = capfd.readouterr()\n print(out)\n a"
},
{
"path": "tests/typesystem/test_ndarray_typesysem.py",
"chars": 3702,
"preview": "import pytest\nimport numpy as np\n\n\nfrom hangar.typesystem import NdarrayFixedShape, NdarrayVariableShape\n\n\nclass TestInv"
},
{
"path": "tests/typesystem/test_pybytes_typesystem.py",
"chars": 3099,
"preview": "import pytest\nimport numpy as np\n\nfrom hangar.typesystem import BytesVariableShape\n\n\nclass TestInvalidValues:\n\n @pyte"
},
{
"path": "tests/typesystem/test_pystr_typesystem.py",
"chars": 3282,
"preview": "import pytest\nimport numpy as np\nfrom random import randint, choices\n\n\nfrom hangar.typesystem import StringVariableShape"
},
{
"path": "tox.ini",
"chars": 2683,
"preview": "[tox]\nenvlist =\n clean,\n docs,\n py{36,37,38}-cov{yes,no}-ml{yes,no},\n report,\n mypy\n\n# -------------- dep"
}
]
About this extraction
This page contains the full source code of the tensorwerk/hangar-py GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 190 files (2.1 MB), approximately 550.1k tokens, and a symbol index with 2012 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.