[
  {
    "path": ".coveragerc",
    "content": "[run]\nbranch = True\nomit =\n    */pymilo/__main__.py\n[report]\n# Regexes for lines to exclude from consideration\nexclude_lines =\n    pragma: no cover\n"
  },
  {
    "path": ".github/CODE_OF_CONDUCT.md",
    "content": "# Contributor Covenant Code of Conduct\n\n## Our Pledge\n\nWe as members, contributors, and leaders pledge to make participation in our\ncommunity a harassment-free experience for everyone, regardless of age, body\nsize, visible or invisible disability, ethnicity, sex characteristics, gender\nidentity and expression, level of experience, education, socio-economic status,\nnationality, personal appearance, race, caste, color, religion, or sexual\nidentity and orientation.\n\nWe pledge to act and interact in ways that contribute to an open, welcoming,\ndiverse, inclusive, and healthy community.\n\n## Our Standards\n\nExamples of behavior that contributes to a positive environment for our\ncommunity include:\n\n* Demonstrating empathy and kindness toward other people\n* Being respectful of differing opinions, viewpoints, and experiences\n* Giving and gracefully accepting constructive feedback\n* Accepting responsibility and apologizing to those affected by our mistakes,\n  and learning from the experience\n* Focusing on what is best not just for us as individuals, but for the overall\n  community\n\nExamples of unacceptable behavior include:\n\n* The use of sexualized language or imagery, and sexual attention or advances of\n  any kind\n* Trolling, insulting or derogatory comments, and personal or political attacks\n* Public or private harassment\n* Publishing others' private information, such as a physical or email address,\n  without their explicit permission\n* Other conduct which could reasonably be considered inappropriate in a professional setting\n\n## Enforcement Responsibilities\n\nCommunity leaders are responsible for clarifying and enforcing our standards of\nacceptable behavior and will take appropriate and fair corrective action in\nresponse to any behavior that they deem inappropriate, threatening, offensive,\nor harmful.\n\nCommunity leaders have the right and responsibility to remove, edit, or reject\ncomments, commits, code, wiki edits, issues, and other contributions that are\nnot aligned to this Code of Conduct, and will communicate reasons for moderation\ndecisions when appropriate.\n\n## Scope\nThis Code of Conduct applies both within project spaces and in public spaces\nwhen an individual is representing the project or its community.\nExamples of representing our community include using an official e-mail address,\nposting via an official social media account, or acting as an appointed\nrepresentative at an online or offline event.\n\n## Enforcement\n\nInstances of abusive, harassing, or otherwise unacceptable behavior may be\nreported to the community leaders responsible for enforcement at\npymilo@openscilab.com.\nAll complaints will be reviewed and investigated promptly and fairly.\n\nAll community leaders are obligated to respect the privacy and security of the\nreporter of any incident.\n\n## Enforcement Guidelines\n\nCommunity leaders will follow these Community Impact Guidelines in determining\nthe consequences for any action they deem in violation of this Code of Conduct:\n\n### 1. Correction\n\n**Community Impact**: Use of inappropriate language or other behavior deemed\nunprofessional or unwelcome in the community.\n\n**Consequence**: A private, written warning from community leaders, providing\nclarity around the nature of the violation and an explanation of why the\nbehavior was inappropriate. A public apology may be requested.\n\n### 2. Warning\n\n**Community Impact**: A violation through a single incident or series of\nactions.\n\n**Consequence**: A warning with consequences for continued behavior. No\ninteraction with the people involved, including unsolicited interaction with\nthose enforcing the Code of Conduct, for a specified period of time. This\nincludes avoiding interactions in community spaces as well as external channels\nlike social media. Violating these terms may lead to a temporary or permanent\nban.\n\n### 3. Temporary Ban\n\n**Community Impact**: A serious violation of community standards, including\nsustained inappropriate behavior.\n\n**Consequence**: A temporary ban from any sort of interaction or public\ncommunication with the community for a specified period of time. No public or\nprivate interaction with the people involved, including unsolicited interaction\nwith those enforcing the Code of Conduct, is allowed during this period.\nViolating these terms may lead to a permanent ban.\n\n### 4. Permanent Ban\n\n**Community Impact**: Demonstrating a pattern of violation of community\nstandards, including sustained inappropriate behavior, harassment of an\nindividual, or aggression toward or disparagement of classes of individuals.\n\n**Consequence**: A permanent ban from any sort of public interaction within the\ncommunity.\n\n## Attribution\n\nThis Code of Conduct is adapted from the [Contributor Covenant][homepage],\nversion 2.1, available at\n[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].\n\nCommunity Impact Guidelines were inspired by\n[Mozilla's code of conduct enforcement ladder][Mozilla CoC].\n\nFor answers to common questions about this code of conduct, see the FAQ at\n[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at\n[https://www.contributor-covenant.org/translations][translations].\n\n[homepage]: https://www.contributor-covenant.org\n[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html\n[Mozilla CoC]: https://github.com/mozilla/diversity\n[FAQ]: https://www.contributor-covenant.org/faq\n[translations]: https://www.contributor-covenant.org/translations\n"
  },
  {
    "path": ".github/CONTRIBUTING.md",
    "content": "# Contribution\t\t\t\n\nChanges and improvements are more than welcome! ❤️ Feel free to fork and open a pull request.\t\t\n\n\nPlease consider the following :\n\n\n1. Fork it!\n2. Create your feature branch (under `dev` branch)\n3. Add your functions/methods to proper files\n4. Add standard `docstring` to your functions/methods\n5. Add tests for your functions/methods (`tests` folder)\n6. Pass all CI tests\n7. Update `CHANGELOG.md`\n\t- Describe changes under `[Unreleased]` section\n8. Submit a pull request into `dev` (please complete the pull request template)\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.yml",
    "content": "name: Bug Report\ndescription: File a bug report\ntitle: \"[Bug]: \"\nbody:\n  - type: markdown\n    attributes:\n      value: |\n        Thanks for your time to fill out this bug report!\n  - type: input\n    id: contact\n    attributes:\n      label: Contact details\n      description: How can we get in touch with you if we need more info?\n      placeholder: ex. email@example.com\n    validations:\n      required: false\n  - type: textarea\n    id: what-happened\n    attributes:\n      label: What happened?\n      description: Provide a clear and concise description of what the bug is.\n      placeholder: >\n        Tell us a description of the bug.\n    validations:\n      required: true\n  - type: textarea\n    id: step-to-reproduce\n    attributes:\n      label: Steps to reproduce\n      description: Provide details of how to reproduce the bug.\n      placeholder: >\n        ex. 1. Go to '...'\n    validations:\n      required: true\n  - type: textarea\n    id: expected-behavior\n    attributes:\n      label: Expected behavior\n      description: What did you expect to happen?\n      placeholder: >\n        ex. I expected '...' to happen\n    validations:\n      required: true\n  - type: textarea\n    id: actual-behavior\n    attributes:\n      label: Actual behavior\n      description: What did actually happen?\n      placeholder: >\n        ex. Instead '...' happened\n    validations:\n      required: true\n  - type: dropdown\n    id: operating-system\n    attributes:\n      label: Operating system\n      description: Which operating system are you using?\n      options:\n        - Windows\n        - macOS\n        - Linux\n      default: 0\n    validations:\n      required: true\n  - type: dropdown\n    id: python-version\n    attributes:\n      label: Python version\n      description: Which version of Python are you using?\n      options:\n        - Python 3.14\n        - Python 3.13\n        - Python 3.12\n        - Python 3.11\n        - Python 3.10\n        - Python 3.9\n        - Python 3.8\n        - Python 3.7\n      default: 1\n    validations:\n      required: true\n  - type: dropdown\n    id: pymilo-version\n    attributes:\n      label: PyMilo version\n      description: Which version of PyMilo are you using?\n      options:\n        - PyMilo 1.6\n        - PyMilo 1.5\n        - PyMilo 1.4\n        - PyMilo 1.3\n        - PyMilo 1.2\n        - PyMilo 1.1\n        - PyMilo 1.0\n        - PyMilo 0.9\n        - PyMilo 0.8\n        - PyMilo 0.7\n        - PyMilo 0.6\n        - PyMilo 0.5\n        - PyMilo 0.4\n        - PyMilo 0.3\n        - PyMilo 0.2\n        - PyMilo 0.1\n      default: 0\n    validations:\n      required: true\n  - type: textarea\n    id: logs\n    attributes:\n      label: Relevant log output\n      description: Please copy and paste any relevant log output. This will be automatically formatted into code, so no need for backticks.\n      render: shell\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/config.yml",
    "content": "blank_issues_enabled: false\ncontact_links:\n  - name: Discord\n    url: https://discord.com/invite/mtuMS8AjDS\n    about: Ask questions and discuss with other PyMilo community members \n  - name: Website\n    url: https://openscilab.com/\n    about: Check out our website for more information\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.yml",
    "content": "name: Feature Request\ndescription: Suggest a feature for this project\ntitle: \"[Feature]: \"\nbody:\n  - type: textarea\n    id: description\n    attributes:\n      label: Describe the feature you want to add\n      placeholder: >\n        I'd like to be able to [...]\n    validations:\n      required: true\n  - type: textarea\n    id: possible-solution\n    attributes:\n      label: Describe your proposed solution\n      placeholder: >\n        I think this could be done by [...]\n    validations:\n      required: false\n  - type: textarea\n    id: alternatives\n    attributes:\n      label: Describe alternatives you've considered, if relevant\n      placeholder: >\n        Another way to do this would be [...]\n    validations:\n      required: false\n  - type: textarea\n    id: aditional-context\n    attributes:\n      label: Additional context\n      placeholder: >\n        Add any other context or screenshots about the feature request here.\n    validations:\n      required: false\n"
  },
  {
    "path": ".github/PULL_REQUEST_TEMPLATE.md",
    "content": "#### Reference Issues/PRs\n\n#### What does this implement/fix? Explain your changes.\n\n#### Any other comments?\n\n"
  },
  {
    "path": ".github/dependabot.yml",
    "content": "version: 2\nupdates:\n- package-ecosystem: pip\n  directory: \"/\"\n  schedule:\n    interval: weekly\n    time: \"01:30\"\n  open-pull-requests-limit: 10\n  target-branch: dev\n  assignees:\n      - \"AHReccese\"\n"
  },
  {
    "path": ".github/workflows/publish_conda.yml",
    "content": "name: publish_conda\n\npermissions: read-all\n\non:\n  push:\n    # Sequence of patterns matched against refs/tags\n    tags:\n      - '*' # Push events to matching v*, i.e. v1.0, v20.15.10\n\njobs:\n  publish:\n    runs-on: ubuntu-22.04\n    steps:\n      - uses: actions/checkout@v4\n      - name: publish-to-conda\n        uses: sepandhaghighi/conda-package-publish-action@v1.2\n        with:\n          subDir: 'otherfiles'\n          AnacondaToken: ${{ secrets.ANACONDA_TOKEN }}\n"
  },
  {
    "path": ".github/workflows/publish_pypi.yml",
    "content": "# This workflow will upload a Python Package using Twine when a release is created\n# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries\n\nname: Upload Python Package\n\non:\n  push:\n    # Sequence of patterns matched against refs/tags\n    tags:\n      - '*' # Push events to matching v*, i.e. v1.0, v20.15.10\n\njobs:\n  deploy:\n\n    runs-on: ubuntu-22.04\n\n    steps:\n    - uses: actions/checkout@v4\n    - name: Set up Python\n      uses: actions/setup-python@v4\n      with:\n        python-version: '3.x'\n    - name: Install dependencies\n      run: |\n        python -m pip install --upgrade pip\n        pip install setuptools wheel twine\n    - name: Build and publish\n      env:\n        TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n        TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n      run: |\n        python setup.py sdist bdist_wheel\n        twine upload dist/*.tar.gz\n        twine upload dist/*.whl"
  },
  {
    "path": ".github/workflows/test.yml",
    "content": "# This workflow will install Python dependencies, run tests and lint with a variety of Python versions\n# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions\n\nname: CI\n\non:\n  push:\n    branches:\n      - main\n      - dev\n\n  pull_request:\n    branches:\n      - dev\n      - main\n\nenv:\n  TEST_PYTHON_VERSION: 3.9\n  TEST_OS: 'ubuntu-22.04'\n\njobs:\n  build:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-22.04, windows-2022, macos-15-intel]\n        python-version: [3.7, 3.8, 3.9, 3.10.5, 3.11.0, 3.12.0, 3.13.0, 3.14.0]\n    steps:\n      - uses: actions/checkout@v4\n      - name: Set up Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v4\n        with:\n          python-version: ${{ matrix.python-version }}\n      - name: Installation\n        run: |\n          python -m pip install --upgrade pip\n          pip install .[\"streaming\"]\n      - name: Test requirements Installation\n        run: |\n          python otherfiles/requirements-splitter.py\n          pip install --upgrade --upgrade-strategy=only-if-needed -r test-requirements.txt\n      - name: Pymilo Core Functionality Tests with pytest\n        env:\n          COVERAGE_FILE: .coverage.core\n        run: |\n          python -m pytest . --ignore=./tests/test_ml_streaming --cov=pymilo --cov-report=term --cov-report=xml:coverage_core.xml\n      - name: Pymilo Streaming Functionality Tests with pytest\n        env:\n          COVERAGE_FILE: .coverage.streaming\n        run: |\n          python -m pytest ./tests/test_ml_streaming --cov=pymilo --cov-report=term --cov-report=xml:coverage_streaming.xml\n      - name: Merge coverage files\n        run: |\n          coverage combine\n          coverage xml -o coverage_combined.xml\n      - name: Upload coverage to Codecov\n        uses: codecov/codecov-action@v3\n        with:\n            files: coverage_combined.xml\n            fail_ci_if_error: false\n        if: matrix.python-version == env.TEST_PYTHON_VERSION && matrix.os == env.TEST_OS\n      - name: Vulture, Bandit and Pydocstyle Tests\n        run: |\n          python -m vulture pymilo/ otherfiles/ setup.py --min-confidence 65 --exclude=__init__.py --sort-by-size\n          python -m bandit -r pymilo -s B311\n          python -m pydocstyle -v\n        if: matrix.python-version == env.TEST_PYTHON_VERSION\n      - name: Version check\n        run: |\n          python otherfiles/version_check.py\n        if: matrix.python-version == env.TEST_PYTHON_VERSION\n"
  },
  {
    "path": ".gitignore",
    "content": "# Created by .ignore support plugin (hsz.mobi)\n### Python template\n# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nenv/\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\n*.egg-info/\n.installed.cfg\n*.egg\n\n# PyInstaller\n#  Usually these files are written by a python script from a template\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*,cover\n.hypothesis/\n\n# Translations\n*.mo\n*.pot\n\n# Django stuff:\n*.log\nlocal_settings.py\n\n# Flask stuff:\ninstance/\n.webassets-cache\n\n# Scrapy stuff:\n.scrapy\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\ntarget/\n\n# Jupyter Notebook\n.ipynb_checkpoints\n\n# pyenv\n.python-version\n\n# celery beat schedule file\ncelerybeat-schedule\n\n# dotenv\n.env\n\n# virtualenv\n.venv/\nvenv/\nENV/\n\n# Spyder project settings\n.spyderproject\n\n# Rope project settings\n.ropeproject\n### Example user template template\n### Example user template\n\n# IntelliJ project files\n.idea\n*.iml\nout\ngen\n\n/.VSCodeCounter\n/tests/exported*\n/paper/refs"
  },
  {
    "path": ".pydocstyle",
    "content": "[pydocstyle]\nmatch_dir = ^(?!(tests|build)).*\nmatch = .*\\.py\n\n"
  },
  {
    "path": "AUTHORS.md",
    "content": "# Core Developers\n----------\n- AmirHosein Rostami  - Open Science Laboratory ([Github](https://github.com/AHReccese)) **\n- Sepand Haghighi - Open Science Laboratory ([Github](https://github.com/sepandhaghighi))\n- Alireza Zolanvari  - Open Science Laboratory ([Github](https://github.com/AlirezaZolanvari))\n- Sadra Sabouri - Open Science Laboratory ([Github](https://github.com/sadrasabouri))\n\n\n** **Maintainer**\n\n# Other Contributors\n----------\n- [@zhmbshr](https://github.com/zhmbshr) ++\n\n\n++ **Graphic designer**\n"
  },
  {
    "path": "CHANGELOG.md",
    "content": "# Changelog\nAll notable changes to this project will be documented in this file.\n\nThe format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)\nand this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).\n\n## [Unreleased]\n\n## [1.6] - 2026-04-03\n### Added\n- `_check_response_error` function in `WebSocketClientCommunicator`\n- `register_client` function in `WebSocketClientCommunicator`\n- `remove_client` function in `WebSocketClientCommunicator`\n- `register_model` function in `WebSocketClientCommunicator`\n- `remove_model` function in `WebSocketClientCommunicator`\n- `get_ml_models` function in `WebSocketClientCommunicator`\n- `grant_access` function in `WebSocketClientCommunicator`\n- `revoke_access` function in `WebSocketClientCommunicator`\n- `get_allowance` function in `WebSocketClientCommunicator`\n- `get_allowed_models` function in `WebSocketClientCommunicator`\n- `_action_handlers` registry in `WebSocketServerCommunicator`\n- `_handle_register_client` function in `WebSocketServerCommunicator`\n- `_handle_remove_client` function in `WebSocketServerCommunicator`\n- `_handle_register_model` function in `WebSocketServerCommunicator`\n- `_handle_remove_model` function in `WebSocketServerCommunicator`\n- `_handle_get_ml_models` function in `WebSocketServerCommunicator`\n- `_handle_grant_access` function in `WebSocketServerCommunicator`\n- `_handle_revoke_access` function in `WebSocketServerCommunicator`\n- `_handle_get_allowance` function in `WebSocketServerCommunicator`\n- `_handle_get_allowed_models` function in `WebSocketServerCommunicator`\n- `scenario4` access control test for streaming\n- `--port` argument in test server runner\n- `close` function in `WebSocketClientCommunicator`\n- `close` function in `PymiloClient`\n- Context manager support (`__enter__`/`__exit__`) in `PymiloClient`\n### Changed\n- Test system modified\n- `_ArrayFunctionDispatcher` import in `FunctionTransporter` to use new numpy API\n- `datetime.utcnow()` to `datetime.now(timezone.utc)` in `PymiloException`\n- `get_allowance` endpoint in `RESTServerCommunicator` to handle empty allowances\n- `send_message` function in `WebSocketClientCommunicator`\n- `download` function in `WebSocketClientCommunicator`\n- `upload` function in `WebSocketClientCommunicator`\n- `attribute_call` function in `WebSocketClientCommunicator`\n- `attribute_type` function in `WebSocketClientCommunicator`\n- `__init__` function in `WebSocketServerCommunicator`\n- `handle_message` function in `WebSocketServerCommunicator`\n- `_handle_download` function in `WebSocketServerCommunicator`\n- `_handle_upload` function in `WebSocketServerCommunicator`\n- `_handle_attribute_call` function in `WebSocketServerCommunicator`\n- `_handle_attribute_type` function in `WebSocketServerCommunicator`\n- `parse` function in `WebSocketServerCommunicator`\n- WebSocket streaming tests enabled\n## [1.5] - 2026-01-26\n### Added\n- `_is_remainder_cols_list` function in GeneralDataStructureTransporter\n- `ComposeTransporter` Transporter\n- Composite params initialized in `pymilo_param.py`\n- `get_transporter` in `chains/util.py`\n- `deserialize_possible_ml_model` in `chains/util.py`\n- `serialize_possible_ml_model` in `chains/util.py`\n- `TransformedTargetRegressor` model\n- `ColumnTransformer` model\n- Composite models test runner\n- Composite models chain\n- JOSS paper\n### Changed\n- `serialize` function in FunctionTransporter\n- `serialize_spline` function in PreprocessingTransporter\n- `deserialize_spline` function in PreprocessingTransporter\n- Ensemble models test runner\n- `get_deserialized_list` function in GeneralDataStructureTransporter\n- `deserialize` function in GeneralDataStructureTransporter \n- `serialize` function in GeneralDataStructureTransporter \n- `get_deserialized_dict` function in GeneralDataStructureTransporter\n- `serialize_dict` function in GeneralDataStructureTransporter\n- `serialize_tuple` function in GeneralDataStructureTransporter\n- Test system modified\n- `README.md` updated\n### Removed\n- `get_transporter` in `ensemble_chain.py`\n- `deserialize_possible_ml_model` in `ensemble_chain.py`\n- `serialize_possible_ml_model` in `ensemble_chain.py`\n## [1.4] - 2025-12-01\n### Added\n- `get_allowed_models` function in `PymiloClient`\n- `get_allowance` function in `PymiloClient`\n- `revoke_access` function in `PymiloClient`\n- `grant_access` function in `PymiloClient`\n- `get_ml_models` function in `PymiloClient`\n- `deregister_ml_model` function in `PymiloClient`\n- `register_ml_model` function in `PymiloClient`\n- `deregister` function in `PymiloClient`\n- `register` function in `PymiloClient`\n- `REST_API_PREFIX ` function in `streaming.param.py`\n- `register_client` function in `RESTClientCommunicator`\n- `remove_client` function in `RESTClientCommunicator`\n- `register_model` function in `RESTClientCommunicator`\n- `remove_model` function in `RESTClientCommunicator`\n- `get_ml_models` function in `RESTClientCommunicator`\n- `grant_access` function in `RESTClientCommunicator`\n- `revoke_access` function in `RESTClientCommunicator`\n- `get_allowance` function in `RESTClientCommunicator`\n- `get_allowed_models` function in `RESTClientCommunicator`\n- `_validate_id` function in `PymiloServer`\n- `init_client` function in `PymiloServer`\n- `remove_client` function in `PymiloServer`\n- `grant_access` function in `PymiloServer`\n- `revoke_access` function in `PymiloServer`\n- `get_allowed_models` function in `PymiloServer`\n- `get_clients_allowance` function in `PymiloServer`\n- `get_clients` function in `PymiloServer`\n- `init_ml_model` function in `PymiloServer`\n- `set_ml_model` function in `PymiloServer`\n- `remove_ml_model` function in `PymiloServer`\n- `get_ml_models` function in `PymiloServer`\n### Changed\n- `is_callable_attribute` function in `PymiloServer`\n- `execute_model` function in `PymiloServer`\n- `update_model` function in `PymiloServer`\n- `export_model` function in `PymiloServer`\n- `__getattr__` in `PymiloClient`\n- `upload` function in `PymiloClient`\n- `download` function in `PymiloClient`\n- `encrypt_compress` function in `PymiloClient`\n- `ClientCommunicator` interface\n- `handle_message` function in `WebSocketServerCommunicator`\n- `_handle_download` function in `WebSocketServerCommunicator`\n- `setup_routes` function in `RESTServerCommunicator`\n- `__init__` function in `RESTClientCommunicator`\n- `download` function in `RESTClientCommunicator`\n- `upload` function in `RESTClientCommunicator`\n- `attribute_call` function in `RESTClientCommunicator`\n- `attribute_type` function in `RESTClientCommunicator`\n- `README.md` updated\n- `__init__` function in `PyMiloServer`\n- Test system modified\n- `Python 3.14` added to `test.yml`\n### Removed\n- Python 3.6 support\n## [1.3] - 2025-02-26\n### Added\n- `TfidfVectorizer` feature extractor\n- `TfidfTransformer` feature extractor\n- `HashingVectorizer` feature extractor\n- `CountVectorizer` feature extractor\n- `PatchExtractor` feature extractor\n- `DictVectorizer` feature extractor\n- `FeatureHasher` feature extractor\n- `FeatureExtractorTransporter` Transporter\n- `FeatureExtraction` support added to Ensemble chain\n- FeatureExtraction params initialized in `pymilo_param.py`\n- Feature Extraction models test runner\n- Zenodo badge to `README.md`\n### Changed\n- `get_deserialized_list` in `GeneralDataStructureTransporter`\n- `get_deserialized_dict` in `GeneralDataStructureTransporter`\n- `serialize` in `GeneralDataStructureTransporter`\n- `serialize_tuple` in `GeneralDataStructureTransporter`\n- `AttributeCallPayload` in `streaming.communicator.py`\n- `get_deserialized_regular_primary_types` in `GeneralDataStructureTransporter`\n- Test system modified\n## [1.2] - 2025-01-22\n### Added\n- `generate_dockerfile` testcases\n- `generate_dockerfile` function in `streaming.util.py`\n- `cite` section in `README.md`\n- `CLI` handler\n- `print_supported_ml_models` function in `pymilo_func.py`\n- `pymilo_help` function in `pymilo_func.py`\n- `SKLEARN_SUPPORTED_CATEGORIES` in `pymilo_param.py`\n- `OVERVIEW` in `pymilo_param.py`\n- `get_sklearn_class` in `utils.util.py`\n### Changed\n- `ML Streaming` testcases modified to use PyMilo CLI\n- `to_pymilo_issue` function in `PymiloException`\n- `valid_url_valid_file` testcase added in `test_exceptions.py`\n- `valid_url_valid_file` function in `import_exceptions.py`\n- `StandardPayload` in `RESTServerCommunicator`\n- testcase for LogisticRegressionCV, LogisticRegression\n- `README.md` updated\n- `AUTHORS.md` updated\n## [1.1] - 2024-11-25\n### Added\n- `is_socket_closed` function in `streaming.communicator.py`\n- `validate_http_url` function in `streaming.util.py`\n- `validate_websocket_url` function in `streaming.util.py`\n- `ML Streaming` WebSocket testcases\n- `CommunicationProtocol` Enum in `streaming.communicator.py`\n- `WebSocketClientCommunicator` class in `streaming.communicator.py`\n- `WebSocketServerCommunicator` class in `streaming.communicator.py`\n- batch operation testcases\n- `batch_export` function in `pymilo/pymilo_obj.py` \n- `batch_import` function in `pymilo/pymilo_obj.py`\n- `CCA` model\n- `PLSCanonical` model\n- `PLSRegression` model\n- Cross decomposition models test runner\n- Cross decomposition chain\n- PyMilo exception types added in `pymilo/exceptions/__init__.py`\n- PyMilo exception types added in `pymilo/__init__.py`\n### Changed\n- `core` and `streaming` tests divided in `test.yml`\n- `communication_protocol` parameter added to `PyMiloClient` class\n- `communication_protocol` parameter added to `PyMiloServer` class\n- `ML Streaming` testcases updated to support protocol selection\n- `README.md` updated\n- Tests config modified\n- Cross decomposition params initialized in `pymilo_param`\n- Cross decomposition support added to `pymilo_func.py`\n- `SUPPORTED_MODELS.md` updated\n- `README.md` updated\n- GitHub actions are limited to the `dev` and `main` branches\n- `Python 3.13` added to `test.yml`\n## [1.0] - 2024-09-16\n### Added\n- Compression method test in `ML Streaming` RESTful testcases\n- `CLI` handler in `tests/test_ml_streaming/run_server.py`\n- `Compression` Enum in `streaming.compressor.py`\n- `GZIPCompressor` class in `streaming.compressor.py`\n- `ZLIBCompressor` class in `streaming.compressor.py`\n- `LZMACompressor` class in `streaming.compressor.py`\n- `BZ2Compressor` class in `streaming.compressor.py`\n- `encrypt_compress` function in `PymiloClient`\n- `parse` function in `RESTServerCommunicator`\n- `is_callable_attribute` function in `PymiloServer`\n- `streaming.param.py`\n- `attribute_type` function in `RESTServerCommunicator`\n- `AttributeTypePayload` class in `RESTServerCommunicator`\n- `attribute_type` function in `RESTClientCommunicator`\n- `Mode` Enum in `PymiloClient`\n- Import from url testcases\n- `download_model` function in `utils.util.py`\n- `PymiloServer` class in `streaming.pymilo_server.py`\n- `PymiloClient` class in `PymiloClient`\n- `Communicator` interface in `streaming.interfaces.py`\n- `RESTClientCommunicator` class in `streaming.communicator.py`\n- `RESTServerCommunicator` class in `streaming.communicator.py`\n- `Compressor` interface in `streaming.interfaces.py`\n- `DummyCompressor` class in `streaming.compressor.py`\n- `Encryptor` interface in `streaming.interfaces.py`\n- `DummyEncryptor` class in `streaming.encryptor.py`\n- `ML Streaming` RESTful testcases\n- `streaming-requirements.txt`\n### Changed\n- `README.md` updated\n- `ML Streaming` RESTful testcases\n- `attribute_call` function in `RESTServerCommunicator`\n- `AttributeCallPayload` class in `RESTServerCommunicator`\n- upload function in `RESTClientCommunicator`\n- download function in `RESTClientCommunicator`\n- `__init__` function in `RESTClientCommunicator`\n- `attribute_calls` function in `RESTClientCommunicator`\n- `requests` added to `requirements.txt`\n- `uvicorn`, `fastapi`, `requests` and `pydantic` added to `dev-requirements.txt`\n- `ML Streaming` RESTful testcases\n- `__init__` function in `PymiloServer`\n- `__getattr__` function in `PymiloClient`\n- `__init__` function in `PymiloClient`\n- `toggle_mode` function in `PymiloClient`\n- `upload` function in `PymiloClient`\n- `download` function in `PymiloClient`\n- `__init__` function in `PymiloServer`\n- `serialize_cfnode` function in `transporters.cfnode_transporter.py`\n- `__init__` function in `Import` class\n- `serialize` function in `transporters.tree_transporter.py`\n- `deserialize` function in `transporters.tree_transporter.py`\n- `serialize` function in `transporters.sgdoptimizer_transporter.py`\n- `deserialize` function in `transporters.sgdoptimizer_transporter.py`\n- `serialize` function in `transporters.randomstate_transporter.py`\n- `deserialize` function in `transporters.randomstate_transporter.py`\n- `serialize` function in `transporters.bunch_transporter.py`\n- `deserialize` function in `transporters.bunch_transporter.py`\n- `serialize` function in `transporters.adamoptimizer_transporter.py`\n- `deserialize` function in `transporters.adamoptimizer_transporter.py`\n- `serialize_linear_model` function in `chains.linear_model_chain.py`\n- `serialize_ensemble` function in `chains.ensemble_chain.py`\n- `serialize` function in `GeneralDataStructureTransporter` Transporter refactored\n- `get_deserialized_list` function in `GeneralDataStructureTransporter` Transporter refactored\n- `Export` class call by reference bug fixed\n## [0.9] - 2024-07-01\n### Added\n- Anaconda workflow\n- `prefix_list` function in `utils.util.py`\n- `KBinsDiscretizer` preprocessing model\n- `PowerTransformer` preprocessing model\n- `SplineTransformer` preprocessing model\n- `TargetEncoder` preprocessing model\n- `QuantileTransformer` preprocessing model\n- `RobustScaler` preprocessing model\n- `PolynomialFeatures` preprocessing model\n- `OrdinalEncoder` preprocessing model\n- `Normalizer` preprocessing model\n- `MaxAbsScaler` preprocessing model\n- `MultiLabelBinarizer` preprocessing model\n- `KernelCenterer` preprocessing model\n- `FunctionTransformer` preprocessing model\n- `Binarizer` preprocessing model\n- Preprocessing models test runner\n### Changed\n- `Command` enum class in `transporter.py`\n- `SerializationErrorTypes` enum class in `serialize_exception.py`\n- `DeserializationErrorTypes` enum class in `deserialize_exception.py`\n- `meta.yaml` modified\n- `NaN` type in `pymilo_param`\n- `NaN` type transportation in `GeneralDataStructureTransporter` Transporter\n- `BSpline` Transportation in `PreprocessingTransporter` Transporter\n- one layer deeper transportation in `PreprocessingTransporter` Transporter\n- dictating outer ndarray dtype in `GeneralDataStructureTransporter` Transporter \n- preprocessing params fulfilled in `pymilo_param`\n- `SUPPORTED_MODELS.md` updated\n- `README.md` updated\n- `serialize_possible_ml_model` in the Ensemble chain\n## [0.8] - 2024-05-06\n### Added\n- `StandardScaler` Transformer in `pymilo_param.py`\n- `PreprocessingTransporter` Transporter\n- ndarray shape config in `GeneralDataStructure` Transporter\n- `util.py` in chains\n- `BinMapperTransporter` Transporter\n- `BunchTransporter` Transporter\n- `GeneratorTransporter` Transporter\n- `TreePredictorTransporter` Transporter\n- `AdaboostClassifier` model\n- `AdaboostRegressor` model\n- `BaggingClassifier` model\n- `BaggingRegressor` model\n- `ExtraTreesClassifier` model\n- `ExtraTreesRegressor` model\n- `GradientBoosterClassifier` model\n- `GradientBoosterRegressor` model\n- `HistGradientBoosterClassifier` model\n- `HistGradientBoosterRegressor` model\n- `RandomForestClassifier` model\n- `RandomForestRegressor` model\n- `IsolationForest` model\n- `RandomTreesEmbedding` model\n- `StackingClassifier` model\n- `StackingRegressor` model\n- `VotingClassifier` model\n- `VotingRegressor` model\n- `Pipeline` model\n- Ensemble models test runner\n- Ensemble chain\n- `SECURITY.md`\n### Changed\n- `Pipeline` test updated\n- `LabelBinarizer`,`LabelEncoder` and `OneHotEncoder` got embedded in `PreprocessingTransporter`\n- Preprocessing support added to Ensemble chain\n- Preprocessing params initialized in `pymilo_param`\n- `util.py` in utils updated\n- `test_pymilo.py` updated\n- `pymilo_func.py` updated\n- `linear_model_chain.py` updated\n- `neural_network_chain.py` updated\n- `decision_tree_chain.py` updated\n- `clustering_chain.py` updated\n- `naive_bayes_chain.py` updated\n- `neighbours_chain.py` updated\n- `svm_chain.py` updated\n- `GeneralDataStructure` Transporter updated\n- `LossFunction` Transporter updated\n- `AbstractTransporter` updated\n- Tests config modified\n- Unequal sklearn version error added in `pymilo_param.py`\n- Ensemble params initialized in `pymilo_param`\n- Ensemble support added to `pymilo_func.py`\n- `SUPPORTED_MODELS.md` updated\n- `README.md` updated\n## [0.7] - 2024-04-03\n### Added\n- `pymilo_nearest_neighbor_test` function added to `test_pymilo.py`\n- `NeighborsTreeTransporter` Transporter\n- `LocalOutlierFactor` model\n- `RadiusNeighborsClassifier` model\n- `RadiusNeighborsRegressor` model\n- `NearestCentroid` model\n- `NearestNeighbors` model\n- `KNeighborsClassifier` model\n- `KNeighborsRegressor` model\n- Neighbors models test runner\n- Neighbors chain\n### Changed\n- Tests config modified\n- Neighbors params initialized in `pymilo_param`\n- Neighbors support added to `pymilo_func.py`\n- `SUPPORTED_MODELS.md` updated\n- `README.md` updated\n## [0.6] - 2024-03-27\n### Added\n- `deserialize_primitive_type` function in `GeneralDataStructureTransporter`\n- `is_deserialized_ndarray` function in `GeneralDataStructureTransporter`\n- `deep_deserialize_ndarray` function in `GeneralDataStructureTransporter`\n- `deep_serialize_ndarray`  function in `GeneralDataStructureTransporter`\n- `SVR` model\n- `SVC` model\n- `One Class SVM` model\n- `NuSVR` model\n- `NuSVC` model\n- `Linear SVR` model\n- `Linear SVC` model\n- SVM models test runner\n- SVM chain\n### Changed\n- `pymilo_param.py` updated\n- `pymilo_obj.py` updated to use predefined strings\n- `TreeTransporter` updated\n- `get_homogeneous_type` function in `util.py` updated\n- `GeneralDataStructureTransporter` updated to use deep ndarray serializer & deserializer\n- `check_str_in_iterable` updated\n- `Label Binarizer` Transporter updated\n- `Function` Transporter updated\n- `CFNode` Transporter updated\n- `Bisecting Tree` Transporter updated\n- Tests config modified\n- SVM params initialized in `pymilo_param`\n- SVM support added to `pymilo_func.py`\n- `SUPPORTED_MODELS.md` updated\n- `README.md` updated\n## [0.5] - 2024-01-31\n### Added\n- `reset` function in the `Transport` interface\n- `reset` function implementation in `AbstractTransporter`\n- `Gaussian Naive Bayes` declared as `GaussianNB` model \n- `Multinomial Naive Bayes` model declared as `MultinomialNB` model\n- `Complement Naive Bayes` model declared as `ComplementNB` model\n- `Bernoulli Naive Bayes` model declared as `BernoulliNB` model\n- `Categorical Naive Bayes` model declared as `CategoricalNB` model\n- Naive Bayes models test runner\n- Naive Bayes chain\n### Changed\n- `Transport` function of `AbstractTransporter` updated\n- fix the order of `CFNode` fields serialization in `CFNodeTransporter`\n- `GeneralDataStructureTransporter` support list of ndarray with different shapes\n- Tests config modified\n- Naive Bayes params initialized in `pymilo_param`\n- Naive Bayes support added to `pymilo_func.py`\n- `SUPPORTED_MODELS.md` updated\n- `README.md` updated\n## [0.4] - 2024-01-22\n### Added\n- `has_named_parameter` method in `util.py`\n- `CFSubcluster` Transporter(inside `CFNode` Transporter)\n- `CFNode` Transporter\n- `Birch` model\n- `SpectralBiclustering` model\n- `SpectralCoclustering` model\n- `MiniBatchKMeans` model\n- `feature_request.yml` template\n- `config.yml` for issue template\n- `BayesianGaussianMixture` model\n- `serialize_tuple` method in `GeneralDataStructureTransporter`\n- `import_function` method in `util.py`\n- `Function` Transporter\n- `FeatureAgglomeration` model\n- `HDBSCAN` model\n- `GaussianMixture` model\n- `OPTICS` model\n- `DBSCAN` model\n- `AgglomerativeClustering` model\n- `SpectralClustering` model\n- `MeanShift` model \n- `AffinityPropagation` model\n- `Kmeans` model\n- Clustering models test runner\n- Clustering chain \n### Changed\n- `LossFunctionTransporter` enhanced to handle scikit 1.4.0 `_loss_function_` field\n- Codacy Static Code Analyzer's suggestions applied\n- Spectral Clustering test folder refactored\n- Bug report template modified\n- `GeneralDataStructureTransporter` updated\n- Tests config modified\n- Clustering data set preparation added to `data_exporter.py`\n- Clustering params initialized in `pymilo_param`\n- Clustering support added to `pymilo_func.py`\n- `Python 3.12` added to `test.yml`\n- `dev-requirements.txt` updated\n- Code quality badges added to `README.md`\n- `SUPPORTED_MODELS.md` updated\n- `README.md` updated\n## [0.3] - 2023-09-27\n### Added\n- scikit-learn decision tree models\n- `ExtraTreeClassifier` model\n- `ExtraTreeRegressor` model\n- `DecisionTreeClassifier` model\n- `DecisionTreeRegressor` model\n- `Tree` Transporter\n- Decision Tree chain\n### Changed\n- Tests config modified\n- DecisionTree params initialized in `pymilo_param`\n- Decision Tree support added to `pymilo_func.py`\n## [0.2] - 2023-08-02\n### Added\n- scikit-learn neural network models \n- `MLP Regressor` model \n- `MLP Classifier` model\n- `BernoulliRBN` model\n- `SGDOptimizer` transporter\n- `RandomState(MT19937)` transporter\n- `Adamoptimizer` transporter\n- Neural Network chain\n- Neural Network exceptions \n- `ndarray_to_list` method in `GeneralDataStructureTransporter`\n- `list_to_ndarray` method in `GeneralDataStructureTransporter` \n- `neural_network_chain.py` chain\n### Changed\n- `GeneralDataStructure` Transporter updated\n- `LabelBinerizer` Transporter updated\n- `linear model` chain updated\n- GeneralDataStructure transporter enhanced\n- LabelBinerizer transporter updated\n- transporters' chain router added to `pymilo func`\n- NeuralNetwork params initialized in `pymilo_param`\n- `pymilo_test` updated to support multiple models\n- `linear_model_chain` refactored\n## [0.1] - 2023-06-29\n### Added\n- scikit-learn linear models support\n- `Export` class\n- `Import` class\n\n[Unreleased]: https://github.com/openscilab/pymilo/compare/v1.6...dev\n[1.6]: https://github.com/openscilab/pymilo/compare/v1.5...v1.6\n[1.5]: https://github.com/openscilab/pymilo/compare/v1.4...v1.5\n[1.4]: https://github.com/openscilab/pymilo/compare/v1.3...v1.4\n[1.3]: https://github.com/openscilab/pymilo/compare/v1.2...v1.3\n[1.2]: https://github.com/openscilab/pymilo/compare/v1.1...v1.2\n[1.1]: https://github.com/openscilab/pymilo/compare/v1.0...v1.1\n[1.0]: https://github.com/openscilab/pymilo/compare/v0.9...v1.0\n[0.9]: https://github.com/openscilab/pymilo/compare/v0.8...v0.9\n[0.8]: https://github.com/openscilab/pymilo/compare/v0.7...v0.8\n[0.7]: https://github.com/openscilab/pymilo/compare/v0.6...v0.7\n[0.6]: https://github.com/openscilab/pymilo/compare/v0.5...v0.6\n[0.5]: https://github.com/openscilab/pymilo/compare/v0.4...v0.5\n[0.4]: https://github.com/openscilab/pymilo/compare/v0.3...v0.4\n[0.3]: https://github.com/openscilab/pymilo/compare/v0.2...v0.3\n[0.2]: https://github.com/openscilab/pymilo/compare/v0.1...v0.2\n[0.1]: https://github.com/openscilab/pymilo/compare/e887108...v0.1\n"
  },
  {
    "path": "CITATION.cff",
    "content": "cff-version: \"1.2.0\"\nmessage: If you use this software, please cite our article in the\n  Journal of Open Source Software.\ntitle: \"PyMilo: A Python Library for ML I/O\"\nabstract: >-\n  PyMilo is an open-source Python package that addresses the limitations of\n  existing machine learning (ML) model storage formats by providing a\n  transparent, reliable, end-to-end, and safe method for exporting and\n  deploying trained models. Current tools rely on black-box or executable\n  formats that obscure internal model structures, making them difficult to\n  audit, verify, or safely share. Meanwhile, tensor-centric formats securely\n  store and transfer numerical tensors but do not capture the internal and\n  structural composition of classical machine-learning models (e.g.,\n  scikit-learn pipelines), which remain PyMilo’s primary focus. Others apply\n  structural transformations during export that may degrade predictive\n  performance and reduce the model to a limited inference-only interface. In\n  contrast, PyMilo serializes models in a transparent human-readable format\n  that preserves end-to-end model fidelity and enables reliable, safe, and\n  interpretable exchange.\nauthors:\n- family-names: Rostami\n  given-names: AmirHosein\n  orcid: \"https://orcid.org/0009-0000-0638-2263\"\n- family-names: Haghighi\n  given-names: Sepand\n  orcid: \"https://orcid.org/0000-0001-9450-2375\"\n- family-names: Sabouri\n  given-names: Sadra\n  orcid: \"https://orcid.org/0000-0003-1047-2346\"\n- family-names: Zolanvari\n  given-names: Alireza\n  orcid: \"https://orcid.org/0000-0003-2367-8343\"\ncontact:\n- family-names: Rostami\n  given-names: AmirHosein\n  orcid: \"https://orcid.org/0009-0000-0638-2263\"\nversion: 1.4\ndate-released: 2025-12-23\nrepository-code: \"https://github.com/openscilab/pymilo\"\nurl: \"https://github.com/openscilab/pymilo\"\nlicense: MIT\nkeywords:\n    - Machine Learning\n    - Model Deployment\n    - Model Serialization\n    - Transparency\n    - MLOPS\ndoi: 10.5281/zenodo.17783630\npreferred-citation:\n  authors:\n  - family-names: Rostami\n    given-names: AmirHosein\n    orcid: \"https://orcid.org/0009-0000-0638-2263\"\n  - family-names: Haghighi\n    given-names: Sepand\n    orcid: \"https://orcid.org/0000-0001-9450-2375\"\n  - family-names: Sabouri\n    given-names: Sadra\n    orcid: \"https://orcid.org/0000-0003-1047-2346\"\n  - family-names: Zolanvari\n    given-names: Alireza\n    orcid: \"https://orcid.org/0000-0003-2367-8343\"\n  date-published: 2025-12-20\n  doi: 10.21105/joss.08858\n  issn: 2475-9066\n  issue: 116\n  journal: Journal of Open Source Software\n  publisher:\n    name: Open Journals\n  start: 8858\n  title: \"PyMilo: A Python Library for ML I/O\"\n  type: article\n  url: \"https://joss.theoj.org/papers/10.21105/joss.08858\"\n  volume: 10\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2022 OpenSciLab\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "<div align=\"center\">\n    <img src=\"https://github.com/openscilab/pymilo/raw/main/otherfiles/logo.png\" width=\"500\" height=\"300\">\n    <br/>\n    <br/>\n    <a href=\"https://codecov.io/gh/openscilab/pymilo\"><img src=\"https://codecov.io/gh/openscilab/pymilo/branch/main/graph/badge.svg\" alt=\"Codecov\"/></a>\n    <a href=\"https://badge.fury.io/py/pymilo\"><img src=\"https://badge.fury.io/py/pymilo.svg\" alt=\"PyPI version\"></a>\n    <a href=\"https://anaconda.org/openscilab/pymilo\"><img src=\"https://anaconda.org/openscilab/pymilo/badges/version.svg\"></a>\n    <a href=\"https://www.python.org/\"><img src=\"https://img.shields.io/badge/built%20with-Python3-green.svg\" alt=\"built with Python3\"></a>\n    <a href=\"https://github.com/openscilab/pymilo\"><img alt=\"GitHub repo size\" src=\"https://img.shields.io/github/repo-size/openscilab/pymilo\"></a>\n    <a href=\"https://discord.gg/mtuMS8AjDS\"><img src=\"https://img.shields.io/discord/1064533716615049236.svg\" alt=\"Discord Channel\"></a>\n</div>\n\n----------\n\n## Overview\n<p align=\"justify\">\nPyMilo is an open source Python package that provides a simple, efficient, and safe way for users to export pre-trained machine learning models in a transparent way. By this, the exported model can be used in other environments, transferred across different platforms, and shared with others. PyMilo allows the users to export the models that are trained using popular Python libraries like scikit-learn, and then use them in deployment environments, or share them without exposing the underlying code or dependencies. The transparency of the exported models ensures reliability and safety for the end users, as it eliminates the risks of binary or pickle formats.\n</p>\n<table>\n    <tr>\n        <td align=\"center\">PyPI Counter</td>\n        <td align=\"center\">\n\t    <a href=\"https://pepy.tech/projects/pymilo\">\n\t        <img src=\"https://static.pepy.tech/badge/pymilo\" alt=\"PyPI Downloads\">\n\t    </a>\n        </td>\n    </tr>\n    <tr>\n        <td align=\"center\">Github Stars</td>\n        <td align=\"center\">\n            <a href=\"https://github.com/openscilab/pymilo\">\n                <img src=\"https://img.shields.io/github/stars/openscilab/pymilo.svg?style=social&label=Stars\">\n            </a>\n        </td>\n    </tr>\n</table>\n<table>\n    <tr> \n        <td align=\"center\">Branch</td>\n        <td align=\"center\">main</td>\n        <td align=\"center\">dev</td>\n    </tr>\n    <tr>\n        <td align=\"center\">CI</td>\n        <td align=\"center\">\n            <img src=\"https://github.com/openscilab/pymilo/actions/workflows/test.yml/badge.svg?branch=main\">\n        </td>\n        <td align=\"center\">\n            <img src=\"https://github.com/openscilab/pymilo/actions/workflows/test.yml/badge.svg?branch=dev\">\n            </td>\n    </tr>\n</table>\n\n<table>\n\t<tr> \n\t\t<td align=\"center\">Code Quality</td>\n\t\t<td align=\"center\"><a href=\"https://www.codefactor.io/repository/github/openscilab/pymilo\"><img src=\"https://www.codefactor.io/repository/github/openscilab/pymilo/badge\" alt=\"CodeFactor\" /></a></td>\n\t\t<td align=\"center\"><a href=\"https://app.codacy.com/gh/openscilab/pymilo/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade\"><img src=\"https://app.codacy.com/project/badge/Grade/9eeec99ed11f4d9b86af36dc90f5f753\"></a></td>\n\t</tr>\n</table>\n\n\n## Installation\n\n### PyPI\n\n- Check [Python Packaging User Guide](https://packaging.python.org/installing/)\n- Run `pip install pymilo==1.6`\n### Source code\n- Download [Version 1.6](https://github.com/openscilab/pymilo/archive/v1.6.zip) or [Latest Source](https://github.com/openscilab/pymilo/archive/dev.zip)\n- Run `pip install .`\n\n### Conda\n\n- Check [Conda Managing Package](https://conda.io/)\n- Update Conda using `conda update conda`\n- Run `conda install -c openscilab pymilo`\n\n\n## Usage\n### Import/Export\nImagine you want to train a `LinearRegression` model representing this equation: $y = x_0 + 2x_1 + 3$. You will create data points (`X`, `y`) and train your model as follows.\n```python\nimport numpy as np\nfrom sklearn.linear_model import LinearRegression\nX = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])\ny = np.dot(X, np.array([1, 2])) + 3\n# y = 1 * x_0 + 2 * x_1 + 3\nmodel = LinearRegression().fit(X, y)\npred = model.predict(np.array([[3, 5]]))\n# pred = [16.] (=1 * 3 + 2 * 5 + 3)\n```\n\nUsing PyMilo `Export` class you can easily serialize and export your trained model into a JSON file.\n```python\nfrom pymilo import Export\nExport(model).save(\"model.json\")\n```\n\n#### Export\n\nThe `Export` class facilitates exporting of machine learning models to JSON files.\n\n| **Parameter** | **Description** |\n| ------------- | --------------- |\n| model | The machine learning model to be exported |\n\n| **Property** | **Description** |\n| ------------ | --------------- |\n| data | The serialized model data including all learned parameters |\n| version | The scikit-learn version used to train the model |\n| type | The type/class name of the exported model |\n\n| **Method** | **Description** |\n| ---------- | --------------- |\n| save | Save the exported model to a JSON file |\n| to_json | Return the model as a JSON string representation |\n| batch_export | Export multiple models to individual JSON files in a directory |\n\nYou can check out your model as a JSON file now.\n```json\n{\n    \"data\": {\n        \"fit_intercept\": true,\n        \"copy_X\": true,\n        \"n_jobs\": null,\n        \"positive\": false,\n        \"n_features_in_\": 2,\n        \"coef_\": {\n            \"pymiloed-ndarray-list\": [\n                1.0000000000000002,\n                1.9999999999999991\n            ],\n            \"pymiloed-ndarray-dtype\": \"float64\",\n            \"pymiloed-ndarray-shape\": [\n                2\n            ],\n            \"pymiloed-data-structure\": \"numpy.ndarray\"\n        },\n        \"rank_\": 2,\n        \"singular_\": {\n            \"pymiloed-ndarray-list\": [\n                1.618033988749895,\n                0.6180339887498948\n            ],\n            \"pymiloed-ndarray-dtype\": \"float64\",\n            \"pymiloed-ndarray-shape\": [\n                2\n            ],\n            \"pymiloed-data-structure\": \"numpy.ndarray\"\n        },\n        \"intercept_\": {\n            \"value\": 3.0000000000000018,\n            \"np-type\": \"numpy.float64\"\n        }\n    },\n    \"sklearn_version\": \"1.4.2\",\n    \"pymilo_version\": \"0.8\",\n    \"model_type\": \"LinearRegression\"\n}\n```\nYou can see all the learned parameters of the model in this file and change them if you want. This JSON representation is a transparent version of your model.\n\nNow let's load it back. You can do it easily by using PyMilo `Import` class.\n```python\nfrom pymilo import Import\nmodel = Import(\"model.json\").to_model()\npred = model.predict(np.array([[3, 5]]))\n# pred = [16.] (=1 * 3 + 2 * 5 + 3)\n```\n\n#### Import\n\nThe `Import` class facilitates importing of serialized models from JSON files, JSON strings, or URLs.\n\n| **Parameter** | **Description** |\n| ------------- | --------------- |\n| file_adr | Path to the JSON file containing the serialized model |\n| json_dump | JSON string representation of the serialized model |\n| url | URL to download the serialized model from |\n\n| **Property** | **Description** |\n| ------------ | --------------- |\n| data | The deserialized model data |\n| version | The scikit-learn version of the original model |\n| type | The type/class name of the imported model |\n\n| **Method** | **Description** |\n| ---------- | --------------- |\n| to_model | Convert the imported data back to a scikit-learn model |\n| batch_import | Import multiple models from JSON files in a directory |\n\nThis loaded model is exactly the same as the original trained model.\n\n### ML streaming\nYou can easily serve your ML model from a remote server using `ML streaming` feature of PyMilo.\n\n⚠️ `ML streaming` feature exists in versions `>=1.0`\n\n⚠️ In order to use `ML streaming` feature, make sure you've installed the `streaming` mode of PyMilo\n\n⚠️ The `ML streaming` feature is under construction and is not yet considered stable.\n\nYou can choose either `REST` or `WebSocket` as the communication medium protocol.\n\n#### Server\nLet's assume you are in the remote server and you want to import the exported JSON file and start serving your model through `REST` protocol!\n```python\nfrom pymilo import Import\nfrom pymilo.streaming import PymiloServer, CommunicationProtocol\nmy_model = Import(\"model.json\").to_model()\ncommunicator = PymiloServer(\n    model=my_model,\n    port=8000,\n    communication_protocol=CommunicationProtocol[\"REST\"],\n    ).communicator\ncommunicator.run()\n```\n\n#### PymiloServer\n\nThe `PymiloServer` class facilitates streaming machine learning models over a network.\n\n| **Parameter** | **Description** |\n| ------------- | --------------- |\n| port | Port number for the server to listen on (default: 8000) |\n| host | Host address for the server (default: \"127.0.0.1\") |\n| compressor | Compression method from `Compression` enum |\n| communication_protocol | Communication protocol from `CommunicationProtocol` enum |\n\nThe `compressor` parameter accepts values from the `Compression` enum including `NULL` (no compression), `GZIP`, `ZLIB`, `LZMA`, or `BZ2`. The `communication_protocol` parameter accepts values from the `CommunicationProtocol` enum including `REST` or `WEBSOCKET`.\n\n| **Method** | **Description** |\n| ---------- | --------------- |\n| init_client | Initialize a new client with the given client ID |\n| remove_client | Remove an existing client by client ID |\n| init_ml_model | Initialize a new ML model for a given client |\n| set_ml_model | Set or update the ML model for a client |\n| remove_ml_model | Remove an existing ML model for a client |\n| get_ml_models | Get all ML model IDs for a client |\n| execute_model | Execute model methods or access attributes |\n| grant_access | Allow a client to access another client's model |\n| revoke_access | Revoke access to a client's model |\n| get_allowed_models | Get models a client is allowed to access |\n\nNow `PymiloServer` runs on port `8000` and exposes REST API to `upload`, `download` and retrieve **attributes** either **data attributes** like `model._coef` or **method attributes** like `model.predict(x_test)`.\n\nℹ️ By default, `PymiloServer` listens on the loopback interface (`127.0.0.1`). To make it accessible over a local network (LAN), specify your machine’s LAN IP address in the `host` parameter of the `PymiloServer` constructor.\n\n#### Client\nBy using `PymiloClient` you can easily connect to the remote `PymiloServer` and execute any functionalities that the given ML model has, let's say you want to run `predict` function on your remote ML model and get the result:\n```python\nfrom pymilo.streaming import PymiloClient, CommunicationProtocol\npymilo_client = PymiloClient(\n    mode=PymiloClient.Mode.LOCAL,\n    server_url=\"SERVER_URL\",\n    communication_protocol=CommunicationProtocol[\"REST\"],\n    )\npymilo_client.toggle_mode(PymiloClient.Mode.DELEGATE)\nresult = pymilo_client.predict(x_test)\n```\n\n#### PymiloClient\n\nThe `PymiloClient` class facilitates working with remote PyMilo servers.\n\n| **Parameter** | **Description** |\n| ------------- | --------------- |\n| model | The local ML model to wrap around |\n| mode | Operating mode (LOCAL or DELEGATE) |\n| compressor | Compression method from `Compression` enum |\n| server_url | URL of the PyMilo server |\n| communication_protocol | Communication protocol from `CommunicationProtocol` enum |\n\nThe `mode` parameter accepts two values `LOCAL` to execute operations on the local model, or `DELEGATE` to delegate operations to the remote server. The `compressor` parameter accepts values from the `Compression` enum including `NULL` (no compression), `GZIP`, `ZLIB`, `LZMA`, or `BZ2`. The `communication_protocol` parameter accepts values from the `CommunicationProtocol` enum including `REST` or `WEBSOCKET`.\n\n| **Method** | **Description** |\n| ---------- | --------------- |\n| toggle_mode | Switch between LOCAL and DELEGATE modes |\n| register | Register the client with the remote server |\n| deregister | Deregister the client from the server |\n| register_ml_model | Register an ML model with the server |\n| deregister_ml_model | Deregister an ML model from the server |\n| upload | Upload the local model to the remote server |\n| download | Download the remote model to local |\n| get_ml_models | Get all registered ML models for this client |\n| grant_access | Grant access to this client's model to another client |\n| revoke_access | Revoke access previously granted to another client |\n| get_allowance | Get clients who have access to this client's models |\n| get_allowed_models | Get models this client is allowed to access from another client |\n\nℹ️ If you've deployed `PymiloServer` locally (on port `8000` for instance), then `SERVER_URL` would be `http://127.0.0.1:8000` or `ws://127.0.0.1:8000` based on the selected protocol for the communication medium.\n\nYou can also download the remote ML model into your local and execute functions locally on your model.\n\nCalling `download` function on `PymiloClient` will sync the local model that `PymiloClient` wraps upon with the remote ML model, and it doesn't save model directly to a file.\n\n```python\npymilo_client.download()\n```\nIf you want to save the ML model to a file in your local, you can use `Export` class.\n```python\nfrom pymilo import Export\nExport(pymilo_client.model).save(\"model.json\")\n```\nNow that you've synced the remote model with your local model, you can run functions.\n```python\npymilo_client.toggle_mode(mode=PymiloClient.Mode.LOCAL)\nresult = pymilo_client.predict(x_test)\n```\n`PymiloClient` wraps around the ML model, either to the local ML model or the remote ML model, and you can work with `PymiloClient` in the exact same way that you did with the ML model, you can run exact same functions with same signature.\n\nℹ️ Through the usage of `toggle_mode` function you can specify whether `PymiloClient` applies requests on the local ML model `pymilo_client.toggle_mode(mode=Mode.LOCAL)` or delegates it to the remote server `pymilo_client.toggle_mode(mode=Mode.DELEGATE)`\n\n\n## Supported ML models\n| scikit-learn | PyTorch | \n| ---------------- | ---------------- | \n| Linear Models &#x2705; | - | \n| Neural Networks &#x2705; | -  | \n| Trees &#x2705; | -  | \n| Clustering &#x2705; | -  | \n| Naïve Bayes &#x2705; | -  | \n| Support Vector Machines (SVMs) &#x2705; | -  | \n| Nearest Neighbors &#x2705; | -  |  \n| Ensemble Models &#x2705; | - | \n| Pipeline Model &#x2705; | - |\n| Preprocessing Models &#x2705; | - |\n| Cross Decomposition Models &#x2705; | - |\n| Feature Extractor Models &#x2705; | - |\n| Composite Models &#x2705; | - |\n\n\nDetails are available in [Supported Models](https://github.com/openscilab/pymilo/blob/main/SUPPORTED_MODELS.md).\n\n## Issues & bug reports\n\nJust fill an issue and describe it. We'll check it ASAP! or send an email to [pymilo@openscilab.com](mailto:pymilo@openscilab.com \"pymilo@openscilab.com\"). \n\n- Please complete the issue template\n \nYou can also join our discord server\n\n<a href=\"https://discord.gg/mtuMS8AjDS\">\n  <img src=\"https://img.shields.io/discord/1064533716615049236.svg?style=for-the-badge\" alt=\"Discord Channel\">\n</a>\n\n## Contributing\n\nWe welcome contributions! Please read our **[Contributing Guidelines](.github/CONTRIBUTING.md)** before submitting any changes.\n\n## Acknowledgments\n\n[Python Software Foundation (PSF)](https://www.python.org/psf/) grants PyMilo library partially for versions **1.0, 1.1**. [PSF](https://www.python.org/psf/) is the organization behind Python. Their mission is to promote, protect, and advance the Python programming language and to support and facilitate the growth of a diverse and international community of Python programmers.\n\n<a href=\"https://www.python.org/psf/\"><img src=\"https://github.com/openscilab/pymilo/raw/main/otherfiles/psf.png\" height=\"65px\" alt=\"Python Software Foundation\"></a>\n\n[Trelis Research](https://trelis.com/) grants PyMilo library partially for version **1.0**. [Trelis Research](https://trelis.com/) provides tools and tutorials for businesses and developers looking to fine-tune and deploy large language models.\n\n<a href=\"https://trelis.com/\"><img src=\"https://trelis.com/wp-content/uploads/2023/10/android-chrome-512x512-1.png\" height=\"75px\" alt=\"Trelis Research\"></a>\n\n## Cite\n\nIf you use PyMilo in your research, we would appreciate citations to the following paper:\n\n[Rostami, A., Haghighi, S., Sabouri, S. and Zolanvari, A., 2025. PyMilo: A Python Library for ML I/O. *Journal of Open Source Software*, 10(116), p.8858.](https://joss.theoj.org/papers/10.21105/joss.08858)\n\n```bibtex\n@article{Rostami2025,\n  doi = {10.21105/joss.08858},\n  url = {https://doi.org/10.21105/joss.08858},\n  year = {2025},\n  publisher = {The Open Journal},\n  volume = {10},\n  number = {116},\n  pages = {8858},\n  author = {Rostami, AmirHosein and Haghighi, Sepand and Sabouri, Sadra and Zolanvari, Alireza},\n  title = {PyMilo: A Python Library for ML I/O},\n  journal = {Journal of Open Source Software}\n}\n```\n\nDownload [PyMilo.bib](https://raw.githubusercontent.com/openscilab/pymilo/main/paper/PyMilo.bib)\n\n<a href=\"https://doi.org/10.21105/joss.08858\">\n  <img src=\"https://joss.theoj.org/papers/10.21105/joss.08858/status.svg\" alt=\"JOSS DOI: 10.21105/joss.08858\">\n</a>\n\n## Show your support\n\n\n### Star this repo\n\nGive a ⭐️ if this project helped you!\n\n### Donate to our project\nIf you do like our project and we hope that you do, can you please support us? Our project is not and is never going to be working for profit. We need the money just so we can continue doing what we do ;-) .\t\t\t\n\n<a href=\"https://openscilab.com/#donation\" target=\"_blank\"><img src=\"https://github.com/openscilab/pymilo/raw/main/otherfiles/donation.png\" height=\"90px\" width=\"270px\" alt=\"PyMilo Donation\"></a>\n"
  },
  {
    "path": "SECURITY.md",
    "content": "# Security policy\n\n## Supported versions\n\n| Version       | Supported          |\n| ------------- | ------------------ |\n| 1.6           | :white_check_mark: |\n| < 1.6         | :x:                |\n\n## Reporting a vulnerability\n\nPlease report security vulnerabilities by email to [pymilo@openscilab.com](mailto:pymilo@openscilab.com \"pymilo@openscilab.com\").\n\nIf the security vulnerability is accepted, a dedicated bugfix release will be issued as soon as possible (depending on the complexity of the fix)."
  },
  {
    "path": "SUPPORTED_MODELS.md",
    "content": "# Supported Models\n\n**Last Update: 2025-12-30**\n\n\n<h2 id=\"scikit-learn\">Scikit-Learn</h2> \n<h3 id=\"scikit-learn-linear\">Linear Models</h3>\n\n📚 <a href=\"https://scikit-learn.org/stable/modules/linear_model.html\" target=\"_blank\"><b>Models Document</b></a>\n\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n\t\t<th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>Linear Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>Ridge Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>3</td>\n\t\t<td><b>Ridge Classifier</b></td>\n        <td>>=0.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>4</td>\n\t\t<td><b>Ridge Regressor CV</b></td>\n        <td>>=0.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>5</td>\n\t\t<td><b>Ridge Classifier CV</b></td>\n        <td>>=0.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>6</td>\n\t\t<td><b>Lasso</b></td>\n        <td>>=0.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>7</td>\n\t\t<td><b>Lasso CV</b></td>\n        <td>>=0.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>8</td>\n\t\t<td><b>Lasso Lars</b></td>\n        <td>>=0.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>9</td>\n\t\t<td><b>Lasso Lars CV</b></td>\n        <td>>=0.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>10</td>\n\t\t<td><b>Lasso Lars IC</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>11</td>\n\t\t<td><b>Multi Task Lasso</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>12</td>\n\t\t<td><b>Multi Task Lasso CV</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>13</td>\n\t\t<td><b>Elastic Net</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>14</td>\n\t\t<td><b>Elastic Net CV</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>15</td>\n\t\t<td><b>Multi Task Elastic Net</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>16</td>\n\t\t<td><b>Multi Task Elastic Net CV</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>17</td>\n\t\t<td><b>Orthogonal Matching Pursuit</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>18</td>\n\t\t<td><b>Orthogonal Matching Pursuit CV</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>19</td>\t\n\t\t<td><b>Bayesian Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>20</td>\n\t\t<td><b>Automatic Relevance Determination Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>21</td>\n\t\t<td><b>Logistic Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>22</td>\n\t\t<td><b>Logistic Regressor CV</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>23</td>\n\t\t<td><b>Tweedie Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>24</td>\n\t\t<td><b>Poisson Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>25</td>\n\t\t<td><b>Gamma Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>26</td>\n\t\t<td><b>SGD Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>27</td>\n\t\t<td><b>SGD Classifier</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>28</td>\n\t\t<td><b>SGD oneclass SVM</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>29</td>\n\t\t<td><b>Perceptron</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>30</td>\n\t\t<td><b>Passive Aggressive Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>31</td>\n\t\t<td><b>Passive Aggressive Classifier</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>32</td>\n\t\t<td><b>OMP</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>33</td>\n\t\t<td><b>OMP CV</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>34</td>\n\t\t<td><b>Ransac Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>35</td>\n\t\t<td><b>Theil Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>36</td>\n\t\t<td><b>Huber Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>\n    <tr align=\"center\">\n\t\t<td>37</td>\n\t\t<td><b>Quantile Regressor</b></td>\n        <td>>=0.1</td>\n\t</tr>   \n</table>\n\n<h3 id=\"scikit-learn-nn\">Neural Networks</h3>\n\n📚 <a href=\"https://scikit-learn.org/stable/modules/neural_networks_supervised.html\" target=\"_blank\"><b>Models Document</b></a>\n\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n        <th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>Multi Layer Perceptron Regression</b></td>\n        <td>>=0.2</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>Multi Layer Perceptron Classifier</b></td>\n        <td>>=0.2</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>3</td>\n\t\t<td><b>Bernoulli RBM</b></td>\n        <td>>=0.2</td>\n\t</tr>\n</table>\n\n<h3 id=\"scikit-learn-trees\">Decision Trees</h3> \n\n📚 <a href=\"https://scikit-learn.org/stable/modules/tree.html\" target=\"_blank\"><b>Models Document</b></a>\n\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n        <th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>Decision Tree Regressor</b></td>\n        <td>>=0.3</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>Decision Tree Classifier</b></td>\n        <td>>=0.3</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>3</td>\n\t\t<td><b>Extra Tree Regressor</b></td>\n        <td>>=0.3</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>4</td>\n\t\t<td><b>Extra Tree Classifier</b></td>\n        <td>>=0.3</td>\n\t</tr>\n</table>\n\n<h3 id=\"scikit-learn-clustering\">Clustering Models</h3>\n\n📚 <a href=\"https://scikit-learn.org/stable/modules/clustering.html\" target=\"_blank\"><b>Models Document</b></a>\n\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n\t\t<th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>Kmeans</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>Bisecting Kmeans</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>3</td>\n\t\t<td><b>Mini Batch KMeans</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>4</td>\n\t\t<td><b>Affinity Propagation</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>5</td>\n\t\t<td><b>Mean Shift</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>6</td>\n\t\t<td><b>Spectral Clustering</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>7</td>\n\t\t<td><b>Spectral Biclustering</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>8</td>\n\t\t<td><b>Spectral Coclustering</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>9</td>\n\t\t<td><b>Agglomerative Clustering</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>10</td>\n\t\t<td><b>Feature Agglomeration</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>11</td>\n\t\t<td><b>DBScan</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>12</td>\n\t\t<td><b>HDBScan</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>13</td>\n\t\t<td><b>Optics</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>14</td>\n\t\t<td><b>Gaussian Mixture</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>15</td>\n\t\t<td><b>Bayesian Gaussian Mixture</b></td>\n        <td>>=0.4</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>16</td>\n\t\t<td><b>Birch</b></td>\n        <td>>=0.4</td>\n\t</tr>\n</table>\n\n<h3 id=\"scikit-learn-naivebayes\">Naive Bayes</h3> \n\n📚 <a href=\"https://scikit-learn.org/stable/modules/naive_bayes.html\" target=\"_blank\"><b>Models Document</b></a>\n\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n        <th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>Gaussian Naive Bayes</b></td>\n        <td>>=0.5</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>Multinomial Naive Bayes</b></td>\n        <td>>=0.5</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>3</td>\n\t\t<td><b>Bernoulli Naive Bayes</b></td>\n        <td>>=0.5</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>4</td>\n\t\t<td><b>Complement Naive Bayes</b></td>\n        <td>>=0.5</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>5</td>\n\t\t<td><b>Categorical Naive Bayes</b></td>\n        <td>>=0.5</td>\n\t</tr>\n</table>\n\n<h3 id=\"scikit-learn-svm\">Support Vector Machine</h3> \n\n📚 <a href=\"https://scikit-learn.org/stable/modules/svm.html\" target=\"_blank\"><b>Models Document</b></a>\n\n\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n        <th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>Linear SVC</b></td>\n        <td>>=0.6</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>Linear SVR</b></td>\n        <td>>=0.6</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>3</td>\n\t\t<td><b>NuSVC</b></td>\n        <td>>=0.6</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>4</td>\n\t\t<td><b>NuSVR</b></td>\n        <td>>=0.6</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>5</td>\n\t\t<td><b>One Class SVM</b></td>\n        <td>>=0.6</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>6</td>\n\t\t<td><b>SVC</b></td>\n        <td>>=0.6</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>7</td>\n\t\t<td><b>SVR</b></td>\n        <td>>=0.6</td>\n\t</tr>\n</table>\n\n\n<h3 id=\"scikit-learn-neighbors\">Neighbors</h3> \n\n📚 <a href=\"https://scikit-learn.org/stable/modules/neighbors.html\" target=\"_blank\"><b>Models Document</b></a>\n\n\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n        <th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>KNeighborsClassifier</b></td>\n        <td>>=0.7</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>KNeighborsRegressor</b></td>\n        <td>>=0.7</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>3</td>\n\t\t<td><b>NearestNeighbors</b></td>\n        <td>>=0.7</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>4</td>\n\t\t<td><b>NearestCentroid</b></td>\n        <td>>=0.7</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>5</td>\n\t\t<td><b>RadiusNeighborsClassifier</b></td>\n        <td>>=0.7</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>6</td>\n\t\t<td><b>RadiusNeighborsRegressor</b></td>\n        <td>>=0.7</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>7</td>\n\t\t<td><b>LocalOutlierFactor</b></td>\n        <td>>=0.7</td>\n\t</tr>\n</table>\n\n<h3 id=\"scikit-learn-ensemble\">Ensemble</h3> \n📚 <a href=\"https://scikit-learn.org/stable/modules/ensemble.html\" target=\"_blank\"><b>Models Document</b></a>\n\n\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n        <th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>AdaboostClassifier</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>AdaboostRegressor</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>3</td>\n\t\t<td><b>BaggingClassifier</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>4</td>\n\t\t<td><b>BaggingRegressor</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>5</td>\n\t\t<td><b>ExtraTreesClassifier</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>6</td>\n\t\t<td><b>ExtraTreesRegressor</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>7</td>\n\t\t<td><b>GradientBoosterClassifier</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>8</td>\n\t\t<td><b>GradientBoosterRegressor</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>9</td>\n\t\t<td><b>HistGradientBoostingClassifier</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>10</td>\n\t\t<td><b>HistGradientBoostingRegressor</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>11</td>\n\t\t<td><b>RandomForestClassifier</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>12</td>\n\t\t<td><b>RandomForestRegressor</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>13</td>\n\t\t<td><b>StackingClassifier</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>14</td>\n\t\t<td><b>StackingRegressor</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>15</td>\n\t\t<td><b>VotingClassifier</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>16</td>\n\t\t<td><b>VotingRegressor</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>17</td>\n\t\t<td><b>IsolationForest</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>18</td>\n\t\t<td><b>RandomTreesEmbedding</b></td>\n        <td>>=0.8</td>\n\t</tr>\n</table>\n\n<h3 id=\"scikit-learn-pipeline\">Pipeline</h3> \n📚 <a href=\"https://scikit-learn.org/stable/modules/compose.html#pipeline-chaining-estimators\" target=\"_blank\"><b>Models Document</b></a>\n\n\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n        <th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>Pipeline</b></td>\n        <td>>=0.8</td>\n\t</tr>\n</table>\n\n<h3 id=\"scikit-learn-preprocessing\">Preprocessing Modules</h3> \n📚 <a href=\"https://scikit-learn.org/stable/modules/classes.html#module-sklearn.preprocessing\" target=\"_blank\"><b>Models Document</b></a>\n\n\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n        <th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>OneHotEncoder</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>LabelBinarizer</b></td>\n        <td>>=0.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>3</td>\n\t\t<td><b>LabelEncoder</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>4</td>\n\t\t<td><b>StandardScaler</b></td>\n        <td>>=0.8</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>5</td>\n\t\t<td><b>Binarizer</b></td>\n        <td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>6</td>\n\t\t<td><b>FunctionTransformer</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>7</td>\n\t\t<td><b>KernelCenterer</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>8</td>\n\t\t<td><b>MultiLabelBinarizer</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>9</td>\n\t\t<td><b>MaxAbsScaler</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>10</td>\n\t\t<td><b>Normalizer</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>11</td>\n\t\t<td><b>OrdinalEncoder</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>12</td>\n\t\t<td><b>PolynomialFeatures</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>13</td>\n\t\t<td><b>RobustScaler</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>14</td>\n\t\t<td><b>QuantileTransformer</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>15</td>\n\t\t<td><b>KBinsDiscretizer</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>16</td>\n\t\t<td><b>PowerTransformer</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>17</td>\n\t\t<td><b>SplineTransformer</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>18</td>\n\t\t<td><b>TargetEncoder</b></td>\n\t\t<td>>=0.9</td>\n\t</tr>\n\n</table>\n\n\n<h3 id=\"scikit-learn-cross-decomposition\">Cross Decomposition Modules</h3> \n📚 <a href=\"https://scikit-learn.org/stable/api/sklearn.cross_decomposition.html\" target=\"_blank\"><b>Models Document</b></a>\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n        <th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>PLSRegression</b></td>\n\t\t<td>>=1.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>PLSCanonical</b></td>\n\t\t<td>>=1.1</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>3</td>\n\t\t<td><b>CCA</b></td>\n\t\t<td>>=1.1</td>\n\t</tr>\n</table>\n\n<h3 id=\"scikit-learn-feature-extraction\">Feature Extraction Modules</h3> \n📚 <a href=\"https://scikit-learn.org/stable/api/sklearn.feature_extraction.html\" target=\"_blank\"><b>Models Document</b></a>\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n        <th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>DictVectorizer</b></td>\n\t\t<td>>=1.3</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>FeatureHasher</b></td>\n\t\t<td>>=1.3</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>3</td>\n\t\t<td><b>PatchExtractor</b></td>\n\t\t<td>>=1.3</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>4</td>\n\t\t<td><b>CountVectorizer</b></td>\n\t\t<td>>=1.3</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>5</td>\n\t\t<td><b>HashingVectorizer</b></td>\n\t\t<td>>=1.3</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>6</td>\n\t\t<td><b>TfidfTransformer</b></td>\n\t\t<td>>=1.3</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>7</td>\n\t\t<td><b>TfidfVectorizer</b></td>\n\t\t<td>>=1.3</td>\n\t</tr>\n</table>\n\n<h3 id=\"scikit-learn-feature-extraction\">Composite Modules</h3> \n📚 <a href=\"https://scikit-learn.org/stable/api/sklearn.compose.html\" target=\"_blank\"><b>Models Document</b></a>\n<table>\n\t<tr align=\"center\">\n\t\t<th>ID</th>\n\t\t<th>Model Name</th>\n        <th>PyMilo Version</th>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>1</td>\n\t\t<td><b>ColumnTransformer</b></td>\n\t\t<td>>=1.5</td>\n\t</tr>\n\t<tr align=\"center\">\n\t\t<td>2</td>\n\t\t<td><b>TransformedTargetRegressor</b></td>\n\t\t<td>>=1.5</td>\n\t</tr>\n</table>\n"
  },
  {
    "path": "autopep8.bat",
    "content": "python -m autopep8 pymilo --recursive --aggressive --aggressive --in-place --pep8-passes 2000 --max-line-length 120 --verbose --ignore=E721\npython -m autopep8 otherfiles --recursive --aggressive --aggressive --in-place --pep8-passes 2000 --max-line-length 120 --verbose --ignore=E721\npython -m autopep8 setup.py --recursive --aggressive --aggressive --in-place --pep8-passes 2000 --max-line-length 120 --verbose\n"
  },
  {
    "path": "autopep8.sh",
    "content": "#!/bin/sh\npython -m autopep8 pymilo --recursive --aggressive --aggressive --in-place --pep8-passes 2000 --max-line-length 120 --verbose --ignore=E721\npython -m autopep8 otherfiles --recursive --aggressive --aggressive --in-place --pep8-passes 2000 --max-line-length 120 --verbose --ignore=E721\npython -m autopep8 setup.py --recursive --aggressive --aggressive --in-place --pep8-passes 2000 --max-line-length 120 --verbose\n"
  },
  {
    "path": "codecov.yml",
    "content": "codecov:\n  require_ci_to_pass: yes\n\ncoverage:\n  precision: 2\n  round: up\n  range: \"70...100\"\n  status:\n    patch:\n      default:\n        enabled: no\n    project:\n      default:\n        threshold: 1%\n"
  },
  {
    "path": "dev-requirements.txt",
    "content": "numpy==2.2.4\nscikit-learn==1.6.1\nscipy>=0.19.1\nuvicorn==0.39.0\nfastapi==0.128.8\nrequests==2.32.5\nwebsockets==15.0.1\npydantic==2.12.5\nsetuptools>=40.8.0\nvulture>=1.0\nbandit>=1.5.1\npydocstyle>=3.0.0\npytest>=4.3.1\npytest-cov>=2.6.1\nPillow>=8.4.0"
  },
  {
    "path": "otherfiles/RELEASE.md",
    "content": "# PyMilo Release Instructions\n\n#### Last Update: 2024-04-24\n\n1. Create the `release` branch under `dev`\n2. Update all version tags\n\t1. `setup.py`\n\t2. `README.md`\n\t3. `SECURITY.md`\n\t4. `otherfiles/version_check.py`\n\t5. `otherfiles/meta.yaml`\n\t6. `pymilo/pymilo_param.py`\n3. Update `CHANGELOG.md`\n\t1. Add a new header under `Unreleased` section (Example: `## [0.1] - 2022-08-17`)\n\t2. Add a new compare link to the end of the file (Example: `[0.2]: https://github.com/openscilab/pymilo/compare/v0.1...v0.2`)\n\t3. Update `dev` compare link (Example: `[Unreleased]: https://github.com/openscilab/pymilo/compare/v0.2...dev`)\n4. Update `.github/ISSUE_TEMPLATE/bug_report.yml`\n   1. Add new version tag to `PyMilo version` dropbox options\n5. Create a PR from `release` to `dev`\n\t1. Title: `Version x.x` (Example: `Version 0.1`)\n\t2. Tag all related issues\n\t3. Labels: `release`\n\t4. Set milestone\n\t5. Wait for all CI pass\n\t6. Need review (**2** reviewers)\n\t7. Squash and merge\n\t8. Delete `release` branch\n6. Merge `dev` branch into `main`\n\t1. `git checkout main`\n\t2. `git merge dev`\n\t3. `git push origin main`\n\t4. Wait for all CI pass\n7. Create a new release\n\t1. Target branch: `main`\n\t2. Tag: `vx.x` (Example: `v0.1`)\n\t3. Title: `Version x.x` (Example: `Version 0.1`)\n\t4. Copy changelogs\n\t5. Tag all related issues\n8. Bump!!\n9. Close this version issues\n10. Close milestone"
  },
  {
    "path": "otherfiles/meta.yaml",
    "content": "{% set name = \"pymilo\" %}\n{% set version = \"1.6\" %}\n\npackage:\n    name: {{ name|lower }}\n    version: {{ version }}\nsource:\n    git_url: https://github.com/openscilab/pymilo\n    git_rev: v{{ version }}\nbuild:\n    noarch: python\n    number: 0\n    script: {{ PYTHON }} -m pip install . -vv\nrequirements:\n    host:\n        - pip\n        - setuptools\n        - python >=3.7\n    run:\n        - python >=3.7\n        - numpy >=1.9.0\n        - scikit-learn >=0.22.2\n        - scipy >=0.19.1\n        - requests>=2.0.0\n        - uvicorn>=0.14.0\n        - fastapi>=0.68.0\n        - pydantic>=1.5.0\n        - websockets>=9.0\n\nabout:\n    home: https://github.com/openscilab/pymilo\n    license: MIT\n    license_family: MIT\n    summary: Python library for machine learning input and output\n    description: |\n        Pymilo is an open source Python package that provides a simple, efficient, and\n        safe way for users to export pre-trained machine learning models in a transparent way.\n        By this, the exported model can be used in other environments, transferred across different platforms,\n        and shared with others. Pymilo allows the users to export the models that are\n        trained using popular Python libraries like scikit-learn, and then use them in deployment environments,\n        or share them without exposing the underlying code or dependencies.\n        The transparency of the exported models ensures reliability and safety for the end users,\n        as it eliminates the risks of binary or pickle formats.\n        \n        Website: https://openscilab.com\n\n        Repo: https://github.com/openscilab/pymilo\nextra:\n    recipe-maintainers:\n        - AHReccese\n"
  },
  {
    "path": "otherfiles/requirements-splitter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"Requirements splitter.\"\"\"\n\ntest_req = \"\"\n\nwith open('dev-requirements.txt', 'r') as f:\n    for line in f:\n        if '==' not in line:\n            test_req += line\n\nwith open('test-requirements.txt', 'w') as f:\n    f.write(test_req)\n"
  },
  {
    "path": "otherfiles/version_check.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"Version-check script.\"\"\"\nimport os\nimport sys\nimport codecs\nFailed = 0\nPYMILO_VERSION = \"1.6\"\n\n\nSETUP_ITEMS = [\n    \"version='{0}'\",\n    'https://github.com/openscilab/pymilo/tarball/v{0}']\nREADME_ITEMS = [\n    \"[Version {0}](https://github.com/openscilab/pymilo/archive/v{0}.zip)\",\n    \"pip install pymilo=={0}\"]\nCHANGELOG_ITEMS = [\n    \"## [{0}]\",\n    \"https://github.com/openscilab/pymilo/compare/v{0}...dev\",\n    \"[{0}]:\"]\nPARAMS_ITEMS = ['PYMILO_VERSION = \"{0}\"']\nMETA_ITEMS = ['% set version = \"{0}\" %']\nISSUE_TEMPLATE_ITEMS = [\"- PyMilo {0}\"]\nSECURITY_ITEMS = [\"| {0}           | :white_check_mark: |\", \"| < {0}         | :x:                |\"]\n\nFILES = {\n    os.path.join(\"otherfiles\", \"meta.yaml\"): META_ITEMS,\n    \"setup.py\": SETUP_ITEMS,\n    \"README.md\": README_ITEMS,\n    \"CHANGELOG.md\": CHANGELOG_ITEMS,\n    \"SECURITY.md\": SECURITY_ITEMS,\n    os.path.join(\"pymilo\", \"pymilo_param.py\"): PARAMS_ITEMS,\n    os.path.join(\".github\", \"ISSUE_TEMPLATE\", \"bug_report.yml\"): ISSUE_TEMPLATE_ITEMS,\n}\n\nTEST_NUMBER = len(FILES)\n\n\ndef print_result(failed=False):\n    \"\"\"\n    Print final result.\n\n    :param failed: failed flag\n    :type failed: bool\n    :return: None\n    \"\"\"\n    message = \"Version tag tests \"\n    if not failed:\n        print(\"\\n\" + message + \"passed!\")\n    else:\n        print(\"\\n\" + message + \"failed!\")\n    print(\"Passed : \" + str(TEST_NUMBER - Failed) + \"/\" + str(TEST_NUMBER))\n\n\nif __name__ == \"__main__\":\n    for file_name in FILES:\n        try:\n            file_content = codecs.open(\n                file_name, \"r\", \"utf-8\", 'ignore').read()\n            for test_item in FILES[file_name]:\n                if file_content.find(test_item.format(PYMILO_VERSION)) == -1:\n                    print(\"Incorrect version tag in \" + file_name)\n                    Failed += 1\n                    break\n        except Exception as e:\n            Failed += 1\n            print(\"Error in \" + file_name + \"\\n\" + \"Message : \" + str(e))\n\n    if Failed == 0:\n        print_result(False)\n        sys.exit(0)\n    else:\n        print_result(True)\n        sys.exit(1)\n"
  },
  {
    "path": "paper/.gitignore",
    "content": "/refs"
  },
  {
    "path": "paper/PyMilo.bib",
    "content": "@article{Rostami2025,\n  doi = {10.21105/joss.08858},\n  url = {https://doi.org/10.21105/joss.08858},\n  year = {2025},\n  publisher = {The Open Journal},\n  volume = {10},\n  number = {116},\n  pages = {8858},\n  author = {Rostami, AmirHosein and Haghighi, Sepand and Sabouri, Sadra and Zolanvari, Alireza},\n  title = {PyMilo: A Python Library for ML I/O},\n  journal = {Journal of Open Source Software}\n}\n"
  },
  {
    "path": "paper/paper.bib",
    "content": "@article{Raschka2020,\n  author    = {Sebastian Raschka and Joshua Patterson and Corey Nolet},\n  title     = {Machine Learning in {P}ython: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligence},\n  journal   = {Information},\n  volume    = {11},\n  number    = {4},\n  pages     = {193},\n  year      = {2020},\n  doi       = {10.3390/info11040193}\n}\n\n@inproceedings{parida2025model,\n  author={Parida, Shreyas Kumar and Gerostathopoulos, Ilias and Bogner, Justus},\n  booktitle={2025 IEEE/ACM 4th International Conference on AI Engineering – Software Engineering for AI (CAIN)}, \n  title={How Do Model Export Formats Impact the Development of {ML}-Enabled Systems? A Case Study on Model Integration}, \n  year={2025},\n  volume={},\n  number={},\n  pages={48-59},\n  doi={10.1109/CAIN66642.2025.00014}\n}\n\n@inproceedings{davis2023reusing,\n  title={Reusing Deep Learning Models: Challenges and Directions in Software Engineering},\n  author={Davis, James C and Jajal, Purvish and Jiang, Wenxin and Schorlemmer, Taylor R and Synovic, Nicholas and Thiruvathukal, George K},\n  booktitle={2023 IEEE John Vincent Atanasoff International Symposium on Modern Computing (JVA)},\n  pages={17--30},\n  year={2023},\n  organization={IEEE},\n  doi={10.1109/JVA60410.2023.00015}\n}\n\n@article{Garbin2022,\n  author    = {Cristina Garbin and Osvaldo Marques},\n  title     = {Assessing Methods and Tools to Improve Reporting, Increase Transparency, and Reduce Failures in Machine Learning Applications in Health Care},\n  journal   = {Radiology: Artificial Intelligence},\n  volume    = {4},\n  number    = {2},\n  pages     = {e210127},\n  year      = {2022},\n  doi       = {10.1148/ryai.210127},\n}\n\n@article{bodimani2024assessing, \n  title={Assessing The Impact of Transparent {AI} Systems in Enhancing User Trust and Privacy}, \n  volume={5}, \n  url={https://thesciencebrigade.com/jst/article/view/68}, \n  number={1}, \n  journal={Journal of Science \\& Technology}, \n  author={Bodimani, Meghasai}, \n  year={2024}, \n  month={Feb.}, \n  pages={50–67}, \n}\n\n@misc{Brownlee2018,\n  author    = {Jason Brownlee},\n  title     = {Save and Load Machine Learning Models in {P}ython with scikit-learn},\n  howpublished = {\\url{https://machinelearningmastery.com/save-load-machine-learning-models-python-scikit-learn/}},\n  year      = {2018},\n  note      = {Accessed: 2024-05-22}\n}\n\n@misc{PythonPickleDocs,\n  author       = {{Python Software Foundation}},\n  title        = {{p}ickle — {P}ython object serialization},\n  year         = {2024},\n  howpublished = {\\url{https://docs.python.org/3/library/pickle.html#security}},\n}\n\n@software{onnx,\n  author       = {Bai, Junjie and Lu, Fang and Zhang, Ke and others},\n  title        = {{ONNX} ({O}pen {N}eural {N}etwork {E}xchange)},\n  url          = {https://github.com/onnx/onnx},\n  version      = {1.18.0},\n  date         = {2025-05-12},\n}\n\n@article{pmml,\n  title={{PMML}: An Open Standard for Sharing Models},\n  author={Guazzelli, Alex and Zeller, Michael and Lin, Wen-Ching and Williams, Graham},\n  year={2009},\n  doi={10.32614/RJ-2009-010}\n}\n\n@article{jajal2023analysis,\n  title={Analysis of Failures and Risks in Deep Learning Model Converters: A Case Study in the {ONNX} Ecosystem},\n  author={Jajal, Purvish and Jiang, Wenxin and Tewari, Arav and Kocinare, Erik and Woo, Joseph and Sarraf, Anusha and Lu, Yung-Hsiang and Thiruvathukal, George K and Davis, James C},\n  journal={arXiv preprint arXiv:2303.17708},\n  year={2023},\n  doi={10.48550/arXiv.2303.17708}\n}\n\n@inproceedings{cody2024extending,\n  title={On Extending the {A}utomatic {T}est {M}arkup {L}anguage ({ATML}) for Machine Learning},\n  author={Cody, Tyler and Li, Bingtong and Beling, Peter},\n  booktitle={2024 IEEE International Systems Conference (SysCon)},\n  pages={1--8},\n  year={2024},\n  organization={IEEE},\n  doi={10.1109/SysCon61195.2024.10553464}\n}\n\n@software{skops,\n  author       = {{skops-dev}},\n  title        = {{SKOPS}},\n  url          = {https://github.com/skops-dev/skops},\n  version      = {0.11.0},\n  date         = {2024-12-10},\n}\n\n@article{tfjs2019,\n  title={Tensor{F}low.js: Machine Learning for the Web and Beyond},\n  author={Smilkov, Daniel and Thorat, Nikhil and Assogba, Yannick and Nicholson, Charles and Kreeger, Nick and Yu, Ping and Cai, Shanqing and Nielsen, Eric and Soegel, David and Bileschi, Stan and others},\n  journal={Proceedings of Machine Learning and Systems},\n  volume={1},\n  pages={309--321},\n  year={2019},\n  doi={10.48550/arXiv.1901.05350}\n}\n\n@inproceedings{quan2022towards,\n  title={Towards Understanding the Faults of {J}ava{S}cript-Based Deep Learning Systems},\n  author={Quan, Lili and Guo, Qianyu and Xie, Xiaofei and Chen, Sen and Li, Xiaohong and Liu, Yang},\n  booktitle={Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering},\n  pages={1--13},\n  year={2022},\n  doi={10.1145/3551349.3560427}\n}\n\n@misc{NerdCorner2025,\n  author    = {{Nerd Corner}},\n  title     = {Tensor{F}low.js vs {T}ensor{F}low ({P}ython)},\n  year      = {2025},\n  month     = {Mar},\n  howpublished = {\\url{https://nerd-corner.com/tensorflow-js-vs-tensorflow-python/}}\n}\n\n@inproceedings{rauker2023toward,\n  title={Toward Transparent {AI}: A Survey on Interpreting the Inner Structures of Deep Neural Networks},\n  author={R{\\\"a}uker, Tilman and Ho, Anson and Casper, Stephen and Hadfield-Menell, Dylan},\n  booktitle={2023 ieee conference on secure and trustworthy machine learning (satml)},\n  pages={464--483},\n  year={2023},\n  organization={IEEE},\n  doi={10.1109/SaTML54575.2023.00039}\n}\n\n@article{macrae2019governing,\n  title={Governing the safety of artificial intelligence in healthcare},\n  author={Macrae, Carl},\n  journal={BMJ quality \\& safety},\n  volume={28},\n  number={6},\n  pages={495--498},\n  year={2019},\n  publisher={BMJ Publishing Group Ltd},\n  doi={10.1136/bmjqs-2019-009484}\n}\n\n@software{huggingface_safetensors_2022,\n  author       = {{Hugging Face}},\n  title        = {Safetensors - {ML} Safer for All},\n  year         = {2025},\n  version      = {0.6.2},\n  url          = {https://github.com/huggingface/safetensors},\n}\n"
  },
  {
    "path": "paper/paper.md",
    "content": "---\ntitle: 'PyMilo: A Python Library for ML I/O'\ntags:\n  - Machine Learning\n  - Model Deployment\n  - Model Serialization\n  - Transparency\n  - MLOPS\nauthors:\n  - name: AmirHosein Rostami\n    orcid: 0009-0000-0638-2263\n    corresponding: true\n    affiliation: \"1, 2\"\n  - name: Sepand Haghighi\n    orcid: 0000-0001-9450-2375\n    corresponding: false\n    affiliation: 1\n  - name: Sadra Sabouri\n    orcid: 0000-0003-1047-2346\n    corresponding: false\n    affiliation: \"1, 3\"\n  - name: Alireza Zolanvari\n    orcid: 0000-0003-2367-8343\n    corresponding: false\n    affiliation: 1\naffiliations:\n - index: 1\n   name: Open Science Lab\n - index: 2\n   name: University of Toronto, Toronto, Canada\n   ror: 03dbr7087\n - index: 3\n   name: University of Southern California, Los Angeles, United States\n   ror: 03taz7m60\ndate: 24 June 2025\nbibliography: paper.bib\n---\n\n# Summary\nPyMilo is an open-source Python package that addresses the limitations of existing machine learning (ML) model storage formats by providing a transparent, reliable, end-to-end, and safe method for exporting and deploying trained models. \nCurrent tools rely on black-box or executable formats that obscure internal model structures, making them difficult to audit, verify, or safely share. Meanwhile, tensor-centric formats such as Safetensors [@huggingface_safetensors_2022] securely store and transfer numerical tensors but do not capture the internal and structural composition of classical machine-learning models (e.g., scikit-learn pipelines), which remain PyMilo’s primary focus. \nOthers apply structural transformations during export that may degrade predictive performance and reduce the model to a limited inference-only interface. \nIn contrast, PyMilo serializes models in a transparent human-readable format that preserves end-to-end model fidelity and enables reliable, safe, and interpretable exchange. Here, transparent refers to the ability to inspect model internals through a human-readable structure without execution, and end-to-end fidelity denotes that a model exported and re-imported with PyMilo retains the exact same signature, functionality, parameters, and internal structure as the original, ensuring complete behavioral and structural equivalence.\nThis package is designed to make the preservation and reuse of trained ML models safer, more interpretable, and easier to manage across different stages of the ML workflow (\\autoref{fig:overall}).\n\n![PyMilo is an end-to-end, transparent, and safe solution for transporting models from machine learning frameworks to the target devices. PyMilo preserves the original model's structure while transferring, allowing it to be imported back as the exact same object in its native framework. Currently, PyMilo (v1.4) supports models built with scikit-learn. Support for PyTorch and TensorFlow is planned in upcoming releases.\\label{fig:overall}](pymilo_outlook.png)\n\n\\newpage\n\n# Statement of Need\nModern machine learning development is largely centered around the Python ecosystem, which has become a dominant platform for building and training models due to its rich libraries and community support [@Raschka2020]. \nHowever, once a model is trained, sharing or deploying it securely and transparently remains a significant challenge [@parida2025model; @davis2023reusing]. This issue is especially important in high-stakes domains such as healthcare, where ensuring model accountability and integrity is critical [@Garbin2022].\nIn such settings, any lack of clarity about a model’s internal logic or origin can reduce trust in its predictions. Researchers have increasingly emphasized that greater transparency in AI systems is critical for maintaining user trust and protecting privacy in machine learning applications [@bodimani2024assessing].\n\nDespite ongoing concerns around transparency and safety, the dominant approach for exchanging pretrained models remains ad hoc binary serialization, most commonly through Python’s `pickle` module or its variant `joblib`. \nThese formats allow developers to store complex model objects with minimal effort, but they were never designed with security or human interpretability in mind [@parida2025model]. In fact, loading a pickle file may execute arbitrary code contained within it, a known vulnerability that can be exploited if the file is maliciously crafted [@Brownlee2018; @PythonPickleDocs]. \nWhile these methods preserve full model fidelity within the Python ecosystem, they pose serious security risks and lack transparency, as the serialized files are opaque binary blobs that cannot be inspected without loading. \nFurthermore, compatibility is fragile because pickled models often depend on specific library versions, which may hinder long-term reproducibility [@Brownlee2018].\n\nTo improve portability across environments, several standardized model interchange formats have been developed alongside `pickle`. \nMost notably, Open Neural Network Exchange (ONNX) and Predictive Model Markup Language (PMML) convert trained models into framework-agnostic representations [@onnx; @pmml], enabling deployment in diverse systems without relying on the original training code. \nONNX uses a graph-based structure built from primitive operators (e.g., linear transforms, activations), while PMML provides an XML-based specification for traditional models like decision trees and regressions.\n\nAlthough these formats enhance security by avoiding executable serialization, they introduce compatibility and fidelity challenges. \nExporting complex pipelines to ONNX or PMML often leads to structural approximations, missing metadata, or unsupported components, especially for customized models [@pmml]. \nAs a result, the exported model may differ in behavior, resulting in performance degradation or loss of accuracy [@jajal2023analysis]. \n@jajal2023analysis found that models exported to ONNX can produce incorrect predictions despite successful conversion, indicating semantic inconsistencies between the original and exported versions. This reflects predictive performance degradation and highlights the risks of silent behavioral drift in deployed systems.\n\nBeyond concerns about end-to-end model preservation, ONNX and PMML also present limitations in transparency, scope, and reversibility. ONNX uses a binary protocol buffer format that is not human-readable, which limits interpretability and makes auditing difficult. \nPMML, although XML-based and readable, is verbose and narrowly scoped, supporting only a limited subset of scikit-learn models. As noted by @cody2024extending, both ONNX and PMML focus on static model specification rather than operational testing or lifecycle validation workflows. Moreover, PMML does not provide a mechanism to restore exported models into Python, making it a one-way format that limits reproducibility across ML workflows.\n\nOther tools have been developed to address specific use cases, though they remain limited in scope. For example, SKOPS improves the safety of scikit-learn model storage by enabling limited inspection of model internals without requiring code execution [@skops]. \nHowever, it supports only scikit-learn models, lacks compatibility with other frameworks, and does not provide a fully transparent or human-readable structure. \nTensorFlow.js targets JavaScript environments by converting TensorFlow or Keras models into a JSON configuration file and binary weight files for execution in the browser or Node.js [@tfjs2019]. \nHowever, this process has been shown to introduce compatibility issues, performance degradation, and inconsistencies in inference behavior due to backend limitations and environment-specific faults [@quan2022towards]. \nModels from other frameworks, such as scikit-learn or PyTorch, must be re-implemented or retrained in TensorFlow to be exported. \nAdditionally, running complex models in JavaScript runtimes introduces memory and performance limitations, often making the deployment of large neural networks prohibitively slow or even infeasible in browser environments [@NerdCorner2025].\n\nIn summary, current solutions force practitioners into trade-offs between security, transparency, end-to-end fidelity, and performance preservation. \nThe machine learning community still lacks a safe and transparent end-to-end model serialization framework through which users can securely share models, inspect them easily, and accurately reconstruct them for use across diverse frameworks and environments.\n\nPyMilo is proposed to address the above gaps. It is an open-source Python library that provides an end-to-end solution for exporting and importing machine learning models in a safe, non-executable, and human-readable format such as JSON. PyMilo serializes trained models into a transparent format and fully reconstructs them without structural changes, preserving their original functionality and behavior. \nThis process does not affect inference time or performance and imports models on any target device without additional dependencies, enabling seamless execution in inference mode. \nWhile PyMilo may import functions from widely used scientific libraries during deserialization to restore model behavior (for example, NumPy or SciPy), the JSON representation itself never contains executable code; any remaining security risk is therefore inherited from these already-trusted dependencies rather than introduced by PyMilo’s serialization mechanism.\nPyMilo benefits a wide range of stakeholders, including machine learning engineers, data scientists, and AI practitioners, by facilitating the development of more transparent and accountable AI systems. Furthermore, researchers working on transparent AI [@rauker2023toward], user privacy in ML [@bodimani2024assessing], and safe AI [@macrae2019governing] can use PyMilo as a framework that provides transparency and safety in the machine learning environment.\n\n# References\n"
  },
  {
    "path": "pymilo/__init__.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo modules.\"\"\"\nfrom .pymilo_param import PYMILO_VERSION\nfrom .pymilo_obj import Export, Import\nfrom .exceptions import PymiloException, PymiloSerializationException, PymiloDeserializationException\n\n__version__ = PYMILO_VERSION\n"
  },
  {
    "path": "pymilo/__main__.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo main.\"\"\"\nimport re\nimport argparse\nfrom art import tprint\nfrom .pymilo_param import (\n    PYMILO_VERSION,\n    URL_REGEX,\n    CLI_MORE_INFO,\n    CLI_UNKNOWN_MODEL,\n    CLI_ML_STREAMING_NOT_INSTALLED,\n)\nfrom .pymilo_func import print_supported_ml_models, pymilo_help\nfrom .pymilo_obj import Import\nfrom .utils.util import get_sklearn_class\n\nml_streaming_support = True\ntry:\n    from .streaming import PymiloServer, Compression, CommunicationProtocol\nexcept BaseException:\n    ml_streaming_support = False\n\n\ndef main():\n    \"\"\"\n    CLI main function.\n\n    :return: None\n    \"\"\"\n    parser = argparse.ArgumentParser(description='Run the Pymilo server with a specified compression method.')\n    parser.add_argument(\n        '--compression',\n        type=str,\n        choices=['NULL', 'GZIP', 'ZLIB', 'LZMA', 'BZ2'],\n        default='NULL',\n        help='Specify the compression method (NULL, GZIP, ZLIB, LZMA, or BZ2). Default is NULL.'\n    )\n    parser.add_argument(\n        '--port',\n        type=int,\n        default=8000,\n        help='Specify PyMiloServer port number',\n        metavar=\"\",\n    )\n    parser.add_argument(\n        '--protocol',\n        type=str,\n        choices=['REST', 'WEBSOCKET'],\n        default='REST',\n        help='Specify the communication protocol (REST or WEBSOCKET). Default is REST.'\n    )\n    parser.add_argument(\n        '--load',\n        type=str,\n        default=None,\n        help='the `load` command specifies the path to the JSON file of the previously exported ML model by PyMilo.',\n        metavar=\"\",\n    )\n    parser.add_argument(\n        '--init',\n        type=str,\n        default=None,\n        help='the `init` command specifies the ML model to initialize the PyMilo Server with.',\n        metavar=\"\",\n    )\n    parser.add_argument(\n        '--bare',\n        default=False,\n        action='store_true',\n        help='The `bare` command starts the PyMilo Server without an internal ML model.',\n    )\n    parser.add_argument('--version', action='store_true', default=False, help='PyMilo version')\n    parser.add_argument('-v', action='store_true', default=False, help='PyMilo version')\n    args = parser.parse_args()\n    if args.version or args.v:\n        print(PYMILO_VERSION)\n        return\n    if not ml_streaming_support:\n        print(CLI_ML_STREAMING_NOT_INSTALLED)\n        print(CLI_MORE_INFO)\n        tprint(\"PyMilo\")\n        tprint(\"V:\" + PYMILO_VERSION)\n        pymilo_help()\n        parser.print_help()\n        return\n    run_ps = False\n    _model = None\n    _port = args.port\n    _compressor = Compression[args.compression]\n    _communication_protocol = CommunicationProtocol[args.protocol]\n    if args.load:\n        path = args.load\n        run_ps = True\n        _model = Import(url=path) if re.match(URL_REGEX, path) else Import(file_adr=path)\n        _model = _model.to_model()\n    elif args.init:\n        model_name = args.init\n        model_class = get_sklearn_class(model_name)\n        if model_class is None:\n            print(f\"{CLI_UNKNOWN_MODEL}\\n{print_supported_ml_models()}\")\n            return\n        run_ps = True\n        _model = model_class()\n    elif args.bare:\n        run_ps = True\n    if not run_ps:\n        tprint(\"PyMilo\")\n        tprint(\"V:\" + PYMILO_VERSION)\n        pymilo_help()\n        parser.print_help()\n    else:\n        PymiloServer(\n            model=_model,\n            port=_port,\n            compressor=_compressor,\n            communication_protocol=_communication_protocol,\n        ).communicator.run()\n\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "pymilo/chains/__init__.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo chains.\"\"\"\n"
  },
  {
    "path": "pymilo/chains/chain.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Chain Module.\"\"\"\n\nfrom traceback import format_exc\nfrom abc import ABC, abstractmethod\n\nfrom ..utils.util import get_sklearn_type\nfrom ..transporters.transporter import Command\nfrom ..exceptions.serialize_exception import PymiloSerializationException, SerializationErrorTypes\nfrom ..exceptions.deserialize_exception import PymiloDeserializationException, DeserializationErrorTypes\n\n\nclass Chain(ABC):\n    \"\"\"\n    Chain Interface.\n\n    Each Chain serializes/deserializes the given model.\n    \"\"\"\n\n    @abstractmethod\n    def is_supported(self, model):\n        \"\"\"\n        Check if the given model is a sklearn's ML model supported by this chain.\n\n        :param model: a string name of an ML model or a sklearn object of it\n        :type model: any object\n        :return: check result as bool\n        \"\"\"\n\n    @abstractmethod\n    def transport(self, request, command, is_inner_model=False):\n        \"\"\"\n        Return the transported (serialized or deserialized) model.\n\n        :param request: given ML model to be transported\n        :type request: any object\n        :param command: command to specify whether the request should be serialized or deserialized\n        :type command: transporter.Command\n        :param is_inner_model: determines whether it is an inner model of a super ML model\n        :type is_inner_model: boolean\n        :return: the transported request as a json string or sklearn ML model\n        \"\"\"\n\n    @abstractmethod\n    def serialize(self, model):\n        \"\"\"\n        Return the serialized json string of the given model.\n\n        :param model: given ML model to be get serialized\n        :type model: sklearn ML model\n        :return: the serialized json string of the given ML model\n        \"\"\"\n\n    @abstractmethod\n    def deserialize(self, serialized_model, is_inner_model=False):\n        \"\"\"\n        Return the associated sklearn ML model of the given previously serialized ML model.\n\n        :param serialized_model: given json string of a ML model to get deserialized to associated sklearn ML model\n        :type serialized_model: obj\n        :param is_inner_model: determines whether it is an inner ML model of a super ML model\n        :type is_inner_model: boolean\n        :return: associated sklearn ML model\n        \"\"\"\n\n    @abstractmethod\n    def validate(self, model, command):\n        \"\"\"\n        Check if the provided inputs are valid in relation to each other.\n\n        :param model: a sklearn ML model or a json string of it, serialized through the pymilo export\n        :type model: obj\n        :param command: command to specify whether the request should be serialized or deserialized\n        :type command: transporter.Command\n        :return: None\n        \"\"\"\n\n\nclass AbstractChain(Chain):\n    \"\"\"Abstract Chain with the general implementation of the Chain interface.\"\"\"\n\n    def __init__(self, transporters, supported_models):\n        \"\"\"\n        Initialize the AbstractChain instance.\n\n        :param transporters: worker transporters dedicated to this chain\n        :type transporters: transporter.AbstractTransporter[]\n        :param supported_models: supported sklearn ML models belong to this chain\n        :type supported_models: dict\n        :return: an instance of the AbstractChain class\n        \"\"\"\n        self._transporters = transporters\n        self._supported_models = supported_models\n\n    def is_supported(self, model):\n        \"\"\"\n        Check if the given model is a sklearn's ML model supported by this chain.\n\n        :param model: a string name of an ML model or a sklearn object of it\n        :type model: any object\n        :return: check result as bool\n        \"\"\"\n        model_name = model if isinstance(model, str) else get_sklearn_type(model)\n        return model_name in self._supported_models\n\n    def transport(self, request, command, is_inner_model=False):\n        \"\"\"\n        Return the transported (serialized or deserialized) model.\n\n        :param request: given ML model to be transported\n        :type request: any object\n        :param command: command to specify whether the request should be serialized or deserialized\n        :type command: transporter.Command\n        :param is_inner_model: determines whether it is an inner model of a super ML model\n        :type is_inner_model: boolean\n        :return: the transported request as a json string or sklearn ML model\n        \"\"\"\n        if not is_inner_model:\n            self.validate(request, command)\n\n        if command == Command.SERIALIZE:\n            try:\n                return self.serialize(request)\n            except Exception as e:\n                raise PymiloSerializationException(\n                    {\n                        'error_type': SerializationErrorTypes.VALID_MODEL_INVALID_INTERNAL_STRUCTURE,\n                        'error': {\n                            'Exception': repr(e),\n                            'Traceback': format_exc(),\n                        },\n                        'object': request,\n                    })\n\n        elif command == Command.DESERIALIZE:\n            try:\n                return self.deserialize(request, is_inner_model)\n            except Exception as e:\n                raise PymiloDeserializationException(\n                    {\n                        'error_type': DeserializationErrorTypes.VALID_MODEL_INVALID_INTERNAL_STRUCTURE,\n                        'error': {\n                            'Exception': repr(e),\n                            'Traceback': format_exc()},\n                        'object': request\n                    })\n\n    def serialize(self, model):\n        \"\"\"\n        Return the serialized json string of the given model.\n\n        :param model: given ML model to be get serialized\n        :type model: sklearn ML model\n        :return: the serialized json string of the given ML model\n        \"\"\"\n        for transporter in self._transporters:\n            self._transporters[transporter].transport(model, Command.SERIALIZE)\n        return model.__dict__\n\n    def deserialize(self, serialized_model, is_inner_model=False):\n        \"\"\"\n        Return the associated sklearn ML model of the given previously serialized ML model.\n\n        :param serialized_model: given json string of a ML model to get deserialized to associated sklearn ML model\n        :type serialized_model: obj\n        :param is_inner_model: determines whether it is an inner ML model of a super ML model\n        :type is_inner_model: boolean\n        :return: associated sklearn ML model\n        \"\"\"\n        raw_model = None\n        data = None\n        if is_inner_model:\n            raw_model = self._supported_models[serialized_model[\"type\"]]()\n            data = serialized_model[\"data\"]\n        else:\n            raw_model = self._supported_models[serialized_model.type]()\n            data = serialized_model.data\n        for transporter in self._transporters:\n            self._transporters[transporter].transport(\n                serialized_model, Command.DESERIALIZE, is_inner_model)\n        for item in data:\n            setattr(raw_model, item, data[item])\n        return raw_model\n\n    def validate(self, model, command):\n        \"\"\"\n        Check if the provided inputs are valid in relation to each other.\n\n        :param model: a sklearn ML model or a json string of it, serialized through the pymilo export\n        :type model: obj\n        :param command: command to specify whether the request should be serialized or deserialized\n        :type command: transporter.Command\n        :return: None\n        \"\"\"\n        if command == Command.SERIALIZE:\n            if self.is_supported(model):\n                return\n            else:\n                raise PymiloSerializationException(\n                    {\n                        'error_type': SerializationErrorTypes.INVALID_MODEL,\n                        'object': model\n                    }\n                )\n        elif command == Command.DESERIALIZE:\n            if self.is_supported(model.type):\n                return\n            else:\n                raise PymiloDeserializationException(\n                    {\n                        'error_type': DeserializationErrorTypes.INVALID_MODEL,\n                        'object': model\n                    }\n                )\n"
  },
  {
    "path": "pymilo/chains/clustering_chain.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo chain for Clustering models.\"\"\"\n\nfrom ..chains.chain import AbstractChain\nfrom ..pymilo_param import SKLEARN_CLUSTERING_TABLE, NOT_SUPPORTED\nfrom ..transporters.cfnode_transporter import CFNodeTransporter\nfrom ..transporters.function_transporter import FunctionTransporter\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\nfrom ..transporters.preprocessing_transporter import PreprocessingTransporter\n\nCLUSTERING_CHAIN = {\n    \"PreprocessingTransporter\": PreprocessingTransporter(),\n    \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n    \"FunctionTransporter\": FunctionTransporter(),\n    \"CFNodeTransporter\": CFNodeTransporter(),\n}\n\nif SKLEARN_CLUSTERING_TABLE[\"BisectingKMeans\"] != NOT_SUPPORTED:\n    from ..transporters.bisecting_tree_transporter import BisectingTreeTransporter\n    from ..transporters.randomstate_transporter import RandomStateTransporter\n    CLUSTERING_CHAIN[\"RandomStateTransporter\"] = RandomStateTransporter()\n    CLUSTERING_CHAIN[\"BisectingTreeTransporter\"] = BisectingTreeTransporter()\n\nclustering_chain = AbstractChain(CLUSTERING_CHAIN, SKLEARN_CLUSTERING_TABLE)\n"
  },
  {
    "path": "pymilo/chains/compose_chain.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo chain for compose models.\"\"\"\n\nfrom ..chains.chain import AbstractChain\nfrom ..transporters.compose_transporter import ComposeTransporter\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\nfrom ..transporters.function_transporter import FunctionTransporter\nfrom ..transporters.transporter import Command\nfrom ..pymilo_param import SKLEARN_COMPOSE_TABLE\n\nCOMPOSE_CHAIN = {\n    \"ComposeTransporter\": ComposeTransporter(),\n    \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n    \"FunctionTransporter\": FunctionTransporter(),\n}\n\n\nclass ComposeModelChain(AbstractChain):\n    \"\"\"ComposeModelChain developed to handle sklearn Compose ML model transportation.\"\"\"\n\n    def deserialize(self, compose, is_inner_model=False):\n        \"\"\"\n        Return the associated sklearn compose model of the given compose.\n\n        :param compose: given json string of a compose model to get deserialized to associated sklearn compose model\n        :type compose: obj\n        :param is_inner_model: determines whether it is an inner compose model of a super ml model\n        :type is_inner_model: boolean\n        :return: associated sklearn compose model\n        \"\"\"\n        data = compose[\"data\"] if is_inner_model else compose.data\n        _type = compose[\"type\"] if is_inner_model else compose.type\n\n        # ColumnTransformer requires 'transformers' arg; others use default constructor\n        if _type == \"ColumnTransformer\":\n            raw_model = self._supported_models[_type](transformers=data.get(\"transformers\", []))\n        else:\n            raw_model = self._supported_models[_type]()\n\n        for transporter in self._transporters:\n            self._transporters[transporter].transport(compose, Command.DESERIALIZE, is_inner_model)\n\n        for item in data:\n            setattr(raw_model, item, data[item])\n        return raw_model\n\n\ncompose_chain = ComposeModelChain(COMPOSE_CHAIN, SKLEARN_COMPOSE_TABLE)\n\n\ndef get_transporter(model):\n    \"\"\"\n    Get associated transporter for the given ML model.\n\n    :param model: given model to get it's transporter\n    :type model: scikit ML model\n    :return: model category and transporter function\n    \"\"\"\n    if isinstance(model, str):\n        if model.upper() == \"COMPOSE\":\n            return \"COMPOSE\", compose_chain.transport\n    if compose_chain.is_supported(model):\n        return \"COMPOSE\", compose_chain.transport\n    else:\n        return None, None\n"
  },
  {
    "path": "pymilo/chains/cross_decomposition_chain.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo chain for Cross Decomposition models.\"\"\"\n\nfrom ..chains.chain import AbstractChain\nfrom ..pymilo_param import SKLEARN_CROSS_DECOMPOSITION_TABLE\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\nfrom ..transporters.preprocessing_transporter import PreprocessingTransporter\n\ncross_decomposition_chain = AbstractChain(\n    {\n        \"PreprocessingTransporter\": PreprocessingTransporter(),\n        \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n    },\n    SKLEARN_CROSS_DECOMPOSITION_TABLE,\n)\n"
  },
  {
    "path": "pymilo/chains/decision_tree_chain.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo chain for Decision Trees models.\"\"\"\n\nfrom ..chains.chain import AbstractChain\nfrom ..pymilo_param import SKLEARN_DECISION_TREE_TABLE\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\nfrom ..transporters.preprocessing_transporter import PreprocessingTransporter\nfrom ..transporters.randomstate_transporter import RandomStateTransporter\nfrom ..transporters.tree_transporter import TreeTransporter\n\ndecision_trees_chain = AbstractChain(\n    {\n        \"PreprocessingTransporter\": PreprocessingTransporter(),\n        \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n        \"RandomStateTransporter\": RandomStateTransporter(),\n        \"TreeTransporter\": TreeTransporter(),\n    },\n    SKLEARN_DECISION_TREE_TABLE,\n)\n"
  },
  {
    "path": "pymilo/chains/ensemble_chain.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo chain for ensemble models.\"\"\"\n\nimport copy\nfrom ast import literal_eval\n\nfrom numpy import ndarray, asarray\n\nfrom ..chains.chain import AbstractChain\nfrom ..transporters.feature_extraction_transporter import FeatureExtractorTransporter\nfrom ..transporters.binmapper_transporter import BinMapperTransporter\nfrom ..transporters.bunch_transporter import BunchTransporter\nfrom ..transporters.transporter import Command\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\nfrom ..transporters.generator_transporter import GeneratorTransporter\nfrom ..transporters.lossfunction_transporter import LossFunctionTransporter\nfrom ..transporters.preprocessing_transporter import PreprocessingTransporter\nfrom ..transporters.randomstate_transporter import RandomStateTransporter\nfrom ..transporters.treepredictor_transporter import TreePredictorTransporter\nfrom ..pymilo_param import SKLEARN_ENSEMBLE_TABLE\nfrom ..utils.util import check_str_in_iterable\nfrom .util import serialize_possible_ml_model, deserialize_possible_ml_model\n\nENSEMBLE_CHAIN = {\n    \"FeatureExtractorTransporter\": FeatureExtractorTransporter(),\n    \"PreprocessingTransporter\": PreprocessingTransporter(),\n    \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n    \"TreePredictorTransporter\": TreePredictorTransporter(),\n    \"BinMapperTransporter\": BinMapperTransporter(),\n    \"GeneratorTransporter\": GeneratorTransporter(),\n    \"RandomStateTransporter\": RandomStateTransporter(),\n    \"LossFunctionTransporter\": LossFunctionTransporter(),\n    \"BunchTransporter\": BunchTransporter(),\n}\n\n\nclass EnsembleModelChain(AbstractChain):\n    \"\"\"EnsembleModelChain developed to handle sklearn Ensemble ML model transportation.\"\"\"\n\n    def serialize(self, ensemble_object):\n        \"\"\"\n        Return the serialized json string of the given ensemble model.\n\n        :param ensemble_object: given model to be get serialized\n        :type ensemble_object: any sklearn ensemble model\n        :return: the serialized json string of the given ensemble\n        \"\"\"\n        for transporter in self._transporters:\n            if transporter != \"GeneralDataStructureTransporter\":\n                self._transporters[transporter].transport(\n                    ensemble_object, Command.SERIALIZE)\n\n        pt = ENSEMBLE_CHAIN[\"PreprocessingTransporter\"]\n        fe = ENSEMBLE_CHAIN[\"FeatureExtractorTransporter\"]\n        for key, value in ensemble_object.__dict__.items():\n            if isinstance(value, list):\n                has_inner_tuple_with_ml_model = False\n                for idx, item in enumerate(value):\n                    if isinstance(item, tuple):\n                        listed_tuple = list(item)\n                        for inner_idx, inner_item in enumerate(listed_tuple):\n                            if pt.is_preprocessing_module(inner_item):\n                                listed_tuple[inner_idx] = pt.serialize_pre_module(inner_item)\n                            elif fe.is_fe_module(inner_item):\n                                listed_tuple[inner_idx] = fe.serialize_fe_module(inner_item)\n                            else:\n                                has_inner_model, result = serialize_possible_ml_model(inner_item)\n                                if has_inner_model:\n                                    has_inner_tuple_with_ml_model = True\n                                listed_tuple[inner_idx] = result\n                        value[idx] = listed_tuple\n                    else:\n                        value[idx] = serialize_possible_ml_model(item)[1]\n                if has_inner_tuple_with_ml_model:\n                    ensemble_object.__dict__[key] = {\n                        \"pymiloed-data-structure\": \"list of (str, estimator) tuples\",\n                        \"pymiloed-data\": value,\n                    }\n\n            elif isinstance(value, dict):\n                if check_str_in_iterable(\"pymilo-bunch\", value):\n                    new_value = {}\n                    for inner_key, inner_value in value[\"pymilo-bunch\"].items():\n                        new_value[inner_key] = serialize_possible_ml_model(inner_value)[1]\n                    value[\"pymilo-bunch\"] = new_value\n                else:\n                    new_value = {}\n                    for inner_key, inner_value in value.items():\n                        new_value[inner_key] = serialize_possible_ml_model(inner_value)[1]\n                    ensemble_object.__dict__[key] = new_value\n\n            elif isinstance(value, ndarray):\n                has_inner_model, result = serialize_models_in_ndarray(value)\n                if has_inner_model:\n                    ensemble_object.__dict__[key] = result\n\n            else:\n                ensemble_object.__dict__[key] = serialize_possible_ml_model(value)[1]\n\n        self._transporters[\"GeneralDataStructureTransporter\"].transport(ensemble_object, Command.SERIALIZE)\n\n        return ensemble_object.__dict__\n\n    def deserialize(self, ensemble, is_inner_model=False):\n        \"\"\"\n        Return the associated sklearn ensemble model of the given ensemble.\n\n        :param ensemble: given json string of a ensemble model to get deserialized to associated sklearn ensemble model\n        :type ensemble: obj\n        :param is_inner_model: determines whether it is an inner ensemble model of a super ml model\n        :type is_inner_model: boolean\n        :return: associated sklearn ensemble model\n        \"\"\"\n        data = None\n        if is_inner_model:\n            data = ensemble[\"data\"]\n        else:\n            data = ensemble.data\n\n        for transporter in self._transporters:\n            if transporter != \"GeneralDataStructureTransporter\":\n                self._transporters[transporter].transport(\n                    ensemble, Command.DESERIALIZE, is_inner_model)\n\n        pt = ENSEMBLE_CHAIN[\"PreprocessingTransporter\"]\n        fe = ENSEMBLE_CHAIN[\"FeatureExtractorTransporter\"]\n        for key, value in data.items():\n            if isinstance(value, dict):\n                if check_str_in_iterable(\"pymiloed-data-structure\",\n                                         value) and value[\"pymiloed-data-structure\"] == \"list of (str, estimator) tuples\":\n                    listed_tuples = value[\"pymiloed-data\"]\n                    list_of_tuples = []\n                    for listed_tuple in listed_tuples:\n                        name, serialized_model = listed_tuple\n                        retrieved_model = None\n                        if pt.is_preprocessing_module(serialized_model):\n                            retrieved_model = pt.deserialize_pre_module(serialized_model)\n                        elif fe.is_fe_module(serialized_model):\n                            retrieved_model = fe.deserialize_fe_module(serialized_model)\n                        else:\n                            retrieved_model = deserialize_possible_ml_model(serialized_model)[1]\n                        list_of_tuples.append(\n                            (name, retrieved_model)\n                        )\n                    data[key] = list_of_tuples\n\n                elif GeneralDataStructureTransporter().is_deserialized_ndarray(value):\n                    has_inner_model, result = deserialize_models_in_ndarray(value)\n                    if has_inner_model:\n                        data[key] = result\n\n            if isinstance(value, list):\n                for idx, item in enumerate(value):\n                    has_ml_model, result = deserialize_possible_ml_model(item)\n                    if has_ml_model:\n                        value[idx] = result\n\n            has_ml_model, result = deserialize_possible_ml_model(value)\n            if has_ml_model:\n                data[key] = result\n\n        self._transporters[\"GeneralDataStructureTransporter\"].transport(ensemble, Command.DESERIALIZE, is_inner_model)\n\n        _type = None\n        raw_model = None\n        meta_learnings = [\"StackingRegressor\", \"StackingClassifier\", \"VotingRegressor\", \"VotingClassifier\"]\n        pipeline_models = [\"Pipeline\"]\n        if is_inner_model:\n            _type = ensemble[\"type\"]\n        else:\n            _type = ensemble.type\n\n        if _type in meta_learnings:\n            raw_model = self._supported_models[_type](estimators=data[\"estimators\"])\n        elif _type in pipeline_models:\n            raw_model = self._supported_models[_type](steps=data[\"steps\"])\n        else:\n            raw_model = self._supported_models[_type]()\n\n        for item in data:\n            setattr(raw_model, item, data[item])\n        return raw_model\n\n\nensemble_chain = EnsembleModelChain(ENSEMBLE_CHAIN, SKLEARN_ENSEMBLE_TABLE)\n\n\ndef serialize_models_in_ndarray(ndarray_instance):\n    \"\"\"\n    Serialize the ml models inside the given ndarray.\n\n    :param ndarray_instance: given ndarray needed to get it's inner ML models serialized\n    :type ndarray_instance: numpy.ndarray\n    :return: dict\n    \"\"\"\n    if not isinstance(ndarray_instance, ndarray):\n        return None  # throw error\n\n    ndarray_instance_copy = copy.deepcopy(ndarray_instance)\n    has_inner_model = True\n\n    dtype = ndarray_instance.dtype\n\n    new_list = []\n    for item in ndarray_instance:\n        if isinstance(item, ndarray):\n            has_inside_model, result = serialize_models_in_ndarray(item)\n            if not has_inside_model:\n                has_inner_model = False\n                break\n            else:\n                new_list.append(result)\n        else:\n            has_ml_model, result = serialize_possible_ml_model(item)\n            if has_ml_model:\n                new_list.append(result)\n            else:\n                has_inner_model = False\n                break\n\n    if not has_inner_model:\n        return False, ndarray_instance_copy\n    else:\n        return True, {\n            'pymiloed-ndarray-list': new_list,\n            'pymiloed-ndarray-dtype': str(dtype),\n            'pymiloed-data-structure': 'numpy.ndarray'\n        }\n\n\ndef deserialize_models_in_ndarray(serialized_ndarray):\n    \"\"\"\n    Deserializes possible ML models within the given ndarray instance.\n\n    :param serialized_ndarray: given ndarray to deserialize possible previously serialized inner ML models\n    :type serialized_ndarray: obj\n    :return: numpy.ndarray\n    \"\"\"\n    gdst = GeneralDataStructureTransporter()\n    if not gdst.is_deserialized_ndarray(serialized_ndarray):\n        return False, None  # throw error\n\n    serialized_ndarray_copy = copy.deepcopy(serialized_ndarray)\n    has_inner_model = True\n\n    inner_list = serialized_ndarray['pymiloed-ndarray-list']\n    new_list = []\n    for _, item in enumerate(inner_list):\n        if gdst.is_deserialized_ndarray(item):\n            has_inside_model, result = deserialize_models_in_ndarray(item)\n            if not has_inside_model:\n                has_inside_model = False\n                break\n            else:\n                new_list.append(result)\n\n        else:\n            has_ml_model, result = deserialize_possible_ml_model(item)\n            if has_ml_model:\n                new_list.append(result)\n            else:\n                has_inner_model = False\n                break\n\n    if not has_inner_model:\n        return False, serialized_ndarray_copy\n    else:\n        dtype = serialized_ndarray['pymiloed-ndarray-dtype']\n        if dtype.startswith(\"[\"):\n            dtype = literal_eval(dtype)\n\n        return True, asarray(new_list, dtype=dtype)\n"
  },
  {
    "path": "pymilo/chains/linear_model_chain.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo chain for linear models.\"\"\"\n\nfrom .chain import AbstractChain\nfrom ..transporters.baseloss_transporter import BaseLossTransporter\nfrom ..transporters.transporter import Command\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\nfrom ..transporters.lossfunction_transporter import LossFunctionTransporter\nfrom ..transporters.preprocessing_transporter import PreprocessingTransporter\n\nfrom ..utils.util import get_sklearn_type, is_iterable\nfrom ..pymilo_param import SKLEARN_LINEAR_MODEL_TABLE\n\nLINEAR_MODEL_CHAIN = {\n    \"PreprocessingTransporter\": PreprocessingTransporter(),\n    \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n    \"BaseLossTransporter\": BaseLossTransporter(),\n    \"LossFunctionTransporter\": LossFunctionTransporter(),\n}\n\n\nclass LinearModelChain(AbstractChain):\n    \"\"\"LinearModelChain developed to handle sklearn Linear ML model transportation.\"\"\"\n\n    def serialize(self, linear_model_object):\n        \"\"\"\n        Return the serialized json string of the given linear model.\n\n        :param linear_model_object: given model to be get serialized\n        :type linear_model_object: any sklearn linear model\n        :return: the serialized json string of the given linear model\n        \"\"\"\n        # first serializing the inner linear models...\n        for key in linear_model_object.__dict__:\n            if self.is_supported(linear_model_object.__dict__[key]):\n                linear_model_object.__dict__[key] = {\n                    \"pymilo-inner-model-data\": self.transport(linear_model_object.__dict__[key], Command.SERIALIZE, True),\n                    \"pymilo-inner-model-type\": get_sklearn_type(linear_model_object.__dict__[key]),\n                    \"pymilo-bypass\": True\n                }\n        # now serializing non-linear model fields\n        for transporter in self._transporters:\n            self._transporters[transporter].transport(\n                linear_model_object, Command.SERIALIZE)\n        return linear_model_object.__dict__\n\n    def deserialize(self, linear_model, is_inner_model=False):\n        \"\"\"\n        Return the associated sklearn linear model of the given linear_model.\n\n        :param linear_model: given json string of a linear model to get deserialized to associated sklearn linear model\n        :type linear_model: obj\n        :param is_inner_model: determines whether it is an inner model of a super ml model\n        :type is_inner_model: boolean\n        :return: associated sklearn linear model\n        \"\"\"\n        raw_model = None\n        data = None\n        if is_inner_model:\n            raw_model = self._supported_models[linear_model[\"type\"]]()\n            data = linear_model[\"data\"]\n        else:\n            raw_model = self._supported_models[linear_model.type]()\n            data = linear_model.data\n        # first deserializing the inner linear models(one depth inner linear\n        # models have been deserialized -> TODO full depth).\n        for key in data:\n            if is_deserialized_linear_model(data[key]):\n                data[key] = self.transport({\n                    \"data\": data[key][\"pymilo-inner-model-data\"],\n                    \"type\": data[key][\"pymilo-inner-model-type\"]\n                }, Command.DESERIALIZE, is_inner_model=True)\n        # now deserializing non-linear models fields\n        for transporter in self._transporters:\n            self._transporters[transporter].transport(\n                linear_model, Command.DESERIALIZE, is_inner_model)\n        for item in data:\n            setattr(raw_model, item, data[item])\n        return raw_model\n\n\nlinear_chain = LinearModelChain(LINEAR_MODEL_CHAIN, SKLEARN_LINEAR_MODEL_TABLE)\n\n\ndef is_deserialized_linear_model(content):\n    \"\"\"\n    Check if the given content is a previously serialized model by Pymilo's Export or not.\n\n    :param content: given object to be authorized as a valid pymilo exported serialized model\n    :type content: any object\n    :return: check result as bool\n    \"\"\"\n    if not is_iterable(content):\n        return False\n    return \"pymilo-inner-model-type\" in content and \"pymilo-inner-model-data\" in content\n"
  },
  {
    "path": "pymilo/chains/naive_bayes_chain.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo chain for Naive Bayes models.\"\"\"\n\nfrom ..chains.chain import AbstractChain\nfrom ..pymilo_param import SKLEARN_NAIVE_BAYES_TABLE\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\nfrom ..transporters.preprocessing_transporter import PreprocessingTransporter\n\nnaive_bayes_chain = AbstractChain(\n    {\n        \"PreprocessingTransporter\": PreprocessingTransporter(),\n        \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n    },\n    SKLEARN_NAIVE_BAYES_TABLE,\n)\n"
  },
  {
    "path": "pymilo/chains/neighbours_chain.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo chain for Neighbors models.\"\"\"\n\nfrom ..chains.chain import AbstractChain\nfrom ..pymilo_param import SKLEARN_NEIGHBORS_TABLE\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\nfrom ..transporters.neighbors_tree_transporter import NeighborsTreeTransporter\nfrom ..transporters.preprocessing_transporter import PreprocessingTransporter\n\nneighbors_chain = AbstractChain(\n    {\n        \"PreprocessingTransporter\": PreprocessingTransporter(),\n        \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n        \"NeighborsTreeTransporter\": NeighborsTreeTransporter(),\n    },\n    SKLEARN_NEIGHBORS_TABLE,\n)\n"
  },
  {
    "path": "pymilo/chains/neural_network_chain.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo chain for Neural Network models.\"\"\"\n\nfrom ..chains.chain import AbstractChain\nfrom ..pymilo_param import SKLEARN_NEURAL_NETWORK_TABLE\nfrom ..transporters.adamoptimizer_transporter import AdamOptimizerTransporter\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\nfrom ..transporters.preprocessing_transporter import PreprocessingTransporter\nfrom ..transporters.randomstate_transporter import RandomStateTransporter\nfrom ..transporters.sgdoptimizer_transporter import SGDOptimizerTransporter\n\nneural_network_chain = AbstractChain(\n    {\n        \"PreprocessingTransporter\": PreprocessingTransporter(),\n        \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n        \"RandomStateTransporter\": RandomStateTransporter(),\n        \"SGDOptimizer\": SGDOptimizerTransporter(),\n        \"AdamOptimizerTransporter\": AdamOptimizerTransporter(),\n    },\n    SKLEARN_NEURAL_NETWORK_TABLE,\n)\n"
  },
  {
    "path": "pymilo/chains/svm_chain.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo chain for SVM models.\"\"\"\n\nfrom ..chains.chain import AbstractChain\nfrom ..pymilo_param import SKLEARN_SVM_TABLE\nfrom ..transporters.preprocessing_transporter import PreprocessingTransporter\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\nfrom ..transporters.randomstate_transporter import RandomStateTransporter\n\nsvm_chain = AbstractChain(\n    {\n        \"PreprocessingTransporter\": PreprocessingTransporter(),\n        \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n        \"RandomStateTransporter\": RandomStateTransporter(),\n    },\n    SKLEARN_SVM_TABLE,\n)\n"
  },
  {
    "path": "pymilo/chains/util.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"useful utilities for chains.\"\"\"\n\nfrom .linear_model_chain import linear_chain\nfrom .neural_network_chain import neural_network_chain\nfrom .decision_tree_chain import decision_trees_chain\nfrom .clustering_chain import clustering_chain\nfrom .naive_bayes_chain import naive_bayes_chain\nfrom .svm_chain import svm_chain\nfrom .neighbours_chain import neighbors_chain\nfrom .cross_decomposition_chain import cross_decomposition_chain\nfrom ..utils.util import get_sklearn_type, check_str_in_iterable\n\n\nMODEL_TYPE_TRANSPORTER = {\n    \"LINEAR_MODEL\": linear_chain.transport,\n    \"NEURAL_NETWORK\": neural_network_chain.transport,\n    \"DECISION_TREE\": decision_trees_chain.transport,\n    \"CLUSTERING\": clustering_chain.transport,\n    \"NAIVE_BAYES\": naive_bayes_chain.transport,\n    \"SVM\": svm_chain.transport,\n    \"NEIGHBORS\": neighbors_chain.transport,\n    \"CROSS_DECOMPOSITION\": cross_decomposition_chain.transport,\n}\n\n\ndef get_concrete_transporter(model):\n    \"\"\"\n    Get associated transporter for the given concrete(not ensemble) ML model.\n\n    :param model: given model to get it's transporter\n    :type model: scikit ML model\n    :return: model category and transporter function\n    \"\"\"\n    if isinstance(model, str):\n        upper_model = model.upper()\n        if upper_model in MODEL_TYPE_TRANSPORTER.keys():\n            return upper_model, MODEL_TYPE_TRANSPORTER[upper_model]\n\n    if linear_chain.is_supported(model):\n        return \"LINEAR_MODEL\", linear_chain.transport\n    elif neural_network_chain.is_supported(model):\n        return \"NEURAL_NETWORK\", neural_network_chain.transport\n    elif decision_trees_chain.is_supported(model):\n        return \"DECISION_TREE\", decision_trees_chain.transport\n    elif clustering_chain.is_supported(model):\n        return \"CLUSTERING\", clustering_chain.transport\n    elif naive_bayes_chain.is_supported(model):\n        return \"NAIVE_BAYES\", naive_bayes_chain.transport\n    elif svm_chain.is_supported(model):\n        return \"SVM\", svm_chain.transport\n    elif neighbors_chain.is_supported(model):\n        return \"NEIGHBORS\", neighbors_chain.transport\n    elif cross_decomposition_chain.is_supported(model):\n        return \"CROSS_DECOMPOSITION\", cross_decomposition_chain.transport\n    else:\n        return None, None\n\n\ndef get_transporter(model):\n    \"\"\"\n    Get associated transporter for the given ML model.\n\n    :param model: given model to get it's transporter\n    :type model: scikit ML model or str\n    :return: model category and transporter function\n    \"\"\"\n    # String routing\n    if isinstance(model, str):\n        upper = model.upper()\n        if upper == \"COMPOSE\":\n            from .compose_chain import compose_chain\n            return \"COMPOSE\", compose_chain.transport\n        if upper == \"ENSEMBLE\":\n            from .ensemble_chain import ensemble_chain\n            return \"ENSEMBLE\", ensemble_chain.transport\n\n    # Object routing (check higher-level categories first)\n    from .compose_chain import compose_chain\n    if compose_chain.is_supported(model):\n        return \"COMPOSE\", compose_chain.transport\n\n    from .ensemble_chain import ensemble_chain\n    if ensemble_chain.is_supported(model):\n        return \"ENSEMBLE\", ensemble_chain.transport\n\n    return get_concrete_transporter(model)\n\n\ndef serialize_possible_ml_model(model):\n    \"\"\"\n    Serialize the given object if it is a supported ML model.\n\n    :param model: given object\n    :type model: any\n    :return: ML model flag and serialized result\n    \"\"\"\n    if isinstance(model, str):\n        return False, model\n    ml_category, transporter = get_transporter(model)\n    if transporter is None:\n        return False, model\n    from ..transporters.transporter import Command\n    return True, {\n        \"pymilo-bypass\": True,\n        \"pymilo-inner-model-data\": transporter(model, Command.SERIALIZE),\n        \"pymilo-inner-model-type\": get_sklearn_type(model),\n        \"pymilo-ml-category\": ml_category\n    }\n\n\ndef deserialize_possible_ml_model(serialized_model):\n    \"\"\"\n    Deserialize the given object if it is a previously serialized ML model.\n\n    :param serialized_model: given obj to check\n    :type serialized_model: obj\n    :return: ML model flag and deserialized result\n    \"\"\"\n    if check_str_in_iterable(\"pymilo-inner-model-type\", serialized_model):\n        _, transporter = get_transporter(serialized_model[\"pymilo-ml-category\"])\n        from ..transporters.transporter import Command\n        return True, transporter({\n            \"data\": serialized_model[\"pymilo-inner-model-data\"],\n            \"type\": serialized_model[\"pymilo-inner-model-type\"]\n        }, Command.DESERIALIZE, is_inner_model=True)\n    else:\n        return False, serialized_model\n"
  },
  {
    "path": "pymilo/exceptions/__init__.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo exceptions.\"\"\"\nfrom .pymilo_exception import PymiloException\nfrom .serialize_exception import PymiloSerializationException\nfrom .deserialize_exception import PymiloDeserializationException\n"
  },
  {
    "path": "pymilo/exceptions/deserialize_exception.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Deserialization Exception.\"\"\"\nfrom enum import Enum\nfrom .pymilo_exception import PymiloException\n\n\nclass DeserializationErrorTypes(Enum):\n    \"\"\"An enum class to determine the type of deserialization errors.\"\"\"\n\n    CORRUPTED_JSON_FILE = 1\n    INVALID_MODEL = 2\n    VALID_MODEL_INVALID_INTERNAL_STRUCTURE = 3\n\n\nclass PymiloDeserializationException(PymiloException):\n    \"\"\"\n    Handle exceptions associated with Deserialization.\n\n    There are 3 different types of deserialization exceptions:\n\n        1-CORRUPTED_JSON_FILE: This error type claims that the given json string file which is supposed to be an\n        output of Pymilo Export, is corrupted and can not be parsed as a valid json.\n\n        2-INVALID_MODEL: This error type claims that the given json string file(or object) is not a deserialized export of\n        a valid sklearn linear model.\n\n        3-VALID_MODEL_INVALID_INTERNAL_STRUCTURE: This error occurs when attempting to load a JSON file or object that\n        does not conform to the expected format of a serialized scikit-learn linear model.\n        The file may have been modified after being exported from Pymilo Export, causing it to become invalid.\n    \"\"\"\n\n    def __init__(self, meta_data):\n        \"\"\"\n        Initialize the PymiloDeserializationException instance.\n\n        :param meta_data: Details pertain to the populated error.\n        :type meta_data: dict [str:str]\n        :return: an instance of the PymiloDeserializationException class\n        \"\"\"\n        # Call the base class constructor with the parameters it needs\n        message = \"Pymilo Deserialization failed since {reason}\"\n        error_type = meta_data['error_type']\n        error_type_to_message = {\n            DeserializationErrorTypes.CORRUPTED_JSON_FILE:\n            'the given file is not a valid .json file.',\n            DeserializationErrorTypes.INVALID_MODEL:\n            'the given model is not supported or is not a valid model.',\n            DeserializationErrorTypes.VALID_MODEL_INVALID_INTERNAL_STRUCTURE:\n            'the given model has some non-standard customized internal objects or functions.'}\n        if error_type in error_type_to_message:\n            reason = error_type_to_message[error_type]\n        else:\n            reason = \"an Unknown error occurred.\"\n        message.format(reason=reason)\n        super().__init__(message, meta_data)\n\n    def to_pymilo_log(self):\n        \"\"\"\n        Generate a comprehensive report of the populated error.\n\n        :return: a dictionary of error details.\n        \"\"\"\n        pymilo_report = super().to_pymilo_log()\n        if self.meta_data['error_type'] == DeserializationErrorTypes.CORRUPTED_JSON_FILE:\n            pymilo_report['object']['json_file'] = self.meta_data['json_file']\n        return pymilo_report\n"
  },
  {
    "path": "pymilo/exceptions/pymilo_exception.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Abstract Exception Class.\"\"\"\n\nimport pymilo\nimport sklearn\nimport platform\nfrom datetime import datetime, timezone\nfrom abc import ABC, abstractmethod\n\n\nclass PymiloException(Exception, ABC):\n    \"\"\"An abstract class for handling pymilo associated exceptions.\"\"\"\n\n    def __init__(self, message, meta_data):\n        \"\"\"\n        Initialize the PymiloException instance.\n\n        :param message: Error message associated with the populated error.\n        :type message: str\n        :param meta_data: Details pertain to the populated error.\n        :type meta_data: dict [str:str]\n        :return: an instance of the PymiloDeserializationException class\n        \"\"\"\n        # Call the base class constructor with the parameters it needs\n        super().__init__(message)\n        # gathered meta_data\n        self.message = message\n        self.meta_data = meta_data\n\n    def to_pymilo_log(self):\n        \"\"\"\n        Generate a comprehensive report of the populated error.\n\n        :return: error's details as dictionary\n        \"\"\"\n        pymilo_report = {\n            'os': {\n                'name': platform.system(),\n                'version': platform.version(),\n                'release': platform.release(),\n                'full-description': platform.platform()\n            },\n            'versions': {\n                'pymilo-version': pymilo.__version__,\n                'scikit-version': sklearn.__version__,\n                'python-version': platform.python_version()\n            },\n            'object': {\n                'type': type(self.meta_data['object']),\n                'content': self.meta_data['object']\n            },\n            'error': {\n                'date-utc': datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M:%S'),\n                'pymilo-error': self.message,\n                'inner-error': self.meta_data['error'] if \"error\" in self.meta_data else \"\"\n            }\n        }\n\n        return pymilo_report\n\n    def to_pymilo_issue(self):\n        \"\"\"\n        Generate an issue form from the populated error.\n\n        :return: issue form of the associated error as string\n        \"\"\"\n        pymilo_report = self.to_pymilo_log()\n        help_request = \"\"\"\n        \\n\\nIn order to help us enhance Pymilo's functionality, please open an issue associated with this error and put the message below inside.\\n\n        \"\"\"\n        associated_pymilo_class = \"Export\" if \"Serialization\" in self.message else \"Import\"\n        description = \"#### Description\\n Pymilo {pymilo_class} failed.\".format(pymilo_class=associated_pymilo_class)\n        steps_to_produce = \"\\n#### Steps/Code to Reproduce\\n It is auto-reported from the pymilo logger.\"\n        expected_behavior = \"\\n#### Expected Behavior\\n A successful Pymilo {pymilo_class}.\".format(\n            pymilo_class=associated_pymilo_class)\n        actual_behavior = \"\\n#### Actual Behavior\\n Pymilo {pymilo_class} failed.\".format(\n            pymilo_class=associated_pymilo_class)\n        operating_system = \"#### Operating System\\n {os}\".format(os=pymilo_report['os']['full-description'])\n        python_version = \"#### Python Version\\n {python_version}\".format(\n            python_version=pymilo_report['versions'][\"python-version\"])\n        pymilo_version = \"#### PyMilo Version\\n {pymilo_version}\".format(\n            pymilo_version=pymilo_report['versions'][\"pymilo-version\"])\n        gathered_data = \"#### Logged Data\\n {logged_data}\".format(logged_data=str(pymilo_report))\n\n        full_issue_form = help_request + description + steps_to_produce + expected_behavior + \\\n            actual_behavior + operating_system + python_version + pymilo_version + gathered_data\n        return full_issue_form\n\n    def __str__(self):\n        \"\"\"\n        Override the base __str__ function.\n\n        :return: issue form of the associated error as string\n        \"\"\"\n        return self.to_pymilo_issue()\n"
  },
  {
    "path": "pymilo/exceptions/serialize_exception.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Serialization Exception.\"\"\"\n\nfrom enum import Enum\nfrom .pymilo_exception import PymiloException\n\n\nclass SerializationErrorTypes(Enum):\n    \"\"\"An enum class used to determine the type of serialization errors.\"\"\"\n\n    INVALID_MODEL = 1\n    VALID_MODEL_INVALID_INTERNAL_STRUCTURE = 2\n\n\nclass PymiloSerializationException(PymiloException):\n    \"\"\"\n    Handle exceptions associated with Serializations.\n\n    There are 2 different types of serialization exceptions:\n\n        1-INVALID_MODEL: This error type claims that the given model is not a valid sklearn's linear model.\n\n        2-VALID_MODEL_INVALID_INTERNAL_STRUCTURE: This error occurs when attempting to serialize a model that\n        is one of the sklearn's linear models but it's internal structure has changed in a way that can't be serialized.\n    \"\"\"\n\n    def __init__(self, meta_data):\n        \"\"\"\n        Initialize the PymiloSerializationException instance.\n\n        :param meta_data: Details pertain to the populated error.\n        :type meta_data: dict [str:str]\n        :return: an instance of the PymiloSerializationException class\n        \"\"\"\n        # Call the base class constructor with the parameters it needs\n        message = \"Pymilo Serialization failed since \"\n        error_type = meta_data['error_type']\n        error_type_to_message = {\n            SerializationErrorTypes.INVALID_MODEL: 'the given model is not supported or is not a valid model.',\n            SerializationErrorTypes.VALID_MODEL_INVALID_INTERNAL_STRUCTURE: 'the given model has some non-standard customized internal objects or functions.'}\n        if error_type in error_type_to_message:\n            message += error_type_to_message[error_type]\n        else:\n            message += \"an Unknown error occurred.\"\n        super().__init__(message, meta_data)\n\n    def to_pymilo_log(self):\n        \"\"\"\n        Generate a comprehensive report of the populated error.\n\n        :return: error's details as dictionary\n        \"\"\"\n        pymilo_report = super().to_pymilo_log()\n        # TODO add any serializable field to `object` field of pymilo_report\n        return pymilo_report\n"
  },
  {
    "path": "pymilo/pymilo_func.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"Functions.\"\"\"\nimport numpy as np\nimport sklearn\n\nfrom .chains.util import get_transporter\nfrom .transporters.transporter import Command\nfrom .pymilo_param import SKLEARN_SUPPORTED_CATEGORIES, NOT_SUPPORTED, OVERVIEW\n\n\ndef get_sklearn_version():\n    \"\"\"\n    Return sklearn version.\n\n    :return: sklearn version as a str\n    \"\"\"\n    return sklearn.__version__\n\n\ndef get_sklearn_data(model):\n    \"\"\"\n    Return sklearn data by serializing given model.\n\n    :param model: given model\n    :type model: any sklearn's model class\n    :return: sklearn data\n    \"\"\"\n    _, transporter = get_transporter(model)\n    return transporter(model, Command.SERIALIZE)\n\n\ndef to_sklearn_model(import_obj):\n    \"\"\"\n    Deserialize the imported object as a sklearn model.\n\n    :param import_obj: given object\n    :type import_obj: pymilo.Import\n    :return: sklearn model\n    \"\"\"\n    _, transporter = get_transporter(import_obj.type)\n    return transporter(import_obj, Command.DESERIALIZE)\n\n\ndef compare_model_outputs(exported_output,\n                          imported_output,\n                          epsilon_error=10**(-8)):\n    \"\"\"\n    Check if the given models outputs are the same.\n\n    :param exported_output: exported model output\n    :type exported_output: dict\n    :param imported_output: imported model output\n    :type imported_output: dict\n    :param epsilon_error: error threshold for numeric comparisons\n    :type epsilon_error: float\n    :return: check result as bool\n    \"\"\"\n    if len(exported_output) != len(imported_output):\n        return False  # TODO: throw exception\n    total_error = 0\n    for key in exported_output:\n        if key not in imported_output:\n            return False  # TODO: throw exception\n        total_error += np.abs(imported_output[key] - exported_output[key])\n    return np.abs(total_error) < epsilon_error\n\n\ndef print_supported_ml_models():\n    \"\"\"\n    Print the supported sklearn ML models categorized by type.\n\n    :return: None\n    \"\"\"\n    print(\"Supported Machine Learning Models:\")\n    for category, table in SKLEARN_SUPPORTED_CATEGORIES.items():\n        print(f\"**{category}**:\")\n        for model_name in table:\n            if table[model_name] != NOT_SUPPORTED:\n                print(f\"- {model_name}\")\n\n\ndef pymilo_help():\n    \"\"\"\n    Print PyMilo details.\n\n    :return: None\n    \"\"\"\n    print(OVERVIEW)\n    print(\"Repo : https://github.com/openscilab/pymilo\")\n    print(\"Webpage : https://openscilab.com/\\n\")\n"
  },
  {
    "path": "pymilo/pymilo_obj.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo modules.\"\"\"\nimport os\nimport re\nimport json\nfrom copy import deepcopy\nfrom warnings import warn\nfrom traceback import format_exc\nfrom .utils.util import get_sklearn_type, download_model\nfrom concurrent.futures import ThreadPoolExecutor, as_completed\nfrom .pymilo_func import get_sklearn_data, get_sklearn_version, to_sklearn_model\nfrom .exceptions.serialize_exception import PymiloSerializationException, SerializationErrorTypes\nfrom .exceptions.deserialize_exception import PymiloDeserializationException, DeserializationErrorTypes\nfrom .pymilo_param import PYMILO_VERSION, UNEQUAL_PYMILO_VERSIONS, UNEQUAL_SKLEARN_VERSIONS\nfrom .pymilo_param import INVALID_IMPORT_INIT_PARAMS, BATCH_IMPORT_INVALID_DIRECTORY\n\n\nclass Export:\n    \"\"\"\n    The Pymilo Export class facilitates exporting of models to json files.\n\n    >>> exported_model = Export(model) # the model could be any sklearn linear model.\n    >>> exported_model_serialized_path = os.path.join(os.getcwd(), \"MODEL_NAME.json\")\n    >>> exported_model.save(exported_model_serialized_path)\n    \"\"\"\n\n    def __init__(self, model):\n        \"\"\"\n        Initialize the Pymilo Export instance.\n\n        :param model: given model(any sklearn linear model)\n        :type model: any class of the sklearn's linear models\n        :return: an instance of the Pymilo Export class\n        \"\"\"\n        self.data = get_sklearn_data(deepcopy(model))\n        self.version = get_sklearn_version()\n        self.type = get_sklearn_type(model)\n\n    def save(self, file_adr):\n        \"\"\"\n        Save model in a file.\n\n        :param file_adr: file address\n        :type file_adr: str\n        :return: None\n        \"\"\"\n        with open(file_adr, 'w') as fp:\n            fp.write(self.to_json())\n\n    def to_json(self):\n        \"\"\"\n        Return a json-like representation of model.\n\n        :return: model's representation as str\n        \"\"\"\n        try:\n            return json.dumps(\n                {\n                    \"data\": self.data,\n                    \"sklearn_version\": self.version,\n                    \"pymilo_version\": PYMILO_VERSION,\n                    \"model_type\": self.type\n                },\n                indent=4\n            )\n        except Exception as e:\n            raise PymiloSerializationException(\n                {\n                    'error_type': SerializationErrorTypes.VALID_MODEL_INVALID_INTERNAL_STRUCTURE,\n                    'error': {\n                        'Exception': repr(e),\n                        'Traceback': format_exc()},\n                    'object': {\n                        \"data\": self.data,\n                        \"sklearn_version\": self.version,\n                        \"pymilo_version\": PYMILO_VERSION,\n                        \"model_type\": self.type},\n                })\n\n    @staticmethod\n    def batch_export(models, file_addr, run_parallel=False):\n        \"\"\"\n        Export a batch of models to individual JSON files in a specified directory.\n\n        This method takes a list of trained models and exports each one into a JSON file. The models\n        are exported concurrently using multiple threads, where each model is saved to a file named\n        'model_{index}.json' in the provided directory.\n\n        :param models: list of models to get exported.\n        :type models: list\n        :param file_addr: the directory where exported JSON files will be saved.\n        :type file_addr: str\n        :param run_parallel: flag indicating the parallel execution of exports\n        :type run_parallel: boolean\n        :return: the count of models exported successfully\n        \"\"\"\n        if not os.path.exists(file_addr):\n            os.mkdir(file_addr)\n\n        def export_model(model, index):\n            try:\n                Export(model).save(file_adr=os.path.join(file_addr, f\"model_{index}.json\"))\n                return 1\n            except Exception as _:\n                return 0\n        if run_parallel:\n            with ThreadPoolExecutor() as executor:\n                futures = [executor.submit(export_model, model, index) for index, model in enumerate(models)]\n                count = 0\n                for future in as_completed(futures):\n                    count += future.result()\n                return count\n        else:\n            count = 0\n            for index, model in enumerate(models):\n                count += export_model(model, index)\n            return count\n\n\nclass Import:\n    \"\"\"\n    The Pymilo Import class facilitates importing of serialized models from either a designated file path or a JSON string dump.\n\n    >>> imported_model = Import(exported_model_serialized_path)\n    >>> imported_sklearn_model = imported_model.to_model()\n    >>> imported_sklearn_model.predict(x_test)\n    \"\"\"\n\n    def __init__(self, file_adr=None, json_dump=None, url=None):\n        \"\"\"\n        Initialize the Pymilo Import instance.\n\n        :param file_adr: the file path where the serialized model's JSON file is located.\n        :type file_adr: str or None\n        :param json_dump: the json dump of the associated model, it can be None(reading from the file_adr)\n        :type json_dump: str or None\n        :param url: url to exported JSON file\n        :type: str or None\n        :return: an instance of the Pymilo Import class\n        \"\"\"\n        serialized_model_obj = None\n        if url is not None:\n            serialized_model_obj = download_model(url)\n        elif json_dump is not None and isinstance(json_dump, str):\n            serialized_model_obj = json.loads(json_dump)\n        elif file_adr is not None:\n            with open(file_adr, 'r') as fp:\n                serialized_model_obj = json.load(fp)\n        else:\n            raise Exception(INVALID_IMPORT_INIT_PARAMS)\n        try:\n            if not serialized_model_obj[\"pymilo_version\"] == PYMILO_VERSION:\n                warn(UNEQUAL_PYMILO_VERSIONS, category=Warning)\n            if not serialized_model_obj[\"sklearn_version\"] == get_sklearn_version():\n                warn(UNEQUAL_SKLEARN_VERSIONS, category=Warning)\n            self.data = serialized_model_obj[\"data\"]\n            self.version = serialized_model_obj[\"sklearn_version\"]\n            self.type = serialized_model_obj[\"model_type\"]\n        except Exception as e:\n            json_content = None\n            if json_dump and isinstance(json_dump, str):\n                json_content = json_dump\n            elif file_adr is not None:\n                with open(file_adr) as f:\n                    json_content = f.readlines()\n            else:\n                json_content = serialized_model_obj\n            raise PymiloDeserializationException(\n                {\n                    'json_file': json_content,\n                    'error_type': DeserializationErrorTypes.CORRUPTED_JSON_FILE,\n                    'error': {\n                        'Exception': repr(e),\n                        'Traceback': format_exc()},\n                    'object': \"\"})\n\n    def to_model(self):\n        \"\"\"\n        Convert imported model to sklearn model.\n\n        :return: sklearn model\n        \"\"\"\n        return to_sklearn_model(self)\n\n    @staticmethod\n    def batch_import(file_addr, run_parallel=False):\n        \"\"\"\n        Import a batch of models from individual JSON files in a specified directory.\n\n        This method takes a directory containing JSON files and imports each one into a model.\n        The models are imported concurrently using multiple threads, ensuring that the files are\n        processed in the order determined by their numeric suffixes. The function returns the\n        successfully imported models in the same order as their filenames.\n\n        :param file_addr: the directory where the JSON files to be imported are located.\n        :type file_addr: str\n        :param run_parallel: flag indicating the parallel execution of imports\n        :type run_parallel: boolean\n        :return: a tuple containing the count of models imported successfully and a list of the\n                imported models in their filename order.\n        \"\"\"\n        if not os.path.exists(file_addr):\n            raise FileNotFoundError(BATCH_IMPORT_INVALID_DIRECTORY)\n\n        json_files = [f for f in os.listdir(file_addr) if f.endswith('.json')]\n        json_files.sort(key=lambda x: int(re.search(r'_(\\d+)\\.json$', x).group(1)))\n\n        models = [None] * len(json_files)\n        count = 0\n\n        def import_model(file_path, index):\n            try:\n                model = Import(file_path).to_model()\n                return index, model\n            except Exception as _:\n                return index, None\n\n        if run_parallel:\n            with ThreadPoolExecutor() as executor:\n                futures = {\n                    executor.submit(\n                        import_model,\n                        os.path.join(\n                            file_addr,\n                            file),\n                        index): index for index,\n                    file in enumerate(json_files)}\n                for future in as_completed(futures):\n                    index, model = future.result()\n                    if model is not None:\n                        models[index] = model\n                        count += 1\n                return count, models\n        else:\n            count = 0\n            for index, file in enumerate(json_files):\n                model = Import(os.path.join(file_addr, file)).to_model()\n                if model is not None:\n                    models[index] = model\n                    count += 1\n            return count, models\n"
  },
  {
    "path": "pymilo/pymilo_param.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"Parameters and constants.\"\"\"\nimport numpy as np\nimport sklearn.linear_model as linear_model\nimport sklearn.neural_network as neural_network\nimport sklearn.tree as tree\nimport sklearn.cluster as cluster\nimport sklearn.mixture as mixture\nimport sklearn.naive_bayes as naive_bayes\nimport sklearn.svm as svm\nimport sklearn.neighbors as neighbors\nimport sklearn.dummy as dummy\nimport sklearn.ensemble as ensemble\nimport sklearn.pipeline as pipeline\nimport sklearn.preprocessing as preprocessing\nimport sklearn.cross_decomposition as cross_decomposition\nimport sklearn.feature_extraction as feature_extraction\nimport sklearn.compose as compose\n\nquantile_regressor_support = False\ntry:\n    from sklearn.linear_model import QuantileRegressor\n    quantile_regressor_support = True\nexcept BaseException:\n    pass\n\nglm_support = {\n    'GammaRegressor': False,\n    'PoissonRegressor': False,\n    'TweedieRegressor': False\n}\ntry:\n    from sklearn.linear_model import TweedieRegressor\n    glm_support['TweedieRegressor'] = True\n    from sklearn.linear_model import PoissonRegressor\n    glm_support['PoissonRegressor'] = True\n    from sklearn.linear_model import GammaRegressor\n    glm_support['GammaRegressor'] = True\nexcept BaseException:\n    pass\n\nsgd_one_class_svm_support = False\ntry:\n    from sklearn.linear_model import SGDOneClassSVM\n    sgd_one_class_svm_support = True\nexcept BaseException:\n    pass\n\n\nbisecting_kmeans_support = False\ntry:\n    from sklearn.cluster import BisectingKMeans\n    bisecting_kmeans_support = True\nexcept BaseException:\n    pass\n\nhdbscan_support = False\ntry:\n    from sklearn.cluster import HDBSCAN\n    hdbscan_support = True\nexcept BaseException:\n    pass\n\n\nhist_gradient_boosting_support = False\ntry:\n    from sklearn.ensemble import HistGradientBoostingRegressor\n    from sklearn.ensemble import HistGradientBoostingClassifier\n    hist_gradient_boosting_support = True\nexcept BaseException:\n    pass\n\nspline_transformer_support = False\ntry:\n    from sklearn.preprocessing import SplineTransformer\n    spline_transformer_support = True\nexcept BaseException:\n    pass\n\ntarget_encoder_support = False\ntry:\n    from sklearn.preprocessing import TargetEncoder\n    target_encoder_support = True\nexcept BaseException:\n    pass\n\nOVERVIEW = \"\"\"\nPyMilo is an open source Python package that provides a simple, efficient, and safe way for users to export pre-trained machine learning models in a transparent way.\n\"\"\"\nPYMILO_VERSION = \"1.6\"\nNOT_SUPPORTED = \"NOT_SUPPORTED\"\nPYMILO_VERSION_DOES_NOT_EXIST = \"Corrupted JSON file, `pymilo_version` doesn't exist in this file.\"\nUNEQUAL_PYMILO_VERSIONS = \"warning: Installed PyMilo version differs from the PyMilo version used to create the JSON file.\"\nUNEQUAL_SKLEARN_VERSIONS = \"warning: Installed Scikit version differs from the Scikit version used to create the JSON file and it may prevent PyMilo from transporting seamlessly.\"\nINVALID_IMPORT_INIT_PARAMS = \"Invalid input parameters, you should either pass a valid file_adr or a json_dump or a url to initiate Import class.\"\nURL_REGEX = r'^(http|https)://[a-zA-Z0-9.-_]+\\.[a-zA-Z]{2,}(/\\S*)?$'\nDOWNLOAD_MODEL_FAILED = \"Failed to download the JSON file, Server didn't respond.\"\nINVALID_DOWNLOADED_MODEL = \"The downloaded content is not a valid JSON file.\"\nBATCH_IMPORT_INVALID_DIRECTORY = \"The given directory does not exist.\"\n\nCLI_ML_STREAMING_NOT_INSTALLED = \"\"\"ML Streaming is not installed.\nTo install ML Streaming, run the following command:\npip install pymilo[streaming]\"\"\"\nCLI_MORE_INFO = \"For more information, visit the PyMilo README at https://github.com/openscilab/pymilo\"\nCLI_UNKNOWN_MODEL = \"The provided ML model name is either invalid or unsupported.\"\n\nSKLEARN_LINEAR_MODEL_TABLE = {\n    \"DummyRegressor\": dummy.DummyRegressor,\n    \"DummyClassifier\": dummy.DummyClassifier,\n    \"LinearRegression\": linear_model.LinearRegression,\n    \"Ridge\": linear_model.Ridge,\n    \"RidgeCV\": linear_model.RidgeCV,\n    \"RidgeClassifier\": linear_model.RidgeClassifier,\n    \"RidgeClassifierCV\": linear_model.RidgeClassifierCV,\n    \"Lasso\": linear_model.Lasso,\n    \"LassoCV\": linear_model.LassoCV,\n    \"LassoLars\": linear_model.LassoLars,\n    \"LassoLarsCV\": linear_model.LassoLarsCV,\n    \"LassoLarsIC\": linear_model.LassoLarsIC,\n    \"MultiTaskLasso\": linear_model.MultiTaskLasso,\n    \"MultiTaskLassoCV\": linear_model.MultiTaskLassoCV,\n    \"ElasticNet\": linear_model.ElasticNet,\n    \"ElasticNetCV\": linear_model.ElasticNetCV,\n    \"MultiTaskElasticNet\": linear_model.MultiTaskElasticNet,\n    \"MultiTaskElasticNetCV\": linear_model.MultiTaskElasticNetCV,\n    \"OrthogonalMatchingPursuit\": linear_model.OrthogonalMatchingPursuit,\n    \"OrthogonalMatchingPursuitCV\": linear_model.OrthogonalMatchingPursuitCV,\n    \"BayesianRidge\": linear_model.BayesianRidge,\n    \"ARDRegression\": linear_model.ARDRegression,\n    \"LogisticRegression\": linear_model.LogisticRegression,\n    \"LogisticRegressionCV\": linear_model.LogisticRegressionCV,\n    \"TweedieRegressor\": TweedieRegressor if glm_support['TweedieRegressor'] else NOT_SUPPORTED,\n    \"PoissonRegressor\": PoissonRegressor if glm_support['PoissonRegressor'] else NOT_SUPPORTED,\n    \"GammaRegressor\": GammaRegressor if glm_support['GammaRegressor'] else NOT_SUPPORTED,\n    \"SGDRegressor\": linear_model.SGDRegressor,\n    \"SGDClassifier\": linear_model.SGDClassifier,\n    \"SGDOneClassSVM\": SGDOneClassSVM if sgd_one_class_svm_support else NOT_SUPPORTED,\n    \"Perceptron\": linear_model.Perceptron,\n    \"PassiveAggressiveRegressor\": linear_model.PassiveAggressiveRegressor,\n    \"PassiveAggressiveClassifier\": linear_model.PassiveAggressiveClassifier,\n    \"RANSACRegressor\": linear_model.RANSACRegressor,\n    \"TheilSenRegressor\": linear_model.TheilSenRegressor,\n    \"HuberRegressor\": linear_model.HuberRegressor,\n    \"QuantileRegressor\": QuantileRegressor if quantile_regressor_support else NOT_SUPPORTED,\n}\n\nSKLEARN_NEURAL_NETWORK_TABLE = {\n    \"MLPRegressor\": neural_network.MLPRegressor,\n    \"MLPClassifier\": neural_network.MLPClassifier,\n    \"BernoulliRBM\": neural_network.BernoulliRBM,\n}\n\nSKLEARN_DECISION_TREE_TABLE = {\n    \"DecisionTreeRegressor\": tree.DecisionTreeRegressor,\n    \"DecisionTreeClassifier\": tree.DecisionTreeClassifier,\n    \"ExtraTreeRegressor\": tree.ExtraTreeRegressor,\n    \"ExtraTreeClassifier\": tree.ExtraTreeClassifier\n}\n\nSKLEARN_CLUSTERING_TABLE = {\n    \"KMeans\": cluster.KMeans,\n    \"MiniBatchKMeans\": cluster.MiniBatchKMeans,\n    \"BisectingKMeans\": BisectingKMeans if bisecting_kmeans_support else NOT_SUPPORTED,\n    \"AffinityPropagation\": cluster.AffinityPropagation,\n    \"MeanShift\": cluster.MeanShift,\n    \"SpectralClustering\": cluster.SpectralClustering,\n    \"SpectralBiclustering\": cluster.SpectralBiclustering,\n    \"SpectralCoclustering\": cluster.SpectralCoclustering,\n    \"AgglomerativeClustering\": cluster.AgglomerativeClustering,\n    \"FeatureAgglomeration\": cluster.FeatureAgglomeration,\n    \"DBSCAN\": cluster.DBSCAN,\n    \"HDBSCAN\": HDBSCAN if hdbscan_support else NOT_SUPPORTED,\n    \"OPTICS\": cluster.OPTICS,\n    \"Birch\": cluster.Birch,\n    \"GaussianMixture\": mixture.GaussianMixture,\n    \"BayesianGaussianMixture\": mixture.BayesianGaussianMixture,\n}\n\nSKLEARN_NAIVE_BAYES_TABLE = {\n    \"GaussianNB\": naive_bayes.GaussianNB,\n    \"MultinomialNB\": naive_bayes.MultinomialNB,\n    \"ComplementNB\": naive_bayes.ComplementNB,\n    \"BernoulliNB\": naive_bayes.BernoulliNB,\n    \"CategoricalNB\": naive_bayes.CategoricalNB,\n}\n\nSKLEARN_SVM_TABLE = {\n    \"LinearSVC\": svm.LinearSVC,\n    \"LinearSVR\": svm.LinearSVR,\n    \"NuSVC\": svm.NuSVC,\n    \"NuSVR\": svm.NuSVR,\n    \"OneClassSVM\": svm.OneClassSVM,\n    \"SVC\": svm.SVC,\n    \"SVR\": svm.SVR,\n}\n\nSKLEARN_NEIGHBORS_TABLE = {\n    \"KNeighborsRegressor\": neighbors.KNeighborsRegressor,\n    \"KNeighborsClassifier\": neighbors.KNeighborsClassifier,\n    \"RadiusNeighborsRegressor\": neighbors.RadiusNeighborsRegressor,\n    \"RadiusNeighborsClassifier\": neighbors.RadiusNeighborsClassifier,\n    \"NearestNeighbors\": neighbors.NearestNeighbors,\n    \"NearestCentroid\": neighbors.NearestCentroid,\n    \"LocalOutlierFactor\": neighbors.LocalOutlierFactor,\n}\n\nSKLEARN_ENSEMBLE_TABLE = {\n    \"AdaBoostRegressor\": ensemble.AdaBoostRegressor,\n    \"AdaBoostClassifier\": ensemble.AdaBoostClassifier,\n    \"BaggingRegressor\": ensemble.BaggingRegressor,\n    \"BaggingClassifier\": ensemble.BaggingClassifier,\n    \"ExtraTreesClassifier\": ensemble.ExtraTreesClassifier,\n    \"ExtraTreesRegressor\": ensemble.ExtraTreesRegressor,\n    \"GradientBoostingRegressor\": ensemble.GradientBoostingRegressor,\n    \"GradientBoostingClassifier\": ensemble.GradientBoostingClassifier,\n    \"IsolationForest\": ensemble.IsolationForest,\n    \"RandomForestClassifier\": ensemble.RandomForestClassifier,\n    \"RandomForestRegressor\": ensemble.RandomForestRegressor,\n    \"RandomTreesEmbedding\": ensemble.RandomTreesEmbedding,\n    \"StackingRegressor\": ensemble.StackingRegressor,\n    \"StackingClassifier\": ensemble.StackingClassifier,\n    \"VotingRegressor\": ensemble.VotingRegressor,\n    \"VotingClassifier\": ensemble.VotingClassifier,\n    \"HistGradientBoostingRegressor\": HistGradientBoostingRegressor if hist_gradient_boosting_support else NOT_SUPPORTED,\n    \"HistGradientBoostingClassifier\": HistGradientBoostingClassifier if hist_gradient_boosting_support else NOT_SUPPORTED,\n    ####\n    \"Pipeline\": pipeline.Pipeline,\n}\n\nSKLEARN_PREPROCESSING_TABLE = {\n    \"StandardScaler\": preprocessing.StandardScaler,\n    \"MinMaxScaler\": preprocessing.MinMaxScaler,\n    \"OneHotEncoder\": preprocessing.OneHotEncoder,\n    \"LabelBinarizer\": preprocessing.LabelBinarizer,\n    \"LabelEncoder\": preprocessing.LabelEncoder,\n    \"Binarizer\": preprocessing.Binarizer,\n    \"FunctionTransformer\": preprocessing.FunctionTransformer,\n    \"KernelCenterer\": preprocessing.KernelCenterer,\n    \"MultiLabelBinarizer\": preprocessing.MultiLabelBinarizer,\n    \"MaxAbsScaler\": preprocessing.MaxAbsScaler,\n    \"Normalizer\": preprocessing.Normalizer,\n    \"OrdinalEncoder\": preprocessing.OrdinalEncoder,\n    \"PolynomialFeatures\": preprocessing.PolynomialFeatures,\n    \"RobustScaler\": preprocessing.RobustScaler,\n    \"QuantileTransformer\": preprocessing.QuantileTransformer,\n    \"KBinsDiscretizer\": preprocessing.KBinsDiscretizer,\n    \"PowerTransformer\": preprocessing.PowerTransformer,\n    \"SplineTransformer\": SplineTransformer if spline_transformer_support else NOT_SUPPORTED,\n    \"TargetEncoder\": TargetEncoder if target_encoder_support else NOT_SUPPORTED,\n}\n\nSKLEARN_FEATURE_EXTRACTION_TABLE = {\n    # for raw data:\n    \"DictVectorizer\": feature_extraction.DictVectorizer,\n    \"FeatureHasher\": feature_extraction.FeatureHasher,\n\n    # for image data:\n    \"PatchExtractor\": feature_extraction.image.PatchExtractor,\n\n    # for text data:\n    \"CountVectorizer\": feature_extraction.text.CountVectorizer,\n    \"HashingVectorizer\": feature_extraction.text.HashingVectorizer,\n    \"TfidfTransformer\": feature_extraction.text.TfidfTransformer,\n    \"TfidfVectorizer\": feature_extraction.text.TfidfVectorizer,\n}\n\nSKLEARN_CROSS_DECOMPOSITION_TABLE = {\n    \"PLSRegression\": cross_decomposition.PLSRegression,\n    \"PLSCanonical\": cross_decomposition.PLSCanonical,\n    \"CCA\": cross_decomposition.CCA,\n}\n\nSKLEARN_COMPOSE_TABLE = {\n    \"ColumnTransformer\": compose.ColumnTransformer,\n    \"TransformedTargetRegressor\": compose.TransformedTargetRegressor,\n}\n\nKEYS_NEED_PREPROCESSING_BEFORE_DESERIALIZATION = {\n    \"_label_binarizer\": preprocessing.LabelBinarizer,  # in Ridge Classifier\n    \"active_\": np.int32,  # in Lasso Lars\n    \"n_nonzero_coefs_\": np.int64,  # in OMP-CV\n    \"scores_\": dict,  # in Logistic Regression CV,\n    \"_base_loss\": {},  # BaseLoss in Logistic Regression,\n    \"loss_function_\": {},  # LossFunction in SGD Classifier,\n    \"estimator_\": {},  # LinearRegression model inside RANSAC\n}\n\nNUMPY_TYPE_DICT = {\n    \"numpy.intc\": np.intc,\n    \"numpy.int32\": np.int32,\n    \"numpy.int64\": np.int64,\n    \"numpy.float64\": np.float64,\n    \"numpy.infinity\": lambda _: np.inf,\n    \"numpy.uint8\": np.uint8,\n    \"numpy.uint64\": np.uint64,\n    \"numpy.dtype\": np.dtype,\n    \"numpy.nan\": np.nan,\n}\n\nEXPORTED_MODELS_PATH = {\n    \"LINEAR_MODEL\": \"exported_linear_models\",\n    \"NEURAL_NETWORK\": \"exported_neural_networks\",\n    \"DECISION_TREE\": \"exported_decision_trees\",\n    \"CLUSTERING\": \"exported_clusterings\",\n    \"NAIVE_BAYES\": \"exported_naive_bayes\",\n    \"SVM\": \"exported_svms\",\n    \"NEIGHBORS\": \"exported_neighbors\",\n    \"ENSEMBLE\": \"exported_ensembles\",\n    \"CROSS_DECOMPOSITION\": \"exported_cross_decomposition\",\n    \"COMPOSE\": \"exported_composes\",\n}\n\nSKLEARN_SUPPORTED_CATEGORIES = {\n    \"LINEAR_MODEL\": SKLEARN_LINEAR_MODEL_TABLE,\n    \"NEURAL_NETWORK\": SKLEARN_NEURAL_NETWORK_TABLE,\n    \"DECISION_TREE\": SKLEARN_DECISION_TREE_TABLE,\n    \"CLUSTERING\": SKLEARN_CLUSTERING_TABLE,\n    \"NAIVE_BAYES\": SKLEARN_NAIVE_BAYES_TABLE,\n    \"SVM\": SKLEARN_SVM_TABLE,\n    \"NEIGHBORS\": SKLEARN_NEIGHBORS_TABLE,\n    \"ENSEMBLE\": SKLEARN_ENSEMBLE_TABLE,\n    \"CROSS_DECOMPOSITION\": SKLEARN_CROSS_DECOMPOSITION_TABLE,\n    \"COMPOSE\": SKLEARN_COMPOSE_TABLE,\n}\n"
  },
  {
    "path": "pymilo/streaming/__init__.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo ML Streaming.\"\"\"\nfrom .pymilo_client import PymiloClient\nfrom .pymilo_server import PymiloServer\nfrom .compressor import Compression\nfrom .communicator import CommunicationProtocol\n"
  },
  {
    "path": "pymilo/streaming/communicator.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Communication Mediums.\"\"\"\nfrom __future__ import annotations\n\nimport uuid\nimport json\nimport asyncio\nfrom typing import Any, Dict, List, Optional, Union\nimport uvicorn\nimport requests\nimport websockets\nfrom enum import Enum\nfrom fastapi import FastAPI, Request, WebSocket, WebSocketDisconnect, HTTPException\nfrom .interfaces import ClientCommunicator\nfrom .param import (\n    PYMILO_INVALID_URL,\n    PYMILO_CLIENT_WEBSOCKET_NOT_CONNECTED,\n    REST_API_PREFIX,\n    MSG_DOWNLOAD_REQUEST,\n    MSG_UPLOAD_REQUEST,\n    MSG_ATTRIBUTE_CALL_REQUEST,\n    MSG_ATTRIBUTE_TYPE_REQUEST,\n    MSG_REST_DOWNLOAD_REQUEST,\n    MSG_REST_UPLOAD_REQUEST,\n    MSG_REST_ATTRIBUTE_CALL_REQUEST,\n    MSG_REST_ATTRIBUTE_TYPE_REQUEST,\n)\nfrom .util import validate_websocket_url, validate_http_url\n\n\nclass RESTClientCommunicator(ClientCommunicator):\n    \"\"\"Facilitate working with the communication medium from the client side for the REST protocol.\"\"\"\n\n    def __init__(self, server_url: str) -> None:\n        \"\"\"\n        Initialize the Pymilo RESTClientCommunicator instance.\n\n        :param server_url: the url to which PyMilo Server listens\n        \"\"\"\n        is_valid, server_url = validate_http_url(server_url)\n        if not is_valid:\n            raise Exception(PYMILO_INVALID_URL)\n        self._server_url = server_url.rstrip(\"/\") + \"/api/v1\"\n        self.session = requests.Session()\n        retries = requests.adapters.Retry(\n            total=10,\n            backoff_factor=0.1,\n            status_forcelist=[500, 502, 503, 504]\n        )\n        adapter = requests.adapters.HTTPAdapter(max_retries=retries)\n        self.session.mount('http://', adapter)\n        self.session.mount('https://', adapter)\n\n    def download(self, client_id: str, model_id: str) -> str:\n        \"\"\"\n        Request for the remote ML model to download.\n\n        :param client_id: ID of the requesting client\n        :param model_id: ID of the model to download\n        \"\"\"\n        url = f\"{self._server_url}/clients/{client_id}/models/{model_id}/download\"\n        response = self.session.get(url, timeout=5)\n        response.raise_for_status()\n        return response.json()[\"payload\"]\n\n    def upload(self, client_id: str, model_id: str, model: Any) -> bool:\n        \"\"\"\n        Upload the local ML model to the remote server.\n\n        :param client_id: ID of the client\n        :param model_id: ID of the model\n        :param model: serialized model content\n        \"\"\"\n        url = f\"{self._server_url}/clients/{client_id}/models/{model_id}/upload\"\n        response = self.session.post(url, json=model, timeout=5)\n        return response.status_code == 200\n\n    def attribute_call(self, client_id: str, model_id: str, call_payload: Dict) -> Dict:\n        \"\"\"\n        Delegate the requested attribute call to the remote server.\n\n        :param client_id: ID of the client\n        :param model_id: ID of the model\n        :param call_payload: payload containing attribute name, args, and kwargs\n        \"\"\"\n        url = f\"{self._server_url}/clients/{client_id}/models/{model_id}/attribute-call\"\n        response = self.session.post(url, json=call_payload, timeout=5)\n        response.raise_for_status()\n        return response.json()\n\n    def attribute_type(self, client_id: str, model_id: str, type_payload: Dict) -> Dict:\n        \"\"\"\n        Identify the attribute type of the requested attribute.\n\n        :param client_id: ID of the client\n        :param model_id: ID of the model\n        :param type_payload: payload containing attribute data to inspect\n        \"\"\"\n        url = f\"{self._server_url}/clients/{client_id}/models/{model_id}/attribute-type\"\n        response = self.session.post(url, json=type_payload, timeout=5)\n        response.raise_for_status()\n        return response.json()\n\n    def register_client(self) -> str:\n        \"\"\"Register client in the PyMiloServer.\"\"\"\n        response = self.session.get(f\"{self._server_url}/clients/register\", timeout=5)\n        response.raise_for_status()\n        return response.json()[\"client_id\"]\n\n    def remove_client(self, client_id: str) -> bool:\n        \"\"\"\n        Remove client from the PyMiloServer.\n\n        :param client_id: id of the client to remove\n        \"\"\"\n        response = self.session.delete(f\"{self._server_url}/clients/{client_id}\", timeout=5)\n        return response.status_code == 200\n\n    def register_model(self, client_id: str) -> str:\n        \"\"\"\n        Register ML model in the PyMiloServer.\n\n        :param client_id: id of the client who owns the model\n        \"\"\"\n        response = self.session.post(f\"{self._server_url}/clients/{client_id}/models/register\", timeout=5)\n        response.raise_for_status()\n        return response.json()[\"ml_model_id\"]\n\n    def remove_model(self, client_id: str, model_id: str) -> bool:\n        \"\"\"\n        Remove ML model from the PyMiloServer.\n\n        :param client_id: client owning the model\n        :param model_id: model to remove\n        \"\"\"\n        response = self.session.delete(f\"{self._server_url}/clients/{client_id}/models/{model_id}\", timeout=5)\n        return response.status_code == 200\n\n    def get_ml_models(self, client_id: str) -> List[str]:\n        \"\"\"\n        Get all ML models registered for this specific client in the PyMiloServer.\n\n        :param client_id: client whose models are being queried\n        \"\"\"\n        response = self.session.get(f\"{self._server_url}/clients/{client_id}/models\", timeout=5)\n        response.raise_for_status()\n        return response.json()[\"ml_models_id\"]\n\n    def grant_access(self, allower_id: str, allowee_id: str, model_id: str) -> bool:\n        \"\"\"\n        Grant access to a model to another client.\n\n        :param allower_id: ID of the client granting access\n        :param allowee_id: ID of the client being granted access\n        :param model_id: ID of the model being shared\n        \"\"\"\n        url = f\"{self._server_url}/clients/{allower_id}/grant/{allowee_id}/models/{model_id}\"\n        response = self.session.post(url, timeout=5)\n        return response.status_code == 200\n\n    def revoke_access(self, revoker_id: str, revokee_id: str, model_id: str) -> bool:\n        \"\"\"\n        Revoke previously granted model access.\n\n        :param revoker_id: ID of the client revoking access\n        :param revokee_id: ID of the client whose access is being revoked\n        :param model_id: ID of the model\n        \"\"\"\n        url = f\"{self._server_url}/clients/{revoker_id}/revoke/{revokee_id}/models/{model_id}\"\n        response = self.session.post(url, timeout=5)\n        return response.status_code == 200\n\n    def get_allowance(self, allower_id: str) -> Dict[str, List[str]]:\n        \"\"\"\n        Get the list of all allowees and their allowed models from a given allower.\n\n        :param allower_id: ID of the allower\n        \"\"\"\n        response = self.session.get(f\"{self._server_url}/clients/{allower_id}/allowances\", timeout=5)\n        response.raise_for_status()\n        return response.json()[\"allowance\"]\n\n    def get_allowed_models(self, allower_id: str, allowee_id: str) -> List[str]:\n        \"\"\"\n        Get the list of models that one client is allowed to access from another.\n\n        :param allower_id: ID of the model owner\n        :param allowee_id: ID of the requesting client\n        \"\"\"\n        url = f\"{self._server_url}/clients/{allower_id}/allowances/{allowee_id}\"\n        response = self.session.get(url, timeout=5)\n        response.raise_for_status()\n        return response.json()[\"allowed_models\"]\n\n\nclass RESTServerCommunicator():\n    \"\"\"Facilitate working with the communication medium from the server side for the REST protocol.\"\"\"\n\n    def __init__(\n            self,\n            ps,\n            host: str = \"127.0.0.1\",\n            port: int = 8000,\n    ):\n        \"\"\"\n        Initialize the Pymilo RESTServerCommunicator instance.\n\n        :param ps: reference to the PyMilo server\n        :param host: the url to which PyMilo Server listens\n        :param port: the port to which PyMilo Server listens\n        \"\"\"\n        self._ps = ps\n        self.host = host\n        self.port = port\n        self.app = FastAPI()\n        self.setup_routes()\n\n    def setup_routes(self):\n        \"\"\"Configure endpoints to handle RESTClientCommunicator requests.\"\"\"\n\n        @self.app.get(f\"{REST_API_PREFIX}/health\")\n        async def health():\n            return {\"status\": \"ok\"}\n\n        @self.app.get(f\"{REST_API_PREFIX}/clients/register\")\n        async def request_client_id():\n            client_id = str(uuid.uuid4())\n            self._ps.init_client(client_id)\n            return {\"client_id\": client_id}\n\n        @self.app.delete(f\"{REST_API_PREFIX}/clients/{{client_id}}\")\n        async def remove_client(client_id: str):\n            is_succeed, detail_message = self._ps.remove_client(client_id)\n            if not is_succeed:\n                raise HTTPException(status_code=404, detail=detail_message)\n            return {\"client_id\": client_id}\n\n        @self.app.post(f\"{REST_API_PREFIX}/clients/{{client_id}}/models/register\")\n        async def request_model(client_id: str):\n            model_id = str(uuid.uuid4())\n            is_succeed, detail_message = self._ps.init_ml_model(client_id, model_id)\n            if not is_succeed:\n                raise HTTPException(status_code=404, detail=detail_message)\n            return {\"client_id\": client_id, \"ml_model_id\": model_id}\n\n        @self.app.delete(f\"{REST_API_PREFIX}/clients/{{client_id}}/models/{{ml_model_id}}\")\n        async def remove_model(client_id: str, ml_model_id: str):\n            is_succeed, detail_message = self._ps.remove_ml_model(client_id, ml_model_id)\n            if not is_succeed:\n                raise HTTPException(status_code=404, detail=detail_message)\n            return {\"client_id\": client_id, \"ml_model_id\": ml_model_id}\n\n        @self.app.get(f\"{REST_API_PREFIX}/clients/{{client_id}}/models\")\n        async def get_client_models(client_id: str):\n            return {\"client_id\": client_id, \"ml_models_id\": self._ps.get_ml_models(client_id)}\n\n        @self.app.post(f\"{REST_API_PREFIX}/clients/{{allower_id}}/grant/{{allowee_id}}/models/{{model_id}}\")\n        async def grant_model_access(allower_id: str, allowee_id: str, model_id: str):\n            is_succeed, detail_message = self._ps.grant_access(allower_id, allowee_id, model_id)\n            if not is_succeed:\n                raise HTTPException(status_code=404, detail=detail_message)\n            return {\n                \"allower_id\": allower_id,\n                \"allowee_id\": allowee_id,\n                \"allowed_model_id\": model_id\n            }\n\n        @self.app.post(f\"{REST_API_PREFIX}/clients/{{revoker_id}}/revoke/{{revokee_id}}/models/{{model_id}}\")\n        async def revoke_model_access(revoker_id: str, revokee_id: str, model_id: str):\n            is_succeed, detail_message = self._ps.revoke_access(revoker_id, revokee_id, model_id)\n            if not is_succeed:\n                raise HTTPException(status_code=404, detail=detail_message)\n            return {\n                \"revoker_id\": revoker_id,\n                \"revokee_id\": revokee_id,\n                \"revoked_model_id\": model_id\n            }\n\n        @self.app.get(f\"{REST_API_PREFIX}/clients/{{allower_id}}/allowances\")\n        async def get_allowance(allower_id: str):\n            allowance, reason = self._ps.get_clients_allowance(allower_id)\n            if allowance is None:\n                raise HTTPException(status_code=404, detail=reason)\n            return {\"allower_id\": allower_id, \"allowance\": allowance}\n\n        @self.app.get(f\"{REST_API_PREFIX}/clients/{{allower_id}}/allowances/{{allowee_id}}\")\n        async def get_allowed_models(allower_id: str, allowee_id: str):\n            models, reason = self._ps.get_allowed_models(allower_id, allowee_id)\n            if models is None:\n                raise HTTPException(status_code=404, detail=reason)\n            return {\"allower_id\": allower_id, \"allowee_id\": allowee_id, \"allowed_models\": models}\n\n        @self.app.get(f\"{REST_API_PREFIX}/clients/{{client_id}}/models/{{ml_model_id}}/download\")\n        async def download_model(client_id: str, ml_model_id: str):\n            is_valid, reason = self._ps._validate_id(client_id, ml_model_id)\n            if not is_valid:\n                raise HTTPException(status_code=404, detail=reason)\n            return {\n                \"message\": MSG_REST_DOWNLOAD_REQUEST.format(client_id=client_id, ml_model_id=ml_model_id),\n                \"payload\": self._ps.export_model(client_id, ml_model_id)\n            }\n\n        @self.app.post(f\"{REST_API_PREFIX}/clients/{{client_id}}/models/{{ml_model_id}}/upload\")\n        async def upload_model(client_id: str, ml_model_id: str, request: Request):\n            model_data = self.parse(await request.json()).get(\"model\")\n            if model_data is None:\n                raise HTTPException(status_code=400, detail=\"Missing 'model' in request\")\n\n            is_valid, reason = self._ps._validate_id(client_id, ml_model_id)\n            if not is_valid:\n                raise HTTPException(status_code=404, detail=reason)\n\n            return {\n                \"message\": MSG_REST_UPLOAD_REQUEST.format(client_id=client_id, ml_model_id=ml_model_id),\n                \"payload\": self._ps.update_model(client_id, ml_model_id, model_data)\n            }\n\n        @self.app.post(f\"{REST_API_PREFIX}/clients/{{client_id}}/models/{{ml_model_id}}/attribute-call\")\n        async def attribute_call(client_id: str, ml_model_id: str, request: Request):\n            request_payload = self.parse(await request.json())\n            is_valid, reason = self._ps._validate_id(client_id, ml_model_id)\n            if not is_valid:\n                raise HTTPException(status_code=404, detail=reason)\n            result = self._ps.execute_model(request_payload)\n            return {\n                \"message\": MSG_REST_ATTRIBUTE_CALL_REQUEST.format(client_id=client_id, ml_model_id=ml_model_id),\n                \"payload\": result or \"The ML model has been updated in place.\"\n            }\n\n        @self.app.post(f\"{REST_API_PREFIX}/clients/{{client_id}}/models/{{ml_model_id}}/attribute-type\")\n        async def attribute_type(client_id: str, ml_model_id: str, request: Request):\n            request = self.parse(await request.json())\n            is_valid, reason = self._ps._validate_id(client_id, ml_model_id)\n            if not is_valid:\n                raise HTTPException(status_code=404, detail=reason)\n            is_callable, field_value = self._ps.is_callable_attribute(request)\n            return {\n                \"message\": MSG_REST_ATTRIBUTE_TYPE_REQUEST.format(client_id=client_id, ml_model_id=ml_model_id),\n                \"attribute type\": \"method\" if is_callable else \"field\",\n                \"attribute value\": \"\" if is_callable else field_value,\n            }\n\n    def parse(self, body: str) -> Dict:\n        \"\"\"\n        Parse the compressed encrypted body of the request.\n\n        :param body: request body\n        \"\"\"\n        return json.loads(\n            self._ps._compressor.extract(\n                self._ps._encryptor.decrypt(body)\n            )\n        )\n\n    def run(self):\n        \"\"\"Run internal fastapi server.\"\"\"\n        uvicorn.run(self.app, host=self.host, port=self.port)\n\n\nclass WebSocketClientCommunicator(ClientCommunicator):\n    \"\"\"Facilitate working with the communication medium from the client side for the WebSocket protocol.\"\"\"\n\n    def __init__(\n            self,\n            server_url: str = \"ws://127.0.0.1:8000\"\n    ):\n        \"\"\"\n        Initialize the WebSocketClientCommunicator instance.\n\n        :param server_url: the WebSocket server URL to connect to.\n        \"\"\"\n        is_valid, url = validate_websocket_url(server_url)\n        if not is_valid:\n            raise Exception(PYMILO_INVALID_URL)\n        self.server_url = url\n        self.websocket = None\n        self.connection_established = asyncio.Event()\n        if asyncio._get_running_loop() is None:\n            self.loop = asyncio.new_event_loop()\n            asyncio.set_event_loop(self.loop)\n        else:\n            self.loop = asyncio.get_event_loop()\n        self.loop.run_until_complete(self.connect())\n\n    def is_socket_closed(self) -> bool:\n        \"\"\"Check if the WebSocket connection is closed.\"\"\"\n        if self.websocket is None:\n            return True\n        elif hasattr(self.websocket, \"closed\"):\n            return self.websocket.closed\n        elif hasattr(self.websocket, \"state\"):\n            return self.websocket.state is websockets.protocol.State.CLOSED\n\n    async def connect(self):\n        \"\"\"Establish a WebSocket connection with the server.\"\"\"\n        if self.is_socket_closed():\n            self.websocket = await websockets.connect(self.server_url)\n            print(\"Connected to the WebSocket server.\")\n            self.connection_established.set()\n\n    async def disconnect(self):\n        \"\"\"Close the WebSocket connection.\"\"\"\n        if self.websocket:\n            await self.websocket.close()\n            self.websocket = None\n\n    def _disconnect_sync(self) -> None:\n        \"\"\"Close the WebSocket connection synchronously if still open.\"\"\"\n        if self.websocket and not self.is_socket_closed():\n            if self.loop and not self.loop.is_closed():\n                self.loop.run_until_complete(self.disconnect())\n\n    def close(self) -> None:\n        \"\"\"\n        Close the WebSocket connection synchronously.\n\n        This method should be called before the event loop is closed\n        to ensure proper cleanup.\n        \"\"\"\n        self._disconnect_sync()\n\n    def __del__(self) -> None:\n        \"\"\"Clean up WebSocket connection on object destruction.\"\"\"\n        self._disconnect_sync()\n\n    async def send_message(self, action: str, payload: Optional[Dict] = None) -> Dict:\n        \"\"\"\n        Send a message to the WebSocket server.\n\n        :param action: the type of action to perform (e.g., 'download', 'upload').\n        :param payload: the payload associated with the action.\n        \"\"\"\n        await self.connection_established.wait()\n\n        if self.is_socket_closed():\n            raise RuntimeError(PYMILO_CLIENT_WEBSOCKET_NOT_CONNECTED)\n\n        message = json.dumps({\"action\": action, \"payload\": payload or {}})\n        await self.websocket.send(message)\n        response = await self.websocket.recv()\n        return json.loads(response)\n\n    def _check_response_error(self, response: Dict) -> None:\n        \"\"\"\n        Check if the server response contains an error and raise an exception if so.\n\n        :param response: the server's response.\n        \"\"\"\n        if \"error\" in response:\n            raise RuntimeError(response[\"error\"])\n\n    def download(self, client_id: str, model_id: str) -> str:\n        \"\"\"\n        Request for the remote ML model to download.\n\n        :param client_id: ID of the requesting client\n        :param model_id: ID of the model to download\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"download\", {\n                \"client_id\": client_id,\n                \"ml_model_id\": model_id,\n            })\n        )\n        self._check_response_error(response)\n        return response.get(\"payload\")\n\n    def upload(self, client_id: str, model_id: str, model: Any) -> bool:\n        \"\"\"\n        Upload the local ML model to the remote server.\n\n        :param client_id: ID of the client\n        :param model_id: ID of the model\n        :param model: serialized model content\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"upload\", {\n                \"client_id\": client_id,\n                \"ml_model_id\": model_id,\n                \"model\": model,\n            })\n        )\n        self._check_response_error(response)\n        return True\n\n    def attribute_call(self, client_id: str, model_id: str, call_payload: Dict) -> Dict:\n        \"\"\"\n        Delegate the requested attribute call to the remote server.\n\n        :param client_id: ID of the client\n        :param model_id: ID of the model\n        :param call_payload: payload containing attribute name, args, and kwargs\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"attribute_call\", {\n                \"client_id\": client_id,\n                \"ml_model_id\": model_id,\n                \"call_payload\": call_payload,\n            })\n        )\n        self._check_response_error(response)\n        return response\n\n    def attribute_type(self, client_id: str, model_id: str, type_payload: Dict) -> Dict:\n        \"\"\"\n        Identify the attribute type of the requested attribute.\n\n        :param client_id: ID of the client\n        :param model_id: ID of the model\n        :param type_payload: payload containing attribute data to inspect\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"attribute_type\", {\n                \"client_id\": client_id,\n                \"ml_model_id\": model_id,\n                \"type_payload\": type_payload,\n            })\n        )\n        self._check_response_error(response)\n        return response\n\n    def register_client(self) -> str:\n        \"\"\"Register client in the PyMiloServer.\"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"register_client\")\n        )\n        self._check_response_error(response)\n        return response[\"client_id\"]\n\n    def remove_client(self, client_id: str) -> bool:\n        \"\"\"\n        Remove client from the PyMiloServer.\n\n        :param client_id: id of the client to remove\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"remove_client\", {\"client_id\": client_id})\n        )\n        self._check_response_error(response)\n        return True\n\n    def register_model(self, client_id: str) -> str:\n        \"\"\"\n        Register ML model in the PyMiloServer.\n\n        :param client_id: id of the client who owns the model\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"register_model\", {\"client_id\": client_id})\n        )\n        self._check_response_error(response)\n        return response[\"ml_model_id\"]\n\n    def remove_model(self, client_id: str, model_id: str) -> bool:\n        \"\"\"\n        Remove ML model from the PyMiloServer.\n\n        :param client_id: client owning the model\n        :param model_id: model to remove\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"remove_model\", {\n                \"client_id\": client_id,\n                \"ml_model_id\": model_id,\n            })\n        )\n        self._check_response_error(response)\n        return True\n\n    def get_ml_models(self, client_id: str) -> List[str]:\n        \"\"\"\n        Get all ML models registered for this specific client in the PyMiloServer.\n\n        :param client_id: client whose models are being queried\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"get_ml_models\", {\"client_id\": client_id})\n        )\n        self._check_response_error(response)\n        return response[\"ml_models_id\"]\n\n    def grant_access(self, allower_id: str, allowee_id: str, model_id: str) -> bool:\n        \"\"\"\n        Grant access to a model to another client.\n\n        :param allower_id: ID of the client granting access\n        :param allowee_id: ID of the client being granted access\n        :param model_id: ID of the model being shared\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"grant_access\", {\n                \"allower_id\": allower_id,\n                \"allowee_id\": allowee_id,\n                \"model_id\": model_id,\n            })\n        )\n        self._check_response_error(response)\n        return True\n\n    def revoke_access(self, revoker_id: str, revokee_id: str, model_id: str) -> bool:\n        \"\"\"\n        Revoke previously granted model access.\n\n        :param revoker_id: ID of the client revoking access\n        :param revokee_id: ID of the client whose access is being revoked\n        :param model_id: ID of the model\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"revoke_access\", {\n                \"revoker_id\": revoker_id,\n                \"revokee_id\": revokee_id,\n                \"model_id\": model_id,\n            })\n        )\n        self._check_response_error(response)\n        return True\n\n    def get_allowance(self, allower_id: str) -> Dict[str, List[str]]:\n        \"\"\"\n        Get the list of all allowees and their allowed models from a given allower.\n\n        :param allower_id: ID of the allower\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"get_allowance\", {\"allower_id\": allower_id})\n        )\n        self._check_response_error(response)\n        return response[\"allowance\"]\n\n    def get_allowed_models(self, allower_id: str, allowee_id: str) -> List[str]:\n        \"\"\"\n        Get the list of models that one client is allowed to access from another.\n\n        :param allower_id: ID of the model owner\n        :param allowee_id: ID of the requesting client\n        \"\"\"\n        response = self.loop.run_until_complete(\n            self.send_message(\"get_allowed_models\", {\n                \"allower_id\": allower_id,\n                \"allowee_id\": allowee_id,\n            })\n        )\n        self._check_response_error(response)\n        return response[\"allowed_models\"]\n\n\nclass WebSocketServerCommunicator:\n    \"\"\"Facilitate working with the communication medium from the server side for the WebSocket protocol.\"\"\"\n\n    def __init__(\n            self,\n            ps,\n            host: str = \"127.0.0.1\",\n            port: int = 8000,\n    ):\n        \"\"\"\n        Initialize the WebSocketServerCommunicator instance.\n\n        :param ps: reference to the PyMilo server.\n        :param host: the WebSocket server host address.\n        :param port: the WebSocket server port.\n        \"\"\"\n        self._ps = ps\n        self.host = host\n        self.port = port\n        self.app = FastAPI()\n        self.active_connections: list[WebSocket] = []\n        self._action_handlers = {\n            \"register_client\": self._handle_register_client,\n            \"remove_client\": self._handle_remove_client,\n            \"register_model\": self._handle_register_model,\n            \"remove_model\": self._handle_remove_model,\n            \"get_ml_models\": self._handle_get_ml_models,\n            \"grant_access\": self._handle_grant_access,\n            \"revoke_access\": self._handle_revoke_access,\n            \"get_allowance\": self._handle_get_allowance,\n            \"get_allowed_models\": self._handle_get_allowed_models,\n            \"download\": self._handle_download,\n            \"upload\": self._handle_upload,\n            \"attribute_call\": self._handle_attribute_call,\n            \"attribute_type\": self._handle_attribute_type,\n        }\n        self.setup_routes()\n\n    def setup_routes(self):\n        \"\"\"Configure the WebSocket endpoint to handle client connections.\"\"\"\n        @self.app.websocket(\"/\")\n        async def websocket_endpoint(websocket: WebSocket):\n            await self.connect(websocket)\n            try:\n                while True:\n                    message = await websocket.receive_text()\n                    await self.handle_message(websocket, message)\n            except WebSocketDisconnect:\n                self.disconnect(websocket)\n\n    async def connect(self, websocket: WebSocket) -> None:\n        \"\"\"\n        Accept a WebSocket connection and store it.\n\n        :param websocket: the WebSocket connection to accept.\n        \"\"\"\n        await websocket.accept()\n        self.active_connections.append(websocket)\n\n    def disconnect(self, websocket: WebSocket) -> None:\n        \"\"\"\n        Handle WebSocket disconnection.\n\n        :param websocket: the WebSocket connection to remove.\n        \"\"\"\n        self.active_connections.remove(websocket)\n\n    async def handle_message(self, websocket: WebSocket, message: str) -> None:\n        \"\"\"\n        Handle messages received from WebSocket clients.\n\n        :param websocket: the WebSocket connection from which the message was received.\n        :param message: the message received from the client.\n        \"\"\"\n        try:\n            message = json.loads(message)\n            action = message.get(\"action\")\n            print(f\"Server received action: {action}\")\n            raw_payload = message.get(\"payload\", {})\n            payload = self.parse(raw_payload) if raw_payload else {}\n\n            handler = self._action_handlers.get(action)\n            if handler:\n                response = handler(payload)\n            else:\n                response = {\"error\": f\"Unknown action: {action}\"}\n\n            await websocket.send_text(json.dumps(response))\n        except Exception as e:\n            await websocket.send_text(json.dumps({\"error\": str(e)}))\n\n    def _handle_register_client(self, payload: dict) -> dict:\n        \"\"\"\n        Handle client registration requests.\n\n        :param payload: the payload (empty for this action).\n        \"\"\"\n        client_id = str(uuid.uuid4())\n        self._ps.init_client(client_id)\n        return {\n            \"message\": \"Client registered successfully.\",\n            \"client_id\": client_id,\n        }\n\n    def _handle_remove_client(self, payload: dict) -> dict:\n        \"\"\"\n        Handle client removal requests.\n\n        :param payload: the payload containing the client ID to remove.\n        \"\"\"\n        client_id = payload.get(\"client_id\")\n        is_succeed, detail_message = self._ps.remove_client(client_id)\n        if not is_succeed:\n            return {\"error\": detail_message}\n        return {\n            \"message\": \"Client removed successfully.\",\n            \"client_id\": client_id,\n        }\n\n    def _handle_register_model(self, payload: dict) -> dict:\n        \"\"\"\n        Handle ML model registration requests.\n\n        :param payload: the payload containing the client ID.\n        \"\"\"\n        client_id = payload.get(\"client_id\")\n        ml_model_id = str(uuid.uuid4())\n        is_succeed, detail_message = self._ps.init_ml_model(client_id, ml_model_id)\n        if not is_succeed:\n            return {\"error\": detail_message}\n        return {\n            \"message\": \"Model registered successfully.\",\n            \"client_id\": client_id,\n            \"ml_model_id\": ml_model_id,\n        }\n\n    def _handle_remove_model(self, payload: dict) -> dict:\n        \"\"\"\n        Handle ML model removal requests.\n\n        :param payload: the payload containing the client ID and model ID.\n        \"\"\"\n        client_id = payload.get(\"client_id\")\n        ml_model_id = payload.get(\"ml_model_id\")\n        is_succeed, detail_message = self._ps.remove_ml_model(client_id, ml_model_id)\n        if not is_succeed:\n            return {\"error\": detail_message}\n        return {\n            \"message\": \"Model removed successfully.\",\n            \"client_id\": client_id,\n            \"ml_model_id\": ml_model_id,\n        }\n\n    def _handle_get_ml_models(self, payload: dict) -> dict:\n        \"\"\"\n        Handle requests to get all ML models for a client.\n\n        :param payload: the payload containing the client ID.\n        \"\"\"\n        client_id = payload.get(\"client_id\")\n        return {\n            \"message\": \"ML models retrieved successfully.\",\n            \"client_id\": client_id,\n            \"ml_models_id\": self._ps.get_ml_models(client_id),\n        }\n\n    def _handle_grant_access(self, payload: dict) -> dict:\n        \"\"\"\n        Handle requests to grant model access to another client.\n\n        :param payload: the payload containing allower_id, allowee_id, and model_id.\n        \"\"\"\n        allower_id = payload.get(\"allower_id\")\n        allowee_id = payload.get(\"allowee_id\")\n        model_id = payload.get(\"model_id\")\n        is_succeed, detail_message = self._ps.grant_access(allower_id, allowee_id, model_id)\n        if not is_succeed:\n            return {\"error\": detail_message}\n        return {\n            \"message\": \"Access granted successfully.\",\n            \"allower_id\": allower_id,\n            \"allowee_id\": allowee_id,\n            \"allowed_model_id\": model_id,\n        }\n\n    def _handle_revoke_access(self, payload: dict) -> dict:\n        \"\"\"\n        Handle requests to revoke model access from another client.\n\n        :param payload: the payload containing revoker_id, revokee_id, and model_id.\n        \"\"\"\n        revoker_id = payload.get(\"revoker_id\")\n        revokee_id = payload.get(\"revokee_id\")\n        model_id = payload.get(\"model_id\")\n        is_succeed, detail_message = self._ps.revoke_access(revoker_id, revokee_id, model_id)\n        if not is_succeed:\n            return {\"error\": detail_message}\n        return {\n            \"message\": \"Access revoked successfully.\",\n            \"revoker_id\": revoker_id,\n            \"revokee_id\": revokee_id,\n            \"revoked_model_id\": model_id,\n        }\n\n    def _handle_get_allowance(self, payload: dict) -> dict:\n        \"\"\"\n        Handle requests to get all allowances for a client.\n\n        :param payload: the payload containing the allower_id.\n        \"\"\"\n        allower_id = payload.get(\"allower_id\")\n        allowance, reason = self._ps.get_clients_allowance(allower_id)\n        if allowance is None:\n            return {\"error\": reason}\n        return {\n            \"message\": \"Allowance retrieved successfully.\",\n            \"allower_id\": allower_id,\n            \"allowance\": allowance,\n        }\n\n    def _handle_get_allowed_models(self, payload: dict) -> dict:\n        \"\"\"\n        Handle requests to get allowed models for a specific allowee.\n\n        :param payload: the payload containing allower_id and allowee_id.\n        \"\"\"\n        allower_id = payload.get(\"allower_id\")\n        allowee_id = payload.get(\"allowee_id\")\n        models, reason = self._ps.get_allowed_models(allower_id, allowee_id)\n        if models is None:\n            return {\"error\": reason}\n        return {\n            \"message\": \"Allowed models retrieved successfully.\",\n            \"allower_id\": allower_id,\n            \"allowee_id\": allowee_id,\n            \"allowed_models\": models,\n        }\n\n    def _handle_download(self, payload: dict) -> dict:\n        \"\"\"\n        Handle download requests.\n\n        :param payload: the payload containing the ids associated with the requested model for download.\n        \"\"\"\n        client_id = payload.get(\"client_id\")\n        ml_model_id = payload.get(\"ml_model_id\")\n        is_valid, reason = self._ps._validate_id(client_id, ml_model_id)\n        if not is_valid:\n            return {\"error\": reason}\n        return {\n            \"message\": MSG_DOWNLOAD_REQUEST.format(client_id=client_id, ml_model_id=ml_model_id),\n            \"payload\": self._ps.export_model(client_id, ml_model_id),\n        }\n\n    def _handle_upload(self, payload: dict) -> dict:\n        \"\"\"\n        Handle upload requests.\n\n        :param payload: the payload containing the model data to upload.\n        \"\"\"\n        client_id = payload.get(\"client_id\")\n        ml_model_id = payload.get(\"ml_model_id\")\n        encrypted_model = payload.get(\"model\")\n        if encrypted_model is None:\n            return {\"error\": \"Missing 'model' in request\"}\n        decrypted_model = self.parse(encrypted_model)\n        model_data = decrypted_model.get(\"model\")\n        if model_data is None:\n            return {\"error\": \"Missing 'model' in decrypted request\"}\n        is_valid, reason = self._ps._validate_id(client_id, ml_model_id)\n        if not is_valid:\n            return {\"error\": reason}\n        self._ps.update_model(client_id, ml_model_id, model_data)\n        return {\n            \"message\": MSG_UPLOAD_REQUEST.format(client_id=client_id, ml_model_id=ml_model_id),\n        }\n\n    def _handle_attribute_call(self, payload: dict) -> dict:\n        \"\"\"\n        Handle attribute call requests.\n\n        :param payload: the payload containing the attribute call details.\n        \"\"\"\n        client_id = payload.get(\"client_id\")\n        ml_model_id = payload.get(\"ml_model_id\")\n        encrypted_call_payload = payload.get(\"call_payload\")\n        if encrypted_call_payload is None:\n            return {\"error\": \"Missing 'call_payload' in request\"}\n        call_payload = self.parse(encrypted_call_payload)\n        is_valid, reason = self._ps._validate_id(client_id, ml_model_id)\n        if not is_valid:\n            return {\"error\": reason}\n        result = self._ps.execute_model(call_payload)\n        return {\n            \"message\": MSG_ATTRIBUTE_CALL_REQUEST.format(client_id=client_id, ml_model_id=ml_model_id),\n            \"payload\": result if result else \"The ML model has been updated in place.\",\n        }\n\n    def _handle_attribute_type(self, payload: dict) -> dict:\n        \"\"\"\n        Handle attribute type queries.\n\n        :param payload: the payload containing the attribute to query.\n        \"\"\"\n        client_id = payload.get(\"client_id\")\n        ml_model_id = payload.get(\"ml_model_id\")\n        encrypted_type_payload = payload.get(\"type_payload\")\n        if encrypted_type_payload is None:\n            return {\"error\": \"Missing 'type_payload' in request\"}\n        type_payload = self.parse(encrypted_type_payload)\n        is_valid, reason = self._ps._validate_id(client_id, ml_model_id)\n        if not is_valid:\n            return {\"error\": reason}\n        is_callable, field_value = self._ps.is_callable_attribute(type_payload)\n        return {\n            \"message\": MSG_ATTRIBUTE_TYPE_REQUEST.format(client_id=client_id, ml_model_id=ml_model_id),\n            \"attribute type\": \"method\" if is_callable else \"field\",\n            \"attribute value\": \"\" if is_callable else field_value,\n        }\n\n    def parse(self, message: Union[str, Dict]) -> Dict:\n        \"\"\"\n        Parse the encrypted and compressed message.\n\n        :param message: the encrypted and compressed message to parse.\n        \"\"\"\n        if isinstance(message, dict):\n            return message\n        return json.loads(\n            self._ps._compressor.extract(\n                self._ps._encryptor.decrypt(message)\n            )\n        )\n\n    def run(self):\n        \"\"\"Run the internal FastAPI server.\"\"\"\n        uvicorn.run(self.app, host=self.host, port=self.port)\n\n\nclass CommunicationProtocol(Enum):\n    \"\"\"Communication protocol.\"\"\"\n\n    REST = {\n        \"CLIENT\": RESTClientCommunicator,\n        \"SERVER\": RESTServerCommunicator,\n    }\n    WEBSOCKET = {\n        \"CLIENT\": WebSocketClientCommunicator,\n        \"SERVER\": WebSocketServerCommunicator,\n    }\n"
  },
  {
    "path": "pymilo/streaming/compressor.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"Implementations of Compressor interface.\"\"\"\nimport gzip\nimport zlib\nimport lzma\nimport bz2\nimport json\nimport base64\nfrom enum import Enum\nfrom pymilo.streaming.interfaces import Compressor\n\n\nclass DummyCompressor(Compressor):\n    \"\"\"A dummy implementation of the Compressor interface.\"\"\"\n\n    @staticmethod\n    def compress(payload):\n        \"\"\"Compress the given payload in a dummy way, simply just return it (no compression applied).\"\"\"\n        return payload if isinstance(payload, str) else json.dumps(payload)\n\n    @staticmethod\n    def extract(payload):\n        \"\"\"Extract the given payload in a dummy way, simply just return it (no Extraction applied).\"\"\"\n        return payload\n\n\nclass GZIPCompressor(Compressor):\n    \"\"\"GZIP implementation of the Compressor interface.\"\"\"\n\n    @staticmethod\n    def compress(payload):\n        \"\"\"Compress the given payload using gzip.\"\"\"\n        if isinstance(payload, str):\n            data = payload.encode('utf-8')\n        else:\n            data = json.dumps(payload).encode('utf-8')\n        compressed_data = gzip.compress(data)\n        return base64.b64encode(compressed_data).decode('utf-8')\n\n    @staticmethod\n    def extract(payload):\n        \"\"\"Extract the given payload using gzip.\"\"\"\n        data = base64.b64decode(payload)\n        return gzip.decompress(data).decode('utf-8')\n\n\nclass ZLIBCompressor(Compressor):\n    \"\"\"ZLIB implementation of the Compressor interface.\"\"\"\n\n    @staticmethod\n    def compress(payload):\n        \"\"\"Compress the given payload using zlib.\"\"\"\n        if isinstance(payload, str):\n            data = payload.encode('utf-8')\n        else:\n            data = json.dumps(payload).encode('utf-8')\n        compressed_data = zlib.compress(data)\n        return base64.b64encode(compressed_data).decode('utf-8')\n\n    @staticmethod\n    def extract(payload):\n        \"\"\"Extract the given payload using zlib.\"\"\"\n        data = base64.b64decode(payload)\n        return zlib.decompress(data).decode('utf-8')\n\n\nclass LZMACompressor(Compressor):\n    \"\"\"LZMA implementation of the Compressor interface.\"\"\"\n\n    @staticmethod\n    def compress(payload):\n        \"\"\"Compress the given payload using lzma.\"\"\"\n        if isinstance(payload, str):\n            data = payload.encode('utf-8')\n        else:\n            data = json.dumps(payload).encode('utf-8')\n        compressed_data = lzma.compress(data)\n        return base64.b64encode(compressed_data).decode('utf-8')\n\n    @staticmethod\n    def extract(payload):\n        \"\"\"Extract the given payload using lzma.\"\"\"\n        data = base64.b64decode(payload)\n        return lzma.decompress(data).decode('utf-8')\n\n\nclass BZ2Compressor(Compressor):\n    \"\"\"BZ2 implementation of the Compressor interface.\"\"\"\n\n    @staticmethod\n    def compress(payload):\n        \"\"\"Compress the given payload using bz2.\"\"\"\n        if isinstance(payload, str):\n            data = payload.encode('utf-8')\n        else:\n            data = json.dumps(payload).encode('utf-8')\n        compressed_data = bz2.compress(data)\n        return base64.b64encode(compressed_data).decode('utf-8')\n\n    @staticmethod\n    def extract(payload):\n        \"\"\"Extract the given payload using bz2.\"\"\"\n        data = base64.b64decode(payload)\n        return bz2.decompress(data).decode('utf-8')\n\n\nclass Compression(Enum):\n    \"\"\"Compression method used in end to end communication.\"\"\"\n\n    NULL = DummyCompressor\n    GZIP = GZIPCompressor\n    ZLIB = ZLIBCompressor\n    LZMA = LZMACompressor\n    BZ2 = BZ2Compressor\n"
  },
  {
    "path": "pymilo/streaming/encryptor.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"Implementations of Encryptor interface.\"\"\"\nfrom .interfaces import Encryptor\n\n\nclass DummyEncryptor(Encryptor):\n    \"\"\"A dummy implementation of the Encryptor interface.\"\"\"\n\n    @staticmethod\n    def encrypt(payload):\n        \"\"\"Encrypt the given payload in a dummy way, simply just return it (no encryption applied).\"\"\"\n        return payload\n\n    @staticmethod\n    def decrypt(payload):\n        \"\"\"Decrypt the given payload in a dummy way, simply just return it (no decryption applied).\"\"\"\n        return payload\n"
  },
  {
    "path": "pymilo/streaming/interfaces.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo ML Streaming Interfaces.\"\"\"\nfrom abc import ABC, abstractmethod\n\n\nclass Compressor(ABC):\n    \"\"\"\n    Compressor Interface.\n\n    Each Compressor has methods to compress the given payload or extract it back to the original one.\n    \"\"\"\n\n    @abstractmethod\n    def compress(payload):\n        \"\"\"\n        Compress the given payload.\n\n        :param payload: payload to get compressed\n        :type payload: str\n        :return: the compressed version\n        \"\"\"\n\n    @abstractmethod\n    def extract(payload):\n        \"\"\"\n        Extract the given previously compressed payload.\n\n        :param payload: payload to get extracted\n        :type payload: str\n        :return: the extracted version\n        \"\"\"\n\n\nclass Encryptor(ABC):\n    \"\"\"\n    Encryptor Interface.\n\n    Each Encryptor has methods to encrypt the given payload or decrypt it back to the original one.\n    \"\"\"\n\n    @abstractmethod\n    def encrypt(payload):\n        \"\"\"\n        Encrypt the given payload.\n\n        :param payload: payload to get encrypted\n        :type payload: str\n        :return: the encrypted version\n        \"\"\"\n\n    @abstractmethod\n    def decrypt(payload):\n        \"\"\"\n        Decrypt the given previously encrypted payload.\n\n        :param payload: payload to get decrypted\n        :type payload: str\n        :return: the decrypted version\n        \"\"\"\n\n\nclass ClientCommunicator(ABC):\n    \"\"\"\n    ClientCommunicator Interface.\n\n    Defines the contract for client-server communication. Each implementation is responsible for:\n    - Registering and removing clients and models\n    - Uploading and downloading ML models\n    - Handling delegated attribute access\n    - Managing model allowances between clients\n    \"\"\"\n\n    @abstractmethod\n    def register_client(self):\n        \"\"\"\n        Register the client in the remote server.\n\n        :return: newly allocated client ID\n        :rtype: str\n        \"\"\"\n\n    @abstractmethod\n    def remove_client(self, client_id):\n        \"\"\"\n        Remove the client from the remote server.\n\n        :param client_id: client ID to remove\n        :type client_id: str\n        :return: success status\n        :rtype: bool\n        \"\"\"\n\n    @abstractmethod\n    def register_model(self, client_id):\n        \"\"\"\n        Register an ML model for the given client.\n\n        :param client_id: client ID\n        :type client_id: str\n        :return: newly allocated model ID\n        :rtype: str\n        \"\"\"\n\n    @abstractmethod\n    def remove_model(self, client_id, model_id):\n        \"\"\"\n        Remove the specified ML model for the client.\n\n        :param client_id: client ID\n        :type client_id: str\n        :param model_id: model ID\n        :type model_id: str\n        :return: success status\n        :rtype: bool\n        \"\"\"\n\n    @abstractmethod\n    def get_ml_models(self, client_id):\n        \"\"\"\n        Get the list of ML models for the given client.\n\n        :param client_id: client ID\n        :type client_id: str\n        :return: list of model IDs\n        :rtype: list[str]\n        \"\"\"\n\n    @abstractmethod\n    def grant_access(self, allower_id, allowee_id, model_id):\n        \"\"\"\n        Grant access to a model from one client to another.\n\n        :param allower_id: client who owns the model\n        :type allower_id: str\n        :param allowee_id: client to be granted access\n        :type allowee_id: str\n        :param model_id: model ID\n        :type model_id: str\n        :return: success status\n        :rtype: bool\n        \"\"\"\n\n    @abstractmethod\n    def revoke_access(self, revoker_id, revokee_id, model_id):\n        \"\"\"\n        Revoke model access from one client to another.\n\n        :param revoker_id: client who owns the model\n        :type revoker_id: str\n        :param revokee_id: client to be revoked\n        :type revokee_id: str\n        :param model_id: model ID\n        :type model_id: str\n        :return: success status\n        :rtype: bool\n        \"\"\"\n\n    @abstractmethod\n    def get_allowance(self, allower_id):\n        \"\"\"\n        Get all clients and models this client has allowed.\n\n        :param allower_id: client who granted access\n        :type allower_id: str\n        :return: dictionary mapping allowee_id to list of model_ids\n        :rtype: dict\n        \"\"\"\n\n    @abstractmethod\n    def get_allowed_models(self, allower_id, allowee_id):\n        \"\"\"\n        Get the list of model IDs that `allowee_id` is allowed to access from `allower_id`.\n\n        :param allower_id: model owner\n        :type allower_id: str\n        :param allowee_id: recipient\n        :type allowee_id: str\n        :return: list of allowed model IDs\n        :rtype: list[str]\n        \"\"\"\n\n    @abstractmethod\n    def upload(self, client_id, model_id, model):\n        \"\"\"\n        Upload the local ML model to the remote server.\n\n        :param client_id: ID of the client\n        :param model_id: ID of the model\n        :param model: serialized model content\n        :return: True if upload was successful, False otherwise\n        \"\"\"\n\n    @abstractmethod\n    def download(self, client_id, model_id):\n        \"\"\"\n        Download the remote ML model.\n\n        :param client_id: ID of the requesting client\n        :param model_id: ID of the model to download\n        :return: string serialized model\n        \"\"\"\n\n    @abstractmethod\n    def attribute_call(self, client_id, model_id, call_payload):\n        \"\"\"\n        Execute an attribute call on the remote server.\n\n        :param client_id: ID of the client\n        :param model_id: ID of the model\n        :param call_payload: payload containing attribute name, args, and kwargs\n        :return: remote server response\n        \"\"\"\n\n    @abstractmethod\n    def attribute_type(self, client_id, model_id, type_payload):\n        \"\"\"\n        Identify the attribute type (method or field) on the remote model.\n\n        :param client_id: client ID\n        :param model_id: model ID\n        :param type_payload: payload containing targeted attribute\n        :return: remote server response\n        \"\"\"\n"
  },
  {
    "path": "pymilo/streaming/param.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"Streaming Parameters and constants.\"\"\"\nPYMILO_CLIENT_INVALID_MODE = \"Invalid mode, the given mode should be either `LOCAL`[default] or `DELEGATE`.\"\nPYMILO_CLIENT_MODEL_SYNCHED = \"PyMiloClient synched the local ML model with the remote one successfully.\"\nPYMILO_CLIENT_LOCAL_MODEL_UPLOADED = \"PyMiloClient uploaded the local model successfully.\"\nPYMILO_CLIENT_LOCAL_MODEL_UPLOAD_FAILED = \"PyMiloClient failed to upload the local model.\"\nPYMILO_CLIENT_INVALID_ATTRIBUTE = \"This attribute doesn't exist in either PymiloClient or the inner ML model.\"\nPYMILO_CLIENT_FAILED_TO_DOWNLOAD_REMOTE_MODEL = \"PyMiloClient failed to download the remote ML model.\"\n\nPYMILO_SERVER_NON_EXISTENT_ATTRIBUTE = \"The requested attribute doesn't exist in this model.\"\nPYMILO_INVALID_URL = \"The given URL is not valid.\"\nPYMILO_CLIENT_WEBSOCKET_NOT_CONNECTED = \"WebSocket is not connected.\"\n\nREST_API_PREFIX = \"/api/v1\"\n\nMSG_DOWNLOAD_REQUEST = \"Download request from client: {client_id} for model: {ml_model_id}\"\nMSG_UPLOAD_REQUEST = \"Upload request from client: {client_id} for model: {ml_model_id}\"\nMSG_ATTRIBUTE_CALL_REQUEST = \"Attribute call request from client: {client_id} for model: {ml_model_id}\"\nMSG_ATTRIBUTE_TYPE_REQUEST = \"Attribute type request from client: {client_id} for model: {ml_model_id}\"\nMSG_REST_DOWNLOAD_REQUEST = \"/download request from client: {client_id} for model: {ml_model_id}\"\nMSG_REST_UPLOAD_REQUEST = \"/upload request from client: {client_id} for model: {ml_model_id}\"\nMSG_REST_ATTRIBUTE_CALL_REQUEST = \"/attribute_call request from client: {client_id} for model: {ml_model_id}\"\nMSG_REST_ATTRIBUTE_TYPE_REQUEST = \"/attribute_type request from client: {client_id} for model: {ml_model_id}\"\n"
  },
  {
    "path": "pymilo/streaming/pymilo_client.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMiloClient for RESTFull Protocol.\"\"\"\nfrom enum import Enum\nfrom .encryptor import DummyEncryptor\nfrom .compressor import Compression\nfrom ..pymilo_obj import Export, Import\nfrom .param import PYMILO_CLIENT_INVALID_MODE, PYMILO_CLIENT_MODEL_SYNCHED, \\\n    PYMILO_CLIENT_LOCAL_MODEL_UPLOADED, PYMILO_CLIENT_LOCAL_MODEL_UPLOAD_FAILED, \\\n    PYMILO_CLIENT_INVALID_ATTRIBUTE, PYMILO_CLIENT_FAILED_TO_DOWNLOAD_REMOTE_MODEL\nfrom .communicator import CommunicationProtocol\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\n\n\nclass PymiloClient:\n    \"\"\"Facilitate working with the PyMilo server.\"\"\"\n\n    class Mode(Enum):\n        \"\"\"fallback state of the PyMiloClient.\"\"\"\n\n        LOCAL = 1\n        DELEGATE = 2\n\n    def __init__(\n            self,\n            model=None,\n            mode=Mode.LOCAL,\n            compressor=Compression.NULL,\n            server_url=\"127.0.0.1:8000\",\n            communication_protocol=CommunicationProtocol.REST,\n    ):\n        \"\"\"\n        Initialize the Pymilo PymiloClient instance.\n\n        :param model: the ML model PyMiloClient wrapped around\n        :type model: Any\n        :param mode: the mode in which PymiloClient should work, either LOCAL mode or DELEGATE\n        :type mode: str (LOCAL|DELEGATE)\n        :param compressor: the compression method to be used in client-server communications\n        :type compressor: pymilo.streaming.compressor.Compression\n        :param server_url: the url to which PyMilo Server listens\n        :type server_url: str\n        :param communication_protocol: The communication protocol to be used by PymiloClient\n        :type communication_protocol: pymilo.streaming.communicator.CommunicationProtocol\n        :return: an instance of the Pymilo PymiloClient class\n        \"\"\"\n        self.model = model\n        self.client_id = \"0x_client_id\"\n        self.ml_model_id = \"0x_ml_model_id\"\n        self._mode = mode\n        self._compressor = compressor.value\n        self._encryptor = DummyEncryptor()\n        self._communicator = communication_protocol.value[\"CLIENT\"](server_url)\n\n    def encrypt_compress(self, body):\n        \"\"\"\n        Compress and Encrypt body payload.\n\n        :param body: body payload of the request\n        :type body: dict\n        :return: the compressed and encrypted version of the body payload\n        \"\"\"\n        return self._encryptor.encrypt(\n            self._compressor.compress(body)\n        )\n\n    def toggle_mode(self, mode=Mode.LOCAL):\n        \"\"\"\n        Toggle the PyMiloClient mode, either from LOCAL to DELEGATE or vice versa.\n\n        :return: None\n        \"\"\"\n        if mode not in PymiloClient.Mode.__members__.values():\n            raise Exception(PYMILO_CLIENT_INVALID_MODE)\n        if mode != self._mode:\n            self._mode = mode\n\n    def download(self):\n        \"\"\"\n        Request for the remote ML model to download.\n\n        :return: None\n        \"\"\"\n        serialized_model = self._communicator.download(\n            self.client_id,\n            self.ml_model_id\n        )\n        if serialized_model is None:\n            print(PYMILO_CLIENT_FAILED_TO_DOWNLOAD_REMOTE_MODEL)\n            return\n        self.model = Import(file_adr=None, json_dump=serialized_model).to_model()\n        print(PYMILO_CLIENT_MODEL_SYNCHED)\n\n    def upload(self):\n        \"\"\"\n        Upload the local ML model to the remote server.\n\n        :return: None\n        \"\"\"\n        succeed = self._communicator.upload(\n            self.client_id,\n            self.ml_model_id,\n            self.encrypt_compress({\"model\": Export(self.model).to_json()})\n        )\n        if succeed:\n            print(PYMILO_CLIENT_LOCAL_MODEL_UPLOADED)\n        else:\n            print(PYMILO_CLIENT_LOCAL_MODEL_UPLOAD_FAILED)\n\n    def register(self):\n        \"\"\"\n        Register client in the remote server.\n\n        :return: None\n        \"\"\"\n        self.client_id = self._communicator.register_client()\n\n    def deregister(self):\n        \"\"\"\n        Deregister client in the remote server.\n\n        :return: None\n        \"\"\"\n        self._communicator.remove_client(self.client_id)\n        self.client_id = \"0x_client_id\"\n\n    def register_ml_model(self):\n        \"\"\"\n        Register ML model in the remote server.\n\n        :return: None\n        \"\"\"\n        self.ml_model_id = self._communicator.register_model(self.client_id)\n\n    def deregister_ml_model(self):\n        \"\"\"\n        Deregister ML model in the remote server.\n\n        :return: None\n        \"\"\"\n        self._communicator.remove_model(self.client_id, self.ml_model_id)\n        self.ml_model_id = \"0x_ml_model_id\"\n\n    def get_ml_models(self):\n        \"\"\"\n        Get all registered ml models in the remote server for this client.\n\n        :return: list of ml model ids\n        \"\"\"\n        return self._communicator.get_ml_models(self.client_id)\n\n    def grant_access(self, allowee_id):\n        \"\"\"\n        Grant access to one of this client's models to another client.\n\n        :param allowee_id: The client ID to grant access to\n        :return: True if successful, False otherwise\n        \"\"\"\n        return self._communicator.grant_access(\n            self.client_id,\n            allowee_id,\n            self.ml_model_id\n        )\n\n    def revoke_access(self, revokee_id):\n        \"\"\"\n        Revoke access previously granted to another client.\n\n        :param revokee_id: The client ID to revoke access from\n        :return: True if successful, False otherwise\n        \"\"\"\n        return self._communicator.revoke_access(\n            self.client_id,\n            revokee_id,\n            self.ml_model_id\n        )\n\n    def get_allowance(self):\n        \"\"\"\n        Get a dictionary of all clients who have access to this client's models.\n\n        :return: Dict of allowee_id -> list of model_ids\n        \"\"\"\n        return self._communicator.get_allowance(self.client_id)\n\n    def get_allowed_models(self, allower_id):\n        \"\"\"\n        Get a list of models you are allowed to access from another client.\n\n        :param allower_id: The client ID who owns the models\n        :return: list of allowed model IDs\n        \"\"\"\n        return self._communicator.get_allowed_models(allower_id, self.client_id)\n\n    def close(self):\n        \"\"\"\n        Close the client connection and release resources.\n\n        :return: None\n        \"\"\"\n        if hasattr(self._communicator, 'close'):\n            self._communicator.close()\n\n    def __enter__(self):\n        \"\"\"\n        Enter the context manager.\n\n        :return: self\n        \"\"\"\n        return self\n\n    def __exit__(self, _exc_type, _exc_val, _exc_tb):\n        \"\"\"\n        Exit the context manager and close the connection.\n\n        :return: False\n        \"\"\"\n        self.close()\n        return False\n\n    def __del__(self):\n        \"\"\"Clean up resources on object destruction.\"\"\"\n        self.close()\n\n    def __getattr__(self, attribute):\n        \"\"\"\n        Overwrite the __getattr__ default function to extract requested.\n\n            1. If self._mode is LOCAL, extract the requested from inner ML model and returns it\n            2. If self._mode is DELEGATE, returns a wrapper relayer which delegates the request to the remote server by execution\n\n        :return: Any\n        \"\"\"\n        if self._mode == PymiloClient.Mode.LOCAL:\n            if attribute in dir(self.model):\n                return getattr(self.model, attribute)\n            else:\n                raise AttributeError(PYMILO_CLIENT_INVALID_ATTRIBUTE)\n        elif self._mode == PymiloClient.Mode.DELEGATE:\n            gdst = GeneralDataStructureTransporter()\n            response = self._communicator.attribute_type(\n                self.client_id,\n                self.ml_model_id,\n                self.encrypt_compress({\n                    \"attribute\": attribute,\n                    \"client_id\": self.client_id,\n                    \"ml_model_id\": self.ml_model_id,\n                })\n            )\n            if response[\"attribute type\"] == \"field\":\n                return gdst.deserialize(response, \"attribute value\", None)\n\n            def relayer(*args, **kwargs):\n                payload = {\n                    \"client_id\": self.client_id,\n                    \"ml_model_id\": self.ml_model_id,\n                    'attribute': attribute,\n                    'args': args,\n                    'kwargs': kwargs,\n                }\n                payload[\"args\"] = gdst.serialize(payload, \"args\", None)\n                payload[\"kwargs\"] = gdst.serialize(payload, \"kwargs\", None)\n                result = self._communicator.attribute_call(\n                    self.client_id,\n                    self.ml_model_id,\n                    self.encrypt_compress(payload)\n                )\n                return gdst.deserialize(result, \"payload\", None)\n            return relayer\n"
  },
  {
    "path": "pymilo/streaming/pymilo_server.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMiloServer for RESTFull protocol.\"\"\"\nfrom ..pymilo_obj import Export, Import\nfrom .compressor import Compression\nfrom .encryptor import DummyEncryptor\nfrom .communicator import CommunicationProtocol\nfrom .param import PYMILO_SERVER_NON_EXISTENT_ATTRIBUTE\nfrom ..transporters.general_data_structure_transporter import GeneralDataStructureTransporter\n\n\nclass PymiloServer:\n    \"\"\"Facilitate streaming the ML models.\"\"\"\n\n    def __init__(\n            self,\n            port=8000,\n            host=\"127.0.0.1\",\n            compressor=Compression.NULL,\n            communication_protocol=CommunicationProtocol.REST,\n    ):\n        \"\"\"\n        Initialize the Pymilo PymiloServer instance.\n\n        :param model: the ML model which will be streamed\n        :type model: any\n        :param port: the port to which PyMiloServer listens\n        :type port: int\n        :param host: the url to which PyMilo Server listens\n        :type host: str\n        :param compressor: the compression method to be used in client-server communications\n        :type compressor: pymilo.streaming.compressor.Compression\n        :param communication_protocol: The communication protocol to be used by PymiloServer\n        :type communication_protocol: pymilo.streaming.communicator.CommunicationProtocol\n        :return: an instance of the PymiloServer class\n        \"\"\"\n        self._compressor = compressor.value\n        self._encryptor = DummyEncryptor()\n        # In-memory storage (replace with a database for persistence)\n        self.communicator = communication_protocol.value[\"SERVER\"](ps=self, host=host, port=port)\n        self._clients = {}\n        self._allowance = {}\n\n    def export_model(self, client_id, ml_model_id):\n        \"\"\"\n        Export the ML model to string json dump using PyMilo Export class.\n\n        :return: str\n        \"\"\"\n        return Export(self._clients[client_id][ml_model_id]).to_json()\n\n    def update_model(self, client_id, ml_model_id, serialized_model):\n        \"\"\"\n        Update the PyMilo Server's ML model.\n\n        :param serialized_model: the json dump of a pymilo export ml model\n        :type serialized_model: str\n        :return: None\n        \"\"\"\n        self._clients[client_id][ml_model_id] = Import(file_adr=None, json_dump=serialized_model).to_model()\n\n    def execute_model(self, request):\n        \"\"\"\n        Execute the request attribute call from PyMilo Client.\n\n        :param request: request obj containing requested attribute to call with the associated args and kwargs\n        :type request: obj\n        :return: str | dict\n        \"\"\"\n        gdst = GeneralDataStructureTransporter()\n        attribute = request[\"attribute\"] if isinstance(request, dict) else request.attribute\n        _client_id = request[\"client_id\"] if isinstance(request, dict) else request.client_id\n        _ml_model_id = request[\"ml_model_id\"] if isinstance(request, dict) else request.ml_model_id\n        _ml_model = self._clients[_client_id][_ml_model_id]\n        retrieved_attribute = getattr(_ml_model, attribute, None)\n        if retrieved_attribute is None:\n            raise Exception(PYMILO_SERVER_NON_EXISTENT_ATTRIBUTE)\n        arguments = {\n            'args': request[\"args\"] if isinstance(request, dict) else request.args,\n            'kwargs': request[\"kwargs\"] if isinstance(request, dict) else request.kwargs,\n        }\n        args = gdst.deserialize(arguments, 'args', None)\n        kwargs = gdst.deserialize(arguments, 'kwargs', None)\n        output = retrieved_attribute(*args, **kwargs)\n        if isinstance(output, type(_ml_model)):\n            self._clients[_client_id][_ml_model_id] = output\n            return None\n        return gdst.serialize({'output': output}, 'output', None)\n\n    def is_callable_attribute(self, request):\n        \"\"\"\n        Check whether the requested attribute is callable or not.\n\n        :param request: request obj containing requested attribute to check it's type\n        :type request: obj\n        :return: True if it is callable False otherwise\n        \"\"\"\n        attribute = request[\"attribute\"] if isinstance(request, dict) else request.attribute\n        _client_id = request[\"client_id\"] if isinstance(request, dict) else request.client_id\n        _ml_model_id = request[\"ml_model_id\"] if isinstance(request, dict) else request.ml_model_id\n        _ml_model = self._clients[_client_id][_ml_model_id]\n        retrieved_attribute = getattr(_ml_model, attribute, None)\n        if callable(retrieved_attribute):\n            return True, None\n        else:\n            return False, GeneralDataStructureTransporter().serialize({'output': retrieved_attribute}, 'output', None)\n\n    def _validate_id(self, client_id, ml_model_id):\n        \"\"\"\n        Validate the provided client ID and machine learning model ID.\n\n        :param client_id: The ID of the client to validate.\n        :type client_id: str\n        :param ml_model_id: The ID of the machine learning model to validate.\n        :type ml_model_id: str\n        :return: A tuple containing a boolean indicating validity and an error message if invalid.\n        \"\"\"\n        if client_id not in self._clients:\n            return False, \"The given client_id is invalid.\"\n        if ml_model_id not in self._clients[client_id]:\n            return False, \"The given client_id is valid but requested ml_model_id is invalid.\"\n        return True, None\n\n    def init_client(self, client_id):\n        \"\"\"\n        Initialize a new client with the given client ID.\n\n        :param client_id: The ID of the client to initialize.\n        :type client_id: str\n        :return: A tuple containing a boolean indicating success and an error message if the client already exists.\n        \"\"\"\n        if client_id in self._clients:\n            return False, f\"The client with client_id: {client_id} already exists.\"\n        self._clients[client_id] = {}\n        self._allowance[client_id] = {}\n        return True, None\n\n    def remove_client(self, client_id):\n        \"\"\"\n        Remove an existing client by the given client ID.\n\n        :param client_id: The ID of the client to remove.\n        :type client_id: str\n        :return: A tuple containing a boolean indicating success and an error message if the client does not exist.\n        \"\"\"\n        if client_id not in self._clients:\n            return False, f\"The client with client_id: {client_id} doesn't exist.\"\n        del self._clients[client_id]\n        del self._allowance[client_id]\n        return True, None\n\n    def grant_access(self, allower_id, allowee_id, allowed_model_id):\n        \"\"\"\n        Allow a client to access a specific machine learning model of another client.\n\n        :param allower_id: The ID of the client granting access.\n        :type allower_id: str\n        :param allowee_id: The ID of the client being granted access.\n        :type allowee_id: str\n        :param allowed_model_id: The ID of the machine learning model to be accessed.\n        :type allowed_model_id: str\n        :return: A tuple containing a boolean indicating success and an error message if the operation fails.\n        \"\"\"\n        if allower_id not in self._clients:\n            return False, f\"The allower client with client_id: {allower_id} doesn't exist.\"\n        if allowee_id not in self._clients:\n            return False, f\"The allowee client with client_id: {allowee_id} doesn't exist.\"\n        if allowed_model_id not in self._clients[allower_id]:\n            return False, f\"The model with ml_model_id: {allowed_model_id} doesn't exist for the allower client with client_id: {allower_id}.\"\n\n        if allowed_model_id in self._allowance.get(allower_id).get(allowee_id, []):\n            return False, f\"The model with ml_model_id: {allowed_model_id} is already allowed for the allowee client with client_id: {allowee_id} by the allower client with client_id: {allower_id}.\"\n\n        if allowee_id not in self._allowance[allower_id]:\n            self._allowance[allower_id][allowee_id] = [allowed_model_id]\n            return True, None\n\n        self._allowance[allower_id][allowee_id].append(allowed_model_id)\n        return True, None\n\n    def revoke_access(self, allower_id, allowee_id, allowed_model_id=None):\n        \"\"\"\n        Revoke a client's access to a specific machine learning model of another client.\n\n        :param allower_id: The ID of the client revoking access.\n        :type allower_id: str\n        :param allowee_id: The ID of the client whose access is being revoked.\n        :type allowee_id: str\n        :param allowed_model_id: The ID of the machine learning model whose access is being revoked.\n        :type allowed_model_id: str\n        :return: A tuple containing a boolean indicating success and an error message if the operation fails.\n        \"\"\"\n        if allower_id not in self._clients:\n            return False, f\"The allower client with client_id: {allower_id} doesn't exist.\"\n        if allowee_id not in self._clients:\n            return False, f\"The allowee client with client_id: {allowee_id} doesn't exist.\"\n\n        if allowed_model_id is None:\n            if allowee_id in self._allowance[allower_id]:\n                del self._allowance[allower_id][allowee_id]\n            return True, None\n\n        if allowed_model_id not in self._clients[allower_id]:\n            return False, f\"The model with ml_model_id: {allowed_model_id} doesn't exist for the allower client with client_id: {allower_id}.\"\n\n        if allowee_id not in self._allowance[allower_id]:\n            return False, f\"The allowee client with client_id: {allowee_id} doesn't have any access granted by the allower client with client_id: {allower_id}.\"\n\n        if allowed_model_id not in self._allowance[allower_id][allowee_id]:\n            return False, f\"The model with ml_model_id: {allowed_model_id} is not allowed for the allowee client with client_id: {allowee_id} by the allower client with client_id: {allower_id}.\"\n\n        self._allowance[allower_id][allowee_id].remove(allowed_model_id)\n        return True, None\n\n    def get_allowed_models(self, allower_id, allowee_id):\n        \"\"\"\n        Retrieve a list of machine learning model IDs that a client is allowed to access from another client.\n\n        :param allower_id: The ID of the client who granted access.\n        :type allower_id: str\n        :param allowee_id: The ID of the client who has been granted access.\n        :type allowee_id: str\n        :return: A list of allowed machine learning model IDs or an error message if access is not granted.\n        \"\"\"\n        if allower_id not in self._clients:\n            return None, f\"The allower client with client_id: {allower_id} doesn't exist.\"\n        if allowee_id not in self._clients:\n            return None, f\"The allowee client with client_id: {allowee_id} doesn't exist.\"\n\n        return self._allowance.get(allower_id).get(allowee_id, []), None\n\n    def get_clients_allowance(self, client_id):\n        \"\"\"\n        Retrieve the allowance dictionary for a given client.\n\n        :param client_id: The ID of the client whose allowance is being retrieved.\n        :type client_id: str\n        :return: A dictionary containing the allowance information for the client.\n        \"\"\"\n        if client_id not in self._allowance:\n            return None, f\"The client with client_id: {client_id} doesn't exist.\"\n        return self._allowance[client_id], None\n\n    def get_clients(self):\n        \"\"\"\n        Retrieve a list of all registered client IDs.\n\n        :return: A list of client IDs.\n        \"\"\"\n        return [id for id in self._clients.keys()]\n\n    def init_ml_model(self, client_id, ml_model_id):\n        \"\"\"\n        Initialize a new machine learning model for a given client.\n\n        :param client_id: The ID of the client to associate with the model.\n        :type client_id: str\n        :param ml_model_id: The ID of the machine learning model to initialize.\n        :type ml_model_id: str\n        :return: A tuple containing a boolean indicating success and an error message if the model already exists or the client ID is invalid.\n        \"\"\"\n        if client_id not in self._clients:\n            return False, \"The given client_id is invalid.\"\n\n        if ml_model_id in self._clients[client_id]:\n            return False, f\"The given ml_model_id: {ml_model_id} already exists within ml models of the client with client_id of {client_id}.\"\n\n        self._clients[client_id][ml_model_id] = {}\n        return True, None\n\n    def set_ml_model(self, client_id, ml_model_id, ml_model):\n        \"\"\"\n        Set or update the machine learning model for a given client.\n\n        :param client_id: The ID of the client.\n        :type client_id: str\n        :param ml_model_id: The ID of the machine learning model.\n        :type ml_model_id: str\n        :param ml_model: The machine learning model object to be set.\n        :type ml_model: obj\n        :return: None\n        \"\"\"\n        self._clients[client_id][ml_model_id] = ml_model\n\n    def remove_ml_model(self, client_id, ml_model_id):\n        \"\"\"\n        Remove an existing machine learning model for a given client.\n\n        :param client_id: The ID of the client.\n        :type client_id: str\n        :param ml_model_id: The ID of the machine learning model to remove.\n        :type ml_model_id: str\n        :return: A tuple containing a boolean indicating success and an error message if the client ID or model ID is invalid.\n        \"\"\"\n        if client_id not in self._clients:\n            return False, \"The given client_id is invalid.\"\n\n        if ml_model_id not in self._clients[client_id]:\n            return False, f\"The client with client_id: {client_id} doesn't have any model with ml_model_id of {ml_model_id}.\"\n\n        del self._clients[client_id][ml_model_id]\n        return True, None\n\n    def get_ml_models(self, client_id):\n        \"\"\"\n        Retrieve a list of all machine learning model IDs associated with a given client.\n\n        :param client_id: The ID of the client.\n        :type client_id: str\n        :return: A list of machine learning model IDs.\n        \"\"\"\n        return [id for id in self._clients[client_id].keys()]\n"
  },
  {
    "path": "pymilo/streaming/util.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"ML Streaming utility module.\"\"\"\nimport os\nimport re\nfrom ..pymilo_param import URL_REGEX\n\n\ndef validate_websocket_url(url: str) -> str:\n    \"\"\"\n    Validate a WebSocket URL and add the 'ws://' protocol if missing.\n\n    :param url: The WebSocket URL to validate.\n    :type url: str\n    :return: A tuple where the first element is a boolean indicating whether the URL is valid,\n             and the second element is the possibly corrected URL (or None if invalid).\n    \"\"\"\n    pattern = r\"^(ws|wss)://\"\n    if not re.match(pattern, url):\n        protocol = \"ws://\"\n        url = protocol + url\n    full_pattern = r\"^(ws|wss)://([a-zA-Z0-9.-]+)(:[0-9]+)?(/.*)?$\"\n    if not re.match(full_pattern, url):\n        return False, None\n    return True, url\n\n\ndef validate_http_url(url: str) -> str:\n    \"\"\"\n    Validate a HTTP URL and add the 'http://' protocol if missing.\n\n    :param url: The HTTP URL to validate.\n    :type url: str\n    :return: A tuple where the first element is a boolean indicating whether the URL is valid,\n             and the second element is the possibly corrected URL (or None if invalid).\n    \"\"\"\n    pattern = r\"^(http|https)://\"\n    if not re.match(pattern, url):\n        protocol = \"http://\"\n        url = protocol + url\n    full_pattern = r\"^(http|https)://([a-zA-Z0-9.-]+)(:[0-9]+)?(/.*)?$\"\n    if not re.match(full_pattern, url):\n        return False, None\n    return True, url\n\n\ndef generate_dockerfile(\n        dockerfile_name=\"Dockerfile\",\n        model_path=None,\n        compression='NULL',\n        protocol='REST',\n        port=8000,\n        init_model=None,\n        bare=False\n):\n    \"\"\"\n    Generate a Dockerfile for running a PyMilo server with specified configurations.\n\n    :param dockerfile_name: Name of the dockerfile.\n    :type dockerfile_name: str\n    :param model_path: Path or URL to the exported model JSON file.\n    :type model_path: str\n    :param compression: Compression method (default: NULL).\n    :type compression: str\n    :param protocol: Communication protocol (default: REST).\n    :type protocol: str\n    :param port: Port for the PyMilo server (default: 8000).\n    :type port: int\n    :param init_model: The model that the server initialized with.\n    :type init_model: boolean\n    :param bare: A flag that sets if the server runs without an internal ML model.\n    :type bare: boolean\n    \"\"\"\n    dockerfile_content = f\"\"\"# Use an official Python runtime as a parent image\nFROM python:3.11-slim\n\n# Set the working directory in the container\nWORKDIR /app\n\n# Install pymilo\nRUN pip install pymilo[streaming]\n    \"\"\"\n    is_url = False\n    if model_path:\n        if re.match(URL_REGEX, model_path):\n            is_url = True\n        else:\n            dockerfile_content += f\"\\nCOPY {os.path.basename(model_path)} /app/model.json\"\n\n    # Expose the specified port\n    dockerfile_content += f\"\\nEXPOSE {port}\"\n\n    cmd = \"CMD [\\\"python\\\", \\\"-m\\\", \\\"pymilo\\\"\"\n    cmd += f\", \\\"--compression\\\", \\\"{compression}\\\"\"\n    cmd += f\", \\\"--protocol\\\", \\\"{protocol}\\\"\"\n    cmd += f\", \\\"--port\\\", \\\"{port}\\\"\"\n\n    if model_path:\n        if is_url:\n            cmd += f\", \\\"--load\\\", \\\"{model_path}\\\"\"\n        else:\n            cmd += f\", \\\"--load\\\", \\\"/app/model.json\\\"\"\n    elif init_model:\n        cmd += f\", \\\"--init\\\" \\\"{init_model}\\\"\"\n    elif bare:\n        cmd += f\", \\\"--bare\\\"\"\n\n    cmd += \"]\"\n    dockerfile_content += f\"\\n{cmd}\\n\"\n\n    with open(dockerfile_name, 'w') as f:\n        f.write(dockerfile_content)\n"
  },
  {
    "path": "pymilo/transporters/__init__.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo transporters.\"\"\"\n"
  },
  {
    "path": "pymilo/transporters/adamoptimizer_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Adam optimizer object transporter.\"\"\"\nfrom .transporter import AbstractTransporter\nfrom ..utils.util import check_str_in_iterable\nfrom sklearn.neural_network._stochastic_optimizers import AdamOptimizer\n\n\nclass AdamOptimizerTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle AdamOptimizer field.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize instances of the AdamOptimizer class.\n\n        Record the `learning_rate`, `beta_1`, `beta_2` and `epsilon` fields of AdamOptimizer object.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], AdamOptimizer):\n            optimizer = data[key]\n            data[key] = {\n                \"pymilo-bypass\": True,\n                'pymilo-adamoptimizer': {\n                    \"params\": data[\"coefs_\"] + data[\"intercepts_\"],\n                    'type': \"AdamOptimizer\",\n                    'beta_1': optimizer.beta_1,\n                    'beta_2': optimizer.beta_2,\n                    'epsilon': optimizer.epsilon\n                }\n            }\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize the special _optimizer field of the AdamOptimizer.\n\n        The associated _optimizer field of the pymilo serialized model, is extracted through\n        it's previously serialized parameters.\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if check_str_in_iterable(\"pymilo-adamoptimizer\", content):\n            optimizer = content[\"pymilo-adamoptimizer\"]\n            if (optimizer[\"type\"] == \"AdamOptimizer\"):\n                return AdamOptimizer(\n                    params=optimizer[\"params\"],\n                    beta_1=optimizer['beta_1'],\n                    beta_2=optimizer['beta_2'],\n                    epsilon=optimizer['epsilon'])\n            else:\n                return content\n        else:\n            return content\n"
  },
  {
    "path": "pymilo/transporters/baseloss_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Base loss transporter.\"\"\"\n\n# Handle python 3.5 issues.\nfrom .transporter import AbstractTransporter\nfrom ..utils.util import check_str_in_iterable\nglm_models = [\n    'GammaRegressor',\n    'PoissonRegressor',\n    'TweedieRegressor'\n]\nlegacy_version = False\ntry:\n    from sklearn._loss.loss import BaseLoss\n    # So the python version is >= 3.8\n    from sklearn.linear_model._glm import GammaRegressor\n    from sklearn.linear_model._glm import PoissonRegressor\n    from sklearn.linear_model._glm import TweedieRegressor\nexcept BaseException:  # pragma: no cover\n    # if all bypasses are true, then we either don't have\n    # TweedieRegression(3.5) or we have other kind of TweedieRegreesion(3.7)\n    try:\n        from sklearn._loss.glm_distribution import (\n            ExponentialDispersionModel,\n            TweedieDistribution,\n            EDM_DISTRIBUTIONS,\n        )\n        from sklearn.linear_model._glm.link import (\n            BaseLink,\n            IdentityLink,\n            LogLink,\n        )\n        from sklearn.linear_model._glm import GammaRegressor\n        from sklearn.linear_model._glm import PoissonRegressor\n        from sklearn.linear_model._glm import TweedieRegressor\n        legacy_version = True\n    except BaseException:\n        # there is no glm models.\n        pass\n\n\nclass BaseLossTransporter(AbstractTransporter):  # pragma: no cover\n    \"\"\"Customized PyMilo Transporter developed to handle BaseLoss field.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize the special by-default unserializable BaseLoss field of the Tweedie, Poisson and Gamma regression.\n\n        serialize the data[key] of the given model which type is model_type.\n        basically in order to fully serialize a model, we should traverse over all the keys of its data dictionary and\n        pass it through the chain of associated transporters to get fully serialized.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        # bypass when it's not supported\n        # special legacy mode.\n        if model_type in glm_models:\n            if not legacy_version:\n                # Handling latest GLMs with Loss function of GLMs\n                if isinstance(data[key], BaseLoss):\n                    if model_type == \"TweedieRegressor\":\n                        data[key] = {\n                            \"power\": data[\"power\"],\n                            \"link\": data[\"link\"],\n                            \"pymilo_glm_base_loss\": True\n                        }\n                    elif model_type == \"PoissonRegressor\":\n                        data[key] = {\n                            \"pymilo_glm_base_loss\": True\n                            # nothing for now.\n                        }\n                    elif model_type == \"GammaRegressor\":\n                        data[key] = {\n                            \"pymilo_glm_base_loss\": True\n                            # nothing for now\n                        }\n                return data[key]\n\n            else:\n                # it's legacy version of GLMs\n                if key == \"_family_instance\":\n                    if model_type == \"TweedieRegressor\":\n                        data[\"_family_instance\"] = {\n                            \"family\": {\n                                'state': 'not-direct-serializable',\n                                'value': {\n                                    'power': data[\"power\"]\n                                }\n                            }\n                        }\n                    elif model_type == \"PoissonRegressor\":\n                        data[\"_family_instance\"] = {\n                            \"family\": {\n                                'state': 'direct-serializable',\n                                'value': \"poisson\"\n                            }\n                        }\n                    elif model_type == \"GammaRegressor\":\n                        data[\"_family_instance\"] = {\n                            \"family\": {\n                                'state': 'direct-serializable',\n                                'value': \"gamma\"\n                            }\n                        }\n                    return data[\"_family_instance\"]\n\n                elif key == \"_link_instance\":\n                    return \"sklean-mirror-link\"\n                elif key == \"link\":\n                    if data[key] in ['auto', 'identity', 'log']:\n                        data[key] = {\n                            'state': 'direct-serializable',\n                            'value': data[key]\n                        }\n                        return data[key]\n                    else:\n                        if isinstance(data[key], LogLink):\n                            data[key] = {\n                                'state': 'not-direct-serializable',\n                                'value': {\n                                    'abstract-class': \"LogLink\"\n                                }\n                            }\n                            return data[key]\n\n                        elif isinstance(data[key], IdentityLink):\n                            data[key] = {\n                                'state': 'not-direct-serializable',\n                                'value': {\n                                    'abstract-class': \"IdentityLink\"\n                                }\n                            }\n                            return data[key]\n\n                        else:\n                            # isinstance(data[key], BaseLink) == True\n                            data[key] = {\n                                'state': 'not-direct-serializable',\n                                'value': {\n                                    'abstract-class': \"BaseLink\"\n                                }\n                            }\n                            return data[key]\n                else:\n                    return data[key]\n        else:\n            return data[key]\n\n    def get_deserialized_base_loss(self, model_type, content):\n        \"\"\"\n        Extract the original BaseLoss object out of the associated core data recorded by pymilo.\n\n        :param model_type: the model type of the ML model, which data dictionary is given as the data param\n        :type model_type: str\n        :param content: the internal data dictionary of the given model\n        :type content: dict\n        :return: original BaseLoss field\n        \"\"\"\n        if model_type == \"TweedieRegressor\":\n            if not (\"power\" in content and \"link\" in content):\n                return None  # TODO EXCEPTION HANDLING\n            power, link = content[\"power\"], content[\"link\"]\n            return TweedieRegressor(power=power, link=link)._get_loss()\n        elif model_type == \"PoissonRegressor\":\n            return PoissonRegressor()._get_loss()\n        elif model_type == \"GammaRegressor\":\n            return GammaRegressor()._get_loss()\n        else:\n            return content\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize the previously pymilo made serializable BaseLoss field to its original form.\n\n        deserialize the special loss_function_ of the SGDClassifier, SGDOneClassSVM, Perceptron and PassiveAggressiveClassifier.\n        the associated loss_function_ field of the pymilo serialized model, is extracted through the SGDClassifier's _get_loss_function function\n        with enough feeding of the needed inputs.\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which internal serialized data dictionary is given as the data param.\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        # bypass when it's not supported\n        # special legacy mode.\n        if model_type in glm_models:\n            if not legacy_version:\n                # latest GLMs or irrelevant models.\n                content = data[key]\n                if not check_str_in_iterable(\n                        \"pymilo_glm_base_loss\", content):\n                    return content\n                return self.get_deserialized_base_loss(model_type, content)\n            else:\n                # it's legacy version of GLMs\n                if key == \"_family_instance\":\n                    # family field retrieval...\n                    family = data[\"_family_instance\"][\"family\"]\n                    if family['state'] == 'direct-serializable':\n                        family = family['value']\n                    else:\n                        family = TweedieDistribution(\n                            power=family['value']['power'])\n\n                    if isinstance(family, ExponentialDispersionModel):\n                        return family\n                    elif family in EDM_DISTRIBUTIONS:\n                        return EDM_DISTRIBUTIONS[family]()\n                    else:\n                        raise ValueError(\n                            \"The family must be an instance of class\"\n                            \" ExponentialDispersionModel or an element of\"\n                            \" ['normal', 'poisson', 'gamma', 'inverse-gaussian']\"\n                            \"; got (family={0})\".format(family))\n                elif key == \"link\":\n                    if data[key]['state'] == 'direct-serializable':\n                        data[key] = data[key]['value']\n                        return data[key]\n                    else:\n                        innerMap = {\n                            'LogLink': LogLink,\n                            'IdentityLink': IdentityLink,\n                            'BaseLink': BaseLink\n                        }\n                        return innerMap[data[key]['value']['abstract-class']]()\n                elif key == \"_link_instance\":\n                    # make sure it has been deserialized.\n                    try:\n                        data[\"link\"] = self.deserialize(\n                            data, \"link\", model_type)\n                    except BaseException:\n                        # it has been serialized.\n                        pass\n                    if isinstance(data[\"link\"], BaseLink):\n                        return data[\"link\"]\n                    else:\n                        if data[\"link\"] == \"auto\":\n                            if isinstance(\n                                    data[\"_family_instance\"],\n                                    TweedieDistribution):\n                                if data[\"_family_instance\"].power() <= 0:\n                                    return IdentityLink()\n                                if data[\"_family_instance\"].power() >= 1:\n                                    return LogLink()\n                            else:\n                                raise ValueError(\n                                    \"No default link known for the \"\n                                    \"specified distribution family. Please \"\n                                    \"set link manually, i.e. not to 'auto'; \"\n                                    \"got (link='auto', family={})\".format(\n                                        data[\"family\"]))\n                        elif data[\"link\"] == \"identity\":\n                            return IdentityLink()\n                        elif data[\"link\"] == \"log\":\n                            return LogLink()\n                        else:\n                            raise ValueError(\n                                \"The link must be an instance of class Link or \"\n                                \"an element of ['auto', 'identity', 'log']; \"\n                                \"got (link={0})\".format(\n                                    data[\"link\"]))\n                else:\n                    return data[key]\n        else:\n            return data[key]\n"
  },
  {
    "path": "pymilo/transporters/binmapper_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo BinMapper transporter.\"\"\"\nfrom sklearn.ensemble._hist_gradient_boosting.binning import _BinMapper\nfrom ..utils.util import is_primitive, check_str_in_iterable\nfrom .transporter import AbstractTransporter\nfrom .general_data_structure_transporter import GeneralDataStructureTransporter\n\n\nclass BinMapperTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle _BinMapper objects.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize _BinMapper object[useful in HistGradientBoosting(Regressor,Classifier)].\n\n        serialize the data[key] of the given model which type is model_type.\n        basically in order to fully serialize a model, we should traverse over all the keys of its data dictionary and\n        pass it through the chain of associated transporters to get fully serialized.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], _BinMapper):\n            binMapper = data[key]\n            _data = binMapper.__dict__\n            gdst = GeneralDataStructureTransporter()\n            for _key in _data:\n                _data[_key] = gdst.serialize(_data, _key, model_type + \":_BinMapper\")\n            return {\n                \"pymilo-bypass\": True,\n                \"pymilo-binmapper\": {\n                    \"__dict__\": _data\n                }\n            }\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize previously pymilo serialized _BinMapper object[useful in HistGradientBoosting(Regressor,Classifier)].\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which internal serialized data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if is_primitive(content) or content is None:\n            return content\n\n        if check_str_in_iterable(\"pymilo-binmapper\", content):\n            __dict__ = content[\"pymilo-binmapper\"][\"__dict__\"]\n            binMapper = _BinMapper()\n            gdst = GeneralDataStructureTransporter()\n            for key in __dict__:\n                __dict__[key] = gdst.deserialize(__dict__, key, model_type + \":_BinMapper\")\n            for key in __dict__:\n                setattr(binMapper, key, __dict__[key])\n            return binMapper\n\n        return content\n"
  },
  {
    "path": "pymilo/transporters/bisecting_tree_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo BisectingTree(sklearn.cluster._bisect_k_means) object transporter.\"\"\"\n\nfrom ..pymilo_param import SKLEARN_CLUSTERING_TABLE, NOT_SUPPORTED\nfrom .transporter import AbstractTransporter\nfrom .general_data_structure_transporter import GeneralDataStructureTransporter\nfrom ..utils.util import check_str_in_iterable\n\nbisecting_tree_support = SKLEARN_CLUSTERING_TABLE[\"BisectingKMeans\"] != NOT_SUPPORTED\nif bisecting_tree_support:\n    from sklearn.cluster._bisect_k_means import _BisectingTree\n\n\nclass BisectingTreeTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle BisectingTree object.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize instances of the BisectingTree class.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if bisecting_tree_support and isinstance(data[key], _BisectingTree):\n            data[key] = self.serialize_bisecting_tree(data[key], GeneralDataStructureTransporter())\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize _BisectingTree fields of the Decision Trees.\n\n        The associated tree_ field of the pymilo serialized model, is extracted through\n        it's previously serialized parameters.\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated JSON file of the ML model generated by pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if bisecting_tree_support and is_pymilo_serialized_bisecting_tree(content):\n            return self.deserialize_bisecting_tree(content, GeneralDataStructureTransporter())\n        else:\n            return content\n\n    def serialize_bisecting_tree(self, bisecting_tree, gdst=None):\n        \"\"\"\n        Serialize the bisecting_tree object recursively.\n\n        :param bisecting_tree: the bisecting_tree object which is going to get serialized.\n        :type bisecting_tree: dict\n        :param gdst: an instance of GeneralDataStructureTransporter class.\n        :type gdst: GeneralDataStructureTransporter\n        :return: pymilo-serialized bisecting_tree\n        \"\"\"\n        if (gdst is None):\n            gdst = GeneralDataStructureTransporter()\n        data = bisecting_tree.__dict__\n        data[\"pymilo_model_type\"] = \"_BisectingTree\"\n        for key, value in data.items():\n            if (isinstance(value, _BisectingTree)):\n                data[key] = self.serialize_bisecting_tree(value, gdst)\n            else:\n                data[key] = gdst.serialize(data, key, str(_BisectingTree))\n        return data\n\n    def deserialize_bisecting_tree(self, bisecting_tree_obj, gdst=None):\n        \"\"\"\n        Deserialize the bisecting_tree object recursively.\n\n        :param bisecting_tree_obj: the bisecting_tree object which is going to get deserialized.\n        :type bisecting_tree_obj: dict\n        :param gdst: an instance of GeneralDataStructureTransporter class.\n        :type gdst: GeneralDataStructureTransporter\n        :return: _BisectingTree object generated from bisecting_tree_obj\n        \"\"\"\n        if (gdst is None):\n            gdst = GeneralDataStructureTransporter()\n        data = bisecting_tree_obj\n        for key, value in data.items():\n            if is_pymilo_serialized_bisecting_tree(value):\n                data[key] = self.deserialize_bisecting_tree(value, gdst)\n            else:\n                data[key] = gdst.deserialize(data, key, str(_BisectingTree))\n\n        center = data[\"center\"]\n        indices = data[\"indices\"]\n        score = data[\"score\"]\n\n        reconstructed_bisecting_tree = _BisectingTree(center, indices, score)\n\n        for item in data.keys():\n            setattr(reconstructed_bisecting_tree, item, data[item])\n        return reconstructed_bisecting_tree\n\n\ndef is_pymilo_serialized_bisecting_tree(psbt):\n    \"\"\"\n    Check whether the given object is a bisecting tree which is serialized by pymilo.\n\n    :param psbt: the pymilo serialized bisecting tree object\n    :type psbt: dict\n    :return: boolean\n    \"\"\"\n    return (\n        check_str_in_iterable(\"pymilo_model_type\", psbt) and\n        psbt[\"pymilo_model_type\"] == \"_BisectingTree\"\n    )\n"
  },
  {
    "path": "pymilo/transporters/bunch_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Bunch transporter.\"\"\"\nfrom .transporter import AbstractTransporter\nfrom ..utils.util import check_str_in_iterable\n\nbunch_support = False\ntry:\n    from sklearn.utils._bunch import Bunch\n    bunch_support = True\nexcept BaseException:\n    pass\n\n\nclass BunchTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle Bunch objects.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize Bunch object.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if bunch_support and isinstance(data[key], Bunch):\n            bunch = data[key]\n            _dict = {}\n            for key, value in bunch.items():\n                _dict[key] = value\n            return {\n                \"pymilo-bypass\": True,\n                \"pymilo-bunch\": _dict,\n            }\n\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize previously pymilo serialized Bunch object.\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if bunch_support and check_str_in_iterable(\"pymilo-bunch\", content):\n            bunch = Bunch()\n            for key, value in content[\"pymilo-bunch\"].items():\n                bunch[key] = value\n            return bunch\n        else:\n            return content\n"
  },
  {
    "path": "pymilo/transporters/cfnode_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo CFnode(from sklearn.cluster._birch) object transporter.\"\"\"\nfrom sklearn.cluster._birch import _CFNode\nfrom sklearn.cluster._birch import _CFSubcluster\n\nfrom .transporter import AbstractTransporter\nfrom .general_data_structure_transporter import GeneralDataStructureTransporter\n\nfrom ..utils.util import has_named_parameter, check_str_in_iterable\n\n\nclass CFNodeTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle CFnode object.\"\"\"\n\n    def __init__(self):\n        \"\"\"\n        Initialize the CFNodeTransporter instance.\n\n        :return: an instance of the CFNodeTransporter class\n        \"\"\"\n        self.all_cfnodes = set()\n        self.retrieved_cfnodes = {}\n\n    def reset(self):\n        \"\"\"\n        Reset the CFNodeTransporter's internal data structures.\n\n        :return: None\n        \"\"\"\n        self.all_cfnodes = set()\n        self.retrieved_cfnodes = {}\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize data[key] if it is an instance of _CFNode.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], _CFNode):\n            data[key] = self.serialize_cfnode(data[key], GeneralDataStructureTransporter())\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize data[key] if it is a pymilo serialized _CFNode object.\n\n        :param data: the internal data dictionary of the associated JSON file of the ML model generated by pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if check_str_in_iterable(\"pymilo_model_type\", content) and content[\"pymilo_model_type\"] == \"_CFNode\":\n            return self.deserialize_cfnode(content, GeneralDataStructureTransporter())\n        else:\n            return content\n\n    def serialize_cfnode(self, cfnode, gdst):\n        \"\"\"\n        Serialize given _CFnode instance recursively.\n\n        :param cfnode: given _CFnode object to get serialized\n        :type cfnode: sklearn.cluster._birch._CFNode\n        :param gdst: an instance of GeneralDataStructureTransporter class\n        :type gdst: pymilo.transporters.general_data_structure_transporter.GeneralDataStructureTransporter\n        :return: dict\n        \"\"\"\n        data = cfnode.__dict__\n        cfnode_id = self.get_cfnode_id(cfnode)\n        data[\"pymilo_cfnode_id\"] = cfnode_id\n        data[\"pymilo_model_type\"] = \"_CFNode\"\n        self.all_cfnodes.add(cfnode_id)\n        for key, value in data.items():\n            if (isinstance(value, _CFNode)):\n                value_id = self.get_cfnode_id(value)\n                data[key] = {\n                    \"pymilo_model_type\": \"_CFNode\",\n                    \"pymilo_cfnode_value\": \"PYMILO_CFNODE_RECURSION\" if value_id in self.all_cfnodes else self.serialize_cfnode(\n                        value,\n                        gdst),\n                    \"pymilo_cfnode_id\": value_id,\n                }\n            elif (isinstance(value, list) and key == \"subclusters_\"):\n                if len(value) > 0:\n                    if isinstance(value[0], _CFSubcluster):\n                        data[key] = {\"pymilo_model_type\": \"_CFSubcluster\", \"pymilo_subclusters_value\": [\n                            self.serialize_cfsubcluster(cf_subcluster, gdst) for cf_subcluster in value], }\n            else:\n                data[key] = gdst.serialize(data, key, str(_CFNode))\n        return data\n\n    def deserialize_cfnode(self, cfnode_pymiloed_obj, gdst):\n        \"\"\"\n        Deserialize given serialized object of _CFnode class recursively.\n\n        :param cfnode_pymiloed_obj: given serialized _CFnode object to get deserialized\n        :type cfnode_pymiloed_obj: obj\n        :param gdst: an instance of GeneralDataStructureTransporter class\n        :type gdst: pymilo.transporters.general_data_structure_transporter.GeneralDataStructureTransporter\n        :return: sklearn.cluster._birch._CFNode\n        \"\"\"\n        # this object is a previously pymiloed cfnode object\n        if not cfnode_pymiloed_obj[\"pymilo_cfnode_id\"] in self.retrieved_cfnodes.keys():\n            self.retrieved_cfnodes[cfnode_pymiloed_obj[\"pymilo_cfnode_id\"]] = self.get_base_cfnode(cfnode_pymiloed_obj)\n\n        current_cfnode = self.retrieved_cfnodes[cfnode_pymiloed_obj[\"pymilo_cfnode_id\"]]\n        # init non left, right and subcluster.\n        for key, value in cfnode_pymiloed_obj.items():\n\n            if check_str_in_iterable(\"pymilo_model_type\", value) and value[\"pymilo_model_type\"] == \"_CFNode\":\n                if value[\"pymilo_cfnode_id\"] in self.retrieved_cfnodes.keys():\n                    cfnode_pymiloed_obj[key] = self.retrieved_cfnodes[value[\"pymilo_cfnode_id\"]]\n                    # case of recursion.\n                else:\n                    new_cfnode = self.deserialize_cfnode(value[\"pymilo_cfnode_value\"], gdst)\n                    self.retrieved_cfnodes[value[\"pymilo_cfnode_id\"]] = new_cfnode\n                    cfnode_pymiloed_obj[key] = new_cfnode\n\n            elif isinstance(value, dict) and \"pymilo_model_type\" in value and value[\"pymilo_model_type\"] == \"_CFSubcluster\":\n                # has a >0 length subclusters_ fields.\n                cfnode_pymiloed_obj[key] = [self.deserialize_cfsubcluster(\n                    subcluster, gdst) for subcluster in value[\"pymilo_subclusters_value\"]]\n            else:\n                cfnode_pymiloed_obj[key] = gdst.deserialize(cfnode_pymiloed_obj, key, str(_CFNode))\n\n        for key, value in cfnode_pymiloed_obj.items():\n            setattr(current_cfnode, key, value)\n        return current_cfnode\n\n    def serialize_cfsubcluster(self, cfsubcluster, gdst):\n        \"\"\"\n        Serialize given _CFSubcluster instance.\n\n        :param cfsubcluster: given _CFSubcluster object to get serialized\n        :type cfsubcluster: sklearn.cluster._birch._CFSubcluster\n        :param gdst: an instance of GeneralDataStructureTransporter class\n        :type gdst: pymilo.transporters.general_data_structure_transporter.GeneralDataStructureTransporter\n        :return: dict\n        \"\"\"\n        data = cfsubcluster.__dict__\n        for key, value in data.items():\n            if (isinstance(value, _CFNode)):\n                data[key] = self.serialize_cfnode(value, gdst)\n            else:\n                data[key] = gdst.serialize(data, key, str(_CFSubcluster))\n        return data\n\n    def deserialize_cfsubcluster(self, cfsubcluster_pymiloed_obj, gdst):\n        \"\"\"\n        Deserialize given serialized object of _CFSubcluster class recursively.\n\n        :param cfsubcluster_pymiloed_obj: given serialized _CFSubcluster object to get deserialized\n        :type cfsubcluster_pymiloed_obj: obj\n        :param gdst: an instance of GeneralDataStructureTransporter class\n        :type gdst: pymilo.transporters.general_data_structure_transporter.GeneralDataStructureTransporter\n        :return: sklearn.cluster._birch._CFSubcluster\n        \"\"\"\n        for key, value in cfsubcluster_pymiloed_obj.items():\n            if check_str_in_iterable(\"pymilo_model_type\", value) and value[\"pymilo_model_type\"] == \"_CFNode\":\n                cfsubcluster_pymiloed_obj[key] = self.deserialize_cfnode(value[\"pymilo_cfnode_value\"], gdst)\n            else:\n                cfsubcluster_pymiloed_obj[key] = gdst.deserialize(cfsubcluster_pymiloed_obj, key, str(_CFSubcluster))\n\n        subcluster_instance = _CFSubcluster()\n        for key, value in cfsubcluster_pymiloed_obj.items():\n            setattr(subcluster_instance, key, value)\n        return subcluster_instance\n\n    def get_cfnode_id(self, cfnode):\n        \"\"\"\n        Create a unique id for the given cfnode.\n\n        :param cfnode: given _CFnode object to generate it's id.\n        :type cfnode: sklearn.cluster._birch._CFNode\n        :return: str\n        \"\"\"\n        if not isinstance(cfnode, _CFNode):\n            return \"None\"\n        else:\n            return str(cfnode).split(\" at \")[1][:-1]\n\n    def get_base_cfnode(self, cfnode_pymiloed_obj):\n        \"\"\"\n        Create a basic _CFNode instance from constructor parameters existing in cfnode_pymiloed_obj.\n\n        :param cfnode_pymiloed_obj: given serialized _CFnode object to generate it's basic _CFNode instance\n        :type cfnode_pymiloed_obj: sklearn.cluster._birch._CFNode\n        :return: _CFNode\n        \"\"\"\n        threshold = cfnode_pymiloed_obj[\"threshold\"]\n        branching_factor = cfnode_pymiloed_obj[\"branching_factor\"]\n        is_leaf = cfnode_pymiloed_obj[\"is_leaf\"]\n        n_features = cfnode_pymiloed_obj[\"n_features\"]\n        dtype = GeneralDataStructureTransporter().list_to_ndarray(cfnode_pymiloed_obj[\"init_centroids_\"]).dtype\n        if has_named_parameter(_CFNode.__init__, \"dtype\"):\n            return _CFNode(\n                threshold=threshold,\n                branching_factor=branching_factor,\n                is_leaf=is_leaf,\n                n_features=n_features,\n                dtype=dtype,\n            )\n        else:\n            return _CFNode(\n                threshold=threshold,\n                branching_factor=branching_factor,\n                is_leaf=is_leaf,\n                n_features=n_features,\n            )\n"
  },
  {
    "path": "pymilo/transporters/compose_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Compose transporter.\"\"\"\n\nfrom ..utils.util import check_str_in_iterable\nfrom .transporter import AbstractTransporter\nfrom .general_data_structure_transporter import GeneralDataStructureTransporter\nfrom .preprocessing_transporter import PreprocessingTransporter\nfrom .feature_extraction_transporter import FeatureExtractorTransporter\nfrom ..chains.util import serialize_possible_ml_model, deserialize_possible_ml_model\n\nCOMPOSE_CHAIN = {\n    \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n    \"PreprocessingTransporter\": PreprocessingTransporter(),\n    \"FeatureExtractorTransporter\": FeatureExtractorTransporter(),\n}\n\n\nclass ComposeTransporter(AbstractTransporter):\n    \"\"\"Compose object dedicated Transporter.\"\"\"\n\n    def is_compose_internal_model(self, internal_model):\n        \"\"\"\n        Check whether the given content is a nested estimator/module.\n\n        This is used to detect objects that should be (de)serialized when found inside\n        compose models (e.g., inside ColumnTransformer.transformers).\n\n        :param internal_model: given object\n        :type internal_model: any\n        :return: bool\n        \"\"\"\n        pt = COMPOSE_CHAIN[\"PreprocessingTransporter\"]\n        fe = COMPOSE_CHAIN[\"FeatureExtractorTransporter\"]\n        if isinstance(internal_model, dict):\n            return (\n                check_str_in_iterable(\"pymilo-inner-model-type\", internal_model) or\n                pt.is_preprocessing_module(internal_model) or\n                fe.is_fe_module(internal_model)\n            )\n        return (\n            self._is_ml_model(internal_model) or\n            pt.is_preprocessing_module(internal_model) or\n            fe.is_fe_module(internal_model)\n        )\n\n    def _is_ml_model(self, obj):\n        \"\"\"\n        Check if the object is an ML model that needs serialization.\n\n        :param obj: given object\n        :type obj: any\n        :return: bool\n        \"\"\"\n        return hasattr(obj, 'fit') and (hasattr(obj, 'predict') or hasattr(obj, 'transform'))\n\n    def serialize_compose_internal_model(self, internal_model):\n        \"\"\"\n        Serialize internal model of compose objects.\n\n        :param internal_model: given sklearn internal model\n        :type internal_model: sklearn model or function\n        :return: pymilo serialized internal_model\n        \"\"\"\n        # Handle preprocessing modules\n        pt = COMPOSE_CHAIN[\"PreprocessingTransporter\"]\n        if pt.is_preprocessing_module(internal_model):\n            return pt.serialize_pre_module(internal_model)\n\n        # Handle feature extraction modules\n        fe = COMPOSE_CHAIN[\"FeatureExtractorTransporter\"]\n        if fe.is_fe_module(internal_model):\n            return fe.serialize_fe_module(internal_model)\n\n        # Handle ML models (including compose/ensemble/concrete estimators) using the project-wide schema.\n        if self._is_ml_model(internal_model):\n            has_ml_model, result = serialize_possible_ml_model(internal_model)\n            if has_ml_model:\n                return result\n        return internal_model\n\n    def deserialize_compose_internal_model(self, serialized_internal_model):\n        \"\"\"\n        Deserialize internal model of compose objects.\n\n        :param serialized_internal_model: serialized internal model(by pymilo)\n        :type serialized_internal_model: dict\n        :return: retrieved associated sklearn internal model\n        \"\"\"\n        # Preprocessing / feature extraction modules\n        pt = COMPOSE_CHAIN[\"PreprocessingTransporter\"]\n        if pt.is_preprocessing_module(serialized_internal_model):\n            return pt.deserialize_pre_module(serialized_internal_model)\n\n        fe = COMPOSE_CHAIN[\"FeatureExtractorTransporter\"]\n        if fe.is_fe_module(serialized_internal_model):\n            return fe.deserialize_fe_module(serialized_internal_model)\n\n        # Project-wide nested-model schema\n        has_ml_model, result = deserialize_possible_ml_model(serialized_internal_model)\n        if has_ml_model:\n            return result\n\n        return serialized_internal_model\n\n    def _serialize_nested(self, obj):\n        \"\"\"\n        Recursively serialize nested structures containing internal models.\n\n        :param obj: object to serialize (dict, list, tuple, or internal model)\n        :type obj: any\n        :return: serialized object with internal models converted to pymilo format\n        \"\"\"\n        if isinstance(obj, dict):\n            return {k: self._serialize_nested(v) for k, v in obj.items()}\n        if isinstance(obj, list):\n            return [self._serialize_nested(v) for v in obj]\n        if isinstance(obj, tuple):\n            return tuple(self._serialize_nested(v) for v in obj)\n        if self.is_compose_internal_model(obj):\n            return self.serialize_compose_internal_model(obj)\n        return obj\n\n    def _deserialize_nested(self, obj):\n        \"\"\"\n        Recursively deserialize nested structures containing serialized internal models.\n\n        :param obj: object to deserialize (dict, list, tuple, or serialized internal model)\n        :type obj: any\n        :return: deserialized object with internal models restored to sklearn objects\n        \"\"\"\n        if isinstance(obj, dict):\n            # Try leaf deserialization first (nested model schemas are dicts)\n            if self.is_compose_internal_model(obj):\n                return self.deserialize_compose_internal_model(obj)\n            return {k: self._deserialize_nested(v) for k, v in obj.items()}\n        if isinstance(obj, list):\n            return [self._deserialize_nested(v) for v in obj]\n        if isinstance(obj, tuple):\n            return tuple(self._deserialize_nested(v) for v in obj)\n        return obj\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize Compose object.\n\n        Serialize the data[key] of the given model which type is model_type.\n        To fully serialize a model, we should traverse over all the keys of its data dictionary and\n        pass it through the chain of associated transporters to get fully serialized.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], (dict, list, tuple)):\n            return self._serialize_nested(data[key])\n        if self.is_compose_internal_model(data[key]):\n            return self.serialize_compose_internal_model(data[key])\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize previously pymilo serialized compose object.\n\n        Deserialize the data[key] of the given model which type is model_type.\n        To fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which internal serialized data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], (dict, list, tuple)):\n            return self._deserialize_nested(data[key])\n        if self.is_compose_internal_model(data[key]):\n            return self.deserialize_compose_internal_model(data[key])\n        return data[key]\n"
  },
  {
    "path": "pymilo/transporters/feature_extraction_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Feature Extraction transporter.\"\"\"\nfrom scipy.sparse import csr_matrix\n\nfrom ..pymilo_param import SKLEARN_FEATURE_EXTRACTION_TABLE\nfrom ..utils.util import check_str_in_iterable, get_sklearn_type\nfrom .transporter import AbstractTransporter, Command\nfrom .general_data_structure_transporter import GeneralDataStructureTransporter\nfrom .randomstate_transporter import RandomStateTransporter\n\nFEATURE_EXTRACTION_CHAIN = {\n    \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n    \"RandomStateTransporter\": RandomStateTransporter(),\n}\n\n\nclass FeatureExtractorTransporter(AbstractTransporter):\n    \"\"\"Feature Extractor object dedicated Transporter.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize Feature Extractor object.\n\n        serialize the data[key] of the given model which type is model_type.\n        basically in order to fully serialize a model, we should traverse over all the keys of its data dictionary and\n        pass it through the chain of associated transporters to get fully serialized.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if self.is_fe_module(data[key]):\n            return self.serialize_fe_module(data[key])\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize previously pymilo serialized feature extraction object.\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which internal serialized data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if self.is_fe_module(content):\n            return self.deserialize_fe_module(content)\n        return content\n\n    def is_fe_module(self, fe_module):\n        \"\"\"\n        Check whether the given module is a sklearn Feature Extraction module or not.\n\n        :param fe_module: given object\n        :type fe_module: any\n        :return: bool\n        \"\"\"\n        if isinstance(fe_module, dict):\n            return check_str_in_iterable(\n                \"pymilo-feature_extraction-type\",\n                fe_module) and fe_module[\"pymilo-feature_extraction-type\"] in SKLEARN_FEATURE_EXTRACTION_TABLE\n        return get_sklearn_type(fe_module) in SKLEARN_FEATURE_EXTRACTION_TABLE\n\n    def serialize_fe_module(self, fe_module):\n        \"\"\"\n        Serialize Feature Extraction object.\n\n        :param fe_module: given sklearn feature extraction module\n        :type fe_module: sklearn.feature_extraction\n        :return: pymilo serialized fe_module\n        \"\"\"\n        # add one depth inner preprocessing module population\n        for key, value in fe_module.__dict__.items():\n            if self.is_fe_module(value):\n                fe_module.__dict__[key] = self.serialize_fe_module(value)\n            elif isinstance(value, csr_matrix):\n                fe_module.__dict__[key] = {\n                    \"pymilo-bypass\": True,\n                    \"pymilo-csr_matrix\": FEATURE_EXTRACTION_CHAIN[\"GeneralDataStructureTransporter\"].serialize_dict(\n                        value.__dict__\n                    )\n                }\n\n        for transporter in FEATURE_EXTRACTION_CHAIN:\n            FEATURE_EXTRACTION_CHAIN[transporter].transport(\n                fe_module, Command.SERIALIZE)\n        return {\n            \"pymilo-bypass\": True,\n            \"pymilo-feature_extraction-type\": get_sklearn_type(fe_module),\n            \"pymilo-feature_extraction-data\": fe_module.__dict__\n        }\n\n    def deserialize_fe_module(self, serialized_fe_module):\n        \"\"\"\n        Deserialize Feature Extraction object.\n\n        :param serialized_fe_module: serializezd feature extraction module(by pymilo)\n        :type serialized_fe_module: dict\n        :return: retrieved associated sklearn.feature_extraction module\n        \"\"\"\n        data = serialized_fe_module[\"pymilo-feature_extraction-data\"]\n        associated_type = SKLEARN_FEATURE_EXTRACTION_TABLE[serialized_fe_module[\"pymilo-feature_extraction-type\"]]\n        retrieved_fe_module = associated_type()\n        for key in data:\n            # add one depth inner feature extraction module population\n            if self.is_fe_module(data[key]):\n                data[key] = self.deserialize_fe_module(data[key])\n            elif check_str_in_iterable(\"pymilo-csr_matrix\", data[key]):\n                csr_matrix_dict = FEATURE_EXTRACTION_CHAIN[\"GeneralDataStructureTransporter\"].get_deserialized_dict(\n                    data[key][\"pymilo-csr_matrix\"])\n                cm = csr_matrix(csr_matrix_dict['_shape'])\n                for _key in csr_matrix_dict:\n                    setattr(cm, _key, csr_matrix_dict[_key])\n                data[key] = cm\n            for transporter in FEATURE_EXTRACTION_CHAIN:\n                data[key] = FEATURE_EXTRACTION_CHAIN[transporter].deserialize(data, key, \"\")\n        for key in data:\n            setattr(retrieved_fe_module, key, data[key])\n        return retrieved_fe_module\n"
  },
  {
    "path": "pymilo/transporters/function_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Function transporter.\"\"\"\n\nfrom ..utils.util import import_function, check_str_in_iterable\nfrom .transporter import AbstractTransporter\nfrom types import FunctionType, BuiltinFunctionType\nfrom numpy import ufunc\n\narray_function_dispatcher_support = False\ntry:\n    from numpy._core._multiarray_umath import _ArrayFunctionDispatcher\n    array_function_dispatcher_support = True\nexcept ImportError:\n    try:\n        from numpy.core._multiarray_umath import _ArrayFunctionDispatcher\n        array_function_dispatcher_support = True\n    except ImportError:\n        pass\n\n\nclass FunctionTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle function field transportation.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize Function type fields.\n\n        Record associated function's name and it's parent module in order to retrieve it accordingly.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], ufunc):\n            function = data[key]\n            data[key] = {\n                \"function_name\": function.__name__,\n                \"function_module\": \"numpy\",\n            }\n            return data[key]\n\n        elif isinstance(data[key], (FunctionType, BuiltinFunctionType)) or (\n                array_function_dispatcher_support and\n                isinstance(data[key], _ArrayFunctionDispatcher)):\n            function = data[key]\n            data[key] = {\n                \"function_name\": function.__name__,\n                \"function_module\": function.__module__,\n            }\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize serialized function objects back to it's original function type object.\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if check_str_in_iterable(\"function_name\", content):\n            return import_function(\n                content[\"function_module\"],\n                content[\"function_name\"]\n            )\n        else:\n            return content\n"
  },
  {
    "path": "pymilo/transporters/general_data_structure_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo GeneralDataStructure transporter.\"\"\"\nimport numpy as np\nfrom ast import literal_eval\n\nfrom ..pymilo_param import NUMPY_TYPE_DICT\n\nfrom ..utils.util import get_homogeneous_type, all_same, prefix_list\nfrom ..utils.util import is_primitive, check_str_in_iterable\n\nfrom .transporter import AbstractTransporter\n\n\nclass GeneralDataStructureTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle fields with general datastructures.\"\"\"\n\n    def _is_remainder_cols_list(self, obj):\n        \"\"\"\n        Check whether the given object is sklearn's internal _RemainderColsList.\n\n        :param obj: given object\n        :type obj: any\n        :return: bool\n        \"\"\"\n        try:\n            return (\n                obj.__class__.__name__ == \"_RemainderColsList\" and\n                \"_column_transformer\" in obj.__class__.__module__\n            )\n        except Exception:\n            return False\n\n    def serialize_tuple(self, tuple_field):\n        \"\"\"\n        Check for non-serializable fields in tuple and serialize them.\n\n            1. Serialize inner np.ndarray fields in tuple\n            2. Serialize inner slice fields in tuple\n            3. Convert sklearn's _RemainderColsList to list\n\n        :param tuple_field: given tuple\n        :type tuple_field: tuple\n        :return: serializable tuple\n        \"\"\"\n        new_tuple = tuple()\n        for item in tuple_field:\n            if (isinstance(item, np.ndarray)):\n                new_tuple += (self.deep_serialize_ndarray(item),)\n            elif self._is_remainder_cols_list(item):\n                new_tuple += (list(item),)\n            elif isinstance(item, slice):\n                new_tuple += ({\n                    \"pymilo-slice\": True,\n                    \"start\": item.start,\n                    \"stop\": item.stop,\n                    \"step\": item.step,\n                },)\n            else:\n                new_tuple += (item,)\n        return {\n            \"pymilo-tuple\": new_tuple,\n        }\n\n    # dict serializer for Logistic regression CV\n    def serialize_dict(self, dictionary):\n        \"\"\"\n        Make all the fields of the given dictionary serializable.\n\n            1. Changing ndarray values to list,\n            2. save unserializable values of numpy.int32|int64 types in an serializable custom object form.\n\n        :param dictionary: given dictionary\n        :type dictionary: dict\n        :return: fully serializable dictionary\n        \"\"\"\n        black_list_key_values = []\n        for key in dictionary:\n            # check inner field as a np.ndarray\n            if isinstance(dictionary[key], np.ndarray):\n                dictionary[key] = self.deep_serialize_ndarray(dictionary[key])\n            # check inner field as sklearn's _RemainderColsList\n            if self._is_remainder_cols_list(dictionary[key]):\n                dictionary[key] = list(dictionary[key])\n            # check inner field as a slice\n            if isinstance(dictionary[key], slice):\n                dictionary[key] = {\n                    \"pymilo-slice\": True,\n                    \"start\": dictionary[key].start,\n                    \"stop\": dictionary[key].stop,\n                    \"step\": dictionary[key].step,\n                }\n            # check inner field as np.int32\n            if isinstance(key, np.int32):\n                new_value = {\n                    \"np-type\": \"numpy.int32\",\n                    \"key-value\": dictionary[key]\n                }\n                black_list_key_values.append([key, new_value])\n            if isinstance(key, np.int64):\n                new_value = {\n                    \"np-type\": \"numpy.int64\",\n                    \"key-value\": dictionary[key]\n                }\n                black_list_key_values.append([key, new_value])\n        for black_key_value in black_list_key_values:\n            prev_key = black_key_value[0]\n            new_value = black_key_value[1]\n            del dictionary[prev_key]\n            dictionary[int(prev_key)] = new_value\n        return dictionary\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize the general datastructures.\n\n            1. handling numpy infinity(which is an issue in ransac model)\n            2. unserializable type numpy.int32\n            3. unserializable type numpy.int64\n            4. list type which may contain unserializable type numpy.int32|int64\n            5. object of  unserializable numpy.ndarray class\n            6. dictionary serialization\n            7. tuple serialization\n\n        Serialize the data[key] of the given model which its type is model_type.\n        basically in order to fully serialize a model, we should traverse over all the keys of its data dictionary and\n        pass it through the chain of associated transporters to get fully serialized.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which its data dictionary is given as the data param.\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if not (isinstance(data[key], object) or isinstance(data[key], str)):\n            if np.isnan(data[key]):  # throws exception on object & str types\n                data[key] = {\n                    \"np-type\": \"numpy.nan\",\n                    \"value\": \"NaN\"\n                }\n        elif isinstance(data[key], type):\n            raw_type = str(data[key])\n            raw_type = \"numpy\" + str(raw_type).split(\"numpy\")[-1][:-2]\n            if raw_type in NUMPY_TYPE_DICT.keys():\n                data[key] = {\n                    \"np-type\": \"numpy.dtype\",\n                    \"value\": raw_type\n                }\n        # 1. Handling numpy infinity, ransac\n        elif isinstance(data[key], np.float64):\n            if np.inf == data[key]:\n                data[key] = {\n                    \"np-type\": \"numpy.infinity\",\n                    \"value\": \"infinite\"  # added for compatibility\n                }\n            else:\n                data[key] = {\"value\": data[key], \"np-type\": \"numpy.float64\"}\n\n        elif isinstance(data[key], np.intc):\n            data[key] = {\"value\": int(data[key]), \"np-type\": \"numpy.intc\"}\n\n        elif isinstance(data[key], np.int32):\n            data[key] = {\"value\": int(data[key]), \"np-type\": \"numpy.int32\"}\n\n        elif isinstance(data[key], np.int64):\n            data[key] = {\"value\": int(data[key]), \"np-type\": \"numpy.int64\"}\n\n        elif isinstance(data[key], np.uint64):\n            data[key] = {\"value\": int(data[key]), \"np-type\": \"numpy.uint64\"}\n\n        elif isinstance(data[key], list):\n            new_list = []\n            for item in data[key]:\n                if isinstance(item, np.int32):\n                    new_list.append(\n                        {\"value\": int(item), \"np-type\": \"numpy.int32\"})\n                elif isinstance(item, np.int64):\n                    new_list.append(\n                        {\"value\": int(item), \"np-type\": \"numpy.int64\"})\n                elif isinstance(item, np.ndarray):\n                    new_list.append(self.deep_serialize_ndarray(item))\n                elif isinstance(item, tuple):\n                    new_list.append(self.serialize_tuple(item))\n                elif self._is_remainder_cols_list(item):\n                    new_list.append(list(item))\n                elif isinstance(item, slice):\n                    new_list.append({\n                        \"pymilo-slice\": True,\n                        \"start\": item.start,\n                        \"stop\": item.stop,\n                        \"step\": item.step\n                    })\n                else:\n                    new_list.append(item)\n            data[key] = new_list\n\n        elif isinstance(data[key], np.ndarray):\n            data[key] = self.deep_serialize_ndarray(data[key])\n\n        elif isinstance(data[key], set):\n            data[key] = {\n                \"pymilo-set\": list(data[key])\n            }\n\n        elif isinstance(data[key], dict):\n            data[key] = self.serialize_dict(data[key])\n\n        elif isinstance(data[key], slice):\n            data[key] = {\n                \"pymilo-slice\": True,\n                \"start\": data[key].start,\n                \"stop\": data[key].stop,\n                \"step\": data[key].step\n            }\n\n        elif isinstance(data[key], tuple):\n            data[key] = self.serialize_tuple(data[key])\n\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize the general datastructures.\n\n            1. Dictionary deserialization\n            2. Deep conversion of lists to numpy.ndarray class\n            3. Convert custom serializable object of np.int32|int64 to the main np.int32|int64 type\n\n        deserialize the special loss_function_ of the SGDClassifier, SGDOneClassSVM, Perceptron and PassiveAggressiveClassifier.\n        the associated loss_function_ field of the pymilo serialized model, is extracted through the SGDClassifier's _get_loss_function function\n        with enough feeding of the needed inputs.\n\n        deserialize the data[key] of the given model which its type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which its internal serialized data dictionary is given as the data param.\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], dict):\n            if 'pymilo-bypass' in data[key]:\n                return data[key]\n            else:\n                return self.get_deserialized_dict(data[key])\n\n        elif isinstance(data[key], list):\n            return self.get_deserialized_list(data[key])\n\n        elif isinstance(data[key], dict) and data[key].get(\"pymilo-slice\", False):\n            return slice(data[key][\"start\"], data[key][\"stop\"], data[key][\"step\"])\n\n        elif self.is_numpy_primary_type(data[key]):\n            return self.get_deserialized_regular_primary_types(data[key])\n        else:\n            # TODO\n            return data[key]\n\n    def get_deserialized_dict(self, content):\n        \"\"\"\n        Deserialize the given previously made serializable dictionary.\n\n            1. convert numpy types values which previously made serializable to its original form\n            2. deep conversion of list values to nd arrays\n\n        It is mainly used in serializing/deserializing the \"scores_\" field in Logistic regression([+CV]).\n\n        :param content: given dictionary\n        :type content: dict\n        :return: the original dictionary\n        \"\"\"\n        black_list_key_values = []\n\n        if not isinstance(content, dict):\n            return content\n\n        if check_str_in_iterable(\"pymilo-tuple\", content):\n            return tuple(self.get_deserialized_list(content[\"pymilo-tuple\"]))\n\n        if check_str_in_iterable(\"pymilo-set\", content):\n            return set(self.get_deserialized_list(content[\"pymilo-set\"]))\n\n        if check_str_in_iterable(\"pymilo-slice\", content):\n            return slice(content[\"start\"], content[\"stop\"], content[\"step\"])\n\n        if self.is_deserialized_ndarray(content):\n            return self.deep_deserialize_ndarray(content)\n\n        if check_str_in_iterable(\"np-type\", content) and check_str_in_iterable(\"value\", content):\n            return self.get_deserialized_regular_primary_types(content)\n\n        for key in content:\n\n            if isinstance(content[key], dict):\n                content[key] = self.get_deserialized_dict(content[key])\n\n            elif isinstance(content[key], list):\n                new_list = []\n                for item in content[key]:\n                    if self.is_deserialized_ndarray(item):\n                        new_list.append(self.deep_deserialize_ndarray(item))\n                    else:\n                        new_list.append(self.deserialize_primitive_type(item))\n                content[key] = new_list\n\n            if check_str_in_iterable(\n                    \"np-type\", content[key]):\n                new_key = NUMPY_TYPE_DICT[content[key][\"np-type\"]](key)\n                new_value = content[key][\"key-value\"]\n                black_list_key_values.append([key, new_key, new_value])\n\n        for black_key_value in black_list_key_values:\n            prev_key, new_key, new_value = black_key_value\n            del content[prev_key]\n            content[new_key] = new_value\n\n        return content\n\n    def get_deserialized_list(self, content):\n        \"\"\"\n        Deserialize the given list to its original form.\n\n            1. convert previously made serializable numpy types to its original form\n            2. convert list to nd array\n\n        It is mainly used in serializing/deserializing the \"active_\" array field in Lasso Lars.\n\n        :param content: given list to get\n        :type content: list\n        :return: the original list\n        \"\"\"\n        new_list = []\n        for item in content:\n            if isinstance(item, dict) and check_str_in_iterable(\"pymilo-tuple\", item):\n                new_list.append(tuple(self.get_deserialized_list(item[\"pymilo-tuple\"])))\n            elif self.is_deserialized_ndarray(item):\n                new_list.append(self.deep_deserialize_ndarray(item))\n            elif isinstance(item, dict) and item.get(\"pymilo-slice\", False):\n                new_list.append(slice(item[\"start\"], item[\"stop\"], item[\"step\"]))\n            else:\n                new_list.append(self.deserialize_primitive_type(item))\n        return new_list\n\n    def get_deserialized_regular_primary_types(self, content):\n        \"\"\"\n        Deserialize the given item to its original form.\n\n            1. handling np.int32 type\n            2. handling np.int64 type\n            3. handling np.infinity type\n\n        :param content: given item needed to get back to its original form\n        :type content: object\n        :return: the associated np.int32|np.int64|np.inf\n        \"\"\"\n        if \"np-type\" in content:\n            if content[\"np-type\"] == \"numpy.dtype\":\n                if isinstance(content[\"value\"], str):\n                    # when the value is the associated type name like numpy.float64\n                    return NUMPY_TYPE_DICT[content[\"value\"]]\n                else:\n                    return NUMPY_TYPE_DICT[content[\"np-type\"]](NUMPY_TYPE_DICT[content['value']])\n            if content[\"np-type\"] == \"numpy.nan\":\n                return NUMPY_TYPE_DICT[content[\"np-type\"]]\n            return NUMPY_TYPE_DICT[content[\"np-type\"]](content['value'])\n\n    def is_numpy_primary_type(self, content):\n        \"\"\"\n        Check whether the given object is a numpy primary type.\n\n        :type content: given object to get checked whether it is a numpy primary type or not\n        :return: boolean representing whether the associated content is a numpy primary type or not\n        \"\"\"\n        if is_primitive(content):\n            return False\n        current_supported_primary_types = NUMPY_TYPE_DICT.values()\n        if check_str_in_iterable(\"np-type\", content) and content[\"np-type\"] in current_supported_primary_types:\n            return True\n        else:\n            return False\n\n    def ndarray_to_list(self, ndarray_item):\n        \"\"\"\n        Convert the given ndarray to its fully listed format.\n\n            1. convert itself to a list\n            2. iterate over it's elements and apply ndarray to list conversion if it's eligible\n\n        :param ndarray_item: given ndarray needed to get converted to it's fully listed form\n        :type ndarray_item: numpy.ndarray\n        :return: list\n        \"\"\"\n        if isinstance(ndarray_item, np.ndarray):\n            listed_ndarray = ndarray_item.tolist()\n            new_list = []\n            for item in listed_ndarray:\n                new_list.append(self.ndarray_to_list(item))\n            return new_list\n        else:\n            return ndarray_item\n\n    def list_to_ndarray(self, list_item):\n        \"\"\"\n        Convert the given list to its fully ndarray format.\n\n            1. iterate over it's elements and apply list to ndarray conversion if it's eligible\n            2. convert the coarse-grained list to ndarray\n\n        :param list_item: given list needed to get converted to it's np.ndarray form\n        :type list_item: list\n        :return: numpy.ndarray\n        \"\"\"\n        if isinstance(list_item, list):\n\n            if len(list_item) == 0:\n                return np.asarray(list_item)\n\n            new_list = []\n            for item in list_item:\n                new_list.append(self.list_to_ndarray(item))\n\n            is_homogeneous_type, the_homogeneous_type = get_homogeneous_type(\n                new_list)\n            if is_homogeneous_type:\n                if the_homogeneous_type in [int, float, str, bool]:\n                    return np.asarray(new_list)\n                elif the_homogeneous_type == np.ndarray:\n                    is_homogeneous_type, _ = get_homogeneous_type(\n                        [x.dtype for x in new_list])\n                    if (is_homogeneous_type):\n                        if all_same([len(x) for x in new_list]):\n                            try:\n                                return np.asarray(new_list)\n                            except Exception as _:\n                                # when we have a list of ndarrays with different shapes.\n                                return new_list\n\n            return np.asarray(new_list, dtype=object)\n        else:\n            return self.deserialize_primitive_type(list_item)\n\n    def deserialize_primitive_type(self, primitive):\n        \"\"\"\n        Deserialize the given primitive data type.\n\n        :param primitive: given primitive needed to get deserialized to it's pure primitive form\n        :type primitive: pure python primitive or dict\n        :return: pure python primitive or numpy primitive data type\n        \"\"\"\n        if is_primitive(primitive):\n            return primitive\n        elif check_str_in_iterable(\"np-type\", primitive):\n            return self.get_deserialized_regular_primary_types(primitive)\n        else:\n            return primitive\n\n    def deep_serialize_ndarray(self, ndarray):\n        \"\"\"\n        Serialize the given ndarray.\n\n        :param ndarray_item: given ndarray needed to get serialized to\n        :type ndarray_item: numpy.ndarray\n        :return: dict\n        \"\"\"\n        if (not (isinstance(ndarray, np.ndarray))):\n            return None  # throw error\n\n        listed_ndarray = ndarray.tolist()\n        dtype = ndarray.dtype\n\n        new_list = []\n        for item in listed_ndarray:\n            if isinstance(item, np.ndarray):\n                new_list.append(self.deep_serialize_ndarray(item))\n            else:\n                new_list.append(item)\n\n        return {\n            'pymiloed-ndarray-list': new_list,\n            'pymiloed-ndarray-dtype': str(dtype),\n            'pymiloed-ndarray-shape': ndarray.shape,\n            'pymiloed-data-structure': 'numpy.ndarray'\n        }\n\n    def is_deserialized_ndarray(self, deserialized_ndarray):\n        \"\"\"\n        Check whether the given input is a previously pymilo-deserialized ndarray.\n\n        :param deserialized_ndarray: given input to get checked\n        :type deserialized_ndarray: obj\n        :return: bool\n        \"\"\"\n        if not (isinstance(deserialized_ndarray, dict)):\n            return False\n\n        if not (\n                'pymiloed-data-structure' in deserialized_ndarray and deserialized_ndarray['pymiloed-data-structure'] == 'numpy.ndarray'):\n            return False\n\n        return True\n\n    def deep_deserialize_ndarray(self, deserialized_ndarray):\n        \"\"\"\n        Deserialize the given deserialized_ndarray to its fully ndarray format.\n\n        :param deserialized_ndarray: given deserialized_ndarray needed to get deserialized to it's np.ndarray form\n        :type deserialized_ndarray: dict\n        :return: numpy.ndarray\n        \"\"\"\n        if not self.is_deserialized_ndarray(deserialized_ndarray):\n            return None  # throw error\n\n        inner_list = deserialized_ndarray['pymiloed-ndarray-list']\n        dtype = deserialized_ndarray['pymiloed-ndarray-dtype']\n        shape = deserialized_ndarray['pymiloed-ndarray-shape']\n\n        if dtype.startswith(\"[\"):\n            dtype = literal_eval(dtype)\n\n        new_list = []\n        for item in inner_list:\n            if self.is_deserialized_ndarray(item):\n                new_list.append(self.deep_deserialize_ndarray(item))\n            else:\n                if len(shape) == 1:\n                    # shape in form if [int] so inner items should not be list.\n                    # convert each inner item to tuple(if it a list)\n                    if isinstance(item, list):\n                        new_list.append(tuple(item))\n                    else:\n                        new_list.append(item)\n                else:\n                    new_list.append(item)\n\n        pre_result = np.asarray(new_list, dtype=dtype)\n        if dtype == \"object\" and hasattr(new_list[0], \"dtype\"):\n            # check if inner items have specific dtype.\n            pre_result = np.asarray(new_list)\n        if not prefix_list(list(pre_result.shape), shape):\n            return pre_result.reshape(shape)\n        return pre_result\n"
  },
  {
    "path": "pymilo/transporters/generator_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Generator transporter.\"\"\"\nfrom numpy.random._generator import Generator\nfrom ..utils.util import is_primitive, check_str_in_iterable\nfrom .transporter import AbstractTransporter\nfrom numpy.random import default_rng\n\n\nclass GeneratorTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle Generator objects.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize Generator object.\n\n        serialize the data[key] of the given model which type is model_type.\n        basically in order to fully serialize a model, we should traverse over all the keys of its data dictionary and\n        pass it through the chain of associated transporters to get fully serialized.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], Generator):\n            generator = data[key]\n            data[key] = {\n                \"pymilo-bypass\": True,\n                \"pymilo-generator\": {\n                    \"state\": generator.__getstate__()\n                }\n            }\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize previously pymilo serialized Generator object.\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which internal serialized data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if is_primitive(content) or content is None:\n            return content\n\n        if check_str_in_iterable(\"pymilo-generator\", content):\n            serialized_generator = content[\"pymilo-generator\"]\n            generator = default_rng(0)\n            generator.__setstate__(serialized_generator[\"state\"])\n            return generator\n\n        return content\n"
  },
  {
    "path": "pymilo/transporters/lossfunction_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Loss function transporter.\"\"\"\nfrom .transporter import AbstractTransporter\nfrom ..utils.util import is_primitive, check_str_in_iterable\nfrom sklearn.linear_model._stochastic_gradient import SGDClassifier\nfrom sklearn.ensemble import GradientBoostingRegressor\nfrom sklearn.ensemble import GradientBoostingClassifier\n\nfrom ..pymilo_param import SKLEARN_ENSEMBLE_TABLE, NOT_SUPPORTED\nif SKLEARN_ENSEMBLE_TABLE[\"HistGradientBoostingRegressor\"] != NOT_SUPPORTED:\n    from sklearn.ensemble import HistGradientBoostingRegressor\n    from sklearn.ensemble import HistGradientBoostingClassifier\n\nloss_function_dict = {}\nsklearn_baseloss_support = False\ntry:\n    from sklearn._loss.loss import BaseLoss\n    sklearn_baseloss_support = True\n    loss_function_dict[\"sklearn._loss.loss.BaseLoss\"] = BaseLoss\nexcept BaseException:\n    pass\n\nhist_baseloss_support = False\ntry:\n    from sklearn.ensemble._hist_gradient_boosting.loss import BaseLoss\n    hist_baseloss_support = True\n    loss_function_dict[\"sklearn.ensemble._hist_gradient_boosting.loss.BaseLoss\"] = BaseLoss\nexcept BaseException:\n    pass\n\ngb_losses_support = False\ntry:\n    from sklearn.ensemble._gb_losses import LossFunction\n    gb_losses_support = True\n    loss_function_dict[\"sklearn.ensemble._gb_losses.LossFunction\"] = LossFunction\nexcept BaseException:\n    pass\n\n\nclass LossFunctionTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle Loss function field.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize the special loss_function_ of the SGDClassifier, SGDOneClassSVM, Perceptron and PassiveAggressiveClassifier.\n\n        serialize the data[key] of the given model which type is model_type.\n        basically in order to fully serialize a model, we should traverse over all the keys of its data dictionary and\n        pass it through the chain of associated transporters to get fully serialized.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if (\n            (model_type == \"SGDClassifier\" and (key == \"loss_function_\" or key == \"_loss_function_\")) or\n            (model_type == \"SGDOneClassSVM\" and (key == \"loss_function_\" or key == \"_loss_function_\")) or\n            (model_type == \"Perceptron\" and (key == \"loss_function_\" or key == \"_loss_function_\")) or\n            (model_type == \"PassiveAggressiveClassifier\" and (key == \"loss_function_\" or key == \"_loss_function_\"))\n        ):\n            data[key] = {\n                \"pymilo-sgd-loss\": data[\"loss\"]\n            }\n\n        if sklearn_baseloss_support:\n            if isinstance(data[key], BaseLoss):\n                if (\n                        model_type == \"GradientBoostingRegressor\" or\n                        model_type == \"GradientBoostingClassifier\"):\n                    data[key] = {\n                        \"pymilo-ensemble-loss\": {\n                            \"loss-library\": \"sklearn._loss.loss.BaseLoss\",\n                            \"loss\": data[\"loss\"],\n                            \"constant_hessian\": data[key].__dict__[\"constant_hessian\"],\n                            \"n_classes\": data[key].__dict__[\"n_classes\"],\n                            \"alpha\": data[\"alpha\"],\n                            \"model_type\": model_type,\n                        }\n                    }\n                elif (\n                        model_type == \"HistGradientBoostingRegressor\" or\n                        model_type == \"HistGradientBoostingClassifier\"):\n                    data[key] = {\n                        \"pymilo-ensemble-loss\": {\n                            \"loss-library\": \"sklearn._loss.loss.BaseLoss\",\n                            \"loss\": data[\"loss\"],\n                            \"constant_hessian\": data[key].__dict__[\"constant_hessian\"],\n                            \"n_trees_per_iteration_\": data[\"n_trees_per_iteration_\"],\n                            \"model_type\": model_type,\n                        }\n                    }\n\n        if gb_losses_support:\n            if isinstance(data[key], LossFunction):\n                if model_type == \"GradientBoostingRegressor\":\n                    data[key] = {\n                        \"pymilo-ensemble-loss\": {\n                            \"loss-library\": \"sklearn.ensemble._gb_losses.LossFunction\",\n                            \"loss\": data[\"loss\"],\n                            \"alpha\": data[\"alpha\"],\n                            \"model_type\": model_type,\n                        }\n                    }\n                elif model_type == \"GradientBoostingClassifier\":\n                    data[key] = {\n                        \"pymilo-ensemble-loss\": {\n                            \"loss-library\": \"sklearn.ensemble._gb_losses.LossFunction\",\n                            \"len(classes_)\": len(data[\"classes_\"]),\n                            \"loss\": data[\"loss\"],\n                            \"n_classes_\": data[\"n_classes_\"],\n                            \"model_type\": model_type,\n                        }\n                    }\n\n        if hist_baseloss_support:\n            if isinstance(data[key], BaseLoss):\n                if (\n                        model_type == \"HistGradientBoostingRegressor\" or\n                        model_type == \"HistGradientBoostingClassifier\"):\n                    data[key] = {\n                        \"pymilo-ensemble-loss\": {\n                            \"loss-library\": \"sklearn.ensemble._hist_gradient_boosting.loss.BaseLoss\",\n                            \"loss\": data[\"loss\"],\n                            \"hessians_are_constant\": data[key].__dict__[\"hessians_are_constant\"],\n                            \"n_threads\": data[key].__dict__[\"n_threads\"],\n                            \"n_trees_per_iteration_\": data[\"n_trees_per_iteration_\"],\n                            \"model_type\": model_type,\n                        }\n                    }\n\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize the special loss_function_ of the SGDClassifier, SGDOneClassSVM, Perceptron and PassiveAggressiveClassifier.\n\n        the associated loss_function_ field of the pymilo serialized model, is extracted through\n        the SGDClassifier's _get_loss_function function with enough feeding of the needed inputs.\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which internal serialized data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if is_primitive(content) or content is None:\n            return content\n\n        if check_str_in_iterable(\"pymilo-sgd-loss\", content):\n            return SGDClassifier(\n                loss=content[\"pymilo-sgd-loss\"])._get_loss_function(\n                content[\"pymilo-sgd-loss\"])\n\n        if check_str_in_iterable(\"pymilo-ensemble-loss\", content):\n            ensemble_loss = content[\"pymilo-ensemble-loss\"]\n            model_type = ensemble_loss[\"model_type\"]\n            loss_type = ensemble_loss[\"loss-library\"]\n            if loss_type == \"sklearn._loss.loss.BaseLoss\":\n                sample_weight = None if ensemble_loss[\"constant_hessian\"] else True\n                if model_type == \"GradientBoostingRegressor\":\n                    return GradientBoostingRegressor(\n                        loss=ensemble_loss[\"loss\"],\n                        alpha=ensemble_loss[\"alpha\"])._get_loss(sample_weight)\n\n                elif model_type == \"GradientBoostingClassifier\":\n                    gbs = GradientBoostingClassifier(loss=ensemble_loss[\"loss\"])\n                    gbs.__dict__[\"n_classes_\"] = ensemble_loss[\"n_classes\"]\n                    return gbs._get_loss(sample_weight)\n\n                elif model_type == \"HistGradientBoostingRegressor\":\n                    return HistGradientBoostingRegressor(\n                        loss=ensemble_loss[\"loss\"])._get_loss(sample_weight)\n\n                elif model_type == \"HistGradientBoostingClassifier\":\n                    n_trees_per_iteration_ = ensemble_loss[\"n_trees_per_iteration_\"]\n                    hgbc = HistGradientBoostingClassifier()\n                    hgbc.__dict__[\"n_trees_per_iteration_\"] = n_trees_per_iteration_\n                    return hgbc._get_loss(sample_weight)\n\n            elif loss_type == \"sklearn.ensemble._hist_gradient_boosting.loss.BaseLoss\" and model_type in [\"HistGradientBoostingRegressor\", \"HistGradientBoostingClassifier\"]:\n                sample_weight = None if ensemble_loss[\"hessians_are_constant\"] else True\n                if model_type == \"HistGradientBoostingRegressor\":\n                    return HistGradientBoostingRegressor(\n                        loss=ensemble_loss[\"loss\"])._get_loss(sample_weight, ensemble_loss[\"n_threads\"])\n                elif model_type == \"HistGradientBoostingClassifier\":\n                    n_trees_per_iteration_ = ensemble_loss[\"n_trees_per_iteration_\"]\n                    hgbc = HistGradientBoostingClassifier(loss=ensemble_loss[\"loss\"])\n                    hgbc.__dict__[\"n_trees_per_iteration_\"] = n_trees_per_iteration_\n                    return hgbc._get_loss(sample_weight, ensemble_loss[\"n_threads\"])\n\n            elif loss_type == \"sklearn.ensemble._gb_losses.LossFunction\" and model_type in [\"GradientBoostingRegressor\", \"GradientBoostingClassifier\"]:\n                from sklearn.ensemble._gb_losses import MultinomialDeviance, BinomialDeviance, LOSS_FUNCTIONS\n                if ensemble_loss[\"loss\"] in [\"deviance\", \"log_loss\"]:\n                    loss_class = (\n                        MultinomialDeviance\n                        if ensemble_loss[\"len(classes_)\"] > 2\n                        else BinomialDeviance\n                    )\n                else:\n                    loss_class = LOSS_FUNCTIONS[ensemble_loss[\"loss\"]]\n                if model_type == \"GradientBoostingRegressor\":\n                    if ensemble_loss[\"loss\"] in (\"huber\", \"quantile\"):\n                        return loss_class(ensemble_loss[\"alpha\"])\n                    else:\n                        return loss_class()\n                elif model_type == \"GradientBoostingClassifier\":\n                    return loss_class(ensemble_loss[\"n_classes_\"])\n\n        return content\n"
  },
  {
    "path": "pymilo/transporters/neighbors_tree_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Tree(from sklearn.tree._tree) object transporter.\"\"\"\nfrom sklearn.neighbors._kd_tree import KDTree\nfrom sklearn.neighbors._ball_tree import BallTree\n\nfrom .transporter import AbstractTransporter\n\ntype_to_tree = {\n    \"KDTree\": KDTree,\n    \"BallTree\": BallTree\n}\n\n\nclass NeighborsTreeTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle (pyi,pyx) NeighborsTreeTransporter object.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize the special _tree field of the Neighbors model.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], KDTree):\n            data[key] = {\n                'pymilo-bypass': True,\n                'pymilo-tree-type': \"KDTree\",\n            }\n        elif isinstance(data[key], BallTree):\n            data[key] = {\n                'pymilo-bypass': True,\n                'pymilo-tree-type': \"BallTree\",\n            }\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize the special _tree field of the Neighbors model.\n\n        :param data: the internal data dictionary of the associated JSON file of the ML model generated by pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if (key == \"_tree\" and content is not None and\n            (model_type == \"KNeighborsRegressor\" or\n             model_type == \"KNeighborsClassifier\" or\n             model_type == \"RadiusNeighborsRegressor\" or\n             model_type == \"RadiusNeighborsClassifier\" or\n             model_type == \"NearestNeighbors\" or\n             model_type == \"NearestCentroid\"\n             )):\n            _tree = type_to_tree[content[\"pymilo-tree-type\"]](\n                data[\"_fit_X\"],\n                data[\"leaf_size\"],\n                data[\"effective_metric_\"],\n                **data[\"effective_metric_params_\"],\n            )\n            return _tree\n\n        else:\n            return content\n"
  },
  {
    "path": "pymilo/transporters/preprocessing_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Preprocessing transporter.\"\"\"\nfrom ..pymilo_param import SKLEARN_PREPROCESSING_TABLE\nfrom ..utils.util import check_str_in_iterable, get_sklearn_type\nfrom .transporter import AbstractTransporter, Command\nfrom .general_data_structure_transporter import GeneralDataStructureTransporter\nfrom .function_transporter import FunctionTransporter\nfrom scipy.interpolate._bsplines import BSpline\n\nPREPROCESSING_CHAIN = {\n    \"GeneralDataStructureTransporter\": GeneralDataStructureTransporter(),\n    \"FunctionTransporter\": FunctionTransporter(),\n}\n\n\nclass PreprocessingTransporter(AbstractTransporter):\n    \"\"\"Preprocessing object dedicated Transporter.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize Preprocessing object.\n\n        serialize the data[key] of the given model which type is model_type.\n        basically in order to fully serialize a model, we should traverse over all the keys of its data dictionary and\n        pass it through the chain of associated transporters to get fully serialized.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if self.is_preprocessing_module(data[key]):\n            return self.serialize_pre_module(data[key])\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize previously pymilo serialized preprocessing object.\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which internal serialized data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if self.is_preprocessing_module(content):\n            return self.deserialize_pre_module(content)\n        return content\n\n    def is_preprocessing_module(self, pre_module):\n        \"\"\"\n        Check whether the given module is a sklearn Preprocessing module or not.\n\n        :param pre_module: given object\n        :type pre_module: any\n        :return: bool\n        \"\"\"\n        if isinstance(pre_module, dict):\n            return check_str_in_iterable(\n                \"pymilo-preprocessing-type\",\n                pre_module) and pre_module[\"pymilo-preprocessing-type\"] in SKLEARN_PREPROCESSING_TABLE.keys()\n        return get_sklearn_type(pre_module) in SKLEARN_PREPROCESSING_TABLE.keys()\n\n    def serialize_pre_module(self, pre_module):\n        \"\"\"\n        Serialize Preprocessing object.\n\n        :param pre_module: given sklearn preprocessing module\n        :type pre_module: sklearn.preprocessing\n        :return: pymilo serialized pre_module\n        \"\"\"\n        # add one depth inner preprocessing module population\n        for key, value in pre_module.__dict__.items():\n            if self.is_preprocessing_module(value):\n                pre_module.__dict__[key] = self.serialize_pre_module(value)\n            elif isinstance(value, BSpline):\n                pre_module.__dict__[key] = self.serialize_spline(value)\n            elif isinstance(value, list):\n                if len(value) > 0 and isinstance(value[0], BSpline):\n                    pre_module.__dict__[key] = [self.serialize_spline(bspline) for bspline in value]\n\n        for transporter in PREPROCESSING_CHAIN:\n            PREPROCESSING_CHAIN[transporter].transport(\n                pre_module, Command.SERIALIZE)\n        return {\n            \"pymilo-bypass\": True,\n            \"pymilo-preprocessing-type\": get_sklearn_type(pre_module),\n            \"pymilo-preprocessing-data\": pre_module.__dict__\n        }\n\n    def deserialize_pre_module(self, serialized_pre_module):\n        \"\"\"\n        Deserialize Preprocessing object.\n\n        :param serialized_pre_module: serializezd preprocessing module(by pymilo)\n        :type serialized_pre_module: dict\n        :return: retrieved associated sklearn.preprocessing module\n        \"\"\"\n        data = serialized_pre_module[\"pymilo-preprocessing-data\"]\n        associated_type = SKLEARN_PREPROCESSING_TABLE[serialized_pre_module[\"pymilo-preprocessing-type\"]]\n        retrieved_pre_module = associated_type()\n        for key in data:\n            # add one depth inner preprocessing module population\n            if self.is_preprocessing_module(data[key]):\n                data[key] = self.deserialize_pre_module(data[key])\n            # check inner field is BSpline\n            if self.is_bspline(data[key]):\n                data[key] = self.deserialize_spline(data[key])\n            # check inner field is [BSpline]\n            if isinstance(data[key], list) and len(data[key]) > 0 and self.is_bspline(data[key][0]):\n                data[key] = [self.deserialize_spline(serialized_bspline) for serialized_bspline in data[key]]\n\n            for transporter in PREPROCESSING_CHAIN:\n                data[key] = PREPROCESSING_CHAIN[transporter].deserialize(data, key, \"\")\n        for key in data:\n            setattr(retrieved_pre_module, key, data[key])\n        return retrieved_pre_module\n\n    def is_bspline(self, bspline):\n        \"\"\"\n        Check whether the given module is a scipy.interpolate._bsplines.BSpline or not.\n\n        :param bspline: given object\n        :type bspline: any\n        :return: bool\n        \"\"\"\n        if isinstance(bspline, dict):\n            return check_str_in_iterable(\n                \"pymilo-preprocessing-type\",\n                bspline) and bspline[\"pymilo-preprocessing-type\"] == \"BSpline\"\n        return get_sklearn_type(bspline) == \"BSpline\"\n\n    def serialize_spline(self, bspline):\n        \"\"\"\n        Serialize scipy.interpolate._bsplines.BSpline object.\n\n        :param bspline: given scipy.interpolate._bsplines.BSpline module\n        :type bspline: scipy.interpolate._bsplines.BSpline\n        :return: pymilo serialized bspline\n        \"\"\"\n        for transporter in PREPROCESSING_CHAIN:\n            PREPROCESSING_CHAIN[transporter].transport(bspline, Command.SERIALIZE)\n        return {\n            \"pymilo-bypass\": True,\n            \"pymilo-preprocessing-type\": get_sklearn_type(bspline),\n            \"pymilo-preprocessing-data\": bspline.__dict__\n        }\n\n    def deserialize_spline(self, serialized_bspline):\n        \"\"\"\n        Deserialize BSpline object.\n\n        :param serialized_bspline: serialized BSpline object(by pymilo)\n        :type serialized_bspline: dict\n        :return: retrieved associated scipy.interpolate._bsplines.BSpline object\n        \"\"\"\n        data = serialized_bspline[\"pymilo-preprocessing-data\"]\n        associated_type = BSpline\n        for key in list(data.keys()):\n            for transporter in PREPROCESSING_CHAIN:\n                data[key] = PREPROCESSING_CHAIN[transporter].deserialize(data, key, \"\")\n        # Handle both old scipy (t, c, k) and new scipy (_t, _c, _k) attribute names\n        t = data.get(\"t\", data.get(\"_t\"))\n        c = data.get(\"c\", data.get(\"_c\"))\n        k = data.get(\"k\", data.get(\"_k\"))\n        retrieved_pre_module = associated_type(t=t, k=k, c=c)\n        for key in data:\n            setattr(retrieved_pre_module, key, data[key])\n        return retrieved_pre_module\n"
  },
  {
    "path": "pymilo/transporters/randomstate_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo RandomState(MT19937) object transporter.\"\"\"\nimport numpy as np\nfrom .transporter import AbstractTransporter\nfrom ..utils.util import check_str_in_iterable\n\n\nclass RandomStateTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle RandomState field.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize instances of the RandomState class.\n\n        Record the `state` associated fields of random state object.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], np.random.RandomState):\n            inner_random_state = data[key]\n            data[key] = {\n                \"pymilo-bypass\": True,\n                \"pymilo-randomstate\": (\n                    inner_random_state.get_state()[0],\n                    inner_random_state.get_state()[1].tolist(),\n                    inner_random_state.get_state()[2],\n                    inner_random_state.get_state()[3],\n                    inner_random_state.get_state()[4],\n                ),\n            }\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize the previously serialized RandomState object.\n\n        The associated _random_state field of the pymilo serialized NN model, is extracted through\n        it's previously serialized parameters.\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n\n        if check_str_in_iterable(\"pymilo-randomstate\", content):\n            inner_random_state = content[\"pymilo-randomstate\"]\n            inner_random_state = (\n                inner_random_state[0],\n                np.array(inner_random_state[1]),\n                inner_random_state[2],\n                inner_random_state[3],\n                inner_random_state[4],\n            )\n            _random_state = np.random.RandomState()\n            _random_state.set_state(inner_random_state)\n            return _random_state\n        else:\n            return content\n"
  },
  {
    "path": "pymilo/transporters/sgdoptimizer_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo SGDOptimizer object transporter.\"\"\"\nfrom .transporter import AbstractTransporter\nfrom ..utils.util import check_str_in_iterable\nfrom sklearn.neural_network._stochastic_optimizers import SGDOptimizer\n\n\nclass SGDOptimizerTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle SGDOptimizer field.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize instances of the SGDOptimizer class.\n\n        Record the `learning_rate`, `momentum`, `decay` and `nesterov` fields of random state object.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], SGDOptimizer):\n            optimizer = data[key]\n            data[key] = {\n                \"pymilo-bypass\": True,\n                'pymilo-sgdoptimizer': {\n                    'type': \"SGDOptimizer\",\n                    'learning_rate': optimizer.learning_rate,\n                    'momentum': optimizer.momentum,\n                    'decay': optimizer.decay,\n                    'nesterov': optimizer.nesterov\n                }\n            }\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize the special _optimizer field of the SGDOptimizer.\n\n        The associated _optimizer field of the pymilo serialized model, is extracted through\n        it's previously serialized parameters.\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if check_str_in_iterable('pymilo-sgdoptimizer', content):\n            optimizer = content['pymilo-sgdoptimizer']\n            if (optimizer[\"type\"] == \"SGDOptimizer\"):\n                return SGDOptimizer(\n                    learning_rate=optimizer['learning_rate'],\n                    momentum=optimizer['momentum'],\n                    decay=optimizer['decay'],\n                    nesterov=optimizer['nesterov'])\n            else:\n                return content\n        else:\n            return content\n"
  },
  {
    "path": "pymilo/transporters/transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Transporter.\"\"\"\nfrom ..utils.util import get_sklearn_type\nfrom abc import ABC, abstractmethod\nfrom enum import Enum\nfrom ..utils.util import is_primitive, check_str_in_iterable\n\n\nclass Command(Enum):\n    \"\"\"Command is an enum class used to determine the type of transportation.\"\"\"\n\n    SERIALIZE = 1\n    DESERIALIZE = 2\n\n\nclass Transporter(ABC):\n    \"\"\"\n    Transporter Interface.\n\n    Each Transporter transports(either serializes or deserializes) the input according to the given command.\n    \"\"\"\n\n    @abstractmethod\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize the data[key] of the given model which type is model_type.\n\n        basically in order to fully serialize a model, we should traverse over all the keys of its data dictionary and\n        pass it through the chain of associated transporters to get fully serialized.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n\n    @abstractmethod\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize the data[key] of the given model which type is model_type.\n\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model\n            which is generated previously by pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which internal serialized data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n\n    @abstractmethod\n    def bypass(self, content):\n        \"\"\"\n        Determine whether to bypass transporting on this content or not.\n\n        :param content: either a ML model object's internal data dictionary(.__dict__) or an object associated with the json string of a pymilo serialized ML model.\n        :type content: object\n        :return: boolean, whether to bypass or not\n        \"\"\"\n\n    @abstractmethod\n    def reset(self):\n        \"\"\"\n        Reset internal data structures of the transport object.\n\n        Some Transporters may be stateful and have internal data structures getting filled during transportation.\n\n        :return: None\n        \"\"\"\n\n    @abstractmethod\n    def transport(self, request, command):\n        \"\"\"\n        Either serializes or deserializes the request according to the given command.\n\n        basically in order to fully transport a request, we should traverse over all the keys of its internal data dictionary and\n        pass it through the chain of associated transporters to get fully transported.\n\n        :param request: either a ML model object itself(when command is serialize) or\n        an object associated with the json string of a pymilo serialized ML model(when command is deserialize)\n        :type request: object\n        :param command: determines the type of transportation, it can be either Serialize or Deserialize\n        :type command: Command class\n        :return: pymilo transported output of data[key]\n        \"\"\"\n\n\nclass AbstractTransporter(Transporter):\n    \"\"\"Abstract Transporter with the implementation of the traversing through the given input according to the associated command.\"\"\"\n\n    def bypass(self, content):\n        \"\"\"\n        Determine whether to bypass transporting on this content or not.\n\n        :param content: either a ML model object's internal data dictionary or an object associated with the json string of a pymilo serialized ML model.\n        :type content: object\n        :return: boolean, whether to bypass or not\n        \"\"\"\n        if is_primitive(content):\n            return False\n\n        if check_str_in_iterable(\"pymilo-bypass\", content):\n            return content[\"pymilo-bypass\"]\n        else:\n            return False\n\n    def transport(self, request, command, is_inner_model=False):\n        \"\"\"\n        Either serializes or deserializes the request according to the given command.\n\n        basically in order to fully transport a request, we should traverse over all the keys of its internal data dictionary and\n        pass it through the chain of associated transporters to get fully transported.\n\n        :param request: either a ML model object itself(when command is serialize) or an object associated with the json string of a pymilo serialized ML model(when command is deserialize)\n        :type request: object\n        :param command: determines the type of transportation, it can be either Serialize or Deserialize\n        :type command: Command class\n        :param is_inner_model: determines whether it is an inner linear model of a super ml model\n        :type is_inner_model: boolean\n        :return: pymilo transported output of data[key]\n        \"\"\"\n        if command == Command.SERIALIZE:\n            # request is a sklearn model\n            data = request.__dict__\n            for key in data:\n                if self.bypass(data[key]):\n                    continue  # by-pass!!\n                data[key] = self.serialize(\n                    data, key, get_sklearn_type(request))\n            self.reset()\n\n        elif command == Command.DESERIALIZE:\n            # request is a pymilo-created import object\n            data = None\n            model_type = None\n            if is_inner_model:\n                data = request[\"data\"]\n                model_type = request[\"type\"]\n            else:\n                data = request.data\n                model_type = request.type\n            for key in data:\n                data[key] = self.deserialize(data, key, model_type)\n            self.reset()\n            return\n\n        else:\n            # TODO error handling.\n            return None\n\n    def reset(self):\n        \"\"\"\n        Reset internal data structures of the transport object.\n\n        Some Transporters may be stateful and have internal data structures getting filled during transportation.\n\n        :return: None\n        \"\"\"\n        return\n"
  },
  {
    "path": "pymilo/transporters/tree_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo Tree(from sklearn.tree._tree) object transporter.\"\"\"\nimport numpy as np\nfrom sklearn.tree._tree import Tree\nfrom ..pymilo_param import NUMPY_TYPE_DICT\nfrom .transporter import AbstractTransporter\nfrom ..utils.util import check_str_in_iterable\nfrom .general_data_structure_transporter import GeneralDataStructureTransporter\n\n\nclass TreeTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle (pyi,pyx) Tree object.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize instances of the Tree class by recording the n_features, n_classes and n_outputs fields of tree object.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], Tree):\n            gdst = GeneralDataStructureTransporter()\n            tree = data[key]\n            tree_inner_state = tree.__getstate__()\n            data[key] = {\n                'pymilo-bypass': True,\n                'pymilo-tree': {\n                    'internal_state': {\n                        \"max_depth\": tree_inner_state[\"max_depth\"],\n                        \"node_count\": tree_inner_state[\"node_count\"],\n                        \"nodes\": {\n                            \"types\": [str(np.dtype(i).name) for i in tree_inner_state[\"nodes\"][0]],\n                            \"field-names\": list(tree_inner_state[\"nodes\"][0].dtype.names),\n                            \"values\": [node.tolist() for node in tree_inner_state[\"nodes\"]],\n                        },\n                        \"values\": gdst.ndarray_to_list(tree_inner_state[\"values\"]),\n                    },\n                    'n_features': tree.n_features,\n                    'n_classes': gdst.ndarray_to_list(tree.n_classes),\n                    'n_outputs': tree.n_outputs,\n                }\n            }\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize the special tree_ field of the Decision Trees.\n\n        The associated tree_ field of the pymilo serialized model, is extracted through\n        it's previously serialized parameters.\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated JSON file of the ML model generated by pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if check_str_in_iterable('pymilo-tree', content):\n            gdst = GeneralDataStructureTransporter()\n            tree_params = content['pymilo-tree']\n            tree_internal_state = tree_params[\"internal_state\"]\n            nodes_dtype_spec = []\n\n            for idx, node_type in enumerate(tree_internal_state[\"nodes\"][\"types\"]):\n                nodes_dtype_spec.append(\n                    (tree_internal_state[\"nodes\"][\"field-names\"][idx], NUMPY_TYPE_DICT[\"numpy.\" + node_type]))\n            nodes = [tuple(node)\n                     for node in tree_internal_state[\"nodes\"][\"values\"]]\n            nodes = np.array(nodes, dtype=nodes_dtype_spec)\n\n            tree_internal_state = {\n                \"max_depth\": tree_internal_state[\"max_depth\"],\n                \"node_count\": tree_internal_state[\"node_count\"],\n                \"nodes\": nodes,\n                \"values\": gdst.list_to_ndarray(tree_internal_state[\"values\"]),\n            }\n\n            n_classes = np.ndarray(\n                shape=(np.intp(len(tree_params[\"n_classes\"])),), dtype=np.intp)\n            for idx, _ in enumerate(n_classes):\n                n_classes[idx] = tree_params[\"n_classes\"][idx]\n\n            _tree = Tree(\n                tree_params[\"n_features\"],\n                n_classes,\n                tree_params[\"n_outputs\"]\n            )\n            _tree.__setstate__(tree_internal_state)\n            return _tree\n        else:\n            return content\n"
  },
  {
    "path": "pymilo/transporters/treepredictor_transporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo TreePredictor transporter.\"\"\"\nfrom sklearn.ensemble._hist_gradient_boosting.predictor import TreePredictor\nfrom ..utils.util import check_str_in_iterable\nfrom .transporter import AbstractTransporter\nfrom .general_data_structure_transporter import GeneralDataStructureTransporter\n\n\nclass TreePredictorTransporter(AbstractTransporter):\n    \"\"\"Customized PyMilo Transporter developed to handle TreePredictor objects.\"\"\"\n\n    def serialize(self, data, key, model_type):\n        \"\"\"\n        Serialize TreePredictor object[useful in HistGradientBoosting(Regressor,Classifier)].\n\n        serialize the data[key] of the given model which type is model_type.\n        basically in order to fully serialize a model, we should traverse over all the keys of its data dictionary and\n        pass it through the chain of associated transporters to get fully serialized.\n\n        :param data: the internal data dictionary of the given model\n        :type data: dict\n        :param key: the special key of the data param, which we're going to serialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo serialized output of data[key]\n        \"\"\"\n        if isinstance(data[key], TreePredictor):\n            return self.serialize_tree_predictor(data[key])\n        elif isinstance(data[key], list):\n            return self.serialize_possible_inner_tree_predictor(data[key])\n        return data[key]\n\n    def deserialize(self, data, key, model_type):\n        \"\"\"\n        Deserialize previously pymilo serialized TreePredictor object[useful in HistGradientBoosting(Regressor,Classifier)].\n\n        deserialize the data[key] of the given model which type is model_type.\n        basically in order to fully deserialize a model, we should traverse over all the keys of its serialized data dictionary and\n        pass it through the chain of associated transporters to get fully deserialized.\n\n        :param data: the internal data dictionary of the associated json file of the ML model which is generated previously by\n        pymilo export.\n        :type data: dict\n        :param key: the special key of the data param, which we're going to deserialize its value(data[key])\n        :type key: object\n        :param model_type: the model type of the ML model, which internal serialized data dictionary is given as the data param\n        :type model_type: str\n        :return: pymilo deserialized output of data[key]\n        \"\"\"\n        content = data[key]\n        if self.is_serialized_treepredictor(content):\n            return self.deserialize_tree_predictor(content)\n        if isinstance(content, list):\n            return self.deserialize_possible_inner_tree_predictor(content)\n        return content\n\n    def is_treepredictor(self, treepredictor):\n        \"\"\"\n        Check if the given object is an instance of TreePredictor class.\n\n        :param treepredictor: given object to check\n        :type treepredictor: any\n\n        :return: bool\n        \"\"\"\n        return isinstance(treepredictor, TreePredictor)\n\n    def is_serialized_treepredictor(self, serialized_treepredictor):\n        \"\"\"\n        Check if the given object is a previously pymilo-serialized TreePredictor.\n\n        :param serialized_treepredictor: given object to check\n        :type serialized_treepredictor: any\n\n        :return: bool\n        \"\"\"\n        return check_str_in_iterable(\n            \"pymiloed-data-structure\",\n            serialized_treepredictor) and serialized_treepredictor[\"pymiloed-data-structure\"] == \"TreePredictor\"\n\n    def serialize_tree_predictor(self, treepredictor):\n        \"\"\"\n        Serialize given Treepredictor instance.\n\n        :param treepredictor: given treepredictor to get serialized\n        :type treepredictor: Treepredictor\n\n        :return: dict\n        \"\"\"\n        gdst = GeneralDataStructureTransporter()\n        return {\n            \"pymilo-bypass\": True,\n            \"pymiloed-data-structure\": 'TreePredictor',\n            \"pymiloed-data\": {\n                \"nodes\": gdst.deep_serialize_ndarray(treepredictor.nodes),\n                \"binned_left_cat_bitsets\": gdst.deep_serialize_ndarray(treepredictor.binned_left_cat_bitsets),\n                \"raw_left_cat_bitsets\": gdst.deep_serialize_ndarray(treepredictor.raw_left_cat_bitsets),\n            },\n        }\n\n    def deserialize_tree_predictor(self, serialized_tree_predictor):\n        \"\"\"\n        Deserialize to pure Treepredictor object.\n\n        :param serialized_tree_predictor: pymilo-serialized treepredictor\n        :type serialized_tree_predictor: dict\n\n        :return: Treepredictor\n        \"\"\"\n        gdst = GeneralDataStructureTransporter()\n        nodes = serialized_tree_predictor[\"pymiloed-data\"][\"nodes\"][\"pymiloed-ndarray-list\"]\n        for idx, value in enumerate(nodes):\n            nodes[idx] = tuple(value)\n\n        binned_left_cat_bitsets = serialized_tree_predictor[\"pymiloed-data\"][\"binned_left_cat_bitsets\"]\n        raw_left_cat_bitsets = serialized_tree_predictor[\"pymiloed-data\"][\"raw_left_cat_bitsets\"]\n\n        return TreePredictor(\n            nodes=gdst.deep_deserialize_ndarray(\n                serialized_tree_predictor[\"pymiloed-data\"][\"nodes\"]),\n            binned_left_cat_bitsets=gdst.deep_deserialize_ndarray(\n                binned_left_cat_bitsets),\n            raw_left_cat_bitsets=gdst.deep_deserialize_ndarray(\n                raw_left_cat_bitsets)\n        )\n\n    def serialize_possible_inner_tree_predictor(self, _list):\n        \"\"\"\n        Traverse over list and serialize Treepredictor objects.\n\n        :param _list: given list to serialize inner Treepredictor objects\n        :type _list: list\n\n        :return: list\n        \"\"\"\n        for idx, value in enumerate(_list):\n            if self.is_treepredictor(value):\n                _list[idx] = self.serialize_tree_predictor(value)\n            if isinstance(value, list):\n                _list[idx] = self.serialize_possible_inner_tree_predictor(value)\n        return _list\n\n    def deserialize_possible_inner_tree_predictor(self, _list):\n        \"\"\"\n        Traverse over list and deserialize previously pymilo-serialized Treepredictor objects.\n\n        :param _list: given list to deserialize inner Treepredictor objects\n        :type _list: list\n\n        :return: list\n        \"\"\"\n        for idx, value in enumerate(_list):\n            if self.is_serialized_treepredictor(value):\n                _list[idx] = self.deserialize_tree_predictor(value)\n            if isinstance(value, list):\n                _list[idx] = self.deserialize_possible_inner_tree_predictor(value)\n        return _list\n"
  },
  {
    "path": "pymilo/utils/__init__.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"PyMilo utils.\"\"\"\n"
  },
  {
    "path": "pymilo/utils/data_exporter.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"data exporter modules.\"\"\"\nfrom sklearn import datasets\n\n\ndef _split_X_y(X, y, threshold=20):\n    \"\"\"\n    Split X and y into train and test sets.\n\n    :param X: the data\n    :type X: list or np.ndarray\n    :param y: the targets\n    :type y: list or np.ndarray\n    :param threshold: threshold for train/test spliting\n    :int threshold: int\n    :return: X train, y train, X test, y test\n    \"\"\"\n    X_train, X_test = X[:-threshold], X[-threshold:]\n    y_train, y_test = y[:-threshold], y[-threshold:]\n    return X_train, y_train, X_test, y_test\n\n\ndef prepare_simple_classification_datasets(threshold=50):\n    \"\"\"\n    Generate a dataset for classification (breast cancer wisconsin).\n\n    :param threshold: threshold for train/test spliting\n    :int threshold: int\n    :return: splited dataset for classification\n    \"\"\"\n    cancer_X, cancer_y = datasets.load_breast_cancer(return_X_y=True)\n    return _split_X_y(cancer_X, cancer_y, threshold)\n\n\ndef prepare_simple_regression_datasets(threshold=20):\n    \"\"\"\n    Generate a dataset for regression (the diabetes).\n\n    :param threshold: threshold for train/test spliting\n    :int threshold: int\n    :return: splited dataset for regression\n    \"\"\"\n    diabetes_X, diabetes_y = datasets.load_diabetes(return_X_y=True)\n    return _split_X_y(diabetes_X, diabetes_y, threshold)\n\n\ndef prepare_simple_clustering_datasets():\n    \"\"\"\n    Generate a dataset for clustering (the iris).\n\n    :return: dataset for clustering\n    \"\"\"\n    # Load the Iris dataset\n    iris = datasets.load_iris()\n    # Access the features and target\n    X = iris.data  # Features\n    y = iris.target  # Target (labels)\n    return X, y\n"
  },
  {
    "path": "pymilo/utils/test_pymilo.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"pymilo test modules.\"\"\"\nimport os\nimport copy\nfrom numpy import array_equal\n\nfrom ..pymilo_obj import Export\nfrom ..pymilo_obj import Import\nfrom ..chains.util import get_transporter\nfrom ..pymilo_func import compare_model_outputs\nfrom ..pymilo_param import EXPORTED_MODELS_PATH\n\nfrom sklearn.metrics import mean_squared_error, r2_score\nfrom sklearn.metrics import accuracy_score, hinge_loss\n\n\ndef pymilo_export_path(model):\n    \"\"\"\n    Return the associated folder name to save the json file generated by pymilo Export(applied to the given model).\n\n    :param model: given model\n    :type model: any sklearn's model class\n    :return: folder name\n    \"\"\"\n    model_type, _ = get_transporter(model)\n    return EXPORTED_MODELS_PATH[model_type]\n\n\ndef pymilo_test(model, model_name):\n    \"\"\"\n    Return the pymilo imported model's outputs for given test_data.\n\n    :param model: given model\n    :type model: any sklearn's model class\n    :param model_name: model name\n    :type model_name: str\n    :return: imported model's output\n    \"\"\"\n    export_model_path = pymilo_export_path(model)\n    exported_model = Export(model)\n    exported_model_serialized_path = os.path.join(\n        os.getcwd(), \"tests\", export_model_path, model_name + '.json')\n    exported_model.save(exported_model_serialized_path)\n\n    imported_model = Import(exported_model_serialized_path)\n    imported_sklearn_model = imported_model.to_model()\n    return imported_sklearn_model\n\n\ndef pymilo_regression_test(regressor, model_name, test_data):\n    \"\"\"\n    Test the package's main structure in regression task.\n\n    :param regressor: the given regressor model\n    :type regressor: any valid sklearn's regressor class\n    :param model_name: model name\n    :type model_name: str\n    :param test_data: data for testing\n    :type test_data: np.ndarray or list\n    :return: True if the test succeed\n    \"\"\"\n    x_test, y_test = test_data\n    pre_pymilo_model_y_pred = regressor.predict(x_test)\n    pre_pymilo_model_prediction_output = {\n        \"mean-error\": mean_squared_error(y_test, pre_pymilo_model_y_pred),\n        \"r2-score\": r2_score(y_test, pre_pymilo_model_y_pred)\n    }\n    post_pymilo_model_y_pred = pymilo_test(regressor, model_name).predict(x_test)\n    post_pymilo_model_prediction_outputs = {\n        \"mean-error\": mean_squared_error(y_test, post_pymilo_model_y_pred),\n        \"r2-score\": r2_score(y_test, post_pymilo_model_y_pred)\n    }\n    comparison_result = compare_model_outputs(\n        pre_pymilo_model_prediction_output,\n        post_pymilo_model_prediction_outputs)\n    report_status(comparison_result, model_name)\n    return comparison_result\n\n\ndef pymilo_classification_test(classifier, model_name, test_data):\n    \"\"\"\n    Test the package's main structure in classification task.\n\n    :param classifier: the given classifier model\n    :type classifier: any valid sklearn's classifier class\n    :param model_name: model name\n    :type model_name: str\n    :param test_data: data for testing\n    :type test_data: np.ndarray or list\n    :return: True if the test succeed\n    \"\"\"\n    x_test, y_test = test_data\n    pre_pymilo_model_y_pred = classifier.predict(x_test)\n    pre_pymilo_model_prediction_output = {\n        \"accuracy-score\": accuracy_score(y_test, pre_pymilo_model_y_pred),\n        \"hinge-loss\": hinge_loss(y_test, pre_pymilo_model_y_pred)\n    }\n    post_pymilo_model_y_pred = pymilo_test(classifier, model_name).predict(x_test)\n    post_pymilo_model_prediction_outputs = {\n        \"accuracy-score\": accuracy_score(y_test, post_pymilo_model_y_pred),\n        \"hinge-loss\": hinge_loss(y_test, post_pymilo_model_y_pred)\n    }\n    comparison_result = compare_model_outputs(\n        pre_pymilo_model_prediction_output,\n        post_pymilo_model_prediction_outputs)\n    report_status(comparison_result, model_name)\n    return comparison_result\n\n\ndef pymilo_clustering_test(clusterer, model_name, x_test, support_prediction=False):\n    \"\"\"\n    Test the package's main structure in clustering task.\n\n    :param clusterer: the given clusterer model\n    :type clusterer: any valid sklearn's clusterer class\n    :param model_name: model name\n    :type model_name: str\n    :param x_test: data for testing\n    :type x_test: np.ndarray or list\n    :param support_prediction: whether the given clusterer supports .predict function or\n    :type support_prediction: boolean\n    :return: True if the test succeed\n    \"\"\"\n    pre_pymilo_model = copy.deepcopy(clusterer)\n    post_pymilo_model = pymilo_test(clusterer, model_name)\n    if (support_prediction):\n        pre_pymilo_model_y_pred = pre_pymilo_model.predict(x_test)\n        post_pymilo_model_y_pred = post_pymilo_model.predict(x_test)\n        mse = ((post_pymilo_model_y_pred - pre_pymilo_model_y_pred)**2).mean(axis=0)\n        epsilon_error = 10**(-8)\n        return report_status(mse < epsilon_error, model_name)\n    else:\n        # TODO, apply peer to peer\n        # Evaluation: peer to peer field type & value check\n        return report_status(True, model_name)\n\n\ndef pymilo_nearest_neighbor_test(nearest_neighbor, model_name, test_data):\n    \"\"\"\n    Test the package's main structure in nearest neighbor task.\n\n    :param nearest_neighbor: the given nearest neighbor model\n    :type nearest_neighbor: sklearn's nearest neighbor class\n    :param model_name: model name\n    :type model_name: str\n    :param test_data: data for testing\n    :type test_data: np.ndarray or list\n    :return: True if the test succeed\n    \"\"\"\n    x_test, _ = test_data\n    pre_pymilo_kneighbors = nearest_neighbor.kneighbors([x_test[0]], 3, return_distance=True)\n    post_pymilo_nearest_neighbor = pymilo_test(nearest_neighbor, model_name)\n    post_pymilo_kneighbors = post_pymilo_nearest_neighbor.kneighbors([x_test[0]], 3, return_distance=True)\n    report_status(array_equal(pre_pymilo_kneighbors, post_pymilo_kneighbors), model_name)\n\n\ndef report_status(result, model_name):\n    \"\"\"\n    Print status for each model.\n\n    :param result: the test result\n    :type result: bool\n    :param model_name: model name\n    :type model_name: str\n    :return: None\n    \"\"\"\n    if result:\n        print('Pymilo Test for Model: ' + model_name + ' succeed.')\n    else:\n        print('Pymilo Test for Model: ' + model_name + ' failed.')\n"
  },
  {
    "path": "pymilo/utils/util.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"utility module.\"\"\"\nimport requests\nimport importlib\nfrom inspect import signature\nfrom ..pymilo_param import DOWNLOAD_MODEL_FAILED, INVALID_DOWNLOADED_MODEL, SKLEARN_SUPPORTED_CATEGORIES\n\n\ndef get_sklearn_type(model):\n    \"\"\"\n    Return sklearn model type.\n\n    :param model: sklearn model\n    :type model: any sklearn's model class\n    :return: model type as str\n    \"\"\"\n    raw_type = type(model)\n    return str(raw_type).split(\".\")[-1][:-2]\n\n\ndef is_primitive(obj):\n    \"\"\"\n    Check if the given object is primitive.\n\n    :param obj: given object\n    :type obj: any valid type\n    :return: True if object is primitive\n    \"\"\"\n    if isinstance(obj, dict):\n        return False\n    return not hasattr(obj, '__dict__')\n\n\ndef is_iterable(obj):\n    \"\"\"\n    Check if the given object is iterable.\n\n    :param obj: given object\n    :type obj: any valid type\n    :return: True if object is iterable\n    \"\"\"\n    try:\n        iter(obj)\n        return True\n    except TypeError:\n        return False\n\n\ndef check_str_in_iterable(field, content):\n    \"\"\"\n    Check if the specified string field exists in content, which is supposed to be a dictionary.\n\n    :param field: given string field\n    :type field: str\n    :param content: given supposed to be a dictionary\n    :type content: obj\n    :return: True if associated field is an iterable string in content and False otherwise.\n    \"\"\"\n    if isinstance(content, dict):\n        return field in content\n    else:\n        return False\n\n\ndef get_homogeneous_type(seq):\n    \"\"\"\n    Check if the given sequence's inner items have the same type or not and if they do, return the associated type.\n\n    :param seq: given sequence\n    :type seq: sequence\n\n    :return: Tuple of (True, inner_type) or (False, None)\n    \"\"\"\n    iseq = iter(seq)\n    first_type = type(next(iseq))\n    if all((isinstance(x, first_type)) for x in iseq):\n        return True, first_type\n    else:\n        return False, None\n\n\ndef all_same(arr):\n    \"\"\"\n    Check if the given array's items are the same or not.\n\n    :param arr: given array\n    :type arr: array\n\n    :return: bool\n    \"\"\"\n    return all(x == arr[0] for x in arr)\n\n\ndef import_function(module_name, function_name):\n    \"\"\"\n    Import function with name function_name from module called module_name.\n\n    :param module_name: module to import the function from\n    :type module_name: str\n    :param function_name: function's name to get imported\n    :type function_name: str\n\n    :return: function\n    \"\"\"\n    module = importlib.import_module(module_name)\n    function = getattr(module, function_name)\n    return function\n\n\ndef has_named_parameter(func, param_name):\n    \"\"\"\n    Check whether the given function has a parameter named param_name or not.\n\n    :param func: function to check it's params\n    :type func: function\n    :param param_name: parameter's name\n    :type param_name: str\n\n    :return: boolean\n    \"\"\"\n    _signature = signature(func)\n    parameter_names = [p.name for p in _signature.parameters.values()]\n    return param_name in parameter_names\n\n\ndef prefix_list(list1, list2):\n    \"\"\"\n    Check whether the list2 list is list1 sublist of the a list.\n\n    :param list1: outer list\n    :type list1: list\n    :param list2: inner list\n    :type list2: list\n\n    :return: boolean\n    \"\"\"\n    if len(list1) < len(list2):\n        return False\n    return all(list1[j] == list2[j] for j in range(len(list2)))\n\n\ndef download_model(url):\n    \"\"\"\n    Download the model from the given url.\n\n    :param url: url to exported JSON file\n    :type url: str\n\n    :return: obj\n    \"\"\"\n    s = requests.Session()\n    retries = requests.adapters.Retry(\n        total=5,\n        backoff_factor=0.1,\n        status_forcelist=[500, 502, 503, 504]\n    )\n    s.mount('http://', requests.adapters.HTTPAdapter(max_retries=retries))\n    s.mount('https://', requests.adapters.HTTPAdapter(max_retries=retries))\n    try:\n        response = s.get(url)\n    except Exception:\n        raise Exception(DOWNLOAD_MODEL_FAILED)\n    try:\n        if response.status_code == 200:\n            return response.json()\n    except ValueError:\n        raise Exception(INVALID_DOWNLOADED_MODEL)\n\n\ndef get_sklearn_class(model_name):\n    \"\"\"\n    Return the sklearn class of the requested model name.\n\n    :param model_name: model name\n    :type model_name: str\n\n    :return: sklearn ML model class\n    \"\"\"\n    for _, category_models in SKLEARN_SUPPORTED_CATEGORIES.items():\n        if model_name in category_models:\n            return category_models[model_name]\n    return None\n"
  },
  {
    "path": "requirements.txt",
    "content": "art>=1.8\nnumpy>=1.9.0\nrequests>=2.0.0\nscikit-learn>=0.22.2\nscipy>=0.19.1\n"
  },
  {
    "path": "setup.py",
    "content": "# -*- coding: utf-8 -*-\n\"\"\"Setup module.\"\"\"\ntry:\n    from setuptools import setup, find_packages\nexcept ImportError:\n    from distutils.core import setup\n\nINSTALLATION_MODES = {\n    'core': 'requirements.txt',\n    'streaming': 'streaming-requirements.txt',\n}\n\n\ndef get_requires(mode='core'):\n    \"\"\"Read associated requirements to install.\"\"\"\n    reqs_path = INSTALLATION_MODES[mode]\n    requirements = open(reqs_path, \"r\").read()\n    return list(filter(lambda x: x != \"\", requirements.split()))\n\n\ndef read_description():\n    \"\"\"Read README.md and CHANGELOG.md.\"\"\"\n    try:\n        with open(\"README.md\") as r:\n            description = \"\\n\"\n            description += r.read()\n        with open(\"CHANGELOG.md\") as c:\n            description += \"\\n\"\n            description += c.read()\n        return description\n    except Exception:\n        return '''PyMilo: Python for ML I/O'''\n\n\nsetup(\n    name='pymilo',\n    packages=find_packages(include=['pymilo*'], exclude=['tests*']),\n    version='1.6',\n    description='PyMilo: Python for ML I/O',\n    long_description=read_description(),\n    long_description_content_type='text/markdown',\n    author='PyMilo Development Team',\n    author_email='pymilo@openscilab.com',\n    url='https://github.com/openscilab/pymilo',\n    download_url='https://github.com/openscilab/pymilo/tarball/v1.6',\n    keywords=\"machine_learning ml ai mlops model export import\",\n    project_urls={\n            'Source': 'https://github.com/openscilab/pymilo',\n    },\n    install_requires=get_requires(),\n    extras_require={\n        'streaming': get_requires(mode='streaming'),\n    },\n    python_requires='>=3.7',\n    classifiers=[\n        'Development Status :: 3 - Alpha',\n        'Natural Language :: English',\n        'License :: OSI Approved :: MIT License',\n        'Operating System :: OS Independent',\n        'Programming Language :: Python :: 3.7',\n        'Programming Language :: Python :: 3.8',\n        'Programming Language :: Python :: 3.9',\n        'Programming Language :: Python :: 3.10',\n        'Programming Language :: Python :: 3.11',\n        'Programming Language :: Python :: 3.12',\n        'Programming Language :: Python :: 3.13',\n        'Programming Language :: Python :: 3.14',\n        'Intended Audience :: Developers',\n        'Intended Audience :: Education',\n        'Intended Audience :: End Users/Desktop',\n        'Intended Audience :: Manufacturing',\n        'Intended Audience :: Science/Research',\n        'Topic :: Scientific/Engineering :: Information Analysis',\n        'Topic :: Education',\n        'Topic :: Scientific/Engineering',\n        'Topic :: Scientific/Engineering :: Artificial Intelligence',\n        'Topic :: Scientific/Engineering :: Human Machine Interfaces',\n        'Topic :: Scientific/Engineering :: Mathematics',\n        'Topic :: Scientific/Engineering :: Physics',\n    ],\n    license='MIT',\n    entry_points={\n            'console_scripts': [\n                'pymilo = pymilo.__main__:main',\n            ]\n    }\n)\n"
  },
  {
    "path": "streaming-requirements.txt",
    "content": "uvicorn>=0.14.0\nfastapi>=0.68.0\npydantic>=1.5.0\nwebsockets>=9.0\n"
  },
  {
    "path": "tests/test_clusterings/affinity_propagation.py",
    "content": "from sklearn.cluster import AffinityPropagation\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Affinity Propagation\"\n\ndef affinity_propagation():\n    x, y = prepare_simple_clustering_datasets()\n    affinity_propagation = AffinityPropagation(random_state=5).fit(x, y)\n    pymilo_clustering_test(affinity_propagation, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/birch.py",
    "content": "from sklearn.cluster import Birch\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Birch\"\n\ndef birch():\n    x, y = prepare_simple_clustering_datasets()\n    birch = Birch().fit(x, y)\n    pymilo_clustering_test(birch, MODEL_NAME, True)\n"
  },
  {
    "path": "tests/test_clusterings/bisecting_kmeans.py",
    "content": "from sklearn.cluster import BisectingKMeans\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Bisecting KMeans\"\n\ndef bisecting_kmeans():\n    x, y = prepare_simple_clustering_datasets()\n    bisecting_kmeans = BisectingKMeans(n_clusters=3, random_state=0).fit(x, y)\n    pymilo_clustering_test(bisecting_kmeans, MODEL_NAME, x, True)\n"
  },
  {
    "path": "tests/test_clusterings/dbscan.py",
    "content": "from sklearn.cluster import DBSCAN\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"DBSCAN\"\n\ndef dbscan():\n    x, y = prepare_simple_clustering_datasets()\n    dbscan = DBSCAN(eps=3, min_samples=2).fit(x, y)\n    pymilo_clustering_test(dbscan, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/gaussian_mixture/bayesian_gaussian_mixture.py",
    "content": "from sklearn.mixture import BayesianGaussianMixture\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Bayesian Gaussian Mixture\"\n\ndef bayesian_gaussian_mixture():\n    x, y = prepare_simple_clustering_datasets()\n    bayesian_gaussian_mixture = BayesianGaussianMixture(n_components=2, random_state=42).fit(x, y)\n    pymilo_clustering_test(bayesian_gaussian_mixture, MODEL_NAME, x, True)\n"
  },
  {
    "path": "tests/test_clusterings/gaussian_mixture/gaussian_mixture.py",
    "content": "from sklearn.mixture import GaussianMixture\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Gaussian Mixture\"\n\ndef gaussian_mixture():\n    x, y = prepare_simple_clustering_datasets()\n    gaussian_mixture = GaussianMixture(n_components=2, random_state=0).fit(x, y)\n    pymilo_clustering_test(gaussian_mixture, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/hdbscan.py",
    "content": "from sklearn.cluster import HDBSCAN\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"HDBSCAN\"\n\ndef hdbscan():\n    x, y = prepare_simple_clustering_datasets()\n    hdbscan = HDBSCAN(min_cluster_size=20).fit(x, y)\n    pymilo_clustering_test(hdbscan, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/hierarchical_clustering/agglomerative_clustering.py",
    "content": "from sklearn.cluster import AgglomerativeClustering\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Agglomerative Clustering\"\n\ndef agglomerative_clustering():\n    x, y = prepare_simple_clustering_datasets()\n    agglomerative_clustering = AgglomerativeClustering().fit(x, y)\n    pymilo_clustering_test(agglomerative_clustering, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/hierarchical_clustering/feature_agglomeration.py",
    "content": "from sklearn.cluster import FeatureAgglomeration\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Feature Agglomeration\"\n\ndef feature_agglomeration():\n    x, y = prepare_simple_clustering_datasets()\n    feature_agglomeration = FeatureAgglomeration(n_clusters=2).fit(x, y)\n    pymilo_clustering_test(feature_agglomeration, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/kmeans.py",
    "content": "from sklearn.cluster import KMeans\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Kmeans\"\n\ndef kmeans():\n    x, y = prepare_simple_clustering_datasets()\n    kmeans = KMeans(n_clusters=2, random_state=0).fit(x, y)\n    pymilo_clustering_test(kmeans, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/mean_shift.py",
    "content": "from sklearn.cluster import MeanShift\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Mean Shift\"\n\ndef mean_shift():\n    x, y = prepare_simple_clustering_datasets()\n    mean_shift = MeanShift(bandwidth=2).fit(x, y)\n    pymilo_clustering_test(mean_shift, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/minibatch_kmeans.py",
    "content": "from sklearn.cluster import MiniBatchKMeans\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"MiniBatch KMeans\"\n\ndef minibatch_kmeans():\n    x, y = prepare_simple_clustering_datasets()\n    minibatch_kmeans = MiniBatchKMeans(n_clusters=2, random_state=2, batch_size=6, max_iter=10).fit(x, y)\n    pymilo_clustering_test(minibatch_kmeans, MODEL_NAME, x, True)\n"
  },
  {
    "path": "tests/test_clusterings/optics.py",
    "content": "from sklearn.cluster import OPTICS\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"OPTICS\"\n\ndef optics():\n    x, y = prepare_simple_clustering_datasets()\n    optics = OPTICS().fit(x, y)\n    pymilo_clustering_test(optics, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/spectral_clustering/spectral_biclustering.py",
    "content": "from sklearn.cluster import SpectralBiclustering\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Spectral Biclustering\"\n\ndef spectral_biclustering():\n    x, y = prepare_simple_clustering_datasets()\n    spectral_biclustering = SpectralBiclustering(n_clusters=2, random_state=0).fit(x, y)\n    pymilo_clustering_test(spectral_biclustering, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/spectral_clustering/spectral_clustering.py",
    "content": "from sklearn.cluster import SpectralClustering\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Spectral Clustering\"\n\ndef spectral_clustering():\n    x, y = prepare_simple_clustering_datasets()\n    spectral_clustering = SpectralClustering(random_state=5).fit(x, y)\n    pymilo_clustering_test(spectral_clustering, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/spectral_clustering/spectral_coclustering.py",
    "content": "from sklearn.cluster import SpectralCoclustering\n\nfrom pymilo.utils.test_pymilo import pymilo_clustering_test\nfrom pymilo.utils.data_exporter import prepare_simple_clustering_datasets\n\nMODEL_NAME = \"Spectral Coclustering\"\n\ndef spectral_coclustering():\n    x, y = prepare_simple_clustering_datasets()\n    spectral_coclustering = SpectralCoclustering(n_clusters=2, random_state=0).fit(x, y)\n    pymilo_clustering_test(spectral_coclustering, MODEL_NAME, x)\n"
  },
  {
    "path": "tests/test_clusterings/test_clusterings.py",
    "content": "import os\nimport pytest\n\nfrom pymilo.pymilo_param import SKLEARN_CLUSTERING_TABLE, NOT_SUPPORTED\n\nfrom birch import birch\nfrom kmeans import kmeans\nfrom minibatch_kmeans import minibatch_kmeans\nfrom affinity_propagation import affinity_propagation\nfrom mean_shift import mean_shift\nfrom dbscan import dbscan\nfrom optics import optics\nfrom spectral_clustering.spectral_clustering import spectral_clustering\nfrom spectral_clustering.spectral_biclustering import spectral_biclustering\nfrom spectral_clustering.spectral_coclustering import spectral_coclustering\nfrom gaussian_mixture.gaussian_mixture import gaussian_mixture\nfrom gaussian_mixture.bayesian_gaussian_mixture import bayesian_gaussian_mixture\nfrom hierarchical_clustering.agglomerative_clustering import agglomerative_clustering\nfrom hierarchical_clustering.feature_agglomeration import feature_agglomeration\n\nbisecting_kmeans_support = SKLEARN_CLUSTERING_TABLE[\"BisectingKMeans\"] != NOT_SUPPORTED\nif bisecting_kmeans_support:\n    from bisecting_kmeans import bisecting_kmeans\n\nhdbscan_support = SKLEARN_CLUSTERING_TABLE[\"HDBSCAN\"] != NOT_SUPPORTED\nif hdbscan_support:\n    from hdbscan import hdbscan\n\nCLUSTERINGS = {\n    \"KMEANS\": [kmeans, bisecting_kmeans if bisecting_kmeans_support else (None,\"BisectingKMeans\"), minibatch_kmeans],\n    \"AFFINITY_PROPAGATION\": [affinity_propagation],\n    \"MEAN_SHIFT\": [mean_shift],\n    \"DBSCAN\": [dbscan, hdbscan if hdbscan_support else (None,\"HDBSCAN\")],\n    \"OPTICS\": [optics],\n    \"SPECTRAL_CLUSTERING\": [spectral_clustering, spectral_biclustering, spectral_coclustering],\n    \"GAUSSIAN_MIXTURE\": [gaussian_mixture, bayesian_gaussian_mixture],\n    \"HIERARCHICAL_CLUSTERING\": [agglomerative_clustering, feature_agglomeration],\n    \"BIRCH\": [birch],\n}\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_clusterings\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        # construct full file path\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for category in CLUSTERINGS:\n        for model in CLUSTERINGS[category]:\n            if isinstance(model, tuple):\n                func, model_name = model\n                if func == None:\n                    print(\"Model: \" + model_name + \" is not supported in this python version.\")\n                    continue\n            model()\n"
  },
  {
    "path": "tests/test_composes/column_transformer.py",
    "content": "from numpy import array, array_equal\nfrom sklearn.compose import ColumnTransformer\nfrom sklearn.preprocessing import Normalizer, MinMaxScaler\nfrom sklearn.feature_extraction.text import CountVectorizer\nfrom util import get_path, write_and_read\nfrom pymilo.transporters.compose_transporter import ComposeTransporter\nfrom pymilo.utils.test_pymilo import report_status\n\nMODEL_NAME = \"ColumnTransformer\"\n\ndef column_transformer():\n    model = ColumnTransformer(\n        [(\"norm1\", Normalizer(norm='l1'), [0, 1]),\n        (\"norm2\", Normalizer(norm='l1'), slice(2, 4))])\n    X = array([[0., 1., 2., 2.],\n                [1., 1., 0., 1.]])\n    # Normalizer scales each row of X to unit norm. A separate scaling\n    # is applied for the two first and two last elements of each\n    # row independently.\n    pre_result = model.fit_transform(X)\n\n    ct = ComposeTransporter()\n    post_pymilo_ct_model = ct.deserialize_compose_internal_model(\n        write_and_read(\n            ct.serialize_compose_internal_model(model),\n            get_path(MODEL_NAME,1)))\n    post_result = post_pymilo_ct_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n\n\ndef complex_column_transformer():\n    # Use a plain 2D numpy array (no pandas dependency). With numpy inputs, select columns by index.\n    X = array(\n        [[\"First item\", 3.0], [\"second one here\", 4.0], [\"Is this the last?\", 5.0]],\n        dtype=object,\n    )\n    ct = ColumnTransformer(\n        [(\"text_preprocess\", CountVectorizer(), 0),\n        (\"num_preprocess\", MinMaxScaler(), [1])])\n    pre_result = ct.fit_transform(X)\n\n    pt = ComposeTransporter()\n    post_pymilo_pre_model = pt.deserialize_compose_internal_model(\n        write_and_read(\n            pt.serialize_compose_internal_model(ct),\n            get_path(MODEL_NAME,2)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n\n\ndef nested_column_transformer_with_pipeline():\n    # Numeric-only example to keep output dense and easily comparable\n    X = array([[0., 1., 2., 3.],\n               [1., 1., 0., 1.],\n               [2., 0., 1., 0.]])\n\n    # Inner ColumnTransformer working on array indices (post StandardScaler)\n    inner_ct = ColumnTransformer([\n        (\"minmax_first_two\", MinMaxScaler(), [0, 1])\n    ])\n\n    # Pipeline used as a transformer inside the outer ColumnTransformer\n    from sklearn.pipeline import Pipeline\n    pipe = Pipeline([\n        (\"scale\", MinMaxScaler()),\n        (\"inner_ct\", inner_ct),\n    ])\n\n    # Outer ColumnTransformer applies the pipeline on all columns\n    outer_ct = ColumnTransformer([\n        (\"pipe\", pipe, slice(0, 4))\n    ])\n\n    pre_result = outer_ct.fit_transform(X)\n\n    pt = ComposeTransporter()\n    post_model = pt.deserialize_compose_internal_model(\n        write_and_read(\n            pt.serialize_compose_internal_model(outer_ct),\n            get_path(MODEL_NAME, 3)))\n    post_result = post_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_composes/test_compose_models.py",
    "content": "import os\nimport pytest\n\nfrom column_transformer import (\n    column_transformer,\n    complex_column_transformer,\n    nested_column_transformer_with_pipeline,\n  )\n\nfrom transformed_target_regressor import (\n    transformed_target_regressor,\n)\n\nCOMPOSE_MODEL_TESTS = [\n    column_transformer,\n    complex_column_transformer,\n    transformed_target_regressor,\n    nested_column_transformer_with_pipeline,\n]\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_composes\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        # construct full file path\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for model in COMPOSE_MODEL_TESTS:\n        print(\"Testing model: \", model.__name__)\n        model()\n"
  },
  {
    "path": "tests/test_composes/transformed_target_regressor.py",
    "content": "import numpy as np\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.linear_model import LinearRegression, SGDRegressor\nfrom sklearn.compose import TransformedTargetRegressor\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\n\nMODEL_NAME = \"TransformedTargetRegressor\"\n\n\ndef transformed_target_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n\n    tt_regressor = TransformedTargetRegressor(regressor=LinearRegression(),\n                                func=np.log, inverse_func=np.exp)\n\n    tt_regressor.fit(x_train, y_train)\n\n    assert pymilo_regression_test(\n        tt_regressor, MODEL_NAME, (x_test, y_test)) == True\n\n\ndef complex_transformed_target_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create SGD Regression object\n    sgd_max_iter = 1000\n    sgd_tol = 1e-3\n    sgd_regression = SGDRegressor(max_iter=sgd_max_iter, tol=sgd_tol)\n    # Train the model using the training sets\n    sgd_regression.fit(x_train, y_train)\n\n    x_scaler = StandardScaler()\n    y_scaler = StandardScaler()\n    ttr = TransformedTargetRegressor(regressor=sgd_regression, transformer=y_scaler)\n\n    pipeline = Pipeline([(\"x_scaler\", x_scaler), (\"ttr\", ttr)])\n    pipeline.fit(x_train, y_train)\n\n    assert pymilo_regression_test(\n        pipeline, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_composes/util.py",
    "content": "import os\nimport json\n\ndef write_and_read(serialized_model, file_addr):\n    with open(file_addr, 'w') as fp:\n        fp.write(json.dumps(serialized_model, indent=4))\n    with open(file_addr, 'r') as fp:\n        return json.load(fp)\n\ndef get_path(model_name, index=None):\n    index = \"\" if index is None else f\"_{index}\"\n    return  os.path.join(os.getcwd(), \"tests\", \"exported_composes\", model_name + index + \".json\")\n"
  },
  {
    "path": "tests/test_cross_decomposition/cca.py",
    "content": "from sklearn.cross_decomposition import CCA\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"CCA\"\n\ndef cca():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    cca = CCA(n_components=1).fit(x_train, y_train)\n    pymilo_regression_test(cca, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_cross_decomposition/pls_canonical.py",
    "content": "from sklearn.cross_decomposition import PLSCanonical\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"PLSCanonical\"\n\ndef pls_canonical():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    pls_canonical = PLSCanonical(n_components=1).fit(x_train, y_train)\n    pymilo_regression_test(pls_canonical, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_cross_decomposition/pls_regression.py",
    "content": "from sklearn.cross_decomposition import PLSRegression\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"PLSRegression\"\n\ndef pls_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    pls_regressor = PLSRegression(n_components=2).fit(x_train, y_train)\n    pymilo_regression_test(pls_regressor, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_cross_decomposition/test_cross_decompositions.py",
    "content": "import os\nimport pytest\n\nfrom pls_regression import pls_regressor\nfrom pls_canonical import pls_canonical\nfrom cca import cca\n\nCROSS_DECOMPOSITIONS = {\n    \"PLS_REGRESSION\": [pls_regressor],\n    \"PLS_CANONICAL\": [pls_canonical],\n    \"CCA\": [cca],\n}\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_cross_decomposition\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        # construct full file path\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for category in CROSS_DECOMPOSITIONS:\n        for model in CROSS_DECOMPOSITIONS[category]:\n            model()\n"
  },
  {
    "path": "tests/test_decision_trees/decision_tree/decision_tree_classification.py",
    "content": "from sklearn.tree import DecisionTreeClassifier\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"Decision Tree Classifier\"\n\ndef decision_tree_classification():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    # Create Decision Tree Regressor\n    decision_tree_classifier = DecisionTreeClassifier(random_state=1)\n    decision_tree_classifier = decision_tree_classifier.fit(x_train, y_train)\n    assert pymilo_classification_test(\n        decision_tree_classifier, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_decision_trees/decision_tree/decision_tree_regression.py",
    "content": "from sklearn.tree import DecisionTreeRegressor\n\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Decision Tree Regressor\"\n\ndef decision_tree_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Decision Tree Regressor\n    decision_tree_regressor = DecisionTreeRegressor(random_state=1)\n    decision_tree_regressor = decision_tree_regressor.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        decision_tree_regressor, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_decision_trees/extra_tree/extra_tree_classification.py",
    "content": "from sklearn.tree import ExtraTreeClassifier\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"Extra Tree Classifier\"\n\ndef extra_tree_classification():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    # Create Decision Tree Regressor\n    extra_tree_classifier = ExtraTreeClassifier(random_state=1)\n    extra_tree_classifier = extra_tree_classifier.fit(x_train, y_train)\n    assert pymilo_classification_test(\n        extra_tree_classifier, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_decision_trees/extra_tree/extra_tree_regression.py",
    "content": "from sklearn.tree import ExtraTreeRegressor\n\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Extra Tree Regressor\"\n\ndef extra_tree_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Decision Tree Regressor\n    extra_tree_regressor = ExtraTreeRegressor(random_state=0)\n    extra_tree_regressor = extra_tree_regressor.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        extra_tree_regressor, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_decision_trees/test_decision_trees.py",
    "content": "import os\nimport pytest\n\nfrom decision_tree.decision_tree_regression import decision_tree_regression\nfrom decision_tree.decision_tree_classification import decision_tree_classification\nfrom extra_tree.extra_tree_regression import extra_tree_regression\nfrom extra_tree.extra_tree_classification import extra_tree_classification\n\nDECISION_TREES = {\n    \"DECISION_TREE\": [decision_tree_regression, decision_tree_classification],\n    \"EXTRA TREE\": [extra_tree_regression, extra_tree_classification],\n}\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_decision_trees\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        # construct full file path\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for category in DECISION_TREES:\n        for model in DECISION_TREES[category]:\n            model()\n"
  },
  {
    "path": "tests/test_ensembles/adaboost/adaboost_classifier.py",
    "content": "from sklearn.ensemble import AdaBoostClassifier\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"AdaBoostClassifier\"\n\ndef adaboost_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    adaboost_classifier = AdaBoostClassifier(n_estimators=100, random_state=0).fit(x_train, y_train)\n    pymilo_classification_test(adaboost_classifier, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/adaboost/adaboost_regressor.py",
    "content": "from sklearn.ensemble import AdaBoostRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"AdaBoostRegressor\"\n\ndef adaboost_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    adaboost_regressor = AdaBoostRegressor(random_state=0, n_estimators=100).fit(x_train, y_train)\n    pymilo_regression_test(adaboost_regressor, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/bagging/bagging_classifier.py",
    "content": "from sklearn.ensemble import BaggingClassifier\nfrom sklearn.svm import SVC\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\nfrom pymilo.utils.util import has_named_parameter\n\nMODEL_NAME = \"BaggingClassifier\"\n\ndef bagging_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    if has_named_parameter(BaggingClassifier, \"estimator\"):\n        bagging_classifier = BaggingClassifier(estimator=SVC(), n_estimators=10, random_state=0).fit(x_train, y_train)\n    else:\n        bagging_classifier = BaggingClassifier(n_estimators=10, random_state=0).fit(x_train, y_train)\n    pymilo_classification_test(bagging_classifier, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/bagging/bagging_regressor.py",
    "content": "from sklearn.ensemble import BaggingRegressor\nfrom sklearn.svm import SVR\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\nfrom pymilo.utils.util import has_named_parameter\n\nMODEL_NAME = \"BaggingRegressor\"\n\ndef bagging_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    if has_named_parameter(BaggingRegressor, \"estimator\"):\n        bagging_regressor = BaggingRegressor(estimator=SVR(), n_estimators=10, random_state=0).fit(x_train, y_train)\n    else:\n        bagging_regressor = BaggingRegressor(n_estimators=10, random_state=0).fit(x_train, y_train)\n    pymilo_regression_test(bagging_regressor, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/extra_trees/extra_trees_classifier.py",
    "content": "from sklearn.ensemble import ExtraTreesClassifier\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"ExtraTreesClassifier\"\n\ndef extra_trees_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    extra_trees_classifier = ExtraTreesClassifier(n_estimators=100, random_state=0).fit(x_train, y_train)\n    pymilo_classification_test(extra_trees_classifier, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/extra_trees/extra_trees_regressor.py",
    "content": "from sklearn.ensemble import ExtraTreesRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"ExtraTreesRegressor\"\n\ndef extra_trees_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    extra_trees_regressor = ExtraTreesRegressor(n_estimators=100, random_state=0).fit(x_train, y_train)\n    pymilo_regression_test(extra_trees_regressor, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/gradient_booster/gradient_booster_classifier.py",
    "content": "from sklearn.ensemble import GradientBoostingClassifier\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"GradientBoostingClassifier\"\n\ndef gradient_booster_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    gradient_booster_classifier = GradientBoostingClassifier(n_estimators=100, learning_rate=1.0, max_depth=1, random_state=0).fit(x_train, y_train)\n    pymilo_classification_test(gradient_booster_classifier, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/gradient_booster/gradient_booster_regressor.py",
    "content": "from sklearn.ensemble import GradientBoostingRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"GradientBoostingRegressor\"\n\ndef gradient_booster_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    gradient_booster_regressor = GradientBoostingRegressor(random_state=0).fit(x_train, y_train, sample_weight=1)\n    pymilo_regression_test(gradient_booster_regressor, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/hist_gradient_boosting/hist_gradient_boosting_classifier.py",
    "content": "from sklearn.ensemble import HistGradientBoostingClassifier\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"HistGradientBoostingClassifier\"\n\ndef hist_gradient_boosting_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    hist_gradient_boosting_classifier = HistGradientBoostingClassifier().fit(x_train, y_train)\n    pymilo_classification_test(hist_gradient_boosting_classifier, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/hist_gradient_boosting/hist_gradient_boosting_regressor.py",
    "content": "from sklearn.ensemble import HistGradientBoostingRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"HistGradientBoostingRegressor\"\n\ndef hist_gradient_boosting_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    hist_gradient_boosting_regressor = HistGradientBoostingRegressor().fit(x_train, y_train)\n    pymilo_regression_test(hist_gradient_boosting_regressor, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/isolation_forest.py",
    "content": "from sklearn.ensemble import IsolationForest\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"IsolationForest\"\n\ndef isolation_forest():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    isolation_forest = IsolationForest(random_state=0).fit(x_train, y_train)\n    pymilo_regression_test(isolation_forest, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/pipeline.py",
    "content": "from sklearn.svm import SVC\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.pipeline import Pipeline\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"Pipeline\"\n\ndef pipeline():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    pipeline = Pipeline([\n        ('scaler', StandardScaler()),\n        ('svc', SVC())]).fit(x_train, y_train)\n    pymilo_classification_test(pipeline, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/random_forests/random_forest_classifier.py",
    "content": "from sklearn.ensemble import RandomForestClassifier\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"RandomForestClassifier\"\n\ndef random_forest_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    random_forest_classifier = RandomForestClassifier(max_depth=2, random_state=0).fit(x_train, y_train)\n    pymilo_classification_test(random_forest_classifier, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/random_forests/random_forest_regressor.py",
    "content": "from sklearn.ensemble import RandomForestRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"RandomForestRegressor\"\n\ndef random_forest_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    random_forest_regressor = RandomForestRegressor(max_depth=2, random_state=0).fit(x_train, y_train, sample_weight=1)\n    pymilo_regression_test(random_forest_regressor, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/random_trees_embedding.py",
    "content": "from sklearn.ensemble import RandomTreesEmbedding\nfrom pymilo.utils.test_pymilo import pymilo_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"RandomTreesEmbedding\"\n\ndef random_trees_embedding():\n    x_train, y_train, _, _ = prepare_simple_regression_datasets()\n    random_trees_embedding = RandomTreesEmbedding(n_estimators=5, random_state=0, max_depth=1).fit(x_train, y_train)\n    pymilo_test(random_trees_embedding, MODEL_NAME)\n"
  },
  {
    "path": "tests/test_ensembles/stacking/stacking_classifier.py",
    "content": "from sklearn.ensemble import StackingClassifier\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn.pipeline import make_pipeline\nfrom sklearn.svm import LinearSVC\nfrom sklearn.linear_model import LogisticRegression\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"StackingClassifier\"\n\ndef stacking_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    estimators = [\n        ('rf', RandomForestClassifier(n_estimators=10, random_state=42)),\n        ('svr', make_pipeline(LinearSVC(random_state=42)))\n        ]\n    stacking_classifier = StackingClassifier(\n        estimators=estimators, final_estimator=LogisticRegression()\n    ).fit(x_train, y_train)\n\n    pymilo_classification_test(stacking_classifier, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/stacking/stacking_regressor.py",
    "content": "from sklearn.ensemble import StackingRegressor\nfrom sklearn.linear_model import RidgeCV\nfrom sklearn.svm import LinearSVR\nfrom sklearn.ensemble import RandomForestRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"StackingRegressor\"\n\ndef stacking_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    estimators = [\n        ('lr', RidgeCV()),\n        ('svr', LinearSVR(random_state=42))]\n    stacking_regressor = StackingRegressor(\n        estimators=estimators,\n        final_estimator=RandomForestRegressor(n_estimators=10,random_state=42)).fit(x_train,y_train)\n    pymilo_regression_test(stacking_regressor, MODEL_NAME,(x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/test_ensembles.py",
    "content": "import os\nimport pytest\n\nfrom adaboost.adaboost_regressor import adaboost_regressor\nfrom adaboost.adaboost_classifier import adaboost_classifier\n\nfrom bagging.bagging_regressor import bagging_regressor\nfrom bagging.bagging_classifier import bagging_classifier\n\nfrom extra_trees.extra_trees_regressor import extra_trees_regressor\nfrom extra_trees.extra_trees_classifier import extra_trees_classifier\n\nfrom gradient_booster.gradient_booster_regressor import gradient_booster_regressor\nfrom gradient_booster.gradient_booster_classifier import gradient_booster_classifier\n\nfrom random_forests.random_forest_regressor import random_forest_regressor\nfrom random_forests.random_forest_classifier import random_forest_classifier\nfrom isolation_forest import isolation_forest\nfrom random_trees_embedding import random_trees_embedding\n\nfrom stacking.stacking_regressor import stacking_regressor\nfrom stacking.stacking_classifier import stacking_classifier\n\nfrom voting.voting_regressor import voting_regressor\nfrom voting.voting_classifier import voting_classifier\n\nfrom pipeline import pipeline\nfrom tests.test_composes.transformed_target_regressor import complex_transformed_target_regressor\n\nfrom pymilo.pymilo_param import SKLEARN_ENSEMBLE_TABLE, NOT_SUPPORTED\n\nif SKLEARN_ENSEMBLE_TABLE[\"HistGradientBoostingRegressor\"] != NOT_SUPPORTED:\n    from hist_gradient_boosting.hist_gradient_boosting_regressor import hist_gradient_boosting_regressor\n    from hist_gradient_boosting.hist_gradient_boosting_classifier import hist_gradient_boosting_classifier\n\nENSEMBLES = {\n    \"Adaboost\": [adaboost_regressor, adaboost_classifier],\n    \"Bagging\": [bagging_regressor, bagging_classifier], \n    \"ExtaTrees\": [extra_trees_regressor, extra_trees_classifier],\n    \"GradientBooster\": [gradient_booster_regressor, gradient_booster_classifier],\n    \"HistGradientBooster\": [\n        hist_gradient_boosting_regressor if SKLEARN_ENSEMBLE_TABLE[\"HistGradientBoostingRegressor\"] != NOT_SUPPORTED else (None, \"HistGradientBoostingRegressor\"),\n        hist_gradient_boosting_classifier if SKLEARN_ENSEMBLE_TABLE[\"HistGradientBoostingClassifier\"] != NOT_SUPPORTED else (None, \"HistGradientBoostingClassifier\")\n        ],\n    \"Forests\": [random_forest_regressor, random_forest_classifier, isolation_forest, random_trees_embedding],\n    \"Stacking\": [stacking_regressor, stacking_classifier],\n    \"Voting\": [voting_regressor, voting_classifier],\n    \"Pipeline\": [pipeline, complex_transformed_target_regressor],\n}\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_ensembles\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        # construct full file path\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for category in ENSEMBLES:\n        for model in ENSEMBLES[category]:\n            if isinstance(model, tuple):\n                func, model_name = model\n                if func == None:\n                    print(\"Model: \" + model_name + \" is not supported in this python version.\")\n                    continue\n            model()\n"
  },
  {
    "path": "tests/test_ensembles/voting/voting_classifier.py",
    "content": "from sklearn.linear_model import LogisticRegression\nfrom sklearn.naive_bayes import GaussianNB\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn.ensemble import VotingClassifier\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"VotingClassifier\"\n\ndef voting_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    r1 = LogisticRegression(random_state=1)\n    r2 = RandomForestClassifier(n_estimators=50, random_state=1)\n    r3 = GaussianNB()\n    voting_classifier = VotingClassifier([('lr', r1), ('rf', r2), ('r3', r3)], voting='hard').fit(x_train,y_train)\n    pymilo_classification_test(voting_classifier, MODEL_NAME,(x_test, y_test))\n"
  },
  {
    "path": "tests/test_ensembles/voting/voting_regressor.py",
    "content": "from sklearn.linear_model import LinearRegression\nfrom sklearn.ensemble import RandomForestRegressor\nfrom sklearn.ensemble import VotingRegressor\nfrom sklearn.neighbors import KNeighborsRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"VotingRegressor\"\n\ndef voting_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    r1 = LinearRegression()\n    r2 = RandomForestRegressor(n_estimators=10, random_state=1)\n    r3 = KNeighborsRegressor()\n    voting_regressor = VotingRegressor([('lr', r1), ('rf', r2), ('r3', r3)]).fit(x_train,y_train)\n    pymilo_regression_test(voting_regressor, MODEL_NAME,(x_test, y_test))\n"
  },
  {
    "path": "tests/test_exceptions/custom_models.py",
    "content": "from collections import namedtuple\n\nimport numpy as np\n\nDistributionBoundary = namedtuple(\"DistributionBoundary\", (\"value\", \"inclusive\"))\nclass CustomizedTweedieDistribution():\n\n    def __init__(self, power=0):\n        self.power = power\n\n    @property\n    def power(self):\n        return self._power\n\n    @power.setter\n    def power(self, power):\n        self._lower_bound = DistributionBoundary(0, inclusive=True)\n        self._power = power\n\n    def unit_variance(self, y_pred):\n        return np.power(y_pred, self.power)\n\n    def unit_deviance(self, y, y_pred, check_input=False):\n        return (y - y_pred) ** 2\n\n"
  },
  {
    "path": "tests/test_exceptions/export_exceptions.py",
    "content": "# INVALID_MODEL = 1 -> tested.\n# VALID_MODEL_INVALID_INTERNAL_STRUCTURE = 2 -> tested.\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.chains.neural_network_chain import neural_network_chain\nfrom pymilo.transporters.transporter import Command\n\nfrom sklearn.linear_model import LinearRegression\nfrom sklearn.neural_network import MLPRegressor\nfrom custom_models import CustomizedTweedieDistribution\n\nimport numpy as np\n\n# Learning model, but an invalid one.\n# test case for INVALID_MODEL.\nclass InvalidModel:\n  def __init__(self):\n    self.name = \"Invalid Linear Model\"\n\n  def fit(self, x, y):\n     return\n\n  def predict(self, x):\n     return [0 for _ in range(np.shape(x)[0])]\n\ndef invalid_model(print_output = True):\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create linear regression object\n    model = InvalidModel()\n    # Train the model using the training sets\n    model.fit(x_train, y_train)\n    try:\n      pymilo_regression_test(\n        model, model.name , (x_test, y_test))\n      return False\n    except Exception as e:\n      if print_output: print(\"An Exception occured\\n\", e)\n      return True\n\ndef valid_model_invalid_structure_linear_model(print_output = True):\n    MODEL_NAME = \"LinearRegression\"\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create linear regression object\n    linear_regression = LinearRegression()\n    linear_regression.__dict__[\"invalid_field\"] = CustomizedTweedieDistribution(power= 1.5)\n    # Train the model using the training sets\n    linear_regression.fit(x_train, y_train)\n    try:\n      pymilo_regression_test(\n        linear_regression, MODEL_NAME, (x_test, y_test))\n      return False\n    except Exception as e:\n      if print_output: print(\"An Exception occured\\n\", e)\n      return True\n\ndef valid_model_irrelevant_chain(print_output = True):\n    x_train, y_train, _, _ = prepare_simple_regression_datasets()\n    # Create linear regression object\n    linear_regression = LinearRegression()\n    # Train the model using the training sets\n    linear_regression.fit(x_train, y_train)\n    try:\n      neural_network_chain.transport(linear_regression, Command.SERIALIZE)\n      return False\n    except Exception as e:\n      if print_output: print(\"An Exception occured\\n\", e)\n      return True\n\ndef valid_model_invalid_structure_neural_network(print_output = True):\n    MODEL_NAME = \"MLPRegressor\"\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create linear regression object\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Passive Aggressive Regression object\n    multi_layer_perceptron_regression = MLPRegressor(random_state=1, max_iter=500).fit(x_train, y_train)\n    multi_layer_perceptron_regression.__dict__[\"invalid_field\"] = CustomizedTweedieDistribution(power= 1.5)\n    # Train the model using the training sets\n    multi_layer_perceptron_regression.fit(x_train, y_train)\n    try:\n      pymilo_regression_test(\n        multi_layer_perceptron_regression, MODEL_NAME, (x_test, y_test))\n      return False\n    except Exception as e:\n      if print_output: print(\"An Exception occured\\n\", e)\n      return True\n"
  },
  {
    "path": "tests/test_exceptions/import_exceptions.py",
    "content": "# CORRUPTED_JSON_FILE = 1 -> tested.\n# INVALID_MODEL = 2 -> tested.\n# VALID_MODEL_INVALID_INTERNAL_STRUCTURE = 3 -> tested.\nimport os\nfrom pymilo.pymilo_obj import Import\n\n\ndef invalid_json(print_output = True):\n    json_files = [\"corrupted\", \"unknown-model\"]\n    for json_file in json_files:\n      json_path = os.path.join(os.getcwd(), \"tests\", \"test_exceptions\", \"invalid_jsons\", json_file + '.json')\n      try:\n        imported_model = Import(json_path)\n        imported_model.to_model()\n        return False\n      except Exception as e:\n        if print_output: print(\"An Exception occured\\n\", e)\n        return True\n\ndef invalid_url():\n  try:\n    url = \"https://invalid_url\"\n    Import(url=url)\n    return False\n  except Exception:\n    return True\n\ndef valid_url_invalid_file():\n  try:\n    url = \"https://filesamples.com/samples/code/json/sample1.json\"\n    Import(url=url)\n    return False\n  except Exception:\n    return True\n\ndef valid_url_valid_file():\n    url = \"https://raw.githubusercontent.com/openscilab/pymilo/main/tests/test_exceptions/valid_jsons/linear_regression.json\"\n    _ = Import(url=url).to_model()\n    return True\n"
  },
  {
    "path": "tests/test_exceptions/invalid_jsons/corrupted.json",
    "content": "{\n    \"data\": {\n        \"n_iter\": 300,\n        \"tol\": 0.001,\n        \"alpha_1\": 1e-06,\n        \"alpha_2\": 1e-06,\n        \"lambda_1\": 1e-06,\n        \"lambda_2\": 1e-06,\n        \"alpha_init\": null,\n        \"lambda_init\": null,\n        \"compute_score\": false,\n        \"fit_intercept\": true,\n        \"copy_X\": true,\n        \"verbose\": false,\n        \"n_features_in_\": 10,\n        \"X_offset_\": [\n            0.00039047198185187474,\n            -0.00014309199603861062,\n            0.0004729050423879101,\n            0.00031769088082990975,\n            -0.0005321043488653789,\n            -0.000540281996323959,\n            -0.0005080203823721925,\n            9.749742869583833e-05,\n            0.00016354453731930825,\n            -0.0004691415082391016\n        ],\n        \"X_scale_\": [\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0\n        ],\n        \"scores_\": [],\n        \"n_iter_\": 10,\n        \"alpha_\": 0.00033567214754955896,\n        \"lambda_\": 1.1696814565888858e-05,\n        \"coef_\": [\n            6.260169432328754,\n            -223.51864434743877,\n            504.4898155237174,\n            317.6826922409544,\n            -183.93709375460858,\n            -3.0140347144499344,\n            -166.81028226122402,\n            117.74970819732141,\n            490.6339122980116,\n            85.56330705023953\n        ],\n        \"sigma_\": [\n            [\n                3579.602330752307,\n                -360.00990476089714,\n                -39.833887053829145,\n                -677.6426736494097,\n                -192.69161195980587,\n                -333.1369639704782,\n                -291.5207855281649,\n                166.8639609758337,\n                -416.48851363073044,\n                -487.84362180798183\n            ],\n            [\n                -360.00990476089686,\n                3738.821765173381,\n                544.655627817642,\n                -674.7534735548464,\n                494.03892752340954,\n                -298.82757068683424,\n                902.917044082385,\n                -682.3225387742524,\n                305.1778851029596,\n                -257.3341683192198\n            ],\n            [\n                -39.83388705382913,\n                544.6556278176421,\n                4338.24999893544,\n                -967.992302745072,\n                366.5816726368117,\n                -714.3457193814772,\n                902.2808250243974,\n                -17.853540520319132,\n                -865.0410004760379,\n                -616.8235023579258\n            ],\n            [\n                -677.6426736494096,\n                -674.7534735548462,\n                -967.9923027450723,\n                4204.47184045613,\n                -351.55865308243597,\n                97.58076887959795,\n                16.213647168832498,\n                605.3948813145282,\n                -778.2934065916294,\n                -709.062846767269\n            ],\n            [\n                -192.69161195980578,\n                494.03892752340954,\n                366.58167263681156,\n                -351.558653082436,\n                35727.450648570455,\n                -26540.29423432095,\n                -15225.494106017897,\n                -5527.399988117267,\n                -12198.04092770419,\n                -124.95284890762083\n            ],\n            [\n                -333.1369639704781,\n                -298.8275706868342,\n                -714.3457193814772,\n                97.58076887959798,\n                -26540.29423432095,\n                26727.310400999704,\n                7445.10478909175,\n                -3536.960212881747,\n                10440.664494101131,\n                -37.504835508766185\n            ],\n            [\n                -291.52078552816477,\n                902.917044082385,\n                902.2808250243974,\n                16.21364716883258,\n                -15225.4941060179,\n                7445.104789091751,\n                15547.2038315758,\n                11175.055399649176,\n                4141.551253444934,\n                -42.36715192288026\n            ],\n            [\n                166.8639609758336,\n                -682.3225387742525,\n                -17.853540520319097,\n                605.3948813145282,\n                -5527.399988117267,\n                -3536.9602128817464,\n                11175.055399649176,\n                17732.9762426958,\n                -2377.7448150592045,\n                -628.0335721125343\n            ],\n            [\n                -416.4885136307306,\n                305.1778851029597,\n                -865.0410004760382,\n                -778.2934065916294,\n                -12198.04092770419,\n                10440.664494101131,\n                4141.551253444934,\n                -2377.7448150592045,\n                9938.406157954483,\n                -834.1616253656699\n            ],\n            [\n                -487.8436218079817,\n                -257.33416831921977,\n                -616.8235023579258,\n                -709.0628467672691,\n                -124.9528489076208,\n                -37.50483550876613,\n                -42.36715192288027,\n                -628.0335721125343,\n                -834.16162536567,\n                4391.940199555901\n            ]\n        ],\n        \"intercept_\": 152.75280574943835\n    },\n    \"sklearn_version\": \"1.2.0\",\n    \"pymilo_version\": \"0.1\" \n    \"model_type\": \"Unknown-Model\"\n}"
  },
  {
    "path": "tests/test_exceptions/invalid_jsons/unknown-model.json",
    "content": "{\n    \"data\": {\n        \"n_iter\": 300,\n        \"tol\": 0.001,\n        \"alpha_1\": 1e-06,\n        \"alpha_2\": 1e-06,\n        \"lambda_1\": 1e-06,\n        \"lambda_2\": 1e-06,\n        \"alpha_init\": null,\n        \"lambda_init\": null,\n        \"compute_score\": false,\n        \"fit_intercept\": true,\n        \"copy_X\": true,\n        \"verbose\": false,\n        \"n_features_in_\": 10,\n        \"X_offset_\": [\n            0.00039047198185187474,\n            -0.00014309199603861062,\n            0.0004729050423879101,\n            0.00031769088082990975,\n            -0.0005321043488653789,\n            -0.000540281996323959,\n            -0.0005080203823721925,\n            9.749742869583833e-05,\n            0.00016354453731930825,\n            -0.0004691415082391016\n        ],\n        \"X_scale_\": [\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0,\n            1.0\n        ],\n        \"scores_\": [],\n        \"n_iter_\": 10,\n        \"alpha_\": 0.00033567214754955896,\n        \"lambda_\": 1.1696814565888858e-05,\n        \"coef_\": [\n            6.260169432328754,\n            -223.51864434743877,\n            504.4898155237174,\n            317.6826922409544,\n            -183.93709375460858,\n            -3.0140347144499344,\n            -166.81028226122402,\n            117.74970819732141,\n            490.6339122980116,\n            85.56330705023953\n        ],\n        \"sigma_\": [\n            [\n                3579.602330752307,\n                -360.00990476089714,\n                -39.833887053829145,\n                -677.6426736494097,\n                -192.69161195980587,\n                -333.1369639704782,\n                -291.5207855281649,\n                166.8639609758337,\n                -416.48851363073044,\n                -487.84362180798183\n            ],\n            [\n                -360.00990476089686,\n                3738.821765173381,\n                544.655627817642,\n                -674.7534735548464,\n                494.03892752340954,\n                -298.82757068683424,\n                902.917044082385,\n                -682.3225387742524,\n                305.1778851029596,\n                -257.3341683192198\n            ],\n            [\n                -39.83388705382913,\n                544.6556278176421,\n                4338.24999893544,\n                -967.992302745072,\n                366.5816726368117,\n                -714.3457193814772,\n                902.2808250243974,\n                -17.853540520319132,\n                -865.0410004760379,\n                -616.8235023579258\n            ],\n            [\n                -677.6426736494096,\n                -674.7534735548462,\n                -967.9923027450723,\n                4204.47184045613,\n                -351.55865308243597,\n                97.58076887959795,\n                16.213647168832498,\n                605.3948813145282,\n                -778.2934065916294,\n                -709.062846767269\n            ],\n            [\n                -192.69161195980578,\n                494.03892752340954,\n                366.58167263681156,\n                -351.558653082436,\n                35727.450648570455,\n                -26540.29423432095,\n                -15225.494106017897,\n                -5527.399988117267,\n                -12198.04092770419,\n                -124.95284890762083\n            ],\n            [\n                -333.1369639704781,\n                -298.8275706868342,\n                -714.3457193814772,\n                97.58076887959798,\n                -26540.29423432095,\n                26727.310400999704,\n                7445.10478909175,\n                -3536.960212881747,\n                10440.664494101131,\n                -37.504835508766185\n            ],\n            [\n                -291.52078552816477,\n                902.917044082385,\n                902.2808250243974,\n                16.21364716883258,\n                -15225.4941060179,\n                7445.104789091751,\n                15547.2038315758,\n                11175.055399649176,\n                4141.551253444934,\n                -42.36715192288026\n            ],\n            [\n                166.8639609758336,\n                -682.3225387742525,\n                -17.853540520319097,\n                605.3948813145282,\n                -5527.399988117267,\n                -3536.9602128817464,\n                11175.055399649176,\n                17732.9762426958,\n                -2377.7448150592045,\n                -628.0335721125343\n            ],\n            [\n                -416.4885136307306,\n                305.1778851029597,\n                -865.0410004760382,\n                -778.2934065916294,\n                -12198.04092770419,\n                10440.664494101131,\n                4141.551253444934,\n                -2377.7448150592045,\n                9938.406157954483,\n                -834.1616253656699\n            ],\n            [\n                -487.8436218079817,\n                -257.33416831921977,\n                -616.8235023579258,\n                -709.0628467672691,\n                -124.9528489076208,\n                -37.50483550876613,\n                -42.36715192288027,\n                -628.0335721125343,\n                -834.16162536567,\n                4391.940199555901\n            ]\n        ],\n        \"intercept_\": 152.75280574943835\n    },\n    \"sklearn_version\": \"1.2.0\",\n    \"pymilo_version\": \"0.1\",\n    \"model_type\": \"Unknown-Model\"\n}"
  },
  {
    "path": "tests/test_exceptions/test_exceptions.py",
    "content": "from export_exceptions import invalid_model\nfrom export_exceptions import valid_model_invalid_structure_linear_model\nfrom export_exceptions import valid_model_invalid_structure_neural_network\nfrom export_exceptions import valid_model_irrelevant_chain\n\nfrom import_exceptions import invalid_json, invalid_url, valid_url_invalid_file, valid_url_valid_file\n\nEXCEPTION_TESTS = {\n    'IMPORT': [\n        invalid_json,\n        invalid_url,\n        valid_url_invalid_file,\n        valid_url_valid_file,\n        ],\n    'EXPORT': [\n        invalid_model,\n        valid_model_invalid_structure_linear_model,\n        valid_model_invalid_structure_neural_network,\n        valid_model_irrelevant_chain\n        ]\n}\n\ndef test_full():\n    for category in EXCEPTION_TESTS:\n        category_all_test_pass = True\n        for test in EXCEPTION_TESTS[category]:\n            category_all_test_pass = category_all_test_pass and test()\n            assert category_all_test_pass == True\n            print(\"Test of Category: \" + category + \" with granularity of: \" + test.__name__ + \" executed successfully.\" )"
  },
  {
    "path": "tests/test_exceptions/valid_jsons/linear_regression.json",
    "content": "{\n    \"data\": {\n        \"fit_intercept\": true,\n        \"copy_X\": true,\n        \"n_jobs\": null,\n        \"positive\": false,\n        \"n_features_in_\": 10,\n        \"coef_\": {\n            \"pymiloed-ndarray-list\": [\n                0.30609424754267966,\n                -237.63557011300716,\n                510.53804765114097,\n                327.7298779909887,\n                -814.1119263534517,\n                492.7995945034062,\n                102.84123996793083,\n                184.6034960903708,\n                743.5093875957093,\n                76.09664636971895\n            ],\n            \"pymiloed-ndarray-dtype\": \"float64\",\n            \"pymiloed-ndarray-shape\": [\n                10\n            ],\n            \"pymiloed-data-structure\": \"numpy.ndarray\"\n        },\n        \"rank_\": 10,\n        \"singular_\": {\n            \"pymiloed-ndarray-list\": [\n                1.9578051002417796,\n                1.1797491126040702,\n                1.0755406405377144,\n                0.9579192686906345,\n                0.7980638292867588,\n                0.7594342409324799,\n                0.7216957209064547,\n                0.6459380350140406,\n                0.27271507089040337,\n                0.0915832239699\n            ],\n            \"pymiloed-ndarray-dtype\": \"float64\",\n            \"pymiloed-ndarray-shape\": [\n                10\n            ],\n            \"pymiloed-data-structure\": \"numpy.ndarray\"\n        },\n        \"intercept_\": {\n            \"value\": 152.76429169049118,\n            \"np-type\": \"numpy.float64\"\n        }\n    },\n    \"sklearn_version\": \"1.3.0\",\n    \"pymilo_version\": \"0.9\",\n    \"model_type\": \"LinearRegression\"\n}"
  },
  {
    "path": "tests/test_feature_extraction/count_vectorizer.py",
    "content": "from numpy import array_equal\nfrom util import get_path, write_and_read\nfrom pymilo.utils.test_pymilo import report_status\nfrom sklearn.feature_extraction.text import CountVectorizer\nfrom pymilo.transporters.feature_extraction_transporter import FeatureExtractorTransporter\n\nMODEL_NAME = \"CountVectorizer\"\n\ndef count_vectorizer():\n    corpus = [\n        'This is the first document.',\n        'This document is the second document.',\n        'And this is the third one.',\n        'Is this the first document?',\n    ]\n    cv = CountVectorizer(analyzer='word', ngram_range=(2, 2))\n    X = cv.fit_transform(corpus)\n    pre_result = X.toarray()\n\n    fe = FeatureExtractorTransporter()\n    post_pymilo_pre_model = fe.deserialize_fe_module(\n        write_and_read(\n            fe.serialize_fe_module(cv),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.fit_transform(corpus).toarray()\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_feature_extraction/dict_vectorizer.py",
    "content": "from numpy import array_equal\nfrom sklearn.feature_extraction import DictVectorizer\nfrom pymilo.transporters.feature_extraction_transporter import FeatureExtractorTransporter\nfrom pymilo.utils.test_pymilo import report_status\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"DictVectorizer\"\n\ndef dict_vectorizer():\n    v = DictVectorizer(sparse=False)\n    D = [{'foo': 1, 'bar': 2}, {'foo': 3, 'baz': 1}]\n    _ = v.fit_transform(D)\n\n    pre_result = v.transform({'foo': 4, 'unseen_feature': 3})\n\n    fe = FeatureExtractorTransporter()\n    post_pymilo_pre_model = fe.deserialize_fe_module(\n        write_and_read(\n            fe.serialize_fe_module(v),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform({'foo': 4, 'unseen_feature': 3})\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_feature_extraction/feature_hasher.py",
    "content": "from numpy import array_equal\nfrom sklearn.feature_extraction import FeatureHasher\nfrom pymilo.transporters.feature_extraction_transporter import FeatureExtractorTransporter\nfrom pymilo.utils.test_pymilo import report_status\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"FeatureHasher\"\n\ndef feature_hasher():\n    h = FeatureHasher(n_features=10)\n    D = [{'dog': 1, 'cat':2, 'elephant':4},{'dog': 2, 'run': 5}]\n    f = h.transform(D)\n\n    pre_result = f.toarray()\n\n    fe = FeatureExtractorTransporter()\n    post_pymilo_pre_model = fe.deserialize_fe_module(\n        write_and_read(\n            fe.serialize_fe_module(h),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(D).toarray()\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_feature_extraction/hashing_vectorizer.py",
    "content": "from numpy import array_equal\nfrom util import get_path, write_and_read\nfrom pymilo.utils.test_pymilo import report_status\nfrom sklearn.feature_extraction.text import HashingVectorizer\nfrom pymilo.transporters.feature_extraction_transporter import FeatureExtractorTransporter\n\nMODEL_NAME = \"HashingVectorizer\"\n\ndef hashing_vectorizer():\n    corpus = [\n        'This is the first document.',\n        'This document is the second document.',\n        'And this is the third one.',\n        'Is this the first document?',\n    ]\n    hv = HashingVectorizer(n_features=2**4)\n    X = hv.fit_transform(corpus)\n\n    pre_result = X.toarray()\n    fe = FeatureExtractorTransporter()\n    post_pymilo_pre_model = fe.deserialize_fe_module(\n        write_and_read(\n            fe.serialize_fe_module(hv),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.fit_transform(corpus).toarray()\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_feature_extraction/patch_extractor.py",
    "content": "from numpy import array_equal, random\nfrom sklearn.datasets import load_sample_images\nfrom sklearn.feature_extraction import image\nfrom pymilo.transporters.feature_extraction_transporter import FeatureExtractorTransporter\nfrom pymilo.utils.test_pymilo import report_status\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"PatchExtractor\"\n\ndef patch_extractor():\n    X = load_sample_images().images[1]\n    X = X[None, ...]\n    pe = image.PatchExtractor(patch_size=(10, 10), random_state=random.RandomState(42))\n    pre_result = pe.transform(X)\n\n    fe = FeatureExtractorTransporter()\n\n    post_pymilo_pre_model = fe.deserialize_fe_module(\n        write_and_read(\n            fe.serialize_fe_module(pe),\n            get_path(MODEL_NAME)))\n\n    post_result = post_pymilo_pre_model.transform(X)\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_feature_extraction/pipeline.py",
    "content": "from sklearn.feature_extraction.text import TfidfTransformer\nfrom sklearn.feature_extraction.text import CountVectorizer\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.preprocessing import LabelEncoder\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\n\nMODEL_NAME = \"Pipeline\"\n\ndef pipeline():\n    corpus = ['this is the first document',\n            'this document is the second document',\n            'and this is the third one',\n            'is this the first document']\n\n    labels = ['A', 'B', 'A', 'B']\n    label_encoder = LabelEncoder()\n    y = label_encoder.fit_transform(labels)\n\n    X_train, X_test, y_train, y_test = train_test_split(corpus, y, test_size=0.2, random_state=42)\n\n    pipe = Pipeline([\n        ('count', CountVectorizer(vocabulary=['this', 'document', 'first', 'is', 'second', 'the', 'and', 'one'])),\n        ('tfid', TfidfTransformer()),\n        ('clf', LogisticRegression())\n    ])\n\n    pipe.fit(X_train, y_train)\n    pymilo_classification_test(pipe, MODEL_NAME, (X_test, y_test))\n"
  },
  {
    "path": "tests/test_feature_extraction/test_feature_extractions.py",
    "content": "import os\nimport pytest\nfrom count_vectorizer import count_vectorizer\nfrom dict_vectorizer import dict_vectorizer\nfrom feature_hasher import feature_hasher\nfrom hashing_vectorizer import hashing_vectorizer\nfrom patch_extractor import patch_extractor\nfrom tfidf_transformer import tfidf_transformer\nfrom tfidf_vectorizer import tfidf_vectorizer\n\nFEATURE_EXTRACTIONS = [\n    count_vectorizer,\n    dict_vectorizer,\n    feature_hasher,\n    hashing_vectorizer,\n    patch_extractor,\n    tfidf_transformer,\n    tfidf_vectorizer,\n]\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_feature_extraction\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for model in FEATURE_EXTRACTIONS:\n        if isinstance(model, tuple):\n            func, model_name = model\n            if func == None:\n                print(\"Model: \" + model_name + \" is not supported in this python version.\")\n                continue\n        model()\n"
  },
  {
    "path": "tests/test_feature_extraction/tfidf_transformer.py",
    "content": "from numpy import array_equal\nfrom util import get_path, write_and_read\nfrom pymilo.utils.test_pymilo import report_status\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.feature_extraction.text import CountVectorizer\nfrom sklearn.feature_extraction.text import TfidfTransformer\nfrom pymilo.transporters.feature_extraction_transporter import FeatureExtractorTransporter\n\nMODEL_NAME = \"TfidfTransformer\"\n\ndef tfidf_transformer():\n    corpus = ['this is the first document',\n            'this document is the second document',\n            'and this is the third one',\n            'is this the first document']\n    vocabulary = ['this', 'document', 'first', 'is', 'second', 'the',\n                'and', 'one']\n    pipe = Pipeline([('count', CountVectorizer(vocabulary=vocabulary)),\n                    ('tfid', TfidfTransformer())]).fit(corpus)\n\n    _tfidf = pipe['tfid']\n    pre_result = _tfidf.idf_\n\n    fe = FeatureExtractorTransporter()\n\n    post_pymilo_pre_model = fe.deserialize_fe_module(\n        write_and_read(\n            fe.serialize_fe_module(_tfidf),\n            get_path(MODEL_NAME)))\n\n    post_result = post_pymilo_pre_model.idf_\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_feature_extraction/tfidf_vectorizer.py",
    "content": "from numpy import array_equal\nfrom util import get_path, write_and_read\nfrom pymilo.utils.test_pymilo import report_status\nfrom sklearn.feature_extraction.text import TfidfVectorizer\nfrom pymilo.transporters.feature_extraction_transporter import FeatureExtractorTransporter\n\nMODEL_NAME = \"TfidfVectorizer\"\n\ndef tfidf_vectorizer():\n    corpus = [\n        'This is the first document.',\n        'This document is the second document.',\n        'And this is the third one.',\n        'Is this the first document?',\n    ]\n    tfidf = TfidfVectorizer()\n    X = tfidf.fit_transform(corpus)\n    pre_result = X.toarray()\n\n    fe = FeatureExtractorTransporter()\n\n    post_pymilo_pre_model = fe.deserialize_fe_module(\n        write_and_read(\n            fe.serialize_fe_module(tfidf),\n            get_path(MODEL_NAME)))\n\n    post_result = post_pymilo_pre_model.fit_transform(corpus).toarray()\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_feature_extraction/util.py",
    "content": "import os\nimport json\n\ndef write_and_read(serialized_model, file_addr):\n    with open(file_addr, 'w') as fp:\n        fp.write(json.dumps(serialized_model, indent=4))\n    with open(file_addr, 'r') as fp:\n        return json.load(fp)\n\ndef get_path(model_name):\n    return  os.path.join(os.getcwd(), \"tests\", \"exported_feature_extraction\", model_name + \".json\")\n"
  },
  {
    "path": "tests/test_linear_models/bayesian/ard_regression.py",
    "content": "from sklearn.linear_model import ARDRegression\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Automatic-Relevance-Determination-Regression\"\n\n\ndef ard_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create ARD regression object\n    ard_regression = ARDRegression()\n    # Train the model using the training sets\n    ard_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        ard_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/bayesian/bayesian_regression.py",
    "content": "from sklearn.linear_model import BayesianRidge\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Bayesian-Ridge-Regression\"\n\n\ndef bayesian_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create bayesian ridge regression object\n    bayesian_ridge_regression = BayesianRidge()\n    # Train the model using the training sets\n    bayesian_ridge_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        bayesian_ridge_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/elasticnet/elastic_net.py",
    "content": "from sklearn.linear_model import ElasticNet\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Elastic-Net-Regression\"\n\n\ndef elastic_net():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Elastic Net regression object\n    elasticnet_alpha = 0.1\n    elasticnet_random_state = 0\n    elasticnet_regression = ElasticNet(\n        random_state=elasticnet_random_state,\n        alpha=elasticnet_alpha)\n    # Train the model using the training sets\n    elasticnet_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        elasticnet_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/elasticnet/elastic_net_cv.py",
    "content": "from sklearn.linear_model import ElasticNetCV\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Elastic-Net-CV-Regression\"\n\n\ndef elastic_net_cv():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Elastic Net CV regression object\n    elasticnet_alphas = [1e-3, 1e-2, 1e-1, 1]\n    elasticnet_cv = 5\n    elasticnet_random_state = 0\n    elasticnet_cv_regression = ElasticNetCV(\n        cv=elasticnet_cv,\n        alphas=elasticnet_alphas,\n        random_state=elasticnet_random_state)\n    # Train the model using the training sets\n    elasticnet_cv_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        elasticnet_cv_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/elasticnet/multi_task_elastic_net.py",
    "content": "from sklearn.linear_model import MultiTaskElasticNet\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Multi-Task-Elastic-Net-Regression\"\n\n\ndef multi_task_elastic_net():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    y_train = [[y, y**2] for y in y_train]\n    y_test = [[y, y**2] for y in y_test]\n    # Create MultiTaskElasticNet regression object\n    elasticnet_alpha = 0.01\n    elasticnet_random_state = 0\n    multitask_elasticnet_regression = MultiTaskElasticNet(\n        random_state=elasticnet_random_state, alpha=elasticnet_alpha)\n    # Train the model using the training sets\n    multitask_elasticnet_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        multitask_elasticnet_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/elasticnet/multi_task_elastic_net_cv.py",
    "content": "from sklearn.linear_model import MultiTaskElasticNetCV\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Multi-Task-Elastic-Net-CV-Regression\"\n\n\ndef multi_task_elastic_net_cv():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    y_train = [[y, y**2] for y in y_train]\n    y_test = [[y, y**2] for y in y_test]\n    # Create MultiTaskElasticNetCV regression object\n    elasticnet_alphas = [1e-3, 1e-2, 1e-1, 1]\n    elasticnet_cv = 5\n    elasticnet_random_state = 0\n    multitask_elasticnet_cv_regression = MultiTaskElasticNetCV(\n        random_state=elasticnet_random_state, alphas=elasticnet_alphas, cv=elasticnet_cv)\n    # Train the model using the training sets\n    multitask_elasticnet_cv_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        multitask_elasticnet_cv_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/glm/gamma_regression.py",
    "content": "from sklearn.linear_model import GammaRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Gamma-Regression\"\n\n\ndef gamma_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Gamma regression object\n    gamma_alpha = 0.5\n    gamma_regression = GammaRegressor(alpha=gamma_alpha)\n    # Train the model using the training sets\n    gamma_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        gamma_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/glm/poisson_regression.py",
    "content": "from sklearn.linear_model import PoissonRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Poisson-Regression\"\n\n\ndef poisson_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Poisson regression object\n    poisson_alpha = 0.5\n    poisson_regression = PoissonRegressor(alpha=poisson_alpha)\n    # Train the model using the training sets\n    poisson_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        poisson_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/glm/tweedie_regression.py",
    "content": "from sklearn.linear_model import TweedieRegressor\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\n\nMODEL_NAME = \"Tweedie-Regression\"\n\n\ndef tweedie_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Tweedie Regression object\n    tweedie_alpha = 0.5\n    tweedie_link = 'log'\n    tweedie_power = 1\n    tweedie_regression = TweedieRegressor(\n        power=tweedie_power,\n        alpha=tweedie_alpha,\n        link=tweedie_link)\n    # Train the model using the training sets\n    tweedie_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        tweedie_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/lasso_lars/lasso.py",
    "content": "from sklearn.linear_model import Lasso\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Lasso-Regression\"\n\n\ndef lasso():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Lasso regression object\n    lasso_alpha = 0.2\n    lasso_regression = Lasso(lasso_alpha)\n    # Train the model using the training sets\n    lasso_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        lasso_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/lasso_lars/lasso_cv.py",
    "content": "from sklearn.linear_model import LassoCV\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Lasso-Regression-CV\"\n\n\ndef lasso_cv():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Lasso CV regression object\n    lasso_alphas = [1e-3, 1e-2, 1e-1, 1]\n    lasso_cv = 5\n    lasso_random_state = 0\n    lasso_cv_regression = LassoCV(\n        alphas=lasso_alphas,\n        cv=lasso_cv,\n        random_state=lasso_random_state)\n    # Train the model using the training sets\n    lasso_cv_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        lasso_cv_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/lasso_lars/lasso_lars.py",
    "content": "from sklearn.linear_model import LassoLars\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Lasso-Lars-Regression\"\n\n\ndef lasso_lars():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Lasso Lars regression object\n    lasso_alpha = 0.2\n    lasso_lars_regression = LassoLars(lasso_alpha)\n    # Train the model using the training sets\n    lasso_lars_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        lasso_lars_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/lasso_lars/lasso_lars_cv.py",
    "content": "from sklearn.linear_model import LassoLarsCV\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Lasso-Lars-CV-Regression\"\n\n\ndef lasso_lars_cv():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Lasso Lars CV regression object\n    lasso_cv = 5\n    lasso_lars_cv_regression = LassoLarsCV(cv=lasso_cv)\n    # Train the model using the training sets\n    lasso_lars_cv_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        lasso_lars_cv_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/lasso_lars/lasso_lars_ic.py",
    "content": "from sklearn.linear_model import LassoLarsIC\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Lasso-Lars-IC-Regression\"\n\n\ndef lasso_lars_ic():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Lasso Lars IC regression object\n    lasso_criterian = \"bic\"\n    lass_lars_ic_regression = LassoLarsIC(criterion=lasso_criterian)\n    # Train the model using the training sets\n    lass_lars_ic_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        lass_lars_ic_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/lasso_lars/multi_task_lasso.py",
    "content": "from sklearn.linear_model import MultiTaskLasso\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Multi-Task-Lasso-Regression\"\n\n\ndef multi_task_lasso():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    y_train = [[y, y**2] for y in y_train]\n    y_test = [[y, y**2] for y in y_test]\n    # Create MultiTaskLasso regression object\n    lasso_alpha = 0.1\n    lasso_random_state = 0\n    multi_task_lasso = MultiTaskLasso(\n        random_state=lasso_random_state,\n        alpha=lasso_alpha)\n    # Train the model using the training sets\n    multi_task_lasso.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        multi_task_lasso, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/lasso_lars/multi_task_lasso_cv.py",
    "content": "from sklearn.linear_model import MultiTaskLassoCV\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Multi-Task-Lasso-CV-Regression\"\n\n\ndef multi_task_lasso_cv():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    y_train = [[y, y**2] for y in y_train]\n    y_test = [[y, y**2] for y in y_test]\n    # Create Multi Task Lasso CV regression object\n    lasso_alphas = [1e-3, 1e-2, 1e-1, 1]\n    lasso_cv = 5\n    lasso_random_state = 0\n    multi_task_lasso_cv = MultiTaskLassoCV(\n        random_state=lasso_random_state,\n        alphas=lasso_alphas,\n        cv=lasso_cv)\n    # Train the model using the training sets\n    multi_task_lasso_cv.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        multi_task_lasso_cv, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/linear_regression/linear_regression.py",
    "content": "from sklearn.linear_model import LinearRegression\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\n\nMODEL_NAME = \"Linear-Regression\"\n\n\ndef linear_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create linear regression object\n    linear_regression = LinearRegression()\n    # Train the model using the training sets\n    linear_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        linear_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/logistic/logistic_regression.py",
    "content": "from sklearn.linear_model import LogisticRegression\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"Logistic-Regression\"\n\n\ndef logistic_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    # Create Logistic regression object\n    logistic_regression_random_state = 4\n    logistic_regression = LogisticRegression(\n        random_state=logistic_regression_random_state)\n    # Train the model using the training sets\n    logistic_regression.fit(x_train, y_train)\n    assert pymilo_classification_test(\n        logistic_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/logistic/logistic_regression_cv.py",
    "content": "from sklearn.linear_model import LogisticRegressionCV\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"Logistic-Regression-CV\"\n\n\ndef logistic_regression_cv():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    # Create Logistic regression cv object\n    logistic_regression_cv = 5\n    logistic_regression_random_state = 0\n    logistic_regression_cv = LogisticRegressionCV(\n        cv=logistic_regression_cv,\n        random_state=logistic_regression_random_state)\n    # Train the model using the training sets\n    logistic_regression_cv.fit(x_train, y_train)\n    assert pymilo_classification_test(\n        logistic_regression_cv, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/omp/omp.py",
    "content": "from sklearn.linear_model import OrthogonalMatchingPursuit\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Orthogonal-Matching-Pursuit-Regression\"\n\n\ndef omp():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Orthogonal Matching Pursuit regression object\n    omp_n_nonzero_coefs = 10\n    omp_regression = OrthogonalMatchingPursuit(\n        n_nonzero_coefs=omp_n_nonzero_coefs)\n    # Train the model using the training sets\n    omp_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        omp_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/omp/omp_cv.py",
    "content": "from sklearn.linear_model import OrthogonalMatchingPursuitCV\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Orthogonal-Matching-Pursuit-CV-Regression\"\n\n\ndef omp_cv():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Orthogonal Matching Pursuit CV regression object\n    omp_cv = 5\n    omp_cv_regression = OrthogonalMatchingPursuitCV(cv=omp_cv)\n    # Train the model using the training sets\n    omp_cv_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        omp_cv_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/passive_aggressive/passive_aggressive_classifier.py",
    "content": "from sklearn.linear_model import PassiveAggressiveClassifier\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"Passive-Aggressive-Classifier\"\n\n\ndef passive_aggressive_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    # Create ridge regression object\n    pac_max_iter = 1000\n    pac_random_state = 0\n    pac_tol = 1e-3\n    passive_aggressive_classifier = PassiveAggressiveClassifier(\n        max_iter=pac_max_iter, random_state=pac_random_state, tol=pac_tol)\n    # Train the model using the training sets\n    passive_aggressive_classifier.fit(x_train, y_train)\n    assert pymilo_classification_test(\n        passive_aggressive_classifier, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/passive_aggressive/passive_aggressive_regressor.py",
    "content": "from sklearn.linear_model import PassiveAggressiveRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Passive-Aggressive-Regressor\"\n\n\ndef passive_agressive_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Passive Aggressive Regression object\n    par_random_state = 2\n    par_max_iter = 100\n    passive_aggressive_regression = PassiveAggressiveRegressor(\n        max_iter=par_max_iter, random_state=par_random_state)\n    # Train the model using the training sets\n    passive_aggressive_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        passive_aggressive_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/perceptron/perception.py",
    "content": "from sklearn.linear_model import Perceptron\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Perceptron\"\n\n\ndef perceptron():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create perceptron regression object\n\n    perceptron_random_state = 0\n    perceptron_tol = 1e-3\n    perceptron = Perceptron(\n        random_state=perceptron_random_state,\n        tol=perceptron_tol)\n    # Train the model using the training sets\n    perceptron.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        perceptron, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/quantile/quantile.py",
    "content": "from sklearn.linear_model import QuantileRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Quantile-Regressor\"\n\n\ndef quantile_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Quantile regression object\n    quantile_regression = QuantileRegressor(quantile=0.8, solver=\"highs\")\n    # Train the model using the training sets\n    quantile_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        quantile_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/ridge/ridge_classifier.py",
    "content": "from sklearn.linear_model import RidgeClassifier\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"Ridge-Classifier\"\n\n\ndef ridge_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    # Create ridge classifier object\n    ridge_alpha = 0.4\n    ridge_classifier = RidgeClassifier(alpha=ridge_alpha)\n    # Train the model using the training sets\n    ridge_classifier.fit(x_train, y_train)\n    assert pymilo_classification_test(\n        ridge_classifier, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/ridge/ridge_classifier_cv.py",
    "content": "from sklearn.linear_model import RidgeClassifierCV\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"Ridge-Classifier-CV\"\n\n\ndef ridge_classifier_cv():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    # Create ridge classifier cv object\n    ridge_cv_alphas = [1e-3, 1e-2, 1e-1, 1]\n    ridge_classifier = RidgeClassifierCV(alphas=ridge_cv_alphas)\n    # Train the model using the training sets\n    ridge_classifier.fit(x_train, y_train)\n    assert pymilo_classification_test(\n        ridge_classifier, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/ridge/ridge_regression.py",
    "content": "from sklearn.linear_model import Ridge\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nMODEL_NAME = \"Ridge-Regression\"\n\n\ndef ridge_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create ridge regression object\n    ridge_alpha = 0.5\n    ridge_regression = Ridge(alpha=ridge_alpha)\n    # Train the model using the training sets\n    ridge_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        ridge_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/ridge/ridge_regression_cv.py",
    "content": "from sklearn.linear_model import RidgeCV\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Ridge-Regression-CV\"\n\n\ndef ridge_regression_cv():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create ridgeCV regression object\n    ridge_cv_alphas = [1e-3, 1e-2, 1e-1, 1]\n    ridge_regression_cv = RidgeCV(alphas=ridge_cv_alphas)\n    # Train the model using the training sets\n    ridge_regression_cv.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        ridge_regression_cv, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/robustness/huber_regression.py",
    "content": "from sklearn.linear_model import HuberRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Huber-Regressor\"\n\n\ndef huber_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create Huber regression object\n    huber_regresion = HuberRegressor(max_iter=300)\n    # Train the model using the training sets\n    huber_regresion.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        huber_regresion, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/robustness/ransac_regression.py",
    "content": "from sklearn.linear_model import RANSACRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"RANSAC-Regressor\"\n\n\ndef ransac_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create ransac regression object\n    ransac_random_state = 3\n    ransac_regression = RANSACRegressor(random_state=ransac_random_state)\n    # Train the model using the training sets\n    ransac_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        ransac_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/robustness/theil_sen_regression.py",
    "content": "from sklearn.linear_model import TheilSenRegressor\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\n\nMODEL_NAME = \"Theil-Sen-Regressor\"\n\n\ndef theil_sen_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create TheilSen Regression object\n    theilsen_random_state = 4\n    theilsen_regresion = TheilSenRegressor(random_state=theilsen_random_state)\n    # Train the model using the training sets\n    theilsen_regresion.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        theilsen_regresion, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/sgd/sgd_classifier.py",
    "content": "from sklearn.linear_model import SGDClassifier\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nMODEL_NAME = \"SGD-Classifier\"\n\n\ndef sgd_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    # Create SGDClassifier regression object\n    sgd_max_iter = 100000\n    sgd_tol = 1e-3\n    sgd_classifier = SGDClassifier(max_iter=sgd_max_iter, tol=sgd_tol)\n    # Train the model using the training sets\n    sgd_classifier.fit(x_train, y_train)\n    assert pymilo_classification_test(\n        sgd_classifier, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/sgd/sgd_oneclass_svm.py",
    "content": "from sklearn.linear_model import SGDOneClassSVM\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\n\nMODEL_NAME = \"SGD-OneClass-Regression\"\n\n\ndef sgd_oneclass_svm():\n    x_train, _, x_test, y_test = prepare_simple_regression_datasets()\n    # Create SGDOneClassSVM regression object\n    sgd_random_state = 34\n    sgd_oneclass_svm = SGDOneClassSVM(random_state=sgd_random_state)\n    # Train the model using the training sets\n    sgd_oneclass_svm.fit(x_train)\n    assert pymilo_regression_test(\n        sgd_oneclass_svm, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/sgd/sgd_regression.py",
    "content": "from sklearn.linear_model import SGDRegressor\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\n\nMODEL_NAME = \"SGD-Regression\"\n\n\ndef sgd_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    # Create SGD Regression object\n    sgd_max_iter = 100000\n    sgd_tol = 1e-3\n    sgd_regression = SGDRegressor(max_iter=sgd_max_iter, tol=sgd_tol)\n    # Train the model using the training sets\n    sgd_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        sgd_regression, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_linear_models/test_linear_models.py",
    "content": "import os\nimport pytest\n\nfrom pymilo.pymilo_param import SKLEARN_LINEAR_MODEL_TABLE, NOT_SUPPORTED\n\nfrom linear_regression.linear_regression import linear_regression\n\nfrom ridge.ridge_regression import ridge_regression\nfrom ridge.ridge_regression_cv import ridge_regression_cv\nfrom ridge.ridge_classifier import ridge_classifier\nfrom ridge.ridge_classifier_cv import ridge_classifier_cv\n\nfrom lasso_lars.lasso import lasso\nfrom lasso_lars.lasso_cv import lasso_cv\nfrom lasso_lars.lasso_lars import lasso_lars\nfrom lasso_lars.lasso_lars_cv import lasso_lars_cv\nfrom lasso_lars.lasso_lars_ic import lasso_lars_ic\nfrom lasso_lars.multi_task_lasso import multi_task_lasso\nfrom lasso_lars.multi_task_lasso_cv import multi_task_lasso_cv\n\nfrom elasticnet.elastic_net import elastic_net\nfrom elasticnet.elastic_net_cv import elastic_net_cv\nfrom elasticnet.multi_task_elastic_net import multi_task_elastic_net\nfrom elasticnet.multi_task_elastic_net_cv import multi_task_elastic_net_cv\n\nfrom omp.omp import omp\nfrom omp.omp_cv import omp_cv\n\nfrom bayesian.bayesian_regression import bayesian_regression\nfrom bayesian.ard_regression import ard_regression\n\nfrom logistic.logistic_regression import logistic_regression\nfrom logistic.logistic_regression_cv import logistic_regression_cv\n\nfrom sgd.sgd_regression import sgd_regression\nfrom sgd.sgd_classifier import sgd_classifier\n\nfrom perceptron.perception import perceptron\n\nfrom passive_aggressive.passive_aggressive_regressor import passive_agressive_regressor\nfrom passive_aggressive.passive_aggressive_classifier import passive_aggressive_classifier\n\nfrom robustness.ransac_regression import ransac_regression\nfrom robustness.theil_sen_regression import theil_sen_regression\nfrom robustness.huber_regression import huber_regression\n\nif SKLEARN_LINEAR_MODEL_TABLE[\"TweedieRegressor\"] != NOT_SUPPORTED:\n    from glm.tweedie_regression import tweedie_regression\nif SKLEARN_LINEAR_MODEL_TABLE[\"PoissonRegressor\"] != NOT_SUPPORTED:\n    from glm.poisson_regression import poisson_regression\nif SKLEARN_LINEAR_MODEL_TABLE[\"GammaRegressor\"] != NOT_SUPPORTED:\n    from glm.gamma_regression import gamma_regression\nif SKLEARN_LINEAR_MODEL_TABLE[\"SGDOneClassSVM\"] != NOT_SUPPORTED:\n    from sgd.sgd_oneclass_svm import sgd_oneclass_svm\nif SKLEARN_LINEAR_MODEL_TABLE[\"QuantileRegressor\"] != NOT_SUPPORTED:\n    from quantile.quantile import quantile_regressor\n\nLINEAR_MODELS = {\n    \"LINEAR_REGRESSION\": [linear_regression],\n    \"RIDGE_REGRESSION_AND_CLASSIFICATION\": [\n        ridge_regression,\n        ridge_regression_cv,\n        ridge_classifier,\n        ridge_classifier_cv],\n    \"LASSO_AND_LARS\": [\n        lasso,\n        lasso_cv,\n        lasso_lars,\n        lasso_lars_cv,\n        lasso_lars_ic,\n        multi_task_lasso,\n        multi_task_lasso_cv],\n    \"ELASTIC_NET\": [\n        elastic_net,\n        elastic_net_cv],\n    \"MULTI_CLASS_ELASTIC_NET\": [\n        multi_task_elastic_net,\n        multi_task_elastic_net_cv],\n    \"OMP\": [\n        omp,\n        omp_cv],\n    \"BAYESIAN_REGRESSION\": [\n        bayesian_regression,\n        ard_regression],\n    \"LOGISTIC_REGRESSION\": [\n        logistic_regression,\n        logistic_regression_cv],\n    \"GLM\": [\n        tweedie_regression if SKLEARN_LINEAR_MODEL_TABLE[\"TweedieRegressor\"] != NOT_SUPPORTED else (None,\"TweedieRegressor\"),\n        poisson_regression if SKLEARN_LINEAR_MODEL_TABLE[\"PoissonRegressor\"] != NOT_SUPPORTED else (None,\"PoissonRegressor\"),\n        gamma_regression if SKLEARN_LINEAR_MODEL_TABLE[\"GammaRegressor\"] != NOT_SUPPORTED else (None,\"GammaRegressor\")],\n    \"SGD\": [\n        sgd_regression,\n        sgd_classifier,\n        sgd_oneclass_svm if SKLEARN_LINEAR_MODEL_TABLE[\"SGDOneClassSVM\"] != NOT_SUPPORTED else (None,\"SGDOneClassSVM\")],\n    \"PERCEPTRON\": [perceptron],\n    \"PASSIVE_AGGRESSIVE_REGRESSION_AND_CLASSIFIER\": [\n        passive_agressive_regressor,\n        passive_aggressive_classifier],\n    \"ROBUSTNESS_REGRESSION\": [\n        ransac_regression,\n        theil_sen_regression,\n        huber_regression],\n    \"QUANTILE_REGRESSION\": [quantile_regressor if SKLEARN_LINEAR_MODEL_TABLE[\"QuantileRegressor\"] != NOT_SUPPORTED else (None,\"QuantileRegressor\")]}\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_linear_models\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        # construct full file path\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for category in LINEAR_MODELS:\n        for model in LINEAR_MODELS[category]:\n            if isinstance(model, tuple):\n                func, model_name = model\n                if func == None:\n                    print(\"Model: \" + model_name + \" is not supported in this python version.\")\n                    continue\n            model()\n"
  },
  {
    "path": "tests/test_misc_functionalities.py/test_batch.py",
    "content": "import os\nimport re\nimport random\nimport numpy as np\nfrom pymilo import Export, Import\nfrom sklearn.metrics import mean_squared_error\nfrom sklearn.linear_model import LinearRegression\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\n\ndef test_batch_execution():\n    x_train, y_train, x_test, _ = prepare_simple_regression_datasets()\n    linear_regression = LinearRegression()\n    linear_regression.fit(x_train, y_train)\n    pre_models = [linear_regression]*100\n    exp_n = Export.batch_export(pre_models, os.getcwd())\n    imp_n, post_models = Import.batch_import(os.getcwd())\n    r_index = random.randint(0, len(post_models) - 1)\n    pre_result = pre_models[r_index].predict(x_test)\n    post_result = post_models[r_index].predict(x_test)\n    mse = mean_squared_error(post_result, pre_result)\n    pattern = re.compile(r'model_\\d+\\.json')\n    for filename in os.listdir(os.getcwd()):\n        if pattern.match(filename):\n            file_path = os.path.join(os.getcwd(), filename)\n            os.remove(file_path)\n    assert exp_n == imp_n and np.abs(mse) <= 10**(-8)\n"
  },
  {
    "path": "tests/test_ml_streaming/docker_files/Dockerfile1",
    "content": "# Use an official Python runtime as a parent image\nFROM python:3.11-slim\n\n# Set the working directory in the container\nWORKDIR /app\n\n# Install pymilo\nRUN pip install pymilo[streaming]\n    \nEXPOSE 8000\nCMD [\"python\", \"-m\", \"pymilo\", \"--compression\", \"NULL\", \"--protocol\", \"REST\", \"--port\", \"8000\", \"--load\", \"https://raw.githubusercontent.com/openscilab/pymilo/main/tests/test_exceptions/valid_jsons/linear_regression.json\"]\n"
  },
  {
    "path": "tests/test_ml_streaming/docker_files/Dockerfile2",
    "content": "# Use an official Python runtime as a parent image\nFROM python:3.11-slim\n\n# Set the working directory in the container\nWORKDIR /app\n\n# Install pymilo\nRUN pip install pymilo[streaming]\n    \nCOPY linear_regression.json /app/model.json\nEXPOSE 8000\nCMD [\"python\", \"-m\", \"pymilo\", \"--compression\", \"NULL\", \"--protocol\", \"REST\", \"--port\", \"8000\", \"--load\", \"/app/model.json\"]\n"
  },
  {
    "path": "tests/test_ml_streaming/run_server.py",
    "content": "import argparse\nfrom sklearn.linear_model import LinearRegression\nfrom pymilo.streaming import PymiloServer, Compression, CommunicationProtocol\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\n\ndef main():\n    parser = argparse.ArgumentParser(description='Run the Pymilo server with a specified compression method.')\n    parser.add_argument(\n        '--compression',\n        type=str,\n        choices=['NULL', 'GZIP', 'ZLIB', 'LZMA', 'BZ2'],\n        default='NULL',\n        help='Specify the compression method (NULL, GZIP, ZLIB, LZMA, or BZ2). Default is NULL.'\n        )\n    parser.add_argument(\n        '--protocol',\n        type=str,\n        choices=['REST', 'WEBSOCKET'],\n        default='REST',\n        help='Specify the communication protocol (REST or WEBSOCKET). Default is REST.'\n        )\n    parser.add_argument(\n        '--init',\n        action=\"store_true\",\n        default=False,\n        help='the `init` command specifies whether or not initializing the PyMilo Server with a ML model.',\n    )\n    parser.add_argument(\n        '--port',\n        type=int,\n        default=None,\n        help='Override the default port.',\n    )\n    args = parser.parse_args()\n    communicator = None\n    if args.init:\n        port = args.port if args.port else 9000\n        x_train, y_train, _, _ = prepare_simple_regression_datasets()\n        linear_regression = LinearRegression()\n        linear_regression.fit(x_train, y_train)\n        ps = PymiloServer(\n            port=port,\n            compressor=Compression[args.compression],\n            communication_protocol=CommunicationProtocol[args.protocol],\n            )\n        sample_client_id = \"0x_demo_client_id\"\n        sample_ml_model_id = \"0x_demo_ml_model_id\"\n        ps.init_client(sample_client_id)\n        ps.init_ml_model(sample_client_id, sample_ml_model_id)\n        ps.set_ml_model(sample_client_id, sample_ml_model_id, linear_regression)\n        communicator = ps.communicator\n    else:\n        port = args.port if args.port else 8000\n        communicator = PymiloServer(\n            port=port,\n            compressor=Compression[args.compression],\n            communication_protocol=CommunicationProtocol[args.protocol],\n            ).communicator\n\n    communicator.run()\n\nif __name__ == '__main__':\n    main()"
  },
  {
    "path": "tests/test_ml_streaming/scenarios/scenario1.py",
    "content": "import numpy as np\nfrom pymilo.streaming import PymiloClient, Compression, CommunicationProtocol\nfrom sklearn.metrics import mean_squared_error\nfrom sklearn.linear_model import LinearRegression\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\n\ndef scenario1(compression_method, communication_protocol):\n    # [PyMilo Server is not initialized with ML Model]\n    # 1. create model in local\n    # 2. train model in local\n    # 3. calculate mse before streaming\n    # 4. upload model to server\n    # 5. download model to local\n    # 6. calculate mse after streaming\n\n\n    # 1.\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    linear_regression = LinearRegression()\n\n    # 2.\n    linear_regression.fit(x_train, y_train)\n    client = PymiloClient(\n        model=linear_regression,\n        mode=PymiloClient.Mode.LOCAL,\n        compressor=Compression[compression_method],\n        communication_protocol=CommunicationProtocol[communication_protocol],\n        )\n\n    # 3. get client id + get ml model id [from remote server]\n    client.register()\n    client.register_ml_model()\n\n    # 4.\n    result = client.predict(x_test)\n    mse_before = mean_squared_error(y_test, result)\n\n    # 5.\n    client.upload()\n    # 6.\n    client.download()\n\n    # 7.\n    result = client.predict(x_test)\n    mse_after = mean_squared_error(y_test, result)\n\n    return np.abs(mse_after-mse_before)\n"
  },
  {
    "path": "tests/test_ml_streaming/scenarios/scenario2.py",
    "content": "import numpy as np\nfrom pymilo.streaming import PymiloClient, Compression, CommunicationProtocol\nfrom sklearn.metrics import mean_squared_error\nfrom sklearn.linear_model import LinearRegression\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\n\ndef scenario2(compression_method, communication_protocol):\n    # [PyMilo Server is not initialized with ML Model]\n    # 1. create model in local\n    # 2. upload model to server\n    # 3. train model in server\n    # 4. calculate mse in server\n    # 5. download model to local\n    # 6. calculate mse in local\n\n\n    # 1.\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    linear_regression = LinearRegression()\n    client = PymiloClient(\n        model=linear_regression,\n        mode=PymiloClient.Mode.LOCAL,\n        compressor=Compression[compression_method],\n        communication_protocol=CommunicationProtocol[communication_protocol],\n        )\n\n    # 2. get client id + get ml model id [from remote server]\n    client.register()\n    client.register_ml_model()\n\n    # 3.\n    client.upload()\n\n    # 4.\n    client.toggle_mode(PymiloClient.Mode.DELEGATE)\n    client.fit(x_train, y_train)\n    remote_field = client.coef_\n\n    # 5.\n    result = client.predict(x_test)\n    mse_server = mean_squared_error(y_test, result)\n\n    # 6.\n    client.download()\n\n    # 7.\n    client.toggle_mode(mode=PymiloClient.Mode.LOCAL)\n    local_field = client.coef_\n    result = client.predict(x_test)\n    mse_local = mean_squared_error(y_test, result)\n\n    return np.abs(mse_server-mse_local) + np.abs(np.sum(local_field-remote_field))\n"
  },
  {
    "path": "tests/test_ml_streaming/scenarios/scenario3.py",
    "content": "import numpy as np\nfrom sklearn.metrics import mean_squared_error\nfrom pymilo.streaming import PymiloClient, Compression, CommunicationProtocol\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\n\ndef scenario3(compression_method, communication_protocol):\n    # [PyMilo Server is initialized with ML Model]\n    # 1. calculate mse in server\n    # 2. download model in local\n    # 3. calculate mse in local\n    # 4. compare results\n\n    # 1.\n    _, _, x_test, y_test = prepare_simple_regression_datasets()\n    client = PymiloClient(\n        mode=PymiloClient.Mode.LOCAL,\n        compressor=Compression[compression_method],\n        server_url=\"127.0.0.1:9000\",\n        communication_protocol=CommunicationProtocol[communication_protocol],\n        )\n    client.client_id = \"0x_demo_client_id\"\n    client.ml_model_id = \"0x_demo_ml_model_id\"\n\n    client.toggle_mode(PymiloClient.Mode.DELEGATE)\n    result = client.predict(x_test)\n    mse_server = mean_squared_error(y_test, result)\n\n    # 2.\n    client.download()\n\n    # 3.\n    client.toggle_mode(mode=PymiloClient.Mode.LOCAL)\n    result = client.predict(x_test)\n    mse_local = mean_squared_error(y_test, result)\n\n    # 4.\n    return np.abs(mse_server-mse_local)\n"
  },
  {
    "path": "tests/test_ml_streaming/scenarios/scenario4.py",
    "content": "from pymilo.streaming import PymiloClient, Compression, CommunicationProtocol\nfrom sklearn.linear_model import LinearRegression\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\n\ndef scenario4(compression_method, communication_protocol):\n    \"\"\"\n    Test access control management features between multiple clients.\n\n    This scenario tests:\n    1. Client registration/deregistration\n    2. Model registration/deregistration\n    3. get_ml_models\n    4. grant_access / revoke_access\n    5. get_allowance / get_allowed_models\n    \"\"\"\n    x_train, y_train, _, _ = prepare_simple_regression_datasets()\n\n    # Create and train a model locally\n    linear_regression = LinearRegression()\n    linear_regression.fit(x_train, y_train)\n\n    # Initialize client_a (model owner)\n    client_a = PymiloClient(\n        model=linear_regression,\n        mode=PymiloClient.Mode.LOCAL,\n        compressor=Compression[compression_method],\n        server_url=\"127.0.0.1:8500\",\n        communication_protocol=CommunicationProtocol[communication_protocol],\n    )\n\n    # Initialize client_b (will be granted access)\n    client_b = PymiloClient(\n        mode=PymiloClient.Mode.LOCAL,\n        compressor=Compression[compression_method],\n        server_url=\"127.0.0.1:8500\",\n        communication_protocol=CommunicationProtocol[communication_protocol],\n    )\n\n    # 1. Register both clients\n    client_a.register()\n    client_b.register()\n\n    assert client_a.client_id is not None, \"client_a registration failed\"\n    assert client_b.client_id is not None, \"client_b registration failed\"\n    assert client_a.client_id != client_b.client_id, \"clients should have different IDs\"\n\n    # 2. Register model for client_a\n    client_a.register_ml_model()\n    assert client_a.ml_model_id is not None, \"model registration failed\"\n\n    # 3. Test get_ml_models\n    models_a = client_a.get_ml_models()\n    assert client_a.ml_model_id in models_a, \"registered model should appear in get_ml_models\"\n\n    # 4. Upload model from client_a\n    client_a.upload()\n\n    # 5. Grant access from client_a to client_b (uses client_a.ml_model_id implicitly)\n    grant_result = client_a.grant_access(client_b.client_id)\n    assert grant_result is True, \"grant_access should succeed\"\n\n    # 6. Verify allowance updated\n    allowance = client_a.get_allowance()\n    assert isinstance(allowance, dict), \"get_allowance should return a dict\"\n    assert client_b.client_id in allowance, \"client_b should be in allowance after grant\"\n    assert client_a.ml_model_id in allowance[client_b.client_id], \"model should be in allowance\"\n\n    # 7. Test get_allowed_models (from client_b's perspective)\n    allowed_models = client_b.get_allowed_models(client_a.client_id)\n    assert client_a.ml_model_id in allowed_models, \"model should be in allowed_models\"\n\n    # 8. Revoke access (uses client_a.ml_model_id implicitly)\n    revoke_result = client_a.revoke_access(client_b.client_id)\n    assert revoke_result is True, \"revoke_access should succeed\"\n\n    # 9. Verify allowance updated after revoke\n    allowed_models_after_revoke = client_b.get_allowed_models(client_a.client_id)\n    assert client_a.ml_model_id not in allowed_models_after_revoke, \"model should not be in allowed_models after revoke\"\n\n    # 10. Test model deregistration\n    models_before_deregister = client_a.get_ml_models()\n    client_a.deregister_ml_model()\n    models_after_deregister = client_a.get_ml_models()\n    assert len(models_after_deregister) == len(models_before_deregister) - 1, \"model count should decrease after deregister\"\n\n    # 11. Test client deregistration\n    client_b.deregister()\n    client_a.deregister()\n\n    return 0\n"
  },
  {
    "path": "tests/test_ml_streaming/test_streaming.py",
    "content": "import os\nimport time\nimport pytest\nimport subprocess\nfrom filecmp import cmp\nfrom sys import executable\nfrom scenarios.scenario1 import scenario1\nfrom scenarios.scenario2 import scenario2\nfrom scenarios.scenario3 import scenario3\nfrom scenarios.scenario4 import scenario4\nfrom pymilo.streaming.util import generate_dockerfile\n\n@pytest.fixture(\n    scope=\"session\",\n    params=[\"NULL\", \"GZIP\", \"ZLIB\", \"LZMA\", \"BZ2\"])\ndef prepare_bare_server(request):\n    compression_method = request.param\n    # Using PyMilo direct CLI\n    # server_proc = subprocess.Popen(\n    #     [\n    #         executable,\n    #         \"-m\", \"pymilo\",\n    #         \"--compression\", compression_method,\n    #         \"--protocol\", \"REST\",\n    #         \"--port\", \"8000\",\n    #         \"--bare\",\n    #     ],\n    #     )\n    path = os.path.join(\n        os.getcwd(),\n        \"tests\",\n        \"test_ml_streaming\",\n        \"run_server.py\",\n        )\n    server_proc = subprocess.Popen(\n        [\n            executable,\n            path,\n            \"--compression\", compression_method,\n            \"--protocol\", \"REST\"\n        ],\n        )\n    time.sleep(10)\n    yield (compression_method, \"REST\")\n    server_proc.terminate()\n\n\n@pytest.fixture(\n    scope=\"session\",\n    params=[\"REST\", \"WEBSOCKET\"])\ndef prepare_ml_server(request):\n    communication_protocol = request.param\n    compression_method = \"ZLIB\"\n    # Using PyMilo direct CLI\n    # server_proc = subprocess.Popen(\n    #     [\n    #         executable,\n    #         \"-m\", \"pymilo\",\n    #         \"--compression\", compression_method,\n    #         \"--protocol\", communication_protocol,\n    #         \"--port\", \"9000\",\n    #         \"--load\", os.path.join(os.getcwd(), \"tests\", \"test_exceptions\", \"valid_jsons\", \"linear_regression.json\")\n    #         # \"--load\", \"https://raw.githubusercontent.com/openscilab/pymilo/main/tests/test_exceptions/valid_jsons/linear_regression.json\",\n    #     ],\n    #     )\n    path = os.path.join(\n        os.getcwd(),\n        \"tests\",\n        \"test_ml_streaming\",\n        \"run_server.py\",\n        )\n    server_proc = subprocess.Popen(\n        [\n            executable,\n            path,\n            \"--compression\", compression_method,\n            \"--protocol\", communication_protocol,\n            \"--init\",\n        ],\n        )\n    time.sleep(10)\n    yield (compression_method, communication_protocol)\n    server_proc.terminate()\n\n\n@pytest.fixture(\n    scope=\"function\",\n    params=[\"REST\", \"WEBSOCKET\"])\ndef prepare_access_control_server(request):\n    communication_protocol = request.param\n    compression_method = \"ZLIB\"\n    path = os.path.join(\n        os.getcwd(),\n        \"tests\",\n        \"test_ml_streaming\",\n        \"run_server.py\",\n        )\n    server_proc = subprocess.Popen(\n        [\n            executable,\n            path,\n            \"--compression\", compression_method,\n            \"--protocol\", communication_protocol,\n            \"--port\", \"8500\",\n        ],\n        )\n    time.sleep(10)\n    yield (compression_method, communication_protocol)\n    server_proc.terminate()\n    time.sleep(2)\n\n\ndef test1(prepare_bare_server):\n    compression_method, communication_protocol = prepare_bare_server\n    assert scenario1(compression_method, communication_protocol) == 0\n\n\ndef test2(prepare_bare_server):\n    compression_method, communication_protocol = prepare_bare_server\n    assert scenario2(compression_method, communication_protocol) == 0\n\n\ndef test3(prepare_ml_server):\n    compression_method, communication_protocol = prepare_ml_server\n    assert scenario3(compression_method, communication_protocol) == 0\n\n\ndef test4(prepare_access_control_server):\n    compression_method, communication_protocol = prepare_access_control_server\n    assert scenario4(compression_method, communication_protocol) == 0\n\n\ndef test_dockerfile():\n    docker_files_folder = os.path.join(\n        os.getcwd(),\n        \"tests\",\n        \"test_ml_streaming\",\n        \"docker_files\",\n    )\n    generate_dockerfile(\n        dockerfile_name=\"Dockerfile\",\n        model_path=\"https://raw.githubusercontent.com/openscilab/pymilo/main/tests/test_exceptions/valid_jsons/linear_regression.json\")\n    r1 = cmp('Dockerfile', os.path.join(\n        docker_files_folder,\n        \"Dockerfile1\"\n        )\n    )\n    generate_dockerfile(\n        dockerfile_name=\"Dockerfile\",\n        model_path=\"linear_regression.json\",\n        )\n    r2 = cmp('Dockerfile', os.path.join(\n        docker_files_folder,\n        \"Dockerfile2\"\n        )\n    )\n    os.remove(path='Dockerfile')\n    assert r1 and r2\n"
  },
  {
    "path": "tests/test_naive_bayes/bernoulli.py",
    "content": "from sklearn.naive_bayes import BernoulliNB\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"BernoulliNB\"\n\ndef bernoulli_naive_bayes():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    bernoulli_naive_bayes = BernoulliNB().fit(x_train, y_train)\n    pymilo_classification_test(bernoulli_naive_bayes, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_naive_bayes/categorical.py",
    "content": "from sklearn.naive_bayes import CategoricalNB\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"CategoricalNB\"\n\ndef categorical_naive_bayes():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    categorical_naive_bayes = CategoricalNB().fit(x_train, y_train)\n    pymilo_classification_test(categorical_naive_bayes, MODEL_NAME, (x_test, y_test))\n\n"
  },
  {
    "path": "tests/test_naive_bayes/complement.py",
    "content": "from sklearn.naive_bayes import ComplementNB\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"ComplementNB\"\n\ndef complement_naive_bayes():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    complement_naive_bayes = ComplementNB().fit(x_train, y_train)\n    pymilo_classification_test(complement_naive_bayes, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_naive_bayes/gaussian.py",
    "content": "from sklearn.naive_bayes import GaussianNB\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"GaussianNB\"\n\ndef gaussian_naive_bayes():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    gaussian_naive_bayes = GaussianNB().fit(x_train, y_train)\n    pymilo_classification_test(gaussian_naive_bayes, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_naive_bayes/multinomial.py",
    "content": "from sklearn.naive_bayes import MultinomialNB\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"MultinomialNB\"\n\ndef multinomial_naive_bayes():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    multinomial_naive_bayes = MultinomialNB().fit(x_train, y_train)\n    pymilo_classification_test(multinomial_naive_bayes, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_naive_bayes/test_naive_bayes_models.py",
    "content": "import os\nimport pytest\n\nfrom gaussian import gaussian_naive_bayes\nfrom multinomial import multinomial_naive_bayes\nfrom complement import complement_naive_bayes\nfrom bernoulli import bernoulli_naive_bayes\nfrom categorical import categorical_naive_bayes\n\nNAIVE_BAYES_MODELS = [\n    gaussian_naive_bayes,\n    multinomial_naive_bayes,\n    complement_naive_bayes,\n    bernoulli_naive_bayes,\n    categorical_naive_bayes\n]\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_naive_bayes\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        # construct full file path\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for model in NAIVE_BAYES_MODELS:\n        model()\n"
  },
  {
    "path": "tests/test_neighbors/kneighbors_classifier.py",
    "content": "from sklearn.neighbors import KNeighborsClassifier\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"KNeighborsClassifier\"\n\ndef kneighbors_classifier():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    kneighbors_classifier = KNeighborsClassifier(n_neighbors=3).fit(x_train, y_train)\n    pymilo_classification_test(kneighbors_classifier, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_neighbors/kneighbors_regressor.py",
    "content": "from sklearn.neighbors import KNeighborsRegressor\n\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"KNeighborsRegressor\"\n\ndef kneighbors_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    kneighbors_regressor = KNeighborsRegressor(n_neighbors=2).fit(x_train, y_train)\n    pymilo_regression_test(kneighbors_regressor, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_neighbors/local_outlier_factor.py",
    "content": "from sklearn.neighbors import LocalOutlierFactor\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"LocalOutlierFactor\"\n\ndef local_outlier_factor():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    local_outlier_factor = LocalOutlierFactor(n_neighbors=2, novelty= True).fit(x_train, y_train)\n    pymilo_classification_test(local_outlier_factor, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_neighbors/nearest_centroid.py",
    "content": "from sklearn.neighbors import NearestCentroid\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"NearestCentroid\"\n\ndef nearest_centroid():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    nearest_centroid = NearestCentroid().fit(x_train, y_train)\n    pymilo_classification_test(nearest_centroid, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_neighbors/nearest_neighbor.py",
    "content": "from sklearn.neighbors import NearestNeighbors\n\nfrom pymilo.utils.test_pymilo import pymilo_nearest_neighbor_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"NearestNeighbors\"\n\ndef nearest_neighbor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    nearest_neighbor = NearestNeighbors(n_neighbors=2, radius=0.4).fit(x_train, y_train)\n    pymilo_nearest_neighbor_test(nearest_neighbor, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_neighbors/radius_neighbors_classifier.py",
    "content": "from sklearn.neighbors import RadiusNeighborsClassifier\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"RadiusNeighborsClassifier\"\n\ndef radius_neighbors_classifier():\n    x_train, y_train, _, _ = prepare_simple_classification_datasets()\n    radius_neighbors_classifier = RadiusNeighborsClassifier(radius=1.0).fit(x_train, y_train)\n    pymilo_classification_test(radius_neighbors_classifier, MODEL_NAME, (x_train, y_train))\n"
  },
  {
    "path": "tests/test_neighbors/radius_neighbors_regressor.py",
    "content": "from sklearn.neighbors import RadiusNeighborsRegressor\n\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"RadiusNeighborsRegressor\"\n\ndef radius_neighbors_regressor():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    radius_neighbors_regressor = RadiusNeighborsRegressor(radius=1.0).fit(x_train, y_train)\n    pymilo_regression_test(radius_neighbors_regressor, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_neighbors/test_neighbors.py",
    "content": "import os\nimport pytest\n\nfrom kneighbors_regressor import kneighbors_regressor\nfrom kneighbors_classifier import kneighbors_classifier\nfrom radius_neighbors_regressor import radius_neighbors_regressor\nfrom radius_neighbors_classifier import radius_neighbors_classifier\nfrom nearest_neighbor import nearest_neighbor\nfrom nearest_centroid import nearest_centroid\nfrom local_outlier_factor import local_outlier_factor\n\nNEIGHBORS = {\n    \"KNeighbors\": [kneighbors_regressor, kneighbors_classifier],\n    \"RadiusNeighbors\": [radius_neighbors_regressor, radius_neighbors_classifier],\n    \"Nearests\": [nearest_neighbor, nearest_centroid],\n    \"LocalOutlierDetectors\": [local_outlier_factor],\n}\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_neighbors\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        # construct full file path\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for category in NEIGHBORS:\n        for model in NEIGHBORS[category]:\n            model()\n"
  },
  {
    "path": "tests/test_neural_networks/bernoulli_rbm/bernoulli_rbm.py",
    "content": "import os\n\nfrom sklearn import metrics\nfrom sklearn.base import clone\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.neural_network import BernoulliRBM\nfrom sklearn.linear_model import LogisticRegression\n\nfrom pymilo.pymilo_obj import Export\nfrom pymilo.pymilo_obj import Import\nfrom pymilo.utils.test_pymilo import pymilo_export_path\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"Bernoulli Restricted Boltzmann Machine (RBM)\"\n\ndef bernoulli_rbm():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    # Create Bernoulli RBM object\n    logistic = LogisticRegression(solver=\"newton-cg\", tol=1)\n    rbm = BernoulliRBM(random_state=0, verbose=True)\n    rbm_features_classifier = Pipeline(steps=[(\"rbm\", rbm), (\"logistic\", logistic)])\n\n    # Hyper-parameters. These were set by cross-validation,\n    # using a GridSearchCV. Here we are not performing cross-validation to\n    # save time.\n    rbm.learning_rate = 0.06\n    rbm.n_iter = 10\n\n    # More components tend to give better prediction performance, but larger\n    # fitting time\n    rbm.n_components = 100\n    logistic.C = 6000\n\n    # Training RBM-Logistic Pipeline\n    rbm_features_classifier.fit(x_train, y_train)\n\n    # Training the Logistic regression classifier directly on the pixel\n    raw_pixel_classifier = clone(logistic)\n    raw_pixel_classifier.C = 100.0\n    raw_pixel_classifier.fit(x_train, y_train)\n\n    Y_pred = rbm_features_classifier.predict(x_test)\n    before_report = metrics.classification_report(y_test, Y_pred)\n\n    export_model_path = pymilo_export_path(rbm)\n    exported_model = Export(rbm)\n    exported_model_serialized_path = os.path.join(\n        os.getcwd(), \"tests\", export_model_path, MODEL_NAME + '.json')\n    exported_model.save(exported_model_serialized_path)\n\n    imported_model = Import(exported_model_serialized_path)\n    imported_rbm = imported_model.to_model()\n\n    logistic = LogisticRegression(solver=\"newton-cg\", tol=1)\n    rbm_features_classifier = Pipeline(steps=[(\"rbm\", imported_rbm), (\"logistic\", logistic)])\n    logistic.C = 6000\n\n    # Training RBM-Logistic Pipeline\n    rbm_features_classifier.fit(x_train, y_train)\n\n    # Training the Logistic regression classifier directly on the pixel\n    raw_pixel_classifier = clone(logistic)\n    raw_pixel_classifier.C = 100.0\n    raw_pixel_classifier.fit(x_train, y_train)\n\n    Y_pred = rbm_features_classifier.predict(x_test)\n    after_report = metrics.classification_report(y_test, Y_pred)\n\n    assert before_report == after_report\n\n\n\n\n\n"
  },
  {
    "path": "tests/test_neural_networks/mlp/mlp_classification.py",
    "content": "from sklearn.neural_network import MLPClassifier\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"Multi Layer Perceptron Classification\"\n\n\ndef multi_layer_perceptron_classification():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    # Create MLPClassifier object\n    multi_layer_perceptron_classifier = MLPClassifier(random_state=1, max_iter=500).fit(x_train, y_train)\n    # Train the model using the training sets\n    multi_layer_perceptron_classifier.fit(x_train, y_train)\n    assert pymilo_classification_test(\n        multi_layer_perceptron_classifier, MODEL_NAME, (x_test, y_test)) == True\n"
  },
  {
    "path": "tests/test_neural_networks/mlp/mlp_regression.py",
    "content": "from sklearn.neural_network import MLPRegressor\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"Multi Layer Perceptron Regression\"\n\n\ndef multi_layer_perceptron_regression():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    multi_layer_perceptron_regression = MLPRegressor(random_state=1, max_iter=500).fit(x_train, y_train)\n    # Train the model using the training sets\n    multi_layer_perceptron_regression.fit(x_train, y_train)\n    assert pymilo_regression_test(\n        multi_layer_perceptron_regression, MODEL_NAME, (x_test, y_test)) == True\n\n"
  },
  {
    "path": "tests/test_neural_networks/test_neural_networks.py",
    "content": "import os\nimport pytest\n\nfrom mlp.mlp_regression import multi_layer_perceptron_regression\nfrom mlp.mlp_classification import multi_layer_perceptron_classification\n\nfrom bernoulli_rbm.bernoulli_rbm import bernoulli_rbm\n\nNEURAL_NETWORKS = {\n    \"MLP_REGRESSION\": [multi_layer_perceptron_regression, multi_layer_perceptron_classification],\n    \"BERNOULLI_RBM\": [bernoulli_rbm]\n}\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_neural_networks\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        # construct full file path\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for category in NEURAL_NETWORKS:\n        for model in NEURAL_NETWORKS[category]:\n            model()\n"
  },
  {
    "path": "tests/test_preprocessings/binarizer.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import Binarizer\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"Binarizer\"\n\ndef binarizer():\n    X = [[ 1., -1.,  2.],\n        [ 2.,  0.,  0.],\n        [ 0.,  1., -1.]]\n\n    _binarizer = Binarizer().fit(X)\n    pre_result = _binarizer.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(_binarizer),\n            get_path(MODEL_NAME)))\n\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/function_transformer.py",
    "content": "from numpy import array_equal, log1p\nfrom sklearn.preprocessing import FunctionTransformer\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom pymilo.utils.test_pymilo import report_status\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"FunctionTransformer\"\n\ndef function_transformer():\n    f = log1p\n    X = [[0, 1], [2, 3]]\n\n    _function_transformer = FunctionTransformer(f).fit(X)\n    pre_result = _function_transformer.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(_function_transformer),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/kbins_discretizer.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import KBinsDiscretizer\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom pymilo.utils.test_pymilo import report_status\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"KBinsDiscretizer\"\n\ndef kbins_discretizer():\n    X = [[-2, 1, -4,   -1],\n        [-1, 2, -3, -0.5],\n        [ 0, 3, -2,  0.5],\n        [ 1, 4, -1,    2]]\n    est = KBinsDiscretizer(\n        n_bins=3, encode='ordinal', strategy='uniform'\n    )\n    est = est.fit(X)\n    pre_result = est.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(est),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/kernel_centerer.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import KernelCenterer\nfrom sklearn.metrics.pairwise import pairwise_kernels\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"KernelCenterer\"\n\ndef kernel_centerer():\n    X = [[ 1., -2., 2.],\n         [-2., 1., 3.],\n         [ 4., 1., -2.]]\n    kernel = pairwise_kernels(X, metric='linear')\n    _kernel_centerer = KernelCenterer().fit(kernel)\n    pre_result = _kernel_centerer.transform(kernel)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(_kernel_centerer),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(kernel)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/label_binarizer.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import LabelBinarizer\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom pymilo.utils.test_pymilo import report_status\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"LabelBinarizer\"\n\ndef label_binarizer():\n    X = ['yes', 'no', 'no', 'yes']\n\n    lb = LabelBinarizer().fit(X)\n    pre_result = lb.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(lb),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/label_encoder.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import LabelEncoder\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom pymilo.utils.test_pymilo import report_status\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"LabelEncoder\"\n\ndef label_encoder():\n    X = [\"paris\", \"paris\", \"tokyo\", \"amsterdam\"]\n\n    le = LabelEncoder().fit(X)\n    pre_result = le.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(le),\n            get_path(MODEL_NAME)))\n\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/max_abs_scaler.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import MaxAbsScaler\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"MaxAbsScaler\"\n\ndef max_abs_scaler():\n    X = [[ 1., -1.,  2.],\n        [ 2.,  0.,  0.],\n        [ 0.,  1., -1.]]\n\n    _max_abs_scaler = MaxAbsScaler().fit(X)\n    pre_result = _max_abs_scaler.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(_max_abs_scaler),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/multilabel_binarizer.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import MultiLabelBinarizer\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"MultiLabelBinarizer\"\n\ndef multilabel_binarizer():\n    X = [{'sci-fi', 'thriller'}, {'comedy'}]\n\n    mlb = MultiLabelBinarizer().fit(X)\n    pre_result = mlb.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(mlb),\n            get_path(MODEL_NAME)))\n\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/normalizer.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import Normalizer\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"Normalizer\"\n\ndef normalizer():\n    X = [[4, 1, 2, 2],\n         [1, 3, 9, 3],\n         [5, 7, 5, 1]]\n\n    _normalizer = Normalizer().fit(X)\n    pre_result = _normalizer.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(_normalizer),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/one_hot_encoder.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import OneHotEncoder\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"OneHotEncoder\"\n\ndef one_hot_encoder():\n    X = [['Male', 1], ['Female', 3], ['Female', 2]]\n\n    _one_hot_encoder = OneHotEncoder(handle_unknown='ignore').fit(X)\n    pre_result = _one_hot_encoder.transform(X).toarray()\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(_one_hot_encoder),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X).toarray()\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/ordinal_encoder.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import OrdinalEncoder\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"OrdinalEncoder\"\n\ndef ordinal_encoder():\n    X = [['Male', 1], ['Female', 3], ['Female,', 2]]\n    _ordinal_encoder = OrdinalEncoder().fit(X)\n    pre_result = _ordinal_encoder.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(_ordinal_encoder),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/polynomial_features.py",
    "content": "from numpy import array_equal, arange\nfrom sklearn.preprocessing import PolynomialFeatures\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom pymilo.utils.test_pymilo import report_status\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"PolynomialFeatures\"\n\ndef polynomial_features():\n    X = arange(6).reshape(3, 2)\n\n    _polynomial_features = PolynomialFeatures().fit(X)\n    pre_result = _polynomial_features.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(_polynomial_features),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/power_transformer.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import PowerTransformer\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"PowerTransformer\"\n\ndef power_transformer():\n    power_transformer = PowerTransformer()\n    X = [[1, 2], [3, 2], [4, 5]]\n    power_transformer = power_transformer.fit(X)\n    pre_result = power_transformer.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(power_transformer),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/quantile_transformer.py",
    "content": "from numpy import array_equal, random, sort\nfrom sklearn.preprocessing import QuantileTransformer\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"QuantileTransformer\"\n\ndef quantile_transformer():\n    rng = random.RandomState(0)\n    X = sort(rng.normal(loc=0.5, scale=0.25, size=(25, 1)), axis=0)\n\n    _quantile_transformer = QuantileTransformer(n_quantiles=10, random_state=0).fit(X)\n    pre_result = _quantile_transformer.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(_quantile_transformer),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/robust_scaler.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import RobustScaler\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"RobustScaler\"\n\ndef robust_scaler():\n    X = [[ 1., -2.,  2.],\n        [ -2.,  1.,  3.],\n        [ 4.,  1., -2.]]\n\n    _robust_scaler = RobustScaler().fit(X)\n    pre_result = _robust_scaler.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(_robust_scaler),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/spline_transformer.py",
    "content": "from numpy import array_equal, arange\nfrom sklearn.preprocessing import SplineTransformer\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"SplineTransformer\"\n\ndef spline_transformer():\n    X = arange(6).reshape(6, 1)\n    spline = SplineTransformer(degree=2, n_knots=3)\n    pre_result = spline.fit_transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(spline),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.fit_transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/standard_scaler.py",
    "content": "from numpy import array_equal\nfrom sklearn.preprocessing import StandardScaler\nfrom pymilo.utils.test_pymilo import report_status\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"StandardScaler\"\n\ndef standard_scaler():\n    X = [[0, 0], [0, 0], [1, 1], [1, 1]]\n\n    _standard_scaler = StandardScaler().fit(X)\n    pre_result = _standard_scaler.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(_standard_scaler),\n            get_path(MODEL_NAME)))\n    post_result = post_pymilo_pre_model.transform(X)\n\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/target_encoder.py",
    "content": "from numpy import array_equal, array\nfrom sklearn.preprocessing import TargetEncoder\nfrom pymilo.transporters.preprocessing_transporter import PreprocessingTransporter\nfrom pymilo.utils.test_pymilo import report_status\nfrom util import get_path, write_and_read\n\nMODEL_NAME = \"TargetEncoder\"\n\ndef target_encoder():\n    X = array([[\"dog\"] * 20 + [\"cat\"] * 30 + [\"snake\"] * 38], dtype=object).T\n    y = [90.3] * 5 + [80.1] * 15 + [20.4] * 5 + [20.1] * 25 + [21.2] * 8 + [49] * 30\n    enc_auto = TargetEncoder(smooth=\"auto\")\n    enc_auto = enc_auto.fit(X, y)\n    pre_result = enc_auto.transform(X)\n\n    pt = PreprocessingTransporter()\n    post_pymilo_pre_model = pt.deserialize_pre_module(\n        write_and_read(\n            pt.serialize_pre_module(enc_auto),\n            get_path(MODEL_NAME)))\n\n    post_result = post_pymilo_pre_model.transform(X)\n    comparison_result = array_equal(pre_result, post_result)\n    report_status(comparison_result, MODEL_NAME)\n    assert comparison_result\n"
  },
  {
    "path": "tests/test_preprocessings/test_preprocessings.py",
    "content": "import os\nimport pytest\nfrom pymilo.pymilo_param import SKLEARN_PREPROCESSING_TABLE, NOT_SUPPORTED\n\nfrom one_hot_encoder import one_hot_encoder\nfrom label_binarizer import label_binarizer\nfrom label_encoder import label_encoder\nfrom standard_scaler import standard_scaler\nfrom binarizer import binarizer\nfrom function_transformer import function_transformer\nfrom kernel_centerer import kernel_centerer\nfrom multilabel_binarizer import multilabel_binarizer\nfrom max_abs_scaler import max_abs_scaler\nfrom normalizer import normalizer\nfrom ordinal_encoder import ordinal_encoder\nfrom polynomial_features import polynomial_features\nfrom robust_scaler import robust_scaler\nfrom quantile_transformer import quantile_transformer\nfrom kbins_discretizer import kbins_discretizer\nfrom power_transformer import power_transformer\n\nif SKLEARN_PREPROCESSING_TABLE[\"SplineTransformer\"] != NOT_SUPPORTED:\n    from spline_transformer import spline_transformer\nif SKLEARN_PREPROCESSING_TABLE[\"TargetEncoder\"] != NOT_SUPPORTED:\n    from target_encoder import target_encoder\n\nPREPROCESSINGS = [one_hot_encoder,\n                  label_binarizer,\n                  label_encoder,\n                  standard_scaler,\n                  binarizer,\n                  function_transformer,\n                  kernel_centerer,\n                  multilabel_binarizer,\n                  max_abs_scaler,\n                  normalizer,\n                  ordinal_encoder,\n                  polynomial_features,\n                  robust_scaler,\n                  quantile_transformer,\n                  kbins_discretizer,\n                  power_transformer,\n                  spline_transformer if SKLEARN_PREPROCESSING_TABLE[\"SplineTransformer\"] != NOT_SUPPORTED else (None, \"SplineTransformer\"),\n                  target_encoder if SKLEARN_PREPROCESSING_TABLE[\"TargetEncoder\"] != NOT_SUPPORTED else (None, \"TargetEncoder\")\n                  ]\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_preprocessings\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        # construct full file path\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for model in PREPROCESSINGS:\n        if isinstance(model, tuple):\n            func, model_name = model\n            if func == None:\n                print(\"Model: \" + model_name + \" is not supported in this python version.\")\n                continue\n        model()\n"
  },
  {
    "path": "tests/test_preprocessings/util.py",
    "content": "import os\nimport json\n\ndef write_and_read(serialized_model, file_addr):\n    with open(file_addr, 'w') as fp:\n        fp.write(json.dumps(serialized_model, indent=4))\n    with open(file_addr, 'r') as fp:\n        return json.load(fp)\n\ndef get_path(model_name):\n    return  os.path.join(os.getcwd(), \"tests\", \"exported_preprocessings\", model_name + \".json\")\n"
  },
  {
    "path": "tests/test_svms/linear_svc.py",
    "content": "from sklearn.svm import LinearSVC\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"LinearSVC\"\n\ndef linear_svc():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    linear_svc = LinearSVC(random_state=0, tol=1e-5).fit(x_train, y_train)\n    pymilo_classification_test(linear_svc, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_svms/linear_svr.py",
    "content": "from sklearn.svm import LinearSVR\n\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"LinearSVR\"\n\ndef linear_svr():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    linear_svr = LinearSVR(random_state=3, tol=1e-5).fit(x_train, y_train)\n    pymilo_regression_test(linear_svr, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_svms/nu_svc.py",
    "content": "from sklearn.svm import NuSVC\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"NuSVC\"\n\ndef nu_svc():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    nu_svc = NuSVC().fit(x_train, y_train)\n    pymilo_classification_test(nu_svc, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_svms/nu_svr.py",
    "content": "from sklearn.svm import NuSVR\n\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"NuSVC\"\n\ndef nu_svr():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    nu_svr = NuSVR(C=1.0, nu=0.1).fit(x_train, y_train)\n    pymilo_regression_test(nu_svr, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_svms/one_class_svm.py",
    "content": "from sklearn.svm import OneClassSVM\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"OneClassSVM\"\n\ndef one_class_svm():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    one_class_svm = OneClassSVM(gamma='auto').fit(x_train, y_train)\n    pymilo_classification_test(one_class_svm, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_svms/svc.py",
    "content": "from sklearn.svm import SVC\n\nfrom pymilo.utils.test_pymilo import pymilo_classification_test\nfrom pymilo.utils.data_exporter import prepare_simple_classification_datasets\n\nMODEL_NAME = \"SVC\"\n\ndef svc():\n    x_train, y_train, x_test, y_test = prepare_simple_classification_datasets()\n    svc = SVC(gamma='auto').fit(x_train, y_train)\n    pymilo_classification_test(svc, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_svms/svr.py",
    "content": "from sklearn.svm import SVR\n\nfrom pymilo.utils.test_pymilo import pymilo_regression_test\nfrom pymilo.utils.data_exporter import prepare_simple_regression_datasets\n\nMODEL_NAME = \"SVR\"\n\ndef svr():\n    x_train, y_train, x_test, y_test = prepare_simple_regression_datasets()\n    svr = SVR(C=1.0, epsilon=0.2).fit(x_train, y_train)\n    pymilo_regression_test(svr, MODEL_NAME, (x_test, y_test))\n"
  },
  {
    "path": "tests/test_svms/test_svms.py",
    "content": "import os\nimport pytest\n\nfrom linear_svc import linear_svc\nfrom linear_svr import linear_svr\nfrom nu_svc import nu_svc\nfrom nu_svr import nu_svr\nfrom one_class_svm import one_class_svm\nfrom svc import svc\nfrom svr import svr\n\nSVMS = {\n    \"LINEAR\": [linear_svc, linear_svr],\n    \"Nu\": [nu_svc, nu_svr],\n    \"ONE_CLASS\": [one_class_svm],\n    \"SVC\": [svc],\n    \"SVR\": [svr],\n}\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef reset_exported_models_directory():\n    exported_models_directory = os.path.join(\n        os.getcwd(), \"tests\", \"exported_svms\")\n    if not os.path.isdir(exported_models_directory):\n        os.mkdir(exported_models_directory)\n        return\n    for file_name in os.listdir(exported_models_directory):\n        # construct full file path\n        json_file = os.path.join(exported_models_directory, file_name)\n        if os.path.isfile(json_file):\n            os.remove(json_file)\n\ndef test_full():\n    for category in SVMS:\n        for model in SVMS[category]:\n            model()\n"
  }
]