[
  {
    "path": ".circleci/config.yml",
    "content": "# Python CircleCI 2.0 configuration file\nversion: 2\njobs:\n  build:\n    docker:\n      - image: circleci/python:3.7.3\n\n    working_directory: ~/repo\n\n    steps:\n      - checkout\n\n      - restore_cache:\n          keys:\n          - env-build\n\n      - run:\n          name: setup env\n          command: |\n            python3 -m venv venv\n            . venv/bin/activate\n            pip install --upgrade pip\n            pip install ./[tensorboard,tests,docs]\n      - save_cache:\n          paths:\n            - ./venv\n          key: env-build\n\n      - run:\n          name: run tests\n          command: |\n            . venv/bin/activate\n            py.test --verbose --runslow hypertunity\n      - store_artifacts:\n          path: test-reports\n          destination: test-reports\n"
  },
  {
    "path": ".gitignore",
    "content": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\n*.egg-info/\n.installed.cfg\n*.egg\nMANIFEST\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit tests / coverage reports\nhtmlcov/\n.tox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*.cover\n.hypothesis/\n.pytest_cache/\n\n# Jupyter Notebook\n.ipynb_checkpoints\n\n# Environments\n.venv*\n\n# Pycharm project settings\n.idea\n\n# mkdocs documentation\n/site\n\n# mypy\n.mypy_cache/\n\n# Sphinx documentation\n/docs/_build\n"
  },
  {
    "path": ".readthedocs.yml",
    "content": "# Read the Docs configuration file\n# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details\n\nversion: 2\n\n# Sphinx settings\nsphinx:\n  builder: html\n  configuration: docs/conf.py\n  fail_on_warning: true\n\n# Python settings\npython:\n   version: 3.7\n   install:\n      - method: pip\n        path: .\n        extra_requirements:\n            - docs\n"
  },
  {
    "path": "CHANGELOG.md",
    "content": "# Changelog\nAll notable changes to this project will be documented in this file.\n\n## [Unreleased]\n\n## [1.0.1] - 2020-01-27\n## Changed\n- some code style related changes are applied, such as import sorting and line length shortening.\n- refactoring in tests to use pytest parameterisation and fixtures.\n\n## Fixed\n- issue with running callables from script thanks to David Turner (https://github.com/gdikov/hypertunity/pull/43).\n- issue with tensorflow version comparison in the tensorboard reporter.\n\n## [1.0.0] - 2019-11-10\n## Added\n- `Reporter` instance can be loaded with data from the database of another reporter using a `from_database()` method.\n- data from a `Reporter` instance can be exported into a `HistoryPoint` list to load into an optimiser.\n- compiled documentation and logo.\n- `BayesianOptimisation` raises `ExhaustedSearchSpaceError` if a discrete domain is exhausted.\n\n## Changed\n- minor fixes in documentation typos, argument names and tests.\n- `Domain` is moved from `hypertunity.optimisation` to the `hypertunity` package.\n- rename `TableReporter` to `Table` and `TensorboardReporter` to `Tensorboard`.\n- `ExhaustedSearchSpaceError` is moved from `optimisation.exhastive` to `optimisation.base` module.\n- `Trial` running a task from a job is now done with dict as input keyword arguments or named command line arguments.\n\n## Fixed\n- bug in `BayesianOptimisation` sample conversion for nested dictionaries.\n- bug in `BayesianOptimisation` type preserving between the domain and the sample value.\n- bug in `Tensorboard` reporter for real intervals with integer boundaries. \n- bug in `Reporter` for not using the default metric name during logging.\n\n## [0.4.0] - 2019-09-15\n## Added\n- `Trial` a wrapper class for high-level usage, which runs the optimiser, evaluates the objective\n by scheduling jobs, updates the optimiser and summarises the results.\n- a `Job` from a script with command line arguments can now be run with \n named arguments passed as a dictionary instead of a tuple.\n- checkpointing of results on disk when calling `log()` or a `Reporter` object.\n- optimisation history can now be loaded into an `Optimiser`. Example use-case would be to warm-start\n`BayesianOptimisation` from the history of the quicker `RandomSearch`.\n\n## Changed\n- every `Reporter` instance has a `primary_metric` attribute, which is an argument to `__init__`.\n\n## Fixed\n- validation of `Domain` is not allowing for intervals with more than 2 numbers.\n- minor bugs in tests.\n\n## [0.3.1] - 2019-09-10\n## Fixed\n- `Optimiser.update()` now accepts evaluation arguments that are float, `EvaluationScore` or a dict\n with metric names and floats or `EvaluationScore`s. This is valid for all optimisers. \n\n## [0.3.0] - 2019-09-08\n## Added\n- `Job` can now be scheduled locally to run command line scripts with arguments.\n- `BayesianOptimisation.run_step` can pass arguments to the backend for better customisation.\n\n## Changed\n- any `Reporter` object can be fed with data from a tuple of a \n`Sample` object and a score, which can be a float or an `EvaluationScore`.\n- `BayesianOptimisation` optimiser can be updated with a `Sample` and \na float or `EvaluationScore` objective evaluation types.\n- a discrete/categorical `Domain` is defined with a set literal instead of a tuple.\n- `Job` supports running functions from within a script by specifying 'script_path::func_name'.\n- `batch_size` is no more an attribute of an `Optimiser` but an argument to `run_step`. \n- `minimise` is no more an attribute of `BayesianOptimisation` but an argument to `run_step`.\n\n## [0.2.0] - 2019-08-28\n## Added\n- `Scheduler` to run jobs locally using joblib.\n- `SlurmJob` and `Job` dataclasses defining the tasks to be scheduled.\n- `Result` dataclass encapsulating the results from the tasks.\n- `TableReporter` class for reporting results in tabular format.\n- `Reporter` base class for extending reporters.\n\n## Changed\n- `Base`-prefix is removed from all base classes which reside \n in `base.py` modules.\n- `split_by_type` is now a method of the `Domain` class.\n- `Optimiser` has a `batch_size` attribute accessible as a property.\n\n## Removed\n- `optimisation.bo` package has been removed. Instead a single `bo.py`\n module supports the only BO backend---GPyOpt, as of now.\n- prefix for the file encoding (default is utf-8).\n \n## [0.1.0] - 2019-07-27\n### Added\n- `TensorboardReporter` result logger using `HParams`.\n- `GpyOpt` backend for `BayesianOptimisation`.\n- `RandomSearch` optimiser.\n- `GridSearch` optimiser.\n- `Domain` and `Sample` classes as foundations for the optimisers.\n"
  },
  {
    "path": "LICENSE",
    "content": "                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      \"License\" shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      \"Licensor\" shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      \"Legal Entity\" shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      \"control\" means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      \"You\" (or \"Your\") shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      \"Source\" form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      \"Object\" form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      \"Work\" shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      \"Derivative Works\" shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      \"Contribution\" shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, \"submitted\"\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as \"Not a Contribution.\"\n\n      \"Contributor\" shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a \"NOTICE\" text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an \"AS IS\" BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets \"[]\"\n      replaced with your own identifying information. (Don't include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same \"printed page\" as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright [yyyy] [name of copyright owner]\n\n   Licensed under the Apache License, Version 2.0 (the \"License\");\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an \"AS IS\" BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n"
  },
  {
    "path": "README.md",
    "content": "<div align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/gdikov/hypertunity/master/docs/_static/images/logo.svg?sanitize=true\" width=\"100%\">\n</div>\n\n[![CircleCI](https://img.shields.io/circleci/build/github/gdikov/hypertunity)](https://circleci.com/gh/gdikov/hypertunity)\n[![Documentation Status](https://readthedocs.org/projects/hypertunity/badge/?version=latest)](https://hypertunity.readthedocs.io/en/latest/?badge=latest)\n![GitHub](https://img.shields.io/github/license/gdikov/hypertunity)\n\n## Why Hypertunity\n\nHypertunity is a lightweight, high-level library for hyperparameter optimisation. \nAmong others, it supports:\n * Bayesian optimisation by wrapping [GPyOpt](http://sheffieldml.github.io/GPyOpt/),\n * external or internal objective function evaluation by a scheduler, also compatible with [Slurm](https://slurm.schedmd.com),\n * real-time visualisation of results in [Tensorboard](https://www.tensorflow.org/tensorboard) \n via the [HParams](https://www.tensorflow.org/tensorboard/r2/hyperparameter_tuning_with_hparams) plugin.\n\nFor the full set of features refer to the [documentation](https://hypertunity.readthedocs.io).\n\n## Quick start\n\nDefine the objective function to optimise. For example, it can take the hyperparameters `params` as input and \nreturn a raw value `score` as output:\n\n```python\nimport hypertunity as ht\n\ndef foo(**params) -> float:\n    # do some very costly computations\n    ...\n    return score\n```\n\nTo define the valid ranges for the values of `params` we create a `Domain` object:\n\n```python\ndomain = ht.Domain({\n    \"x\": [-10., 10.],         # continuous variable within the interval [-10., 10.]\n    \"y\": {\"opt1\", \"opt2\"},    # categorical variable from the set {\"opt1\", \"opt2\"}\n    \"z\": set(range(4))        # discrete variable from the set {0, 1, 2, 3}\n})\n```\n\nThen we set up the optimiser:\n\n```python\nbo = ht.BayesianOptimisation(domain=domain)\n```\n\nAnd we run the optimisation for 10 steps. Each result is used to update the optimiser so that informed domain \nsamples are drawn:\n\n```python\nn_steps = 10\nfor i in range(n_steps):\n    samples = bo.run_step(batch_size=2, minimise=True)      # suggest next samples\n    evaluations = [foo(**s.as_dict()) for s in samples]     # evaluate foo\n    bo.update(samples, evaluations)                         # update the optimiser\n```\n\nFinally, we visualise the results in Tensorboard: \n\n```python\nimport hypertunity.reports.tensorboard as tb\n\nresults = tb.Tensorboard(domain=domain, metrics=[\"score\"], logdir=\"path/to/logdir\")\nresults.from_history(bo.history)\n```\n\n## Even quicker start\n\nA high-level wrapper class `Trial` allows for seamless parallel optimisation\nwithout bothering with scheduling jobs, updating optimisers and logging:\n   \n```python\ntrial = ht.Trial(objective=foo,\n                 domain=domain,\n                 optimiser=\"bo\",\n                 reporter=\"tensorboard\",\n                 metrics=[\"score\"])\ntrial.run(n_steps, batch_size=2, n_parallel=2)\n```\n\n## Installation\n\n### Using PyPI\nTo install the base version run:\n```bash\npip install hypertunity\n```\nTo use the Tensorboard dashboard, build the docs or run the test suite you will need the following extras:\n```bash\npip install hypertunity[tensorboard,docs,tests]\n```\n\n### From source\nCheckout the latest master and install locally:\n```bash\ngit clone https://github.com/gdikov/hypertunity.git\ncd hypertunity\npip install ./[tensorboard]\n```\n"
  },
  {
    "path": "conftest.py",
    "content": "import pytest\n\n\ndef pytest_addoption(parser):\n    parser.addoption(\n        \"--runslow\",\n        action=\"store_true\",\n        default=False,\n        help=\"run slow tests\"\n    )\n    parser.addoption(\n        \"--runslurm\",\n        action=\"store_true\",\n        default=False,\n        help=\"run slurm tests\"\n    )\n\n\ndef pytest_configure(config):\n    config.addinivalue_line(\n        \"markers\", \"slow: mark test as slow to run\"\n    )\n    config.addinivalue_line(\n        \"markers\", \"slurm: mark test which require slurm to run\"\n    )\n\n\ndef pytest_collection_modifyitems(config, items):\n    def mark_skip(keyword):\n        if config.getoption(f\"--run{keyword}\"):\n            return\n        skip = pytest.mark.skip(reason=f\"need --run{keyword} option to run\")\n        for item in items:\n            if keyword in item.keywords:\n                item.add_marker(skip)\n\n    mark_skip(\"slow\")\n    mark_skip(\"slurm\")\n"
  },
  {
    "path": "docs/Makefile",
    "content": "# Minimal makefile for Sphinx documentation\n#\n\n# You can set these variables from the command line, and also\n# from the environment for the first two.\nSPHINXOPTS    ?=\nSPHINXBUILD   ?= sphinx-build\nSOURCEDIR     = .\nBUILDDIR      = _build\n\n# Put it first so that \"make\" without argument is like \"make help\".\nhelp:\n\t@$(SPHINXBUILD) -M help \"$(SOURCEDIR)\" \"$(BUILDDIR)\" $(SPHINXOPTS) $(O)\n\n.PHONY: help Makefile\n\n# Catch-all target: route all unknown targets to Sphinx using the new\n# \"make mode\" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).\n%: Makefile\n\t@$(SPHINXBUILD) -M $@ \"$(SOURCEDIR)\" \"$(BUILDDIR)\" $(SPHINXOPTS) $(O)\n"
  },
  {
    "path": "docs/conf.py",
    "content": "# Configuration file for the Sphinx documentation builder.\n#\n# This file only contains a selection of the most common options. For a full\n# list see the documentation:\n# https://www.sphinx-doc.org/en/master/usage/configuration.html\n\n# -- Path setup --------------------------------------------------------------\n\n# If extensions (or modules to document with autodoc) are in another directory,\n# add these directories to sys.path here. If the directory is relative to the\n# documentation root, use os.path.abspath to make it absolute, like shown here.\n#\nimport os\nimport sys\n\n\nsys.path.insert(0, os.path.abspath('..'))\nimport hypertunity\n\n# The short X.Y version.\nversion = '.'.join(hypertunity.__version__.split('.', 2)[:2])\n# The full version, including alpha/beta/rc tags.\nrelease = hypertunity.__version__\n\n\n# -- Project information -----------------------------------------------------\n\nproject = 'Hypertunity'\ncopyright = '2019, Georgi Dikov'\nauthor = 'Georgi Dikov'\n\n\n# -- General configuration ---------------------------------------------------\n\n# Add any Sphinx extension module names here, as strings. They can be\n# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom\n# ones.\nextensions = [\n    'sphinx.ext.autodoc',\n    'sphinx.ext.autosummary',\n    'sphinx.ext.napoleon',\n    'sphinx.ext.viewcode'\n]\n\n# Napoleon settings\nnapoleon_google_docstring = True\nnapoleon_numpy_docstring = False\nnapoleon_include_init_with_doc = True\nnapoleon_include_private_with_doc = False\nnapoleon_include_special_with_doc = True\nnapoleon_use_admonition_for_examples = False\nnapoleon_use_admonition_for_notes = True\nnapoleon_use_admonition_for_references = True\nnapoleon_use_ivar = True\nnapoleon_use_param = True\nnapoleon_use_keyword = True\nnapoleon_use_rtype = True\n\nautodoc_typehints = 'none'\nautodoc_mock_imports = ['tensorflow', 'tensorboard']\n\n\nsource_suffix = '.rst'\n\n\n# Add any paths that contain templates here, relative to this directory.\ntemplates_path = ['_templates']\n\n# List of patterns, relative to source directory, that match files and\n# directories to ignore when looking for source files.\n# This pattern also affects html_static_path and html_extra_path.\nexclude_patterns = ['_build', 'Thumbs.db', '.DS_Store', 'test*']\n\n\n# -- Options for HTML output -------------------------------------------------\nhtml_theme = 'sphinx_rtd_theme'\n\npygments_style = 'sphinx'\nadd_module_names = False\n\n# Add any paths that contain custom static files (such as style sheets) here,\n# relative to this directory. They are copied after the builtin static files,\n# so a file named \"default.css\" will overwrite the builtin \"default.css\".\nhtml_static_path = ['_static']\n\n# this is needed as HTML5 causes an ugly rendering of the \"Parameters\", \"Returns\", etc. fields\nhtml4_writer = True\n\nhtml_theme_options = {\n    \"logo_only\": True,\n    'display_version': True,\n    'style_nav_header_background': '#002A3F',\n    # Toc options\n    'collapse_navigation': True\n}\n\nhtml_context = {\n    \"display_github\": True,     # Add 'Edit on Github' link instead of 'View page source'\n    # \"last_updated\": True,\n    # \"commit\": False,\n}\n\nhtml_logo = \"_static/images/logo_inverted.svg\"\nhtml_favicon = '_static/images/favicon.ico'\n\ngithub_url = \"https://github.com/gdikov/hypertunity\"\n"
  },
  {
    "path": "docs/index.rst",
    "content": ":github_url: https://github.com/gdikov/hypertunity\n\n.. image:: _static/images/logo.svg\n  :width: 800\n  :align: center\n  :alt: Hypertunity logo\n\n========\nWelcome!\n========\n\nHypertunity is a lightweight, high-level library for hyperparameter optimisation.\nAmong others, it supports:\n\n* Bayesian optimisation by wrapping `GPyOpt <http://sheffieldml.github.io/GPyOpt/>`_\n* external or internal objective evaluation using a scheduler, also compatible with `Slurm <https://slurm.schedmd.com>`_\n* real-time visualisation of results in `Tensorboard <https://www.tensorflow.org/tensorboard>`_ using the `HParams <https://www.tensorflow.org/tensorboard/r2/hyperparameter_tuning_with_hparams>`_ plugin.\n\nThe main guiding design principles are:\n\n* **Modular**: you can use any optimiser and reporter as well as schedule jobs locally or on Slurm without changes in the API.\n* **Simple**: the small codebase (just about 1000 LOC) and the flat subpackage hierarchy makes it easy to use, maintain and extend.\n* **Extensible**: base classes such as :class:`Optimiser`, :class:`Job` and :class:`Reporter` allow for seamless implementation of customized functionality.\n\n\n.. toctree::\n  :maxdepth: 2\n  :caption: User Guide\n\n  manual/installation\n  manual/quickstart\n  manual/domain\n  manual/optimisation\n  manual/reports\n  manual/scheduling\n\n\n.. toctree::\n  :maxdepth: 2\n  :caption: API Reference\n\n  source/hypertunity\n  source/optimisation\n  source/reports\n  source/scheduling\n\n\nIndices and tables\n------------------\n\n* :ref:`genindex`\n* :ref:`modindex`\n* :ref:`search`\n"
  },
  {
    "path": "docs/manual/domain.rst",
    "content": "Domain\n======\n\nThe set of all hyperparameters and the corresponding ranges of possible values is specified using the :class:`Domain` class.\nIt can be initialised with a dictionary mapping parameter names to continuous numeric intervals or discrete sets.\nThe former are given as python :obj:`list` and the latter---as :obj:`set`.\n\nFor example, to define a domain over the continuous interval [-10, 10] and the discrete set of\nstrings {\"option_1\", \"option_2\"}, it suffices to write:\n\n.. code-block:: python\n\n    domain = Domain({\"var_1\": [-10, 10], \"var_2\": {\"option_1\", \"option_2\"}})\n\nwhere ``\"var_1\"`` and ``\"var_2\"`` are two arbitrary names for the two subdomains.\n\nGiven this domain we can now generate samples from it using the :py:meth:`sample()` method:\n\n.. code-block:: python\n\n    >>> domain.sample()\n    {'var_1': -8.529187978165552, 'var_2': 'option_1'}\n\nThe returned objects are of class :class:`Sample` and represent one realisation of the domain.\nIt is represented as a mapping of parameter names to samples from the set of possible values.\nIt also has a handy conversion methods such as :py:meth:`as_dict()` or :py:meth:`as_namedtuple()` which enable accessing\nparameters using the `[\"var_1\"]` or `.var_1` notation.\n\nBoth :class:`Domain` and :class:`Sample` objects allow for nested subdomains, e.g.:\n\n.. code-block:: python\n\n    >>> domain = Domain({\n    ...    \"subdomain_a\": {\"var_1\": [-10, 10], \"var_2\": {\"option_1\", \"option_2\"}},\n    ...    \"subdomain_b\": {\"var_1\": [-1, 1], \"var_2\": {\"option_1\", \"option_2\"}}\n    ... })\n    >>> sample = domain.sample()\n    >>> sample\n    {\n        'subdomain_a': {'var_1': -6.892359956494582, 'var_2': 'option_2'},\n        'subdomain_b': {'var_1': 0.21004903180560652, 'var_2': 'option_1'}\n    }\n    >>> nt_sample = sample.as_namedtuple()\n    >>> nt_sample.subdomain_a.var_2\n    'option_2'\n"
  },
  {
    "path": "docs/manual/installation.rst",
    "content": "Installation\n============\n\nRequirements\n------------\n\nHypertunity has been tested with Python 3.6 and 3.7. As of now, there are no plans to support earlier versions of Python.\nThe reason for that is the usage of variable and function annotations, dataclasses as well as relying on the fact that the\ninsertion order of the keys in a dictionary is preserved during iteration. Porting Hypertunity to earlier versions will\nonly make it unnecessarily hard to maintain.\n\nFrom PyPI\n---------\n\nTo get the latest stable release just run:\n\n.. code-block:: bash\n\n    pip install hypertunity\n\nNote that this will install the basic version only, without support for Tensorboard visualisations.\nTo enable this feature you will need to specify the option `tensorboard`.\nTo run the tests or compile the docs add the `tests` and `docs` options respectively:\n\n.. code-block:: bash\n\n    pip install hypertunity[tensorboard,tests,docs]\n\n\nFrom source\n-----------\n\nTo install the bleeding-edge version of Hypertunity, clone the repository, checkout the master branch\nand install from source:\n\n.. code-block:: bash\n\n    git clone https://github.com/gdikov/hypertunity.git\n    cd hypertunity\n    git checkout master\n    pip install ./[tensorboard,tests,docs]\n"
  },
  {
    "path": "docs/manual/optimisation.rst",
    "content": "Optimisation\n============\n\nHypertunity ships with three types of hyperparameter space exploration algorithms. A Bayesian optimisation, random and\ngrid search. While the first one is sequential in nature and requires evaluations to update its internal model of the\nobjective function, so that more informed sample suggestions are generated, the latter two are able to generate all samples\nin parallel and do not require updating. In this section we will give a brief overview of each.\n\nBayesian optimisation\n---------------------\n\n:class:`BayesianOptimisation` in Hypertunity is a wrapper around `GPyOpt.methods.BayesianOptimization` which uses\nGaussian Process regression to build a surrogate model of the objective function. It is initialised from a :class:`Domain`\nobject:\n\n.. code-block:: python\n\n    bo = BayesianOptimization(domain)\n\nThe :class:`BayesianOptimisation` optimiser is highly customisable during sampling. This enables the user to\ndynamically refine the model during calling :py:meth:`run_step()`. This approach introduces however the computational\nburden of recomputing the surrogate model at each query. In the following example we show how one can set the GP model\nusing readily available ones from `GPy.models`, e.g. a `GPHeteroschedasticRegression`:\n\n.. code-block:: python\n\n    bo = BayesianOptimisation(domain=domain, seed=7)                    # initialise BO optimiser\n    kernel = GPy.kern.RBF(1) + GPy.kern.Bias(1)                         # create a custom kernel\n    custom_model = GPy.models.GPHeteroscedasticRegression(..., kernel)  # create a custom model\n    samples = bayes_opt.run_step(model=custom_model)                    # generate samples\n\n\nRandom search\n-------------\n\nThis class is a wrapper around the :py:meth:`Domain.sample()` method. It has the API of\nan :class:`Optimiser` class and yields samples which are uniformly drawn from the domain.\nThere is no limitation on the number of samples that can be returned in a single call of :py:meth:`run_step()`,\neven if this leads to repetitions.\n\n\nGrid search\n-----------\n\n:class:`GridSearch` is a wrapper around the iteration over a domain. It goes over each point in the Cartesian-product of\nall discrete subdomains. If one of the subdomains is continuous :class:`GridSearch` will sample uniformly from\nthis interval. Once the domain is exhausted, further iteration will be prevented by raising an :class:`ExhaustedSearchSpaceError`.\nTo iterate again the :class:`GridSearch` optimiser must be reset by calling the :py:meth:`reset()` method.\n\n.. code-block:: python\n\n    >>> domain = Domain({\"x\": {1, 2, 3}, \"y\": {\"a\", \"b\"}, \"z\": [0, 1]})\n    >>> gs = GridSearch(domain, sample_continuous=True)\n    >>> gs.run_step(batch_size=6)\n    [\n        {'x': 1, 'y': 'b', 'z': 0.054781406913364084},\n        {'x': 2, 'y': 'b', 'z': 0.7006391867439882},\n        {'x': 3, 'y': 'b', 'z': 0.9674445624792569},\n        {'x': 1, 'y': 'a', 'z': 0.7837727333178091},\n        {'x': 2, 'y': 'a', 'z': 0.17240297136803384},\n        {'x': 3, 'y': 'a', 'z': 0.844465575155033}\n    ]\n    >>> gs.reset()\n\n\n\n\nCustom optimiser\n----------------\n\nIf neither of the predefined optimiser are useful for your problem, you can easily roll out a custom one.\nOnly thing you have to do is to inherit from the base :class:`Optimiser` class and implement the :py:meth:`run_step` method.\n\n.. code-block:: python\n\n    class CustomOptimiser(Optimiser):\n        def __init__(self, domain, *args, **kwargs):\n            super(CustomOptimiser, self).__init__(domain)\n            ...\n\n        def run_step(batch_size, *args, **kwargs):\n            ...\n            return [samples]\n"
  },
  {
    "path": "docs/manual/quickstart.rst",
    "content": "Quick start\n===========\n\nA worked example\n~~~~~~~~~~~~~~~~\n\nLet's delve in into the API of Hypertunity by going through a worked example---neural network hyperparameter optimisation.\nIn the following we will tune the number of layers and units, the non-linearity type, as well as the dropout rate and the\nlearning rate of the optimiser.\n\n**Disclaimer:** This example serves a demonstration purpose only. It does not represent an advanced way of performing\nneural network architecture search!\n\nFirst thing we do it to import Hypertunity, tensorflow and numpy and define a helper data loading function:\n\n.. code-block:: python\n\n    import hypertunity as ht\n    import numpy as np\n    import tensorflow as tf\n\n    import hypertunity.reports.tensorboard as ht_tb\n\n\n    def load_mnist():\n        (train_x, train_y), (test_x, test_y) = tf.keras.datasets.mnist.load_data()\n        data_shape = train_x.shape[1:]\n        train_x = train_x.reshape(-1, np.prod(data_shape)).astype(np.float32) / 255.\n        mean_train = np.mean(train_x, axis=0)\n        train_x -= mean_train\n        test_x = test_x.reshape(-1, np.prod(data_shape)).astype(np.float32) / 255.\n        test_x -= mean_train\n        train_y = tf.keras.utils.to_categorical(train_y, num_classes=10)\n        test_y = tf.keras.utils.to_categorical(test_y, num_classes=10)\n        return (train_x, train_y), (test_x, test_y)\n\n\nNext we define a function that will build the model given the architectural hyperparameters and the learning rate,\nfollowed by the objective which will wrap the model building and evaluation:\n\n.. code-block:: python\n\n    def build_model(inp_size, out_size, n_layers, n_units, p_dropout, activation):\n        inp = tf.keras.Input(inp_size)\n        h = inp\n        for l in range(n_layers - 1):\n            h = tf.keras.layers.Dense(n_units, activation=activation)(h)\n            h = tf.keras.layers.Dropout(rate=p_dropout)(h)\n        h = tf.keras.layers.Dense(out_size, activation=None)(h)\n        out = tf.keras.layers.Softmax()(h)\n        model = tf.keras.models.Model(inputs=inp, outputs=out)\n        return model\n\n\n    def objective_fn(**config) -> float:\n        (train_x, train_y), (test_x, test_y) = load_mnist()\n        model = build_model(train_x.shape[-1], train_y.shape[-1],\n                            config[\"arch\"][\"n_layers\"],\n                            config[\"arch\"][\"n_units\"],\n                            config[\"arch\"][\"p_dropout\"],\n                            config[\"arch\"][\"activation\"])\n        opt = tf.keras.optimizers.Adam(learning_rate=config[\"opt\"][\"lr\"])\n        model.compile(optimizer=opt, loss=\"categorical_crossentropy\")\n        model.fit(train_x, train_y, batch_size=100, epochs=1)\n        score = model.evaluate(test_x, test_y, batch_size=test_x.shape[0])\n        return score\n\nNow that we can build a model, we should define the ranges of possible values for the these parameters.\nThis can be done with creating a :class:`Domain` instance as follows:\n\n.. code-block:: python\n\n    domain = ht.Domain({\n        \"arch\": {\n            \"n_layers\": {1, 3, 5},\n            \"n_units\": {10, 50, 100, 500},\n            \"p_dropout\": [0, 0.9999],\n            \"activation\": {\"relu\", \"selu\", \"elu\"}\n        },\n        \"opt\": {\n            \"lr\": [1e-9, 1e-2]\n        }\n    })\n\nThe :class:`Domain` plays a central role in Hypertunity and we will make a frequent use of it later as well.\nAn important related class is the :class:`Sample`. It can be thought of as one realisation of the variables from the domain,\nwhich in our case is one particular configuration of network hyperparameters.\n\nUsing the domain, we can set up the optimiser and the result visualiser also used for experiment logging.\nIn this case we use :class:`BayesianOptimisation` and :class:`Tensorboard` respectively:\n\n.. code-block:: python\n\n    optimiser = ht.BayesianOptimisation(domain)\n    tb_rep = ht_tb.Tensorboard(domain,\n                               metrics=[\"cross-entropy\"],\n                               logdir=\"./mnist_mlp\",\n                               database_path=\"./mnist_mlp\")\n\n\nAfter we create the :class:`Tensorboard` reporter we will be prompted to run `tensorboard --logdir=./mnist_mlp`\nin the console and open Tensorboard in the browser. We can do this also before we launch the actual optimisation.\n\nOne last bit before running it is the definition of the job schedule as well as optimiser and reporter update loop.\nThis is to ensure that samples are generated, experiments are run and the results used to improve the underlying model of the :class:`BayesianOptimisation` optimiser.\nTo schedule one experiment at a time, for 50 consecutive steps we create a :class:`Job` for each function call of ``objective_fn``\nwith a set of suggested hyperparameters:\n\n.. code-block:: python\n\n    n_steps = 50\n    batch_size = 1\n    with ht.Scheduler(n_parallel=batch_size) as scheduler:\n        for i in range(n_steps):\n            samples = optimiser.run_step(batch_size=batch_size, minimise=True)\n            jobs = [ht.Job(task=objective_fn, args=s.as_dict() for s in samples]\n            scheduler.dispatch(jobs)\n            evaluations = [r.data for r in scheduler.collect(n_results=batch_size, timeout=100.0)]\n            optimiser.update(samples, evaluations)\n            for sample_evaluation_pair in zip(samples, evaluations):\n                tb_rep.log(sample_evaluation_pair)\n\nIf we have a look at the Tensorboard dashboard while this is running, we should be able to see results being updated live!\n\n.. image:: ../_static/images/tensorboard.gif\n  :width: 800\n  :align: center\n  :alt: Tensorboard\n\nEven quicker start\n~~~~~~~~~~~~~~~~~~\n\nA high-level wrapper class :class:`Trial` allows for seamless parallel optimisation without having to schedule jobs,\nupdate the optimiser or log results explicitly. The API is reduced to the minimum and yet remains flexible as\none can specify any optimiser or reporter:\n\n.. code-block:: python\n\n    trial = ht.Trial(objective=objective_fn,\n                     domain=domain,\n                     optimiser=\"bo\",\n                     reporter=\"tensorboard\",\n                     logdir=\"./mnist_mlp\",\n                     database_path=\"./mnist_mlp\",\n                     metrics=[\"cross-entropy\"])\n\n    trial.run(n_steps, batch_size=batch_size, n_parallel=batch_size)\n"
  },
  {
    "path": "docs/manual/reports.rst",
    "content": "Reports\n=======\n\nSaving and visualising progress can be accomplished by using :class:`Reporter` instance.\nThe reporter is supplied with data using the :py:meth:`log()` method which takes a tuple of a sample and score.\nOptionally one can store additional information about the current experiment, e.g. the output directory or the job id,\nusing the ``meta`` keyword argument:\n\n.. code-block:: python\n\n    for s, e, m in zip(samples, evaluations, meta_infos):\n        reporter.log((s, e), meta=m)\n\nTable\n-----\n\nHypertunity comes with a built-in reporter which organises the experiment results into an ascii table.\nIt is initialised from a domain and a list of metrics and can be viewed as a formatted string table by calling :obj:`str`\non the object.\nThe table can be sorted in ascending or descending order and the best results can be emphasised:\n\n.. code-block:: python\n\n    >>> domain = ht.Domain({\"x\": [-5., 6.], \"y\": {\"sin\", \"cos\"}, \"z\": set(range(4))})\n    >>> reporter = ht.Table(domain, metrics=[\"score\"])\n    >>> # run experiment and call reporter.log(...)\n    ...\n    >>> print(reporter.format(order=\"descending\"))\n    +=====+========+=====+===+==============+\n    | No. |   x    |  y  | z |    score     |\n    +=====+========+=====+===+==============+\n    |  6  | -4.35  | cos | 1 | 15.921 ± 0.0 |\n    +-----+--------+-----+---+--------------+\n    |  5  | -4.232 | cos | 3 | 8.906 ± 0.0  |\n    +-----+--------+-----+---+--------------+\n    |  4  | -4.588 | sin | 3 | 6.134 ± 0.0  |\n    +-----+--------+-----+---+--------------+\n    |  2  |  2.16  | cos | 0 | 4.667 ± 0.0  |\n    +-----+--------+-----+---+--------------+\n    |  3  | -0.977 | cos | 1 | -2.045 ± 0.0 |\n    +-----+--------+-----+---+--------------+\n    |  1  | -1.438 | cos | 3 | -6.933 ± 0.0 |\n    +-----+--------+-----+---+--------------+\n\nTensorboard\n-----------\n\nIf Hypertunity is installed with the `tensorboard` option, a suitable version of Tensorflow and Tensorboard will be installed.\nThis will enable a :class:`Tensorboard` reporter which, using the HParams plugin, will generate live visualisations\nas experiments are being logged. One can start the Tensorboard dashboard in the browser as usual, using the `logdir` supplied\nat initialisation.\n\nNote that to create a Tensorboard reporter one will have to import ``hypertunity.reports.tensorboard`` explicitly:\n\n.. code-block:: python\n\n    import hypertunity.reports.tensorboard as tb\n    tb_reporter = tb.Tensorboard(domain, metrics=[\"score\"], logdir=\"./logs\")\n\nSee the :doc:`quickstart` guide for a preview of the dashboard visualisation.\n"
  },
  {
    "path": "docs/manual/scheduling.rst",
    "content": "Scheduling jobs\n===============\n\nOften in practice the objective function is a python script that might take command line arguments as parameters or define a function that has lots of dependencies.\nImporting this function into the hyperparameter optimisation script or wrapping the target script involves some boilerplate code.\nTo help with that Hypertunity allows for specifying objective functions as ``Job`` instances which are then run in succession or in parallel using a ``Scheduler``.\nThe latter is a wrapper around `joblib <https://joblib.readthedocs.io>`_ and takes care of both running jobs and collecting results.\n\nScheduling of ``Job`` instances is done using the ``dispatch`` method of a ``Scheduler``:\n\n.. code-block:: python\n\n    jobs = [Job(...) for _ in range(10)]\n    scheduler.dispatch(jobs)\n    evaluations = [r.data for r in scheduler.collect(n_results=batch_size, timeout=10.0)]\n\nThere are multiple ways to define a job depending on the target to optimise.\n\nLocal python callable\n~~~~~~~~~~~~~~~~~~~~~\n\nIf the function is defined or imported within the hyperparameter optimisation script, the ``task`` argument is the callable instance.\nThe ``args`` is then a tuple of arguments or a dict of named arguments which are supplied to the task function during calling.\nFor example:\n\n.. code-block:: python\n\n    jobs = [ht.Job(task=foo, args=(*s.as_namedtuple(),)) for s in samples]\n\n\nPython callable in a script\n~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nIf the function to optimise resides in a script, Hypertunity allows for specifying a target by the full path to the script.\nTo select the objective function from the script append ``:`` and the function name:\n\n.. code-block:: python\n\n    jobs = [Job(task=\"path/to/script.py:foo\", args=(*s.as_namedtuple(),)) for s in samples]\n\n\nA script\n~~~~~~~~\n\nIf the objective function is a full command line application or a script that accepts the hyperparameters to tune as command line arguments then you should create a job as follows:\n\n.. code-block:: python\n\n    jobs = [Job(task=\"path/to/script.py\",\n                args=(*s.as_namedtuple(),),\n                meta={\"binary\": \"python\"}) for s in samples]\n\n\nUsing Slurm\n~~~~~~~~~~~\n\nTo schedule jobs using Slurm a special job type is available. It allows to configure resources and other Slurm parameters but also requires that the target script is able to write a results file on disk.\n\n.. code-block:: python\n\n    jobs = [SlurmJob(task=\"path/to/script.py\",\n                     args=(*sample.as_namedtuple(),),\n                     output_file=\"path/to/results.pkl\",\n                     meta={\"binary\": \"python\", \"resources\": {\"cpu\": 1}}))\n\n"
  },
  {
    "path": "docs/source/hypertunity.rst",
    "content": ":mod:`hypertunity`\n==================\n\n.. automodule:: hypertunity\n\nSummary\n-------\n\n.. autosummary::\n   :nosignatures:\n\n   Domain\n   Sample\n   Trial\n\nAPI documentation\n-----------------\n\n.. autoclass:: Domain\n   :members:\n\n.. autoclass:: Sample\n   :members:\n\n.. autoclass:: Trial\n   :members:\n"
  },
  {
    "path": "docs/source/optimisation.rst",
    "content": ":mod:`hypertunity.optimisation`\n===============================\n\n.. currentmodule:: hypertunity.optimisation\n\nSummary\n-------\n\nData classes\n~~~~~~~~~~~~\n\n.. autosummary::\n   :nosignatures:\n\n   EvaluationScore\n   HistoryPoint\n\nOptimisers\n~~~~~~~~~~\n\n.. autosummary::\n   :nosignatures:\n\n   Optimiser\n   BayesianOptimisation\n   GridSearch\n   RandomSearch\n\nAPI documentation\n-----------------\n\n.. autoclass:: EvaluationScore\n   :members:\n\n.. autoclass:: HistoryPoint\n   :members:\n\n.. autoclass:: Optimiser\n   :members:\n\n.. autoclass:: BayesianOptimisation\n   :members:\n\n.. autoclass:: GridSearch\n   :members:\n\n.. autoclass:: RandomSearch\n   :members:\n"
  },
  {
    "path": "docs/source/reports.rst",
    "content": ":mod:`hypertunity.reports`\n==========================\n\n.. currentmodule:: hypertunity.reports\n\nSummary\n-------\n\nDefault\n~~~~~~~\n\n.. autosummary::\n   :nosignatures:\n\n   Reporter\n   Table\n\nOptional\n~~~~~~~~\n\n.. autosummary::\n    :nosignatures:\n\n    tensorboard.Tensorboard\n\nAPI documentation\n-----------------\n\n.. autoclass:: Reporter\n  :members:\n\n.. autoclass:: Table\n  :members:\n\n.. currentmodule:: hypertunity.reports.tensorboard\n\n.. autoclass:: Tensorboard\n   :members:\n"
  },
  {
    "path": "docs/source/scheduling.rst",
    "content": ":mod:`hypertunity.scheduling`\n=============================\n\n.. currentmodule:: hypertunity.scheduling\n\nSummary\n-------\n\n.. autosummary::\n   :nosignatures:\n\n   Scheduler\n   Job\n   SlurmJob\n   Result\n\nAPI documentation\n-----------------\n\n.. autoclass:: Scheduler\n   :members:\n\n.. autoclass:: Job\n   :members:\n\n.. autoclass:: SlurmJob\n   :members:\n\n.. autoclass:: Result\n   :members:\n"
  },
  {
    "path": "hypertunity/__init__.py",
    "content": "from .domain import *\nfrom .optimisation import *\nfrom .reports import *\nfrom .scheduling import *\nfrom .trial import *\n\n__version__ = \"1.0.1\"\n"
  },
  {
    "path": "hypertunity/domain.py",
    "content": "\"\"\"Definition of the optimisation domain and a sample.\"\"\"\n\nimport ast\nimport copy\nimport os\nimport pickle\nimport random\nfrom collections import namedtuple\nfrom typing import Tuple\n\n__all__ = [\n    \"Domain\",\n    \"DomainNotIterableError\",\n    \"DomainSpecificationError\",\n    \"Sample\"\n]\n\n\nclass _RecursiveDict:\n    \"\"\"Helper base class for the :class:`Domain` and :class:`Sample` classes.\n\n    It implements common logic for creation, representation, type conversion\n    and serialisation.\n    \"\"\"\n\n    def __init__(self, dct):\n        if isinstance(dct, dict):\n            self._data = dct\n        elif isinstance(dct, str):\n            self._data = ast.literal_eval(dct)\n        else:\n            raise TypeError(\n                f\"A {self.__class__.__name__} object can be created from a \"\n                f\"Python dict or str objects only. \"\n                f\"Unknown type {type(dct)} at initialisation.\"\n            )\n\n        self._ndim = 0\n        for _, val in _deepiter_dict(self._data):\n            self._ndim += 1\n\n    def __hash__(self):\n        return hash(str(self))\n\n    def __repr__(self):\n        \"\"\"Return the representation of the recursive dict using the\n        string method.\n        \"\"\"\n        return str(self)\n\n    def __str__(self):\n        \"\"\"Return the string representation of the recursive dict.\"\"\"\n        return str(self._data)\n\n    def __eq__(self, other):\n        \"\"\"Compare all subdomains for equal bounds and sets. The order of the\n        subdomains is not important.\n        \"\"\"\n        return self.as_dict() == other.as_dict()\n\n    def __len__(self):\n        \"\"\"Compute the dimensionality of the recursive dict as the length of\n        the flattened dict.\n        \"\"\"\n        return self._ndim\n\n    def __getitem__(self, item):\n        \"\"\"Return the item (possibly a subdomain) for a given key.\n\n        Args:\n            item: str of tuple of str. If the latter it will access nested\n            structures with the next str in the tuple.\n        \"\"\"\n        if isinstance(item, str):\n            return self._data.__getitem__(item)\n        elif isinstance(item, tuple) and all(map(lambda x: isinstance(x, str), item)):\n            sub_dict = self._data\n            for it in item:\n                if not isinstance(sub_dict, dict):\n                    raise KeyError(f\"Unknown sub-key {it}.\")\n                sub_dict = sub_dict[it]\n            return sub_dict\n\n    def __add__(self, other: '_RecursiveDict'):\n        \"\"\"Merge self with the `other` :class:`_RecursiveDict`.\n\n        Args:\n            other: :class:`_RecursiveDict`. The recursive dictionary that will\n                be merged into the current one.\n\n        Returns:\n            A new :class:`_RecursiveDict` object consisting of the subdomains\n            of both domains. If the keys overlap and the subdomains are discrete\n            or categorical, the values will be unified.\n\n        Raises:\n            :obj:`ValueError`: if identical keys point to different values.\n        \"\"\"\n        flattened_a = self.flatten()\n        flattened_b = other.flatten()\n        # validate that the two _RecursiveDicts are disjoint\n        if len(flattened_a.keys()) > len(flattened_a.keys() - flattened_b.keys()):\n            raise ValueError(\n                f\"Ambiguous addition of {self.__class__.__name__} objects.\"\n            )\n        merged = list(flattened_a.items())\n        merged.extend(list(flattened_b.items()))\n        return self.__class__.from_list(merged)\n\n    def flatten(self):\n        \"\"\"Return the flattened version of the recursive dict, i.e. without\n        nested dicts.\n\n        The keys of the nested subdomains are collected in a tuple to create a\n        new unique key. For the sake of type consistency, the key of a\n        non-nested subdomain is converted to a tuple with a single element.\n        \"\"\"\n        return {keys: val for keys, val in _deepiter_dict(self._data)}\n\n    def as_dict(self):\n        \"\"\"Convert the recursive dict object from :class:`_RecursiveDict`\n        to :obj:`dict` type.\n        \"\"\"\n        return copy.deepcopy(self._data)\n\n    @classmethod\n    def from_list(cls, lst):\n        \"\"\"Create a :class:`_RecursiveDict` object from a list of tuples.\n\n        Args:\n            lst: :obj:`List[Tuple]`. Each element is a pair of the keys\n            (tuple of strings) and the value.\n\n        Returns:\n            A :class:`_RecursiveDict` object.\n\n        Raises:\n            :obj:`ValueError`: if the list contains duplicating keys with\n            different values.\n\n        Examples:\n        ```python\n            >>> lst = [((\"a\", \"b\"), {2, 3, 4}), ((\"c\",), [0, 0.1])]\n            >>> _RecursiveDict.from_list(lst)\n            {\"a\": {\"b\": {2, 3, 4}}, \"c\": [0, 0.1]}\n        ```\n        \"\"\"\n        dct = {}\n        head = dct\n        for keys, vals in lst:\n            if not keys:\n                continue\n            for k in keys[:-1]:\n                if k not in dct:\n                    dct[k] = {}\n                dct = dct[k]\n            if keys[-1] in dct and dct[keys[-1]] == vals:\n                raise ValueError(f\"Duplicating entries for keys {keys}.\")\n            dct[keys[-1]] = vals\n            dct = head\n        return cls(head)\n\n    def serialise(self, filepath=None):\n        \"\"\"Serialise the :class:`_RecursiveDict` object to a file or a string\n        if `filepath` is not supplied.\n\n        Args:\n            filepath: (optional) :obj:`str`. Filepath as to dump the serialised\n            :class:`_RecursiveDict` object.\n\n        Returns:\n            The bytes representing the serialised :class:`_RecursiveDict` object.\n        \"\"\"\n        serialised = pickle.dumps(self._data)\n        if filepath is not None:\n            with open(filepath, \"wb\") as fp:\n                pickle.dump(self._data, fp)\n        return serialised\n\n    @classmethod\n    def deserialise(cls, series):\n        \"\"\"Deserialise a serialised :class:`_RecursiveDict` object from a byte\n        stream or file.\n\n        Args:\n            series: :obj:`str`. The serialised :class:`_RecursiveDict` object or\n                a filepath to it.\n\n        Returns:\n            A :class:`_RecursiveDict` object.\n        \"\"\"\n        if not isinstance(series, (bytes, bytearray)) and os.path.isfile(series):\n            with open(series, \"rb\") as fp:\n                return cls(pickle.load(fp))\n        return cls(pickle.loads(series))\n\n    def as_namedtuple(self):\n        \"\"\"Convert a :class:`_RecursiveDict` to a namedtuple type.\n\n        Returns:\n            A Python namedtuple object with names the same as the keys of the\n            :class:`_RecursiveDict` dict. Nested dicts are accessed by\n            successive attribute getters.\n\n        Examples:\n        ```python\n            >>> rd = _RecursiveDict({\"a\": {\"b\": [1, 2]}, \"c\": {1, 2, 3}, \"d\": 2.})\n            >>> nt = rd.as_namedtuple()\n            >>> nt.a.b\n            [1, 2]\n            >>> nt.c == {1, 2, 3} and nt.d == 2.\n            True\n        ```\n        \"\"\"\n\n        def helper(dct):\n            keys, vals = [], []\n            for k, v in dct.items():\n                keys.append(k)\n                if isinstance(v, dict):\n                    vals.append(helper(v))\n                else:\n                    vals.append(v)\n            # The dict.keys() and dict.values() will iterate in the same order\n            # as long as dct is not modified.\n            return namedtuple(\"NT_\" + self.__class__.__name__, keys)(*vals)\n\n        return helper(self._data)\n\n\nclass Domain(_RecursiveDict):\n    \"\"\"Defines the optimisation domain of the objective function. It can be a\n    continuous interval or a discrete set of numeric or non-numeric values.\n    The latter is also designated as a categorical domain. It is represented as\n    a Python dict object with the keys naming the variables and the values defining\n    the set of allowed values. A :class:`Domain` can also be recursively\n    specified. That is, a key can name a subdomain represented as a Python dict.\n\n    For continuous sets use Python list to define an interval in the form\n    [a, b], a < b. For discrete sets use Python sets, e.g. {1, 2, 5, -0.1}\n    or {\"option_a\", \"option_b\"}.\n\n    Examples:\n        >>> simple_domain = {\"x\": {0, 1},\n        >>>                  \"y\": [-1, 1],\n        >>>                  \"z\": {-1, 2, 4}}\n        >>> nested_domain = {\"discrete\": {\"x\": {1, 2, 3}, \"y\": {4, 5, 6}}\n        >>>                  \"continuous\": {\"x\": [-4, 4], \"y\": [0, 1]}\n        >>>                  \"categorical\": {\"opt1\", \"opt2\"}}\n    \"\"\"\n    # Domain types\n    Continuous = 1\n    Discrete = 2\n    Categorical = 3\n    Invalid = 4\n\n    def __init__(self, dct, seed=None):\n        \"\"\"Initialise the :class:`Domain`.\n\n        Args:\n            dct: :obj:`dict`. The mapping of variable names to sets of\n                allowed values.\n            seed: (optional) :obj:`int`. Seed for the randomised sampling.\n        \"\"\"\n        super(Domain, self).__init__(dct)\n        self._validate()\n        self._rng = random.Random(seed)\n        self._is_continuous = False\n        for _, val in _deepiter_dict(self._data):\n            if isinstance(val, list):\n                self._is_continuous = True\n\n    def __iter__(self):\n        \"\"\"Iterate over the domain if it is fully discrete.\n\n        The iterations are over the Cartesian product of all 1-dim discrete\n        subdomains.\n\n        Raises:\n            :class:`DomainNotIterableError`: if the domain has a at least one\n            continuous subdomain.\n        \"\"\"\n        if self._is_continuous:\n            raise DomainNotIterableError(\n                \"The domain has a continuous subdomain and cannot be iterated.\"\n            )\n\n        def cartesian_walk(dct):\n            if dct:\n                key, vals = dct.popitem()\n                if isinstance(vals, set):\n                    for v in vals:\n                        yield from (\n                            dict(**rem, **{key: v})\n                            for rem in cartesian_walk(copy.deepcopy(dct))\n                        )\n                elif isinstance(vals, dict):\n                    for sub_v in cartesian_walk(copy.deepcopy(vals)):\n                        yield from (\n                            dict(**rem, **{key: sub_v})\n                            for rem in cartesian_walk(copy.deepcopy(dct))\n                        )\n                else:\n                    raise TypeError(\n                        f\"Unexpected subdomain of type {type(vals)}.\"\n                    )\n            else:\n                yield {}\n\n        yield from map(Sample, cartesian_walk(copy.deepcopy(self._data)))\n\n    def _validate(self):\n        \"\"\"Check for invalid domain specifications.\"\"\"\n        for keys, values in _deepiter_dict(self._data):\n            if not (all(map(lambda x: isinstance(x, str), keys))\n                    and isinstance(values, (set, list, dict))):\n                raise DomainSpecificationError(\n                    \"Keys must be of type string and values \"\n                    \"must be either of type set, list or dict.\"\n                )\n            if (isinstance(values, list)\n                    and (len(values) != 2 or values[0] >= values[1])):\n                raise DomainSpecificationError(\n                    \"Interval must be specified by two numbers: [a, b], a < b.\"\n                )\n\n    def sample(self):\n        \"\"\"Draw a sample from the domain. All subdomains are sampled uniformly.\n\n        Returns:\n            A :class:`Sample` object.\n        \"\"\"\n\n        def sample_dict(dct):\n            sample = {}\n            for key, vals in dct.items():\n                if isinstance(vals, set):\n                    sample[key] = self._rng.choice(list(vals))\n                elif isinstance(vals, list):\n                    sample[key] = self._rng.uniform(*vals)\n                else:\n                    sample[key] = sample_dict(vals)\n            return sample\n\n        return Sample(sample_dict(self._data))\n\n    @property\n    def is_continuous(self):\n        \"\"\"Return `True` if at least one subdomain is continuous.\"\"\"\n        return self._is_continuous\n\n    @classmethod\n    def get_type(cls, subdomain):\n        \"\"\"Return the type of the set of values in a subdomain.\n\n        Args:\n            subdomain: one of :obj:`dict`, :obj:`list` or :obj:`set`. The\n                subdomain to get the type for.\n\n        Returns:\n            One of `Domain.Continuous`, `Domain.Discrete`, `Domain.Categorical`\n            or `Domain.Invalid`.\n        \"\"\"\n\n        def is_numeric(x):\n            try:\n                float(x)\n            except ValueError:\n                return False\n            return True\n\n        if isinstance(subdomain, list):\n            return Domain.Continuous\n        if isinstance(subdomain, set):\n            if all(map(is_numeric, subdomain)):\n                return Domain.Discrete\n            return Domain.Categorical\n        return Domain.Invalid\n\n    def split_by_type(self) -> Tuple['Domain', 'Domain', 'Domain']:\n        \"\"\"Split the domain into discrete, categorical and continuous\n        subdomains respectively.\n\n        Returns:\n            A tuple of three :class:`Domain` objects for the discrete\n            numerical, categorical and continuous subdomains.\n        \"\"\"\n        discrete, categorical, continuous = [], [], []\n        for keys, vals in self.flatten().items():\n            if Domain.get_type(vals) == Domain.Continuous:\n                continuous.append((keys, vals))\n            elif Domain.get_type(vals) == Domain.Categorical:\n                categorical.append((keys, vals))\n            elif Domain.get_type(vals) == Domain.Discrete:\n                discrete.append((keys, vals))\n            else:\n                raise ValueError(\"Encountered an invalid subdomain.\")\n        return (\n            Domain.from_list(discrete),\n            Domain.from_list(categorical),\n            Domain.from_list(continuous)\n        )\n\n\nclass DomainNotIterableError(TypeError):\n    \"\"\"Alias for the :obj:`TypeError` raised during iteration of (partially)\n    continuous :class:`Domain` object.\n    \"\"\"\n    pass\n\n\nclass DomainSpecificationError(ValueError):\n    \"\"\"Alias for the :obj:`ValueError` raised during :class:`Domain` object\n    creation from an invalid set of values.\n    \"\"\"\n    pass\n\n\nclass Sample(_RecursiveDict):\n    \"\"\"Defines a sample from the optimisation domain.\n\n    It has the same recursive structure a :class:`Domain` object, however each\n    dimension is represented by one value only. The keys are exactly as the\n    keys of the respective domain.\n\n    Examples:\n        >>> domain = Domain({\"x\": {\"y\": {0, 1, 2}}, \"z\": [3, 4]})\n        >>> domain.sample()\n        {'x': {'y': 0}, 'z': 3.1415926535897932}\n    \"\"\"\n\n    def __init__(self, dct):\n        \"\"\"Initialise the :class:`Sample` object from a dict.\"\"\"\n        super(Sample, self).__init__(dct)\n\n    def __iter__(self):\n        \"\"\"Iterate over all values in the sample.\n\n        Yields:\n            A tuple of keys and a single value, where the keys are a tuple\n            of strings.\n        \"\"\"\n        yield from self.flatten().items()\n\n\ndef _deepiter_dict(dct):\n    \"\"\"Iterate over all key, value pairs of a (possibly nested) dictionary.\n    In this case, all keys of the nested dicts are summarised in a tuple.\n\n    Args:\n        dct: dict object to iterate.\n\n    Yields:\n        Tuple of keys (itself a tuple) and the corresponding value.\n\n    Examples:\n        >>> list(_deepiter_dict({\"a\": {\"b\": 1, \"c\": 2}, \"d\": 3}))\n        [(('a', 'b'), 1), (('a', 'c'), 2), (('d',), 3)]\n    \"\"\"\n\n    def chained_keys_iter(prefix_keys, dct_tmp):\n        for key, val in dct_tmp.items():\n            chained_keys = prefix_keys + (key,)\n            if isinstance(val, dict):\n                yield from chained_keys_iter(chained_keys, val)\n            else:\n                yield chained_keys, val\n\n    yield from chained_keys_iter((), dct)\n"
  },
  {
    "path": "hypertunity/optimisation/__init__.py",
    "content": "from .base import *\nfrom .bo import *\nfrom .exhaustive import *\nfrom .random import *\n"
  },
  {
    "path": "hypertunity/optimisation/base.py",
    "content": "\"\"\"Defines the API of every optimiser and implements common logic.\"\"\"\n\nimport abc\nimport math\nfrom dataclasses import dataclass\nfrom typing import Any, Dict, List, Sequence\n\nfrom hypertunity.domain import Domain, Sample\n\n__all__ = [\n    \"EvaluationScore\",\n    \"HistoryPoint\",\n    \"Optimiser\",\n    \"Optimizer\",\n    \"ExhaustedSearchSpaceError\"\n]\n\n\n@dataclass(frozen=True, order=True)\nclass EvaluationScore:\n    \"\"\"A tuple of the evaluation value of the objective function\n    and a variance if known.\n    \"\"\"\n    value: float\n    variance: float = 0.0\n\n    def __str__(self):\n        return f\"{self.value:.3f} ± {math.sqrt(self.variance):.1f}\"\n\n\n@dataclass(frozen=True)\nclass HistoryPoint:\n    \"\"\"A tuple of a :class:`Sample` at which the objective has been evaluated\n    and the corresponding metrics. The metrics are supplied as :obj:`dict`\n    mapping of a :obj:`str` metric name to an :class:`EvaluationScore`.\n    \"\"\"\n    sample: Sample\n    metrics: Dict[str, EvaluationScore]\n\n\nclass Optimiser:\n    \"\"\"Abstract class :class:`Optimiser` for all optimisers.\n\n    It must be implemented by all subclasses in this package.\n\n    Every :class:`Optimiser` instance can be run for one single step using the\n    :py:meth:`run_step` method. The :class:`Optimiser` does not perform the\n    evaluation of the objective function but only proposes values from its\n    domain. Therefore an evaluation history must be supplied via the\n    :py:meth`update` method. The history can be erased and the\n    :class:`Optimiser` brought to the initial state via the :py:meth:`reset`\n    method.\n    \"\"\"\n\n    DEFAULT_METRIC_NAME = \"score\"\n\n    def __init__(self, domain: Domain):\n        \"\"\"Initialise the optimiser with a domain.\n\n        Args:\n            domain: :class:`Domain`. The domain of the objective function.\n        \"\"\"\n        self.domain = domain\n        self._history: List[HistoryPoint] = []\n\n    @property\n    def history(self):\n        \"\"\"Return the accumulated optimisation history.\"\"\"\n        return self._history\n\n    @history.setter\n    def history(self, history: List[HistoryPoint]):\n        \"\"\"Set the optimiser history.\n\n        This method can be used to warm-start an optimiser.\n\n        Args:\n            history: :obj:`List[HistoryPoint]`. New history which will\n                **overwrite** the old one.\n        \"\"\"\n        self.reset()\n        for hp in history:\n            self.update(hp.sample, hp.metrics)\n\n    @abc.abstractmethod\n    def run_step(self, batch_size, *args: Any, **kwargs: Any) -> List[Sample]:\n        \"\"\"Perform one step of optimisation and suggest the next sample to\n        evaluate.\n\n        Args:\n            batch_size: (optional) :obj:`int`. The number of samples to\n                suggest at once.\n            *args: optional arguments for the Optimiser.\n            **kwargs: optional keyword arguments for the Optimiser.\n\n        Returns:\n            A :obj:`List[Sample]` with the suggested samples to evaluate.\n        \"\"\"\n        raise NotImplementedError\n\n    def update(self, x, fx, **kwargs):\n        \"\"\"Update the optimiser's history with new points.\n\n        Args:\n            x: :class:`Sample` or :obj:`List[Sample]`. The samples at which the\n                objective function has been evaluated.\n            fx: :class:`EvaluationScore` or :obj:`List[EvaluationScore]`. The\n                evaluation scores at the corresponding samples.\n        \"\"\"\n        if isinstance(x, Sample):\n            self._update_history(x, fx)\n        elif (isinstance(x, Sequence)\n              and isinstance(fx, Sequence)\n              and len(x) == len(fx)):\n            for i, j in zip(x, fx):\n                self._update_history(i, j)\n        else:\n            raise ValueError(\"Update values for `x` and `f(x)` must be either \"\n                             \"a `Sample` and an evaluation or a list thereof.\")\n\n    def _update_history(self, x, fx):\n        if isinstance(fx, (float, int)):\n            history_point = HistoryPoint(\n                sample=x,\n                metrics={self.DEFAULT_METRIC_NAME: EvaluationScore(fx)}\n            )\n        elif isinstance(fx, EvaluationScore):\n            history_point = HistoryPoint(\n                sample=x, metrics={self.DEFAULT_METRIC_NAME: fx})\n        elif isinstance(fx, Dict):\n            metrics = {}\n            for key, val in fx.items():\n                if isinstance(val, (float, int)):\n                    metrics[key] = EvaluationScore(val)\n                else:\n                    metrics[key] = val\n            history_point = HistoryPoint(sample=x, metrics=metrics)\n        else:\n            raise TypeError(\n                \"Cannot update history for one sample and multiple evaluations.\"\n                \" Use batched update instead and provide a list of samples and \"\n                \"a list of evaluation metrics.\")\n        self.history.append(history_point)\n\n    def reset(self):\n        \"\"\"Reset the optimiser to the initial state.\"\"\"\n        self._history.clear()\n\n\nclass ExhaustedSearchSpaceError(Exception):\n    pass\n\n\nOptimizer = Optimiser\n"
  },
  {
    "path": "hypertunity/optimisation/bo.py",
    "content": "\"\"\"Bayesian Optimisation using Gaussian Process regression.\"\"\"\n\nfrom multiprocessing import cpu_count\nfrom typing import Any, Dict, List, Sequence, Tuple, Type, TypeVar, Union\n\nimport GPy\nimport GPyOpt\nimport numpy as np\nfrom GPyOpt.core import errors as gpyopt_err\n\nfrom hypertunity import utils\nfrom hypertunity.domain import Domain, Sample\nfrom hypertunity.optimisation.base import (\n    EvaluationScore,\n    ExhaustedSearchSpaceError,\n    Optimiser\n)\n\n__all__ = [\n    \"BayesianOptimisation\",\n    \"BayesianOptimization\"\n]\n\nGPyOptSample = TypeVar(\"GPyOptSample\", List[List], np.ndarray)\nGPyOptDomain = List[Dict[str, Any]]\nGPyOptCategoricalValueMapper = Dict[str, Dict[Any, int]]\nGPyOptDiscreteTypeMapper = Dict[str, Dict[Any, type]]\n\n\nclass BayesianOptimisation(Optimiser):\n    \"\"\"Bayesian Optimiser using `GPyOpt` as a backend.\"\"\"\n\n    CONTINUOUS_TYPE = \"continuous\"\n    DISCRETE_TYPE = \"discrete\"\n    CATEGORICAL_TYPE = \"categorical\"\n\n    def __init__(self, domain, seed=None):\n        \"\"\"Initialise the optimiser's domain.\n\n        Args:\n            domain: :class:`Domain`. The domain of the objective function.\n            seed: (optional) :obj:`int`. The seed of the optimiser. Used for\n                reproducibility purposes.\n        \"\"\"\n        np.random.seed(seed)\n        domain = Domain(domain.as_dict(), seed=seed)\n        super(BayesianOptimisation, self).__init__(domain)\n        converted_and_mappers = self._convert_to_gpyopt_domain(self.domain)\n        (\n            self.gpyopt_domain,\n            self._categorical_value_mapper,\n            self._discrete_type_mapper\n        ) = converted_and_mappers\n        self._inv_categorical_value_mapper = {\n            name: {v: k for k, v in mapping.items()}\n            for name, mapping in self._categorical_value_mapper.items()\n        }\n        self._data_x = np.array([[]])\n        self._data_fx = np.array([[]])\n        self.__is_empty_data = True\n\n    @staticmethod\n    def _convert_to_gpyopt_domain(\n            orig_domain: Domain\n    ) -> Tuple[GPyOptDomain,\n               GPyOptCategoricalValueMapper,\n               GPyOptDiscreteTypeMapper]:\n        \"\"\"Convert a :class:`Domain` type object to :obj:`GPyOptDomain`.\n\n        Args:\n            orig_domain: :class:`Domain` to convert.\n\n        Returns:\n            A tuple of the converted :obj:`GPyOptDomain` object and a value\n            mapper to assign each categorical value to an integer\n            (0, 1, 2, 3 ...). This is done to abstract away the type of the\n            categorical domain from the `GPyOpt` internals and thus arbitrary\n            types are supported.\n\n        Notes:\n            The categorical options must be hashable. This behaviour may change\n            in the future.\n        \"\"\"\n        gpyopt_domain = []\n        value_mapper = {}\n        type_mapper = {}\n        flat_domain = orig_domain.flatten()\n        for names, vals in flat_domain.items():\n            dim_name = utils.join_strings(names)\n            domain_type = Domain.get_type(vals)\n            if domain_type == Domain.Continuous:\n                dim_type = BayesianOptimisation.CONTINUOUS_TYPE\n            elif domain_type == Domain.Discrete:\n                dim_type = BayesianOptimisation.DISCRETE_TYPE\n                type_mapper[dim_name] = {v: type(v) for v in vals}\n            elif domain_type == Domain.Categorical:\n                dim_type = BayesianOptimisation.CATEGORICAL_TYPE\n                value_mapper[dim_name] = {v: i for i, v in enumerate(vals)}\n                vals = tuple(range(len(vals)))\n            else:\n                raise ValueError(\n                    f\"Badly specified subdomain {names} with values {vals}.\"\n                )\n            gpyopt_domain.append({\n                \"name\": dim_name,\n                \"type\": dim_type,\n                \"domain\": tuple(vals)\n            })\n        assert len(gpyopt_domain) == len(orig_domain), \\\n            \"Mismatching dimensionality after domain conversion.\"\n        return gpyopt_domain, value_mapper, type_mapper\n\n    def _convert_to_gpyopt_sample(self, orig_sample: Sample) -> GPyOptSample:\n        \"\"\"Convert a sample of type :class:`Sample` to type :obj:`GPyOptSample`\n        and vice versa.\n\n        If the function is supplied with a :obj:`GPyOptSample` type object it\n        calls the dedicated function `self._convert_from_gpyopt_sample`.\n\n        Args:\n            orig_sample: :class:`Sample` type object to be converted.\n\n        Returns:\n            A :obj:`GPyOptSample` type object with the same values as\n            `orig_sample`.\n        \"\"\"\n        gpyopt_sample = []\n        # iterate in the order of the GPyOpt domain names\n        for dim in self.gpyopt_domain:\n            keys = utils.split_string(dim[\"name\"])\n            val = orig_sample[keys]\n            if dim[\"type\"] == BayesianOptimisation.CATEGORICAL_TYPE:\n                val = self._categorical_value_mapper[dim[\"name\"]][val]\n            gpyopt_sample.append(val)\n        return np.asarray(gpyopt_sample)\n\n    def _convert_from_gpyopt_sample(self, gpyopt_sample: GPyOptSample) -> Sample:\n        \"\"\"Convert :obj:`GPyOptSample` type object to the corresponding\n        :class:`Sample` type.\n\n        Args:\n            gpyopt_sample: :obj:`GPyOptSample` object to be converted.\n\n        Returns:\n            A :class:`Sample` type object with the same values as\n                `gpyopt_sample`.\n        \"\"\"\n        if len(self.gpyopt_domain) != len(gpyopt_sample):\n            raise ValueError(\n                f\"Cannot convert sample with mismatching dimensionality. \"\n                f\"The original space has {len(self.domain)} dimensions and the \"\n                f\"sample {len(gpyopt_sample)} dimensions.\"\n            )\n        orig_sample = {}\n        for dim, value in zip(self.gpyopt_domain, gpyopt_sample):\n            names = utils.split_string(dim[\"name\"])\n            sub_dim = orig_sample\n            for name in names[:-1]:\n                if name not in sub_dim:\n                    sub_dim[name] = {}\n                sub_dim = sub_dim[name]\n            if dim[\"type\"] == BayesianOptimisation.CATEGORICAL_TYPE:\n                sub_dim[names[-1]] = self._inv_categorical_value_mapper[dim[\"name\"]][value]\n            elif dim[\"type\"] == BayesianOptimisation.DISCRETE_TYPE:\n                sub_dim[names[-1]] = self._discrete_type_mapper[dim[\"name\"]][value](value)\n            else:\n                sub_dim[names[-1]] = value\n        return Sample(orig_sample)\n\n    @utils.support_american_spelling\n    def run_step(\n            self,\n            batch_size: int = 1,\n            minimise: bool = False,\n            **kwargs\n    ) -> List[Sample]:\n        \"\"\"Run one step of Bayesian optimisation with a GP regression surrogate\n        model.\n\n        The first sample of the domain is chosen at random. Only after the model\n        has been updated with at least one (data point, evaluation score)-pair\n        the GPs are built and the acquisition function computed and optimised.\n\n        Args:\n            batch_size: (optional) :obj:`int`. The number of samples to suggest\n                at once. If larger than one, there is no guarantee for the\n                optimality of the number of probes.\n            minimise: (optional) :obj:`bool`. Whether the objective should be\n                minimised\n            **kwargs: optional keyword arguments which will be passed to the\n                backend `GPyOpt.methods.BayesianOptimisation` optimiser.\n\n        Keyword Args:\n            model: :obj:`str` or :obj:`GPy.Model` object. The surrogate model\n                used by the backend optimiser.\n            kernel: :obj:`GPy.Kern` object. The kernel used by the model.\n            variance: :obj:`float`. The variance of the objective function.\n\n        Returns:\n            A list of `batch_size`-many :class:`Sample` instances at which the\n            objective should be evaluated next.\n\n        Raises:\n            :class:`ExhaustedSearchSpaceError`: if the domain is discrete and\n            gets exhausted.\n        \"\"\"\n        if self.__is_empty_data:\n            next_samples = [self.domain.sample() for _ in range(batch_size)]\n        else:\n            assert len(self._data_x) > 0 and len(self._data_fx) > 0, \\\n                \"Cannot initialise BO from empty data.\"\n            default_kwargs = {\n                \"num_cores\": min(batch_size, cpu_count() - 1),\n                \"normalize_Y\": True,\n                \"acquisition_type\": \"EI\",\n                \"de_duplication\": True,\n                \"model_type\": \"GP\",\n                \"evaluator_type\": \"local_penalization\" if batch_size > 1 else \"sequential\"\n            }\n            if \"model\" in kwargs:\n                model = kwargs.pop(\"model\")\n                # NOTE: Remove this test for model type after the bug in GPyOpt\n                #  is fixed: https://github.com/SheffieldML/GPyOpt/issues/183\n                if (isinstance(model, str)\n                        and model.lower() == \"gp_mcmc\"\n                        and batch_size > 1):\n                    raise ValueError(\n                        \"GP_MCMC model cannot be used with a batch size > 1 \"\n                        \"due to a bug in GPyOpt: \"\n                        \"https://github.com/SheffieldML/GPyOpt/issues/183\"\n                    )\n                kernel = kwargs.pop(\"kernel\", None)\n                variance = kwargs.pop(\"variance\", None)\n                default_kwargs[\"model\"] = self._build_model(\n                    model, kernel, variance\n                )\n                if (variance is not None\n                        and all(np.atleast_1d(np.isclose(variance, 0.0)))):\n                    default_kwargs[\"exact_feval\"] = True\n            default_kwargs = _overwrite_dict(default_kwargs, kwargs)\n\n            # NOTE: as of GPyOpt 1.2.5 adding new data to an existing model is\n            #  not yet possible, hence the object recreation. This behaviour\n            #  might be changed in future versions. In this case the code should\n            #  be refactored such that `bo` is initialised once and `update`\n            #  takes care of the extension of the (X, Y) samples.\n            bo = GPyOpt.methods.BayesianOptimization(\n                f=None, domain=self.gpyopt_domain,\n                maximize=not minimise,\n                X=self._data_x,\n                # NOTE: the following hack is necessary due to a bug in GPyOpt.\n                #  The code should be updated once this gets fixed:\n                #  https://github.com/SheffieldML/GPyOpt/issues/180\n                Y=(-1 + 2 * minimise) * self._data_fx,\n                initial_design_numdata=len(self._data_x),\n                batch_size=batch_size,\n                **default_kwargs)\n            try:\n                gpyopt_samples = bo.suggest_next_locations()\n            except gpyopt_err.FullyExploredOptimizationDomainError as err:\n                raise ExhaustedSearchSpaceError from err\n            next_samples = [self._convert_from_gpyopt_sample(s)\n                            for s in gpyopt_samples]\n        return next_samples\n\n    def _build_model(self, model: Union[str, Type[GPy.Model]] = \"GP\",\n                     kernel: GPy.kern.Kern = None,\n                     variance: float = None):\n        \"\"\"Build the surrogate model for the GPyOpt BayesianOptimisation.\n\n        The default model is 'gp'. In case of a large number of already\n        evaluated samples, a 'sparse_gp' is used to speed up computation.\n\n        Args:\n            model: :obj:`str` or :obj:`GPy.Model`, the GP regression model.\n            kernel: :obj:`GPy.kern.Kern`, the kernel of the GP regression model.\n            variance: :obj:`float`, the variance of the evaluations\n                (used only if supported by the model).\n\n        Returns:\n            A :obj:`GPy.Model` instance.\n        \"\"\"\n        if isinstance(model, GPy.Model):\n            return model\n        if isinstance(model, str):\n            model = model.lower()\n            if model == \"gp\":\n                return GPyOpt.models.GPModel(kernel=kernel, noise_var=variance,\n                                             sparse=len(self._data_x) > 25)\n            if model == \"gp_mcmc\":\n                return GPyOpt.models.GPModel_MCMC(\n                    kernel=kernel,\n                    noise_var=variance\n                )\n            raise ValueError(\n                f\"Unknown model {model}. When supplying a custom kernel or \"\n                f\"the variance of the objective function, the model has to be \"\n                f\"one from {{'GP', 'GP_MCMC'}}. Otherwise you should supply a \"\n                f\"custom `GPy.Model` instance.\"\n            )\n        raise TypeError(\"Argument `model` must be of type str or `GPy.Model`.\")\n\n    def update(self, x, fx, **kwargs):\n        \"\"\"Update the surrogate model with the domain sample `x` and the\n        function evaluation `fx`.\n\n        Args:\n            x: class:`Sample`. One sample of the domain of the objective\n                function.\n            fx: a :obj:`float`, an :class:`EvaluationScore` or a :obj:`dict`.\n                The evaluation scores of the objective evaluated at `x`. If\n                given as :obj:`dict` then it must be a mapping from metric names\n                to :class:`EvaluationScore` or :obj:`float` results.\n            **kwargs: unused by this model.\n        \"\"\"\n        super(BayesianOptimisation, self).update(x, fx)\n        # both `converted_x` and `array_fx` must be 2dim arrays\n        if isinstance(x, Sample):\n            converted_x, array_fx = self._convert_evaluation_sample(x, fx)\n        elif (isinstance(x, Sequence)\n              and isinstance(fx, Sequence)\n              and len(x) == len(fx)):\n            # append each history point to the tracked history and\n            # convert to numpy arrays\n            converted_x, array_fx = map(\n                np.concatenate, zip(*[self._convert_evaluation_sample(i, j)\n                                      for i, j in zip(x, fx)]))\n        else:\n            raise ValueError(\n                \"Update values for `x` and `f(x)` must be either \"\n                \"`Sample` and an evaluation or a list thereof.\"\n            )\n\n        if self._data_x.size == 0:\n            self._data_x = converted_x\n            self._data_fx = array_fx\n        else:\n            self._data_x = np.concatenate([self._data_x, converted_x])\n            self._data_fx = np.concatenate([self._data_fx, array_fx])\n        self.__is_empty_data = False\n\n    def _convert_evaluation_sample(self, x, fx):\n        if isinstance(fx, (float, int)):\n            array_fx = np.array([[fx]])\n        elif isinstance(fx, EvaluationScore):\n            array_fx = np.array([[fx.value]])\n        elif isinstance(fx, Dict):\n            if not len(fx) == 1:\n                raise NotImplementedError(\n                    \"Currently only evaluations with a single metric are supported.\"\n                )\n            array_fx = np.array([[list(fx.values())[0].value]])\n        else:\n            raise TypeError(\n                \"Cannot update history for one sample and multiple evaluations.\"\n                \" Use batched update instead and provide a list of samples and \"\n                \"a list of evaluation metrics.\"\n            )\n        converted_x = self._convert_to_gpyopt_sample(x).reshape(1, -1)\n        return converted_x, array_fx\n\n    def reset(self):\n        \"\"\"Reset the optimiser for a fresh start.\"\"\"\n        super(BayesianOptimisation, self).reset()\n        self._data_x = np.array([])\n        self._data_fx = np.array([])\n        self.__is_empty_data = True\n\n\nBayesianOptimization = BayesianOptimisation\n\n\ndef _overwrite_dict(old_dict, new_dict):\n    updated_old = {}\n    # copy the old dict\n    for key, value in old_dict.items():\n        updated_old[key] = value\n    # overwrite the existing and add the new values\n    for key, value in new_dict.items():\n        updated_old[key] = value\n    return updated_old\n"
  },
  {
    "path": "hypertunity/optimisation/exhaustive.py",
    "content": "\"\"\"Optimisation by exhaustive search, aka grid search.\"\"\"\n\nfrom typing import List\n\nfrom hypertunity.domain import Domain, DomainNotIterableError, Sample\nfrom hypertunity.optimisation.base import ExhaustedSearchSpaceError, Optimiser\n\n__all__ = [\n    \"GridSearch\"\n]\n\n\nclass GridSearch(Optimiser):\n    \"\"\"Grid search pseudo-optimiser.\"\"\"\n\n    def __init__(self,\n                 domain: Domain,\n                 sample_continuous: bool = False,\n                 seed: int = None):\n        \"\"\"Initialise the :class:`GridSearch` optimiser from a discrete domain.\n\n        If the domain contains continuous subspaces, then they could be sampled\n        if `sample_continuous` is enabled.\n\n        Args:\n            domain: :class:`Domain`. The domain to iterate over.\n            sample_continuous: (optional) :obj:`bool`. Whether to sample the\n                continuous subspaces of the domain.\n            seed: (optional) :obj:`int`. Seed for the sampling of the continuous\n                subspace if necessary.\n        \"\"\"\n        if domain.is_continuous and not sample_continuous:\n            raise DomainNotIterableError(\n                \"Cannot perform grid search on (partially) continuous domain. \"\n                \"To enable grid search in this case, set the argument \"\n                \"'sample_continuous' to True.\"\n            )\n        super(GridSearch, self).__init__(domain)\n        (\n            discrete_domain,\n            categorical_domain,\n            continuous_domain\n        ) = domain.split_by_type()\n        # unify the discrete and the categorical into one,\n        # as they can be iterated:\n        self.discrete_domain = discrete_domain + categorical_domain\n        if seed is not None:\n            self.continuous_domain = Domain(\n                continuous_domain.as_dict(), seed=seed\n            )\n        else:\n            self.continuous_domain = continuous_domain\n        self._discrete_domain_iter = iter(self.discrete_domain)\n        self._is_exhausted = len(self.discrete_domain) == 0\n        self.__exhausted_err = ExhaustedSearchSpaceError(\n            \"The domain has been exhausted. Reset the optimiser to start again.\"\n        )\n\n    def run_step(self, batch_size: int = 1, **kwargs) -> List[Sample]:\n        \"\"\"Get the next `batch_size` samples from the Cartesian-product walk\n        over the domain.\n\n        Args:\n            batch_size: (optional) :obj:`int`. The number of samples to suggest\n                at once.\n\n        Returns:\n            A list of :class:`Sample` instances from the domain.\n\n        Raises:\n            :class:`ExhaustedSearchSpaceError`: if the (discrete part of the)\n                domain is fully exhausted and no samples can be generated.\n\n        Notes:\n            This method does not guarantee that the returned list of\n            :class:`Samples` will be of length `batch_size`. This is due to the\n            size of the domain and the fact that samples will not be repeated.\n        \"\"\"\n        if self._is_exhausted:\n            raise self.__exhausted_err\n\n        samples = []\n        for i in range(batch_size):\n            try:\n                discrete = next(self._discrete_domain_iter)\n            except StopIteration:\n                self._is_exhausted = True\n                break\n            if self.continuous_domain:\n                continuous = self.continuous_domain.sample()\n                samples.append(discrete + continuous)\n            else:\n                samples.append(discrete)\n        if samples:\n            return samples\n        raise self.__exhausted_err\n\n    def reset(self):\n        \"\"\"Reset the optimiser to the beginning of the Cartesian-product walk.\"\"\"\n        super(GridSearch, self).reset()\n        self._discrete_domain_iter = iter(self.discrete_domain)\n        self._is_exhausted = len(self.discrete_domain) == 0\n"
  },
  {
    "path": "hypertunity/optimisation/random.py",
    "content": "\"\"\"Optimisation by a uniformly random search.\"\"\"\n\nfrom typing import List\n\nfrom hypertunity.domain import Domain, Sample\nfrom hypertunity.optimisation.base import Optimiser\n\n__all__ = [\n    \"RandomSearch\"\n]\n\n\nclass RandomSearch(Optimiser):\n    \"\"\"Uniform random sampling pseudo-optimiser.\"\"\"\n\n    def __init__(self, domain: Domain, seed: int = None):\n        \"\"\"Initialise the :class:`RandomSearch` search space.\n\n        Args:\n            domain: :class:`Domain`. The domain of the objective function.\n                It will be sampled uniformly using the :py:meth:`sample()`\n                method of the :class:`Domain`.\n            seed: (optional) :obj:`int`. The seed for the domain sampling.\n        \"\"\"\n        if seed is not None:\n            domain = Domain(domain.as_dict(), seed=seed)\n        super(RandomSearch, self).__init__(domain)\n\n    def run_step(self, batch_size=1, **kwargs) -> List[Sample]:\n        \"\"\"Sample uniformly the domain for `batch_size` number of times.\n\n        Args:\n            batch_size: (optional) :obj:`int`. The number of samples to return\n                at one step.\n\n        Returns:\n            A list of `batch_size` many :class:`Sample` instances.\n        \"\"\"\n        return [self.domain.sample() for _ in range(batch_size)]\n"
  },
  {
    "path": "hypertunity/optimisation/tests/__init__.py",
    "content": ""
  },
  {
    "path": "hypertunity/optimisation/tests/_common.py",
    "content": "import numpy as np\n\nfrom hypertunity.optimisation import EvaluationScore\n\nCONT_1D_ARGMAX = 3.989333\nCONT_1D_MAX = 5.958363\n\n\ndef continuous_1d(x):\n    \"\"\"Compute x * sin(2x) + 2 if x in [0, 5] else 0.\"\"\"\n    fx = np.atleast_1d(x * np.sin(2 * x) + 2)\n    fx[np.logical_and(x < 0, x > 5)] = 0.\n    return fx\n\n\nCONT_HETEROSCED_1D_ARGMAX = 0.0\nCONT_HETEROSCED_1D_MAX = 2.0\n\n\ndef continuous_heteroscedastic_1d(x):\n    \"\"\"Compute 0.2 * x^4 - x^2 + 2 + eps\n    where eps ~ N(0, |0.2 * x| + 1e-7) and x in [-2., 2]\n    \"\"\"\n    rng = np.random.RandomState(7)\n    noise = rng.normal(0., 0.2 * np.abs(x) + 1e-7)\n    fx = np.atleast_1d(0.2 * x**4 - x**2 + 2 + noise)\n    fx[np.logical_and(x < -2., x > 2.)] = 0.\n    return fx\n\n\nHETEROGEN_3D_ARGMAX = (6.0, \"sqr\", 0)\nHETEROGEN_3D_MAX = 36.0\n\n\ndef heterogeneous_3d(x, y, z):\n    \"\"\"Compute `continuous_1d` + z if y == 'sin', else return x**2 - 3 * z\n    where x is continuous, y is categorical (\"sin\", \"sqr\"), z is discrete.\n\n    Args:\n        x: float or np.ndarray, continuous variable         [-5.0, 6.0]\n        y: str, categorical variable                        (\"sin\", \"sqr\")\n        z: float or int or np.ndarray, discrete variable    (0, 1, 2, 3)\n    \"\"\"\n    if y == \"sin\":\n        return (continuous_1d(x) + z)[0]\n    elif y == \"sqr\" and z in [0, 1, 2, 3]:\n        return x**2 - 3 * z\n    else:\n        raise ValueError(\"`y` can only be 'sin' or 'sqr' and z [0, 1, 2, 3].\")\n\n\nDISCRETE_3D_ARGMAX = (4, 5, \"large\")\nDISCRETE_3D_MAX = 3.0\n\n\ndef discrete_3d(x, y, z):\n    \"\"\"Compute c * x * y where c = 0.1 if z == \"small\" else 0.15.\n\n    `x` and `y` are discrete numerical values, z is categorical.\n\n    Args:\n        x: int, discrete variable                           (1, 2, 3, 4)\n        y: int, discrete variable                           (-3, 2, 5)\n        z: str, categorical variable                        (\"small\", \"large\")\n    \"\"\"\n    if (x not in {1, 2, 3, 4}\n            and y not in {-3, 2, 5}\n            and z not in {\"small\", \"large\"}):\n        raise ValueError(\"Outside the allowed domain.\")\n    if z == \"small\":\n        return 0.1 * x * y\n    return 0.15 * x * y\n\n\ndef evaluate_continuous_1d(opt, batch_size, n_steps, **kwargs):\n    all_samples = []\n    all_evaluations = []\n    for i in range(n_steps):\n        samples = opt.run_step(batch_size, minimise=False, **kwargs)\n        evaluations = continuous_1d(np.array([s[\"x\"] for s in samples]))\n        opt.update(samples, [EvaluationScore(ev) for ev in evaluations], )\n        # gather the samples and evaluations for later assessment\n        all_samples.extend([s[\"x\"] for s in samples])\n        all_evaluations.extend(evaluations)\n    best_eval_index = int(np.argmax(all_evaluations))\n    best_sample = all_samples[best_eval_index]\n    best_eval = all_evaluations[best_eval_index]\n    assert np.isclose(best_sample, CONT_1D_ARGMAX, atol=1e-1)\n    assert np.isclose(best_eval, CONT_1D_MAX, atol=1e-1)\n\n\ndef evaluate_heterogeneous_3d(opt, batch_size, n_steps):\n    all_samples = []\n    all_evaluations = []\n    for i in range(n_steps):\n        samples = opt.run_step(batch_size, minimise=False)\n        evaluations = [heterogeneous_3d(s[\"x\"], s[\"y\"], s[\"z\"])\n                       for s in samples]\n        opt.update(samples, [EvaluationScore(ev) for ev in evaluations], )\n        # gather the samples and evaluations for later assessment\n        all_samples.extend([(s[\"x\"], s[\"y\"], s[\"z\"]) for s in samples])\n        all_evaluations.extend(evaluations)\n    best_eval_index = int(np.argmax(all_evaluations))\n    best_sample = all_samples[best_eval_index]\n    best_eval = all_evaluations[best_eval_index]\n    assert np.isclose(best_sample[0], HETEROGEN_3D_ARGMAX[0], atol=1.0)\n    assert best_sample[1:] == HETEROGEN_3D_ARGMAX[1:]\n    assert np.isclose(best_eval, HETEROGEN_3D_MAX, atol=1.0)\n\n\ndef evaluate_discrete_3d(opt, batch_size, n_steps):\n    all_samples = []\n    all_evaluations = []\n    for i in range(n_steps):\n        samples = opt.run_step(batch_size, minimise=False)\n        evaluations = [discrete_3d(s[\"x\"], s[\"y\"], s[\"z\"]) for s in samples]\n        opt.update(samples, [EvaluationScore(ev) for ev in evaluations], )\n        # gather the samples and evaluations for later assessment\n        all_samples.extend([(s[\"x\"], s[\"y\"], s[\"z\"]) for s in samples])\n        all_evaluations.extend(evaluations)\n    best_eval_index = int(np.argmax(all_evaluations))\n    best_sample = all_samples[best_eval_index]\n    best_eval = all_evaluations[best_eval_index]\n    assert best_sample == DISCRETE_3D_ARGMAX\n    assert best_eval == DISCRETE_3D_MAX\n"
  },
  {
    "path": "hypertunity/optimisation/tests/test_bo.py",
    "content": "import GPy\nimport numpy as np\nimport pytest\n\nfrom hypertunity.domain import Domain\nfrom hypertunity.optimisation import base, bo\n\nfrom . import _common as test_utils\n\n\ndef test_bo_update_and_reset():\n    domain = Domain({\"a\": {\"b\": [2, 3], \"d\": {\"f\": [3, 4]}}, \"c\": [0, 0.1]})\n    bayes_opt = bo.BayesianOptimisation(domain, seed=7)\n    samples = []\n    n_reps = 3\n    for i in range(n_reps):\n        samples.extend(bayes_opt.run_step(batch_size=1, minimise=False))\n        bayes_opt.update(samples[-1], base.EvaluationScore(2. * i))\n    assert len(bayes_opt._data_x) == n_reps\n    assert len(bayes_opt._data_fx) == n_reps\n    assert np.all(\n        bayes_opt._data_x == np.array([bayes_opt._convert_to_gpyopt_sample(s)\n                                       for s in samples])\n    )\n    assert np.all(\n        bayes_opt._data_fx == 2. * np.arange(n_reps).reshape(n_reps, 1)\n    )\n    bayes_opt.reset()\n    assert len(bayes_opt.history) == 0\n\n\ndef test_bo_set_history():\n    n_samples = 10\n    domain = Domain({\"a\": {\"b\": [2, 3]}, \"c\": [0, 0.1]})\n    history = [\n        base.HistoryPoint(\n            domain.sample(),\n            {\"score\": base.EvaluationScore(float(i))}\n        )\n        for i in range(n_samples)\n    ]\n    bayes_opt = bo.BayesianOptimisation(domain, seed=7)\n    bayes_opt.history = history\n    assert bayes_opt.history == history\n    assert len(bayes_opt._data_x) == len(bayes_opt._data_fx) == len(history)\n\n\n@pytest.mark.slow\ndef test_bo_simple_continuous():\n    domain = Domain({\"x\": [-1., 6.]})\n    bayes_opt = bo.BayesianOptimization(domain=domain, seed=7)\n    test_utils.evaluate_continuous_1d(bayes_opt, batch_size=2, n_steps=7)\n\n\n@pytest.mark.slow\ndef test_bo_simple_mixed():\n    domain = Domain({\"x\": [-5., 6.], \"y\": {\"sin\", \"sqr\"}, \"z\": set(range(4))})\n    bayes_opt = bo.BayesianOptimization(domain=domain, seed=7)\n    test_utils.evaluate_heterogeneous_3d(bayes_opt, batch_size=7, n_steps=3)\n\n\n@pytest.mark.slow\ndef test_bo_custom_model():\n    domain = Domain({\"x\": [-2., 2.]})\n    bayes_opt = bo.BayesianOptimisation(domain=domain, seed=7)\n    kernel = GPy.kern.RBF(1) + GPy.kern.Bias(1)\n    n_steps = 3\n    batch_size = 3\n    all_samples = []\n    all_evaluations = []\n    first_samples = bayes_opt.run_step(batch_size=batch_size, minimise=False)\n    xs = np.atleast_2d([s[\"x\"] for s in first_samples])\n    ys = np.atleast_2d(test_utils.continuous_heteroscedastic_1d(\n        np.array([s[\"x\"] for s in first_samples]))\n    )\n    for i in range(n_steps):\n        custom_model = GPy.models.GPHeteroscedasticRegression(xs, ys, kernel)\n        samples = bayes_opt.run_step(\n            batch_size,\n            minimise=False,\n            model=custom_model\n        )\n        evaluations = test_utils.continuous_heteroscedastic_1d(\n            np.array([s[\"x\"] for s in samples])\n        )\n        bayes_opt.update(\n            samples, [base.EvaluationScore(ev) for ev in evaluations]\n        )\n        xs = np.concatenate(\n            [xs, np.atleast_2d([s[\"x\"] for s in samples])], axis=0\n        )\n        ys = np.concatenate([ys, np.atleast_2d(evaluations)], axis=0)\n        # gather the samples and evaluations for later assessment\n        all_samples.extend([s[\"x\"] for s in samples])\n        all_evaluations.extend(evaluations)\n    best_eval_index = int(np.argmax(all_evaluations))\n    best_sample = all_samples[best_eval_index]\n    assert np.isclose(\n        best_sample, test_utils.CONT_HETEROSCED_1D_ARGMAX, atol=1e-1\n    )\n\n\n@pytest.mark.skip(\"Due to https://github.com/SheffieldML/GPyOpt/issues/260\"\n                  \" using GP_MCMC model can not be tested yet.\")\n@pytest.mark.slow\ndef test_bo_gp_mcmc_model():\n    domain = Domain({\"x\": [-1., 6.]})\n    bayes_opt = bo.BayesianOptimization(domain=domain, seed=7)\n    test_utils.evaluate_continuous_1d(\n        bayes_opt,\n        batch_size=1,\n        n_steps=7,\n        model=\"GP_MCMC\",\n        evaluator_type=\"sequential\"\n    )\n"
  },
  {
    "path": "hypertunity/optimisation/tests/test_exhaustive.py",
    "content": "import pytest\n\nfrom hypertunity.domain import Domain\nfrom hypertunity.optimisation import exhaustive\n\nfrom . import _common as test_utils\n\n\ndef test_grid_simple_discrete():\n    domain = Domain({\n        \"x\": {1, 2, 3, 4},\n        \"y\": {-3, 2, 5},\n        \"z\": {\"small\", \"large\"}\n    })\n    gs = exhaustive.GridSearch(domain=domain)\n    test_utils.evaluate_discrete_3d(gs, batch_size=4, n_steps=3 * 2)\n    with pytest.raises(exhaustive.ExhaustedSearchSpaceError):\n        gs.run_step(batch_size=4)\n    gs.reset()\n    assert len(gs.run_step(batch_size=4)) == 4\n\n\ndef test_grid_simple_mixed():\n    domain = Domain({\"x\": [-5., 6.], \"y\": {\"sin\", \"sqr\"}, \"z\": set(range(4))})\n    with pytest.raises(exhaustive.DomainNotIterableError):\n        _ = exhaustive.GridSearch(domain)\n    gs = exhaustive.GridSearch(domain, sample_continuous=True, seed=93)\n    assert len(gs.run_step(batch_size=8)) == 8\n\n\ndef test_update():\n    domain = Domain({\"x\": {-5., 6.}})\n    gs = exhaustive.GridSearch(domain)\n    gs.update([domain.sample() for _ in range(10)], list(range(10)))\n    gs.update(domain.sample(), {\"score\": 23.0})\n    gs.update(domain.sample(), 2.0)\n    assert len(gs.history) == 12\n"
  },
  {
    "path": "hypertunity/optimisation/tests/test_random.py",
    "content": "from hypertunity.domain import Domain\nfrom hypertunity.optimisation import random\n\nfrom . import _common as test_utils\n\n\ndef test_random_simple_continuous():\n    domain = Domain({\"x\": [-1., 6.]})\n    rs = random.RandomSearch(domain=domain, seed=7)\n    test_utils.evaluate_continuous_1d(rs, batch_size=50, n_steps=2)\n\n\ndef test_random_simple_mixed():\n    domain = Domain({\"x\": [-5., 6.], \"y\": {\"sin\", \"sqr\"}, \"z\": set(range(4))})\n    rs = random.RandomSearch(domain=domain, seed=1)\n    test_utils.evaluate_heterogeneous_3d(rs, batch_size=50, n_steps=25)\n\n\ndef test_update():\n    domain = Domain({\"x\": [-5., 6.]})\n    rs = random.RandomSearch(domain)\n    rs.update([domain.sample() for _ in range(4)], list(range(4)))\n    rs.update(domain.sample(), {\"score\": 23.0})\n    rs.update(domain.sample(), 2.0)\n    assert len(rs.history) == 6\n    rs.reset()\n    assert len(rs.history) == 0\n"
  },
  {
    "path": "hypertunity/reports/__init__.py",
    "content": "from .base import Reporter\nfrom .table import Table\n"
  },
  {
    "path": "hypertunity/reports/base.py",
    "content": "import abc\nimport datetime\nimport os\nfrom typing import Any, Callable, Dict, List, Optional, Tuple, Union\n\nimport tinydb\n\nfrom hypertunity.domain import Domain, Sample\nfrom hypertunity.optimisation.base import EvaluationScore, HistoryPoint\n\n__all__ = [\n    \"Reporter\"\n]\n\nHistoryEntryType = Union[\n    HistoryPoint,\n    Tuple[Sample, Union[float, Dict[str, float], Dict[str, EvaluationScore]]]\n]\n\n\nclass Reporter:\n    \"\"\"Abstract class :class:`Reporter` for result visualisation.\"\"\"\n\n    def __init__(self, domain: Domain,\n                 metrics: List[str],\n                 primary_metric: str = \"\",\n                 database_path: str = None):\n        \"\"\"Initialise the base reporter with domain and metrics.\n\n        Args:\n            domain: A :class:`Domain` from which all evaluated samples are drawn.\n            metrics: :obj:`List[str]` with names of the metrics used during\n                evaluation.\n            primary_metric: (optional) :obj:`str` primary metric from `metrics`.\n                This is used to determine the best sample. Defaults to the first one.\n            database_path: (optional) :obj:`str` path to the database for\n                storing experiment history on disk. Defaults to in-memory storage.\n        \"\"\"\n        self.domain = domain\n        if not metrics:\n            self.metrics = [\"score\"]\n        else:\n            self.metrics = metrics\n        if not primary_metric:\n            self.primary_metric = self.metrics[0]\n        else:\n            self.primary_metric = primary_metric\n\n        self._default_table_name = f\"trial_{datetime.datetime.now().isoformat()}\"\n        if database_path is not None:\n            if not os.path.exists(database_path):\n                os.makedirs(database_path)\n            db_path = os.path.join(database_path, \"db.json\")\n            self._db = tinydb.TinyDB(\n                db_path,\n                sort_keys=True,\n                indent=4,\n                separators=(',', ': ')\n            )\n        else:\n            from tinydb.storages import MemoryStorage\n            self._db = tinydb.TinyDB(storage=MemoryStorage,\n                                     default_table=self._default_table_name)\n        self._db_default_table = self._db.table(self._default_table_name)\n\n    @property\n    def database(self):\n        \"\"\"Return the logging database.\"\"\"\n        return self._db\n\n    @property\n    def default_database_table(self):\n        \"\"\"Return the default database table name.\"\"\"\n        return self._default_table_name\n\n    def log(self, entry: HistoryEntryType, **kwargs: Any):\n        \"\"\"Create an entry for an optimisation history point in the\n        :class:`Reporter`.\n\n        Args:\n            entry: :class:`HistoryPoint` or :obj:`Tuple[Sample, Dict]`.\n                The history point to log. If given as a tuple of :class:`Sample`\n                instance and a mapping from metric names to results, the\n                variance of the evaluation noise can be supplied by adding\n                an entry in the dict with the metric name and the suffix '_var'.\n            **kwargs: (optional) :obj:`Any`. Additional arguments for the\n                logging implementation in a subclass.\n\n        Keyword Args:\n            meta: (optional) additional information to be logged in the database\n                for this entry.\n        \"\"\"\n        if isinstance(entry, Tuple):\n            log_fn = self._log_tuple\n        elif isinstance(entry, HistoryPoint):\n            self._add_to_db(entry, kwargs.pop(\"meta\", None))\n            log_fn = self._log_history_point\n        else:\n            raise TypeError(\n                \"The history point can be either a tuple or a \"\n                \"`HistoryPoint` type object.\"\n            )\n        log_fn(entry, **kwargs)\n\n    def _log_tuple(self, entry: Tuple, **kwargs):\n        \"\"\"Helper function to convert the history entry from tuple to\n        :class:`HistoryPoint` and then log it using the overridden method\n        :method:`_log_history_point`.\n        \"\"\"\n        if not (len(entry) == 2 and isinstance(entry[0], Sample)\n                and isinstance(entry[1], (Dict, EvaluationScore, float))):\n            raise ValueError(f\"Malformed history entry tuple: {entry}.\")\n        sample, metrics_obj = entry\n        if isinstance(metrics_obj, (float, EvaluationScore)):\n            # use default name for score column\n            metrics_obj = {self.primary_metric: metrics_obj}\n        metrics = {}\n        # create a properly formatted metrics dict of type Dict[str, EvaluationScore]\n        for name, val in metrics_obj.items():\n            if name in metrics:\n                continue\n            if name.endswith(\"_var\"):\n                metric_name = name.rstrip(\"_var\")\n                if (metric_name not in metrics_obj\n                        or not isinstance(metrics_obj[metric_name], float)):\n                    raise ValueError(\n                        f\"Metrics dict does not contain a proper value \"\n                        f\"for metric {metric_name}.\"\n                    )\n                metrics[metric_name] = EvaluationScore(\n                    value=metrics_obj[metric_name],\n                    variance=val\n                )\n            elif isinstance(val, EvaluationScore):\n                metrics[name] = val\n            elif isinstance(val, float):\n                metrics[name] = EvaluationScore(\n                    value=val,\n                    variance=metrics_obj.get(f\"{name}_var\", 0.0)\n                )\n        entry = HistoryPoint(sample=sample, metrics=metrics)\n        self._add_to_db(entry, kwargs.pop(\"meta\", None))\n        self._log_history_point(entry, **kwargs)\n\n    @abc.abstractmethod\n    def _log_history_point(self, entry: HistoryPoint, **kwargs: Any):\n        \"\"\"Abstract method to override.\n\n        Log the :class:`HistoryPoint` entry into the reporter.\n\n        Args:\n            entry: :class:`HistoryPoint`. The sample and evaluation metrics to log.\n        \"\"\"\n        raise NotImplementedError\n\n    def _add_to_db(self, entry: HistoryPoint, meta: Any = None):\n        document = self._convert_history_to_doc(entry)\n        if meta is not None:\n            document[\"meta\"] = meta\n        self._db_default_table.insert(document)\n\n    def get_best(self, criterion: Union[str, Callable] = \"max\") -> Optional[Dict[str, Any]]:\n        \"\"\"Return the entry from the database which corresponds to the best\n        scoring experiment.\n\n        Args:\n            criterion: :obj:`str` or :obj:`Callable`. The function used to\n                determine whether the highest or lowest score is requested. If\n                several evaluation metrics are present, then a custom `criterion`\n                must be supplied.\n\n        Returns:\n            JSON object or `None` if the database is empty. The content of the\n            database for the best experiment.\n        \"\"\"\n        if not self._db_default_table:\n            return None\n        if isinstance(criterion, str):\n            predefined = {\"max\": max, \"min\": min}\n            if criterion not in predefined:\n                raise ValueError(\n                    f\"Unknown criterion for finding best experiment. \"\n                    f\"Select one from {list(predefined.keys())} \"\n                    f\"or supply a custom function.\"\n                )\n            selection_fn = predefined[criterion]\n        elif isinstance(criterion, Callable):\n            selection_fn = criterion\n        else:\n            raise TypeError(\"The criterion must be of type str or Callable.\")\n        return self._get_best_from_db(selection_fn)\n\n    def _get_best_from_db(self, selection_fn: Callable):\n        best_entry = self._db_default_table.get(doc_id=1)\n        best_score = best_entry[\"metrics\"][self.primary_metric][\"value\"]\n        for entry in self._db_default_table:\n            current_score = entry[\"metrics\"][self.primary_metric][\"value\"]\n            new_score = selection_fn(current_score, best_score)\n            if new_score != best_score:\n                best_entry = entry\n                best_score = new_score\n        return best_entry\n\n    def from_history(self, history: List[HistoryEntryType]):\n        \"\"\"Load the reporter with data from an entry of evaluations.\n\n        Args:\n            history: :obj:`List[HistoryPoint]` or :obj:`Tuple`. The sequence of\n                evaluations comprised of samples and metrics.\n        \"\"\"\n        for h in history:\n            self.log(h)\n\n    def from_database(self, database: Union[str, tinydb.TinyDB], table: str = None):\n        \"\"\"Load history from a database supplied as a path to a file or a\n        :obj:`tinydb.TinyDB` object.\n\n        Args:\n            database: :obj:`str` or :obj:`tinydb.TinyDB`. The database to load.\n            table: (optional) :obj:`str`. The table to load from the database.\n                This argument is not required if the database has only one table.\n\n        Raises:\n            :class:`ValueError`: if the database contains more than one table\n                and `table` is not given.\n        \"\"\"\n        if isinstance(database, str):\n            db = tinydb.TinyDB(database, sort_keys=True, indent=4, separators=(',', ': '))\n        elif isinstance(database, tinydb.TinyDB):\n            db = database\n        else:\n            raise TypeError(\"The database must be of type str or tinydb.TinyDB.\")\n        if len(db.tables()) > 1 and table is None:\n            raise ValueError(\n                \"Ambiguous database with multiple tables. \"\n                \"Specify a table name.\"\n            )\n        if table is None:\n            table = list(db.tables())[0]\n        self._db = db\n        self._db_default_table = self._db.table(table)\n\n    def to_history(self, table: str = None) -> List[HistoryPoint]:\n        \"\"\"Export the reporter logged history from a database table to an\n        optimiser-friendly history.\n\n        Args:\n            table: (optional) :obj:`str`. The name of the table to export.\n                Defaults to the one created during reporter initialisation.\n\n        Returns:\n            A list of :class:`HistoryPoint` objects which can be loaded into\n            an :class:`Optimiser` instance.\n        \"\"\"\n        history = []\n        if table is None:\n            default_table = self._db_default_table\n        else:\n            default_table = self._db.table(table)\n        for doc in default_table:\n            history.append(self._convert_doc_to_history(doc))\n        return history\n\n    @staticmethod\n    def _convert_history_to_doc(entry: HistoryPoint) -> Dict:\n        db_entry = {\n            \"sample\": entry.sample.as_dict(),\n            \"metrics\": {k: {\n                \"value\": v.value,\n                \"variance\": v.variance\n            } for k, v in entry.metrics.items()}\n        }\n        return db_entry\n\n    @staticmethod\n    def _convert_doc_to_history(document: Dict) -> HistoryPoint:\n        hist_point = HistoryPoint(\n            sample=Sample(document[\"sample\"]),\n            metrics={k: EvaluationScore(v[\"value\"], v[\"variance\"])\n                     for k, v in document[\"metrics\"].items()}\n        )\n        return hist_point\n"
  },
  {
    "path": "hypertunity/reports/table.py",
    "content": "from typing import Any, List, Union\n\nimport beautifultable as bt\nimport numpy as np\nimport tinydb\n\nfrom hypertunity import utils\nfrom hypertunity.domain import Domain\nfrom hypertunity.optimisation.base import HistoryPoint\n\nfrom .base import Reporter\n\n__all__ = [\n    \"Table\"\n]\n\n\nclass Table(Reporter):\n    \"\"\"A :class:`Reporter` subclass to print and store a formatted table of\n    the results.\n    \"\"\"\n\n    def __init__(self, domain: Domain,\n                 metrics: List[str],\n                 primary_metric: str = \"\",\n                 database_path: str = None):\n        \"\"\"Initialise the table reporter with domain and metrics.\n\n        Args:\n            domain: A :class:`Domain` from which all evaluated samples are drawn.\n            metrics: :obj:`List[str]` with names of the metrics used during evaluation.\n            primary_metric: (optional) :obj:`str` primary metric from `metrics`.\n                This is used to determine the best sample. Defaults to the first one.\n            database_path: (optional) :obj:`str` path to the database for\n                storing experiment history on disk. Defaults to in-memory storage.\n        \"\"\"\n        super(Table, self).__init__(\n            domain, metrics, primary_metric, database_path\n        )\n        self._table = bt.BeautifulTable()\n        self._table.set_style(bt.STYLE_SEPARATED)\n        dim_names = [\".\".join(dns) for dns in self.domain.flatten()]\n        self._table.column_headers = [\"No.\", *dim_names, *self.metrics]\n\n    def __str__(self):\n        \"\"\"Return the string representation of the table.\"\"\"\n        return str(self._table)\n\n    @property\n    def data(self) -> np.array:\n        \"\"\"Return the table as a numpy array.\"\"\"\n        return np.array(self._table)\n\n    def _log_history_point(self, entry: HistoryPoint, **kwargs: Any):\n        \"\"\"Create an entry for a :class:`HistoryPoint` in the table.\n\n        Args:\n            entry: :class:`HistoryPoint`. The history point to log. If given as\n                a tuple of :class:`Sample` instance and a mapping from metric\n                names to results, the variance of the evaluation noise can be\n                supplied by adding an entry in the dict with the metric name and\n                the suffix '_var'.\n        \"\"\"\n        id_ = len(self._table)\n        row = [id_ + 1,\n               *entry.sample.flatten().values(),\n               *entry.metrics.values()]\n        self._table.append_row(row)\n\n    @utils.support_american_spelling\n    def format(self, order: str = \"none\", emphasise: bool = False) -> str:\n        \"\"\"Format the table and return it as a string.\n\n        Supported formatting is sorting and emphasising of the best result.\n\n        Args:\n            order: (optional) :obj:`str`. The order of sorting by the primary\n                metric. Can be \"none\", \"ascending\" or \"descending\".\n                Defaults to \"none\".\n            emphasise: (optional) :obj:`bool`. Whether to emphasise the best\n                experiment by marking it in yellow and blinking if supported.\n                Defaults to `False`.\n\n        Returns:\n            :obj:`str` of the formatted table.\n        \"\"\"\n        table_copy = self._table.copy()\n        if order not in [\"none\", \"descending\", \"ascending\"]:\n            raise ValueError(\n                \"`order` argument can only be 'ascending' or 'descending'.\"\n            )\n        if order != \"none\":\n            table_copy.sort(\n                key=self.primary_metric,\n                reverse=order == \"descending\"\n            )\n        if emphasise:\n            best_row_ind = int(np.argmax(\n                list(table_copy.get_column(self.primary_metric))\n            ))\n            emphasised_best_row = map(\n                lambda x: f\"\\033[33;5;7m{x}\\033[0m\", table_copy[best_row_ind]\n            )\n            table_copy.update_row(best_row_ind, emphasised_best_row)\n        return str(table_copy)\n\n    def from_database(self, database: Union[str, tinydb.TinyDB], table: str = None):\n        \"\"\"Load history from a database supplied as a path to a file or a\n        :obj:`tinydb.TinyDB` object.\n\n        Args:\n            database: :obj:`str` or :obj:`tinydb.TinyDB`. The database to load.\n            table: (optional) :obj:`str`. The table to load from the database.\n                This argument is not required if the database has only one table.\n\n        Raises:\n            :class:`ValueError`: if the database contains more than one table\n            and `table` is not given.\n        \"\"\"\n        super(Table, self).from_database(database, table)\n        for doc in self._db_default_table:\n            history_point = self._convert_doc_to_history(doc)\n            self._log_history_point(history_point)\n"
  },
  {
    "path": "hypertunity/reports/tensorboard.py",
    "content": "import os\nimport sys\nfrom typing import Any, Dict, List, Union\n\nimport tinydb\n\nfrom hypertunity import utils\nfrom hypertunity.domain import Domain, Sample\nfrom hypertunity.optimisation.base import HistoryPoint\n\nfrom .base import Reporter\n\ntry:\n    import tensorflow as tf\n    from tensorboard.plugins.hparams import api as hp\nexcept ImportError as err:\n    raise ImportError(\"Install tensorflow>=1.14 and tensorboard>=1.14 \"\n                      \"to support the HParams plugin.\") from err\n\n\n__all__ = [\n    \"Tensorboard\"\n]\n\nEAGER_MODE = tf.executing_eagerly()\nsession_builder = tf.compat.v1.Session\nif str(tf.version.VERSION) < \"2.\":\n    summary_file_writer = tf.compat.v2.summary.create_file_writer\n    summary_scalar = tf.compat.v2.summary.scalar\nelse:\n    summary_file_writer = tf.summary.create_file_writer\n    summary_scalar = tf.summary.scalar\n\n\nclass Tensorboard(Reporter):\n    \"\"\"A :class:`Reporter` subclass to visualise the results in Tensorboard.\n\n    It utilises Tensorboard's HParams plugin as a dashboard for the summary of\n    the optimisation. This class prepares and creates entries with the scalar\n    data of the experiment trials, containing the domain sample and the\n    corresponding metrics.\n\n    Notes:\n        The user is responsible for launching TensorBoard in the browser.\n    \"\"\"\n\n    def __init__(self, domain: Domain, metrics: List[str], logdir: str,\n                 primary_metric: str = \"\",\n                 database_path: str = None):\n        \"\"\"Initialise the TensorBoard reporter.\n\n        Args:\n            domain: :class:`Domain`. The domain to which all evaluated samples belong.\n            metrics: :obj:`List[str]`. The names of the metrics.\n            logdir: :obj:`str`. Path to a folder for storing the Tensorboard events.\n            primary_metric: (optional) :obj:`str`. Primary metric from `metrics`.\n                This is used by the :py:meth:`format` method to determine the\n                sorting column and the best value. Default is the first one.\n            database_path: (optional) :obj:`str`. The path to the database for\n                storing experiment history on disk. Default is in-memory storage.\n        \"\"\"\n        super(Tensorboard, self).__init__(\n            domain, metrics, primary_metric, database_path\n        )\n        self._hparams_domain = self._convert_to_hparams_domain(self.domain)\n        if not os.path.exists(logdir):\n            os.makedirs(logdir)\n        self._logdir = logdir\n        self._experiment_counter = 0\n        self._set_up()\n        print(f\"Run 'tensorboard --logdir={logdir}' to launch \"\n              f\"the visualisation in TensorBoard\", file=sys.stderr)\n\n    @staticmethod\n    def _convert_to_hparams_domain(domain: Domain) -> Dict[str, hp.HParam]:\n        hparams = {}\n        for var_name, dim in domain.flatten().items():\n            dim_type = Domain.get_type(dim)\n            joined_name = utils.join_strings(var_name, join_char=\"/\")\n            if dim_type == Domain.Continuous:\n                hp_dim_type = hp.RealInterval\n                vals = list(map(float, dim))\n            elif dim_type in [Domain.Discrete, Domain.Categorical]:\n                hp_dim_type = hp.Discrete\n                vals = (dim,)\n            else:\n                raise TypeError(\n                    f\"Cannot map subdomain of type {dim_type} \"\n                    f\"to a known HParams domain.\"\n                )\n            hparams[joined_name] = hp.HParam(joined_name, hp_dim_type(*vals))\n        return hparams\n\n    def _convert_to_hparams_sample(self, sample: Sample) -> Dict[hp.HParam, Any]:\n        hparams = {}\n        for name, val in sample:\n            joined_name = utils.join_strings(name, join_char=\"/\")\n            hparams[self._hparams_domain[joined_name]] = val\n        return hparams\n\n    def _set_up(self):\n        with summary_file_writer(self._logdir).as_default():\n            hp.hparams_config(\n                hparams=self._hparams_domain.values(),\n                metrics=[hp.Metric(m) for m in self.metrics])\n\n    @staticmethod\n    def _log_tf_eager_mode(params, metrics, full_experiment_dir):\n        \"\"\"Log in eager mode.\"\"\"\n        with summary_file_writer(full_experiment_dir).as_default():\n            hp.hparams(params)\n            for metric_name, metric_value in metrics.items():\n                summary_scalar(metric_name, metric_value.value, step=1)\n\n    @staticmethod\n    def _log_tf_graph_mode(params, metrics, full_experiment_dir):\n        \"\"\"Log in legacy graph execution mode with session creation.\"\"\"\n        with summary_file_writer(full_experiment_dir).as_default() as fw, session_builder() as sess:\n            sess.run(fw.init())\n            sess.run(hp.hparams(params))\n            for metric_name, metric_value in metrics.items():\n                sess.run(summary_scalar(metric_name, metric_value.value, step=1))\n            sess.run(fw.flush())\n\n    def _log_history_point(self, entry: HistoryPoint, experiment_dir: str = None):\n        \"\"\"Create an entry for a :class:`HistoryPoint` in Tensorboard.\n\n        Args:\n            entry: :class:`HistoryPoint`. The sample and evaluation metrics to log.\n            experiment_dir: (optional) :obj:`str`. The directory name where to\n                store all experiment related data. It will be prefixed by the\n                `logdir` path which is provided on initialisation of the\n                :class:`Tensorboard` object. Default is 'experiment_[number]'.\n        \"\"\"\n        converted = self._convert_to_hparams_sample(entry.sample)\n        if not experiment_dir:\n            experiment_dir = f\"experiment_{str(self._experiment_counter)}\"\n            self._experiment_counter += 1\n        full_experiment_dir = os.path.join(self._logdir, experiment_dir)\n        if EAGER_MODE:\n            self._log_tf_eager_mode(converted, entry.metrics, full_experiment_dir)\n        else:\n            self._log_tf_graph_mode(converted, entry.metrics, full_experiment_dir)\n\n    def from_database(self, database: Union[str, tinydb.TinyDB], table: str = None):\n        \"\"\"Load history from a database supplied as a path to a file or a\n        :obj:`tinydb.TinyDB` object.\n\n        Args:\n            database: :obj:`str` or :obj:`tinydb.TinyDB`. The database to load.\n            table: (optional) :obj:`str`. The table to load from the database.\n                This argument is not required if the database has only one table.\n\n        Raises:\n            :class:`ValueError`: if the database contains more than one table\n            and `table` is not given.\n        \"\"\"\n        super(Tensorboard, self).from_database(database, table)\n        for doc in self._db_default_table:\n            history_point = self._convert_doc_to_history(doc)\n            self._log_history_point(history_point)\n"
  },
  {
    "path": "hypertunity/reports/tests/__init__.py",
    "content": ""
  },
  {
    "path": "hypertunity/reports/tests/conftest.py",
    "content": "import pytest\n\nfrom hypertunity.domain import Domain\nfrom hypertunity.optimisation.base import EvaluationScore, HistoryPoint\n\n\n@pytest.fixture(scope=\"session\")\ndef generated_history():\n    domain = Domain({\n        \"x\": [-5., 6.],\n        \"y\": {\"sin\", \"sqr\"},\n        \"z\": set(range(4))\n    }, seed=7)\n    n_samples = 10\n    history = [HistoryPoint(sample=domain.sample(),\n                            metrics={\"metric_1\": EvaluationScore(float(i)),\n                                     \"metric_2\": EvaluationScore(i * 2.)})\n               for i in range(n_samples)]\n    if len(history) == 1:\n        history = history[0]\n    return history, domain\n"
  },
  {
    "path": "hypertunity/reports/tests/test_table.py",
    "content": "import os\nimport tempfile\n\nfrom hypertunity.optimisation.base import EvaluationScore\n\nfrom ..table import Table\n\n\ndef test_from_to_history(generated_history):\n    history, domain = generated_history\n    rep = Table(\n        domain,\n        metrics=[\"metric_1\", \"metric_2\"],\n        primary_metric=\"metric_1\"\n    )\n    rep.from_history(history)\n    data_history = [\n        [i + 1, *list(h.sample.flatten().values()), *list(h.metrics.values())]\n        for i, h in enumerate(history)\n    ]\n    assert rep.data.tolist() == data_history\n    assert rep.to_history() == history\n\n\ndef test_from_tuple_and_history_point(generated_history):\n    history, domain = generated_history\n    hist_point = history[0]\n    rep = Table(\n        domain,\n        metrics=[\"metric_1\", \"metric_2\"],\n        primary_metric=\"metric_1\"\n    )\n    rep.log(hist_point)\n    sample = domain.sample()\n    rep.log((sample, {\"metric_1\": 1.0, \"metric_2\": 2.0, \"metric_2_var\": 3.0}))\n    assert rep.data.tolist() == [\n        [1, *list(hist_point.sample.flatten().values()),\n         *list(hist_point.metrics.values())],\n        [2, *list(sample.flatten().values()),\n         EvaluationScore(1.0), EvaluationScore(2.0, 3.0)]\n    ]\n\n\ndef test_database_and_get_best(generated_history):\n    history, domain = generated_history\n    with tempfile.TemporaryDirectory() as db_dir:\n        rep = Table(\n            domain,\n            metrics=[\"metric_1\", \"metric_2\"],\n            database_path=db_dir\n        )\n        best_meta, best_metrics, best_sample = {}, {}, {}\n        best_score = float(\"-inf\")\n        for i, hp in enumerate(history):\n            rep.log(hp, meta={\"id\": i})\n            if hp.metrics[\"metric_1\"].value > best_score:\n                best_meta = {\"id\": i}\n                best_metrics = {k: {\"value\": v.value, \"variance\": v.variance}\n                                for k, v in hp.metrics.items()}\n                best_sample = hp.sample.as_dict()\n                best_score = hp.metrics[\"metric_1\"].value\n\n        assert len(rep.database.table(rep.default_database_table)) == len(history)\n        best_entry = rep.get_best(criterion=\"max\")\n        assert best_entry[\"meta\"] == best_meta\n        assert best_entry[\"metrics\"] == best_metrics\n        assert best_entry[\"sample\"] == best_sample\n\n        rep2 = Table(domain, metrics=[\"metric_1\", \"metric_2\"])\n        rep2.from_database(rep.database, table=rep.default_database_table)\n        rep3 = Table(domain, metrics=[\"metric_1\", \"metric_2\"])\n        rep3.from_database(os.path.join(db_dir, \"db.json\"),\n                           table=rep.default_database_table)\n\n        assert str(rep) == str(rep2) == str(rep3)\n        assert rep.get_best() == rep2.get_best() == rep3.get_best()\n"
  },
  {
    "path": "hypertunity/reports/tests/test_tensorboard.py",
    "content": "import os\nimport tempfile\n\nfrom ..tensorboard import Tensorboard\n\n\ndef test_from_to_history(generated_history):\n    history, domain = generated_history\n    with tempfile.TemporaryDirectory() as tmp_dir:\n        rep = Tensorboard(\n            domain,\n            metrics=[\"metric_1\", \"metric_2\"],\n            logdir=tmp_dir\n        )\n        rep.from_history(history)\n        assert len([dirname for dirname in os.listdir(tmp_dir)\n                    if dirname.startswith(\"experiment_\")]) == len(history)\n        for root, dirs, files in os.walk(tmp_dir):\n            assert all(map(lambda x: x.startswith(\"events.out.tfevents\"), files))\n        assert rep.to_history() == history\n\n\ndef test_from_tuple_and_history_point(generated_history):\n    history, domain = generated_history\n    hist_point = history[0]\n    with tempfile.TemporaryDirectory() as tmp_dir:\n        rep = Tensorboard(\n            domain,\n            metrics=[\"metric_1\", \"metric_2\"],\n            logdir=tmp_dir\n        )\n        rep.log(hist_point)\n        rep.log((domain.sample(),\n                 {\"metric_1\": 1.0, \"metric_2\": 2.0, \"metric_2_var\": 3.0}))\n        assert len([dirname for dirname in os.listdir(tmp_dir)\n                    if dirname.startswith(\"experiment_\")]) == 2\n        for root, dirs, files in os.walk(tmp_dir):\n            assert all(map(lambda x: x.startswith(\"events.out.tfevents\"), files))\n"
  },
  {
    "path": "hypertunity/scheduling/__init__.py",
    "content": "from .jobs import *\nfrom .scheduler import *\n"
  },
  {
    "path": "hypertunity/scheduling/jobs.py",
    "content": "\"\"\"Definition of `Job` and `Result` classes used to encapsulate an experiment\nand the corresponding outcomes.\n\"\"\"\n\nimport enum\nimport importlib\nimport os\nimport pickle\nimport re\nimport subprocess\nimport sys\nimport tempfile\nimport time\nfrom dataclasses import dataclass, field\nfrom functools import partial\nfrom typing import Any, Callable, Dict, List, Tuple, Union\n\n__all__ = [\n    \"Job\",\n    \"SlurmJob\",\n    \"Result\"\n]\n\n# Global registries to control the job and result id assignment\n_JOB_REGISTRY = set()\n_RESULT_REGISTRY = set()\n_ID_COUNTER = -1\n\n\ndef reset_registry():\n    \"\"\"Reset the global job and result registries.\n\n    Notes:\n        This function should be used with care as it will allow for jobs with\n        repeating IDs to be created. As a consequence, two or more\n        :class:`Result` objects might coexist end make the actual experiment\n        outcome ambiguous.\n    \"\"\"\n    global _ID_COUNTER\n    _JOB_REGISTRY.clear()\n    _RESULT_REGISTRY.clear()\n    _ID_COUNTER = -1\n\n\ndef generate_id():\n    \"\"\"Generate a new, unused integer job id.\"\"\"\n    global _ID_COUNTER\n    _ID_COUNTER += 1\n    return _ID_COUNTER\n\n\ndef import_script(path):\n    \"\"\"Import a module or script by a given path.\n\n    Args:\n        path: :obj:`str`, can be either a module import of the form\n            [package.]*[module] if the outer most package is in the\n            `PYTHONPATH`, or a path to an arbitrary python script.\n\n    Returns:\n        The loaded python script as a module.\n    \"\"\"\n    try:\n        module = importlib.import_module(path)\n    except ModuleNotFoundError:\n        if not os.path.isfile(path):\n            raise FileNotFoundError(f\"Cannot find script {path}.\")\n        if not os.path.basename(path).endswith(\".py\"):\n            raise ValueError(\n\n                f\"Expected a python script ending with *.py, \"\n                f\"found {os.path.basename(path)}.\")\n        import_path = os.path.dirname(os.path.abspath(path))\n        sys.path.append(import_path)\n        module = importlib.import_module(\n            f\"{os.path.basename(path).rstrip('.py')}\",\n            package=f\"{os.path.basename(import_path)}\"\n        )\n        sys.path.pop()\n    return module\n\n\ndef run_command(cmd: List[str]) -> str:\n    \"\"\"Execute a command in the shell.\n\n    Args:\n        cmd: :obj:`List[str]`. The command with its arguments to execute.\n\n    Returns:\n        The standard output of the command.\n\n    Raises:\n        :obj:`OSError`: if the standard error stream is not empty.\n    \"\"\"\n    ps = subprocess.run(args=cmd, capture_output=True)\n    if ps.stderr:\n        raise OSError(f\"Failed running {' '.join(cmd)} with error message: \"\n                      f\"{ps.stderr.decode('utf-8')}.\")\n    return ps.stdout.decode(\"utf-8\")\n\n\ndef get_callable_from_script(script_path: str, func_name: str = \"main\") -> Callable:\n    \"\"\"Convert a module to a callable function and call the `main` function of\n    the module.\n\n    Args:\n        script_path: str, the file path to the python script to run. It can\n            either be given as a module i.e. in the [package.]*[module] form,\n            or as a path to a *.py file in case it is not added into the\n            PYTHONPATH environment variable.\n        func_name: str, the name of the function to run.\n\n    Returns:\n        The wrapper which calls a function from the script module.\n\n    Raises:\n          `AttributeError` if the script does not define a `func_name` function.\n    \"\"\"\n\n    def wrapper(*args):\n        module = import_script(script_path)\n        if not hasattr(module, func_name):\n            raise AttributeError(\n                f\"Cannot find {func_name} function in {script_path}.\"\n            )\n        return getattr(module, func_name)(*args)\n\n    return wrapper\n\n\ndef run_script_with_args(binary: str, script_path: str, *args: Any, **kwargs: Any):\n    \"\"\"Run script using a binary and command line arguments.\n\n    Args:\n        binary: str, the binary to run the script with, e.g. 'python'.\n        script_path: str, the path to the script.\n        *args: Any, a collection of arguments which will be converted to string\n            and passed on to the run command.\n        **kwargs: Any, keyword arguments which will be converted to named script\n            arguments.\n\n    Returns:\n        The contents of the results, which the script is assumed to store,\n        given an output file path as an argument.\n\n    Raises:\n        FileNotFoundError if the script cannot be found.\n\n    Notes:\n        It assumes that the script will store the results on disk using the\n        path provided by the last of the command line arguments.\n    \"\"\"\n    if not os.path.isfile(script_path):\n        raise FileNotFoundError(f\"Cannot find script {script_path}.\")\n    with tempfile.TemporaryDirectory() as tmpdir:\n        output_file = os.path.join(tmpdir, \"results.pkl\")\n        args_as_str, kwargs_as_str = [], []\n        if args:\n            args_as_str.extend([*map(str, args), output_file])\n        if kwargs:\n            kwargs_as_str.extend([\n                str(item) for k_v in kwargs.items() for item in k_v\n            ])\n            kwargs_as_str.extend([\"--output_file\", output_file])\n        run_command([binary, script_path, *args_as_str, *kwargs_as_str])\n        return fetch_result(output_file)\n\n\ndef fetch_result(output_file, n_trials: int = 5, waiting_time: float = 1.0) -> Any:\n    \"\"\"Load the output file.\n\n    Args:\n        output_file: str, a path to the output file.\n        n_trials: int, optional number of trials to load the file, afterwards a\n            None is returned.\n        waiting_time: float, time in seconds to wait before retrying to load\n            the file.\n\n    Returns:\n        The unpickled output file if found, else None.\n    \"\"\"\n    if output_file is None:\n        return None\n    for _ in range(n_trials):\n        if os.path.isfile(output_file):\n            break\n        time.sleep(waiting_time)\n    else:\n        return None\n    with open(output_file, 'rb') as fp:\n        return pickle.load(fp)\n\n\n@dataclass(frozen=True)\nclass Job:\n    \"\"\"Default :class:`Job` class defining an experiment as a runnable task on\n    the local machine.\n\n    The job is defined by a callable function or a script task. In the case of\n    the former the `args` will be passed directly to it upon calling. Otherwise\n    either a module will be run as a scirpt with command line arguments or a\n    function, attribute of the module, will be called with the `args` as input.\n    In both cases a :class:`Result` object will be returned.\n\n    Attributes:\n        id: :obj:`int`. The job identifier. Must be unique.\n        args: :obj:`tuple` or :obj:`dict`. The arguments or keyword arguments\n            for the callable function or script.\n        task: :obj:`Callable` or :obj:`str`, a python function to run or a\n            file path to a python script.\n    \"\"\"\n    task: Union[Callable, str]\n    args: Union[Tuple, Dict] = ()\n    id: int = field(default_factory=generate_id)\n    meta: Any = None\n\n    # job related constants\n    _JOB_SCRIPT_FUNC_SEPARATOR = \":\"\n    _JOB_DEFAULT_BINARY = \"source\"\n    _JOB_SCRIPT_FUNC_SEPARATION_REGEX = r\"[^\\w\\/\\.]+\"\n\n    def __post_init__(self):\n        if not isinstance(self.task, (Callable, str)):\n            raise ValueError(\n                \"Job's task must be either a callable function \"\n                \"or a path to a script.\"\n            )\n        if self.id in _JOB_REGISTRY:\n            raise ValueError(\n                f\"Job with an ID {self.id} is already created. \"\n                f\"Reusing IDs is prohibited.\"\n            )\n        _JOB_REGISTRY.add(self.id)\n\n    def __hash__(self):\n        return hash(str(self.id))\n\n    def __call__(self, *args, **kwargs) -> 'Result':\n        all_args = args\n        all_kwargs = kwargs\n        if isinstance(self.args, Tuple):\n            all_args += self.args\n        else:\n            all_kwargs = dict(**kwargs, **self.args)\n        if isinstance(self.task, Callable):\n            runnable = self.task\n        else:\n            runnable = self._build_callable()\n        return Result(id=self.id, data=runnable(*all_args, **all_kwargs))\n\n    def _build_callable(self):\n        \"\"\"Create a function from a string task.\n\n        If the task is of the form /path/to/script.py::func_to_run, split the\n        path from the func and return a script.func_to_run callable.\n        If the task is of the form /path/to/script.py, then return a\n        python /path/to/script.py callable.\n        \"\"\"\n        if self._JOB_SCRIPT_FUNC_SEPARATOR in self.task:\n            # split the task string by the [:]+ marker\n            script_path, func_name = re.split(\n                self._JOB_SCRIPT_FUNC_SEPARATION_REGEX, self.task\n            )\n            assert script_path and func_name, \\\n                f\"Empty path {script_path} or function name {func_name}\"\n            runnable = get_callable_from_script(script_path, func_name)\n        else:\n            binary = self._infer_binary()\n            runnable = partial(run_script_with_args, binary, self.task)\n        return runnable\n\n    def _infer_binary(self):\n        if isinstance(self.meta, dict) and \"binary\" in self.meta:\n            return self.meta[\"binary\"]\n        if self.task.endswith(\".py\"):\n            return \"python\"\n        if self.task.endswith(\".sh\"):\n            return \"bash\"\n        return self._JOB_DEFAULT_BINARY\n\n\nclass SlurmJobState(enum.Enum):\n    \"\"\"Some of the most frequently encountered slurm job statuses.\"\"\"\n\n    PENDING = 0\n    RUNNING = 1\n    COMPLETED = 2\n    FAILED = 3\n    CANCELLED = 4\n    UNKNOWN = 5\n\n    @classmethod\n    def from_string(cls, state: str):\n        if state == \"running\":\n            return cls.RUNNING\n        if state == \"pending\":\n            return cls.PENDING\n        if state == \"completed\":\n            return cls.COMPLETED\n        if state == \"failed\":\n            return cls.FAILED\n        if state == \"cancelled\":\n            return cls.CANCELLED\n        return cls.UNKNOWN\n\n\n@dataclass(frozen=True)\nclass SlurmJob(Job):\n    \"\"\"A :class:`Job` subclass to schedule tasks on Slurm.\n\n    Runs an 'sbatch' command in the shell with the script.\n\n    Attributes:\n        output_file: (optional) :obj:`str`. Path to the file where the executed\n            script will dump the result file. If none is provided, a temporary\n            file will be created.\n    \"\"\"\n\n    output_file: str = None\n\n    # slurm shell commands\n    _SLURM_CMD_PUSH = [\"sbatch\"]\n    _SLURM_CMD_KILL = [\"scancel\"]\n    _SLURM_CMD_INFO = [\"scontrol\", \"show\", \"job\"]\n\n    # slurm script elements\n    _SLURM_SCRIPT_PREAMBLE = \"#!/bin/bash\"\n    _SLURM_SCRIPT_LINE_PREFIX = \"#SBATCH\"\n    _SLURM_SCRIPT_JOB_NAME = \"--job-name\"\n    _SLURM_SCRIPT_OUT_NAME = \"--output\"\n    _SLURM_SCRIPT_RESOURCES_MEM = \"--mem\"\n    _SLURM_SCRIPT_RESOURCES_TIME = \"--time\"\n    _SLURM_SCRIPT_RESOURCES_CPU = \"--cpus-per-task\"\n    _SLURM_SCRIPT_RESOURCES_GPU = \"--gres\"\n\n    # other macros\n    _SLURM_JOB_STATE_REGEX = r\"JobState=(RUNNING|PENDING|COMPLETED|FAILED|CANCELLED)\"\n\n    def __post_init__(self):\n        if not isinstance(self.task, str):\n            raise ValueError(\"Slurm job must be defined with a script to run.\")\n        super(SlurmJob, self).__post_init__()\n\n    def __call__(self) -> 'Result':\n        res = self._execute_job()\n        return Result(id=self.id, data=res)\n\n    def _execute_job(self) -> Any:\n        with tempfile.NamedTemporaryFile(mode=\"w+t\", suffix=\".sh\") as fp:\n            contents = self._create_slurm_script()\n            fp.writelines(contents)\n            fp.seek(0)\n            response = run_command(self._SLURM_CMD_PUSH + [f\"{fp.name}\"])\n        slurm_id = int(re.search(r\"[\\d]+\", response).group())\n        while True:\n            slurm_status = self._query_job_status(slurm_id)\n            if slurm_status in [SlurmJobState.RUNNING, SlurmJobState.PENDING]:\n                time.sleep(1)\n            elif slurm_status in [SlurmJobState.CANCELLED, SlurmJobState.FAILED]:\n                return None\n            elif slurm_status == SlurmJobState.COMPLETED:\n                return fetch_result(self.output_file)\n            else:\n                raise RuntimeError(f\"Unknown state of slurm job {slurm_id}.\")\n\n    def _create_slurm_script(self) -> List[str]:\n        if not self.meta:\n            raise ValueError(f\"Cannot infer slurm job parameters. \"\n                             f\"Fill in meta dict in job {self.id}.\")\n        else:\n            # Preamble, job name and output log filename definitions\n            content_lines = [\n                f\"{self._SLURM_SCRIPT_PREAMBLE}\\n\",\n                f\"{self._SLURM_SCRIPT_LINE_PREFIX} \"\n                f\"{self._SLURM_SCRIPT_JOB_NAME}=job_{self.id}\\n\",\n                f\"{self._SLURM_SCRIPT_LINE_PREFIX} \"\n                f\"{self._SLURM_SCRIPT_OUT_NAME}=log_%j.txt\\n\"]\n            # Resources specification\n            n_cpus = int(self.meta.get(\"resources\", {}).get(\"cpu\", 1))\n            if n_cpus >= 1:\n                content_lines.append(\n                    f\"{self._SLURM_SCRIPT_LINE_PREFIX} \"\n                    f\"{self._SLURM_SCRIPT_RESOURCES_CPU}={n_cpus}\\n\"\n                )\n            gpus = str(self.meta.get(\"resources\", {}).get(\"gpu\", \"\"))\n            if gpus:\n                if gpus.isnumeric():\n                    gpus = f\"gpu:{gpus}\"\n                content_lines.append(\n                    f\"{self._SLURM_SCRIPT_LINE_PREFIX} \"\n                    f\"{self._SLURM_SCRIPT_RESOURCES_GPU}={gpus}\\n\"\n                )\n            mem = str(self.meta.get(\"resources\", {}).get(\"memory\", \"\"))\n            if mem:\n                content_lines.append(\n                    f\"{self._SLURM_SCRIPT_LINE_PREFIX} \"\n                    f\"{self._SLURM_SCRIPT_RESOURCES_MEM}={mem}\\n\"\n                )\n            limit_time = str(self.meta.get(\"resources\", {}).get(\"time\", \"\"))\n            if limit_time:\n                content_lines.append(\n                    f\"{self._SLURM_SCRIPT_LINE_PREFIX} \"\n                    f\"{self._SLURM_SCRIPT_RESOURCES_TIME}={limit_time}\\n\"\n                )\n            # Task specification\n            binary = self.meta.get(\"binary\", \"python\")\n            if isinstance(self.args, Tuple):\n                # build positional arguments\n                script_args = ' '.join([*map(str, self.args), self.output_file])\n            else:\n                # build named arguments\n                script_args = ' '.join([\n                    *(str(item)\n                      for key_val in self.args.items()\n                      for item in key_val),\n                    \"--output_file\", self.output_file\n                ])\n            content_lines.append(f\"{binary} {self.task} {script_args}\")\n        return content_lines\n\n    def _query_job_status(self, slurm_id: int) -> SlurmJobState:\n        response = run_command(self._SLURM_CMD_INFO + [str(slurm_id)])\n        job_state = re.search(self._SLURM_JOB_STATE_REGEX, response)\n        if job_state is not None:\n            job_state = job_state.group(1).lower()\n            return SlurmJobState.from_string(job_state)\n\n\n@dataclass(frozen=True)\nclass Result:\n    \"\"\"A :class:`Result` class to store the output of the executed :class:`Job`.\n\n     It shares the same id as the job which generated it.\n\n    Attributes:\n        id: :obj:`int`. The identifier of the `Result` object which corresponds\n            to the job that has been run.\n        data: :obj:`Any`. The output data of the job.\n    \"\"\"\n    data: Any\n    id: int\n\n    def __post_init__(self):\n        if self.id in _RESULT_REGISTRY:\n            raise ValueError(\n                f\"Result with an ID {self.id} is already created. \"\n                f\"Reusing IDs is prohibited.\"\n            )\n        _RESULT_REGISTRY.add(self.id)\n"
  },
  {
    "path": "hypertunity/scheduling/scheduler.py",
    "content": "\"\"\"A scheduler for running jobs locally in a parallel manner using joblib as\na backend.\n\"\"\"\n\nimport multiprocessing as mp\nimport time\nfrom typing import List\n\nimport joblib\n\nfrom hypertunity import utils\n\nfrom .jobs import Job, Result\n\n__all__ = [\n    \"Scheduler\"\n]\n\n\nclass Scheduler:\n    \"\"\"A manager for parallel execution of jobs.\n\n    A job must be of type :class:`Job` which produces a :class:`Result`\n    object upon successful completion. The scheduler maintains a job and\n    result queues.\n\n    Notes:\n        This class should be used as a context manager.\n    \"\"\"\n\n    def __init__(self, n_parallel: int = None):\n        \"\"\"Setup the job and results queues.\n\n        Args:\n            n_parallel: (optional) :obj:`int`. The number of jobs that can be\n                run in parallel. Defaults to `None` in which case all but one\n                available CPUs will be used.\n        \"\"\"\n        self._job_queue = mp.Queue()\n        self._result_queue = mp.Queue()\n        self._is_queue_closed = False\n\n        if n_parallel is None:\n            self.n_parallel = -2  # using all CPUs but 1\n        else:\n            self.n_parallel = max(n_parallel, 1)\n        self._servant = mp.Process(target=self._run_servant)\n        self._interrupt_event = mp.Event()\n        self._servant.start()\n\n    def __del__(self):\n        \"\"\"Clean up subprocesses on object deletion.\n\n        Close the queues and join all subprocesses before the object is deleted.\n        \"\"\"\n        if not self._is_queue_closed:\n            self.exit()\n        if self._servant.is_alive():\n            self._servant.terminate()\n\n    def __enter__(self):\n        \"\"\"Enter the context manager.\"\"\"\n        return self\n\n    def __exit__(self, exc_type, exc_val, exc_tb):\n        \"\"\"Exit the context manager.\"\"\"\n        self.exit()\n\n    def _run_servant(self):\n        \"\"\"Run the pool of workers on the dispatched jobs, fetched from the job\n        queue and collect the results into the result queue.\n\n        Notes:\n            The runner will take as long as all jobs from the job queue finish\n            before any results are written to the result queue.\n        \"\"\"\n        # TODO: Switch backend back to default \"loky\", after the leakage\n        #  of semaphores is fixed\n        with joblib.Parallel(n_jobs=self.n_parallel,\n                             backend=\"multiprocessing\") as parallel:\n            while not self._interrupt_event.is_set():\n                current_jobs = utils.drain_queue(self._job_queue)\n                if not current_jobs:\n                    continue\n                # the order of the results corresponds to the that of the jobs\n                # and the IDs don't need to be shuffled.\n                ids = [job.id for job in current_jobs]\n                # TODO: in a future version of joblib, this could be a generator\n                #  and then the inputs would be stored immediately in the results\n                #  queue. Be ready to update whenever this PR gets merged:\n                #  https://github.com/joblib/joblib/pull/588\n                results = parallel(joblib.delayed(job)() for job in current_jobs)\n                assert len(ids) == len(results)\n                for res in results:\n                    self._result_queue.put_nowait(res)\n\n    def dispatch(self, jobs: List[Job]):\n        \"\"\"Dispatch the jobs for parallel execution.\n\n        This method is non-blocking.\n\n        Args:\n            jobs: :obj:`List[Job]`. A list of jobs to run whenever resources\n                are available.\n\n        Notes:\n            Although the jobs are scheduled to run immediately, the actual\n            execution may take place after indefinite delay if the job runner\n            is occupied with older jobs.\n        \"\"\"\n        for job in jobs:\n            self._job_queue.put_nowait(job)\n\n    def collect(self, n_results: int, timeout: float = None) -> List[Result]:\n        \"\"\"Collect all the available results or wait until they become available.\n\n        Args:\n            n_results: :obj:`int`, number of results to wait for.\n                If `n_results` ≤ 0 then all available results will be returned.\n            timeout: (optional) :obj:`float`, number of seconds to wait for\n                results to appear. If None (default) then it will wait until\n                all `n_results` are collected.\n\n        Returns:\n            A list of :class:`Result` objects with length `n_results` at least.\n\n        Notes:\n            If `n_results` is overestimated and timeout is None, then this\n            method will hang forever. Therefore it is recommended that a timeout\n            is set.\n\n        Raises:\n            :obj:`TimeoutError`: if more than `timeout` seconds elapse before a\n            :class:`Result` is collected.\n        \"\"\"\n        if n_results > 0:\n            results = []\n            for i in range(n_results):\n                results.append(self._result_queue.get(block=True, timeout=timeout))\n        else:\n            results = utils.drain_queue(self._result_queue)\n        return results\n\n    def interrupt(self):\n        \"\"\"Interrupt the scheduler and all running jobs.\"\"\"\n        self._interrupt_event.set()\n\n    def exit(self):\n        \"\"\"Exit the scheduler by closing the queues and terminating the\n        servant process.\n        \"\"\"\n        if not self._is_queue_closed:\n            utils.drain_queue(self._job_queue, close_queue=True)\n            self._job_queue.join_thread()\n            utils.drain_queue(self._result_queue, close_queue=True)\n            self._result_queue.join_thread()\n            self._is_queue_closed = True\n        self.interrupt()\n        # wait a bit for the subprocess to exit gracefully\n        n_retries = 3\n        while self._servant.is_alive() and n_retries > 0:\n            n_retries -= 1\n            time.sleep(0.05)\n        self._servant.terminate()\n"
  },
  {
    "path": "hypertunity/scheduling/tests/__init__.py",
    "content": ""
  },
  {
    "path": "hypertunity/scheduling/tests/script.py",
    "content": "import argparse\nimport os\nimport pickle\nimport sys\n\n\nclass DoNotReplaceAction(argparse.Action):\n    def __call__(self, parser, namespace, values, option_string=None):\n        if getattr(namespace, self.dest) is None:\n            setattr(namespace, self.dest, values)\n\n\ndef parse_args(args):\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"x\", nargs='?', type=int, action=DoNotReplaceAction)\n    parser.add_argument(\"--x\", type=int)\n    parser.add_argument(\"y\", nargs='?', type=float, action=DoNotReplaceAction)\n    parser.add_argument(\"--y\", type=float)\n    parser.add_argument(\"z\", nargs='?', type=str, action=DoNotReplaceAction)\n    parser.add_argument(\"--z\", type=str)\n    parser.add_argument(\"output_file\", nargs='?', type=str, action=DoNotReplaceAction)\n    parser.add_argument(\"--output_file\", type=str)\n    return parser.parse_args(args)\n\n\ndef main(x: int, y: float, z: str) -> float:\n    if z.endswith(tuple(\"0123456789\")):\n        return y * x\n    return y * x**2\n\n\nif __name__ == '__main__':\n    parsed_args = parse_args(sys.argv[1:])\n    result = main(parsed_args.x, parsed_args.y, parsed_args.z)\n    print(result)\n    output_dir = os.path.dirname(parsed_args.output_file)\n    if not os.path.exists(output_dir):\n        os.makedirs(output_dir)\n    with open(parsed_args.output_file, 'wb') as fp:\n        pickle.dump(result, fp)\n"
  },
  {
    "path": "hypertunity/scheduling/tests/test_jobs.py",
    "content": "import pytest\n\nfrom ..jobs import Job\n\n\ndef test_repeating_id():\n    _ = Job(task=sum, args=(), id=-100)\n    with pytest.raises(ValueError):\n        _ = Job(task=max, args=(), id=-100)\n    _ = Job(task=sum, args=(), id=-99)\n\n\ndef test_callable_job():\n    job_args = (131212, 123123123)\n    job = Job(task=lambda x, y: x + y, args=job_args)\n    result = job()\n    assert result.data == sum(job_args)\n"
  },
  {
    "path": "hypertunity/scheduling/tests/test_scheduler.py",
    "content": "import os\nimport tempfile\n\nimport pytest\n\nfrom hypertunity.domain import Domain, Sample\nfrom hypertunity.optimisation import base\n\nfrom ..jobs import Job, SlurmJob\nfrom ..scheduler import Scheduler\nfrom . import script\n\n\n@pytest.fixture(scope=\"module\")\ndef shared_slurm_tmp_dir():\n    return \"/tmp\"\n\n\ndef square(sample: Sample) -> base.EvaluationScore:\n    return base.EvaluationScore(sample[\"x\"]**2)\n\n\ndef run_jobs(jobs):\n    with Scheduler(n_parallel=2) as scheduler:\n        scheduler.dispatch(jobs)\n        results = scheduler.collect(n_results=len(jobs), timeout=60.0)\n    assert len(results) == len(jobs)\n    assert all([r.id == j.id for r, j in zip(results, jobs)])\n    return results\n\n\n@pytest.mark.timeout(10.0)\ndef test_local_from_script_and_function():\n    domain = Domain({\n        \"x\": {0, 1, 2, 3},\n        \"y\": [-1., 1.],\n        \"z\": {\"123\", \"abc\"}\n    }, seed=7)\n    jobs = [Job(task=\"hypertunity/scheduling/tests/script.py::main\",\n                args=(*domain.sample().as_namedtuple(),)) for _ in range(10)]\n    results = run_jobs(jobs)\n    assert all([r.data == script.main(*j.args) for r, j in zip(results, jobs)])\n\n\n@pytest.mark.timeout(10.0)\ndef test_local_from_script_and_cmdline_args():\n    domain = Domain({\n        \"x\": {0, 1, 2, 3},\n        \"y\": [-1., 1.],\n        \"z\": {\"123\", \"abc\"}\n    }, seed=7)\n    jobs = [Job(task=\"hypertunity/scheduling/tests/script.py\",\n                args=(*domain.sample().as_namedtuple(),),\n                meta={\"binary\": \"python\"}) for _ in range(10)]\n    results = run_jobs(jobs)\n    assert all([r.data == script.main(*j.args) for r, j in zip(results, jobs)])\n\n\n@pytest.mark.timeout(10.0)\ndef test_local_from_script_and_cmdline_named_args():\n    domain = Domain({\n        \"--x\": {0, 1, 2, 3},\n        \"--y\": [-1., 1.],\n        \"--z\": {\"acb123\", \"abc\"}\n    }, seed=7)\n    jobs = [Job(task=\"hypertunity/scheduling/tests/script.py\",\n                args=domain.sample().as_dict(),\n                meta={\"binary\": \"python\"}) for _ in range(10)]\n    results = run_jobs(jobs)\n    assert all([\n        r.data == script.main(**{k.lstrip(\"-\"): v for k, v in j.args.items()})\n        for r, j in zip(results, jobs)\n    ])\n\n\n@pytest.mark.timeout(10.0)\ndef test_local_from_fn():\n    domain = Domain({\"x\": [0., 1.]}, seed=7)\n    jobs = [Job(task=square, args=(domain.sample(),)) for _ in range(10)]\n    results = run_jobs(jobs)\n    assert all([r.data.value == square(*j.args).value\n                for r, j in zip(results, jobs)])\n\n\n@pytest.mark.slurm\n@pytest.mark.timeout(60.0)\ndef test_slurm_from_script(shared_slurm_tmp_dir):\n    domain = Domain({\n        \"x\": {0, 1, 2, 3},\n        \"y\": [-1., 1.],\n        \"z\": {\"123\", \"abc\"}\n    }, seed=7)\n    jobs, dirs = [], []\n    n_jobs = 4\n    for i in range(n_jobs):\n        sample = domain.sample()\n        dirs.append(tempfile.TemporaryDirectory(dir=shared_slurm_tmp_dir))\n        jobs.append(SlurmJob(\n            task=\"hypertunity/scheduling/tests/script.py\",\n            args=(*sample.as_namedtuple(),),\n            output_file=f\"{os.path.join(dirs[-1].name, 'results.pkl')}\",\n            meta={\"binary\": \"python\", \"resources\": {\"cpu\": 1}}\n        ))\n    results = run_jobs(jobs)\n    assert all([r.data == script.main(*j.args) for r, j in zip(results, jobs)])\n    # clean-up the temporary dirs\n    for d in dirs:\n        d.cleanup()\n"
  },
  {
    "path": "hypertunity/tests/__init__.py",
    "content": ""
  },
  {
    "path": "hypertunity/tests/test_domain.py",
    "content": "from collections import namedtuple\n\nimport pytest\n\nfrom hypertunity.domain import (\n    Domain,\n    DomainNotIterableError,\n    DomainSpecificationError,\n    Sample\n)\n\n\n@pytest.mark.parametrize(\"domain,expectation\", [\n    ({1: {\"b\": [2, 3]}, \"c\": [0, 0.1]},\n     pytest.raises(DomainSpecificationError)),\n    ({\"a\": {\"b\": (1, 2, 3, 4)}, \"c\": [0, 0.1]},\n     pytest.raises(DomainSpecificationError)),\n    ({\"a\": {\"b\": lambda x: x}, \"c\": [0, 0.1]},\n     pytest.raises(DomainSpecificationError)),\n    # this one should fail from the ast.literal_eval parsing\n    ('{\"a\": {\"b\": lambda x: x}, \"c\": [0, 0.1]}',\n     pytest.raises(ValueError))\n])\ndef test_invalid_domain(domain, expectation):\n    with expectation:\n        Domain(domain)\n\n\n@pytest.mark.parametrize(\"domain\", [\n    {\"a\": {\"b\": {0, 1}}, \"c\": [0, 0.1]},\n    '{\"a\": {\"b\": {0, 1}}, \"c\": [0, 0.1]}'\n])\ndef test_valid_domain(domain):\n    Domain(domain)\n\n\ndef test_eq():\n    d1 = Domain({\"a\": {\"b\": [2, 3]}, \"c\": [0, 0.1]})\n    d2 = Domain({\"a\": {\"b\": [2, 3]}, \"c\": [0, 0.1]})\n    assert d1 == d2\n\n\ndef test_flatten():\n    dom = Domain({\"a\": {\"b\": [0, 1]}, \"c\": {0, 0.1}})\n    assert dom.flatten() == {(\"a\", \"b\"): [0, 1], (\"c\",): {0, 0.1}}\n\n\ndef test_addition():\n    domain_all = Domain({\n        \"a\": [1, 2],\n        \"b\": {\"c\": {1, 2, 3}, \"d\": {\"o1\", \"o2\"}},\n        \"e\": {3, 4, 5}\n    })\n    domain_1 = Domain({\"a\": [1, 2], \"b\": {\"c\": {1, 2, 3}}})\n    domain_2 = Domain({\"b\": {\"d\": {\"o1\", \"o2\"}}})\n    domain_3 = Domain({\"e\": {3, 4, 5}})\n    assert domain_1 + domain_2 + domain_3 == domain_all\n    with pytest.raises(ValueError):\n        _ = domain_1 + domain_1\n\n\ndef test_serialisation():\n    domain = Domain({\"a\": [1, 2], \"b\": {\"c\": {1, 2, 3}, \"d\": {\"o1\", \"o2\"}}})\n    serialised = domain.serialise()\n    deserialised = Domain.deserialise(serialised)\n    assert deserialised == domain\n\n\ndef test_as_dict():\n    dict_domain = {\"a\": {\"b\": [2, 3]}, \"c\": [0, 0.1]}\n    domain = Domain(dict_domain)\n    assert domain.as_dict() == dict_domain\n\n\ndef test_as_namedtuple():\n    domain = Domain({\"a\": {\"b\": {2, 3, 4}}, \"c\": [0, 0.1]})\n    nt = domain.as_namedtuple()\n    assert nt.a == namedtuple(\"_\", \"b\")({2, 3, 4})\n    assert nt.a.b == {2, 3, 4}\n    assert nt.c == [0, 0.1]\n\n\ndef test_from_list():\n    lst = [\n        ((\"a\", \"b\"), {2, 3, 4}),\n        ((\"c\",), {0, 0.1}),\n        ((\"d\", \"e\", \"f\"), {0, 1}),\n        ((\"d\", \"g\"), {2, 3})\n    ]\n    domain_true = Domain({\n        \"a\": {\"b\": {2, 3, 4}},\n        \"c\": {0, 0.1},\n        \"d\": {\"e\": {\"f\": {0, 1}}, \"g\": {2, 3}}\n    })\n    domain_from_list = Domain.from_list(lst)\n    assert domain_true == domain_from_list\n    assert lst == list(domain_true.flatten().items())\n\n\ndef test_fail_iter_cont_domain():\n    with pytest.raises(DomainNotIterableError):\n        list(iter(Domain({\"a\": {\"b\": {2, 3, 4}}, \"c\": [0, 0.1]})))\n\n\ndef test_iter():\n    discrete_domain = Domain({\n        \"a\": {\"b\": {2, 3, 4}, \"j\": {\"d\": {5, 6}, \"f\": {\"g\": {7}}}},\n        \"c\": {\"op1\", 0.1}\n    })\n    all_samples = set(iter(discrete_domain))\n    assert all_samples == {\n        Sample({'a': {'b': 2, 'j': {'d': 5, 'f': {'g': 7}}}, 'c': 'op1'}),\n        Sample({'a': {'b': 3, 'j': {'d': 5, 'f': {'g': 7}}}, 'c': 'op1'}),\n        Sample({'a': {'b': 4, 'j': {'d': 5, 'f': {'g': 7}}}, 'c': 'op1'}),\n        Sample({'a': {'b': 2, 'j': {'d': 6, 'f': {'g': 7}}}, 'c': 'op1'}),\n        Sample({'a': {'b': 3, 'j': {'d': 6, 'f': {'g': 7}}}, 'c': 'op1'}),\n        Sample({'a': {'b': 4, 'j': {'d': 6, 'f': {'g': 7}}}, 'c': 'op1'}),\n        Sample({'a': {'b': 2, 'j': {'d': 5, 'f': {'g': 7}}}, 'c': 0.1}),\n        Sample({'a': {'b': 3, 'j': {'d': 5, 'f': {'g': 7}}}, 'c': 0.1}),\n        Sample({'a': {'b': 4, 'j': {'d': 5, 'f': {'g': 7}}}, 'c': 0.1}),\n        Sample({'a': {'b': 2, 'j': {'d': 6, 'f': {'g': 7}}}, 'c': 0.1}),\n        Sample({'a': {'b': 3, 'j': {'d': 6, 'f': {'g': 7}}}, 'c': 0.1}),\n        Sample({'a': {'b': 4, 'j': {'d': 6, 'f': {'g': 7}}}, 'c': 0.1})\n    }\n\n\ndef test_sampling():\n    domain = Domain({\"a\": {\"b\": {2, 3, 4}}, \"c\": [0, 0.1]})\n    for i in range(10):\n        sample = domain.sample()\n        assert sample[\"a\"][\"b\"] in {2, 3, 4} and 0. <= sample[\"c\"] <= 0.1\n\n\ndef test_split_by_type():\n    domain = Domain({\"x\": [1, 2], \"y\": {-3, 2, 5}, \"z\": {\"small\", 1, 0.1}})\n    discr, cat, cont = domain.split_by_type()\n    assert sum(domain.split_by_type(), Domain({})) == domain\n    assert discr == Domain({\"y\": {-3, 2, 5}})\n    assert cat == Domain({\"z\": {\"small\", 1, 0.1}})\n    assert cont == Domain({\"x\": [1, 2]})\n"
  },
  {
    "path": "hypertunity/tests/test_trial.py",
    "content": "import pytest\n\nfrom hypertunity import Domain, Trial\nfrom hypertunity.optimisation import RandomSearch\nfrom hypertunity.reports import Table\nfrom hypertunity.scheduling import Job\nfrom hypertunity.scheduling.tests.test_scheduler import run_jobs\n\n\ndef foo(x, y, z):\n    return x**2 + y**2 - z**3\n\n\n@pytest.mark.timeout(60.0)\ndef test_trial_with_callable():\n    domain = Domain({\"x\": [-1., 1.], \"y\": [-2, 2], \"z\": {1, 2, 3, 4}})\n    trial = Trial(objective=foo, domain=domain,\n                  optimiser=\"random_search\",\n                  database_path=None,\n                  seed=7, metrics=[\"score\"])\n    n_steps = 10\n    batch_size = 2\n    trial.run(n_steps, batch_size=batch_size, n_parallel=batch_size)\n\n    rs = RandomSearch(domain=domain, seed=7)\n    rep = Table(domain, metrics=[\"score\"])\n    for i in range(n_steps):\n        samples = rs.run_step(batch_size=batch_size, minimise=False)\n        results = [foo(*s.as_namedtuple(), ) for s in samples]\n        for sample_eval in zip(samples, results):\n            rep.log(sample_eval)\n\n    assert len(trial.optimiser.history) == n_steps * batch_size\n    assert str(rep.format(order=\"ascending\")) == str(\n        trial.reporter.format(order=\"ascending\")\n    )\n\n\n@pytest.mark.timeout(60.0)\ndef test_trial_with_script():\n    domain = Domain({\n        \"--x\": {0, 1, 2, 3},\n        \"--y\": [-1., 1.],\n        \"--z\": {\"123\", \"abc\"}\n    })\n    trial = Trial(objective=\"hypertunity/scheduling/tests/script.py\",\n                  domain=domain,\n                  optimiser=\"random_search\",\n                  database_path=None,\n                  seed=7, metrics=[\"score\"])\n    batch_size = 4\n    trial.run(n_steps=1, batch_size=batch_size, n_parallel=batch_size)\n\n    rs = RandomSearch(domain=domain, seed=7)\n    samples = rs.run_step(batch_size=batch_size)\n    jobs = [Job(task=\"hypertunity/scheduling/tests/script.py\",\n                args=s.as_dict(),\n                meta={\"binary\": \"python\"}) for s in samples]\n    results = [r.data for r in run_jobs(jobs)]\n    assert results == [h.metrics[\"score\"].value\n                       for h in trial.optimiser.history]\n"
  },
  {
    "path": "hypertunity/tests/test_utils.py",
    "content": "import queue\n\nimport pytest\n\nfrom .. import utils\n\ntry:\n    from contextlib import nullcontext\nexcept ImportError:\n    from contextlib import contextmanager\n\n    @contextmanager\n    def nullcontext():\n        yield\n\n\ndef test_support_american_spelling():\n\n    @utils.support_american_spelling\n    def gb_spelling_func(minimise, optimise, maximise):\n        return minimise, optimise, maximise\n\n    expected = (True, 1, None)\n    assert gb_spelling_func(minimise=True, optimise=1, maximise=None) == expected\n    assert gb_spelling_func(minimize=True, optimize=1, maximize=None) == expected\n\n\n@pytest.mark.parametrize(\"test_input,expectation\", [\n    ((\"vxc\", \"\", \"\", \"___\"), nullcontext()),\n    ((\"_\", \"_\", \"\"), nullcontext()),\n    ((\"asd\",), nullcontext()),\n    ((\"asd\", \"dxcv\"), nullcontext()),\n    ((\"asd\", \"\\\\\", \"\\n\"), pytest.raises(ValueError))\n])\ndef test_split_and_join_strings(test_input, expectation):\n    with expectation:\n        assert test_input == utils.split_string(\n            utils.join_strings(test_input, join_char=\"_\"),\n            split_char=\"_\"\n        )\n\n\ndef test_drain_queue():\n    q = queue.Queue(10)\n    elems = list(range(10))\n    for i in elems:\n        q.put(i)\n    items = utils.drain_queue(q)\n    assert items == elems\n    with pytest.raises(queue.Empty):\n        q.get_nowait()\n"
  },
  {
    "path": "hypertunity/trial.py",
    "content": "\"\"\"A wrapper class for conducting multiple experiments, scheduling jobs and\nsaving results.\n\"\"\"\n\nfrom typing import Callable, Type, Union\n\nfrom hypertunity import optimisation, reports, utils\nfrom hypertunity.domain import Domain\nfrom hypertunity.optimisation import Optimiser\nfrom hypertunity.reports import Reporter\nfrom hypertunity.scheduling import Job, Scheduler, SlurmJob\n\n__all__ = [\n    \"Trial\"\n]\n\nOptimiserTypes = Union[str, Type[Optimiser], Optimiser]\nReporterTypes = Union[str, Type[Reporter], Reporter]\n\n\nclass Trial:\n    \"\"\"High-level API class for running hyperparameter optimisation.\n    This class encapsulates optimiser querying, job building, scheduling and\n    results collection as well as checkpointing and report generation.\n    \"\"\"\n\n    @utils.support_american_spelling\n    def __init__(self, objective: Union[Callable, str],\n                 domain: Domain,\n                 optimiser: OptimiserTypes = \"bo\",\n                 reporter: ReporterTypes = \"table\",\n                 device: str = \"local\",\n                 **kwargs):\n        \"\"\"Initialise the :class:`Trial` experiment manager.\n\n        Args:\n            objective: :obj:`Callable` or :obj:`str`. The objective function or\n                script to run.\n            domain: :class:`Domain`. The optimisation domain of the objective\n                function.\n            optimiser: :class:`Optimiser` or :obj:`str`. The optimiser method\n                for domain sampling.\n            reporter: :class:`Reporter` or :obj:`str`. The reporting method for\n                the results.\n            device: :obj:`str`. The host device running the evaluations. Can be\n                'local' or 'slurm'.\n            **kwargs: additional parameters for the optimiser, reporter and\n                scheduler.\n\n        Keyword Args:\n            timeout: :obj:`float`. The number of seconds to wait for a\n                :class:`Job` instance to finish. Default is 259200 seconds,\n                or approximately 3 days.\n        \"\"\"\n        self.objective = objective\n        self.domain = domain\n        self.optimiser = self._init_optimiser(optimiser, **kwargs)\n        self.reporter = self._init_reporter(reporter, **kwargs)\n        self.scheduler = Scheduler\n        # 259200 is the number of seconds contained in 3 days\n        self._timeout = kwargs.get(\"timeout\", 259200.0)\n        self._job = self._init_job(device)\n\n    def _init_optimiser(self, optimiser: OptimiserTypes, **kwargs) -> Optimiser:\n        if isinstance(optimiser, str):\n            optimiser_class = get_optimiser(optimiser)\n        elif issubclass(optimiser, Optimiser):\n            optimiser_class = optimiser\n        elif isinstance(optimiser, Optimiser):\n            return optimiser\n        else:\n            raise TypeError(\n                \"An optimiser must be a either a string, \"\n                \"an Optimiser type or an Optimiser instance.\"\n            )\n        opt_kwargs = {}\n        if \"seed\" in kwargs:\n            opt_kwargs[\"seed\"] = kwargs[\"seed\"]\n        return optimiser_class(self.domain, **opt_kwargs)\n\n    def _init_reporter(self, reporter: ReporterTypes, **kwargs) -> Reporter:\n        if isinstance(reporter, str):\n            reporter_class = get_reporter(reporter)\n        elif issubclass(reporter, Reporter):\n            reporter_class = reporter\n        elif isinstance(reporter, Reporter):\n            return reporter\n        else:\n            raise TypeError(\"A reporter must be either a string, \"\n                            \"a Reporter type or a Reporter instance.\")\n        rep_kwargs = {\"metrics\": kwargs.get(\"metrics\", [\"score\"]),\n                      \"database_path\": kwargs.get(\"database_path\", \".\")}\n        if not issubclass(reporter_class, reports.Table):\n            rep_kwargs[\"logdir\"] = kwargs.get(\"logdir\", \"tensorboard/\")\n        return reporter_class(self.domain, **rep_kwargs)\n\n    @staticmethod\n    def _init_job(device: str) -> Type[Job]:\n        device = device.lower()\n        if device == \"local\":\n            return Job\n        if device == \"slurm\":\n            return SlurmJob\n        raise ValueError(\n            f\"Unknown device {device}. Select one from {{'local', 'slurm'}}.\"\n        )\n\n    def run(self, n_steps: int, n_parallel: int = 1, **kwargs):\n        \"\"\"Run the optimisation and objective function evaluation.\n\n        Args:\n            n_steps: :obj:`int`. The total number of optimisation steps.\n            n_parallel: (optional) :obj:`int`. The number of jobs that can be\n                scheduled at once.\n            **kwargs: additional keyword arguments for the optimisation,\n                supplied to the :py:meth:`run_step` method of the\n                :class:`Optimiser` instance.\n\n        Keyword Args:\n            batch_size: (optional) :obj:`int`. The number of samples that are\n                suggested at once. Default is 1.\n            minimise: (optional) :obj:`bool`. If the optimiser is\n                :class:`BayesianOptimisation` then this flag tells whether the\n                objective function is being minimised or maximised. Otherwise\n                it has no effect. Default is `False`.\n        \"\"\"\n        batch_size = kwargs.get(\"batch_size\", 1)\n        n_parallel = min(n_parallel, batch_size)\n        with self.scheduler(n_parallel=n_parallel) as scheduler:\n            for i in range(n_steps):\n                samples = self.optimiser.run_step(\n                    batch_size=batch_size,\n                    minimise=kwargs.get(\"minimise\", False)\n                )\n                jobs = [\n                    self._job(task=self.objective, args=s.as_dict())\n                    for s in samples\n                ]\n                scheduler.dispatch(jobs)\n                evaluations = [\n                    r.data for r in scheduler.collect(\n                        n_results=batch_size, timeout=self._timeout\n                    )\n                ]\n                self.optimiser.update(samples, evaluations)\n                for s, e, j in zip(samples, evaluations, jobs):\n                    self.reporter.log((s, e), meta={\"job_id\": j.id})\n\n\ndef get_optimiser(name: str) -> Type[Optimiser]:\n    name = name.lower()\n    if name.startswith((\"bayes\", \"bo\")):\n        return optimisation.BayesianOptimisation\n    if name.startswith(\"random\"):\n        return optimisation.RandomSearch\n    if name.startswith((\"grid\", \"exhaustive\")):\n        return optimisation.GridSearch\n    raise ValueError(\n        f\"Unknown optimiser {name}. Select one from \"\n        f\"{{'bayesian_optimisation', 'random_search', 'grid_search'}}.\"\n    )\n\n\ndef get_reporter(name: str) -> Type[Reporter]:\n    name = name.lower()\n    if name.startswith(\"table\"):\n        return reports.Table\n    if name.startswith((\"tensor\", \"tb\")):\n        import reports.tensorboard as tb\n        return tb.Tensorboard\n    raise ValueError(\n        f\"Unknown reporter {name}. Select one from {{'table', 'tensorboard'}}.\"\n    )\n"
  },
  {
    "path": "hypertunity/utils.py",
    "content": "import queue\nfrom functools import wraps\n\nGB_US_SPELLING = {\n    \"minimise\": \"minimize\",\n    \"maximise\": \"maximize\",\n    \"optimise\": \"optimize\",\n    \"optimiser\": \"optimizer\",\n    \"emphasise\": \"emphasize\"\n}\n\nUS_GB_SPELLING = {us: gb for gb, us in GB_US_SPELLING.items()}\n\n\ndef support_american_spelling(func):\n    \"\"\"Convert American spelling keyword arguments to British\n    (default for hypertunity).\n\n    Args:\n        func: a Python callable to decorate.\n\n    Returns:\n        The decorated function which supports American keyword arguments.\n    \"\"\"\n\n    # using functools.wraps(func) enables automated documentation generation\n    # for more information see: https://github.com/sphinx-doc/sphinx/issues/3783\n    @wraps(func)\n    def british_spelling_func(*args, **kwargs):\n        gb_kwargs = {US_GB_SPELLING.get(kw, kw): val\n                     for kw, val in kwargs.items()}\n        return func(*args, **gb_kwargs)\n\n    return british_spelling_func\n\n\ndef join_strings(strings, join_char=\"_\"):\n    \"\"\"Join list of strings with an underscore.\n\n    The strings must contain string.printable characters only, otherwise an\n    exception is raised. If one of the strings has already an underscore, it\n    will be replace by a null character.\n\n    Args:\n        strings: iterable of strings.\n        join_char: str, the character to join with.\n\n    Returns:\n        The joined string with an underscore character.\n\n    Examples:\n    ```python\n        >>> join_strings(['asd', '', '_xcv__'])\n        'asd__\\x00xcv\\x00\\x00'\n    ```\n\n    Raises:\n        ValueError if a string contains an unprintable character.\n    \"\"\"\n    all_cleaned = []\n    for s in strings:\n        if not s.isprintable():\n            raise ValueError(\n                \"Encountered unexpected name containing non-printable characters.\"\n            )\n        all_cleaned.append(s.replace(join_char, \"\\0\"))\n    return join_char.join(all_cleaned)\n\n\ndef split_string(joined, split_char=\"_\"):\n    \"\"\"Split joined string and substitute back the null characters with an\n    underscore if necessary.\n\n    Inverse function of `join_strings(strings)`.\n\n    Args:\n        joined: str, the joined representation of the substrings.\n        split_char: str, the character to split by.\n\n    Returns:\n        A tuple of strings with the splitting character (underscore) removed.\n\n    Examples:\n    ```python\n        >>> split_string('asd__\\x00xcv\\x00\\x00')\n        ('asd', '', '_xcv__')\n    ```\n    \"\"\"\n    strings = joined.split(split_char)\n    strings_copy = []\n    for s in strings:\n        strings_copy.append(s.replace(\"\\0\", split_char))\n    return tuple(strings_copy)\n\n\ndef drain_queue(q, close_queue=False):\n    \"\"\"Get all items from a queue until an `Empty` exception is raised.\n\n    Args:\n        q: `Queue`, the queue to drain.\n        close_queue: bool, whether to close the queue, such that no other\n        object can be put in. Default is False.\n\n    Returns:\n        List of all items from the queue.\n    \"\"\"\n    items = []\n    while True:\n        try:\n            it = q.get_nowait()\n        except queue.Empty:\n            break\n        items.append(it)\n    if close_queue:\n        q.close()\n    return items\n"
  },
  {
    "path": "setup.py",
    "content": "import re\n\nfrom setuptools import setup, find_packages\n\nwith open(\"hypertunity/__init__.py\", \"r\", encoding=\"utf8\") as f:\n    version = re.search(r\"__version__ = [\\'\\\"](.*?)[\\'\\\"]\", f.read()).group(1)\n\nwith open(\"README.md\", \"r\", encoding=\"utf8\") as f:\n    readme = f.read()\n\nrequired_packages = [\n    \"beautifultable>=0.7.0\",\n    \"dataclasses;python_version<'3.7'\",\n    \"gpy>=1.9.8\",\n    \"gpyopt==1.2.5\",\n    \"joblib>=0.13.2\",\n    \"matplotlib>=3.0\",\n    \"numpy>=1.16\",\n    \"tinydb>=3.13.0\"\n]\n\nextras = {\n    \"tensorboard\": [\"tensorflow>=1.14.0\", \"tensorboard>=1.14.0\"],\n    \"tests\": [\"pytest>=4.6.3\", \"pytest-timeout>=1.3.3\"],\n    \"docs\": [\"sphinx>=2.2.0\", \"sphinx_rtd_theme>=0.4.3\"]\n}\n\nclassifiers = [\n    \"Development Status :: 5 - Production/Stable\",\n    \"Intended Audience :: Developers\",\n    \"Intended Audience :: Education\",\n    \"Intended Audience :: Science/Research\",\n    \"License :: OSI Approved :: Apache Software License\",\n    \"Programming Language :: Python :: 3.6\",\n    \"Programming Language :: Python :: 3.7\",\n    \"Programming Language :: Python :: 3.8\",\n    \"Topic :: Software Development :: Libraries\",\n    \"Topic :: Software Development :: Libraries :: Python Modules\"\n]\n\nsetup(\n    name=\"hypertunity\",\n    version=version,\n    author=\"Georgi Dikov\",\n    author_email=\"gvdikov@gmail.com\",\n    url=\"https://github.com/gdikov/hypertunity\",\n    description=\"A toolset for distributed black-box hyperparameter optimisation.\",\n    long_description=readme,\n    long_description_content_type='text/markdown',\n    packages=find_packages(exclude=[\"*.tests\", \"*.tests.*\", \"tests.*\", \"tests\"]),\n    python_requires=\">=3.6\",\n    install_requires=required_packages,\n    extras_require=extras,\n    classifiers=classifiers\n)\n"
  }
]