[
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "content": "---\nname: Bug report\nabout: Create a report to help us improve\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**Describe the bug**\nA clear and concise description of what the bug is.\n\n**To Reproduce**\nSteps to reproduce the behavior:\n1. Go to '...'\n2. Click on '....'\n3. Scroll down to '....'\n4. See error\n\n**Expected behavior**\nA clear and concise description of what you expected to happen.\n\n**Screenshots**\nIf applicable, add screenshots to help explain your problem.\n\n**Desktop (please complete the following information):**\n - OS: [e.g. iOS]\n - Browser [e.g. chrome, safari]\n - Version [e.g. 22]\n\n**Smartphone (please complete the following information):**\n - Device: [e.g. iPhone6]\n - OS: [e.g. iOS8.1]\n - Browser [e.g. stock browser, safari]\n - Version [e.g. 22]\n\n**Additional context**\nAdd any other context about the problem here.\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "content": "---\nname: Feature request\nabout: Suggest an idea for this project\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**Is your feature request related to a problem? Please describe.**\nA clear and concise description of what the problem is. Ex. I'm always frustrated when [...]\n\n**Describe the solution you'd like**\nA clear and concise description of what you want to happen.\n\n**Describe alternatives you've considered**\nA clear and concise description of any alternative solutions or features you've considered.\n\n**Additional context**\nAdd any other context or screenshots about the feature request here.\n"
  },
  {
    "path": ".gitignore",
    "content": "*.cprof\n\n# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\npip-wheel-metadata/\nshare/python-wheels/\n*.egg-info/\n.installed.cfg\n*.egg\nMANIFEST\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Sphinx documentation\ndocs/_build/\n\n# mkdocs documentation\n/site\n\n# data storage from tensorboard\nnasim/agents/runs\nruns/\n\n.ipynb_checkpoints/\n\n*.ipynb\n"
  },
  {
    "path": ".readthedocs.yaml",
    "content": "# .readthedocs.yaml\n# Read the Docs configuration file\n# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details\n\n# Required\nversion: 2\n\n# Set the version of Python and other tools you might need\nbuild:\n  os: ubuntu-20.04\n  tools:\n    python: \"3.8\"\n\n# Build documentation in the docs/ directory with Sphinx\nsphinx:\n   configuration: docs/source/conf.py\n   builder: html\n   fail_on_warning: false\n\n# Optionally declare the Python requirements required to build your docs\npython:\n   install:\n     - method: pip\n       path: .\n     - requirements: docs/requirements.txt"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "content": "# Contributor Covenant Code of Conduct\n\n## Our Pledge\n\nIn the interest of fostering an open and welcoming environment, we as\ncontributors and maintainers pledge to making participation in our project and\nour community a harassment-free experience for everyone, regardless of age, body\nsize, disability, ethnicity, sex characteristics, gender identity and expression,\nlevel of experience, education, socio-economic status, nationality, personal\nappearance, race, religion, or sexual identity and orientation.\n\n## Our Standards\n\nExamples of behavior that contributes to creating a positive environment\ninclude:\n\n* Using welcoming and inclusive language\n* Being respectful of differing viewpoints and experiences\n* Gracefully accepting constructive criticism\n* Focusing on what is best for the community\n* Showing empathy towards other community members\n\nExamples of unacceptable behavior by participants include:\n\n* The use of sexualized language or imagery and unwelcome sexual attention or\n advances\n* Trolling, insulting/derogatory comments, and personal or political attacks\n* Public or private harassment\n* Publishing others' private information, such as a physical or electronic\n address, without explicit permission\n* Other conduct which could reasonably be considered inappropriate in a\n professional setting\n\n## Our Responsibilities\n\nProject maintainers are responsible for clarifying the standards of acceptable\nbehavior and are expected to take appropriate and fair corrective action in\nresponse to any instances of unacceptable behavior.\n\nProject maintainers have the right and responsibility to remove, edit, or\nreject comments, commits, code, wiki edits, issues, and other contributions\nthat are not aligned to this Code of Conduct, or to ban temporarily or\npermanently any contributor for other behaviors that they deem inappropriate,\nthreatening, offensive, or harmful.\n\n## Scope\n\nThis Code of Conduct applies both within project spaces and in public spaces\nwhen an individual is representing the project or its community. Examples of\nrepresenting a project or community include using an official project e-mail\naddress, posting via an official social media account, or acting as an appointed\nrepresentative at an online or offline event. Representation of a project may be\nfurther defined and clarified by project maintainers.\n\n## Enforcement\n\nInstances of abusive, harassing, or otherwise unacceptable behavior may be\nreported by contacting the project team at Jonathon.schwartz@anu.edu.au. All\ncomplaints will be reviewed and investigated and will result in a response that\nis deemed necessary and appropriate to the circumstances. The project team is\nobligated to maintain confidentiality with regard to the reporter of an incident.\nFurther details of specific enforcement policies may be posted separately.\n\nProject maintainers who do not follow or enforce the Code of Conduct in good\nfaith may face temporary or permanent repercussions as determined by other\nmembers of the project's leadership.\n\n## Attribution\n\nThis Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,\navailable at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html\n\n[homepage]: https://www.contributor-covenant.org\n\nFor answers to common questions about this code of conduct, see\nhttps://www.contributor-covenant.org/faq\n"
  },
  {
    "path": "CONTRIBUTING.rst",
    "content": "Development\n===========\n\nNASim is a work in progress and contributions are welcome via pull request.\n\nFor more information, you can check out this link : |how_to_contrib|.\n\n.. |how_to_contrib| raw:: html\n\n   <a href=\"https://guides.github.com/activities/contributing-to-open-source/#contributing\" target=\"_blank\">Contributing to an open source Project on github</a>\n\nGuidelines\n----------\n\nHere are a few guidelines for this project.\n\n* Simplicity: Be easy to use but also easy to understand when one digs into the code. Any additional code should be justified by the usefulness of the feature.\n\nThese guidelines come of course in addition to all good practices for open source development.\n\n.. _naming_conv:\n\nCode style\n----------\n\nThis project follows the `PEP 8 <https://www.python.org/dev/peps/pep-0008/>`_ style guide, please follow this with your contributions.\n\nAdditionally:\n* If a variable is intended to be 'private', it is prefixed by an underscore.\n\nDocumentation\n-------------\n\nAll contributions should be accompanied with at least in code docstrings, when applicable. This project uses `Sphinx <https://www.sphinx-doc.org/>`_ for documentation generation and uses `Numpy style docstrings <https://numpydoc.readthedocs.io/>`_.\n\nPlease see code in this project for example or check out this `example <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html#example-numpy>`_.\n"
  },
  {
    "path": "LICENSE.md",
    "content": "\nThe MIT License (MIT)\n\nCopyright (c) 2018 \n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.rst",
    "content": "**Status**: Stable release. No extra development is planned, but still being maintained (bug fixes, etc).\n\n\nNetwork Attack Simulator\n========================\n\n|docs|\n\nNetwork Attack Simulator (NASim) is a simulated computer network complete with vulnerabilities, scans and exploits designed to be used as a testing environment for AI agents and planning techniques applied to network penetration testing.\n\n\nInstallation\n------------\n\nThe easiest way to install the latest version of NASim hosted on PyPi is via pip::\n\n  $ pip install nasim\n\n\nTo install dependencies for running the DQN test agent (this is needed to run the demo) run::\n\n  $ pip install nasim[dqn]\n\n\nTo get the latest bleeding edge version and install in development mode see the `Install docs <https://networkattacksimulator.readthedocs.io/en/latest/tutorials/installation.html>`_\n\n\nDemo\n----\n\nTo see NASim in action, you can run the provided demo to interact with an environment directly or see a pre-trained AI agent in action.\n\nTo run the `tiny` benchmark scenario demo in interactive mode run::\n\n  $ python -m nasim.demo tiny\n\n\nThis will then run an interactive console where the user can see the current state and choose the next action to take. The goal of the scenario is to *compromise* every host with a non-zero value.\n\nSee `here <https://networkattacksimulator.readthedocs.io/en/latest/reference/scenarios/benchmark_scenarios.html>`_ for the full list of scenarios.\n\nTo run the `tiny` benchmark scenario demo using the pre-trained AI agent, first ensure the DQN dependencies are installed (see *Installation* section above), then run::\n\n  $ python -m nasim.demo tiny -ai\n\n\n**Note:** Currently you can only run the AI demo for the `tiny` scenario.\n\n\nDocumentation\n-------------\n\nThe documentation is available at: https://networkattacksimulator.readthedocs.io/\n\n\n\nUsing with gymnasium\n---------------------\n\nNASim implements the `Gymnasium <https://github.com/Farama-Foundation/Gymnasium/tree/main>`_ environment interface and so can be used with any algorithm that is developed for that interface.\n\nSee `Starting NASim using gymnasium <https://networkattacksimulator.readthedocs.io/en/latest/tutorials/gym_load.html>`_.\n\n\nAuthors\n-------\n\n**Jonathon Schwartz** - Jonathon.schwartz@anu.edu.au\n\n\nLicense\n-------\n\n`MIT`_ © 2020, Jonathon Schwartz\n\n.. _MIT: LICENSE\n\n\nWhat's new\n----------\n\n\n- 2023-05-14 (v 0.12.0) (MINOR release)\n\n  + Renamed `NASimEnv.get_minimum_actions -> NASimEnv.get_minumum_hops` to better reflect what it does (thanks @rzvnbr for the suggestion).\n\n\n- 2023-03-13 (v 0.11.0) (MINOR release)\n\n  + Migrated to `gymnasium (formerly Open AI gym) <https://github.com/Farama-Foundation/Gymnasium/>`_ fromOpen AI gym (thanks @rzvnbr for the suggestion).\n  + Fixed bug with action string representation (thanks @rzvnbr for the bug report)\n  + Added \"sim to real considerations\" explanation document to the docs (thanks @Tudyx for the suggestion)\n\n- 2023-02-27 (v 0.10.1) (MICRO release)\n\n  + Fixed bug for host based actions (thanks @nguyen-thanh20 for the bug report)\n\n- 2022-07-30 (v 0.10.0) (MINOR release)\n\n  + Fixed typos (thanks @francescoluciano)\n  + Updates to be compatible with latest version of OpenAI gym API (v0.25) (see `Open AI gym API docs <https://www.gymlibrary.ml/content/api/>`_ for details), notable changes include\n\n    * Updated naming convention when initializing environments using the ``gym.make`` API (see `gym load docs <https://networkattacksimulator.readthedocs.io/en/latest/tutorials/gym_load.html>`_ for details.)\n    * Updated reset function to match new gym API (shouldn't break any implementations using old API)\n    * Updated step function to match new gym API. It now returns two bools, the first specifies if terminal/goal state has been reached and the other specifies if the episode is terminated due to the scenario step limit (if any exists) has been reached. This change may break implementations and you may need to specify (or not) when initializing the gym environment using ``gym.make(env_id, new_step_api=True)``\n\n- 2022-05-19 (v 0.9.1) (MICRO release)\n\n  + Fixed a few bugs and added some tests (thanks @simonsays1980 for the bug reports)\n\n- 2021-12-20 (v 0.9.0) (MINOR release)\n\n  + The value of a host is now observed when any level of access is gained on a host. This makes it so that agents can learn to decide whether to invest time in gaining root access on a host or not, depending on the host's value (thanks @jaromiru for the proposal).\n  + Initial observation of reachable hosts now contains the host's address (thanks @jaromiru).\n  + Added some support for custom address space bounds in when using scenario generator (thanks @jaromiru for the suggestion).\n\n- 2021-3-15 (v 0.8.0) (MINOR release)\n\n  + Added option of specifying a 'value' for each host when defining a custom network using the .YAML format (thanks @Joe-zsc for the suggestion).\n  + Added the 'small-honeypot' scenario to included scenarios.\n\n- 2020-12-24 (v 0.7.5) (MICRO release)\n\n  + Added 'undefined error' to observation to fix issue with initial and later observations being indistinguishable.\n\n- 2020-12-17 (v 0.7.4) (MICRO release)\n\n  + Fixed issues with incorrect observation of host 'value' and 'discovery_value'. Now, when in partially observable mode, the agent will correctly only observe these values on the step that they are recieved.\n  + Some other minor code formatting fixes\n\n- 2020-09-23 (v 0.7.3) (MICRO release)\n\n  + Fixed issue with scenario YAML files not being included with PyPi package\n  + Added final policy visualisation option to DQN and Q-Learning agents\n\n- 2020-09-20 (v 0.7.2) (MICRO release)\n\n  + Fixed bug with 're-registering' Gym environments when reloading modules\n  + Added example implementations of Tabular Q-Learning: `agents/ql_agent.py` and `agents/ql_replay.py`\n  + Added `Agents` section to docs, along with other minor doc updates\n\n- 2020-09-20 (v 0.7.1) (MICRO release)\n\n  + Added some scripts for running random benchmarks and describing benchmark scenarios\n  + Added some more docs (including for creating custom scenarios) and updated other docs\n\n- 2020-09-20 (v 0.7.0) (MINOR release)\n\n  + Implemented host based firewalls\n  + Added priviledge escalation\n  + Added a demo script, including a pre-trained agent for the 'tiny' scenario\n  + Fix to upper bound calculation (factored in reward for discovering a host)\n\n- 2020-08-02 (v 0.6.0) (MINOR release)\n\n  + Implemented compatibility with gym.make()\n  + Updated docs for loading and interactive with NASimEnv\n  + Added extra functions to nasim.scenarios to make it easier to load scenarios seperately to a NASimEnv\n  + Fixed bug to do with class attributes and creating different scenarios in same python session\n  + Fixed up bruteforce agent and tests\n\n- 2020-07-31 (v 0.5.0) (MINOR release)\n\n  + First official release on PyPi\n  + Cleaned up dependencies, setup.py, etc and some small fixes\n\n\n.. |docs| image:: https://readthedocs.org/projects/networkattacksimulator/badge/\n    :target: https://networkattacksimulator.readthedocs.io/en/latest/?badge=latest\n    :alt: Documentation Status\n    :scale: 100%\n"
  },
  {
    "path": "docs/Makefile",
    "content": "# Minimal makefile for Sphinx documentation\n#\n\n# You can set these variables from the command line, and also\n# from the environment for the first two.\nSPHINXOPTS    ?=\nSPHINXBUILD   ?= sphinx-build\nSOURCEDIR     = source\nBUILDDIR      = build\n\n# Put it first so that \"make\" without argument is like \"make help\".\nhelp:\n\t@$(SPHINXBUILD) -M help \"$(SOURCEDIR)\" \"$(BUILDDIR)\" $(SPHINXOPTS) $(O)\n\n.PHONY: help Makefile\n\n# Catch-all target: route all unknown targets to Sphinx using the new\n# \"make mode\" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).\n%: Makefile\n\t@$(SPHINXBUILD) -M $@ \"$(SOURCEDIR)\" \"$(BUILDDIR)\" $(SPHINXOPTS) $(O)\n"
  },
  {
    "path": "docs/make.bat",
    "content": "@ECHO OFF\r\n\r\npushd %~dp0\r\n\r\nREM Command file for Sphinx documentation\r\n\r\nif \"%SPHINXBUILD%\" == \"\" (\r\n\tset SPHINXBUILD=sphinx-build\r\n)\r\nset SOURCEDIR=source\r\nset BUILDDIR=build\r\n\r\nif \"%1\" == \"\" goto help\r\n\r\n%SPHINXBUILD% >NUL 2>NUL\r\nif errorlevel 9009 (\r\n\techo.\r\n\techo.The 'sphinx-build' command was not found. Make sure you have Sphinx\r\n\techo.installed, then set the SPHINXBUILD environment variable to point\r\n\techo.to the full path of the 'sphinx-build' executable. Alternatively you\r\n\techo.may add the Sphinx directory to PATH.\r\n\techo.\r\n\techo.If you don't have Sphinx installed, grab it from\r\n\techo.http://sphinx-doc.org/\r\n\texit /b 1\r\n)\r\n\r\n%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%\r\ngoto end\r\n\r\n:help\r\n%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%\r\n\r\n:end\r\npopd\r\n"
  },
  {
    "path": "docs/requirements.txt",
    "content": "nasim\nsphinx\nsphinx-autobuild\nsphinx-rtd-theme\n"
  },
  {
    "path": "docs/source/community/acknowledgements.rst",
    "content": ".. _acknowledgements:\n\nAcknowledgements\n================\n\n* Inspiration for the documentation was taken from the `DeeR <https://deer.readthedocs.io/en/master/>`_ project.\n"
  },
  {
    "path": "docs/source/community/contact.rst",
    "content": "Contact\n=======\nQuestions? Please contact Jonathon.schwartz@anu.edu.au.\n"
  },
  {
    "path": "docs/source/community/development.rst",
    "content": ".. _dev:\n\nDevelopment\n===========\n\nNASim is a work in progress and contributions are welcome via pull request.\n\nFor more information, you can check out this link : |how_to_contrib|.\n\n.. |how_to_contrib| raw:: html\n\n   <a href=\"https://guides.github.com/activities/contributing-to-open-source/#contributing\" target=\"_blank\">Contributing to an open source Project on github</a>\n\nGuidelines\n----------\n\nHere are a few guidelines for this project.\n\n* Simplicity: Be easy to use but also easy to understand when one digs into the code. Any additional code should be justified by the usefulness of the feature.\n\nThese guidelines come of course in addition to all good practices for open source development.\n\n.. _naming_conv:\n\nCode style\n----------\n\nThis project follows the `PEP 8 <https://www.python.org/dev/peps/pep-0008/>`_ style guide, please follow this with your contributions.\n\nAdditionally:\n* If a variable is intended to be 'private', it is prefixed by an underscore.\n\nDocumentation\n-------------\n\nAll contributions should be accompanied with at least in code docstrings, when applicable. This project uses `Sphinx <https://www.sphinx-doc.org/>`_ for documentation generation and uses `Numpy style docstrings <https://numpydoc.readthedocs.io/>`_.\n\nPlease see code in this project for example or check out this `example <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html#example-numpy>`_.\n"
  },
  {
    "path": "docs/source/community/distributing.rst",
    "content": ".. _distribution:\n\nDistribution\n============\n\nThis document contains some notes on distributing NASim via PyPi. This is mainly as a reminder for the steps to take when releasing an update.\n\n.. note:: Unless specified otherwise, all bash commands are assumed to be executed from the root directory of the NASim package.\n\n\nBefore pushing to master\n~~~~~~~~~~~~~~~~~~~~~~~~\n\n1. Ensure all tests are passing by running:\n\n.. code-block:: bash\n\n   cd test\n   pytest\n\n2. Ensure updates are included in the *What's new* section of the *README.rst* and *docs/source/index.rst* files (this step can be ignored for very small changes)\n3. Ensure any necessary updates have been included in the documentation.\n4. Make sure the documentation can be built by running:\n\n.. code-block:: bash\n\n   cd docs\n   make html\n\n5. Ensure ``setup.py`` has been updated to reflect any version and/or dependency changes.\n\n\nAfter changes have been pushed\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nIf pushing a new version (MAJOR, MINOR, or MICRO), do the following:\n\n1. Add a tag with the release number to the commit.\n2. On github create a new release and link it to the tagged commit\n3. Publish the new release to PyPi:\n\n.. code-block:: bash\n\n   # build distributions\n   python setup.py sdist bdist_wheel\n\n   # upload latest distribution builds to pypi\n   # this will ask for PyPi username and password\n   python -m twine upload dist/* --skip-existing\n\n\n4. Login to https://pypi.org/ and verify latest version is added correctly.\n5. Visit https://networkattacksimulator.readthedocs.io/en/latest/index.html and check documentation has updated correctly (make sure to refresh browser cache to ensure your looking at the latest version.)\n"
  },
  {
    "path": "docs/source/community/index.rst",
    "content": ".. _community:\n\nCommunity & Development\n=======================\n\n.. toctree::\n    :maxdepth: 1\n\n    development\n    license\n    contact\n    acknowledgements\n    distributing\n"
  },
  {
    "path": "docs/source/community/license.rst",
    "content": "License\n=======\n\nThe MIT License (MIT)\n\nCopyright (c) 2018\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "docs/source/conf.py",
    "content": "# Configuration file for the Sphinx documentation builder.\n#\n# This file only contains a selection of the most common options. For a full\n# list see the documentation:\n# https://www.sphinx-doc.org/en/master/usage/configuration.html\n\n# -- Path setup --------------------------------------------------------------\n\n# If extensions (or modules to document with autodoc) are in another directory,\n# add these directories to sys.path here. If the directory is relative to the\n# documentation root, use os.path.abspath to make it absolute, like shown here.\n#\nimport os\nimport sys\nimport nasim\nsys.path.insert(0, os.path.abspath(os.path.join('..', '..')))\n\n\n# -- Project information -----------------------------------------------------\n\nproject = 'NASim'\ncopyright = '2020, Jonathon Schwartz'\nauthor = 'Jonathon Schwartz'\n\n# The full version, including alpha/beta/rc tags\nrelease = nasim.__version__\n\n\n# -- General configuration ---------------------------------------------------\n\n# Add any Sphinx extension module names here, as strings. They can be\n# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom\n# ones.\nextensions = [\n    'sphinx.ext.autodoc',\n    'sphinx.ext.coverage',\n    'sphinx.ext.napoleon'\n]\n\n# Add any paths that contain templates here, relative to this directory.\ntemplates_path = ['_templates']\n\n# List of patterns, relative to source directory, that match files and\n# directories to ignore when looking for source files.\n# This pattern also affects html_static_path and html_extra_path.\nexclude_patterns = []\n\n# Explicitly assign the master document\n# This is required for the readthedocs.org build to work correctly\nmaster_doc = 'index'\n\n\n# -- to include special methods ---------------------------------------------\n\ndef skip(app, what, name, obj, would_skip, options):\n    if name == \"__init__\":\n        return False\n    return would_skip\n\n\ndef setup(app):\n    app.connect(\"autodoc-skip-member\", skip)\n\n\n# -- Options for HTML output -------------------------------------------------\n\n# The theme to use for HTML and HTML Help pages.  See the documentation for\n# a list of builtin themes.\n#\n# html_theme = 'alabaster'\nhtml_theme = 'sphinx_rtd_theme'\n\n# Add any paths that contain custom static files (such as style sheets) here,\n# relative to this directory. They are copied after the builtin static files,\n# so a file named \"default.css\" will overwrite the builtin \"default.css\".\nhtml_static_path = ['_static']\n"
  },
  {
    "path": "docs/source/explanations/index.rst",
    "content": ".. _explanations:\n\nExplanations\n============\n\nMore technical explanations related to NASim.\n\n.. toctree::\n    :maxdepth: 1\n\n    scenario_generation\n    sim_to_real\n"
  },
  {
    "path": "docs/source/explanations/scenario_generation.rst",
    "content": ".. _scenario_generation_explanation:\n\nScenario Generation Explanation\n===============================\n\nGenerating the scenarios involves a number of design decisions that strongly determine the form of the network being generated. This document aims to explain some of the more technical details of generating the scenarios when using the :ref:`scenario_generator` class.\n\nThe scenario generator is based heavily on prior work, specifically:\n\n- `Sarraute, Carlos, Olivier Buffet, and Jörg Hoffmann. \"POMDPs make better hackers: Accounting for uncertainty in penetration testing.\" Twenty-Sixth AAAI Conference on Artificial Intelligence. 2012. <https://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/viewPaper/4996>`_\n- `Speicher, Patrick, et al. \"Towards Automated Network Mitigation Analysis (extended).\" arXiv preprint arXiv:1705.05088 (2017). <https://arxiv.org/abs/1705.05088>`_\n\nNetwork Topology\n----------------\n\nDescription to come. Till then we recommend reading the papers linked above, especially the appendix of Speicher et al (2017).\n\n.. _correlated_configurations:\n\nCorrelated Configurations\n-------------------------\n\nWhen generating a scenario with ``uniform=False`` the scenario will be generated with host configurations being correlated. This means that rather than the OS and services it is running being chosen uniformly at random from the available OSs and services, they are chosen randomly with increased probability given to OSs and services that are being run by other hosts whose configuration was generated earlier.\n\n\nSpecifically, the distribution of configurations of each host in the network are generated using a Nested Dirichlet Process, so that across the network hosts will have corelated configurations (i.e. certain services/configurations will be more common across hosts on the network). The correlation can be controlled using three parameters: ``alpha_H``, ``alpha_V``, and ``lambda_V``.\n\n``alpha_H`` and ``alpha_V`` control the degree of correlation, with lower values leading to greater correlation.\n\n``lambda_V`` controls the average number of services running per host, with higher values will mean more services (so more vulnerable) hosts on average.\n\nAll three parameters must have a positive value, with the defaults being ``alpha_H=2.0``, ``alpha_V=2.0``, and ``lambda_V=1.0``, which tends to generate networks with fairly correlated configurations where hosts have only a single vulnerability on average.\n\n\n.. _generated_exploit_probs:\n\nGenerated Exploit Probabilities\n-------------------------------\n\nSuccess probabilities of each exploit are determined based on the value of the ``exploit_probs`` argument, as follows:\n\n- ``exploit_probs=None`` - probabilities generated randomly from uniform distribution over the interval (0, 1).\n- ``exploit_probs=float`` - probability of each exploit is set to the float value, which must be a valid probability.\n- ``exploit_probs=list[float]`` - probability of each exploit is set to corresponding float value in list. This requires that the length of the list matches the number of exploits as specified by the ``num_exploits`` argument.\n- ``exploit_probs=\"mixed\"`` - probabilities chosen from a set distribution which is based on the `CVSS attack complexity <https://www.first.org/cvss/v2/guide>`_ distribution of `top 10 vulnerabilities in 2017 <https://go.recordedfuture.com/hubfs/reports/cta-2018-0327.pdf>`_. Specifically, exploit probabilities are chosen from [0.3, 0.6, 0.9] which correspond to high, medium and low attack complexity, respectively, with probabilities [0.2, 0.4, 0.4].\n\nFor deterministic exploits set ``exploit_probs=1.0``.\n\n\nFirewall\n--------\n\nThe firewall restricts which services can be communicated with between hosts on different subnets. This is mostly done by selecting services at random to block between each subnet, with some contraints.\n\nFirstly, there exists no firewall between subnets in the user zone. So communication between hosts on different user subnets is allowed for all services.\n\nSecondly, the number of services blocked is controlled by the ``restrictiveness`` parameter. This controls the number of services to block between zones (i.e. between the internet, DMZ, sensitive, and user zones).\n\nThirdly, to ensure that the goal can be reached, traffic from at least one service running on each subnet will be allowed between each zone. This may mean more services will be allowed than restrictiveness parameter.\n"
  },
  {
    "path": "docs/source/explanations/sim_to_real.rst",
    "content": ".. _sim_to_real_explanation:\n\nSim-to-Real Gap Considerations\n==============================\n\nNASim is a fairly simplified simulator of network penetration testing. It's main goal is to capture some of the key features of network pentesting in a easy-to-use and fast simulator so that it can be used for rapid testing and prototyping of algorithms before these algorithms are tested on more realistic environments. That is to say there is a bit of gap between the scenarios in NASim and the real world.\n\nIn this document we wanted to lay down some considerations to think about when trying to extend your algorithm beyond NASim. This is by no means an exhaustive list, but will hopefully give you something to think about for the next steps, and also give an explanation of some of the design decisions made in NASim.\n\n.. note:: This document is a work in progress so if you have any thoughts, useful references, etc on the topic of applying autonomous penetration testing in the real-world please reach out via email or open an issue on github.\n\nHandling Partial Observability\n------------------------------\n\nOne of the big assumptions made by NASim is that the pentester agent has access to the network addresses of every host in the network, even in partially observable mode. This information is given to the agent in it's list of actions. In practice in the real-world, depending on the scenario, this assumption may be invalid, and part of the challenge for the pentester is to be able to discover new hosts as they navigate through the network.\n\nThe main reason NASim is implemented with the network addresses being known is so that the action space size could be fixed, making it simpler to use with typical Deep Reinforcement Learning algorithms (i.e. with neural nets with fixed size input and output layers).\n\nOne of the research challenges is to develop algorithms that can handle action spaces that change as the pentester discovers more network addresses, or perhaps more realistic would be that the pentester's action space is mult-dimensional and includes choosing an address and exploit/scan/etc separately. There actually is some support for this built into NASim with the nasim.envs.action.ParameterisedActionSpace action space (see :ref:`actions`), but even using that action space some information about the size of the network is given to the pentester.\n\nAt this stage there is no plans to update NASim to support the no-information action space. This is partially due to time, but also to keep NASim simple and stable and because there are a lot of even better and more realistic environments being developed now (e.g. `CybORG <https://github.com/cage-challenge/CybORG>`_.)\n\nOne avenue for handling changing action space is to use auto-regressive actions as was done by `AlphaStar <https://www.deepmind.com/blog/alphastar-mastering-the-real-time-strategy-game-starcraft-ii>`_.\n"
  },
  {
    "path": "docs/source/index.rst",
    "content": "Welcome to Network Attack Simulator's documentation!\n====================================================\n\nNetwork Attack Simulator (NASim) is a lightweight, high-level network attack simulator written in python. It is designed to be used for rapid testing of autonomous pen-testing agents using reinforcement learning and planning. It is a simulator by definition so does not replicate all details of attacking a real system but it instead aims to capture some of the more salient features of network pen-testing such as the large and changing sizes of the state and action spaces, partial observability and varied network topology.\n\nThe environment is modelled after the `gymnasium (formerly Open AI gym) <https://github.com/Farama-Foundation/Gymnasium/>`_ interface.\n\n\nWhat's new\n----------\n\nVersion 0.12.0\n**************\n\n+ Renamed `NASimEnv.get_minimum_actions -> NASimEnv.get_minumum_hops` to better reflect what it does (thanks @rzvnbr for the suggestion).\n\n\nVersion 0.11.0\n**************\n\n+ Migrated to `gymnasium (formerly Open AI gym) <https://github.com/Farama-Foundation/Gymnasium/>`_ fromOpen AI gym (thanks @rzvnbr for the suggestion).\n+ Fixed bug with action string representation (thanks @rzvnbr for the bug report)\n+ Added \"sim to real considerations\" explanation document to the docs (thanks @Tudyx for the suggestion)\n\n\nVersion 0.10.1\n**************\n\n+ Fixed bug for host based actions (thanks @nguyen-thanh20 for the bug report)\n\n\nVersion 0.10.0\n**************\n\n+ Fixed typos (thanks @francescoluciano)\n+ Updates to be compatible with latest version of OpenAI gym API (v0.25) (see `Open AI gym API docs <https://www.gymlibrary.ml/content/api/>`_ for details), notable changes include\n\n  * Updated naming convention when initializing environments using the ``gym.make`` API (see `gym load docs <https://networkattacksimulator.readthedocs.io/en/latest/tutorials/gym_load.html>`_ for details.)\n  * Updated reset function to match new gym API (shouldn't break any implementations using old API)\n  * Updated step function to match new gym API. It now returns two bools, the first specifies if terminal/goal state has been reached and the other specifies if the episode is terminated due to the scenario step limit (if any exists) has been reached. This change may break implementations and you may need to specify (or not) when initializing the gym environment using ``gym.make(env_id, new_step_api=True)``\n\n\nVersion 0.9.1\n*************\n\n- Fixed a few bugs and added some tests (thanks @simonsays1980 for the bug reports)\n\n\nVersion 0.9.0\n*************\n\n- The value of a host is now observed when any level of access is gained on a host. This makes it so that agents can learn to decide whether to invest time in gaining root access on a host or not, depending on the host's value (thanks @jaromiru for the proposal).\n- Initial observation of reachable hosts now contains the host's address (thanks @jaromiru).\n- Added some support for custom address space bounds in when using scenario generator (thanks @jaromiru for the suggestion).\n\n\nVersion 0.8.0\n*************\n\n- Added option of specifying a 'value' for each host when defining a custom network using the .YAML format (thanks @Joe-zsc for the suggestion).\n- Added the 'small-honeypot' scenario to included scenarios.\n\n\nVersion 0.7.5\n*************\n\n- Added 'undefined error' to observation to fix issue with initial and later observations being indistinguishable.\n\n\nVersion 0.7.4\n*************\n\n- Fixed issues with incorrect observation of host 'value' and 'discovery_value'. Now, when in partially observable mode, the agent will correctly only observe these values on the step that they are recieved\n- Some other minor code formatting fixes\n\n\nVersion 0.7.3\n*************\n\n- Fixed issue with scenario YAML files not being included with PyPi package\n- Added final policy visualisation option to DQN and Q-Learning agents\n\n\nVersion 0.7.2\n*************\n\n- Fixed bug with 're-registering' Gym environments when reloading modules\n- Added example implementations of Tabular Q-Learning: `agents/ql_agent.py` and `agents/ql_replay.py`\n- Added `Agents` section to docs, along with other minor doc updates\n\n\nVersion 0.7.1\n*************\n\n- Added some scripts for running random benchmarks and describing benchmark scenarios\n- Added some more docs (including for creating custom scenarios) and updated other docs\n\n\nVersion 0.7\n***********\n\n- Implemented host based firewalls\n- Added priviledge escalation\n- Added a demo script, including a pre-trained agent for the 'tiny' scenario\n- Fix to upper bound calculation (factored in reward for discovering a host)\n\n\nVersion 0.6\n***********\n\n- Implemented compatibility with gym.make()\n- Updated docs for loading and interactive with NASimEnv\n- Added extra functions to nasim.scenarios to make it easier to load scenarios seperately to a NASimEnv\n- Fixed bug to do with class attributes and creating different scenarios in same python session\n- Fixed up bruteforce agent and tests\n\n\nVersion 0.5\n***********\n\n- First official release on PyPi\n- Cleaned up dependencies, setup.py, etc and some small fixes\n- First stable version\n\n\nThe Docs\n--------\n\n.. toctree::\n   :maxdepth: 2\n\n   tutorials/index\n   reference/index\n   explanations/index\n   community/index\n\n\nHow should I cite NASim?\n------------------------\n\nPlease cite NASim in your publications if you use it in your research. Here is an example BibTeX entry:\n\n.. code-block:: bash\n\n    @misc{schwartz2019nasim,\n    title={NASim: Network Attack Simulator},\n    author={Schwartz, Jonathon and Kurniawatti, Hanna},\n    year={2019},\n    howpublished={\\url{https://networkattacksimulator.readthedocs.io/}},\n    }\n\n\n\nIndices and tables\n==================\n\n* :ref:`genindex`\n* :ref:`modindex`\n* :ref:`search`\n\n.. _GitHub: https://github.com/Jjschwartz/NetworkAttackSimulator\n"
  },
  {
    "path": "docs/source/reference/agents/index.rst",
    "content": ".. _agents_reference:\n\nAgents Reference\n================\n\nThis page provides a short summary of the agents that come with the NASim library.\n\nAvailable Agents\n----------------\n\nThe agent implementations that come with NASim include:\n\n* **keyboard_agent.py**: An agent that is controlled by the user via terminal inputs.\n* **random_agent.py**: A random agent that selects an action randomly from all available actions at each time step.\n* **bruteforce_agent.py**: An agent that repeatedly cycles through all available actions in order.\n* **ql_agent.py**: A Tabular, epsilod-greedy Q-Learning reinforcement learning agent.\n* **ql_replay_agent.py**: A Tabular, epsilod-greedy Q-Learning reinforcement learning agent (same as above) that incorporates an experience replay.\n* **dqn_agent.py**: A Deep Q-Network reinforcement learning agent using experience replay and a target Q-Network.\n\n\nRunning Agents\n--------------\n\nEach agent file defines a main function so can be run in python via the terminal, with the specific scenario and settings specified as command line arguments:\n\n\n.. code-block:: bash\n\n    cd nasim/agents\n    # to run a different agent, simply replace .py file with desired file\n    # to run a different scenario, simply replace 'tiny' with desired scenario\n    python bruteforce_agent.py tiny\n\n    # to get details on command line arguments available (e.g. hyperparameters for Q-Learning and DQN agents)\n    python bruteforce_agent.py --help\n\n\nA description and details of how to run each agent can be found at the top of each agent file.\n\n\nViewing Agent Policies\n----------------------\n\nFor the DQN and Tabular Q-Learning agents you can optionally also view the final policies learned by the agents after training has finished:\n\n.. code-block:: bash\n\n    # simply include the --render_eval flag with the DQN and Q-Learning agents\n    python ql_agent.py tiny --render_eval\n\n\nThis will show a single episode of the agent, displaying the actions the agent performs along with the observations and rewards the agent recieves.\n"
  },
  {
    "path": "docs/source/reference/envs/actions.rst",
    "content": ".. _`actions`:\n\nActions\n=======\n\n.. automodule:: nasim.envs.action\n   :members:\n"
  },
  {
    "path": "docs/source/reference/envs/environment.rst",
    "content": ".. _`environment`:\n\nEnvironment\n===========\n\n.. automodule:: nasim.envs.environment\n   :members:\n"
  },
  {
    "path": "docs/source/reference/envs/host_vector.rst",
    "content": ".. _`host_vector`:\n\nHostVector\n==========\n\n.. automodule:: nasim.envs.host_vector\n   :members:\n"
  },
  {
    "path": "docs/source/reference/envs/index.rst",
    "content": ".. _env_reference:\n\nEnvironment Reference\n=====================\n\nTechnical reference material for classes and functions used to interact with the NASim Environment.\n\n.. toctree::\n    :maxdepth: 1\n\n    actions\n    environment\n    host_vector\n    observation\n    state\n"
  },
  {
    "path": "docs/source/reference/envs/observation.rst",
    "content": ".. _`observation`:\n\nObservation\n===========\n\n.. automodule:: nasim.envs.observation\n   :members:\n"
  },
  {
    "path": "docs/source/reference/envs/state.rst",
    "content": ".. _`state`:\n\nState\n=====\n\n.. automodule:: nasim.envs.state\n   :members:\n"
  },
  {
    "path": "docs/source/reference/index.rst",
    "content": ".. _reference:\n\nReference\n=========\n\nTechnical reference material.\n\n.. toctree::\n    :maxdepth: 2\n\n    load\n    agents/index\n    envs/index\n    scenarios/index\n"
  },
  {
    "path": "docs/source/reference/load.rst",
    "content": ".. _nasim_init:\n\nNASimEnv load reference\n=======================\n\nTechnical reference material for different functions for creating a new NASim Environment.\n\n.. automodule:: nasim\n   :members:\n"
  },
  {
    "path": "docs/source/reference/scenarios/benchmark_scenarios.rst",
    "content": ".. _benchmark_scenarios:\n\nBenchmark Scenarios\n===================\n\nThere are a number of existing scenarios that come with NASim. They cover a range of complexities and sizes and are intended to be used to help with benchmarking algorithms. Additionally, there are two flavours of existing scenarios: **static** and **generated**.\n\n.. note:: For full list of benchmark scenarios see :ref:`all_benchmark_scenarios`.\n\n**Static** scenarios are predefined and will be exactly the same every time they are loaded. They are defined in .yaml files in the `nasim/scenarios/benchmark/` directory.\n\n**Generated** are scenario generated using the :ref:`scenario_generator` based on some parameters. While certain features of the each scenario will remain constant between generations (e.g. number of hosts, services, exploits), other features may change (e.g. specific host configurations, firewall settings, exploit probabilities) depending on the random seed.\n\n\n.. _all_benchmark_scenarios:\n\nAll benchmark scenarios\n-----------------------\n\nThe following table provides details of each benchmark scenario currently available in NASim.\n\n.. csv-table:: NASim Benchmark scenarios\n   :file: benchmark_scenarios_table.csv\n   :header-rows: 1\n\n\nThe number of actions is calculated as *Hosts X (Exploits + PrivEscs + 4)*. The +4 is for the 4 scans available for each host (OSScan, ServiceScan, ProcessScan, and SubnetScan).\n\nThe number of states is calculated as *Hosts X 2^(3 + OS + Services) X 3 *. Here the first 3 comes from the *compromised*, *reachable* and *discovered* features of the state and the base of 2 is due to all state features being boolean (present/absent). The second 3 comes from the number of possible access levels possible on a host.\n\nThe table below provides mean steps to reach the goal and reward (+/- stdev) for a uniform random agent, with scores averaged over 100 runs.\n\n.. csv-table:: NASim Benchmark scenarios Agent scores\n   :file: benchmark_scenarios_agent_scores.csv\n   :header-rows: 2\n\n\nNotes on the scenarios\n----------------------\n\nThe *tiny*, *small*, *medium*, *large*, and *huge* (and their generated versions) are all based on the network scenarios first used by:\n\n- `Sarraute, Carlos, Olivier Buffet, and Jörg Hoffmann. \"POMDPs make better hackers: Accounting for uncertainty in penetration testing.\" Twenty-Sixth AAAI Conference on Artificial Intelligence. 2012. <https://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/viewPaper/4996>`_\n- `Speicher, Patrick, et al. \"Towards Automated Network Mitigation Analysis (extended).\" arXiv preprint arXiv:1705.05088 (2017). <https://arxiv.org/abs/1705.05088>`_\n\nThe *pocp-1-gen* and *pocp-2-gen* scenarios are based on the work by:\n\n- `Shmaryahu, D., Shani, G., Hoffmann, J., & Steinmetz, M. (2018, June). Simulated penetration testing as contingent planning. In Twenty-Eighth International Conference on Automated Planning and Scheduling. <https://www.aaai.org/ocs/index.php/ICAPS/ICAPS18/paper/viewPaper/17766>`_\n\nThe other scenarios were made up by author after looking at some random google images of network layouts, and playing around with different interesting network topologies.\n"
  },
  {
    "path": "docs/source/reference/scenarios/benchmark_scenarios_agent_scores.csv",
    "content": "Scenario Name,Steps,Total Reward\ntiny,108.02 +/- 43.82,91.98 +/- 43.82\ntiny-hard,135.31 +/- 65.56,21.05 +/- 85.45\ntiny-small,319.56 +/- 124.26,-225.86 +/- 167.14\nsmall,501.94 +/- 181.40,-469.80 +/- 241.99\nsmall-honeypot,448.72 +/- 151.62,-476.08 +/- 222.41\nsmall-linear,566.00 +/- 177.08,-555.08 +/- 241.06\nmedium,1371.45 +/- 420.41,-1875.29 +/- 660.62\nmedium-single-site,654.89 +/- 385.76,-782.17 +/- 581.14\nmedium-multi-site,1060.94 +/- 389.86,-1394.71 +/- 590.89\ntiny-gen,86.56 +/- 40.16,116.43 +/- 40.15\ntiny-gen-rgoal,98.94 +/- 47.83,104.02 +/- 47.80\nsmall-gen,435.73 +/- 205.61,-228.53 +/- 214.34\nsmall-gen-rgoal,423.52 +/- 226.68,-218.62 +/- 240.20\nmedium-gen,1002.94 +/- 468.10,-788.64 +/- 481.86\nlarge-gen,2548.62 +/- 1224.08,-2327.34 +/- 1241.92\nhuge-gen,6303.86 +/- 2403.40,-6075.69 +/- 2434.77\npocp-1-gen,15189.46 +/- 6879.75,-14947.80 +/- 6887.43\npocp-2-gen,17211.38 +/- 5855.83,-16871.05 +/- 5864.58\n"
  },
  {
    "path": "docs/source/reference/scenarios/benchmark_scenarios_table.csv",
    "content": "Name,Type,Subnets,Hosts,OS,Services,Processes,Exploits,PrivEscs,Actions,Observation Dims,States,Step Limit\ntiny,static,4,3,1,1,1,1,1,18,4X14,576,1000\ntiny-hard,static,4,3,2,3,2,3,2,27,4X18,9216,1000\ntiny-small,static,5,5,2,3,2,3,2,45,6X20,15360,1000\nsmall,static,5,8,2,3,2,3,2,72,9X23,24576,1000\nsmall-honeypot,static,5,8,2,3,2,3,2,72,9X23,24576,1000\nsmall-linear,static,7,8,2,3,2,3,2,72,9X22,24576,1000\nmedium,static,6,16,2,5,3,5,3,192,17X27,393216,2000\nmedium-single-site,static,2,16,2,5,3,5,3,192,17x34,393216,2000\nmedium-multi-site,static,7,16,2,5,3,5,3,192,17X29,393216,2000\ntiny-gen,generated,4,3,1,1,1,1,1,18,4X14,576,1000\ntiny-gen-rangoal,generated,4,3,1,1,1,1,1,18,4X14,576,1000\nsmall-gen,generated,5,8,2,3,2,3,2,72,9X23,24576,1000\nsmall-gen-rangoal,generated,5,8,2,3,2,3,2,72,9X23,24576,1000\nmedium-gen,generated,6,16,2,5,2,5,2,176,17X26,196608,2000\nlarge-gen,generated,8,23,3,7,3,7,3,322,24X32,4521984,5000\nhuge-gen,generated,11,38,4,10,4,10,4,684,39X40,2.39E+08,10000\npocp-1-gen,generated,10,35,2,50,2,60,2,2310,36X75,1.51E+19,30000\npocp-2-gen,generated,21,95,3,10,3,30,3,3515,96X48,1.49E+08,30000\n"
  },
  {
    "path": "docs/source/reference/scenarios/generator.rst",
    "content": ".. _scenario_generator:\n\nScenario Generator\n===================\n\n.. automodule:: nasim.scenarios.generator\n   :members:\n"
  },
  {
    "path": "docs/source/reference/scenarios/index.rst",
    "content": ".. _scenario_reference:\n\nScenario Reference\n==================\n\nTechnical reference material for classes and functions used to generate and load Scenarios to use with the NASim Environment.\n\n.. toctree::\n    :maxdepth: 1\n\n    benchmark_scenarios\n    generator\n"
  },
  {
    "path": "docs/source/tutorials/creating_scenarios.rst",
    "content": ".. _`creating_scenarios_tute`:\n\nCreating Custom Scenarios\n=========================\n\nWith NASim it is possible to use custom scenarios defined in a valid YAML file. In this tutorial we will cover how to create and run you own custom scenario.\n\n.. _'defining_custom_yaml':\n\nDefining a custom scenario using YAML\n-------------------------------------\n\nBefore we dive into writing a new custom YAML scenario it is worth having a look at some examples. NASim comes with a number of benchmark YAML scenarios which can be found in the ``nasim/scenarios/benchmark`` directory (or view on github `here <https://github.com/Jjschwartz/NetworkAttackSimulator/tree/master/nasim/scenarios/benchmark>`_). For this tutorial we will be using the ``tiny.yaml`` scenario as an example.\n\nA custom scenarios in NASim requires definining components: the network and the pen-tester.\n\n\nDefining the network\n^^^^^^^^^^^^^^^^^^^^\n\nThe network is defined by the following sections:\n\n   1. **subnets**: size of each subnet in network\n   2. **topology**: an adjacency matrix defining which subnets are connected\n   3. **os**: names of available operating systems on network\n   4. **services**: names of available services on network\n   5. **processes**: names of available processes on network\n   6. **hosts**: a dictionary of hosts on the network and their configurations\n   7. **firewall**: definition of the subnet firewalls\n\n\nSubnets\n\"\"\"\"\"\"\"\n\nThis property defines the number of subnets on the network and the size of each. It is simply defined as an ordered list of integers. The address of the first subnet in the list is *1*, the second subnet is *2*, and so on. The address of *0* is reserved for the \"internet\" subnet (see topology section below). For example, the ``tiny`` network contains 3 subnets all of size 1:\n\n.. code-block:: yaml\n\n   subnets: [1, 1, 1]\n\n   # or alternatively\n\n   subnets:\n     - 1\n     - 1\n     - 1\n\n\nTopology\n\"\"\"\"\"\"\"\"\n\nThe topology is defined by an adjacency matrix with a row and column for every subnet in the network along with an additional row and column designating the \"internet\" subnet, i.e. connection to outside of the network. The first row and column is reserved for the \"internet\" subnet. A connection between subnets is indicated with a ``1`` while not connection is indicated with a ``0``. Note that we assume that connections are symmetric and that a subnet is connected with itself.\n\nFor the ``tiny`` network, subnet *1* is a public subnet so is connected to the internet, indicated by a ``1`` in row 1, column 2 and row 2, column 1. Subnet *1* is also connected with subnets *2* and *3*, indicated by ``1`` in relevant cells, meanwhile subnets *2* and *3* are private and not connected directly to the internet, indicated by the ``0`` values.\n\n.. code-block:: yaml\n\n   topology: [[ 1, 1, 0, 0],\n              [ 1, 1, 1, 1],\n              [ 0, 1, 1, 1],\n              [ 0, 1, 1, 1]]\n\n\n\nOS, services, processes\n\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\n\nSimilar to how we defined the subnet list, the **os**, **services** and **processes** are defined by a simple list. The names of any of the items in each list can be anything, but note that they will be used for validating the host configurations, exploits, etc, so just need to match-up with those values as desired.\n\nContinuing our example, the ``tiny`` scenario includes one OS: *linux*, one service: *ssh*, and one process: *tomcat*:\n\n.. code-block:: yaml\n\n   os:\n     - linux\n   services:\n     - ssh\n   processes:\n     - tomcat\n\n\nHost Configurations\n\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\n\nThe host configuration section is a mapping from host address to their configuration, where the address is a ``(subnet number, host number)`` tuple and the configuration must include the hosts OS, services running, processes running, and optional host firewall settings.\n\nThere are a few things to note when defining a host:\n\n   1. The number of hosts defined for each subnet needs to match the size of each subnet\n   2. Host addresses within a subnet must start from ``0`` and count up from there (i.e. three hosts in subnet *1* would have addresses ``(1, 0)``, ``(1, 1)``, and ``(1, 2)``)\n   3. The names of any OS, service, and process must match values provided in the **os**, **services** and **processes** sections of the YAML file.\n   4. Each host must have an OS and at least one service running. It is okay for hosts to have no processes running (which can be indicated using an empty list ``[]``).\n\n**Host firewalls** are defined as a mapping from host address to the list of services to deny from that host. Host addresses must be a valid address of a host in the network and any services must also match services defined in the services section. Finally, if a host address is not part of the firewall then it is assumed all traffic is allowed from that host, at the host level (it may still be blocked by subnet firewall).\n\n**Host Value** is the optional value the agent will recieve when compromising the host. Unlike for the *sensitive_hosts* section this value can be negative as well as zero and positive. This makes it possible to set additional host specific rewards or penalties, for example setting a negative reward for a 'honeypot' host on the network. A couple of things to note:\n\n  1. Host value is optional and will default to 0.\n  2. For any *sensitive hosts* the value must either not be specified or it must match the value specified in the *sensitive_hosts* section of the file.\n  3. Same as for *sensitive hosts*,  agent will only recieve the value as a reward when they compromise the host.\n\nHere is the example host configurations section for the ``tiny`` scenario, where a host firewall and is defined only for host ``(1, 0)`` and the host ``(1, 0)`` has a value of ``0`` (noting we could leave value unspecified in this case for the same result, we include it here as an example):\n\n.. code-block:: yaml\n\n   host_configurations:\n     (1, 0):\n       os: linux\n       services: [ssh]\n       processes: [tomcat]\n       # which services to deny between individual hosts\n       firewall:\n         (3, 0): [ssh]\n       value: 0\n     (2, 0):\n       os: linux\n       services: [ssh]\n       processes: [tomcat]\n       firewall:\n         (1, 0): [ssh]\n     (3, 0):\n       os: linux\n       services: [ssh]\n       processes: [tomcat]\n\n\nFirewall\n\"\"\"\"\"\"\"\"\n\nThe final section for defining the network is the firewall, which is defined as a mapping from ``(subnet number, subnet number)`` tuples to list of services to allow. Some things to note about defining firewalls:\n\n   1. A firewall rule can only be defined between subnets that are connected in the topology adjacency matrix.\n   2. Each rule defines which services are allowed in a single direction, from the first subnet in the tuple to the second subnet in the tuple (i.e. (source subnet, destination subnet))\n   3. An empty list means all traffic will be blocked from source to destination\n\nHere is the firewall definition for the ``tiny`` scenario where SSH traffic is allowed between all subnets, except from subnet 1 to 0 and from 1 to 2.\n\n.. code-block:: yaml\n\n    # two rows for each connection between subnets as defined by topology\n    # one for each direction of connection\n    # lists which services to allow\n    firewall:\n      (0, 1): [ssh]\n      (1, 0): []\n      (1, 2): []\n      (2, 1): [ssh]\n      (1, 3): [ssh]\n      (3, 1): [ssh]\n      (2, 3): [ssh]\n      (3, 2): [ssh]\n\n\nAnd with that we have covered everything needed to define the scenario's network. Next up is defining the pen-tester.\n\n\nDefining the pen-tester\n^^^^^^^^^^^^^^^^^^^^^^^\n\nThe pen-tester is defined by these sections:\n\n   1. **sensitive_hosts**: a dictionary containing the address of sensitive/target hosts and their value\n   2. **exploits**: a dictionary of exploits\n   3. **privilege_escalation**: a dictionary of privilege escalation actions\n   4. **os_scan_cost**: cost of using OS scan\n   5. **service_scan_cost**: cost of using service scan\n   6. **process_scan_cost**: cost of using process scan\n   7. **subnet_scan_cost**: cost of using subnet scan\n   8. **step_limit**: the maximum number of actions pen-tester can perform in a single episode\n\n\nSensitive hosts\n\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\n\nThis section specifies the addresses and values of the target hosts in the network. When the pen-tester gains root access on these hosts they will recieve the specified value as a reward. The *sensitive_hosts* section is a dictionary where the entries are address, value pairs. Where the address is a ``(subnet number, host number)`` tuple and the value is a non-negative float or integer.\n\nIn the ``tiny`` scenario the pen-tester is aiming to get root access on the hosts ``(2, 0)`` and ``(3, 0)``, both of which have a value of 100:\n\n.. code-block:: yaml\n\n    sensitive_hosts:\n      (2, 0): 100\n      (3, 0): 100\n\n\nExploits\n\"\"\"\"\"\"\"\"\n\nThe exploits section is a dictionary which maps exploit names to exploit definitions. Every scenario requires at least on exploit. An exploit definition is a dictionary which must include the following entries:\n\n  1. **service**: the name of the service the exploit targets.\n\n     - Note, the value must match the name of a service defined in the **services** section of the network definition.\n\n  2. **os**: the name of the operating system the exploit targets or ``none`` if the exploit works on all OSs.\n\n     - If the value is not ``none`` it must match the name of an OS defined in the **os** section of the network definition\n\n  3. **prob**: the probability that the exploit succeeds given all preconditions are met (i.e. target host is discovered and reachable, and the host is running targete service and OS)\n  4. **cost**: the cost of performing the action. This should be a non-negative int or float and can represent the cost of the action in any sense desired (financial, time, traffic generated, etc)\n  5. **access**: the resulting access the pen-tester will get on the target host if the exploit succeeds. This can be either *user* or *root*.\n\n\nThe name of the exploits can be anything you desire, so long as they are immutable and hashable (i.e. strings, ints, tuples) and unique.\n\nThe ``tiny`` example scenario has only a single exploit ``e_ssh`` which targets the SSH service running on linux hosts, has a cost of 1 and results in user level access:\n\n.. code-block:: yaml\n\n    exploits:\n      e_ssh:\n        service: ssh\n        os: linux\n        prob: 0.8\n        cost: 1\n        access: user\n\n\nPrivilege Escalation\n\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\n\nSimilar to the exploits section, the privilege escalation section is a dictionary which maps privilege escalation action names to their definitions. A privilege escalation action definition is a dictionary which must include the following entries:\n\n  1. **process**: the name of the process the action targets.\n\n     - The value must match the name of a process defined in the **processes** section of the network definition.\n\n  2. **os**: the name of the operating system the action targets or ``none`` if the exploit works on all OSs.\n\n     - If the value is not ``none`` it must match the name of an OS defined in the **os** section of the network definition.\n\n  3. **prob**: the probability that the action succeeds given all preconditions are met (i.e. pen-tester has access to target host, and the host is running target process and OS)\n  4. **cost**: the cost of performing the action. This should be a non-negative int or float and can represent the cost of the action in any sense desired (financial, time, traffic generated, etc)\n  5. **access**: the resulting access the pen-tester will get on the target host if the action succeeds. This can be either *user* or *root*.\n\nSimilar to  exploits, the name of each privilege exploit action can be anything you desire, so long as they are immutable and hashable (i.e. strings, ints, tuples) and unique.\n\n.. note:: It is not required that a scenario has any privilege escalation actions defined. In this case define the privilege escalation section to be empty: ``privilege_escalation: {}``.\n\n          Note however that you will need to make sure that it is possible to get root access on the sensitive hosts via using only exploits, otherwise the pen-tester will never be able to reach the goal.\n\nThe ``tiny`` example scenario has a single privilege escalation action ``pe_tomcat`` which targets the tomcat process running on linux hosts, has a cost of 1 and results in root level access:\n\n.. code-block:: yaml\n\n    privilege_escalation:\n      pe_tomcat:\n        process: tomcat\n        os: linux\n        prob: 1.0\n        cost: 1\n        access: root\n\n\nScan costs\n\"\"\"\"\"\"\"\"\"\"\n\nEach scan must have an associated non-negative cost associated with it. This cost can represent whatever you wish and will be factored in to the reward the agent recieves each time a scan is performed.\n\nScan costs are easy to define, requiring only a non-negative float or integer value. You must specify the cost of all scans. Here, in the example ``tiny`` scenario, we define a cost of 1 for all scans:\n\n.. code-block:: yaml\n\n    service_scan_cost: 1\n    os_scan_cost: 1\n    subnet_scan_cost: 1\n    process_scan_cost: 1\n\n\nStep limit\n\"\"\"\"\"\"\"\"\"\"\n\nThe step limit defines the maximum number of steps (i.e. actions) the pen-tester has to reach the goal within a single episode. During simulation once the step limit is reached the episode is considered done, with the agent having failed to reach the goal.\n\nDefining the step limit is easy since it requires only a positive integer value. For example, here we define a step limit of 1000 for the ``tiny`` scenario:\n\n.. code-block:: yaml\n\n    step_limit: 1000\n\n\n\nWith that we have everything we need to define a custom scenario. Running the scenario is even easier!\n\n\n.. _'running_custom_yaml':\n\nRunning a custom YAML scenario\n------------------------------\n\nTo create a ``NASimEnv`` from a custom YAML scenario file we use the ``nasim.load()`` function:\n\n.. code-block:: python\n\n   import nasim\n   env = nasim.load('path/to/custom/scenario.yaml`)\n\n\nThe load function also takes some additional parameters to control the observation mode and observation and action spaces for the environment, see :ref:`nasim_init` for reference and :ref:`env_params` for explanation.\n\nIf there are any issues with the format of your file you should recieve some, hopefully, helpful error messages when attempting to load it. Once the environment is loaded successfully you can interact with it as per normal (see :ref:`env_tute` for more details).\n"
  },
  {
    "path": "docs/source/tutorials/environment.rst",
    "content": ".. _`env_tute`:\n\nInteracting with NASim Environment\n==================================\n\nAssuming you are comfortable loading an environment from a scenario (see :ref:`loading_tute` or :ref:`gym_load_tute`), then interacting with a NASim Environment is very easy and follows the same interface as `gymnasium <https://github.com/Farama-Foundation/Gymnasium/>`_.\n\n\nStarting the environment\n------------------------\n\nFirst thing is simply loading the environment::\n\n  import nasim\n  # load my environment in the desired way (make_benchmark, load, generate)\n  env = nasim.make_benchmark(\"tiny\")\n\n  # or using gym\n  import gymnasium as gym\n  env = gym.make(\"nasim:Tiny-PO-v0\")\n\n\nHere we are using the default environment parameters: ``fully_obs=False``, ``flat_actions=True``, and ``flat_obs=True``.\n\nThe number of actions can be retrieved from the environment ``action_space`` attribute as follows::\n\n  # When flat_actions=True\n  num_actions = env.action_space.n\n\n  # When flat_actions=False\n  nvec_actions = env.action_space.nvec\n\n\nThe shape of the observations can be retrieved from the environment ``observation_space`` attribute as follows::\n\n  obs_shape = env.observation_space.shape\n\n\n\nGetting the initial observation and resetting the environment\n-------------------------------------------------------------\n\nTo reset the environment and get the initial observation, use the ``reset()`` function::\n\n  o, info = env.reset()\n\n\nThe ``info`` return value contains optional auxiliary information.\n\n\nPerforming a single step\n------------------------\n\nA step in the environment can be taken using the ``step(action)`` function. Here ``action`` can take a few different forms depending on if using ``flat_actions=True`` or ``flat_actions=False``, for our example we can simply pass an integer with 0 <= action < N, which specifies the index of the action in the action space. The ``step`` function then returns a ``(Observation, float, bool, bool, dict)`` tuple corresponding to observation, reward, done, step limit reached, auxiliary info, respectively::\n\n  action = # integer in range [0, env.action_space.n]\n  o, r, done, step_limit_reached, info = env.step(action)\n\n\nif ``done=True`` then the goal has been reached, and the episode is over. Alternatively, if the current scenario has a step limit and ``step_limit_reached=True`` then, well, the step limit has been reached. Following both cases, it is then recommended to stop or reset the environment, otherwise theres no gaurantee of what will happen (especially the first case).\n\n\nVisualizing the environment\n---------------------------\n\nYou can use the ``render()`` function to get a human readable visualization of the state of the environment. To use render correctly make sure to pass ``render_mode=\"human\"`` to the environment initialization function::\n\n  import nasim\n  # load my environment in the desired way (make_benchmark, load, generate)\n  env = nasim.make_benchmark(\"tiny\", render_mode=\"human\")\n\n  # or using gym\n  import gymnasium as gym\n  env = gym.make(\"nasim:Tiny-PO-v0\", render_mode=\"human\")\n\n  env.reset()\n  # render the environment\n  # (if render_mode=\"human\" is not passed during initialization this will do nothing)\n  env.render()\n\n\nAn example agent\n----------------\n\nSome example agents are provided in the ``nasim/agents`` directory. Here is a quick example of a hypothetical agent interacting with the environment::\n\n  import nasim\n\n  env = nasim.make_benchmark(\"tiny\")\n\n  agent = AnAgent(...)\n\n  o, info = env.reset()\n  total_reward = 0\n  done = False\n  step_limit_reached = False\n  while not done and not step_limit_reached:\n      a = agent.choose_action(o)\n      o, r, done, step_limit_reached, info = env.step(a)\n      total_reward += r\n\n  print(\"Done\")\n  print(\"Total reward =\", total_reward)\n\n\nIt's as simple as that.\n"
  },
  {
    "path": "docs/source/tutorials/gym_load.rst",
    "content": ".. _`gym_load_tute`:\n\nStarting NASim using OpenAI gym\n===============================\n\nOn startup NASim also registers each benchmark scenario as an `Gymnasium <https://github.com/Farama-Foundation/Gymnasium/>`_  environment, allowing NASim benchmark environments to be loaded using ``gymnasium.make()``.\n\n:ref:`all_benchmark_scenarios` can be loaded using ``gymnasium.make()``.\n\n.. note:: Custom scenarios must be loaded using the nasim library directly, see :ref:`loading_tute`.\n\n\nEnvironment Naming\n------------------\n\nUnlike when starting an environment using the ``nasim`` library directly, where environment modes are specified as arguments to the ``nasim.make_benchmark()`` function, when using ``gymnasium.make()`` the scenario and mode are specified in a single name.\n\nWhen using ``gymnasium.make()`` each environment has the following mode and naming convention:\n\n  ``ScenarioName[PO][2D][VA]-vX``\n\nWhere:\n\n- ``ScenarioName`` is the name of the benchmark scenario in Camel Casing\n- ``[PO]`` is optional and specifies the environment is in partially observable mode, if it is not included the environment is in fully observable mode.\n- ``[2D]`` is optional and specifies the environment is to return 2D observations, if it is not included the environment returns 1D observations.\n- ``[VA]`` is optional and specifies the environment is to accept Vector actions (parametrised actions), if it is not included the environment expects integer (flat) actions.\n- ``vX`` is the environment version. Currently (as of version ``0.10.0``) all environments are on ``v0``\n\nFor example, the 'tiny' benchmark scenario in partially observable mode with flat action-space and flat observation space has the name:\n\n  ``TinyPO-v0``\n\nOr the 'small-gen' benchmark scenario in fully observable mode with parametrised action-space and flat observation-space has the name:\n\n  ``SmallGenVA-v0``\n\n\nOr the 'medium-single-site' benchmark scenario in partially observable mode with parametrised action-space and 2D observation-space has the name:\n\n  ``MediumSingleSitePO2DVA-v0``\n\n\n.. note:: See :ref:`env_params` for more explanation on the different modes.\n\n\nUsage\n-----\n\nNow we understand the naming of environments, making a new environment using ``gym.make()`` is easy.\n\nFor example to create a new ``TinyPO-v0`` environment:\n\n.. code:: python\n\n   import gymnasium as gym\n   env = gym.make(\"nasim:TinyPO-v0\")\n\n   # to specify render mode\n   env = gym.make(\"nasim:TinyPO-v0\", render_mode=\"human\")\n"
  },
  {
    "path": "docs/source/tutorials/index.rst",
    "content": ".. _tutorials:\n\nTutorials\n=========\n\n.. toctree::\n    :maxdepth: 1\n\n    installation\n    loading\n    gym_load\n    environment\n    scenarios\n    creating_scenarios\n"
  },
  {
    "path": "docs/source/tutorials/installation.rst",
    "content": ".. _installation:\n\nInstallation\n==============\n\n\nDependencies\n--------------\n\nThis framework is tested to work under Python 3.7 or later.\n\nThe required dependencies:\n\n* Python >= 3.7\n* Gym >= 0.17\n* NumPy >= 1.18\n* PyYaml >= 5.3\n\nFor rendering:\n\n* NetworkX >= 2.4\n* prettytable >= 0.7.2\n* Matplotlib >= 3.1.3\n\nWe recommend to use the bleeding-edge version and to install it by following the :ref:`dev-install`. If you want a simpler installation procedure and do not intend to modify yourself the learning algorithms etc., you can look at the :ref:`user-install`.\n\n.. _user-install:\n\nUser install instructions\n--------------------------\n\nNASIm is available on PyPi for and can be installed with ``pip`` with the following command:\n\n.. code-block:: bash\n\n    pip install nasim\n\n\nThis will install the base level, which includes all dependencies needed to use NASim. You can also install the dependencies for building the docs, running tests, and running the DQN example agent seperately or all together, as follows:\n\n.. code-block:: bash\n\n    # install dependencies for building docs\n    pip install nasim[docs]\n\n    # install dependencies for running tests\n    pip install nasim[test]\n\n    # install dependencies for running dqn_agent\n    pip install nasim[dqn]\n\n    # install all dependencies\n    pip install nasim[all]\n\n\n\n.. _dev-install:\n\nDeveloper install instructions\n-------------------------------\n\nAs a developer, you can set you up with the bleeding-edge version of NASim with:\n\n.. code-block:: bash\n\n    git clone -b master https://github.com/Jjschwartz/NetworkAttackSimulator.git\n\n\nYou can install the framework as a package along with all dependencies with (you can remove the '[all]' if you just want base level install):\n\n.. code-block:: bash\n\n    pip install -e .[all]\n"
  },
  {
    "path": "docs/source/tutorials/loading.rst",
    "content": ".. _`loading_tute`:\n\nStarting a NASim Environment\n============================\n\nInteraction with NASim is done primarily via the :class:`~nasim.envs.environment.NASimEnv` class, which handles a simulated network environment as defined by the chosen scenario.\n\nThere are two ways to start a new environment: (i) via the nasim library directly, or (ii) using the `gym.make()` function of the gymnasium library.\n\nIn this tutorial we will be covering the first method. For the second method check out :ref:`gym_load_tute`.\n\n\n.. _`env_params`:\n\nEnvironment Settings\n--------------------\n\nFor initialization the NASimEnv class takes a scenario definition and three optional arguments.\n\nThe scenario defines the network properties and the pen-tester specific information (e.g. exploits available, etc). For this tutorial we are going to stick to how to start a new environment, details on scenarios is covered in :ref:`scenarios_tute`.\n\nThe three optional arguments control the environment modes:\n\n- ``fully_obs`` : The observability mode of environment, if True then uses fully observable mode, otherwise is partially observable (default=False)\n- ``flat_actions`` : If true then uses a flat action space, otherwise will uses a parameterised action space (default=True).\n- ``flat_obs`` :  If true then uses a 1D observation space, otherwise uses a 2D observation space (default=True)\n\n\nIf using fully observable mode (``fully_obs=True``) then the entire state of the network and the attack is observed after each step. This is 'easy' mode and does not reflect the reality of pen-testing, but it is useful for getting started and sanity checking algorithms and environments. When using partially observable mode (``fully_obs=False``) the agent starts with no knowledge of the location, configuration and value of every host on the network and recieves only observations of features of the directly related to the action performed at each step. This is 'hard' mode and reflects the reality of pen-testing more accurately.\n\nWhether the environment is fully or partially observable has no effect on the size and shape of the action and observation spaces or how the agent interacts with the environment. It will have significant implications for the algorithms used to solve the environment, but that is beyond the scope of this tutorial.\n\nUsing ``flat_actions=True`` means our action space is made up of N discrete actions, where N is based on the number of hosts in the network and the number of exploits and scans available. For our example there are 3 hosts, 1 exploit and 3 scans (OS, Service, and Subnet), for a total of 3 * (1 + 3) = 12 actions. If ``flat_actions=False`` then each action is a vector with each element of the vector specifying a parameter of the action. For more info see :ref:`actions`.\n\nUsing ``flat_obs=True`` means the observations returned will be a 1D vector. Otherwise if ``flat_obs=False`` observations will be a 2D matrix. For explanation of the features of this vector see :ref:`observation`.\n\n\n.. _`loading_env`:\n\nLoading an Environment from a Scenario\n--------------------------------------\n\nNASim Environments can be constructed from scenarios in three ways: making an existing scenario, loading from a .yaml file, and generating from parameters.\n\n.. note:: Each of the methods described below also accept `fully_obs`, `flat_actions` and `flat_obs` boolen arguments.\n\n\n.. _`make_existing`:\n\nMaking an existing scenario\n^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nThis is the easiest method for loading a new environment and closely matches the `OpenAI gym <https://github.com/openai/gym>`_ way of doing things. Loading an existing scenario is as easy as::\n\n  import nasim\n  env = nasim.make_benchmark(\"tiny\")\n\nAnd you are done.\n\nYou can also pass in a a random seed using the `seed` argument, which will have an effect when using a generated scenario.\n\n.. note::  This method only works with the benchmark scenarios that come with NASim (for the full list see the :ref:`benchmark_scenarios`).\n\n\nLoading a scenario from a YAML file\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nIf you wish to load an existing or custom scenario defined in a YAML file, this is also very straight forward::\n\n  import nasim\n  env = nasim.load(\"path/to/scenario.yaml\")\n\nAnd once again, you are done (given your file is in a valid format)!\n\n\nGenerating a scenario\n^^^^^^^^^^^^^^^^^^^^^\n\nThe final method for loading a new environment is to generate it using the NASim scenario generator. There are quite a number of parameters that can be used to control the what scenario is generated (for a full list see the :ref:`scenario_generator` class), but the two key parameters are the number of hosts in the network and the number of services running (which also controls number of exploits, unless otherwise specified).\n\nTo generate a new environment with 5 hosts running a possible 3 services::\n\n  import nasim\n  env = nasim.generate(5, 3)\n\nAnd your done! If you want to pass in some other parameters (say the number of possible operating systems) these can be passed in as keyword arguments::\n\n  env = nasim.generate(5, 3, num_os=3)\n\n\nOnce again, for a full list of available parameters refer to the :ref:`scenario_generator` documentation.\n"
  },
  {
    "path": "docs/source/tutorials/scenarios.rst",
    "content": ".. _`scenarios_tute`:\n\nUnderstanding Scenarios\n=======================\n\nA scenario in NASim defines all the necessary properties for creating a network environment. Each scenario definition can be broken down into two components: the network configuration and the pen-tester.\n\nNetwork Configuration\n---------------------\n\nThe network configuration is defined by a the following properties:\n\n- *subnets*: the number and size of the subnets in the network.\n- *topology*: how the different subnets in the network are connected\n- *host configurations*: the address, OS, services, processes, and firewalls for each host in the network\n- *firewall*: which communication is prevented between subnets\n\n*Note*, for the host configurations we are, in general, only interested in services and processes that the pen-tester has exploits for, so we will typically ignore any non-vulnerable services and processes in order to reduce the problem size.\n\nPen-Tester\n----------\n\nThe pen-tester is defined by:\n\n- *exploits*: the set of exploits available to the pen-tester\n- *privescs*: the set of priviledge escalation actions available to the pen-tester\n- *scan costs*: the cost of performing each type of scan (service, OS, process, and subnet)\n- *sensitive hosts*: the target hosts on the network and their value\n\nExample Scenario\n----------------\n\nTo illustrate these properties here we show an example scenario, where the aim of the pen-tester is to gain root access to the server in the sensitive subnet and one of the hosts in the user subnet.\n\nThe figure below shows the the layout of our example network.\n\n.. image:: example_network.png\n  :width: 700\n\nFrom the figure we can see that this network has the following properties:\n\n- *subnets*: three subnets: DMZ with a single server, Sensitive with a single server and User with three user machines.\n- *topology*: Only the DMZ is connected to the internet, while all subnets in network are interconnected.\n- *host configurations*: The address, OS, services, and processes running on each host are shown next to each host (e.g. the server in the DMZ subnet has address (1, 0), has a linux OS, is running http and ssh services, and the tomcat process). The host firewall settings are show in the table in the top-right of the figure. Here only host *(1, 0)* has a firewall configured which blocks any SSH connections from hosts *(3, 0)* and *(3, 1)*.\n- *firewall*: The arrows above and below the firwalls indicate which services can be communicated with in each direction between subnets and between the DMZ subnet and the internet (e.g. the internet can communicate with http services running on hosts in the DMZ, while the firewall blocks no communication from the DMZ to the internet).\n\nNext we need to define our pen-tester, which we specify based on the scenario we wish to simulate.\n\n- *exploits*: for this scenario the pen-tester has access to three exploits\n\n  1. *ssh_exploit*: which exploits the ssh service running on windows machine, has a cost of 2, a success probability of 0.6, and results in user level access if successful.\n  2. *ftp_exploit*: which exploits the ftp service running on a linux machine, has a cost of 1, a sucess probability of 0.9, and results in root level access if successful.\n  3. *http_exploit*: which exploits the http service running on any OS, has a cost of 3, a success probability of 1.0, and results in user level access if successful.\n\n- *privescs*: for this scenario the pen-tester has access to two priviledge escalation actions\n\n  1. *pe_tomcat*: exploits the tomcat process running on a linux machine to gain root access. It has a cost of 1 and success probability of 1.0.\n  2. *pe_daclsvc*: exploits the daclsvc process running on a windows machine to gain root access. It has a cost of 1 and success probability of 1.0.\n\n- *scan costs*: here we need to specify the cost of each type of scan\n\n  1. *service_scan*: 1\n  2. *os_scan*: 2\n  3. *process_scan*: 1\n  4. *subnet_scan*: 1\n\n- *sensitive hosts*: here we have two target hosts\n\n  1. *(2, 0), 1000* : the server running on sensitive subnet, which has a value of 1000.\n  2. *(3, 2), 1000* : the last host running on user subnet, which has a value of 1000.\n\nAnd with that our scenario is fully defined and we have everything we need to run an attack simulation.\n"
  },
  {
    "path": "nasim/__init__.py",
    "content": "import gymnasium as gym\nfrom gymnasium.envs.registration import register\n\nfrom nasim.envs import NASimEnv\nfrom nasim.scenarios.benchmark import AVAIL_BENCHMARKS\nfrom nasim.scenarios import \\\n    make_benchmark_scenario, load_scenario, generate_scenario\n\n\n__all__ = ['make_benchmark', 'load', 'generate']\n\n\ndef make_benchmark(scenario_name,\n                   seed=None,\n                   fully_obs=False,\n                   flat_actions=True,\n                   flat_obs=True,\n                   render_mode=None):\n    \"\"\"Make a new benchmark NASim environment.\n\n    Parameters\n    ----------\n    scenario_name : str\n        the name of the benchmark environment\n    seed : int, optional\n        random seed to use to generate environment (default=None)\n    fully_obs : bool, optional\n        the observability mode of environment, if True then uses fully\n        observable mode, otherwise partially observable (default=False)\n    flat_actions : bool, optional\n        if true then uses a flat action space, otherwise will use\n        parameterised action space (default=True).\n    flat_obs : bool, optional\n        if true then uses a 1D observation space. If False\n        will use a 2D observation space (default=True)\n    render_mode : str, optional\n            The render mode to use for the environment.\n\n    Returns\n    -------\n    NASimEnv\n        a new environment instance\n\n    Raises\n    ------\n    NotImplementederror\n        if scenario_name does no match any implemented benchmark scenarios.\n    \"\"\"\n    env_kwargs = {\"fully_obs\": fully_obs,\n                  \"flat_actions\": flat_actions,\n                  \"flat_obs\": flat_obs,\n                  \"render_mode\": render_mode}\n    scenario = make_benchmark_scenario(scenario_name, seed)\n    return NASimEnv(scenario, **env_kwargs)\n\n\ndef load(path,\n         fully_obs=False,\n         flat_actions=True,\n         flat_obs=True,\n         name=None,\n         render_mode=None):\n    \"\"\"Load NASim Environment from a .yaml scenario file.\n\n    Parameters\n    ----------\n    path : str\n        path to the .yaml scenario file\n    fully_obs : bool, optional\n        The observability mode of environment, if True then uses fully\n        observable mode, otherwise partially observable (default=False)\n    flat_actions : bool, optional\n        if true then uses a flat action space, otherwise will use\n        parameterised action space (default=True).\n    flat_obs : bool, optional\n        if true then uses a 1D observation space. If False\n        will use a 2D observation space (default=True)\n    name : str, optional\n        the scenarios name, if None name will be generated from path\n        (default=None)\n    render_mode : str, optional\n            The render mode to use for the environment.\n\n    Returns\n    -------\n    NASimEnv\n        a new environment object\n    \"\"\"\n    env_kwargs = {\"fully_obs\": fully_obs,\n                  \"flat_actions\": flat_actions,\n                  \"flat_obs\": flat_obs,\n                  \"render_mode\": render_mode}\n    scenario = load_scenario(path, name=name)\n    return NASimEnv(scenario, **env_kwargs)\n\n\ndef generate(num_hosts,\n             num_services,\n             fully_obs=False,\n             flat_actions=True,\n             flat_obs=True,\n             render_mode=None,\n             **params):\n    \"\"\"Construct Environment from an auto generated network.\n\n    Parameters\n    ----------\n    num_hosts : int\n        number of hosts to include in network (minimum is 3)\n    num_services : int\n        number of services to use in environment (minimum is 1)\n    fully_obs : bool, optional\n        The observability mode of environment, if True then uses fully\n        observable mode, otherwise partially observable (default=False)\n    flat_actions : bool, optional\n        if true then uses a flat action space, otherwise will use\n        parameterised action space (default=True).\n    flat_obs : bool, optional\n        if true then uses a 1D observation space. If False\n        will use a 2D observation space (default=True)\n    render_mode : str, optional\n            The render mode to use for the environment.\n    params : dict, optional\n        generator params (see :class:`ScenarioGenertor` for full list)\n\n    Returns\n    -------\n    NASimEnv\n        a new environment object\n    \"\"\"\n    env_kwargs = {\"fully_obs\": fully_obs,\n                  \"flat_actions\": flat_actions,\n                  \"flat_obs\": flat_obs,\n                  \"render_mode\": render_mode}\n    scenario = generate_scenario(num_hosts, num_services, **params)\n    return NASimEnv(scenario, **env_kwargs)\n\n\ndef _register(id, entry_point, kwargs, nondeterministic, force=True):\n    \"\"\"Registers NASim as a Gymnasium Environment.\n\n    Handles issues with re-registering gym environments.\n    \"\"\"\n    if id in gym.envs.registry:\n        if not force:\n            return\n        del gym.envs.registry[id]\n    register(\n        id=id,\n        entry_point=entry_point,\n        kwargs=kwargs,\n        nondeterministic=nondeterministic\n    )\n\n\nfor benchmark in AVAIL_BENCHMARKS:\n    # PO - partially observable\n    # 2D - use 2D Obs\n    # VA - use param actions\n    # tiny should yield Tiny and tiny-small should yield TinySmall\n    for fully_obs in [True, False]:\n        name = ''.join([g.capitalize() for g in benchmark.split(\"-\")])\n        if not fully_obs:\n            name = f\"{name}PO\"\n\n        _register(\n            id=f\"{name}-v0\",\n            entry_point='nasim.envs:NASimGymEnv',\n            kwargs={\n                \"scenario\": benchmark,\n                \"fully_obs\": fully_obs,\n                \"flat_actions\": True,\n                \"flat_obs\": True\n            },\n            nondeterministic=True\n        )\n\n        _register(\n            id=f\"{name}2D-v0\",\n            entry_point='nasim.envs:NASimGymEnv',\n            kwargs={\n                \"scenario\": benchmark,\n                \"fully_obs\": fully_obs,\n                \"flat_actions\": True,\n                \"flat_obs\": False\n            },\n            nondeterministic=True\n        )\n\n        _register(\n            id=f\"{name}VA-v0\",\n            entry_point='nasim.envs:NASimGymEnv',\n            kwargs={\n                \"scenario\": benchmark,\n                \"fully_obs\": fully_obs,\n                \"flat_actions\": False,\n                \"flat_obs\": True\n            },\n            nondeterministic=True\n        )\n\n        _register(\n            id=f\"{name}2DVA-v0\",\n            entry_point='nasim.envs:NASimGymEnv',\n            kwargs={\n                \"scenario\": benchmark,\n                \"fully_obs\": fully_obs,\n                \"flat_actions\": False,\n                \"flat_obs\": False\n            },\n            nondeterministic=True\n        )\n\n__version__ = \"0.12.0\"\n"
  },
  {
    "path": "nasim/agents/__init__.py",
    "content": ""
  },
  {
    "path": "nasim/agents/bruteforce_agent.py",
    "content": "\"\"\"An bruteforce agent that repeatedly cycles through all available actions in\norder.\n\nTo run 'tiny' benchmark scenario with default settings, run the following from\nthe nasim/agents dir:\n\n$ python bruteforce_agent.py tiny\n\nThis will run the agent and display progress and final results to stdout.\n\nTo see available running arguments:\n\n$ python bruteforce_agent.py --help\n\"\"\"\n\nfrom itertools import product\n\nimport nasim\n\nLINE_BREAK = \"-\"*60\n\n\ndef run_bruteforce_agent(env, step_limit=1e6, verbose=True):\n    \"\"\"Run bruteforce agent on nasim environment.\n\n    Parameters\n    ----------\n    env : nasim.NASimEnv\n        the nasim environment to run agent on\n    step_limit : int, optional\n        the maximum number of steps to run agent for (default=1e6)\n    verbose : bool, optional\n        whether to print out progress messages or not (default=True)\n\n    Returns\n    -------\n    int\n        timesteps agent ran for\n    float\n        the total reward recieved by agent\n    bool\n        whether the goal was reached or not\n    \"\"\"\n    if verbose:\n        print(LINE_BREAK)\n        print(\"STARTING EPISODE\")\n        print(LINE_BREAK)\n        print(\"t: Reward\")\n\n    env.reset()\n    total_reward = 0\n    done = False\n    env_step_limit_reached = False\n    steps = 0\n    cycle_complete = False\n\n    if env.flat_actions:\n        act = 0\n    else:\n        act_iter = product(*[range(n) for n in env.action_space.nvec])\n\n    while not done and not env_step_limit_reached and steps < step_limit:\n        if env.flat_actions:\n            act = (act + 1) % env.action_space.n\n            cycle_complete = (steps > 0 and act == 0)\n        else:\n            try:\n                act = next(act_iter)\n                cycle_complete = False\n            except StopIteration:\n                act_iter = product(*[range(n) for n in env.action_space.nvec])\n                act = next(act_iter)\n                cycle_complete = True\n\n        _, rew, done, env_step_limit_reached, _ = env.step(act)\n        total_reward += rew\n\n        if cycle_complete and verbose:\n            print(f\"{steps}: {total_reward}\")\n        steps += 1\n\n    if done and verbose:\n        print(LINE_BREAK)\n        print(\"EPISODE FINISHED\")\n        print(LINE_BREAK)\n        print(f\"Goal reached = {env.goal_reached()}\")\n        print(f\"Total steps = {steps}\")\n        print(f\"Total reward = {total_reward}\")\n    elif verbose:\n        print(LINE_BREAK)\n        print(\"STEP LIMIT REACHED\")\n        print(LINE_BREAK)\n\n    if done:\n        done = env.goal_reached()\n\n    return steps, total_reward, done\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"env_name\", type=str, help=\"benchmark scenario name\")\n    parser.add_argument(\"-s\", \"--seed\", type=int, default=0,\n                        help=\"random seed\")\n    parser.add_argument(\"-o\", \"--partially_obs\", action=\"store_true\",\n                        help=\"Partially Observable Mode\")\n    parser.add_argument(\"-p\", \"--param_actions\", action=\"store_true\",\n                        help=\"Use Parameterised action space\")\n    parser.add_argument(\"-f\", \"--box_obs\", action=\"store_true\",\n                        help=\"Use 2D observation space\")\n    args = parser.parse_args()\n\n    nasimenv = nasim.make_benchmark(\n        args.env_name,\n        args.seed,\n        not args.partially_obs,\n        not args.param_actions,\n        not args.box_obs\n    )\n    if not args.param_actions:\n        print(nasimenv.action_space.n)\n    else:\n        print(nasimenv.action_space.nvec)\n    run_bruteforce_agent(nasimenv)\n"
  },
  {
    "path": "nasim/agents/dqn_agent.py",
    "content": "\"\"\"An example DQN Agent.\n\nIt uses pytorch 1.5+ and tensorboard libraries (HINT: these dependencies can\nbe installed by running pip install nasim[dqn])\n\nTo run 'tiny' benchmark scenario with default settings, run the following from\nthe nasim/agents dir:\n\n$ python dqn_agent.py tiny\n\nTo see detailed results using tensorboard:\n\n$ tensorboard --logdir runs/\n\nTo see available hyperparameters:\n\n$ python dqn_agent.py --help\n\nNotes\n-----\n\nThis is by no means a state of the art implementation of DQN, but is designed\nto be an example implementation that can be used as a reference for building\nyour own agents.\n\"\"\"\nimport random\nfrom pprint import pprint\n\nfrom gymnasium import error\nimport numpy as np\n\nimport nasim\n\ntry:\n    import torch\n    import torch.nn as nn\n    import torch.optim as optim\n    import torch.nn.functional as F\n    from torch.utils.tensorboard import SummaryWriter\nexcept ImportError as e:\n    raise error.DependencyNotInstalled(\n        f\"{e}. (HINT: you can install dqn_agent dependencies by running \"\n        \"'pip install nasim[dqn]'.)\"\n    )\n\n\nclass ReplayMemory:\n\n    def __init__(self, capacity, s_dims, device=\"cpu\"):\n        self.capacity = capacity\n        self.device = device\n        self.s_buf = np.zeros((capacity, *s_dims), dtype=np.float32)\n        self.a_buf = np.zeros((capacity, 1), dtype=np.int64)\n        self.next_s_buf = np.zeros((capacity, *s_dims), dtype=np.float32)\n        self.r_buf = np.zeros(capacity, dtype=np.float32)\n        self.done_buf = np.zeros(capacity, dtype=np.float32)\n        self.ptr, self.size = 0, 0\n\n    def store(self, s, a, next_s, r, done):\n        self.s_buf[self.ptr] = s\n        self.a_buf[self.ptr] = a\n        self.next_s_buf[self.ptr] = next_s\n        self.r_buf[self.ptr] = r\n        self.done_buf[self.ptr] = done\n        self.ptr = (self.ptr + 1) % self.capacity\n        self.size = min(self.size+1, self.capacity)\n\n    def sample_batch(self, batch_size):\n        sample_idxs = np.random.choice(self.size, batch_size)\n        batch = [self.s_buf[sample_idxs],\n                 self.a_buf[sample_idxs],\n                 self.next_s_buf[sample_idxs],\n                 self.r_buf[sample_idxs],\n                 self.done_buf[sample_idxs]]\n        return [torch.from_numpy(buf).to(self.device) for buf in batch]\n\n\nclass DQN(nn.Module):\n    \"\"\"A simple Deep Q-Network \"\"\"\n\n    def __init__(self, input_dim, layers, num_actions):\n        super().__init__()\n        self.layers = nn.ModuleList([nn.Linear(input_dim[0], layers[0])])\n        for l in range(1, len(layers)):\n            self.layers.append(nn.Linear(layers[l-1], layers[l]))\n        self.out = nn.Linear(layers[-1], num_actions)\n\n    def forward(self, x):\n        for layer in self.layers:\n            x = F.relu(layer(x))\n        x = self.out(x)\n        return x\n\n    def save_DQN(self, file_path):\n        torch.save(self.state_dict(), file_path)\n\n    def load_DQN(self, file_path):\n        self.load_state_dict(torch.load(file_path))\n\n    def get_action(self, x):\n        with torch.no_grad():\n            if len(x.shape) == 1:\n                x = x.view(1, -1)\n            return self.forward(x).max(1)[1]\n\n\nclass DQNAgent:\n    \"\"\"A simple Deep Q-Network Agent \"\"\"\n\n    def __init__(self,\n                 env,\n                 seed=None,\n                 lr=0.001,\n                 training_steps=20000,\n                 batch_size=32,\n                 replay_size=10000,\n                 final_epsilon=0.05,\n                 exploration_steps=10000,\n                 gamma=0.99,\n                 hidden_sizes=[64, 64],\n                 target_update_freq=1000,\n                 verbose=True,\n                 **kwargs):\n\n        # This DQN implementation only works for flat actions\n        assert env.flat_actions\n        self.verbose = verbose\n        if self.verbose:\n            print(f\"\\nRunning DQN with config:\")\n            pprint(locals())\n\n        # set seeds\n        self.seed = seed\n        if self.seed is not None:\n            np.random.seed(self.seed)\n\n        # environment setup\n        self.env = env\n\n        self.num_actions = self.env.action_space.n\n        self.obs_dim = self.env.observation_space.shape\n\n        # logger setup\n        self.logger = SummaryWriter()\n\n        # Training related attributes\n        self.lr = lr\n        self.exploration_steps = exploration_steps\n        self.final_epsilon = final_epsilon\n        self.epsilon_schedule = np.linspace(1.0,\n                                            self.final_epsilon,\n                                            self.exploration_steps)\n        self.batch_size = batch_size\n        self.discount = gamma\n        self.training_steps = training_steps\n        self.steps_done = 0\n\n        # Neural Network related attributes\n        self.device = torch.device(\"cuda\"\n                                   if torch.cuda.is_available()\n                                   else \"cpu\")\n        self.dqn = DQN(self.obs_dim,\n                       hidden_sizes,\n                       self.num_actions).to(self.device)\n        if self.verbose:\n            print(f\"\\nUsing Neural Network running on device={self.device}:\")\n            print(self.dqn)\n\n        self.target_dqn = DQN(self.obs_dim,\n                              hidden_sizes,\n                              self.num_actions).to(self.device)\n        self.target_update_freq = target_update_freq\n\n        self.optimizer = optim.Adam(self.dqn.parameters(), lr=self.lr)\n        self.loss_fn = nn.SmoothL1Loss()\n\n        # replay setup\n        self.replay = ReplayMemory(replay_size,\n                                   self.obs_dim,\n                                   self.device)\n\n    def save(self, save_path):\n        self.dqn.save_DQN(save_path)\n\n    def load(self, load_path):\n        self.dqn.load_DQN(load_path)\n\n    def get_epsilon(self):\n        if self.steps_done < self.exploration_steps:\n            return self.epsilon_schedule[self.steps_done]\n        return self.final_epsilon\n\n    def get_egreedy_action(self, o, epsilon):\n        if random.random() > epsilon:\n            o = torch.from_numpy(o).float().to(self.device)\n            return self.dqn.get_action(o).cpu().item()\n        return random.randint(0, self.num_actions-1)\n\n    def optimize(self):\n        batch = self.replay.sample_batch(self.batch_size)\n        s_batch, a_batch, next_s_batch, r_batch, d_batch = batch\n\n        # get q_vals for each state and the action performed in that state\n        q_vals_raw = self.dqn(s_batch)\n        q_vals = q_vals_raw.gather(1, a_batch).squeeze()\n\n        # get target q val = max val of next state\n        with torch.no_grad():\n            target_q_val_raw = self.target_dqn(next_s_batch)\n            target_q_val = target_q_val_raw.max(1)[0]\n            target = r_batch + self.discount*(1-d_batch)*target_q_val\n\n        # calculate loss\n        loss = self.loss_fn(q_vals, target)\n\n        # optimize the model\n        self.optimizer.zero_grad()\n        loss.backward()\n        self.optimizer.step()\n\n        if self.steps_done % self.target_update_freq == 0:\n            self.target_dqn.load_state_dict(self.dqn.state_dict())\n\n        q_vals_max = q_vals_raw.max(1)[0]\n        mean_v = q_vals_max.mean().item()\n        return loss.item(), mean_v\n\n    def train(self):\n        if self.verbose:\n            print(\"\\nStarting training\")\n\n        num_episodes = 0\n        training_steps_remaining = self.training_steps\n\n        while self.steps_done < self.training_steps:\n            ep_results = self.run_train_episode(training_steps_remaining)\n            ep_return, ep_steps, goal = ep_results\n            num_episodes += 1\n            training_steps_remaining -= ep_steps\n\n            self.logger.add_scalar(\"episode\", num_episodes, self.steps_done)\n            self.logger.add_scalar(\n                \"epsilon\", self.get_epsilon(), self.steps_done\n            )\n            self.logger.add_scalar(\n                \"episode_return\", ep_return, self.steps_done\n            )\n            self.logger.add_scalar(\n                \"episode_steps\", ep_steps, self.steps_done\n            )\n            self.logger.add_scalar(\n                \"episode_goal_reached\", int(goal), self.steps_done\n            )\n\n            if num_episodes % 10 == 0 and self.verbose:\n                print(f\"\\nEpisode {num_episodes}:\")\n                print(f\"\\tsteps done = {self.steps_done} / \"\n                      f\"{self.training_steps}\")\n                print(f\"\\treturn = {ep_return}\")\n                print(f\"\\tgoal = {goal}\")\n\n        self.logger.close()\n        if self.verbose:\n            print(\"Training complete\")\n            print(f\"\\nEpisode {num_episodes}:\")\n            print(f\"\\tsteps done = {self.steps_done} / {self.training_steps}\")\n            print(f\"\\treturn = {ep_return}\")\n            print(f\"\\tgoal = {goal}\")\n\n    def run_train_episode(self, step_limit):\n        o, _ = self.env.reset()\n        done = False\n        env_step_limit_reached = False\n\n        steps = 0\n        episode_return = 0\n\n        while not done and not env_step_limit_reached and steps < step_limit:\n            a = self.get_egreedy_action(o, self.get_epsilon())\n\n            next_o, r, done, env_step_limit_reached, _ = self.env.step(a)\n            self.replay.store(o, a, next_o, r, done)\n            self.steps_done += 1\n            loss, mean_v = self.optimize()\n            self.logger.add_scalar(\"loss\", loss, self.steps_done)\n            self.logger.add_scalar(\"mean_v\", mean_v, self.steps_done)\n\n            o = next_o\n            episode_return += r\n            steps += 1\n\n        return episode_return, steps, self.env.goal_reached()\n\n    def run_eval_episode(self,\n                         env=None,\n                         render=False,\n                         eval_epsilon=0.05,\n                         render_mode=\"human\"):\n        if env is None:\n            env = self.env\n\n        original_render_mode = env.render_mode\n        env.render_mode = render_mode\n\n        o, _ = env.reset()\n        done = False\n        env_step_limit_reached = False\n\n        steps = 0\n        episode_return = 0\n\n        line_break = \"=\"*60\n        if render:\n            print(\"\\n\" + line_break)\n            print(f\"Running EVALUATION using epsilon = {eval_epsilon:.4f}\")\n            print(line_break)\n            env.render()\n            input(\"Initial state. Press enter to continue..\")\n\n        while not done and not env_step_limit_reached:\n            a = self.get_egreedy_action(o, eval_epsilon)\n            next_o, r, done, env_step_limit_reached, _ = env.step(a)\n            o = next_o\n            episode_return += r\n            steps += 1\n            if render:\n                print(\"\\n\" + line_break)\n                print(f\"Step {steps}\")\n                print(line_break)\n                print(f\"Action Performed = {env.action_space.get_action(a)}\")\n                env.render()\n                print(f\"Reward = {r}\")\n                print(f\"Done = {done}\")\n                print(f\"Step limit reached = {env_step_limit_reached}\")\n                input(\"Press enter to continue..\")\n\n                if done or env_step_limit_reached:\n                    print(\"\\n\" + line_break)\n                    print(\"EPISODE FINISHED\")\n                    print(line_break)\n                    print(f\"Goal reached = {env.goal_reached()}\")\n                    print(f\"Total steps = {steps}\")\n                    print(f\"Total reward = {episode_return}\")\n\n        env.render_mode = original_render_mode\n        return episode_return, steps, env.goal_reached()\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"env_name\", type=str, help=\"benchmark scenario name\")\n    parser.add_argument(\"--render_eval\", action=\"store_true\",\n                        help=\"Renders final policy\")\n    parser.add_argument(\"-o\", \"--partially_obs\", action=\"store_true\",\n                        help=\"Partially Observable Mode\")\n    parser.add_argument(\"--hidden_sizes\", type=int, nargs=\"*\",\n                        default=[64, 64],\n                        help=\"(default=[64. 64])\")\n    parser.add_argument(\"--lr\", type=float, default=0.001,\n                        help=\"Learning rate (default=0.001)\")\n    parser.add_argument(\"-t\", \"--training_steps\", type=int, default=20000,\n                        help=\"training steps (default=20000)\")\n    parser.add_argument(\"--batch_size\", type=int, default=32,\n                        help=\"(default=32)\")\n    parser.add_argument(\"--target_update_freq\", type=int, default=1000,\n                        help=\"(default=1000)\")\n    parser.add_argument(\"--seed\", type=int, default=0,\n                        help=\"(default=0)\")\n    parser.add_argument(\"--replay_size\", type=int, default=100000,\n                        help=\"(default=100000)\")\n    parser.add_argument(\"--final_epsilon\", type=float, default=0.05,\n                        help=\"(default=0.05)\")\n    parser.add_argument(\"--init_epsilon\", type=float, default=1.0,\n                        help=\"(default=1.0)\")\n    parser.add_argument(\"--exploration_steps\", type=int, default=10000,\n                        help=\"(default=10000)\")\n    parser.add_argument(\"--gamma\", type=float, default=0.99,\n                        help=\"(default=0.99)\")\n    parser.add_argument(\"--quite\", action=\"store_false\",\n                        help=\"Run in Quite mode\")\n    args = parser.parse_args()\n\n    env = nasim.make_benchmark(args.env_name,\n                               args.seed,\n                               fully_obs=not args.partially_obs,\n                               flat_actions=True,\n                               flat_obs=True)\n    dqn_agent = DQNAgent(env, verbose=args.quite, **vars(args))\n    dqn_agent.train()\n    dqn_agent.run_eval_episode(render=args.render_eval)\n"
  },
  {
    "path": "nasim/agents/keyboard_agent.py",
    "content": "\"\"\"An agent that lets the user interact with NASim using the keyboard.\n\nTo run 'tiny' benchmark scenario with default settings, run the following from\nthe nasim/agents dir:\n\n$ python keyboard_agent.py tiny\n\nThis will run the agent and display the game in stdout.\n\nTo see available running arguments:\n\n$ python keyboard_agent.py--help\n\"\"\"\nimport nasim\nfrom nasim.envs.action import Exploit, PrivilegeEscalation\n\n\nLINE_BREAK = \"-\"*60\nLINE_BREAK2 = \"=\"*60\n\n\ndef print_actions(action_space):\n    for a in range(action_space.n):\n        print(f\"{a} {action_space.get_action(a)}\")\n    print(LINE_BREAK)\n\n\ndef choose_flat_action(env):\n    print_actions(env.action_space)\n    while True:\n        try:\n            idx = int(input(\"Choose action number: \"))\n            action = env.action_space.get_action(idx)\n            print(f\"Performing: {action}\")\n            return action\n        except Exception:\n            print(\"Invalid choice. Try again.\")\n\n\ndef display_actions(actions):\n    action_names = list(actions)\n    for i, name in enumerate(action_names):\n        a_def = actions[name]\n        output = [f\"{i} {name}:\"]\n        output.extend([f\"{k}={v}\" for k, v in a_def.items()])\n        print(\" \".join(output))\n\n\ndef choose_item(items):\n    while True:\n        try:\n            idx = int(input(\"Choose number: \"))\n            return items[idx]\n        except Exception:\n            print(\"Invalid choice. Try again.\")\n\n\ndef choose_param_action(env):\n    print(\"1. Choose Action Type:\")\n    print(\"----------------------\")\n    for i, atype in enumerate(env.action_space.action_types):\n        print(f\"{i} {atype.__name__}\")\n    while True:\n        try:\n            atype_idx = int(input(\"Choose index: \"))\n            # check idx valid\n            atype = env.action_space.action_types[atype_idx]\n            break\n        except Exception:\n            print(\"Invalid choice. Try again.\")\n\n    print(\"------------------------\")\n    print(\"2. Choose Target Subnet:\")\n    print(\"------------------------\")\n    num_subnets = env.action_space.nvec[1]\n    while True:\n        try:\n            subnet = int(input(f\"Choose subnet in [1, {num_subnets}]: \"))\n            if subnet < 1 or subnet > num_subnets:\n                raise ValueError()\n            break\n        except Exception:\n            print(\"Invalid choice. Try again.\")\n\n    print(\"----------------------\")\n    print(\"3. Choose Target Host:\")\n    print(\"----------------------\")\n    num_hosts = env.scenario.subnets[subnet]\n    while True:\n        try:\n            host = int(input(f\"Choose host in [0, {num_hosts-1}]: \"))\n            if host < 0 or host > num_hosts-1:\n                raise ValueError()\n            break\n        except Exception:\n            print(\"Invalid choice. Try again.\")\n\n    # subnet-1, since action_space handles exclusion of internet subnet\n    avec = [atype_idx, subnet-1, host, 0, 0]\n    if atype not in (Exploit, PrivilegeEscalation):\n        action = env.action_space.get_action(avec)\n        print(\"----------------\")\n        print(f\"ACTION SELECTED: {action}\")\n        return action\n\n    target = (subnet, host)\n    if atype == Exploit:\n        print(\"------------------\")\n        print(\"4. Choose Exploit:\")\n        print(\"------------------\")\n        exploits = env.scenario.exploits\n        display_actions(exploits)\n        e_name = choose_item(list(exploits))\n        action = Exploit(name=e_name, target=target, **exploits[e_name])\n    else:\n        print(\"------------------\")\n        print(\"4. Choose Privilege Escalation:\")\n        print(\"------------------\")\n        privescs = env.scenario.privescs\n        display_actions(privescs)\n        pe_name = choose_item(list(privescs))\n        action = PrivilegeEscalation(\n            name=pe_name, target=target, **privescs[pe_name]\n        )\n\n    print(\"----------------\")\n    print(f\"ACTION SELECTED: {action}\")\n    return action\n\n\ndef choose_action(env):\n    input(\"Press enter to choose next action..\")\n    print(\"\\n\" + LINE_BREAK2)\n    print(\"CHOOSE ACTION\")\n    print(LINE_BREAK2)\n    if env.flat_actions:\n        return choose_flat_action(env)\n    return choose_param_action(env)\n\n\ndef run_keyboard_agent(env):\n    \"\"\"Run Keyboard agent\n\n    Parameters\n    ----------\n    env : NASimEnv\n        the environment\n\n    Returns\n    -------\n    int\n        final return\n    int\n        steps taken\n    bool\n        whether goal reached or not\n    \"\"\"\n    print(LINE_BREAK2)\n    print(\"STARTING EPISODE\")\n    print(LINE_BREAK2)\n\n    o, _ = env.reset()\n    env.render()\n    total_reward = 0\n    total_steps = 0\n    done = False\n    step_limit_reached = False\n    while not done and not step_limit_reached:\n        a = choose_action(env)\n        o, r, done, step_limit_reached, _ = env.step(a)\n        total_reward += r\n        total_steps += 1\n        print(\"\\n\" + LINE_BREAK2)\n        print(\"OBSERVATION RECIEVED\")\n        print(LINE_BREAK2)\n        env.render()\n        print(f\"Reward={r}\")\n        print(f\"Done={done}\")\n        print(f\"Step limit reached={step_limit_reached}\")\n        print(LINE_BREAK)\n\n    return total_reward, total_steps, done\n\n\ndef run_generative_keyboard_agent(env, render_mode=\"human\"):\n    \"\"\"Run Keyboard agent in generative mode.\n\n    The experience is the same as the normal mode, this is mainly useful\n    for testing.\n\n    Parameters\n    ----------\n    env : NASimEnv\n        the environment\n    render_mode : str, optional\n        display mode for environment (default=\"human\")\n\n    Returns\n    -------\n    int\n        final return\n    int\n        steps taken\n    bool\n        whether goal reached or not\n    \"\"\"\n    print(LINE_BREAK2)\n    print(\"STARTING EPISODE\")\n    print(LINE_BREAK2)\n\n    o, _ = env.reset()\n    s = env.current_state\n    env.render_state(render_mode, s)\n    env.render_obs(render_mode, o)\n\n    total_reward = 0\n    total_steps = 0\n    done = False\n    while not done:\n        a = choose_action(env)\n        ns, o, r, done, _ = env.generative_step(s, a)\n        total_reward += r\n        total_steps += 1\n        print(LINE_BREAK2)\n        print(\"NEXT STATE\")\n        print(LINE_BREAK2)\n        env.render_state(render_mode, ns)\n        print(\"\\n\" + LINE_BREAK2)\n        print(\"OBSERVATION RECIEVED\")\n        print(LINE_BREAK2)\n        env.render_obs(render_mode, o)\n        print(f\"Reward={r}\")\n        print(f\"Done={done}\")\n        print(LINE_BREAK)\n        s = ns\n\n    if done:\n        done = env.goal_reached()\n\n    return total_reward, total_steps, done\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"env_name\", type=str,\n                        help=\"benchmark scenario name\")\n    parser.add_argument(\"-s\", \"--seed\", type=int, default=None,\n                        help=\"random seed (default=None)\")\n    parser.add_argument(\"-o\", \"--partially_obs\", action=\"store_true\",\n                        help=\"Partially Observable Mode\")\n    parser.add_argument(\"-p\", \"--param_actions\", action=\"store_true\",\n                        help=\"Use Parameterised action space\")\n    parser.add_argument(\"-g\", \"--use_generative\", action=\"store_true\",\n                        help=(\"Generative environment mode. This makes no\"\n                              \" difference for the player, but is useful\"\n                              \" for testing.\"))\n    args = parser.parse_args()\n\n    env = nasim.make_benchmark(args.env_name,\n                               args.seed,\n                               fully_obs=not args.partially_obs,\n                               flat_actions=not args.param_actions,\n                               flat_obs=True,\n                               render_mode=\"human\")\n    if args.use_generative:\n        total_reward, steps, goal = run_generative_keyboard_agent(env,\n                                                                  render_mode=\"human\")\n    else:\n        total_reward, steps, goal = run_keyboard_agent(env)\n\n    print(LINE_BREAK2)\n    print(\"EPISODE FINISHED\")\n    print(LINE_BREAK)\n    print(f\"Goal reached = {goal}\")\n    print(f\"Total reward = {total_reward}\")\n    print(f\"Steps taken = {steps}\")\n"
  },
  {
    "path": "nasim/agents/ql_agent.py",
    "content": "\"\"\"An example Tabular, epsilon greedy Q-Learning Agent.\n\nThis agent does not use an Experience replay (see the 'ql_replay_agent.py')\n\nIt uses pytorch 1.5+ tensorboard library for logging (HINT: these dependencies\ncan be installed by running pip install nasim[dqn])\n\nTo run 'tiny' benchmark scenario with default settings, run the following from\nthe nasim/agents dir:\n\n$ python ql_agent.py tiny\n\nTo see detailed results using tensorboard:\n\n$ tensorboard --logdir runs/\n\nTo see available hyperparameters:\n\n$ python ql_agent.py --help\n\nNotes\n-----\n\nThis is by no means a state of the art implementation of Tabular Q-Learning.\nIt is designed to be an example implementation that can be used as a reference\nfor building your own agents and for simple experimental comparisons.\n\"\"\"\nimport random\nimport numpy as np\nfrom pprint import pprint\n\nimport nasim\n\ntry:\n    from torch.utils.tensorboard import SummaryWriter\nexcept ImportError as e:\n    from gymnasium import error\n    raise error.DependencyNotInstalled(\n        f\"{e}. (HINT: you can install tabular_q_learning_agent dependencies \"\n        \"by running 'pip install nasim[dqn]'.)\"\n    )\n\n\nclass TabularQFunction:\n    \"\"\"Tabular Q-Function \"\"\"\n\n    def __init__(self, num_actions):\n        self.q_func = dict()\n        self.num_actions = num_actions\n\n    def __call__(self, x):\n        return self.forward(x)\n\n    def forward(self, x):\n        if isinstance(x, np.ndarray):\n            x = str(x.astype(np.int))\n        if x not in self.q_func:\n            self.q_func[x] = np.zeros(self.num_actions, dtype=np.float32)\n        return self.q_func[x]\n\n    def forward_batch(self, x_batch):\n        return np.asarray([self.forward(x) for x in x_batch])\n\n    def update_batch(self, s_batch, a_batch, delta_batch):\n        for s, a, delta in zip(s_batch, a_batch, delta_batch):\n            q_vals = self.forward(s)\n            q_vals[a] += delta\n\n    def update(self, s, a, delta):\n        q_vals = self.forward(s)\n        q_vals[a] += delta\n\n    def get_action(self, x):\n        return int(self.forward(x).argmax())\n\n    def display(self):\n        pprint(self.q_func)\n\n\nclass TabularQLearningAgent:\n    \"\"\"A Tabular. epsilon greedy Q-Learning Agent using Experience Replay \"\"\"\n\n    def __init__(self,\n                 env,\n                 seed=None,\n                 lr=0.001,\n                 training_steps=10000,\n                 final_epsilon=0.05,\n                 exploration_steps=10000,\n                 gamma=0.99,\n                 verbose=True,\n                 **kwargs):\n\n        # This implementation only works for flat actions\n        assert env.flat_actions\n        self.verbose = verbose\n        if self.verbose:\n            print(\"\\nRunning Tabular Q-Learning with config:\")\n            pprint(locals())\n\n        # set seeds\n        self.seed = seed\n        if self.seed is not None:\n            np.random.seed(self.seed)\n\n        # envirnment setup\n        self.env = env\n\n        self.num_actions = self.env.action_space.n\n        self.obs_dim = self.env.observation_space.shape\n\n        # logger setup\n        self.logger = SummaryWriter()\n\n        # Training related attributes\n        self.lr = lr\n        self.exploration_steps = exploration_steps\n        self.final_epsilon = final_epsilon\n        self.epsilon_schedule = np.linspace(\n            1.0, self.final_epsilon, self.exploration_steps\n        )\n        self.discount = gamma\n        self.training_steps = training_steps\n        self.steps_done = 0\n\n        # Q-Function\n        self.qfunc = TabularQFunction(self.num_actions)\n\n    def get_epsilon(self):\n        if self.steps_done < self.exploration_steps:\n            return self.epsilon_schedule[self.steps_done]\n        return self.final_epsilon\n\n    def get_egreedy_action(self, o, epsilon):\n        if random.random() > epsilon:\n            return self.qfunc.get_action(o)\n        return random.randint(0, self.num_actions-1)\n\n    def optimize(self, s, a, next_s, r, done):\n        # get q_val for state and action performed in that state\n        q_vals_raw = self.qfunc.forward(s)\n        q_val = q_vals_raw[a]\n\n        # get target q val = max val of next state\n        target_q_val = self.qfunc.forward(next_s).max()\n        target = r + self.discount * (1-done) * target_q_val\n\n        # calculate error and update\n        td_error = target - q_val\n        td_delta = self.lr * td_error\n\n        # optimize the model\n        self.qfunc.update(s, a, td_delta)\n\n        s_value = q_vals_raw.max()\n        return td_error, s_value\n\n    def train(self):\n        if self.verbose:\n            print(\"\\nStarting training\")\n\n        num_episodes = 0\n        training_steps_remaining = self.training_steps\n\n        while self.steps_done < self.training_steps:\n            ep_results = self.run_train_episode(training_steps_remaining)\n            ep_return, ep_steps, goal = ep_results\n            num_episodes += 1\n            training_steps_remaining -= ep_steps\n\n            self.logger.add_scalar(\"episode\", num_episodes, self.steps_done)\n            self.logger.add_scalar(\n                \"epsilon\", self.get_epsilon(), self.steps_done\n            )\n            self.logger.add_scalar(\n                \"episode_return\", ep_return, self.steps_done\n            )\n            self.logger.add_scalar(\n                \"episode_steps\", ep_steps, self.steps_done\n            )\n            self.logger.add_scalar(\n                \"episode_goal_reached\", int(goal), self.steps_done\n            )\n\n            if num_episodes % 10 == 0 and self.verbose:\n                print(f\"\\nEpisode {num_episodes}:\")\n                print(f\"\\tsteps done = {self.steps_done} / \"\n                      f\"{self.training_steps}\")\n                print(f\"\\treturn = {ep_return}\")\n                print(f\"\\tgoal = {goal}\")\n\n        self.logger.close()\n        if self.verbose:\n            print(\"Training complete\")\n            print(f\"\\nEpisode {num_episodes}:\")\n            print(f\"\\tsteps done = {self.steps_done} / {self.training_steps}\")\n            print(f\"\\treturn = {ep_return}\")\n            print(f\"\\tgoal = {goal}\")\n\n    def run_train_episode(self, step_limit):\n        s, _ = self.env.reset()\n        done = False\n        env_step_limit_reached = False\n\n        steps = 0\n        episode_return = 0\n\n        while not done and not env_step_limit_reached and steps < step_limit:\n            a = self.get_egreedy_action(s, self.get_epsilon())\n\n            next_s, r, done, env_step_limit_reached, _ = self.env.step(a)\n            self.steps_done += 1\n            td_error, s_value = self.optimize(s, a, next_s, r, done)\n            self.logger.add_scalar(\"td_error\", td_error, self.steps_done)\n            self.logger.add_scalar(\"s_value\", s_value, self.steps_done)\n\n            s = next_s\n            episode_return += r\n            steps += 1\n\n        return episode_return, steps, self.env.goal_reached()\n\n    def run_eval_episode(self,\n                         env=None,\n                         render=False,\n                         eval_epsilon=0.05,\n                         render_mode=\"human\"):\n        if env is None:\n            env = self.env\n\n        original_render_mode = env.render_mode\n        env.render_mode = render_mode\n\n        s, _ = env.reset()\n        done = False\n        env_step_limit_reached = False\n\n        steps = 0\n        episode_return = 0\n\n        line_break = \"=\"*60\n        if render:\n            print(\"\\n\" + line_break)\n            print(f\"Running EVALUATION using epsilon = {eval_epsilon:.4f}\")\n            print(line_break)\n            env.render()\n            input(\"Initial state. Press enter to continue..\")\n\n        while not done and not env_step_limit_reached:\n            a = self.get_egreedy_action(s, eval_epsilon)\n            next_s, r, done, env_step_limit_reached, _ = env.step(a)\n            s = next_s\n            episode_return += r\n            steps += 1\n            if render:\n                print(\"\\n\" + line_break)\n                print(f\"Step {steps}\")\n                print(line_break)\n                print(f\"Action Performed = {env.action_space.get_action(a)}\")\n                env.render()\n                print(f\"Reward = {r}\")\n                print(f\"Done = {done}\")\n                print(f\"Step limit reached = {env_step_limit_reached}\")\n                input(\"Press enter to continue..\")\n\n                if done or env_step_limit_reached:\n                    print(\"\\n\" + line_break)\n                    print(\"EPISODE FINISHED\")\n                    print(line_break)\n                    print(f\"Goal reached = {env.goal_reached()}\")\n                    print(f\"Total steps = {steps}\")\n                    print(f\"Total reward = {episode_return}\")\n\n        env.render_mode = original_render_mode\n        return episode_return, steps, env.goal_reached()\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"env_name\", type=str, help=\"benchmark scenario name\")\n    parser.add_argument(\"--render_eval\", action=\"store_true\",\n                        help=\"Renders final policy\")\n    parser.add_argument(\"--lr\", type=float, default=0.001,\n                        help=\"Learning rate (default=0.001)\")\n    parser.add_argument(\"-t\", \"--training_steps\", type=int, default=10000,\n                        help=\"training steps (default=10000)\")\n    parser.add_argument(\"--batch_size\", type=int, default=32,\n                        help=\"(default=32)\")\n    parser.add_argument(\"--seed\", type=int, default=0,\n                        help=\"(default=0)\")\n    parser.add_argument(\"--replay_size\", type=int, default=100000,\n                        help=\"(default=100000)\")\n    parser.add_argument(\"--final_epsilon\", type=float, default=0.05,\n                        help=\"(default=0.05)\")\n    parser.add_argument(\"--init_epsilon\", type=float, default=1.0,\n                        help=\"(default=1.0)\")\n    parser.add_argument(\"-e\", \"--exploration_steps\", type=int, default=10000,\n                        help=\"(default=10000)\")\n    parser.add_argument(\"--gamma\", type=float, default=0.99,\n                        help=\"(default=0.99)\")\n    parser.add_argument(\"--quite\", action=\"store_false\",\n                        help=\"Run in Quite mode\")\n    args = parser.parse_args()\n\n    env = nasim.make_benchmark(\n        args.env_name,\n        args.seed,\n        fully_obs=True,\n        flat_actions=True,\n        flat_obs=True\n    )\n    ql_agent = TabularQLearningAgent(\n        env, verbose=args.quite, **vars(args)\n    )\n    ql_agent.train()\n    ql_agent.run_eval_episode(render=args.render_eval)\n"
  },
  {
    "path": "nasim/agents/ql_replay_agent.py",
    "content": "\"\"\"An example Tabular, epsilon greedy Q-Learning Agent using experience replay.\n\nThe replay can help improve learning stability and speed (in terms of learning\nper training step), at the cost of increased memory and computation use.\n\nIt uses pytorch 1.5+ tensorboard library for logging (HINT: these dependencies\ncan be installed by running pip install nasim[dqn])\n\nTo run 'tiny' benchmark scenario with default settings, run the following from\nthe nasim/agents dir:\n\n$ python ql_replay_agent.py tiny\n\nTo see detailed results using tensorboard:\n\n$ tensorboard --logdir runs/\n\nTo see available hyperparameters:\n\n$ python ql_replay_agent.py --help\n\nNotes\n-----\n\nThis is by no means a state of the art implementation of Tabular Q-Learning.\nIt is designed to be an example implementation that can be used as a reference\nfor building your own agents and for simple experimental comparisons.\n\"\"\"\nimport random\nfrom pprint import pprint\n\nimport numpy as np\n\nimport nasim\n\ntry:\n    from torch.utils.tensorboard import SummaryWriter\nexcept ImportError as e:\n    from gymnasium import error\n    raise error.DependencyNotInstalled(\n        f\"{e}. (HINT: you can install tabular_q_learning_agent dependencies \"\n        \"by running 'pip install nasim[dqn]'.)\"\n    )\n\n\nclass ReplayMemory:\n    \"\"\"Experience Replay for Tabular Q-Learning agent \"\"\"\n\n    def __init__(self, capacity, s_dims):\n        self.capacity = capacity\n        self.s_buf = np.zeros((capacity, *s_dims), dtype=np.float32)\n        self.a_buf = np.zeros((capacity, 1), dtype=np.int32)\n        self.next_s_buf = np.zeros((capacity, *s_dims), dtype=np.float32)\n        self.r_buf = np.zeros(capacity, dtype=np.float32)\n        self.done_buf = np.zeros(capacity, dtype=np.float32)\n        self.ptr, self.size = 0, 0\n\n    def store(self, s, a, next_s, r, done):\n        self.s_buf[self.ptr] = s\n        self.a_buf[self.ptr] = a\n        self.next_s_buf[self.ptr] = next_s\n        self.r_buf[self.ptr] = r\n        self.done_buf[self.ptr] = done\n        self.ptr = (self.ptr + 1) % self.capacity\n        self.size = min(self.size+1, self.capacity)\n\n    def sample_batch(self, batch_size):\n        sample_idxs = np.random.choice(self.size, batch_size)\n        batch = [self.s_buf[sample_idxs],\n                 self.a_buf[sample_idxs],\n                 self.next_s_buf[sample_idxs],\n                 self.r_buf[sample_idxs],\n                 self.done_buf[sample_idxs]]\n        return batch\n\n\nclass TabularQFunction:\n    \"\"\"Tabular Q-Function \"\"\"\n\n    def __init__(self, num_actions):\n        self.q_func = dict()\n        self.num_actions = num_actions\n\n    def __call__(self, x):\n        return self.forward(x)\n\n    def forward(self, x):\n        if isinstance(x, np.ndarray):\n            x = str(x.astype(np.int))\n        if x not in self.q_func:\n            self.q_func[x] = np.zeros(self.num_actions, dtype=np.float32)\n        return self.q_func[x]\n\n    def forward_batch(self, x_batch):\n        return np.asarray([self.forward(x) for x in x_batch])\n\n    def update(self, s_batch, a_batch, delta_batch):\n        for s, a, delta in zip(s_batch, a_batch, delta_batch):\n            q_vals = self.forward(s)\n            q_vals[a] += delta\n\n    def get_action(self, x):\n        return int(self.forward(x).argmax())\n\n    def display(self):\n        pprint(self.q_func)\n\n\nclass TabularQLearningAgent:\n    \"\"\"A Tabular. epsilon greedy Q-Learning Agent using Experience Replay \"\"\"\n\n    def __init__(self,\n                 env,\n                 seed=None,\n                 lr=0.001,\n                 training_steps=10000,\n                 batch_size=32,\n                 replay_size=10000,\n                 final_epsilon=0.05,\n                 exploration_steps=10000,\n                 gamma=0.99,\n                 verbose=True,\n                 **kwargs):\n\n        # This implementation only works for flat actions\n        assert env.flat_actions\n        self.verbose = verbose\n        if self.verbose:\n            print(\"\\nRunning Tabular Q-Learning with config:\")\n            pprint(locals())\n\n        # set seeds\n        self.seed = seed\n        if self.seed is not None:\n            np.random.seed(self.seed)\n\n        # envirnment setup\n        self.env = env\n\n        self.num_actions = self.env.action_space.n\n        self.obs_dim = self.env.observation_space.shape\n\n        # logger setup\n        self.logger = SummaryWriter()\n\n        # Training related attributes\n        self.lr = lr\n        self.exploration_steps = exploration_steps\n        self.final_epsilon = final_epsilon\n        self.epsilon_schedule = np.linspace(\n            1.0, self.final_epsilon, self.exploration_steps\n        )\n        self.batch_size = batch_size\n        self.discount = gamma\n        self.training_steps = training_steps\n        self.steps_done = 0\n\n        # Q-Function\n        self.qfunc = TabularQFunction(self.num_actions)\n\n        # replay setup\n        self.replay = ReplayMemory(replay_size, self.obs_dim)\n\n    def get_epsilon(self):\n        if self.steps_done < self.exploration_steps:\n            return self.epsilon_schedule[self.steps_done]\n        return self.final_epsilon\n\n    def get_egreedy_action(self, o, epsilon):\n        if random.random() > epsilon:\n            return self.qfunc.get_action(o)\n        return random.randint(0, self.num_actions-1)\n\n    def optimize(self):\n        batch = self.replay.sample_batch(self.batch_size)\n        s_batch, a_batch, next_s_batch, r_batch, d_batch = batch\n\n        # get q_vals for each state and the action performed in that state\n        q_vals_raw = self.qfunc.forward_batch(s_batch)\n        q_vals = np.take_along_axis(q_vals_raw, a_batch, axis=1).squeeze()\n\n        # get target q val = max val of next state\n        target_q_val_raw = self.qfunc.forward_batch(next_s_batch)\n        target_q_val = target_q_val_raw.max(axis=1)\n        target = r_batch + self.discount*(1-d_batch)*target_q_val\n\n        # calculate error and update\n        td_error = target - q_vals\n        td_delta = self.lr * td_error\n\n        # optimize the model\n        self.qfunc.update(s_batch, a_batch, td_delta)\n\n        q_vals_max = q_vals_raw.max(axis=1)\n        mean_v = q_vals_max.mean().item()\n        mean_td_error = np.absolute(td_error).mean().item()\n        return mean_td_error, mean_v\n\n    def train(self):\n        if self.verbose:\n            print(\"\\nStarting training\")\n\n        num_episodes = 0\n        training_steps_remaining = self.training_steps\n\n        while self.steps_done < self.training_steps:\n            ep_results = self.run_train_episode(training_steps_remaining)\n            ep_return, ep_steps, goal = ep_results\n            num_episodes += 1\n            training_steps_remaining -= ep_steps\n\n            self.logger.add_scalar(\"episode\", num_episodes, self.steps_done)\n            self.logger.add_scalar(\n                \"epsilon\", self.get_epsilon(), self.steps_done\n            )\n            self.logger.add_scalar(\n                \"episode_return\", ep_return, self.steps_done\n            )\n            self.logger.add_scalar(\n                \"episode_steps\", ep_steps, self.steps_done\n            )\n            self.logger.add_scalar(\n                \"episode_goal_reached\", int(goal), self.steps_done\n            )\n\n            if num_episodes % 10 == 0 and self.verbose:\n                print(f\"\\nEpisode {num_episodes}:\")\n                print(f\"\\tsteps done = {self.steps_done} / \"\n                      f\"{self.training_steps}\")\n                print(f\"\\treturn = {ep_return}\")\n                print(f\"\\tgoal = {goal}\")\n\n        self.logger.close()\n        if self.verbose:\n            print(\"Training complete\")\n            print(f\"\\nEpisode {num_episodes}:\")\n            print(f\"\\tsteps done = {self.steps_done} / {self.training_steps}\")\n            print(f\"\\treturn = {ep_return}\")\n            print(f\"\\tgoal = {goal}\")\n\n    def run_train_episode(self, step_limit):\n        o = self.env.reset()\n        done = False\n        env_step_limit_reached = False\n\n        steps = 0\n        episode_return = 0\n\n        while not done and not env_step_limit_reached and steps < step_limit:\n            a = self.get_egreedy_action(o, self.get_epsilon())\n\n            next_o, r, done, env_step_limit_reached, _ = self.env.step(a)\n            self.replay.store(o, a, next_o, r, done)\n            self.steps_done += 1\n            mean_td_error, mean_v = self.optimize()\n            self.logger.add_scalar(\n                \"mean_td_error\", mean_td_error, self.steps_done\n            )\n            self.logger.add_scalar(\"mean_v\", mean_v, self.steps_done)\n\n            o = next_o\n            episode_return += r\n            steps += 1\n\n        return episode_return, steps, self.env.goal_reached()\n\n    def run_eval_episode(self,\n                         env=None,\n                         render=False,\n                         eval_epsilon=0.05,\n                         render_mode=\"readable\"):\n        if env is None:\n            env = self.env\n        o = env.reset()\n        done = False\n        env_step_limit_reached = False\n\n        steps = 0\n        episode_return = 0\n\n        line_break = \"=\"*60\n        if render:\n            print(\"\\n\" + line_break)\n            print(f\"Running EVALUATION using epsilon = {eval_epsilon:.4f}\")\n            print(line_break)\n            env.render(render_mode)\n            input(\"Initial state. Press enter to continue..\")\n\n        while not done and not env_step_limit_reached:\n            a = self.get_egreedy_action(o, eval_epsilon)\n            next_o, r, done, env_step_limit_reached, _ = env.step(a)\n            o = next_o\n            episode_return += r\n            steps += 1\n            if render:\n                print(\"\\n\" + line_break)\n                print(f\"Step {steps}\")\n                print(line_break)\n                print(f\"Action Performed = {env.action_space.get_action(a)}\")\n                env.render(render_mode)\n                print(f\"Reward = {r}\")\n                print(f\"Done = {done}\")\n                print(f\"Step limit reached = {env_step_limit_reached}\")\n                input(\"Press enter to continue..\")\n\n                if done or env_step_limit_reached:\n                    print(\"\\n\" + line_break)\n                    print(\"EPISODE FINISHED\")\n                    print(line_break)\n                    print(f\"Goal reached = {env.goal_reached()}\")\n                    print(f\"Total steps = {steps}\")\n                    print(f\"Total reward = {episode_return}\")\n\n        return episode_return, steps, env.goal_reached()\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"env_name\", type=str, help=\"benchmark scenario name\")\n    parser.add_argument(\"--render_eval\", action=\"store_true\",\n                        help=\"Renders final policy\")\n    parser.add_argument(\"--lr\", type=float, default=0.001,\n                        help=\"Learning rate (default=0.001)\")\n    parser.add_argument(\"-t\", \"--training_steps\", type=int, default=10000,\n                        help=\"training steps (default=10000)\")\n    parser.add_argument(\"--batch_size\", type=int, default=32,\n                        help=\"(default=32)\")\n    parser.add_argument(\"--seed\", type=int, default=0,\n                        help=\"(default=0)\")\n    parser.add_argument(\"--replay_size\", type=int, default=100000,\n                        help=\"(default=100000)\")\n    parser.add_argument(\"--final_epsilon\", type=float, default=0.05,\n                        help=\"(default=0.05)\")\n    parser.add_argument(\"--init_epsilon\", type=float, default=1.0,\n                        help=\"(default=1.0)\")\n    parser.add_argument(\"--exploration_steps\", type=int, default=10000,\n                        help=\"(default=10000)\")\n    parser.add_argument(\"--gamma\", type=float, default=0.99,\n                        help=\"(default=0.99)\")\n    parser.add_argument(\"--quite\", action=\"store_false\",\n                        help=\"Run in Quite mode\")\n    args = parser.parse_args()\n\n    env = nasim.make_benchmark(args.env_name,\n                               args.seed,\n                               fully_obs=True,\n                               flat_actions=True,\n                               flat_obs=True)\n    ql_agent = TabularQLearningAgent(\n        env, verbose=args.quite, **vars(args)\n    )\n    ql_agent.train()\n    ql_agent.run_eval_episode(render=args.render_eval)\n"
  },
  {
    "path": "nasim/agents/random_agent.py",
    "content": "\"\"\"A random agent that selects a random action at each step\n\nTo run 'tiny' benchmark scenario with default settings, run the following from\nthe nasim/agents dir:\n\n$ python random_agent.py tiny\n\nThis will run the agent and display progress and final results to stdout.\n\nTo see available running arguments:\n\n$ python random_agent.py --help\n\"\"\"\n\nimport numpy as np\n\nimport nasim\n\nLINE_BREAK = \"-\"*60\n\n\ndef run_random_agent(env, step_limit=1e6, verbose=True):\n    if verbose:\n        print(LINE_BREAK)\n        print(\"STARTING EPISODE\")\n        print(LINE_BREAK)\n        print(f\"t: Reward\")\n\n    env.reset()\n    total_reward = 0\n    done = False\n    env_step_limit_reached = False\n    t = 0\n    a = 0\n\n    while not done and not env_step_limit_reached and t < step_limit:\n        a = env.action_space.sample()\n        _, r, done, env_step_limit_reached, _ = env.step(a)\n        total_reward += r\n        if (t+1) % 100 == 0 and verbose:\n            print(f\"{t}: {total_reward}\")\n        t += 1\n\n    if (done or env_step_limit_reached) and verbose:\n        print(LINE_BREAK)\n        print(\"EPISODE FINISHED\")\n        print(LINE_BREAK)\n        print(f\"Total steps = {t}\")\n        print(f\"Total reward = {total_reward}\")\n    elif verbose:\n        print(LINE_BREAK)\n        print(\"STEP LIMIT REACHED\")\n        print(LINE_BREAK)\n\n    if done:\n        done = env.goal_reached()\n\n    return t, total_reward, done\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"env_name\", type=str,\n                        help=\"benchmark scenario name\")\n    parser.add_argument(\"-s\", \"--seed\", type=int, default=0,\n                        help=\"random seed\")\n    parser.add_argument(\"-r\", \"--runs\", type=int, default=1,\n                        help=\"number of random runs to perform (default=1)\")\n    parser.add_argument(\"-o\", \"--partially_obs\", action=\"store_true\",\n                        help=\"Partially Observable Mode\")\n    parser.add_argument(\"-p\", \"--param_actions\", action=\"store_true\",\n                        help=\"Use Parameterised action space\")\n    parser.add_argument(\"-f\", \"--box_obs\", action=\"store_true\",\n                        help=\"Use 2D observation space\")\n    args = parser.parse_args()\n\n    seed = args.seed\n    run_steps = []\n    run_rewards = []\n    run_goals = 0\n    for i in range(args.runs):\n        env = nasim.make_benchmark(args.env_name,\n                                   seed,\n                                   not args.partially_obs,\n                                   not args.param_actions,\n                                   not args.box_obs)\n        steps, reward, done = run_random_agent(env, verbose=False)\n        run_steps.append(steps)\n        run_rewards.append(reward)\n        run_goals += int(done)\n        seed += 1\n\n        if args.runs > 1:\n            print(f\"Run {i}:\")\n            print(f\"\\tSteps = {steps}\")\n            print(f\"\\tReward = {reward}\")\n            print(f\"\\tGoal reached = {done}\")\n\n    run_steps = np.array(run_steps)\n    run_rewards = np.array(run_rewards)\n\n    print(LINE_BREAK)\n    print(\"Random Agent Runs Complete\")\n    print(LINE_BREAK)\n    print(f\"Mean steps = {run_steps.mean():.2f} +/- {run_steps.std():.2f}\")\n    print(f\"Mean rewards = {run_rewards.mean():.2f} \"\n          f\"+/- {run_rewards.std():.2f}\")\n    print(f\"Goals reached = {run_goals} / {args.runs}\")\n"
  },
  {
    "path": "nasim/demo.py",
    "content": "\"\"\"Script for running NASim demo\n\nUsage\n-----\n\n$ python demo [-ai] [-h] env_name\n\"\"\"\n\nimport os.path as osp\n\nimport nasim\nfrom nasim.agents.dqn_agent import DQNAgent\nfrom nasim.agents.keyboard_agent import run_keyboard_agent\n\n\nDQN_POLICY_DIR = osp.join(\n    osp.dirname(osp.abspath(__file__)),\n    \"agents\",\n    \"policies\"\n)\nDQN_POLICIES = {\n    \"tiny\": osp.join(DQN_POLICY_DIR, \"dqn_tiny.pt\"),\n    \"small\": osp.join(DQN_POLICY_DIR, \"dqn_small.pt\")\n}\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser(\n        description=(\n            \"NASim demo. Play as the hacker, trying to gain access\"\n            \" to sensitive information on the network, or run a pre-trained\"\n            \" AI hacker.\"\n        )\n    )\n    parser.add_argument(\"env_name\", type=str,\n                        help=\"benchmark scenario name\")\n    parser.add_argument(\"-ai\", \"--run_ai\", action=\"store_true\",\n                        help=(\"Run AI policy (currently ony supported for\"\n                              \" 'tiny' and 'small' environments\"))\n    args = parser.parse_args()\n\n    if args.run_ai:\n        assert args.env_name in DQN_POLICIES, \\\n            (\"AI demo only supported for the following environments:\"\n             f\" {list(DQN_POLICIES)}\")\n\n    env = nasim.make_benchmark(\n        args.env_name,\n        fully_obs=True,\n        flat_actions=True,\n        flat_obs=True,\n        render_mode=\"human\"\n    )\n\n    line_break = f\"\\n{'-'*60}\"\n    print(line_break)\n    print(f\"Running Demo on {args.env_name} environment\")\n    if args.run_ai:\n        print(\"Using AI policy\")\n        print(line_break)\n        dqn_agent = DQNAgent(env, verbose=False, **vars(args))\n        dqn_agent.load(DQN_POLICIES[args.env_name])\n        ret, steps, goal = dqn_agent.run_eval_episode(\n            env, True, 0.01, \"human\"\n        )\n    else:\n        print(\"Player controlled\")\n        print(line_break)\n        ret, steps, goal = run_keyboard_agent(env)\n\n    print(line_break)\n    print(f\"Episode Complete\")\n    print(line_break)\n    if goal:\n        print(\"Goal accomplished. Sensitive data retrieved!\")\n    print(f\"Final Score={ret}\")\n    print(f\"Steps taken={steps}\")\n"
  },
  {
    "path": "nasim/envs/__init__.py",
    "content": "from nasim.envs.gym_env import NASimGymEnv\nfrom nasim.envs.environment import NASimEnv\n"
  },
  {
    "path": "nasim/envs/action.py",
    "content": "\"\"\"Action related classes for the NASim environment.\n\nThis module contains the different action classes that are used\nto implement actions within a NASim environment, along within the\ndifferent ActionSpace classes, and the ActionResult class.\n\nNotes\n-----\n\n**Actions:**\n\nEvery action inherits from the base :class:`Action` class, which defines\nsome common attributes and functions. Different types of actions\nare implemented as subclasses of the Action class.\n\nAction types implemented:\n\n- :class:`Exploit`\n- :class:`PrivilegeEscalation`\n- :class:`ServiceScan`\n- :class:`OSScan`\n- :class:`SubnetScan`\n- :class:`ProcessScan`\n- :class:`NoOp`\n\n**Action Spaces:**\n\nThere are two types of action spaces, depending on if you are using flat\nactions or not:\n\n- :class:`FlatActionSpace`\n- :class:`ParameterisedActionSpace`\n\n\"\"\"\n\nimport math\nimport numpy as np\nfrom gymnasium import spaces\n\nfrom nasim.envs.utils import AccessLevel\n\n\ndef load_action_list(scenario):\n    \"\"\"Load list of actions for environment for given scenario\n\n    Parameters\n    ----------\n    scenario : Scenario\n        the scenario\n\n    Returns\n    -------\n    list\n        list of all actions in environment\n    \"\"\"\n    action_list = []\n    for address in scenario.address_space:\n        action_list.append(\n            ServiceScan(address, scenario.service_scan_cost)\n        )\n        action_list.append(\n            OSScan(address, scenario.os_scan_cost)\n        )\n        action_list.append(\n            SubnetScan(address, scenario.subnet_scan_cost)\n        )\n        action_list.append(\n            ProcessScan(address, scenario.process_scan_cost)\n        )\n        for e_name, e_def in scenario.exploits.items():\n            exploit = Exploit(e_name, address, **e_def)\n            action_list.append(exploit)\n        for pe_name, pe_def in scenario.privescs.items():\n            privesc = PrivilegeEscalation(pe_name, address, **pe_def)\n            action_list.append(privesc)\n    return action_list\n\n\nclass Action:\n    \"\"\"The base abstract action class in the environment\n\n    There are multiple types of actions (e.g. exploit, scan, etc.), but every\n    action has some common attributes.\n\n    ...\n\n    Attributes\n    ----------\n    name : str\n        the name of action\n    target : (int, int)\n        the (subnet, host) address of target of the action. The target of the\n        action could be the address of a host that the action is being used\n        against (e.g. for exploits or targeted scans) or could be the host that\n        the action is being executed on (e.g. for subnet scans).\n    cost : float\n        the cost of performing the action\n    prob : float\n        the success probability of the action. This is the probability that\n        the action works given that it's preconditions are met. E.g. a remote\n        exploit targeting a host that you cannot communicate with will always\n        fail. For deterministic actions this will be 1.0.\n    req_access : AccessLevel,\n        the required access level to perform action. For for on host actions\n        (i.e. subnet scan, process scan, and privilege escalation) this will\n        be the access on the target. For remote actions (i.e. service scan,\n        os scan, and exploits) this will be the access on a pivot host (i.e.\n        a compromised host that can reach the target).\n    \"\"\"\n\n    def __init__(self,\n                 name,\n                 target,\n                 cost,\n                 prob=1.0,\n                 req_access=AccessLevel.USER,\n                 **kwargs):\n        \"\"\"\n        Parameters\n        ---------\n        name : str\n            name of action\n        target : (int, int)\n            address of target\n        cost : float\n            cost of performing action\n        prob : float, optional\n            probability of success for a given action (default=1.0)\n        req_access : AccessLevel, optional\n            the required access level to perform action\n            (default=AccessLevel.USER)\n        \"\"\"\n        assert 0 <= prob <= 1.0\n        self.name = name\n        self.target = target\n        self.cost = cost\n        self.prob = prob\n        self.req_access = req_access\n\n    def is_exploit(self):\n        \"\"\"Check if action is an exploit\n\n        Returns\n        -------\n        bool\n            True if action is exploit, otherwise False\n        \"\"\"\n        return isinstance(self, Exploit)\n\n    def is_privilege_escalation(self):\n        \"\"\"Check if action is privilege escalation action\n\n        Returns\n        -------\n        bool\n            True if action is privilege escalation action, otherwise False\n        \"\"\"\n        return isinstance(self, PrivilegeEscalation)\n\n    def is_scan(self):\n        \"\"\"Check if action is a scan\n\n        Returns\n        -------\n        bool\n            True if action is scan, otherwise False\n        \"\"\"\n        return isinstance(self, (ServiceScan, OSScan, SubnetScan, ProcessScan))\n\n    def is_remote(self):\n        \"\"\"Check if action is a remote action\n\n        A remote action is one where the target host is a remote host (i.e. the\n        action is not performed locally on the target)\n\n        Returns\n        -------\n        bool\n            True if action is remote, otherwise False\n        \"\"\"\n        return isinstance(self, (ServiceScan, OSScan, Exploit))\n\n    def is_service_scan(self):\n        \"\"\"Check if action is a service scan\n\n        Returns\n        -------\n        bool\n            True if action is service scan, otherwise False\n        \"\"\"\n        return isinstance(self, ServiceScan)\n\n    def is_os_scan(self):\n        \"\"\"Check if action is an OS scan\n\n        Returns\n        -------\n        bool\n            True if action is an OS scan, otherwise False\n        \"\"\"\n        return isinstance(self, OSScan)\n\n    def is_subnet_scan(self):\n        \"\"\"Check if action is a subnet scan\n\n        Returns\n        -------\n        bool\n            True if action is a subnet scan, otherwise False\n        \"\"\"\n        return isinstance(self, SubnetScan)\n\n    def is_process_scan(self):\n        \"\"\"Check if action is a process scan\n\n        Returns\n        -------\n        bool\n            True if action is a process scan, otherwise False\n        \"\"\"\n        return isinstance(self, ProcessScan)\n\n    def is_noop(self):\n        \"\"\"Check if action is a do nothing action.\n\n        Returns\n        -------\n        bool\n            True if action is a noop action, otherwise False\n        \"\"\"\n        return isinstance(self, NoOp)\n\n    def __str__(self):\n        return (f\"{self.__class__.__name__}: \"\n                f\"target={self.target}, \"\n                f\"cost={self.cost:.2f}, \"\n                f\"prob={self.prob:.2f}, \"\n                f\"req_access={self.req_access}\")\n\n    def __hash__(self):\n        return hash(self.__str__())\n\n    def __eq__(self, other):\n        if self is other:\n            return True\n        if not isinstance(other, type(self)):\n            return False\n        if self.target != other.target:\n            return False\n        if not (math.isclose(self.cost, other.cost)\n                and math.isclose(self.prob, other.prob)):\n            return False\n        return self.req_access == other.req_access\n\n\nclass Exploit(Action):\n    \"\"\"An Exploit action in the environment\n\n    Inherits from the base Action Class.\n\n    ...\n\n    Attributes\n    ----------\n    service : str\n        the service targeted by exploit\n    os : str\n        the OS targeted by exploit. If None then exploit works for all OSs.\n    access : int\n        the access level gained on target if exploit succeeds.\n    \"\"\"\n\n    def __init__(self,\n                 name,\n                 target,\n                 cost,\n                 service,\n                 os=None,\n                 access=0,\n                 prob=1.0,\n                 req_access=AccessLevel.USER,\n                 **kwargs):\n        \"\"\"\n        Parameters\n        ---------\n        target : (int, int)\n            address of target\n        cost : float\n            cost of performing action\n        service : str\n            the target service\n        os : str, optional\n            the target OS of exploit, if None then exploit works for all OS\n            (default=None)\n        access : int, optional\n            the access level gained on target if exploit succeeds (default=0)\n        prob : float, optional\n            probability of success (default=1.0)\n        req_access : AccessLevel, optional\n            the required access level to perform action\n            (default=AccessLevel.USER)\n        \"\"\"\n        super().__init__(name=name,\n                         target=target,\n                         cost=cost,\n                         prob=prob,\n                         req_access=req_access)\n        self.os = os\n        self.service = service\n        self.access = access\n\n    def __str__(self):\n        return (f\"{super().__str__()}, os={self.os}, \"\n                f\"service={self.service}, access={self.access}\")\n\n    def __eq__(self, other):\n        if not super().__eq__(other):\n            return False\n        return self.service == other.service \\\n            and self.os == other.os \\\n            and self.access == other.access\n\n\nclass PrivilegeEscalation(Action):\n    \"\"\"A privilege escalation action in the environment\n\n    Inherits from the base Action Class.\n\n    ...\n\n    Attributes\n    ----------\n    process : str\n        the process targeted by the privilege escalation. If None the action\n        works independent of a process\n    os : str\n        the OS targeted by privilege escalation. If None then action works\n        for all OSs.\n    access : int\n        the access level resulting from privilege escalation action\n    \"\"\"\n\n    def __init__(self,\n                 name,\n                 target,\n                 cost,\n                 access,\n                 process=None,\n                 os=None,\n                 prob=1.0,\n                 req_access=AccessLevel.USER,\n                 **kwargs):\n        \"\"\"\n        Parameters\n        ---------\n        target : (int, int)\n            address of target\n        cost : float\n            cost of performing action\n        access : int\n            the access level resulting from the privilege escalation\n        process : str, optional\n            the target process, if None the action does not require a process\n            to work (default=None)\n        os : str, optional\n            the target OS of privilege escalation action, if None then action\n            works for all OS (default=None)\n        prob : float, optional\n            probability of success (default=1.0)\n        req_access : AccessLevel, optional\n            the required access level to perform action\n            (default=AccessLevel.USER)\n        \"\"\"\n        super().__init__(name=name,\n                         target=target,\n                         cost=cost,\n                         prob=prob,\n                         req_access=req_access)\n        self.access = access\n        self.os = os\n        self.process = process\n\n    def __str__(self):\n        return (f\"{super().__str__()}, os={self.os}, \"\n                f\"process={self.process}, access={self.access}\")\n\n    def __eq__(self, other):\n        if not super().__eq__(other):\n            return False\n        return self.process == other.process \\\n            and self.os == other.os \\\n            and self.access == other.access\n\n\nclass ServiceScan(Action):\n    \"\"\"A Service Scan action in the environment\n\n    Inherits from the base Action Class.\n    \"\"\"\n\n    def __init__(self,\n                 target,\n                 cost,\n                 prob=1.0,\n                 req_access=AccessLevel.USER,\n                 **kwargs):\n        \"\"\"\n        Parameters\n        ---------\n        target : (int, int)\n            address of target\n        cost : float\n            cost of performing action\n        prob : float, optional\n            probability of success for a given action (default=1.0)\n        req_access : AccessLevel, optional\n            the required access level to perform action\n            (default=AccessLevel.USER)\n        \"\"\"\n        super().__init__(\"service_scan\",\n                         target=target,\n                         cost=cost,\n                         prob=prob,\n                         req_access=req_access,\n                         **kwargs)\n\n\nclass OSScan(Action):\n    \"\"\"An OS Scan action in the environment\n\n    Inherits from the base Action Class.\n    \"\"\"\n\n    def __init__(self,\n                 target,\n                 cost,\n                 prob=1.0,\n                 req_access=AccessLevel.USER,\n                 **kwargs):\n        \"\"\"\n        Parameters\n        ---------\n        target : (int, int)\n            address of target\n        cost : float\n            cost of performing action\n        prob : float, optional\n            probability of success for a given action (default=1.0)\n        req_access : AccessLevel, optional\n            the required access level to perform action\n            (default=AccessLevel.USER)\n        \"\"\"\n        super().__init__(\"os_scan\",\n                         target=target,\n                         cost=cost,\n                         prob=prob,\n                         req_access=req_access,\n                         **kwargs)\n\n\nclass SubnetScan(Action):\n    \"\"\"A Subnet Scan action in the environment\n\n    Inherits from the base Action Class.\n    \"\"\"\n\n    def __init__(self,\n                 target,\n                 cost,\n                 prob=1.0,\n                 req_access=AccessLevel.USER,\n                 **kwargs):\n        \"\"\"\n        Parameters\n        ---------\n        target : (int, int)\n            address of target\n        cost : float\n            cost of performing action\n        prob : float, optional\n            probability of success for a given action (default=1.0)\n        req_access : AccessLevel, optional\n            the required access level to perform action\n            (default=AccessLevel.USER)\n        \"\"\"\n        super().__init__(\"subnet_scan\",\n                         target=target,\n                         cost=cost,\n                         prob=prob,\n                         req_access=req_access,\n                         **kwargs)\n\n\nclass ProcessScan(Action):\n    \"\"\"A Process Scan action in the environment\n\n    Inherits from the base Action Class.\n    \"\"\"\n\n    def __init__(self,\n                 target,\n                 cost,\n                 prob=1.0,\n                 req_access=AccessLevel.USER,\n                 **kwargs):\n        \"\"\"\n        Parameters\n        ---------\n        target : (int, int)\n            address of target\n        cost : float\n            cost of performing action\n        prob : float, optional\n            probability of success for a given action (default=1.0)\n        req_access : AccessLevel, optional\n            the required access level to perform action\n            (default=AccessLevel.USER)\n        \"\"\"\n        super().__init__(\"process_scan\",\n                         target=target,\n                         cost=cost,\n                         prob=prob,\n                         req_access=req_access,\n                         **kwargs)\n\n\nclass NoOp(Action):\n    \"\"\"A do nothing action in the environment\n\n    Inherits from the base Action Class\n    \"\"\"\n\n    def __init__(self, *args, **kwargs):\n        super().__init__(name=\"noop\",\n                         target=(1, 0),\n                         cost=0,\n                         prob=1.0,\n                         req_access=AccessLevel.NONE)\n\n\nclass ActionResult:\n    \"\"\"A dataclass for storing the results of an Action.\n\n    These results are then used to update the full state and observation.\n\n    ...\n\n    Attributes\n    ----------\n    success : bool\n        True if exploit/scan was successful, False otherwise\n    value : float\n        value gained from action. Is the value of the host if successfuly\n        exploited, otherwise 0\n    services : dict\n        services identified by action.\n    os : dict\n        OS identified by action\n    processes : dict\n        processes identified by action\n    access : dict\n        access gained by action\n    discovered : dict\n        host addresses discovered by action\n    connection_error : bool\n        True if action failed due to connection error (e.g. could\n        not reach target)\n    permission_error : bool\n        True if action failed due to a permission error (e.g. incorrect access\n        level to perform action)\n    undefined_error : bool\n        True if action failed due to an undefined error (e.g. random exploit\n        failure)\n    newly_discovered : dict\n        host addresses discovered for the first time by action\n    \"\"\"\n\n    def __init__(self,\n                 success,\n                 value=0.0,\n                 services=None,\n                 os=None,\n                 processes=None,\n                 access=None,\n                 discovered=None,\n                 connection_error=False,\n                 permission_error=False,\n                 undefined_error=False,\n                 newly_discovered=None):\n        \"\"\"\n        Parameters\n        ----------\n        success : bool\n            True if exploit/scan was successful, False otherwise\n        value : float, optional\n            value gained from action (default=0.0)\n        services : dict, optional\n            services identified by action (default=None={})\n        os : dict, optional\n            OS identified by action (default=None={})\n        processes : dict, optional\n            processes identified by action (default=None={})\n        access : dict, optional\n            access gained by action (default=None={})\n        discovered : dict, optional\n            host addresses discovered by action (default=None={})\n        connection_error : bool, optional\n            True if action failed due to connection error (default=False)\n        permission_error : bool, optional\n            True if action failed due to a permission error (default=False)\n        undefined_error : bool, optional\n            True if action failed due to an undefined error (default=False)\n        newly_discovered : dict, optional\n            host addresses discovered for first time by action (default=None)\n        \"\"\"\n        self.success = success\n        self.value = value\n        self.services = {} if services is None else services\n        self.os = {} if os is None else os\n        self.processes = {} if processes is None else processes\n        self.access = {} if access is None else access\n        self.discovered = {} if discovered is None else discovered\n        self.connection_error = connection_error\n        self.permission_error = permission_error\n        self.undefined_error = undefined_error\n        if newly_discovered is not None:\n            self.newly_discovered = newly_discovered\n        else:\n            self.newly_discovered = {}\n\n    def info(self):\n        \"\"\"Get results as dict\n\n        Returns\n        -------\n        dict\n            action results information\n        \"\"\"\n        return dict(\n            success=self.success,\n            value=self.value,\n            services=self.services,\n            os=self.os,\n            processes=self.processes,\n            access=self.access,\n            discovered=self.discovered,\n            connection_error=self.connection_error,\n            permission_error=self.permission_error,\n            undefined_error=self.undefined_error,\n            newly_discovered=self.newly_discovered\n        )\n\n    def __str__(self):\n        output = [\"ActionObservation:\"]\n        for k, val in self.info().items():\n            output.append(f\"  {k}={val}\")\n        return \"\\n\".join(output)\n\n\nclass FlatActionSpace(spaces.Discrete):\n    \"\"\"Flat Action space for NASim environment.\n\n    Inherits and implements the gym.spaces.Discrete action space\n\n    ...\n\n    Attributes\n    ----------\n    n : int\n        the number of actions in the action space\n    actions : list of Actions\n        the list of the Actions in the action space\n    \"\"\"\n\n    def __init__(self, scenario):\n        \"\"\"\n        Parameters\n        ---------\n        scenario : Scenario\n            scenario description\n        \"\"\"\n        self.actions = load_action_list(scenario)\n        super().__init__(len(self.actions))\n\n    def get_action(self, action_idx):\n        \"\"\"Get Action object corresponding to action idx\n\n        Parameters\n        ----------\n        action_idx : int\n            the action idx\n\n        Returns\n        -------\n        Action\n            Corresponding Action object\n        \"\"\"\n        assert isinstance(action_idx, int), \\\n            (\"When using flat action space, action must be an integer\"\n             f\" or an Action object: {action_idx} is invalid\")\n        return self.actions[action_idx]\n\n\nclass ParameterisedActionSpace(spaces.MultiDiscrete):\n    \"\"\"A parameterised action space for NASim environment.\n\n    Inherits and implements the gym.spaces.MultiDiscrete action space, where\n    each dimension corresponds to a different action parameter.\n\n    The action parameters (in order) are:\n\n    0. Action Type = [0, 5]\n\n       Where:\n\n         0=Exploit,\n\n         1=PrivilegeEscalation,\n\n         2=ServiceScan,\n\n         3=OSScan,\n\n         4=SubnetScan,\n\n         5=ProcessScan,\n\n    1. Subnet = [0, #subnets-1]\n\n       -1 since we don't include the internet subnet\n\n    2. Host = [0, max subnets size-1]\n    3. OS = [0, #OS]\n\n       Where 0=None.\n\n    4. Service = [0, #services - 1]\n    5. Process = [0, #processes]\n\n       Where 0=None.\n\n    Note that OS, Service and Process are only important for exploits and\n    privilege escalation actions.\n\n    ...\n\n    Attributes\n    ----------\n    nvec : Numpy.Array\n        vector of the of the size of each parameter\n    actions : list of Actions\n        the list of all the Actions in the action space\n    \"\"\"\n\n    action_types = [\n        Exploit,\n        PrivilegeEscalation,\n        ServiceScan,\n        OSScan,\n        SubnetScan,\n        ProcessScan\n    ]\n\n    def __init__(self, scenario):\n        \"\"\"\n        Parameters\n        ----------\n        scenario : Scenario\n            scenario description\n        \"\"\"\n        self.scenario = scenario\n        self.actions = load_action_list(scenario)\n\n        nvec = [\n            len(self.action_types),\n            len(self.scenario.subnets)-1,\n            max(self.scenario.subnets),\n            self.scenario.num_os+1,\n            self.scenario.num_services,\n            self.scenario.num_processes\n        ]\n\n        super().__init__(nvec)\n\n    def get_action(self, action_vec):\n        \"\"\"Get Action object corresponding to action vector.\n\n        Parameters\n        ----------\n        action_vector : list of ints or tuple of ints or Numpy.Array\n            the action vector\n\n        Returns\n        -------\n        Action\n            Corresponding Action object\n\n        Notes\n        -----\n        1. if host# specified in action vector is greater than\n           the number of hosts in the specified subnet, then host#\n           will be changed to host# % subnet size.\n        2. if action is an exploit and parameters do not match\n           any exploit definition in the scenario description then\n           a NoOp action is returned with 0 cost.\n        \"\"\"\n        assert isinstance(action_vec, (list, tuple, np.ndarray)), \\\n            (\"When using parameterised action space, action must be an Action\"\n             f\" object, a list or a numpy array: {action_vec} is invalid\")\n        a_class = self.action_types[action_vec[0]]\n        # need to add one to subnet to account for Internet subnet\n        subnet = action_vec[1]+1\n        host = action_vec[2] % self.scenario.subnets[subnet]\n\n        target = (subnet, host)\n\n        if a_class not in (Exploit, PrivilegeEscalation):\n            # can ignore other action parameters\n            kwargs = self._get_scan_action_def(a_class)\n            return a_class(target=target, **kwargs)\n\n        os = None if action_vec[3] == 0 else self.scenario.os[action_vec[3]-1]\n\n        if a_class == Exploit:\n            # have to make sure it is valid choice\n            # and also get constant params (name, cost, prob, access)\n            service = self.scenario.services[action_vec[4]]\n            a_def = self._get_exploit_def(service, os)\n        else:\n            # privilege escalation\n            # have to make sure it is valid choice\n            # and also get constant params (name, cost, prob, access)\n            proc = self.scenario.processes[action_vec[5]]\n            a_def = self._get_privesc_def(proc, os)\n\n        if a_def is None:\n            return NoOp()\n        return a_class(target=target, **a_def)\n\n    def _get_scan_action_def(self, a_class):\n        \"\"\"Get the constants for scan actions definitions \"\"\"\n        if a_class == ServiceScan:\n            cost = self.scenario.service_scan_cost\n        elif a_class == OSScan:\n            cost = self.scenario.os_scan_cost\n        elif a_class == SubnetScan:\n            cost = self.scenario.subnet_scan_cost\n        elif a_class == ProcessScan:\n            cost = self.scenario.process_scan_cost\n        else:\n            raise TypeError(f\"Not implemented for Action class {a_class}\")\n        return {\"cost\": cost}\n\n    def _get_exploit_def(self, service, os):\n        \"\"\"Check if exploit parameters are valid \"\"\"\n        e_map = self.scenario.exploit_map\n        if service not in e_map:\n            return None\n        if os not in e_map[service]:\n            return None\n        return e_map[service][os]\n\n    def _get_privesc_def(self, proc, os):\n        \"\"\"Check if privilege escalation parameters are valid \"\"\"\n        pe_map = self.scenario.privesc_map\n        if proc not in pe_map:\n            return None\n        if os not in pe_map[proc]:\n            return None\n        return pe_map[proc][os]\n"
  },
  {
    "path": "nasim/envs/environment.py",
    "content": "\"\"\" The main Environment class for NASim: NASimEnv.\n\nThe NASimEnv class is the main interface for agents interacting with NASim.\n\"\"\"\nimport gymnasium as gym\nfrom gymnasium import spaces\nimport numpy as np\n\nfrom nasim.envs.state import State\nfrom nasim.envs.render import Viewer\nfrom nasim.envs.network import Network\nfrom nasim.envs.observation import Observation\nfrom nasim.envs.action import Action, FlatActionSpace, ParameterisedActionSpace\n\n\nclass NASimEnv(gym.Env):\n    \"\"\" A simulated computer network environment for pen-testing.\n\n    Implements the gymnasium interface.\n\n    ...\n\n    Attributes\n    ----------\n    name : str\n        the environment scenario name\n    scenario : Scenario\n        Scenario object, defining the properties of the environment\n    action_space : FlatActionSpace or ParameterisedActionSpace\n        Action space for environment.\n        If *flat_action=True* then this is a discrete action space (which\n        subclasses gymnasium.spaces.Discrete), so each action is represented by an\n        integer.\n        If *flat_action=False* then this is a parameterised action space (which\n        subclasses gymnasium.spaces.MultiDiscrete), so each action is represented\n        using a list of parameters.\n    observation_space : gymnasium.spaces.Box\n        observation space for environment.\n        If *flat_obs=True* then observations are represented by a 1D vector,\n        otherwise observations are represented as a 2D matrix.\n    current_state : State\n        the current state of the environment\n    last_obs : Observation\n        the last observation that was generated by environment\n    steps : int\n        the number of steps performed since last reset (this does not include\n        generative steps)\n\n    \"\"\"\n    metadata = {'render_modes': [\"human\", \"ansi\"]}\n    render_mode = None\n    reward_range = (-float('inf'), float('inf'))\n\n    action_space = None\n    observation_space = None\n    current_state = None\n    last_obs = None\n\n    def __init__(self,\n                 scenario,\n                 fully_obs=False,\n                 flat_actions=True,\n                 flat_obs=True,\n                 render_mode=None):\n        \"\"\"\n        Parameters\n        ----------\n        scenario : Scenario\n            Scenario object, defining the properties of the environment\n        fully_obs : bool, optional\n            The observability mode of environment, if True then uses fully\n            observable mode, otherwise is partially observable (default=False)\n        flat_actions : bool, optional\n            If true then uses a flat action space, otherwise will uses a\n            parameterised action space (default=True).\n        flat_obs : bool, optional\n            If true then uses a 1D observation space, otherwise uses a 2D\n            observation space (default=True)\n        render_mode : str, optional\n            The render mode to use for the environment.\n        \"\"\"\n        self.name = scenario.name\n        self.scenario = scenario\n        self.fully_obs = fully_obs\n        self.flat_actions = flat_actions\n        self.flat_obs = flat_obs\n        self.render_mode = render_mode\n\n        self.network = Network(scenario)\n        self.current_state = State.generate_initial_state(self.network)\n        self._renderer = None\n        self.reset()\n\n        if self.flat_actions:\n            self.action_space = FlatActionSpace(self.scenario)\n        else:\n            self.action_space = ParameterisedActionSpace(self.scenario)\n\n        if self.flat_obs:\n            obs_shape = self.last_obs.shape_flat()\n        else:\n            obs_shape = self.last_obs.shape()\n        obs_low, obs_high = Observation.get_space_bounds(self.scenario)\n        self.observation_space = spaces.Box(\n            low=obs_low, high=obs_high, shape=obs_shape\n        )\n\n        self.steps = 0\n\n    def reset(self, *, seed=None, options=None):\n        \"\"\"Reset the state of the environment and returns the initial state.\n\n        Implements gymnasium.Env.reset().\n\n        Parameters\n        ----------\n        seed : int, optional\n            the optional seed for the environments RNG\n        options : dict, optional\n            optional environment options (does nothing in NASim at the moment)\n\n        Returns\n        -------\n        numpy.Array\n            the initial observation of the environment\n        dict\n            auxiliary information regarding reset\n        \"\"\"\n        super().reset(seed=seed, options=options)\n        self.steps = 0\n        self.current_state = self.network.reset(self.current_state)\n        self.last_obs = self.current_state.get_initial_observation(\n            self.fully_obs\n        )\n\n        if self.flat_obs:\n            obs = self.last_obs.numpy_flat()\n        else:\n            obs = self.last_obs.numpy()\n\n        return obs, {}\n\n    def step(self, action):\n        \"\"\"Run one step of the environment using action.\n\n        Implements gymnasium.Env.step().\n\n        Parameters\n        ----------\n        action : Action or int or list or NumpyArray\n            Action to perform. If not Action object, then if using\n            flat actions this should be an int and if using non-flat actions\n            this should be an indexable array.\n\n        Returns\n        -------\n        numpy.Array\n            observation from performing action\n        float\n            reward from performing action\n        bool\n            whether the episode reached a terminal state or not (i.e. all\n            target machines have been successfully compromised)\n        bool\n            whether the episode has reached the step limit (if one exists)\n        dict\n            auxiliary information regarding step\n            (see :func:`nasim.env.action.ActionResult.info`)\n        \"\"\"\n        next_state, obs, reward, done, info = self.generative_step(\n            self.current_state,\n            action\n        )\n        self.current_state = next_state\n        self.last_obs = obs\n\n        if self.flat_obs:\n            obs = obs.numpy_flat()\n        else:\n            obs = obs.numpy()\n\n        self.steps += 1\n\n        step_limit_reached = (\n            self.scenario.step_limit is not None\n            and self.steps >= self.scenario.step_limit\n        )\n\n        return obs, reward, done, step_limit_reached, info\n\n    def generative_step(self, state, action):\n        \"\"\"Run one step of the environment using action in given state.\n\n        Parameters\n        ----------\n        state : State\n            The state to perform the action in\n        action : Action, int, list, NumpyArray\n            Action to perform. If not Action object, then if using\n            flat actions this should be an int and if using non-flat actions\n            this should be an indexable array.\n\n        Returns\n        -------\n        State\n            the next state after action was performed\n        Observation\n            observation from performing action\n        float\n            reward from performing action\n        bool\n            whether a terminal state has been reached or not\n        dict\n            auxiliary information regarding step\n            (see :func:`nasim.env.action.ActionResult.info`)\n        \"\"\"\n        if not isinstance(action, Action):\n            action = self.action_space.get_action(action)\n\n        next_state, action_obs = self.network.perform_action(\n            state, action\n        )\n        obs = next_state.get_observation(\n            action, action_obs, self.fully_obs\n        )\n        done = self.goal_reached(next_state)\n        reward = action_obs.value - action.cost\n        return next_state, obs, reward, done, action_obs.info()\n\n    def generate_random_initial_state(self):\n        \"\"\"Generates a random initial state for environment.\n\n        This only randomizes the host configurations (os, services)\n        using a uniform distribution, so may result in networks where\n        it is not possible to reach the goal.\n\n        Returns\n        -------\n        State\n            A random initial state\n        \"\"\"\n        return State.generate_random_initial_state(self.network)\n\n    def generate_initial_state(self):\n        \"\"\"Generate the initial state for the environment.\n\n        Returns\n        -------\n        State\n            The initial state\n\n        Notes\n        -----\n        This does not reset the current state of the environment (use\n        :func:`reset` for that).\n        \"\"\"\n        return State.generate_initial_state(self.network)\n\n    def render(self):\n        \"\"\"Render environment.\n\n        Implements gymnasium.Env.render().\n\n        See render module for more details on modes and symbols.\n\n        \"\"\"\n        if self.render_mode is None:\n            return\n        return self.render_obs(mode=self.render_mode, obs=self.last_obs)\n\n    def render_obs(self, mode=\"human\", obs=None):\n        \"\"\"Render observation.\n\n        See render module for more details on modes and symbols.\n\n        Parameters\n        ----------\n        mode : str\n            rendering mode\n        obs : Observation or numpy.ndarray, optional\n            the observation to render, if None will render last observation.\n            If numpy.ndarray it must be in format that matches Observation\n            (i.e. ndarray returned by step method) (default=None)\n        \"\"\"\n        if mode is None:\n            return\n\n        if obs is None:\n            obs = self.last_obs\n\n        if not isinstance(obs, Observation):\n            obs = Observation.from_numpy(obs, self.current_state.shape())\n\n        if self._renderer is None:\n            self._renderer = Viewer(self.network)\n\n        if mode in (\"human\", \"ansi\"):\n            return self._renderer.render_readable(obs)\n        else:\n            raise NotImplementedError(\n                \"Please choose correct render mode from :\"\n                f\"{self.metadata['render_modes']}\"\n            )\n\n    def render_state(self, mode=\"human\", state=None):\n        \"\"\"Render state.\n\n        See render module for more details on modes and symbols.\n\n        If mode = ASCI:\n            Machines displayed in rows, with one row for each subnet and\n            hosts displayed in order of id within subnet\n\n        Parameters\n        ----------\n        mode : str\n            rendering mode\n        state : State or numpy.ndarray, optional\n            the State to render, if None will render current state\n            If numpy.ndarray it must be in format that matches State\n            (i.e. ndarray returned by generative_step method) (default=None)\n        \"\"\"\n        if mode is None:\n            return\n\n        if state is None:\n            state = self.current_state\n\n        if not isinstance(state, State):\n            state = State.from_numpy(state,\n                                     self.current_state.shape(),\n                                     self.current_state.host_num_map)\n\n        if self._renderer is None:\n            self._renderer = Viewer(self.network)\n\n        if mode in (\"human\", \"ansi\"):\n            return self._renderer.render_readable_state(state)\n        else:\n            raise NotImplementedError(\n                \"Please choose correct render mode from : \"\n                f\"{self.metadata['render_modes']}\"\n            )\n\n    def render_action(self, action):\n        \"\"\"Renders human readable version of action.\n\n        This is mainly useful for getting a text description of the action\n        that corresponds to a given integer.\n\n        Parameters\n        ----------\n        action : Action or int or list or NumpyArray\n            Action to render. If not Action object, then if using\n            flat actions this should be an int and if using non-flat actions\n            this should be an indexable array.\n        \"\"\"\n        if not isinstance(action, Action):\n            action = self.action_space.get_action(action)\n        print(action)\n\n    def render_episode(self, episode, width=7, height=7):\n        \"\"\"Render an episode as sequence of network graphs, where an episode\n        is a sequence of (state, action, reward, done) tuples generated from\n        interactions with environment.\n\n        Parameters\n        ----------\n        episode : list\n            list of (State, Action, reward, done) tuples\n        width : int\n            width of GUI window\n        height : int\n            height of GUI window\n        \"\"\"\n        if self._renderer is None:\n            self._renderer = Viewer(self.network)\n        self._renderer.render_episode(episode, width, height)\n\n    def render_network_graph(self, ax=None, show=False):\n        \"\"\"Render a plot of network as a graph with hosts as nodes arranged\n        into subnets and showing connections between subnets. Renders current\n        state of network.\n\n        Parameters\n        ----------\n        ax : Axes\n            matplotlib axis to plot graph on, or None to plot on new axis\n        show : bool\n            whether to display plot, or simply setup plot and showing plot\n            can be handled elsewhere by user\n        \"\"\"\n        if self._renderer is None:\n            self._renderer = Viewer(self.network)\n        state = self.current_state\n        self._renderer.render_graph(state, ax, show)\n\n    def get_minimum_hops(self):\n        \"\"\"Get the minimum number of network hops required to reach targets.\n\n        That is minimum number of hosts that must be traversed in the network\n        in order to reach all sensitive hosts on the network starting from the\n        initial state\n\n        Returns\n        -------\n        int\n            minumum possible number of network hops to reach target hosts\n        \"\"\"\n        return self.network.get_minimal_hops()\n\n    def get_action_mask(self):\n        \"\"\"Get a vector mask for valid actions.\n\n        Returns\n        -------\n        ndarray\n            numpy vector of 1's and 0's, one for each action. Where an\n            index will be 1 if action is valid given current state, or\n            0 if action is invalid.\n        \"\"\"\n        assert isinstance(self.action_space, FlatActionSpace), \\\n            \"Can only use action mask function when using flat action space\"\n        mask = np.zeros(self.action_space.n, dtype=np.int64)\n        for a_idx in range(self.action_space.n):\n            action = self.action_space.get_action(a_idx)\n            if self.network.host_discovered(action.target):\n                mask[a_idx] = 1\n        return mask\n\n    def get_score_upper_bound(self):\n        \"\"\"Get the theoretical upper bound for total reward for scenario.\n\n        The theoretical upper bound score is where the agent exploits only a\n        single host in each subnet that is required to reach sensitive hosts\n        along the shortest bath in network graph, and exploits the all\n        sensitive hosts (i.e. the minimum network hops). Assuming action cost\n        of 1 and each sensitive host is exploitable from any other connected\n        subnet (which may not be true, hence being an upper bound).\n\n        Returns\n        -------\n        float\n            theoretical max score\n        \"\"\"\n        max_reward = self.network.get_total_sensitive_host_value()\n        max_reward += self.network.get_total_discovery_value()\n        max_reward -= self.network.get_minimal_hops()\n        return max_reward\n\n    def goal_reached(self, state=None):\n        \"\"\"Check if the state is the goal state.\n\n        The goal state is when all sensitive hosts have been compromised.\n\n        Parameters\n        ----------\n        state : State, optional\n            a state, if None will use current_state of environment\n            (default=None)\n\n        Returns\n        -------\n        bool\n            True if state is goal state, otherwise False.\n        \"\"\"\n        if state is None:\n            state = self.current_state\n        return self.network.all_sensitive_hosts_compromised(state)\n\n    def __str__(self):\n        output = [\n            \"NASimEnv:\",\n            f\"name={self.name}\",\n            f\"fully_obs={self.fully_obs}\",\n            f\"flat_actions={self.flat_actions}\",\n            f\"flat_obs={self.flat_obs}\"\n        ]\n        return \"\\n  \".join(output)\n\n    def close(self):\n        if self._renderer is not None:\n            self._renderer.close()\n            self._renderer = None\n"
  },
  {
    "path": "nasim/envs/gym_env.py",
    "content": "from nasim.envs.environment import NASimEnv\nfrom nasim.scenarios import Scenario, make_benchmark_scenario\n\n\nclass NASimGymEnv(NASimEnv):\n    \"\"\"A wrapper around the NASimEnv compatible with gymnasium.make()\n\n    See nasim.NASimEnv for details.\n    \"\"\"\n\n    def __init__(self,\n                 scenario,\n                 fully_obs=False,\n                 flat_actions=True,\n                 flat_obs=True,\n                 render_mode=None):\n        \"\"\"\n        Parameters\n        ----------\n        scenario : str or or nasim.scenarios.Scenario\n            either the name of benchmark environment (str) or a nasim Scenario\n            instance\n        fully_obs : bool, optional\n            the observability mode of environment, if True then uses fully\n            observable mode, otherwise partially observable (default=False)\n        flat_actions : bool, optional\n            if true then uses a flat action space, otherwise will use\n            parameterised action space (default=True).\n        flat_obs : bool, optional\n            if true then uses a 1D observation space. If False\n            will use a 2D observation space (default=True)\n        render_mode : str, optional\n            The render mode to use for the environment.\n        \"\"\"\n        if not isinstance(scenario, Scenario):\n            scenario = make_benchmark_scenario(scenario)\n        super().__init__(scenario,\n                         fully_obs=fully_obs,\n                         flat_actions=flat_actions,\n                         flat_obs=flat_obs,\n                         render_mode=render_mode)\n"
  },
  {
    "path": "nasim/envs/host_vector.py",
    "content": "\"\"\" This module contains the HostVector class.\n\nThis is the main class for storing and updating the state of a single host\nin the NASim environment.\n\"\"\"\n\nimport numpy as np\n\nfrom nasim.envs.utils import AccessLevel\nfrom nasim.envs.action import ActionResult\n\n\nclass HostVector:\n    \"\"\" A Vector representation of a single host in NASim.\n\n    Each host is represented as a vector (1D numpy array) for efficiency and to\n    make it easier to use with deep learning agents. The vector is made up of\n    multiple features arranged in a consistent way.\n\n    Features in the vector, listed in order, are:\n\n    1. subnet address - one-hot encoding with length equal to the number\n                        of subnets\n    2. host address - one-hot encoding with length equal to the maximum number\n                      of hosts in any subnet\n    3. compromised - bool\n    4. reachable - bool\n    5. discovered - bool\n    6. value - float\n    7. discovery value - float\n    8. access - int\n    9. OS - bool for each OS in scenario (only one OS has value of true)\n    10. services running - bool for each service in scenario\n    11. processes running - bool for each process in scenario\n\n    Notes\n    -----\n    - The size of the vector is equal to:\n\n        #subnets + max #hosts in any subnet + 6 + #OS + #services + #processes.\n\n    - Where the +6 is for compromised, reachable, discovered, value,\n      discovery_value, and access features\n    - The vector is a float vector so True/False is actually represented as\n      1.0/0.0.\n\n    \"\"\"\n\n    # class properties that are the same for all hosts\n    # these are set when calling vectorize method\n    # the bounds on address space (used for one hot encoding of host address)\n    address_space_bounds = None\n    # number of OS in scenario\n    num_os = None\n    # map from OS name to its index in host vector\n    os_idx_map = {}\n    # number of services in scenario\n    num_services = None\n    # map from service name to its index in host vector\n    service_idx_map = {}\n    # number of processes in scenario\n    num_processes = None\n    # map from process name to its index in host vector\n    process_idx_map = {}\n    # size of state for host vector (i.e. len of vector)\n    state_size = None\n\n    # vector position constants\n    # to be initialized\n    _subnet_address_idx = 0\n    _host_address_idx = None\n    _compromised_idx = None\n    _reachable_idx = None\n    _discovered_idx = None\n    _value_idx = None\n    _discovery_value_idx = None\n    _access_idx = None\n    _os_start_idx = None\n    _service_start_idx = None\n    _process_start_idx = None\n\n    def __init__(self, vector):\n        self.vector = vector\n\n    @classmethod\n    def vectorize(cls, host, address_space_bounds, vector=None):\n        if cls.address_space_bounds is None:\n            cls._initialize(\n                address_space_bounds, host.services, host.os, host.processes\n            )\n\n        if vector is None:\n            vector = np.zeros(cls.state_size, dtype=np.float32)\n        else:\n            assert len(vector) == cls.state_size\n\n        vector[cls._subnet_address_idx + host.address[0]] = 1\n        vector[cls._host_address_idx + host.address[1]] = 1\n        vector[cls._compromised_idx] = int(host.compromised)\n        vector[cls._reachable_idx] = int(host.reachable)\n        vector[cls._discovered_idx] = int(host.discovered)\n        vector[cls._value_idx] = host.value\n        vector[cls._discovery_value_idx] = host.discovery_value\n        vector[cls._access_idx] = host.access\n        for os_num, (os_key, os_val) in enumerate(host.os.items()):\n            vector[cls._get_os_idx(os_num)] = int(os_val)\n        for srv_num, (srv_key, srv_val) in enumerate(host.services.items()):\n            vector[cls._get_service_idx(srv_num)] = int(srv_val)\n        host_procs = host.processes.items()\n        for proc_num, (proc_key, proc_val) in enumerate(host_procs):\n            vector[cls._get_process_idx(proc_num)] = int(proc_val)\n        return cls(vector)\n\n    @classmethod\n    def vectorize_random(cls, host, address_space_bounds, vector=None):\n        hvec = cls.vectorize(host, vector)\n        # random variables\n        for srv_num in cls.service_idx_map.values():\n            srv_val = np.random.randint(0, 2)\n            hvec.vector[cls._get_service_idx(srv_num)] = srv_val\n\n        chosen_os = np.random.choice(list(cls.os_idx_map.values()))\n        for os_num in cls.os_idx_map.values():\n            hvec.vector[cls._get_os_idx(os_num)] = int(os_num == chosen_os)\n\n        for proc_num in cls.process_idx_map.values():\n            proc_val = np.random.randint(0, 2)\n            hvec.vector[cls._get_process_idx(proc_num)] = proc_val\n        return hvec\n\n    @property\n    def compromised(self):\n        return self.vector[self._compromised_idx]\n\n    @compromised.setter\n    def compromised(self, val):\n        self.vector[self._compromised_idx] = int(val)\n\n    @property\n    def discovered(self):\n        return self.vector[self._discovered_idx]\n\n    @discovered.setter\n    def discovered(self, val):\n        self.vector[self._discovered_idx] = int(val)\n\n    @property\n    def reachable(self):\n        return self.vector[self._reachable_idx]\n\n    @reachable.setter\n    def reachable(self, val):\n        self.vector[self._reachable_idx] = int(val)\n\n    @property\n    def address(self):\n        return (\n            self.vector[self._subnet_address_idx_slice()].argmax(),\n            self.vector[self._host_address_idx_slice()].argmax()\n        )\n\n    @property\n    def value(self):\n        return self.vector[self._value_idx]\n\n    @property\n    def discovery_value(self):\n        return self.vector[self._discovery_value_idx]\n\n    @property\n    def access(self):\n        return self.vector[self._access_idx]\n\n    @access.setter\n    def access(self, val):\n        self.vector[self._access_idx] = int(val)\n\n    @property\n    def services(self):\n        services = {}\n        for srv, srv_num in self.service_idx_map.items():\n            services[srv] = self.vector[self._get_service_idx(srv_num)]\n        return services\n\n    @property\n    def os(self):\n        os = {}\n        for os_key, os_num in self.os_idx_map.items():\n            os[os_key] = self.vector[self._get_os_idx(os_num)]\n        return os\n\n    @property\n    def processes(self):\n        processes = {}\n        for proc, proc_num in self.process_idx_map.items():\n            processes[proc] = self.vector[self._get_process_idx(proc_num)]\n        return processes\n\n    def is_running_service(self, srv):\n        srv_num = self.service_idx_map[srv]\n        return bool(self.vector[self._get_service_idx(srv_num)])\n\n    def is_running_os(self, os):\n        os_num = self.os_idx_map[os]\n        return bool(self.vector[self._get_os_idx(os_num)])\n\n    def is_running_process(self, proc):\n        proc_num = self.process_idx_map[proc]\n        return bool(self.vector[self._get_process_idx(proc_num)])\n\n    def perform_action(self, action):\n        \"\"\"Perform given action against this host\n\n        Arguments\n        ---------\n        action : Action\n            the action to perform\n\n        Returns\n        -------\n        HostVector\n            the resulting state of host after action\n        ActionObservation\n            the result from the action\n        \"\"\"\n        next_state = self.copy()\n        if action.is_service_scan():\n            result = ActionResult(True, 0, services=self.services)\n            return next_state, result\n\n        if action.is_os_scan():\n            return next_state, ActionResult(True, 0, os=self.os)\n\n        if action.is_exploit():\n            if self.is_running_service(action.service) and \\\n               (action.os is None or self.is_running_os(action.os)):\n                # service and os is present so exploit is successful\n                value = 0\n                next_state.compromised = True\n                if not self.access == AccessLevel.ROOT:\n                    # ensure a machine is not rewarded twice\n                    # and access doesn't decrease\n                    next_state.access = action.access\n                    if action.access == AccessLevel.ROOT:\n                        value = self.value\n\n                result = ActionResult(\n                    True,\n                    value=value,\n                    services=self.services,\n                    os=self.os,\n                    access=action.access\n                )\n                return next_state, result\n\n        # following actions are on host so require correct access\n        if not (self.compromised and action.req_access <= self.access):\n            result = ActionResult(False, 0, permission_error=True)\n            return next_state, result\n\n        if action.is_process_scan():\n            result = ActionResult(\n                True, 0, access=self.access, processes=self.processes\n            )\n            return next_state, result\n\n        if action.is_privilege_escalation():\n            has_proc = (\n                action.process is None\n                or self.is_running_process(action.process)\n            )\n            has_os = (\n                action.os is None or self.is_running_os(action.os)\n            )\n            if has_proc and has_os:\n                # host compromised and proc and os is present\n                # so privesc is successful\n                value = 0.0\n                if not self.access == AccessLevel.ROOT:\n                    # ensure a machine is not rewarded twice\n                    # and access doesn't decrease\n                    next_state.access = action.access\n                    if action.access == AccessLevel.ROOT:\n                        value = self.value\n                result = ActionResult(\n                    True,\n                    value=value,\n                    processes=self.processes,\n                    os=self.os,\n                    access=action.access\n                )\n                return next_state, result\n\n        # action failed due to host config not meeting preconditions\n        return next_state, ActionResult(False, 0)\n\n    def observe(self,\n                address=False,\n                compromised=False,\n                reachable=False,\n                discovered=False,\n                access=False,\n                value=False,\n                discovery_value=False,\n                services=False,\n                processes=False,\n                os=False):\n        obs = np.zeros(self.state_size, dtype=np.float32)\n        if address:\n            subnet_slice = self._subnet_address_idx_slice()\n            host_slice = self._host_address_idx_slice()\n            obs[subnet_slice] = self.vector[subnet_slice]\n            obs[host_slice] = self.vector[host_slice]\n        if compromised:\n            obs[self._compromised_idx] = self.vector[self._compromised_idx]\n        if reachable:\n            obs[self._reachable_idx] = self.vector[self._reachable_idx]\n        if discovered:\n            obs[self._discovered_idx] = self.vector[self._discovered_idx]\n        if value:\n            obs[self._value_idx] = self.vector[self._value_idx]\n        if discovery_value:\n            v = self.vector[self._discovery_value_idx]\n            obs[self._discovery_value_idx] = v\n        if access:\n            obs[self._access_idx] = self.vector[self._access_idx]\n        if os:\n            idxs = self._os_idx_slice()\n            obs[idxs] = self.vector[idxs]\n        if services:\n            idxs = self._service_idx_slice()\n            obs[idxs] = self.vector[idxs]\n        if processes:\n            idxs = self._process_idx_slice()\n            obs[idxs] = self.vector[idxs]\n        return obs\n\n    def readable(self):\n        return self.get_readable(self.vector)\n\n    def copy(self):\n        vector_copy = np.copy(self.vector)\n        return HostVector(vector_copy)\n\n    def numpy(self):\n        return self.vector\n\n    @classmethod\n    def _initialize(cls, address_space_bounds, services, os_info, processes):\n        cls.os_idx_map = {}\n        cls.service_idx_map = {}\n        cls.process_idx_map = {}\n        cls.address_space_bounds = address_space_bounds\n        cls.num_os = len(os_info)\n        cls.num_services = len(services)\n        cls.num_processes = len(processes)\n        cls._update_vector_idxs()\n        for os_num, (os_key, os_val) in enumerate(os_info.items()):\n            cls.os_idx_map[os_key] = os_num\n        for srv_num, (srv_key, srv_val) in enumerate(services.items()):\n            cls.service_idx_map[srv_key] = srv_num\n        for proc_num, (proc_key, proc_val) in enumerate(processes.items()):\n            cls.process_idx_map[proc_key] = proc_num\n\n    @classmethod\n    def _update_vector_idxs(cls):\n        cls._subnet_address_idx = 0\n        cls._host_address_idx = cls.address_space_bounds[0]\n        cls._compromised_idx = (\n            cls._host_address_idx + cls.address_space_bounds[1]\n        )\n        cls._reachable_idx = cls._compromised_idx + 1\n        cls._discovered_idx = cls._reachable_idx + 1\n        cls._value_idx = cls._discovered_idx + 1\n        cls._discovery_value_idx = cls._value_idx + 1\n        cls._access_idx = cls._discovery_value_idx + 1\n        cls._os_start_idx = cls._access_idx + 1\n        cls._service_start_idx = cls._os_start_idx + cls.num_os\n        cls._process_start_idx = cls._service_start_idx + cls.num_services\n        cls.state_size = cls._process_start_idx + cls.num_processes\n\n    @classmethod\n    def _subnet_address_idx_slice(cls):\n        return slice(cls._subnet_address_idx, cls._host_address_idx)\n\n    @classmethod\n    def _host_address_idx_slice(cls):\n        return slice(cls._host_address_idx, cls._compromised_idx)\n\n    @classmethod\n    def _get_service_idx(cls, srv_num):\n        return cls._service_start_idx+srv_num\n\n    @classmethod\n    def _service_idx_slice(cls):\n        return slice(cls._service_start_idx, cls._process_start_idx)\n\n    @classmethod\n    def _get_os_idx(cls, os_num):\n        return cls._os_start_idx+os_num\n\n    @classmethod\n    def _os_idx_slice(cls):\n        return slice(cls._os_start_idx, cls._service_start_idx)\n\n    @classmethod\n    def _get_process_idx(cls, proc_num):\n        return cls._process_start_idx+proc_num\n\n    @classmethod\n    def _process_idx_slice(cls):\n        return slice(cls._process_start_idx, cls.state_size)\n\n    @classmethod\n    def get_readable(cls, vector):\n        readable_dict = dict()\n        hvec = cls(vector)\n        readable_dict[\"Address\"] = hvec.address\n        readable_dict[\"Compromised\"] = bool(hvec.compromised)\n        readable_dict[\"Reachable\"] = bool(hvec.reachable)\n        readable_dict[\"Discovered\"] = bool(hvec.discovered)\n        readable_dict[\"Value\"] = hvec.value\n        readable_dict[\"Discovery Value\"] = hvec.discovery_value\n        readable_dict[\"Access\"] = hvec.access\n        for os_name in cls.os_idx_map:\n            readable_dict[f\"{os_name}\"] = hvec.is_running_os(os_name)\n        for srv_name in cls.service_idx_map:\n            readable_dict[f\"{srv_name}\"] = hvec.is_running_service(srv_name)\n        for proc_name in cls.process_idx_map:\n            readable_dict[f\"{proc_name}\"] = hvec.is_running_process(proc_name)\n\n        return readable_dict\n\n    @classmethod\n    def reset(cls):\n        \"\"\"Resets any class variables.\n\n        This is used to avoid errors when changing scenarios within a single\n        python session\n        \"\"\"\n        cls.address_space_bounds = None\n\n    def __repr__(self):\n        return f\"Host: {self.address}\"\n\n    def __hash__(self):\n        return hash(str(self.vector))\n\n    def __eq__(self, other):\n        if self is other:\n            return True\n        if not isinstance(other, HostVector):\n            return False\n        return np.array_equal(self.vector, other.vector)\n"
  },
  {
    "path": "nasim/envs/network.py",
    "content": "import numpy as np\n\nfrom nasim.envs.action import ActionResult\nfrom nasim.envs.utils import get_minimal_hops_to_goal, min_subnet_depth, AccessLevel\n\n# column in topology adjacency matrix that represents connection between\n# subnet and public\nINTERNET = 0\n\n\nclass Network:\n    \"\"\"A computer network \"\"\"\n\n    def __init__(self, scenario):\n        self.hosts = scenario.hosts\n        self.host_num_map = scenario.host_num_map\n        self.subnets = scenario.subnets\n        self.topology = scenario.topology\n        self.firewall = scenario.firewall\n        self.address_space = scenario.address_space\n        self.address_space_bounds = scenario.address_space_bounds\n        self.sensitive_addresses = scenario.sensitive_addresses\n        self.sensitive_hosts = scenario.sensitive_hosts\n\n    def reset(self, state):\n        \"\"\"Reset the network state to initial state \"\"\"\n        next_state = state.copy()\n        for host_addr in self.address_space:\n            host = next_state.get_host(host_addr)\n            host.compromised = False\n            host.access = AccessLevel.NONE\n            host.reachable = self.subnet_public(host_addr[0])\n            host.discovered = host.reachable\n        return next_state\n\n    def perform_action(self, state, action):\n        \"\"\"Perform the given Action against the network.\n\n        Arguments\n        ---------\n        state : State\n            the current state\n        action : Action\n            the action to perform\n\n        Returns\n        -------\n        State\n            the state after the action is performed\n        ActionObservation\n            the result from the action\n        \"\"\"\n        tgt_subnet, tgt_id = action.target\n        assert 0 < tgt_subnet < len(self.subnets)\n        assert tgt_id <= self.subnets[tgt_subnet]\n\n        next_state = state.copy()\n\n        if action.is_noop():\n            return next_state, ActionResult(True)\n\n        if not state.host_reachable(action.target) \\\n           or not state.host_discovered(action.target):\n            result = ActionResult(False, 0.0, connection_error=True)\n            return next_state, result\n\n        has_req_permission = self.has_required_remote_permission(state, action)\n        if action.is_remote() and not has_req_permission:\n            result = ActionResult(False, 0.0, permission_error=True)\n            return next_state, result\n\n        if action.is_exploit() \\\n           and not self.traffic_permitted(\n                    state, action.target, action.service\n           ):\n            result = ActionResult(False, 0.0, connection_error=True)\n            return next_state, result\n\n        host_compromised = state.host_compromised(action.target)\n        if action.is_privilege_escalation() and not host_compromised:\n            result = ActionResult(False, 0.0, connection_error=True)\n            return next_state, result\n\n        if action.is_exploit() and host_compromised:\n            # host already compromised so exploits don't fail due to randomness\n            pass\n        elif np.random.rand() > action.prob:\n            return next_state, ActionResult(False, 0.0, undefined_error=True)\n\n        if action.is_subnet_scan():\n            return self._perform_subnet_scan(next_state, action)\n\n        t_host = state.get_host(action.target)\n        next_host_state, action_obs = t_host.perform_action(action)\n        next_state.update_host(action.target, next_host_state)\n        self._update(next_state, action, action_obs)\n        return next_state, action_obs\n\n    def _perform_subnet_scan(self, next_state, action):\n        if not next_state.host_compromised(action.target):\n            result = ActionResult(False, 0.0, connection_error=True)\n            return next_state, result\n\n        if not next_state.host_has_access(action.target, action.req_access):\n            result = ActionResult(False, 0.0, permission_error=True)\n            return next_state, result\n\n        discovered = {}\n        newly_discovered = {}\n        discovery_reward = 0\n        target_subnet = action.target[0]\n        for h_addr in self.address_space:\n            newly_discovered[h_addr] = False\n            discovered[h_addr] = False\n            if self.subnets_connected(target_subnet, h_addr[0]):\n                host = next_state.get_host(h_addr)\n                discovered[h_addr] = True\n                if not host.discovered:\n                    newly_discovered[h_addr] = True\n                    host.discovered = True\n                    discovery_reward += host.discovery_value\n\n        obs = ActionResult(\n            True,\n            discovery_reward,\n            discovered=discovered,\n            newly_discovered=newly_discovered\n        )\n        return next_state, obs\n\n    def _update(self, state, action, action_obs):\n        if action.is_exploit() and action_obs.success:\n            self._update_reachable(state, action.target)\n\n    def _update_reachable(self, state, compromised_addr):\n        \"\"\"Updates the reachable status of hosts on network, based on current\n        state and newly exploited host\n        \"\"\"\n        comp_subnet = compromised_addr[0]\n        for addr in self.address_space:\n            if state.host_reachable(addr):\n                continue\n            if self.subnets_connected(comp_subnet, addr[0]):\n                state.set_host_reachable(addr)\n\n    def get_sensitive_hosts(self):\n        return self.sensitive_addresses\n\n    def is_sensitive_host(self, host_address):\n        return host_address in self.sensitive_addresses\n\n    def subnets_connected(self, subnet_1, subnet_2):\n        return self.topology[subnet_1][subnet_2] == 1\n\n    def subnet_traffic_permitted(self, src_subnet, dest_subnet, service):\n        if src_subnet == dest_subnet:\n            # in same subnet so permitted\n            return True\n        if not self.subnets_connected(src_subnet, dest_subnet):\n            return False\n        return service in self.firewall[(src_subnet, dest_subnet)]\n\n    def host_traffic_permitted(self, src_addr, dest_addr, service):\n        dest_host = self.hosts[dest_addr]\n        return dest_host.traffic_permitted(src_addr, service)\n\n    def has_required_remote_permission(self, state, action):\n        \"\"\"Checks attacker has necessary permissions for remote action \"\"\"\n        if self.subnet_public(action.target[0]):\n            return True\n\n        for src_addr in self.address_space:\n            if not state.host_compromised(src_addr):\n                continue\n            if action.is_scan() and \\\n               not self.subnets_connected(src_addr[0], action.target[0]):\n                continue\n            if action.is_exploit() and \\\n               not self.subnet_traffic_permitted(\n                   src_addr[0], action.target[0], action.service\n               ):\n                continue\n            if state.host_has_access(src_addr, action.req_access):\n                return True\n        return False\n\n    def traffic_permitted(self, state, host_addr, service):\n        \"\"\"Checks whether the subnet and host firewalls permits traffic to a\n        given host and service, based on current set of compromised hosts on\n        network.\n        \"\"\"\n        for src_addr in self.address_space:\n            if not state.host_compromised(src_addr) and \\\n               not self.subnet_public(src_addr[0]):\n                continue\n            if not self.subnet_traffic_permitted(\n                    src_addr[0], host_addr[0], service\n            ):\n                continue\n            if self.host_traffic_permitted(src_addr, host_addr, service):\n                return True\n        return False\n\n    def subnet_public(self, subnet):\n        return self.topology[subnet][INTERNET] == 1\n\n    def get_number_of_subnets(self):\n        return len(self.subnets)\n\n    def all_sensitive_hosts_compromised(self, state):\n        for host_addr in self.sensitive_addresses:\n            if not state.host_has_access(host_addr, AccessLevel.ROOT):\n                return False\n        return True\n\n    def get_total_sensitive_host_value(self):\n        total = 0\n        for host_value in self.sensitive_hosts.values():\n            total += host_value\n        return total\n\n    def get_total_discovery_value(self):\n        total = 0\n        for host in self.hosts.values():\n            total += host.discovery_value\n        return total\n\n    def get_minimal_hops(self):\n        return get_minimal_hops_to_goal(\n            self.topology, self.sensitive_addresses\n        )\n\n    def get_subnet_depths(self):\n        return min_subnet_depth(self.topology)\n\n    def __str__(self):\n        output = \"\\n--- Network ---\\n\"\n        output += \"Subnets: \" + str(self.subnets) + \"\\n\"\n        output += \"Topology:\\n\"\n        for row in self.topology:\n            output += f\"\\t{row}\\n\"\n        output += \"Sensitive hosts: \\n\"\n        for addr, value in self.sensitive_hosts.items():\n            output += f\"\\t{addr}: {value}\\n\"\n        output += \"Num_services: {self.scenario.num_services}\\n\"\n        output += \"Hosts:\\n\"\n        for m in self.hosts.values():\n            output += str(m) + \"\\n\"\n        output += \"Firewall:\\n\"\n        for c, a in self.firewall.items():\n            output += f\"\\t{c}: {a}\\n\"\n        return output\n"
  },
  {
    "path": "nasim/envs/observation.py",
    "content": "import numpy as np\n\nfrom nasim.envs.utils import AccessLevel\nfrom nasim.envs.host_vector import HostVector\n\n\nclass Observation:\n    \"\"\"An observation for NASim.\n\n    Each observation is a 2D tensor with a row for each host and an additional\n    row containing auxiliary observations. Each host row is a host_vector (for\n    details see :class:`HostVector`) while the auxiliary\n    row contains non-host specific observations (see Notes section).\n\n    ...\n\n    Attributes\n    ----------\n    obs_shape : (int, int)\n        the shape of the observation\n    aux_row : int\n        the row index for the auxiliary row\n    tensor : numpy.ndarray\n        2D Numpy array storing the observation\n\n    Notes\n    -----\n    The auxiliary row is the final row in the observation tensor and has the\n    following features (in order):\n\n    1. Action success - True (1) or False (0)\n        indicates whether the action succeeded or failed\n    2. Connection error - True (1) or False (0)\n        indicates whether there was a connection error or not\n    3. Permission error - True (1) or False (0)\n        indicates whether there was a permission error or not\n    4. Undefined error - True (1) or False (0)\n        indicates whether there was an undefined error or not (e.g. failure due\n        to stochastic nature of exploits)\n\n    Since the number of features in the auxiliary row is less than the number\n    of features in each host row, the remainder of the row is all zeros.\n    \"\"\"\n\n    # obs vector positions for auxiliary observations\n    _success_idx = 0\n    _conn_error_idx = _success_idx + 1\n    _perm_error_idx = _conn_error_idx + 1\n    _undef_error_idx = _perm_error_idx + 1\n\n    def __init__(self, state_shape):\n        \"\"\"\n        Parameters\n        ----------\n        state_shape : (int, int)\n            2D shape of the state (i.e. num_hosts, host_vector_size)\n        \"\"\"\n        self.obs_shape = (state_shape[0]+1, state_shape[1])\n        self.aux_row = self.obs_shape[0]-1\n        self.tensor = np.zeros(self.obs_shape, dtype=np.float32)\n\n    @staticmethod\n    def get_space_bounds(scenario):\n        value_bounds = scenario.host_value_bounds\n        discovery_bounds = scenario.host_discovery_value_bounds\n        obs_low = min(\n            0,\n            value_bounds[0],\n            discovery_bounds[0]\n        )\n        obs_high = max(\n            1,\n            value_bounds[1],\n            discovery_bounds[1],\n            AccessLevel.ROOT,\n            scenario.address_space_bounds[0],\n            scenario.address_space_bounds[1]\n        )\n        return (obs_low, obs_high)\n\n    @classmethod\n    def from_numpy(cls, o_array, state_shape):\n        obs = cls(state_shape)\n        if o_array.shape != (state_shape[0]+1, state_shape[1]):\n            o_array = o_array.reshape(state_shape[0]+1, state_shape[1])\n        obs.tensor = o_array\n        return obs\n\n    def from_state(self, state):\n        self.tensor[:self.aux_row] = state.tensor\n\n    def from_action_result(self, action_result):\n        success = int(action_result.success)\n        self.tensor[self.aux_row][self._success_idx] = success\n        con_err = int(action_result.connection_error)\n        self.tensor[self.aux_row][self._conn_error_idx] = con_err\n        perm_err = int(action_result.permission_error)\n        self.tensor[self.aux_row][self._perm_error_idx] = perm_err\n        undef_err = int(action_result.undefined_error)\n        self.tensor[self.aux_row][self._undef_error_idx] = undef_err\n\n    def from_state_and_action(self, state, action_result):\n        self.from_state(state)\n        self.from_action_result(action_result)\n\n    def update_from_host(self, host_idx, host_obs_vector):\n        self.tensor[host_idx][:] = host_obs_vector\n\n    @property\n    def success(self):\n        \"\"\"Whether the action succeded or not\n\n        Returns\n        -------\n        bool\n            True if the action succeeded, otherwise False\n        \"\"\"\n        return bool(self.tensor[self.aux_row][self._success_idx])\n\n    @property\n    def connection_error(self):\n        \"\"\"Whether there was a connection error or not\n\n        Returns\n        -------\n        bool\n            True if there was a connection error, otherwise False\n        \"\"\"\n        return bool(self.tensor[self.aux_row][self._conn_error_idx])\n\n    @property\n    def permission_error(self):\n        \"\"\"Whether there was a permission error or not\n\n        Returns\n        -------\n        bool\n            True if there was a permission error, otherwise False\n        \"\"\"\n        return bool(self.tensor[self.aux_row][self._perm_error_idx])\n\n    @property\n    def undefined_error(self):\n        \"\"\"Whether there was an undefined error or not\n\n        Returns\n        -------\n        bool\n            True if there was a undefined error, otherwise False\n        \"\"\"\n        return bool(self.tensor[self.aux_row][self._undef_error_idx])\n\n    def shape_flat(self):\n        \"\"\"Get the flat (1D) shape of the Observation.\n\n        Returns\n        -------\n        (int, )\n            the flattened shape of observation\n        \"\"\"\n        return self.numpy_flat().shape\n\n    def shape(self):\n        \"\"\"Get the (2D) shape of the observation\n\n        Returns\n        -------\n        (int, int)\n            the 2D shape of the observation\n        \"\"\"\n        return self.obs_shape\n\n    def numpy_flat(self):\n        \"\"\"Get the flattened observation tensor\n\n        Returns\n        -------\n        numpy.ndarray\n            the flattened (1D) observation tenser\n        \"\"\"\n        return self.tensor.flatten()\n\n    def numpy(self):\n        \"\"\"Get the observation tensor\n\n        Returns\n        -------\n        numpy.ndarray\n            the (2D) observation tenser\n        \"\"\"\n        return self.tensor\n\n    def get_readable(self):\n        \"\"\"Get a human readable version of the observation\n\n        Returns\n        -------\n        list[dict]\n            list of host observations as human-readable dictionary\n        dict[str, bool]\n            auxiliary observation dictionary\n        \"\"\"\n        host_obs = []\n        for host_idx in range(self.obs_shape[0]-1):\n            host_obs_vec = self.tensor[host_idx]\n            readable_dict = HostVector.get_readable(host_obs_vec)\n            host_obs.append(readable_dict)\n\n        aux_obs = {\n            \"Success\": self.success,\n            \"Connection Error\": self.connection_error,\n            \"Permission Error\": self.permission_error,\n            \"Undefined Error\": self.undefined_error\n        }\n        return host_obs, aux_obs\n\n    def __str__(self):\n        return str(self.tensor)\n\n    def __eq__(self, other):\n        return np.array_equal(self.tensor, other.tensor)\n\n    def __hash__(self):\n        return hash(str(self.tensor))\n"
  },
  {
    "path": "nasim/envs/render.py",
    "content": "\"\"\"This module contains functions and classes for rendering NASim \"\"\"\nimport math\nimport random\nimport tkinter as Tk\nimport networkx as nx\nfrom prettytable import PrettyTable\n\n# import order important here\ntry:\n    import matplotlib\n    matplotlib.use('TkAgg')\n    import matplotlib.pyplot as plt         # noqa E402\n    from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg     # noqa E402\n    import matplotlib.patches as mpatches   # noqa E402\nexcept Exception as ex:\n    import warnings\n    warnings.warn(\n        f\"Unable to import Matplotlib with TkAgg backend due to following \"\n        f\"exception: \\\"{type(ex)} {ex}\\\". NASIM can still run but GUI \"\n        f\"functionallity may not work as expected.\"\n    )\n\n# Agent node in graph\nAGENT = (0, 0)\n\n# Colors and symbols for describing state of host\nCOLORS = ['yellow', 'orange', 'magenta', 'green', 'blue', 'red', 'black']\nSYMBOLS = ['C', 'R', 'S', 'c', 'r', 'o', 'A']\n\n\nclass Viewer:\n    \"\"\"A class for visualizing the network state from NASimEnv\"\"\"\n\n    def __init__(self, network):\n        \"\"\"\n        Arguments\n        ---------\n        network : Network\n            network of environment\n        \"\"\"\n        self.network = network\n        self.subnets = self._get_subnets(network)\n        self.positions = self._get_host_positions(network)\n\n    def render_graph(self, state, ax=None, show=False, width=5, height=6):\n        \"\"\"Render graph structure represention of network\n\n        Arguments\n        ---------\n        state : State\n            state of network user wants to view (Typically will be\n            initial state)\n        ax : Axes\n            matplotlib axis to plot graph on, or None to plot on new axis\n        show : bool\n            whether to display plot, or simply construct plot\n        width : int\n            width of GUI window\n        height : int\n            height of GUI window\n        \"\"\"\n        G = self._construct_graph(state)\n        colors = []\n        labels = {}\n        for n in list(G.nodes):\n            colors.append(G.nodes[n][\"color\"])\n            labels[n] = G.nodes[n][\"label\"]\n\n        if ax is None:\n            fig = plt.figure(figsize=(width, height))\n            ax = fig.add_subplot(111)\n        else:\n            fig = ax.get_figure()\n\n        nx.draw_networkx_nodes(G,\n                               self.positions,\n                               node_size=1000,\n                               node_color=colors,\n                               ax=ax)\n        nx.draw_networkx_labels(G,\n                                self.positions,\n                                labels,\n                                font_size=10,\n                                font_weight=\"bold\")\n        nx.draw_networkx_edges(G, self.positions)\n        ax.axis('off')\n        ax.set_xlim(left=0.0, right=100.0)\n        # ax.set_ylim(bottom=0.0, top=100.0)\n\n        legend_entries = EpisodeViewer.legend(compromised=False)\n        ax.legend(handles=legend_entries, fontsize=12, loc=2)\n\n        if show:\n            fig.tight_layout()\n            plt.show()\n            plt.close(fig)\n\n    def render_episode(self, episode, width=7, height=5):\n        \"\"\"Display an episode from Cyber Attack Simulator Environment in a seperate\n        window. Where an episode is a sequence of (state, action, reward, done)\n        tuples generated from interactions with environment.\n\n        Arguments\n        ---------\n        episode : list\n            list of (State, Action, reward, done) tuples\n        width : int\n            width of GUI window\n        height : int\n            height of GUI window\n        \"\"\"\n        init_ep_state = episode[0][0]\n        G = self._construct_graph(init_ep_state)\n        EpisodeViewer(episode, G, self.network.sensitive_hosts, width, height)\n\n    def render_readable(self, obs):\n        \"\"\"Print a readable tabular version of observation to stdout\n\n        Arguments\n        ---------\n        obs : Observation\n            observation to view\n        \"\"\"\n        host_obs, aux_obs = obs.get_readable()\n        aux_table = self._construct_table_from_dict(aux_obs)\n        host_table = self._construct_table_from_list_of_dicts(host_obs)\n        print(\"Observation:\")\n        print(aux_table)\n        print(host_table)\n\n    def render_readable_state(self, state):\n        \"\"\"Print a readable tabular version of observation to stdout\n\n        Arguments\n        ---------\n        state : State\n            state to view\n        \"\"\"\n        host_obs = state.get_readable()\n        host_table = self._construct_table_from_list_of_dicts(host_obs)\n        print(\"State:\")\n        print(host_table)\n\n    def close(self):\n        \"\"\"Close renderer.\"\"\"\n        plt.close(\"all\")\n\n    def _construct_table_from_dict(self, d):\n        headers = list(d.keys())\n        table = PrettyTable(headers)\n        row = [str(d[k]) for k in headers]\n        table.add_row(row)\n        return table\n\n    def _construct_table_from_list_of_dicts(self, l):\n        headers = list(l[0].keys())\n        table = PrettyTable(headers)\n        for d in l:\n            row = [str(d[k]) for k in headers]\n            table.add_row(row)\n        return table\n\n    def _construct_graph(self, state):\n        \"\"\"Create a network graph from the current state\n\n        Arguments\n        ---------\n        state : State\n            current state of network\n\n        Returns\n        -------\n        G : Graph\n            NetworkX Graph representing state of network\n        \"\"\"\n        G = nx.Graph()\n        sensitive_hosts = self.network.sensitive_hosts\n\n        # Create a fully connected graph for each subnet\n        for subnet in self.subnets:\n            for m in subnet:\n                node_color = get_host_representation(state,\n                                                     sensitive_hosts,\n                                                     m,\n                                                     COLORS)\n                node_pos = self.positions[m]\n                G.add_node(m, color=node_color, pos=node_pos, label=str(m))\n            for x in subnet:\n                for y in subnet:\n                    if x == y:\n                        continue\n                    G.add_edge(x, y)\n\n        # Retrieve first host in each subnet\n        subnet_prime_nodes = []\n        for subnet in self.subnets:\n            subnet_prime_nodes.append(subnet[0])\n        # Connect connected subnets by creating edge between first host from\n        # each subnet\n        for x in subnet_prime_nodes:\n            for y in subnet_prime_nodes:\n                if x == y:\n                    continue\n                if self.network.subnets_connected(x[0], y[0]):\n                    G.add_edge(x, y)\n\n        return G\n\n    def _get_host_positions(self, network):\n        \"\"\"Get list of positions for each host in episode\n\n        Arguments\n        ---------\n        network : Network\n            network object describing network configuration of environment\n            episode was generated from\n        \"\"\"\n        address_space = network.address_space\n        depths = network.get_subnet_depths()\n        max_depth = max(depths)\n        # list of lists where each list contains subnet_id of subnets with\n        # same depth\n        subnets_by_depth = [[] for i in range(max_depth + 1)]\n        for subnet_id, subnet_depth in enumerate(depths):\n            if subnet_id == 0:\n                continue\n            subnets_by_depth[subnet_depth].append(subnet_id)\n\n        # max value of position in figure\n        max_pos = 100\n        # for spacing between rows and columns and spread of nodes within\n        # subnet\n        margin = 10\n        row_height = max_pos / (max_depth + 1)\n\n        # positions are randomly assigned within regions of display based on\n        # subnet number\n        positions = {}\n        for m in address_space:\n            m_subnet = m[0]\n            m_depth = depths[m_subnet]\n            # row is dependent on depth of subnet\n            row_max = max_pos - (m_depth * row_height)\n            row_min = max_pos - ((m_depth + 1) * row_height)\n            # col width is dependent on number of subnets at same depth\n            num_cols = len(subnets_by_depth[m_depth])\n            col_width = max_pos / num_cols\n            # col of host dependent on subnet_id relative to other subnets of\n            # same depth\n            m_col = subnets_by_depth[m_depth].index(m_subnet)\n            col_min = m_col * col_width\n            col_max = (m_col + 1) * col_width\n            # randomly sample position of host within row and column of subnet\n            col_pos, row_pos = self._get_host_position(\n                m, positions, address_space, row_min, row_max, col_min,\n                col_max, margin\n            )\n            positions[m] = (col_pos, row_pos)\n\n        # get position of agent, which is just right of host first host in\n        # network\n        first_m_pos = positions[address_space[0]]\n        agent_row = first_m_pos[1]\n        agent_col = min(first_m_pos[0] + margin * 4, max_pos - margin)\n        positions[AGENT] = (agent_col, agent_row)\n\n        return positions\n\n    def _get_host_position(self, m, positions, address_space, row_min, row_max,\n                           col_min, col_max, margin):\n        \"\"\"Get the position of m within the bounds of (row_min, row_max,\n        col_min, col_max) while trying to make the distance between the\n        positions of any two hosts in the same subnet greater than some\n        threshold.\n        \"\"\"\n        subnet_hosts = []\n        for other_m in address_space:\n            if other_m == m:\n                continue\n            if other_m[0] == m[0]:\n                subnet_hosts.append(other_m)\n\n        threshold = 8\n        col_margin = (col_max - col_min) / 4\n        col_mid = col_max - ((col_max - col_min) / 2)\n        m_y = random.uniform(row_min + margin, row_max - margin)\n        m_x = random.uniform(col_mid - col_margin, col_mid + col_margin)\n\n        # only try 100 times\n        good = False\n        n = 0\n        while n < 100 and not good:\n            good = True\n            m_x = random.uniform(col_mid - col_margin, col_mid + col_margin)\n            m_y = random.uniform(row_min + margin, row_max - margin)\n            for other_m in subnet_hosts:\n                if other_m not in positions:\n                    continue\n                other_x, other_y = positions[other_m]\n                dist = math.hypot(m_x - other_x, m_y - other_y)\n                if dist < threshold:\n                    good = False\n                    break\n            n += 1\n        return m_x, m_y\n\n    def _get_subnets(self, network):\n        \"\"\"Get list of hosts organized into subnets\n\n        Arguments\n        ---------\n        network : Network\n            the environment network\n\n        Returns\n        -------\n        list[list[(int, int)]]\n            addresses with each list containing hosts on same subnet\n        \"\"\"\n        subnets = [[] for i in range(network.get_number_of_subnets())]\n        for m in network.address_space:\n            subnets[m[0]].append(m)\n        # add internet host\n        subnets[0].append(AGENT)\n        return subnets\n\n\nclass EpisodeViewer:\n    \"\"\"Displays sequence of observations from NASimEnv in a seperate window\"\"\"\n\n    def __init__(self, episode, G, sensitive_hosts, width=7, height=7):\n        self.episode = episode\n        self.G = G\n        self.sensitive_hosts = sensitive_hosts\n        # used for moving between timesteps in episode\n        self.timestep = 0\n        self._setup_GUI(width, height)\n        # draw first observation\n        self._next_graph()\n        # Initialize GUI drawing loop\n        Tk.mainloop()\n\n    def _setup_GUI(self, width, height):\n        \"\"\"Setup all the elements for the GUI for displaying the network graphs.\n\n        Initializes object variables:k\n            Tk root : the root window for GUI\n            FigureCanvasTkAgg canvas : the canvas object to draw figure onto\n            Figure fig : the figure that holds axes\n            Axes axes : the matplotlib figure axes to draw onto\n        \"\"\"\n        # The GUI root window\n        self.root = Tk.Tk()\n        self.root.wm_title(\"Cyber Attack Simulator\")\n        self.root.wm_protocol(\"WM_DELETE_WINDOW\", self._close)\n        # matplotlib figure to house networkX graph\n        self.fig = plt.figure(figsize=(width, height))\n        self.axes = self.fig.add_subplot(111)\n        self.fig.tight_layout()\n        self.fig.subplots_adjust(top=0.8)\n        # a tk.DrawingArea\n        self.canvas = FigureCanvasTkAgg(self.fig, master=self.root)\n        self.canvas.draw()\n        self.canvas.get_tk_widget().pack(side=Tk.TOP, fill=Tk.BOTH, expand=1)\n        # buttons for moving between observations\n        back = Tk.Button(self.root, text=\"back\", command=self._previous_graph)\n        back.pack()\n        next = Tk.Button(self.root, text=\"next\", command=self._next_graph)\n        next.pack()\n\n    def _close(self):\n        plt.close('all')\n        self.root.destroy()\n\n    def _next_graph(self):\n        if self.timestep < len(self.episode):\n            t_state = self.episode[self.timestep][0]\n            self.G = self._update_graph(self.G, t_state)\n            self._draw_graph(self.G)\n            self.timestep += 1\n\n    def _previous_graph(self):\n        if self.timestep > 1:\n            self.timestep -= 2\n            self._next_graph()\n\n    def _update_graph(self, G, state):\n        # update colour of each host in network as necessary\n        for m in list(G.nodes):\n            if m == AGENT:\n                continue\n            node_color = get_host_representation(\n                state, self.sensitive_hosts, m, COLORS\n            )\n            G.nodes[m][\"color\"] = node_color\n        return G\n\n    def _draw_graph(self, G):\n        pos = {}\n        colors = []\n        labels = {}\n        for n in list(G.nodes):\n            colors.append(G.nodes[n][\"color\"])\n            labels[n] = G.nodes[n][\"label\"]\n            pos[n] = G.nodes[n][\"pos\"]\n\n        # clear window and redraw graph\n        self.axes.cla()\n        nx.draw_networkx_nodes(\n            G, pos, node_color=colors, node_size=1500, ax=self.axes\n        )\n        nx.draw_networkx_labels(\n            G, pos, labels, font_size=12, font_weight=\"bold\"\n        )\n        nx.draw_networkx_edges(G, pos)\n        plt.axis('off')\n        # generate and plot legend\n        # legend_entries = self.legend()\n        # plt.legend(handles=legend_entries, fontsize=16)\n        # add title\n        state, action, reward, done = self.episode[self.timestep]\n        if done:\n            title = (\n                f\"t={self.timestep}\\nGoal reached\\ntotal reward={reward}\"\n            )\n        else:\n            title = f\"t={self.timestep}\\n{action}\\nreward={reward}\"\n        ax_title = self.axes.set_title(title, fontsize=16, pad=10)\n        ax_title.set_y(1.05)\n\n        xticks = self.axes.get_xticks()\n        yticks = self.axes.get_yticks()\n        # shift half a step to the left\n        xmin = (3*xticks[0] - xticks[1])/2.\n        ymin = (3*yticks[0] - yticks[1])/2.\n        # shaft half a step to the right\n        xmax = (3*xticks[-1] - xticks[-2])/2.\n        ymax = (3*yticks[-1] - yticks[-2])/2.\n\n        self.axes.set_xlim(left=xmin, right=xmax)\n        self.axes.set_ylim(bottom=ymin, top=ymax)\n        # self.fig.savefig(\"t_{}.png\".format(self.timestep))\n        self.canvas.draw()\n\n    @staticmethod\n    def legend(compromised=True):\n        \"\"\"\n        Manually setup the display legend\n        \"\"\"\n        a = mpatches.Patch(color='black', label='Agent')\n        s = mpatches.Patch(color='magenta', label='Sensitive (S)')\n        c = mpatches.Patch(color='green', label='Compromised (C)')\n        r = mpatches.Patch(color='blue', label='Reachable (R)')\n        legend_entries = [a, s, c, r]\n        if compromised:\n            sc = mpatches.Patch(color='yellow', label='S & C')\n            sr = mpatches.Patch(color='orange', label='S & R')\n            o = mpatches.Patch(color='red', label='not S, C or R')\n            legend_entries.extend([sc, sr, o])\n        return legend_entries\n\n\ndef get_host_representation(state, sensitive_hosts, m, representation):\n    \"\"\"Get the representation of a host based on current state\n\n    Arguments\n    ---------\n    state : State\n        current state\n    sensitive_hosts : list\n        list of addresses of sensitive hosts on network\n    m : (int, int)\n        host address\n    representation : list\n        list of different representations (e.g. color or symbol)\n\n    Returns\n    -------\n    str\n        host color\n    \"\"\"\n    # agent not in state so return straight away\n    if m == AGENT:\n        return representation[6]\n    compromised = state.host_compromised(m)\n    reachable = state.host_reachable(m)\n    sensitive = m in sensitive_hosts\n    if sensitive:\n        if compromised:\n            output = representation[0]\n        elif reachable:\n            output = representation[1]\n        else:\n            output = representation[2]\n    elif compromised:\n        output = representation[3]\n    elif reachable:\n        output = representation[4]\n    else:\n        output = representation[5]\n    return output\n"
  },
  {
    "path": "nasim/envs/state.py",
    "content": "import numpy as np\n\nfrom nasim.envs.host_vector import HostVector\nfrom nasim.envs.observation import Observation\n\n\nclass State:\n    \"\"\"A state in the NASim Environment.\n\n    Each row in the state tensor represents the state of a single host on the\n    network. For details on host the state a single host is represented see\n    :class:`HostVector`\n\n    ...\n\n    Attributes\n    ----------\n    tensor : numpy.Array\n        tensor representation of the state of network\n    host_num_map : dict\n        mapping from host address to host number (this is used\n        to map host address to host row in the network tensor)\n    \"\"\"\n\n    def __init__(self, network_tensor, host_num_map):\n        \"\"\"\n        Parameters\n        ----------\n        state_tensor : np.Array\n            the tensor representation of the network state\n        host_num_map : dict\n            mapping from host address to host number (this is used\n            to map host address to host row in the network tensor)\n        \"\"\"\n        self.tensor = network_tensor\n        self.host_num_map = host_num_map\n\n    @classmethod\n    def tensorize(cls, network):\n        h0 = network.hosts[(1, 0)]\n        h0_vector = HostVector.vectorize(h0, network.address_space_bounds)\n        tensor = np.zeros(\n            (len(network.hosts), h0_vector.state_size),\n            dtype=np.float32\n        )\n        for host_addr, host in network.hosts.items():\n            host_num = network.host_num_map[host_addr]\n            HostVector.vectorize(\n                host, network.address_space_bounds, tensor[host_num]\n            )\n        return cls(tensor, network.host_num_map)\n\n    @classmethod\n    def generate_initial_state(cls, network):\n        cls.reset()\n        state = cls.tensorize(network)\n        return network.reset(state)\n\n    @classmethod\n    def generate_random_initial_state(cls, network):\n        h0 = network.hosts[(1, 0)]\n        h0_vector = HostVector.vectorize_random(\n            h0, network.address_space_bounds\n        )\n        tensor = np.zeros(\n            (len(network.hosts), h0_vector.state_size),\n            dtype=np.float32\n        )\n        for host_addr, host in network.hosts.items():\n            host_num = network.host_num_map[host_addr]\n            HostVector.vectorize_random(\n                host, network.address_space_bounds, tensor[host_num]\n            )\n        state = cls(tensor, network.host_num_map)\n        # ensure host state set correctly\n        return network.reset(state)\n\n    @classmethod\n    def from_numpy(cls, s_array, state_shape, host_num_map):\n        if s_array.shape != state_shape:\n            s_array = s_array.reshape(state_shape)\n        return State(s_array, host_num_map)\n\n    @classmethod\n    def reset(cls):\n        \"\"\"Reset any class attributes for state \"\"\"\n        HostVector.reset()\n\n    @property\n    def hosts(self):\n        hosts = []\n        for host_addr in self.host_num_map:\n            hosts.append((host_addr, self.get_host(host_addr)))\n        return hosts\n\n    def copy(self):\n        new_tensor = np.copy(self.tensor)\n        return State(new_tensor, self.host_num_map)\n\n    def get_initial_observation(self, fully_obs):\n        \"\"\"Get the initial observation of network.\n\n        Returns\n        -------\n        Observation\n            an observation object\n        \"\"\"\n        obs = Observation(self.shape())\n        if fully_obs:\n            obs.from_state(self)\n            return obs\n\n        for host_addr, host in self.hosts:\n            if not host.reachable:\n                continue\n            host_obs = host.observe(address=True,\n                                    reachable=True,\n                                    discovered=True)\n            host_idx = self.get_host_idx(host_addr)\n            obs.update_from_host(host_idx, host_obs)\n        return obs\n\n    def get_observation(self, action, action_result, fully_obs):\n        \"\"\"Get observation given last action and action result\n\n        Parameters\n        ----------\n        action : Action\n            last action performed\n        action_result : ActionResult\n            observation from performing action\n        fully_obs : bool\n            whether problem is fully observable or not\n\n        Returns\n        -------\n        Observation\n            an observation object\n        \"\"\"\n        obs = Observation(self.shape())\n        obs.from_action_result(action_result)\n        if fully_obs:\n            obs.from_state(self)\n            return obs\n\n        if action.is_noop():\n            return obs\n\n        if not action_result.success:\n            # action failed so no observation\n            return obs\n\n        t_idx, t_host = self.get_host_and_idx(action.target)\n        obs_kwargs = dict(\n            address=True,       # must be true for success\n            compromised=False,\n            reachable=True,     # must be true for success\n            discovered=True,    # must be true for success\n            value=False,\n            # discovery_value=False,    # this is only added as needed\n            services=False,\n            processes=False,\n            os=False,\n            access=False\n        )\n        if action.is_exploit():\n            # exploit action, so get all observations for host\n            obs_kwargs[\"compromised\"] = True\n            obs_kwargs[\"services\"] = True\n            obs_kwargs[\"os\"] = True\n            obs_kwargs[\"access\"] = True\n            obs_kwargs[\"value\"] = True\n        elif action.is_privilege_escalation():\n            obs_kwargs[\"compromised\"] = True\n            obs_kwargs[\"access\"] = True\n        elif action.is_service_scan():\n            obs_kwargs[\"services\"] = True\n        elif action.is_os_scan():\n            obs_kwargs[\"os\"] = True\n        elif action.is_process_scan():\n            obs_kwargs[\"processes\"] = True\n            obs_kwargs[\"access\"] = True\n        elif action.is_subnet_scan():\n            for host_addr in action_result.discovered:\n                discovered = action_result.discovered[host_addr]\n                if not discovered:\n                    continue\n                d_idx, d_host = self.get_host_and_idx(host_addr)\n                newly_discovered = action_result.newly_discovered[host_addr]\n                d_obs = d_host.observe(\n                    discovery_value=newly_discovered, **obs_kwargs\n                )\n                obs.update_from_host(d_idx, d_obs)\n            # this is for target host (where scan was performed on)\n            obs_kwargs[\"compromised\"] = True\n        else:\n            raise NotImplementedError(f\"Action {action} not implemented\")\n        target_obs = t_host.observe(**obs_kwargs)\n        obs.update_from_host(t_idx, target_obs)\n        return obs\n\n    def shape_flat(self):\n        return self.numpy_flat().shape\n\n    def shape(self):\n        return self.tensor.shape\n\n    def numpy_flat(self):\n        return self.tensor.flatten()\n\n    def numpy(self):\n        return self.tensor\n\n    def update_host(self, host_addr, host_vector):\n        host_idx = self.host_num_map[host_addr]\n        self.tensor[host_idx] = host_vector.vector\n\n    def get_host(self, host_addr):\n        host_idx = self.host_num_map[host_addr]\n        return HostVector(self.tensor[host_idx])\n\n    def get_host_idx(self, host_addr):\n        return self.host_num_map[host_addr]\n\n    def get_host_and_idx(self, host_addr):\n        host_idx = self.host_num_map[host_addr]\n        return host_idx, HostVector(self.tensor[host_idx])\n\n    def host_reachable(self, host_addr):\n        return self.get_host(host_addr).reachable\n\n    def host_compromised(self, host_addr):\n        return self.get_host(host_addr).compromised\n\n    def host_discovered(self, host_addr):\n        return self.get_host(host_addr).discovered\n\n    def host_has_access(self, host_addr, access_level):\n        return self.get_host(host_addr).access >= access_level\n\n    def set_host_compromised(self, host_addr):\n        self.get_host(host_addr).compromised = True\n\n    def set_host_reachable(self, host_addr):\n        self.get_host(host_addr).reachable = True\n\n    def set_host_discovered(self, host_addr):\n        self.get_host(host_addr).discovered = True\n\n    def get_host_value(self, host_address):\n        return self.hosts[host_address].get_value()\n\n    def host_is_running_service(self, host_addr, service):\n        return self.get_host(host_addr).is_running_service(service)\n\n    def host_is_running_os(self, host_addr, os):\n        return self.get_host(host_addr).is_running_os(os)\n\n    def get_total_host_value(self):\n        total_value = 0\n        for host_addr in self.host_num_map:\n            host = self.get_host(host_addr)\n            total_value += host.value\n        return total_value\n\n    def state_size(self):\n        return self.tensor.size\n\n    def get_readable(self):\n        host_obs = []\n        for host_addr in self.host_num_map:\n            host = self.get_host(host_addr)\n            readable_dict = host.readable()\n            host_obs.append(readable_dict)\n        return host_obs\n\n    def __str__(self):\n        output = \"\\n--- State ---\\n\"\n        output += \"Hosts:\\n\"\n        for host in self.hosts:\n            output += str(host) + \"\\n\"\n        return output\n\n    def __hash__(self):\n        return hash(str(self.tensor))\n\n    def __eq__(self, other):\n        return np.array_equal(self.tensor, other.tensor)\n"
  },
  {
    "path": "nasim/envs/utils.py",
    "content": "import enum\nimport numpy as np\nfrom queue import deque\nfrom itertools import permutations\n\nINTERNET = 0\n\n\nclass OneHotBool(enum.IntEnum):\n    NONE = 0\n    TRUE = 1\n    FALSE = 2\n\n    @staticmethod\n    def from_bool(b):\n        if b:\n            return OneHotBool.TRUE\n        return OneHotBool.FALSE\n\n    def __str__(self):\n        return self.name\n\n    def __repr__(self):\n        return self.name\n\n\nclass ServiceState(enum.IntEnum):\n    # values for possible service knowledge states\n    UNKNOWN = 0     # service may or may not be running on host\n    PRESENT = 1     # service is running on the host\n    ABSENT = 2      # service not running on the host\n\n    def __str__(self):\n        return self.name\n\n    def __repr__(self):\n        return self.name\n\n\nclass AccessLevel(enum.IntEnum):\n    NONE = 0\n    USER = 1\n    ROOT = 2\n\n    def __str__(self):\n        return self.name\n\n    def __repr__(self):\n        return self.name\n\n\ndef get_minimal_hops_to_goal(topology, sensitive_addresses):\n    \"\"\"Get minimum network hops required to reach all sensitive hosts.\n\n    Starting from outside the network (i.e. can only reach exposed subnets).\n\n    Returns\n    -------\n    int\n        minimum number of network hops to reach all sensitive hosts\n    \"\"\"\n    num_subnets = len(topology)\n    max_value = np.iinfo(np.int16).max\n    distance = np.full((num_subnets, num_subnets),\n                       max_value,\n                       dtype=np.int16)\n\n    # set distances for each edge to 1\n    for s1 in range(num_subnets):\n        for s2 in range(num_subnets):\n            if s1 == s2:\n                distance[s1][s2] = 0\n            elif topology[s1][s2] == 1:\n                distance[s1][s2] = 1\n    # find all pair minimum shortest path distance\n    for k in range(num_subnets):\n        for i in range(num_subnets):\n            for j in range(num_subnets):\n                if distance[i][k] == max_value \\\n                   or distance[k][j] == max_value:\n                    dis = max_value\n                else:\n                    dis = distance[i][k] + distance[k][j]\n                if distance[i][j] > dis:\n                    distance[i][j] = distance[i][k] + distance[k][j]\n\n    # get list of all subnets we need to visit\n    subnets_to_visit = [INTERNET]\n    for subnet, host in sensitive_addresses:\n        if subnet not in subnets_to_visit:\n            subnets_to_visit.append(subnet)\n\n    # find minimum shortest path that visits internet subnet and all\n    # sensitive subnets by checking all possible permutations\n    shortest = max_value\n    for pm in permutations(subnets_to_visit):\n        pm_sum = 0\n        for i in range(len(pm) - 1):\n            pm_sum += distance[pm[i]][pm[i+1]]\n        shortest = min(shortest, pm_sum)\n\n    return shortest\n\n\ndef min_subnet_depth(topology):\n    \"\"\"Find the minumum depth of each subnet in the network graph in terms of steps\n    from an exposed subnet to each subnet\n\n    Parameters\n    ----------\n    topology : 2D matrix\n        An adjacency matrix representing the network, with first subnet\n        representing the internet (i.e. exposed)\n\n    Returns\n    -------\n    depths : list\n        depth of each subnet ordered by subnet index in topology\n    \"\"\"\n    num_subnets = len(topology)\n\n    assert len(topology[0]) == num_subnets\n\n    depths = []\n    Q = deque()\n    for subnet in range(num_subnets):\n        if topology[subnet][INTERNET] == 1:\n            depths.append(0)\n            Q.appendleft(subnet)\n        else:\n            depths.append(float('inf'))\n\n    while len(Q) > 0:\n        parent = Q.pop()\n        for child in range(num_subnets):\n            if topology[parent][child] == 1:\n                # child is connected to parent\n                if depths[child] > depths[parent] + 1:\n                    depths[child] = depths[parent] + 1\n                    Q.appendleft(child)\n    return depths\n"
  },
  {
    "path": "nasim/scenarios/__init__.py",
    "content": "from nasim.scenarios.utils import INTERNET\nfrom nasim.scenarios.scenario import Scenario\nfrom nasim.scenarios.loader import ScenarioLoader\nfrom nasim.scenarios.generator import ScenarioGenerator\nimport nasim.scenarios.benchmark as benchmark\n\n\ndef make_benchmark_scenario(scenario_name, seed=None):\n    \"\"\"Generate or Load a benchmark Scenario.\n\n    Parameters\n    ----------\n    scenario_name : str\n        the name of the benchmark environment\n    seed : int, optional\n        random seed to use to generate environment (default=None)\n\n    Returns\n    -------\n    Scenario\n        a new scenario instance\n\n    Raises\n    ------\n    NotImplementederror\n        if scenario_name does no match any implemented benchmark scenarios.\n    \"\"\"\n    if scenario_name in benchmark.AVAIL_GEN_BENCHMARKS:\n        params = benchmark.AVAIL_GEN_BENCHMARKS[scenario_name]\n        params['seed'] = seed\n        return generate_scenario(**params)\n    elif scenario_name in benchmark.AVAIL_STATIC_BENCHMARKS:\n        scenario_def = benchmark.AVAIL_STATIC_BENCHMARKS[scenario_name]\n        return load_scenario(scenario_def[\"file\"], name=scenario_name)\n    else:\n        raise NotImplementedError(\n            f\"Benchmark scenario '{scenario_name}' not available.\"\n            f\"Available scenarios are: {benchmark.AVAIL_BENCHMARKS}\"\n        )\n\n\ndef generate_scenario(num_hosts, num_services, **params):\n    \"\"\"Generate Scenario from network parameters.\n\n    Parameters\n    ----------\n    num_hosts : int\n        number of hosts to include in network (minimum is 3)\n    num_services : int\n        number of services to use in environment (minimum is 1)\n    params : dict, optional\n        generator params (see :class:`ScenarioGenertor` for full list)\n\n    Returns\n    -------\n    Scenario\n        a new scenario object\n    \"\"\"\n    generator = ScenarioGenerator()\n    return generator.generate(num_hosts, num_services, **params)\n\n\ndef load_scenario(path, name=None):\n    \"\"\"Load NASim Environment from a .yaml scenario file.\n\n    Parameters\n    ----------\n    path : str\n        path to the .yaml scenario file\n    name : str, optional\n        the scenarios name, if None name will be generated from path\n        (default=None)\n\n    Returns\n    -------\n    Scenario\n        a new scenario object\n    \"\"\"\n    loader = ScenarioLoader()\n    return loader.load(path, name=name)\n\n\ndef get_scenario_max(scenario_name):\n    if scenario_name in benchmark.AVAIL_GEN_BENCHMARKS:\n        return benchmark.AVAIL_GEN_BENCHMARKS[scenario_name][\"max_score\"]\n    elif scenario_name in benchmark.AVAIL_STATIC_BENCHMARKS:\n        return benchmark.AVAIL_STATIC_BENCHMARKS[scenario_name][\"max_score\"]\n    return None\n"
  },
  {
    "path": "nasim/scenarios/benchmark/__init__.py",
    "content": "import os.path as osp\n\nfrom nasim.scenarios.benchmark.generated import AVAIL_GEN_BENCHMARKS\n\nBENCHMARK_DIR = osp.dirname(osp.abspath(__file__))\n\nAVAIL_STATIC_BENCHMARKS = {\n    \"tiny\": {\n        \"file\": osp.join(BENCHMARK_DIR, \"tiny.yaml\"),\n        \"name\": \"tiny\",\n        \"step_limit\": 1000,\n        \"max_score\": 195\n    },\n    \"tiny-hard\": {\n        \"file\": osp.join(BENCHMARK_DIR, \"tiny-hard.yaml\"),\n        \"name\": \"tiny-hard\",\n        \"step_limit\": 1000,\n        \"max_score\": 192\n    },\n    \"tiny-small\": {\n        \"file\": osp.join(BENCHMARK_DIR, \"tiny-small.yaml\"),\n        \"name\": \"tiny-small\",\n        \"step_limit\": 1000,\n        \"max_score\": 189\n    },\n    \"small\": {\n        \"file\": osp.join(BENCHMARK_DIR, \"small.yaml\"),\n        \"name\": \"small\",\n        \"step_limit\": 1000,\n        \"max_score\": 186\n    },\n    \"small-honeypot\": {\n        \"file\": osp.join(BENCHMARK_DIR, \"small-honeypot.yaml\"),\n        \"name\": \"small-honeypot\",\n        \"step_limit\": 1000,\n        \"max_score\": 186\n    },\n    \"small-linear\": {\n        \"file\": osp.join(BENCHMARK_DIR, \"small-linear.yaml\"),\n        \"name\": \"small-linear\",\n        \"step_limit\": 1000,\n        \"max_score\": 187\n    },\n    \"medium\": {\n        \"file\": osp.join(BENCHMARK_DIR, \"medium.yaml\"),\n        \"name\": \"medium\",\n        \"step_limit\": 2000,\n        \"max_score\": 190\n    },\n    \"medium-single-site\": {\n        \"file\": osp.join(BENCHMARK_DIR, \"medium-single-site.yaml\"),\n        \"name\": \"medium-single-site\",\n        \"step_limit\": 2000,\n        \"max_score\": 195\n    },\n    \"medium-multi-site\": {\n        \"file\": osp.join(BENCHMARK_DIR, \"medium-multi-site.yaml\"),\n        \"name\": \"medium-multi-site\",\n        \"step_limit\": 2000,\n        \"max_score\": 190\n    },\n}\n\nAVAIL_BENCHMARKS = list(AVAIL_STATIC_BENCHMARKS.keys()) \\\n                    + list(AVAIL_GEN_BENCHMARKS.keys())\n"
  },
  {
    "path": "nasim/scenarios/benchmark/generated.py",
    "content": "\"\"\"A collection of definitions for generated benchmark scenarios.\n\nEach generated scenario is defined by the a number of parameters that\ncontrol the size of the problem (see scenario.generator for more info):\n\nThere are also some parameters, where default values are used for all\nscenarios, see DEFAULTS dict.\n\"\"\"\n\n# generated environment constants\nDEFAULTS = dict(\n    num_exploits=None,\n    num_privescs=None,\n    r_sensitive=100,\n    r_user=100,\n    exploit_cost=1,\n    exploit_probs='mixed',\n    privesc_cost=1,\n    privesc_probs=1.0,\n    service_scan_cost=1,\n    os_scan_cost=1,\n    subnet_scan_cost=1,\n    process_scan_cost=1,\n    uniform=False,\n    alpha_H=2.0,\n    alpha_V=2.0,\n    lambda_V=1.0,\n    random_goal=False,\n    base_host_value=1,\n    host_discovery_value=1,\n    step_limit=1000,\n    address_space_bounds=None\n)\n\n# Generated Scenario definitions\nTINY_GEN = {**DEFAULTS,\n            \"name\": \"tiny-gen\",\n            \"num_hosts\": 3,\n            \"num_os\": 1,\n            \"num_services\": 1,\n            \"num_processes\": 1,\n            \"restrictiveness\": 1}\nTINY_GEN_RGOAL = {**DEFAULTS,\n                  \"name\": \"tiny-gen-rangoal\",\n                  \"num_hosts\": 3,\n                  \"num_os\": 1,\n                  \"num_services\": 1,\n                  \"num_processes\": 1,\n                  \"restrictiveness\": 1,\n                  \"random_goal\": True}\nSMALL_GEN = {**DEFAULTS,\n             \"name\": \"small-gen\",\n             \"num_hosts\": 8,\n             \"num_os\": 2,\n             \"num_services\": 3,\n             \"num_processes\": 2,\n             \"restrictiveness\": 2}\nSMALL_GEN_RGOAL = {**DEFAULTS,\n                   \"name\": \"small-gen-rangoal\",\n                   \"num_hosts\": 8,\n                   \"num_os\": 2,\n                   \"num_services\": 3,\n                   \"num_processes\": 2,\n                   \"restrictiveness\": 2,\n                   \"random_goal\": True}\nMEDIUM_GEN = {**DEFAULTS,\n              \"name\": \"medium-gen\",\n              \"num_hosts\": 16,\n              \"num_os\": 2,\n              \"num_services\": 5,\n              \"num_processes\": 2,\n              \"restrictiveness\": 3,\n              \"step_limit\": 2000}\nLARGE_GEN = {**DEFAULTS,\n             \"name\": \"large-gen\",\n             \"num_hosts\": 23,\n             \"num_os\": 3,\n             \"num_services\": 7,\n             \"num_processes\": 3,\n             \"restrictiveness\": 3,\n             \"step_limit\": 5000}\nHUGE_GEN = {**DEFAULTS,\n            \"name\": \"huge-gen\",\n            \"num_hosts\": 38,\n            \"num_os\": 4,\n            \"num_services\": 10,\n            \"num_processes\": 4,\n            \"restrictiveness\": 3,\n            \"step_limit\": 10000}\nPOCP_1_GEN = {**DEFAULTS,\n              \"name\": \"pocp-1-gen\",\n              \"num_hosts\": 35,\n              \"num_os\": 2,\n              \"num_services\": 50,\n              \"num_exploits\": 60,\n              \"num_processes\": 2,\n              \"restrictiveness\": 5,\n              \"step_limit\": 30000}\nPOCP_2_GEN = {**DEFAULTS,\n              \"name\": \"pocp-2-gen\",\n              \"num_hosts\": 95,\n              \"num_os\": 3,\n              \"num_services\": 10,\n              \"num_exploits\": 30,\n              \"num_processes\": 3,\n              \"restrictiveness\": 5,\n              \"step_limit\": 30000}\n\n\nAVAIL_GEN_BENCHMARKS = {\n    \"tiny-gen\": TINY_GEN,\n    \"tiny-gen-rgoal\": TINY_GEN_RGOAL,\n    \"small-gen\": SMALL_GEN,\n    \"small-gen-rgoal\": SMALL_GEN_RGOAL,\n    \"medium-gen\": MEDIUM_GEN,\n    \"large-gen\": LARGE_GEN,\n    \"huge-gen\": HUGE_GEN,\n    \"pocp-1-gen\": POCP_1_GEN,\n    \"pocp-2-gen\": POCP_2_GEN\n}\n"
  },
  {
    "path": "nasim/scenarios/benchmark/medium-multi-site.yaml",
    "content": "# A WAN which has multiple 3 remote sites (subnets) connected to the main site\n# sensitive hosts:\n# 1) a server in server subnet on the main site,\n# 2) a host in user subnet in main site\n#\n# main site has 3 subnets (1 server, 1 DMZ, 1 user)\n# subnet 1 = main site DMZ (exposed, but not vulnerable) - contains 2 webservers\n# subnet 2 = main site server (not exposed) - contains 2 data servers\n# subnet 3 = main site user (not exposed) - contains 6 user hosts\n# subnet 4 = remote site 1 (exposed) - contains 2 user hosts\n# subnet 5 = remote site 2 (exposed) - contains 2 user hosts\n# subnet 6 = remote site 3 (exposed) - contains 2 user hosts\n# each remote site is connected to main site server subnet\n#\n# 16 hosts\n# 6 subnets\n# 2 OS\n# 5 services\n# 3 processes\n# 5 exploits\n# 3 priv esc\n#\n# |A| = 16 * (5 + 3 + 4) = 192\n#\n# Optimal path:\n#  (e_samba, (6, 1)) -> (subnet_scan, (6, 1)) -> (e_smtp, (2, 1)) -> (pe_schtask, (2, 1))\n#     -> (e_http, (3, 1)) -> (e_ssh, (3, 4)) -> (pe_tomcat, (3, 4))\n#  Score = 200 - (2 + 3 + 2 + 3) = 190\n#\nsubnets: [2, 2, 6, 2, 2, 2]\ntopology: [[ 1, 1, 0, 0, 1, 1, 1],    # 0 - internet\n           [ 1, 1, 1, 1, 0, 0, 0],    # 1 - MS-DMZ\n           [ 0, 1, 1, 1, 1, 1, 1],    # 2 - MS-Server\n           [ 0, 1, 1, 1, 0, 0, 0],    # 3 - MS-User\n           [ 1, 0, 1, 0, 1, 0, 0],    # 4 - RS-1\n           [ 1, 0, 1, 0, 0, 1, 0],    # 5 - RS-2\n           [ 1, 0, 1, 0, 0, 0, 1]]    # 6 - RS-3\nsensitive_hosts:\n  (2, 1): 100\n  (3, 4): 100\nos:\n  - linux\n  - windows\nservices:\n  - ssh\n  - ftp\n  - http\n  - samba\n  - smtp\nprocesses:\n  - tomcat\n  - daclsvc\n  - schtask\nexploits:\n  e_ssh:\n    service: ssh\n    os: linux\n    prob: 0.9\n    cost: 3\n    access: user\n  e_ftp:\n    service: ftp\n    os: windows\n    prob: 0.6\n    cost: 1\n    access: root\n  e_http:\n    service: http\n    os: None\n    prob: 0.9\n    cost: 2\n    access: user\n  e_samba:\n    service: samba\n    os: linux\n    prob: 0.3\n    cost: 2\n    access: root\n  e_smtp:\n    service: smtp\n    os: windows\n    prob: 0.6\n    cost: 3\n    access: user\nprivilege_escalation:\n  pe_tomcat:\n    process: tomcat\n    os: linux\n    prob: 1.0\n    cost: 1\n    access: root\n  pe_daclsvc:\n    process: daclsvc\n    os: windows\n    prob: 1.0\n    cost: 1\n    access: root\n  pe_schtask:\n    process: schtask\n    os: windows\n    prob: 1.0\n    cost: 1\n    access: root\nservice_scan_cost: 1\nos_scan_cost: 1\nsubnet_scan_cost: 1\nprocess_scan_cost: 1\nhost_configurations:\n  (1, 0):\n    os: linux\n    services: [ssh]\n    processes: [tomcat]\n  (1, 1):\n    os: linux\n    services: [ssh]\n    processes: [tomcat]\n  (2, 0):\n    os: windows\n    services: [smtp]\n    processes: []\n  (2, 1):\n    os: windows\n    services: [smtp]\n    processes: [schtask]\n  (3, 0):\n    os: linux\n    services: [ssh]\n    processes: [tomcat]\n  (3, 1):\n    os: linux\n    services: [ssh, http]\n    processes: []\n  (3, 2):\n    os: linux\n    services: [ssh]\n    processes: []\n  (3, 3):\n    os: linux\n    services: [ssh]\n    processes: []\n  (3, 4):\n    os: linux\n    services: [ssh]\n    processes: [tomcat]\n  (3, 5):\n    os: linux\n    services: [ssh]\n    processes: []\n  (4, 0):\n    os: windows\n    services: [ftp]\n    processes: [daclsvc]\n  (4, 1):\n    os: windows\n    services: [ftp]\n    processes: [daclsvc]\n  (5, 0):\n    os: windows\n    services: [ftp]\n    processes: [daclsvc, schtask]\n  (5, 1):\n    os: windows\n    services: [ftp, http]\n    processes: []\n  (6, 0):\n    os: linux\n    services: [ssh]\n    processes: [tomcat]\n  (6, 1):\n    os: windows\n    services: [ssh, samba]\n    processes: []\nfirewall:\n  (0, 1): []\n  (1, 0): []\n  (0, 4): []\n  (4, 0): []\n  (0, 5): [http]\n  (5, 0): []\n  (0, 6): [samba]\n  (6, 0): []\n  (1, 2): []\n  (2, 1): [ssh]\n  (1, 3): []\n  (3, 1): [ssh]\n  (2, 3): [http]\n  (3, 2): [smtp]\n  (2, 4): [ftp]\n  (4, 2): [smtp]\n  (2, 5): [ftp]\n  (5, 2): [smtp]\n  (2, 6): [ftp, ssh]\n  (6, 2): [smtp]\nstep_limit: 2000\n"
  },
  {
    "path": "nasim/scenarios/benchmark/medium-single-site.yaml",
    "content": "# A network with a single subnet that has one vulnerable host that must be compromised\n# to access other hosts behind firewall\n#\n# 1 subnet\n# 16 hosts\n# 2 OS\n# 5 services\n# 3 processes\n# 5 exploits\n# 3 priv esc\n#\n# |A| = 16 * (5 + 3 + 4) = 192\n#\n# Optimal path:\n#  (e_http, (1, 7) or (1, 15)) -> (e_smtp, (1, 3)) -> (pe_schtask, (1, 3))\n#       -> (e_ssh, (1, 8)) -> (pe_tomcat, (1, 8))\n#  Score = 200 - (2 + 3 + 1 + 3 + 1) = 190\n#\nsubnets: [16]\ntopology: [[ 1, 1],\n           [ 1, 1]]\nsensitive_hosts:\n  (1, 3): 100\n  (1, 8): 100\nos:\n  - linux\n  - windows\nservices:\n  - ssh\n  - ftp\n  - http\n  - samba\n  - smtp\nprocesses:\n  - tomcat\n  - daclsvc\n  - schtask\nexploits:\n  e_ssh:\n    service: ssh\n    os: linux\n    prob: 0.9\n    cost: 3\n    access: user\n  e_ftp:\n    service: ftp\n    os: windows\n    prob: 0.6\n    cost: 1\n    access: root\n  e_http:\n    service: http\n    os: None\n    prob: 0.9\n    cost: 2\n    access: user\n  e_samba:\n    service: samba\n    os: linux\n    prob: 0.3\n    cost: 2\n    access: root\n  e_smtp:\n    service: smtp\n    os: windows\n    prob: 0.6\n    cost: 3\n    access: user\nprivilege_escalation:\n  pe_tomcat:\n    process: tomcat\n    os: linux\n    prob: 1.0\n    cost: 1\n    access: root\n  pe_daclsvc:\n    process: daclsvc\n    os: windows\n    prob: 1.0\n    cost: 1\n    access: root\n  pe_schtask:\n    process: schtask\n    os: windows\n    prob: 1.0\n    cost: 1\n    access: root\nservice_scan_cost: 1\nos_scan_cost: 1\nsubnet_scan_cost: 1\nprocess_scan_cost: 1\nhost_configurations:\n  (1, 0):\n    os: linux\n    services: [ftp]\n    processes: [tomcat]\n  (1, 1):\n    os: linux\n    services: [ftp, ssh]\n    processes: [tomcat]\n  (1, 2):\n    os: windows\n    services: [ftp]\n    processes: [schtask]\n  (1, 3):\n    os: windows\n    services: [smtp]\n    processes: [schtask]\n  (1, 4):\n    os: windows\n    services: [ftp]\n    processes: [schtask]\n  (1, 5):\n    os: linux\n    services: [ftp, ssh]\n    processes: [tomcat]\n  (1, 6):\n    os: windows\n    services: [ftp]\n    processes: [daclsvc]\n  (1, 7):\n    os: windows\n    services: [http]\n    processes: []\n  (1, 8):\n    os: linux\n    services: [ssh]\n    processes: [tomcat]\n  (1, 9):\n    os: windows\n    services: [ftp]\n    processes: [schtask]\n  (1, 10):\n    os: windows\n    services: [ssh]\n    processes: []\n  (1, 11):\n    os: windows\n    services: [ftp]\n    processes: [daclsvc]\n  (1, 12):\n    os: windows\n    services: [ftp, ssh]\n    processes: []\n  (1, 13):\n    os: windows\n    services: [ftp]\n    processes: []\n  (1, 14):\n    os: windows\n    services: [ftp]\n    processes: [schtask]\n  (1, 15):\n    os: linux\n    services: [http]\n    processes: []\nfirewall:\n  (0, 1): [http]\n  (1, 0): []\nstep_limit: 2000\n"
  },
  {
    "path": "nasim/scenarios/benchmark/medium.yaml",
    "content": "# A medium standard (one public subnet) network configuration\n#\n# 16 hosts\n# 5 subnets\n# 2 OS\n# 5 services\n# 3 processes\n# 5 exploits\n# 3 priv esc\n#\n# |A| = 16 * (5 + 3 + 4) = 192\n#\n# Optimal path:\n#  (e_http, (1, 0)) -> subnet_scan -> (e_smtp, (2, 0)) -> (pe_schtask, (2, 0) -> (e_http, (3, 1))\n#      -> subnet_scan -> (e_ssh, (5, 0)) -> (e_samba, (5, 0))\n#  Score = 200 - (2+1+3+1+2+1+3+2) = 185\n#\n\nsubnets: [1, 1, 5, 5, 4]\ntopology: [[ 1, 1, 0, 0, 0, 0],\n           [ 1, 1, 1, 1, 0, 0],\n           [ 0, 1, 1, 1, 0, 0],\n           [ 0, 1, 1, 1, 1, 1],\n           [ 0, 0, 0, 1, 1, 0],\n           [ 0, 0, 0, 1, 0, 1]]\nsensitive_hosts:\n  (2, 0): 100\n  (5, 0): 100\nos:\n  - linux\n  - windows\nservices:\n  - ssh\n  - ftp\n  - http\n  - samba\n  - smtp\nprocesses:\n  - tomcat\n  - daclsvc\n  - schtask\nexploits:\n  e_ssh:\n    service: ssh\n    os: linux\n    prob: 0.9\n    cost: 3\n    access: user\n  e_ftp:\n    service: ftp\n    os: windows\n    prob: 0.6\n    cost: 1\n    access: root\n  e_http:\n    service: http\n    os: None\n    prob: 0.9\n    cost: 2\n    access: user\n  e_samba:\n    service: samba\n    os: linux\n    prob: 0.3\n    cost: 2\n    access: root\n  e_smtp:\n    service: smtp\n    os: windows\n    prob: 0.6\n    cost: 3\n    access: user\nprivilege_escalation:\n  pe_tomcat:\n    process: tomcat\n    os: linux\n    prob: 1.0\n    cost: 1\n    access: root\n  pe_daclsvc:\n    process: daclsvc\n    os: windows\n    prob: 1.0\n    cost: 1\n    access: root\n  pe_schtask:\n    process: schtask\n    os: windows\n    prob: 1.0\n    cost: 1\n    access: root\nservice_scan_cost: 1\nos_scan_cost: 1\nsubnet_scan_cost: 1\nprocess_scan_cost: 1\nhost_configurations:\n  (1, 0):\n    os: linux\n    services: [http]\n    processes: []\n  (2, 0):\n    os: windows\n    services: [smtp]\n    processes: [schtask]\n  (3, 0):\n    os: windows\n    services: [ftp]\n    processes: [schtask]\n  (3, 1):\n    os: windows\n    services: [ftp, http]\n    processes: [daclsvc]\n  (3, 2):\n    os: windows\n    services: [ftp]\n    processes: []\n  (3, 3):\n    os: windows\n    services: [ftp]\n    processes: [schtask]\n  (3, 4):\n    os: windows\n    services: [ftp]\n    processes: [schtask]\n  (4, 0):\n    os: linux\n    services: [ssh]\n    processes: []\n  (4, 1):\n    os: linux\n    services: [ssh]\n    processes: []\n  (4, 2):\n    os: linux\n    services: [ssh]\n    processes: []\n  (4, 3):\n    os: windows\n    services: [ssh, ftp]\n    processes: [tomcat]\n  (4, 4):\n    os: windows\n    services: [ssh, ftp]\n    processes: [tomcat]\n  (5, 0):\n    os: linux\n    services: [ssh, samba]\n    processes: []\n  (5, 1):\n    os: linux\n    services: [ssh, http]\n    processes: [tomcat]\n  (5, 2):\n    os: linux\n    services: [ssh]\n    processes: []\n  (5, 3):\n    os: linux\n    services: [ssh]\n    processes: []\nfirewall:\n  (0, 1): [http]\n  (1, 0): []\n  (1, 2): [smtp]\n  (2, 1): [ssh]\n  (1, 3): []\n  (3, 1): [ssh]\n  (2, 3): [http]\n  (3, 2): [smtp]\n  (3, 4): [ssh, ftp]\n  (4, 3): [ftp, ssh]\n  (3, 5): [ssh, ftp]\n  (5, 3): [ftp, ssh]\nstep_limit: 2000\n"
  },
  {
    "path": "nasim/scenarios/benchmark/small-honeypot.yaml",
    "content": "# A small standard (one public network) network configuration containing a\n# honeypot host (3, 2).\n#\n# 4 subnets\n# 8 hosts\n# 2 OS\n# 3 services\n# 2 processes\n# 3 exploits\n# 2 priv esc\n#\n# Optimal path:\n#  (e_http, (1, 0)) -> subnet_scan -> (e_ssh, (2, 0)) -> (pe_tomcat, (2, 0))\n#      -> (e_http, (3, 1)) -> subnet_scan -> (e_ssh, (4, 0)\n#      -> (pe_tomcat, (4, 0))\n#  Score = 200 - (2 + 1 + 3 + 1 + 2 + 1 + 3 + 1) = 186\n#\nsubnets: [1, 1, 5, 1]\ntopology: [[ 1, 1, 0, 0, 0],\n           [ 1, 1, 1, 1, 0],\n           [ 0, 1, 1, 1, 0],\n           [ 0, 1, 1, 1, 1],\n           [ 0, 0, 0, 1, 1]]\nsensitive_hosts:\n  (2, 0): 100\n  (4, 0): 100\nos:\n  - linux\n  - windows\nservices:\n  - ssh\n  - ftp\n  - http\nprocesses:\n  - tomcat\n  - daclsvc\nexploits:\n  e_ssh:\n    service: ssh\n    os: linux\n    prob: 0.9\n    cost: 3\n    access: user\n  e_ftp:\n    service: ftp\n    os: windows\n    prob: 0.6\n    cost: 1\n    access: user\n  e_http:\n    service: http\n    os: None\n    prob: 0.9\n    cost: 2\n    access: user\nprivilege_escalation:\n  pe_tomcat:\n    process: tomcat\n    os: linux\n    prob: 1.0\n    cost: 1\n    access: root\n  pe_daclsvc:\n    process: daclsvc\n    os: windows\n    prob: 1.0\n    cost: 1\n    access: root\nservice_scan_cost: 1\nos_scan_cost: 1\nsubnet_scan_cost: 1\nprocess_scan_cost: 1\nhost_configurations:\n  (1, 0):\n    os: linux\n    services: [http]\n    processes: []\n  (2, 0):\n    os: linux\n    services: [ssh, ftp]\n    processes: [tomcat]\n  (3, 0):\n    os: windows\n    services: [ftp]\n    processes: []\n  (3, 1):\n    os: windows\n    services: [ftp, http]\n    processes: [daclsvc]\n  (3, 2):\n    os: windows\n    services: [ftp, http]\n    processes: [daclsvc]\n    # This host is the honeypot so has large negative value\n    value: -100\n  (3, 3):\n    os: windows\n    services: [ftp]\n    processes: []\n  (3, 4):\n    os: windows\n    services: [ftp]\n    processes: [daclsvc]\n  (4, 0):\n    os: linux\n    services: [ssh, ftp]\n    processes: [tomcat]\n# two row for each connection between subnets as defined by topology\n# one for each direction of connection\n# list which services to allow\nfirewall:\n  (0, 1): [http]\n  (1, 0): []\n  (1, 2): [ssh]\n  (2, 1): [ssh]\n  (1, 3): []\n  (3, 1): [ssh]\n  (2, 3): [http]\n  (3, 2): [ftp]\n  (3, 4): [ssh, ftp]\n  (4, 3): [ftp]\nstep_limit: 1000\n"
  },
  {
    "path": "nasim/scenarios/benchmark/small-linear.yaml",
    "content": "# A small network with\n#\n# 6 subnets\n# 8 hosts\n# 2 OS\n# 3 services\n# 2 processes\n# 3 exploits\n# 2 priv esc\n#\n# - subnets organized in a linear network\n# - sensitive documents located in two middle subnets\n# - end subnets are both connected to internet\n# - two middle subnets are not connected to each other\n#\n# Optimal path:\n#  (e_http, (1, 0)) -> subnet_scan -> (e_ssh, (2, 0)) -> subnet_scan -> (e_ssh, (3, 1)) -> (e_ftp, (3, 0))\n#  (e_http, (6, 0)) -> subnet_scan -> (e_ssh, (5, 0)) -> subnet_scan -> (e_http, (4, 0)) -> (pe_daclsvc, (4, 0))\n#  Score = 200 - (2+1+3+1+3+1+2+1+3+1+1+1) = 179\n#\nsubnets: [1, 1, 2, 1, 2, 1]\ntopology: [[ 1, 1, 0, 0, 0, 0, 1],  # 0 connected to 1 and 6\n           [ 1, 1, 1, 0, 0, 0, 0],  # 1 connected to 0 and 2\n           [ 0, 1, 1, 1, 0, 0, 0],  # 2 connected to 1 and 3\n           [ 0, 0, 1, 1, 1, 0, 0],  # 3 connected to 2 and 4\n           [ 0, 0, 0, 1, 1, 1, 0],  # 4 connected to 3 and 5\n           [ 0, 0, 0, 0, 1, 1, 1],  # 5 connected to 4 and 6\n           [ 1, 0, 0, 0, 0, 1, 1]]  # 6 connected to 5 and 0\nsensitive_hosts:\n  (3, 0): 100\n  (4, 0): 100\nos:\n  - linux\n  - windows\nservices:\n  - ssh\n  - ftp\n  - http\nprocesses:\n  - tomcat\n  - daclsvc\nexploits:\n  e_ssh:\n    service: ssh\n    os: linux\n    prob: 0.9\n    cost: 3\n    access: user\n  e_ftp:\n    service: ftp\n    os: windows\n    prob: 0.6\n    cost: 1\n    access: root\n  e_http:\n    service: http\n    os: None\n    prob: 0.9\n    cost: 2\n    access: user\nprivilege_escalation:\n  pe_tomcat:\n    process: tomcat\n    os: linux\n    prob: 1.0\n    cost: 1\n    access: root\n  pe_daclsvc:\n    process: daclsvc\n    os: windows\n    prob: 1.0\n    cost: 1\n    access: root\nservice_scan_cost: 1\nos_scan_cost: 1\nsubnet_scan_cost: 1\nprocess_scan_cost: 1\nhost_configurations:\n  (1, 0):\n    os: linux\n    services: [http]\n    processes: []\n  (2, 0):\n    os: linux\n    services: [ssh, ftp]\n    processes: [tomcat]\n  (3, 0):\n    os: windows\n    services: [ftp]\n    processes: []\n  (3, 1):\n    os: linux\n    services: [ssh]\n    processes: []\n  (4, 0):\n    os: windows\n    services: [http]\n    processes: [daclsvc]\n  (5, 0):\n    os: linux\n    services: [ftp, ssh]\n    processes: []\n  (5, 1):\n    os: windows\n    services: [ftp]\n    processes: [daclsvc]\n  (6, 0):\n    os: linux\n    services: [http]\n    processes: [tomcat]\n# two row for each connection between subnets as defined by topology\n# one for each direction of connection\n# list which services to allow\nfirewall:\n  (0, 1): [http]\n  (1, 0): []\n  (1, 2): [ssh, ftp]\n  (2, 1): [http]\n  (2, 3): [ssh]\n  (3, 2): [ssh, ftp]\n  (3, 4): []  # no traffic permitted between middle networks\n  (4, 3): []  # no traffic permitted between middle networks\n  (4, 5): [ftp]\n  (5, 4): [ftp, http]\n  (5, 6): [http]\n  (6, 5): [ssh]\n  (6, 0): []\n  (0, 6): [http]\nstep_limit: 1000\n"
  },
  {
    "path": "nasim/scenarios/benchmark/small.yaml",
    "content": "# A small standard (one public network) network configuration\n#\n# 4 subnets\n# 8 hosts\n# 2 OS\n# 3 services\n# 2 processes\n# 3 exploits\n# 2 priv esc\n#\n# Optimal path:\n#  (e_http, (1, 0)) -> subnet_scan -> (e_ssh, (2, 0)) -> (pe_tomcat, (2, 0))\n#      -> (e_http, (3, 1)) -> subnet_scan -> (e_ssh, (4, 0)\n#      -> (pe_tomcat, (4, 0))\n#  Score = 200 - (2 + 1 + 3 + 1 + 2 + 1 + 3 + 1) = 186\n#\nsubnets: [1, 1, 5, 1]\ntopology: [[ 1, 1, 0, 0, 0],\n           [ 1, 1, 1, 1, 0],\n           [ 0, 1, 1, 1, 0],\n           [ 0, 1, 1, 1, 1],\n           [ 0, 0, 0, 1, 1]]\nsensitive_hosts:\n  (2, 0): 100\n  (4, 0): 100\nos:\n  - linux\n  - windows\nservices:\n  - ssh\n  - ftp\n  - http\nprocesses:\n  - tomcat\n  - daclsvc\nexploits:\n  e_ssh:\n    service: ssh\n    os: linux\n    prob: 0.9\n    cost: 3\n    access: user\n  e_ftp:\n    service: ftp\n    os: windows\n    prob: 0.6\n    cost: 1\n    access: user\n  e_http:\n    service: http\n    os: None\n    prob: 0.9\n    cost: 2\n    access: user\nprivilege_escalation:\n  pe_tomcat:\n    process: tomcat\n    os: linux\n    prob: 1.0\n    cost: 1\n    access: root\n  pe_daclsvc:\n    process: daclsvc\n    os: windows\n    prob: 1.0\n    cost: 1\n    access: root\nservice_scan_cost: 1\nos_scan_cost: 1\nsubnet_scan_cost: 1\nprocess_scan_cost: 1\nhost_configurations:\n  (1, 0):\n    os: linux\n    services: [http]\n    processes: []\n  (2, 0):\n    os: linux\n    services: [ssh, ftp]\n    processes: [tomcat]\n  (3, 0):\n    os: windows\n    services: [ftp]\n    processes: []\n  (3, 1):\n    os: windows\n    services: [ftp, http]\n    processes: [daclsvc]\n  (3, 2):\n    os: windows\n    services: [ftp]\n    processes: [daclsvc]\n  (3, 3):\n    os: windows\n    services: [ftp]\n    processes: []\n  (3, 4):\n    os: windows\n    services: [ftp]\n    processes: [daclsvc]\n  (4, 0):\n    os: linux\n    services: [ssh, ftp]\n    processes: [tomcat]\n# two row for each connection between subnets as defined by topology\n# one for each direction of connection\n# list which services to allow\nfirewall:\n  (0, 1): [http]\n  (1, 0): []\n  (1, 2): [ssh]\n  (2, 1): [ssh]\n  (1, 3): []\n  (3, 1): [ssh]\n  (2, 3): [http]\n  (3, 2): [ftp]\n  (3, 4): [ssh, ftp]\n  (4, 3): [ftp]\nstep_limit: 1000\n"
  },
  {
    "path": "nasim/scenarios/benchmark/tiny-hard.yaml",
    "content": "# A harder version of the tiny standard (one public network) network configuration\n#\n# 3 subnets\n# 3 hosts\n# 2 OS\n# 3 services\n# 2 processes\n# 3 exploits\n# 2 priv esc actions\n#\n# Optimal path:\n#  (e_http, (1, 0)) -> subnet scan -> (e_ssh, (2, 0)) -> (pe_tomcat, (2, 0)) -> (e_ftp, (3, 0))\n#  Score = 200 - (2 + 1 + 3 + 1 + 1) = 192\n#\nsubnets: [1, 1, 1]\ntopology: [[ 1, 1, 0, 0],\n           [ 1, 1, 1, 1],\n           [ 0, 1, 1, 1],\n           [ 0, 1, 1, 1]]\nsensitive_hosts:\n  (2, 0): 100\n  (3, 0): 100\nos:\n  - linux\n  - windows\nservices:\n  - ssh\n  - ftp\n  - http\nprocesses:\n  - tomcat\n  - daclsvc\nexploits:\n  e_ssh:\n    service: ssh\n    os: linux\n    prob: 0.9\n    cost: 3\n    access: user\n  e_ftp:\n    service: ftp\n    os: windows\n    prob: 0.6\n    cost: 1\n    access: root\n  e_http:\n    service: http\n    os: None\n    prob: 0.9\n    cost: 2\n    access: user\nprivilege_escalation:\n  pe_tomcat:\n    process: tomcat\n    os: linux\n    prob: 1.0\n    cost: 1\n    access: root\n  pe_daclsvc:\n    process: daclsvc\n    os: windows\n    prob: 1.0\n    cost: 1\n    access: root\nservice_scan_cost: 1\nos_scan_cost: 1\nsubnet_scan_cost: 1\nprocess_scan_cost: 1\nhost_configurations:\n  (1, 0):\n    os: linux\n    services: [http]\n    processes: []\n  (2, 0):\n    os: linux\n    services: [ssh, ftp]\n    processes: [tomcat]\n  (3, 0):\n    os: windows\n    services: [ftp]\n    processes: [daclsvc]\n# two row for each connection between subnets as defined by topology\n# one for each direction of connection\n# list which services to allow\nfirewall:\n  (0, 1): [http]\n  (1, 0): []\n  (1, 2): [ssh]\n  (2, 1): [ssh]\n  (1, 3): []\n  (3, 1): [ssh]\n  (2, 3): [ftp, ssh]\n  (3, 2): [ftp, ssh]\nstep_limit: 1000\n"
  },
  {
    "path": "nasim/scenarios/benchmark/tiny-small.yaml",
    "content": "# A tiny-small standard (one public network) network configuration\n# (Not quite tiny, not quite small)\n#\n# 4 subnets\n# 5 hosts\n# 2 OS\n# 3 services\n# 2 processes\n# 3 exploits\n# 2 priv esc actions\n#\n# Optimal path:\n#  (e_http, (1, 0)) -> subnet_scan -> (e_ssh, (2, 0)) -> (pe_tomcat, (2,0)) -> (e_http, (3, 1))\n#        -> subnet_scan -> (e_ftp, (4, 0))\n#  Score = 200 - (2 + 1 + 3 + 1 + 2 + 1 + 1) = 189\n#\nsubnets: [1, 1, 2, 1]\ntopology: [[ 1, 1, 0, 0, 0],\n           [ 1, 1, 1, 1, 0],\n           [ 0, 1, 1, 1, 0],\n           [ 0, 1, 1, 1, 1],\n           [ 0, 0, 0, 1, 1]]\nsensitive_hosts:\n  (2, 0): 100\n  (4, 0): 100\nos:\n  - linux\n  - windows\nservices:\n  - ssh\n  - ftp\n  - http\nprocesses:\n  - tomcat\n  - daclsvc\nexploits:\n  e_ssh:\n    service: ssh\n    os: linux\n    prob: 0.9\n    cost: 3\n    access: user\n  e_ftp:\n    service: ftp\n    os: windows\n    prob: 0.6\n    cost: 1\n    access: root\n  e_http:\n    service: http\n    os: None\n    prob: 0.9\n    cost: 2\n    access: user\nprivilege_escalation:\n  pe_tomcat:\n    process: tomcat\n    os: linux\n    prob: 1.0\n    cost: 1\n    access: root\n  pe_daclsvc:\n    process: daclsvc\n    os: windows\n    prob: 1.0\n    cost: 1\n    access: root\nservice_scan_cost: 1\nos_scan_cost: 1\nsubnet_scan_cost: 1\nprocess_scan_cost: 1\nhost_configurations:\n  (1, 0):\n    os: linux\n    services: [http]\n    processes: [tomcat]\n  (2, 0):\n    os: linux\n    services: [ssh, ftp]\n    processes: [tomcat]\n  (3, 0):\n    os: windows\n    services: [ftp]\n    processes: []\n  (3, 1):\n    os: windows\n    services: [ftp, http]\n    processes: [daclsvc]\n  (4, 0):\n    os: windows\n    services: [ssh, ftp]\n    processes: []\n# two row for each connection between subnets as defined by topology\n# one for each direction of connection\n# list which services to allow\nfirewall:\n  (0, 1): [http]\n  (1, 0): []\n  (1, 2): [ssh]\n  (2, 1): [ssh]\n  (1, 3): []\n  (3, 1): [ssh]\n  (2, 3): [http]\n  (3, 2): [ftp]\n  (3, 4): [ssh, ftp]\n  (4, 3): [ftp]\nstep_limit: 1000\n"
  },
  {
    "path": "nasim/scenarios/benchmark/tiny.yaml",
    "content": "# A tiny standard (one public network) network configuration\n#\n# 3 hosts\n# 3 subnets\n# 1 service\n# 1 process\n# 1 os\n# 1 exploit\n# 1 privilege escalation\n#\n# Optimal path:\n# (e_ssh, (1, 0)) -> subnet_scan -> (e_ssh, (3, 0)) -> (pe_tomcat, (3, 0))\n#     -> (e_ssh, (2, 0)) -> (pe_tomcat, (2, 0))\n# Score = 200 - (6*1) = 195\n#\nsubnets: [1, 1, 1]\ntopology: [[ 1, 1, 0, 0],\n           [ 1, 1, 1, 1],\n           [ 0, 1, 1, 1],\n           [ 0, 1, 1, 1]]\nsensitive_hosts:\n  (2, 0): 100\n  (3, 0): 100\nos:\n  - linux\nservices:\n  - ssh\nprocesses:\n  - tomcat\nexploits:\n  e_ssh:\n    service: ssh\n    os: linux\n    prob: 0.8\n    cost: 1\n    access: user\nprivilege_escalation:\n  pe_tomcat:\n    process: tomcat\n    os: linux\n    prob: 1.0\n    cost: 1\n    access: root\nservice_scan_cost: 1\nos_scan_cost: 1\nsubnet_scan_cost: 1\nprocess_scan_cost: 1\nhost_configurations:\n  (1, 0):\n    os: linux\n    services: [ssh]\n    processes: [tomcat]\n    # which services to deny between individual hosts\n    firewall:\n      (3, 0): [ssh]\n  (2, 0):\n    os: linux\n    services: [ssh]\n    processes: [tomcat]\n    firewall:\n      (1, 0): [ssh]\n  (3, 0):\n    os: linux\n    services: [ssh]\n    processes: [tomcat]\n# two row for each connection between subnets as defined by topology\n# one for each direction of connection\n# list which services to allow\nfirewall:\n  (0, 1): [ssh]\n  (1, 0): []\n  (1, 2): []\n  (2, 1): [ssh]\n  (1, 3): [ssh]\n  (3, 1): [ssh]\n  (2, 3): [ssh]\n  (3, 2): [ssh]\nstep_limit: 1000\n"
  },
  {
    "path": "nasim/scenarios/generator.py",
    "content": "\"\"\"This module contains functionality for generating scenarios.\n\nSpecifically, it generates network configurations and action space\nconfigurations based on number of hosts and services in network using standard\nformula.\n\"\"\"\nimport math\nimport numpy as np\n\nimport nasim.scenarios.utils as u\nfrom nasim.scenarios import Scenario\nfrom nasim.scenarios.host import Host\n\n# Constants for generating network\nUSER_SUBNET_SIZE = 5\nHOST_ASSIGNMENT_PERIOD = 40\nDMZ = 1\nSENSITIVE = 2\nUSER = 3\n\n# Number of time to attempt to find valid vulnerable config\nVUL_RETRIES = 5\n\n\nclass ScenarioGenerator:\n    \"\"\"Generates a scenario based on standard formula\n\n    For explanation of the details of how scenarios are generated see\n    :ref:`scenario_generation_explanation`.\n\n    Notes\n    -----\n\n    **Exploit Probabilities**:\n\n    Success probabilities of each exploit are determined based on the value of\n    the ``exploit_probs`` argument, as follows:\n\n    - ``exploit_probs=None`` - probabilities generated randomly from uniform\n      distribution\n    - ``exploit_probs=\"mixed\"`` - probabilities are chosen from [0.3, 0.6, 0.9]\n      with probability [0.2, 0.4, 0.4] (see :ref:`generated_exploit_probs` for\n      explanation).\n    - ``exploit_probs=float`` - probability of each exploit is set to value\n    - ``exploit_probs=list[float]`` - probability of each exploit is set to\n      corresponding value in list\n\n    For deterministic exploits set ``exploit_probs=1.0``.\n\n    **Privilege Escalation Probabilities**:\n\n    Success probabilities of each privilege escalation are determined based\n    on the value of the ``privesc_probs`` argument, and are determined the same\n    as for exploits with the exclusion of the \"mixed\" option.\n\n    **Host Configuration distribution**:\n\n    1. if ``uniform=True`` then host configurations are chosen uniformly at\n       random from set of all valid possible configurations\n    2. if ``uniform=False`` host configurations are chosen to be correlated\n       (see :ref:`correlated_configurations` for explanation)\n\n\n    \"\"\"\n\n    def generate(self,\n                 num_hosts,\n                 num_services,\n                 num_os=2,\n                 num_processes=2,\n                 num_exploits=None,\n                 num_privescs=None,\n                 r_sensitive=10,\n                 r_user=10,\n                 exploit_cost=1,\n                 exploit_probs=1.0,\n                 privesc_cost=1,\n                 privesc_probs=1.0,\n                 service_scan_cost=1,\n                 os_scan_cost=1,\n                 subnet_scan_cost=1,\n                 process_scan_cost=1,\n                 uniform=False,\n                 alpha_H=2.0,\n                 alpha_V=2.0,\n                 lambda_V=1.0,\n                 restrictiveness=5,\n                 random_goal=False,\n                 base_host_value=1,\n                 host_discovery_value=1,\n                 seed=None,\n                 name=None,\n                 step_limit=None,\n                 address_space_bounds=None,\n                 **kwargs):\n        \"\"\"Generate the network configuration based on standard formula.\n\n        Parameters\n        ----------\n        num_hosts : int\n            number of hosts to include in network (minimum is 3)\n        num_services : int\n            number of services running on network (minimum is 1)\n        num_os : int, optional\n            number of OS running on network (minimum is 1) (default=2)\n        num_processes : int, optional\n            number of processes running on hosts on network (minimum is 1)\n            (default=2)\n        num_exploits : int, optional\n            number of exploits to use. minimum is 1. If None will use\n            num_services (default=None)\n        num_privescs : int, optional\n            number of privilege escalation actions to use. minimum is 1.\n            If None will use num_processes (default=None)\n        r_sensitive : float, optional\n            reward for sensitive subnet documents (default=10)\n        r_user : float, optional\n            reward for user subnet documents (default=10)\n        exploit_cost : int or float, optional\n            cost for an exploit (default=1)\n        exploit_probs : None, float, list of floats or \"mixed\", optional\n            success probability of exploits (default=1.0)\n        privesc_cost : int or float, optional\n            cost for an privilege escalation action (default=1)\n        privesc_probs : None, float, list of floats, optional\n            success probability of privilege escalation actions (default=1.0)\n        service_scan_cost : int or float, optional\n            cost for a service scan (default=1)\n        os_scan_cost : int or float, optional\n            cost for an os scan (default=1)\n        subnet_scan_cost : int or float, optional\n            cost for a subnet scan (default=1)\n        process_scan_cost : int or float, optional\n            cost for a process scan (default=1)\n        uniform : bool, optional\n            whether to use uniform distribution or correlated host configs\n            (default=False)\n        alpha_H : float, optional\n            (only used when uniform=False) Scaling/concentration parameter for\n            controlling corelation between host configurations (must be > 0)\n            (default=2.0)\n        alpha_V : float, optional\n            (only used when uniform=False) scaling/concentration parameter for\n            controlling corelation between services across host configurations\n            (must be > 0) (default=2.0)\n        lambda_V : float, optional\n            (only used when uniform=False) parameter for controlling average\n            number of services running per host configuration (must be > 0)\n            (default=1.0)\n        restrictiveness : int, optional\n            max number of services allowed to pass through firewalls between\n            zones (default=5)\n        random_goal : bool, optional\n            whether to randomly assign the goal user host or not\n            (default=False)\n        base_host_value : int, optional,\n            value of non sensitive hosts (default=1)\n        host_discovery_value : int, optional\n            value of discovering a host for the first time (default=1)\n        seed : int, optional\n            random number generator seed (default=None)\n        name : str, optional\n            name of the scenario, if None one will be generated (default=None)\n        step_limit : int, optional\n            max number of steps permitted in a single episode, if None there is\n            no limit (default=None)\n        address_space_bounds : (int, int), optional\n            bounds for the (subnet#, host#) address space. If None bounds will\n            be determined by the number of subnets in the scenario and the max\n            number of hosts in any subnet.\n\n        Returns\n        -------\n        Scenario\n            scenario description\n        \"\"\"\n        assert 0 < num_services\n        assert 2 < num_hosts\n        assert 0 < num_processes\n        assert num_exploits is None or 0 < num_exploits\n        assert num_privescs is None or 0 < num_privescs\n        assert 0 < num_os\n        assert 0 < r_sensitive and 0 < r_user\n        assert 0 < alpha_H and 0 < alpha_V and 0 < lambda_V\n        assert 0 < restrictiveness\n\n        if seed is not None:\n            np.random.seed(seed)\n\n        if num_exploits is None:\n            num_exploits = num_services\n\n        if num_privescs is None:\n            num_privescs = num_processes\n\n        self._generate_subnets(num_hosts)\n        self._generate_topology()\n        self._generate_address_space_bounds(address_space_bounds)\n        self._generate_os(num_os)\n        self._generate_services(num_services)\n        self._generate_processes(num_processes)\n        self._generate_exploits(num_exploits, exploit_cost, exploit_probs)\n        self._generate_privescs(num_privescs, privesc_cost, privesc_probs)\n        self._generate_sensitive_hosts(r_sensitive, r_user, random_goal)\n        self.base_host_value = base_host_value\n        self.host_discovery_value = host_discovery_value\n        if uniform:\n            self._generate_uniform_hosts()\n        else:\n            self._generate_correlated_hosts(alpha_H, alpha_V, lambda_V)\n        self._ensure_host_vulnerability()\n        self._generate_firewall(restrictiveness)\n        self.service_scan_cost = service_scan_cost\n        self.os_scan_cost = os_scan_cost\n        self.subnet_scan_cost = subnet_scan_cost\n        self.process_scan_cost = process_scan_cost\n\n        if name is None:\n            name = f\"gen_H{num_hosts}_E{num_exploits}_S{num_services}\"\n        self.name = name\n\n        self.step_limit = step_limit\n\n        return self._construct_scenario()\n\n    def _construct_scenario(self):\n        scenario_dict = dict()\n        scenario_dict[u.SUBNETS] = self.subnets\n        scenario_dict[u.ADDRESS_SPACE_BOUNDS] = self.address_space_bounds\n        scenario_dict[u.TOPOLOGY] = self.topology\n        scenario_dict[u.SERVICES] = self.services\n        scenario_dict[u.PROCESSES] = self.processes\n        scenario_dict[u.OS] = self.os\n        scenario_dict[u.SENSITIVE_HOSTS] = self.sensitive_hosts\n        scenario_dict[u.EXPLOITS] = self.exploits\n        scenario_dict[u.PRIVESCS] = self.privescs\n        scenario_dict[u.SERVICE_SCAN_COST] = self.service_scan_cost\n        scenario_dict[u.OS_SCAN_COST] = self.os_scan_cost\n        scenario_dict[u.SUBNET_SCAN_COST] = self.subnet_scan_cost\n        scenario_dict[u.PROCESS_SCAN_COST] = self.process_scan_cost\n        scenario_dict[u.FIREWALL] = self.firewall\n        scenario_dict[u.HOSTS] = self.hosts\n        scenario_dict[u.STEP_LIMIT] = self.step_limit\n        scenario = Scenario(\n            scenario_dict, name=self.name, generated=True\n        )\n        return scenario\n\n    def _generate_subnets(self, num_hosts):\n        # Internet (0) and sensitive (2) subnets both start with 1 host\n        subnets = [1]\n        # For every HOST_ASSIGNMENT_PERIOD hosts we have:\n        # first host assigned to DMZ (1),\n        dmz_hosts = math.ceil(num_hosts / HOST_ASSIGNMENT_PERIOD)\n        subnets.append(dmz_hosts)\n\n        # second host assigned sensitive (2)\n        sensitive_hosts = math.ceil(num_hosts / (HOST_ASSIGNMENT_PERIOD+1))\n        subnets.append(sensitive_hosts)\n\n        # remainder of hosts go into user subnet tree\n        num_user_hosts = num_hosts - dmz_hosts - sensitive_hosts\n        num_full_user_subnets = num_user_hosts // USER_SUBNET_SIZE\n        subnets += [USER_SUBNET_SIZE] * num_full_user_subnets\n        if (num_user_hosts % USER_SUBNET_SIZE) != 0:\n            subnets.append(num_user_hosts % USER_SUBNET_SIZE)\n        self.subnets = subnets\n\n    def _generate_topology(self):\n        # including internet subnet\n        num_subnets = len(self.subnets)\n        topology = np.zeros((num_subnets, num_subnets))\n        # DMZ subnet is connected to sensitive and first user subnet and also\n        # to internet\n        for row in range(USER + 1):\n            for col in range(USER + 1):\n                if row == u.INTERNET and col > DMZ:\n                    continue\n                if row > DMZ and col == u.INTERNET:\n                    continue\n                topology[row][col] = 1\n        if num_subnets == USER + 1:\n            self.topology = topology\n            return\n        # all other subnets are part of user binary tree\n        for row in range(USER, num_subnets):\n            # subnet connected to itself\n            topology[row][row] = 1\n            # position in tree\n            pos = row - USER\n            if pos > 0:\n                parent = ((pos - 1) // 2) + 3\n                topology[row][parent] = 1\n            child_left = ((2 * pos) + 1) + 3\n            child_right = ((2 * pos) + 2) + 3\n            if child_left < num_subnets:\n                topology[row][child_left] = 1\n            if child_right < num_subnets:\n                topology[row][child_right] = 1\n        self.topology = topology\n\n    def _generate_address_space_bounds(self, address_space_bounds):\n        if address_space_bounds is None:\n            address_space_bounds = (len(self.subnets), max(self.subnets))\n\n        err_msg = (\n            \"address_space_bounds must be None or a tuple/list of length 2\"\n            f\"containing positive ints. '{address_space_bounds}' is invalid\"\n        )\n        assert isinstance(address_space_bounds, (tuple, list)), err_msg\n        address_space_bounds = tuple(address_space_bounds)\n\n        assert len(address_space_bounds) == 2, err_msg\n        for val in address_space_bounds:\n            assert isinstance(val, int) and 0 < val, err_msg\n        assert address_space_bounds[0] >= len(self.subnets), \\\n            (\"Number of subnets in address bound must be >= number of subnets\"\n             f\" in the scenario. '{address_space_bounds[0]}' is invalid\")\n        assert address_space_bounds[1] >= max(self.subnets), \\\n            (\"Number of hosts in address bound must be >= number of hosts \"\n             \" in the largest subnet in the scenario. \"\n             f\"'{address_space_bounds[1]}' is invalid\")\n        self.address_space_bounds = address_space_bounds\n\n    def _generate_os(self, num_os):\n        self.os = [f\"os_{i}\" for i in range(num_os)]\n\n    def _generate_services(self, num_services):\n        self.services = [f\"srv_{s}\" for s in range(num_services)]\n\n    def _generate_processes(self, num_processes):\n        self.processes = [f\"proc_{s}\" for s in range(num_processes)]\n\n    def _generate_exploits(self, num_exploits, exploit_cost, exploit_probs):\n        exploits = {}\n        exploit_probs = self._get_action_probs(num_exploits, exploit_probs)\n        # add None since some exploits might work for all OS\n        possible_os = self.os + [None]\n        # we create one exploit per service\n        exploits_added = 0\n        while exploits_added < num_exploits:\n            srv = np.random.choice(self.services)\n            os = np.random.choice(possible_os)\n            al = np.random.randint(u.USER_ACCESS, u.ROOT_ACCESS+1)\n            e_name = f\"e_{srv}\"\n            if os is not None:\n                e_name += f\"_{os}\"\n            if e_name not in exploits:\n                exploits[e_name] = {\n                    u.EXPLOIT_SERVICE: srv,\n                    u.EXPLOIT_OS: os,\n                    u.EXPLOIT_PROB: exploit_probs[exploits_added],\n                    u.EXPLOIT_COST: exploit_cost,\n                    u.EXPLOIT_ACCESS: al\n                }\n                exploits_added += 1\n        self.exploits = exploits\n\n    def _generate_privescs(self, num_privesc, privesc_cost, privesc_probs):\n        privescs = {}\n        privesc_probs = self._get_action_probs(num_privesc, privesc_probs)\n        # add None since some privesc might work for all OS\n        possible_os = self.os + [None]\n\n        # need to ensure there is a privesc for each OS,\n        # or >= 1 OS agnostic privesc\n        # This ensures we can make it possible to get ROOT access on a\n        # host, independendent of the exploit the host is vulnerable too\n        if num_privesc < len(self.os):\n            os_choices = [None]\n            os_choices.extend(\n                list(np.random.choice(possible_os, num_privesc-1))\n            )\n        else:\n            while True:\n                os_choices = list(\n                    np.random.choice(possible_os, num_privesc)\n                )\n                if None in os_choices \\\n                   or all([os in os_choices for os in self.os]):\n                    break\n\n        # we create one exploit per service\n        privescs_added = 0\n        while privescs_added < num_privesc:\n            proc = np.random.choice(self.processes)\n            os = os_choices[privescs_added]\n            pe_name = f\"pe_{proc}\"\n            if os is not None:\n                pe_name += f\"_{os}\"\n            if pe_name not in privescs:\n                privescs[pe_name] = {\n                    u.PRIVESC_PROCESS: proc,\n                    u.PRIVESC_OS: os,\n                    u.PRIVESC_PROB: privesc_probs[privescs_added],\n                    u.PRIVESC_COST: privesc_cost,\n                    u.PRIVESC_ACCESS: u.ROOT_ACCESS\n                }\n                privescs_added += 1\n        self.privescs = privescs\n\n    def _get_action_probs(self, num_actions, action_probs):\n        if action_probs is None:\n            action_probs = np.random.random_sample(num_actions)\n        elif action_probs == 'mixed':\n            # success probability of low, med, high attack complexity\n            if num_actions == 1:\n                # for case where only 1 service ignore low probability actions\n                # since could lead to unnecessarily long attack paths\n                levels = [0.6, 0.9]\n                probs = [0.5, 0.5]\n            else:\n                levels = [0.3, 0.6, 0.9]\n                probs = [0.2, 0.4, 0.4]\n            action_probs = np.random.choice(levels, num_actions, p=probs)\n        elif type(action_probs) is list:\n            assert len(action_probs) == num_actions, \\\n                (\"Length of action probability list must equal number of\"\n                 \" exploits\")\n            for a in action_probs:\n                assert 0.0 < a <= 1.0, \\\n                    \"Action probabilities in list must be in (0.0, 1.0]\"\n        else:\n            assert isinstance(action_probs, float), \\\n                (\"Action probabilities must be float, list of floats or \"\n                 \"'mixed' (exploit only)\")\n            assert 0.0 < action_probs <= 1.0, \\\n                \"Action probability float must be in (0.0, 1.0]\"\n            action_probs = [action_probs] * num_actions\n\n        return action_probs\n\n    def _generate_sensitive_hosts(self, r_sensitive, r_user, random_goal):\n        sensitive_hosts = {}\n        # first sensitive host is first host in SENSITIVE network\n        sensitive_hosts[(SENSITIVE, 0)] = r_sensitive\n\n        # second sensitive host in USER network\n        if random_goal and len(self.subnets) > SENSITIVE:\n            # randomly choose user host to be goal\n            subnet_id = np.random.randint(USER, len(self.subnets))\n            host_id = np.random.randint(0, self.subnets[subnet_id])\n            sensitive_hosts[(subnet_id, host_id)] = r_user\n        else:\n            # second last host in USER network is goal\n            sensitive_hosts[(len(self.subnets)-1, self.subnets[-1]-1)] = r_user\n        self.sensitive_hosts = sensitive_hosts\n\n    def _generate_uniform_hosts(self):\n        hosts = dict()\n        srv_config_set, proc_config_set = self._possible_host_configs()\n        num_srv_configs = len(srv_config_set)\n        num_proc_configs = len(proc_config_set)\n\n        for subnet, size in enumerate(self.subnets):\n            if subnet == u.INTERNET:\n                continue\n            for h in range(size):\n                srv_cfg = srv_config_set[np.random.choice(num_srv_configs)]\n                srv_cfg = self._convert_to_service_map(srv_cfg)\n\n                proc_cfg = proc_config_set[np.random.choice(num_proc_configs)]\n                proc_cfg = self._convert_to_process_map(proc_cfg)\n\n                os = np.random.choice(self.os)\n                os_cfg = self._convert_to_os_map(os)\n\n                address = (subnet, h)\n                value = self._get_host_value(address)\n                host = Host(\n                    address=address,\n                    os=os_cfg.copy(),\n                    services=srv_cfg.copy(),\n                    processes=proc_cfg.copy(),\n                    firewall={},\n                    value=value,\n                    discovery_value=self.host_discovery_value\n                )\n                hosts[address] = host\n        self.hosts = hosts\n\n    def _possible_host_configs(self):\n        \"\"\"Generate set of all possible host service and process configurations\n        based on number of services and processes in environment.\n\n        Note: Each host is vulnerable to at least one exploit and one privesc,\n        so there is no configuration where all services and processes are\n        absent.\n\n        Returns\n        -------\n        list[list]\n            all possible service configurations, where each configuration is\n            a list of bools corresponding to the presence or absence of a\n            service\n        list[list]\n            all possible process configurations, same as above except for\n            processes\n        \"\"\"\n        # remove last permutation which is all False\n        srv_configs = self._permutations(len(self.services))[:-1]\n        proc_configs = self._permutations(len(self.processes))[:-1]\n        return srv_configs, proc_configs\n\n    def _permutations(self, n):\n        \"\"\"Generate list of all possible permutations of n bools\n\n        N.B First permutation in list is always the all True permutation\n        and final permutation in list is always the all False permutationself.\n\n        perms[1] = [True, ..., True]\n        perms[-1] = [False, ..., False]\n\n        Parameters\n        ----------\n        n : int\n            bool list length\n\n        Returns\n        -------\n        perms : list[list]\n            all possible permutations of n bools\n        \"\"\"\n        # base cases\n        if n <= 0:\n            return []\n        if n == 1:\n            return [[True], [False]]\n\n        perms = []\n        for p in self._permutations(n - 1):\n            perms.append([True] + p)\n            perms.append([False] + p)\n        return perms\n\n    def _generate_correlated_hosts(self, alpha_H, alpha_V, lambda_V):\n        hosts = dict()\n        prev_configs = []\n        prev_os = []\n        prev_srvs = []\n        prev_procs = []\n        host_num = 0\n        for subnet, size in enumerate(self.subnets):\n            if subnet == u.INTERNET:\n                continue\n            for m in range(size):\n                os, services, processes = self._get_host_config(\n                    host_num,\n                    alpha_H,\n                    prev_configs,\n                    alpha_V,\n                    lambda_V,\n                    prev_os,\n                    prev_srvs,\n                    prev_procs\n                )\n                os_cfg = self._convert_to_os_map(os)\n                service_cfg = self._convert_to_service_map(services)\n                process_cfg = self._convert_to_process_map(processes)\n                host_num += 1\n                address = (subnet, m)\n                value = self._get_host_value(address)\n                host = Host(\n                    address=address,\n                    os=os_cfg.copy(),\n                    services=service_cfg.copy(),\n                    processes=process_cfg.copy(),\n                    firewall={},\n                    value=value,\n                    discovery_value=self.host_discovery_value\n                )\n                hosts[address] = host\n        self.hosts = hosts\n\n    def _get_host_config(self,\n                         host_num,\n                         alpha_H,\n                         prev_configs,\n                         alpha_V,\n                         lambda_V,\n                         prev_os,\n                         prev_srvs,\n                         prev_procs):\n        \"\"\"Select a host configuration from all possible configurations based\n        using a Nested Dirichlet Process\n        \"\"\"\n        if host_num == 0 \\\n           or np.random.rand() < (alpha_H / (alpha_H + host_num - 1)):\n            # if first host or with prob proportional to alpha_H\n            # choose new config\n            new_config = self._sample_config(\n                alpha_V, prev_srvs, lambda_V, prev_os, prev_procs\n            )\n        else:\n            # sample uniformly from previous sampled configs\n            new_config = prev_configs[np.random.choice(len(prev_configs))]\n        prev_configs.append(new_config)\n        return new_config\n\n    def _sample_config(self,\n                       alpha_V,\n                       prev_srvs,\n                       lambda_V,\n                       prev_os,\n                       prev_procs):\n        \"\"\"Sample a host configuration from all possible configurations based\n        using a Dirichlet Process\n        \"\"\"\n        os = self._dirichlet_sample(\n            alpha_V, self.os, prev_os\n        )\n\n        new_services_cfg = self._dirichlet_process(\n            alpha_V, lambda_V, len(self.services), prev_srvs\n        )\n\n        new_process_cfg = self._dirichlet_process(\n            alpha_V, lambda_V, len(self.processes), prev_procs\n        )\n\n        return os, new_services_cfg, new_process_cfg\n\n    def _dirichlet_process(self,\n                           alpha_V,\n                           lambda_V,\n                           num_options,\n                           prev_vals):\n        \"\"\"Sample from all possible configurations using Dirichlet Process \"\"\"\n        # no options present by default\n        new_cfg = [False for i in range(num_options)]\n\n        # randomly get number of times to sample using poission dist with\n        # minimum 1 option choice\n        n = max(np.random.poisson(lambda_V), 1)\n\n        # draw n samples from Dirichlet Process\n        # (alpha_V, uniform dist of services)\n        for i in range(n):\n            if i == 0 or np.random.rand() < (alpha_V / (alpha_V + i - 1)):\n                # draw randomly from uniform dist over services\n                x = np.random.randint(0, num_options)\n            else:\n                # draw uniformly at random from previous choices\n                x = np.random.choice(prev_vals)\n            new_cfg[x] = True\n            prev_vals.append(x)\n        return new_cfg\n\n    def _dirichlet_sample(self, alpha_V, choices, prev_vals):\n        \"\"\"Sample single choice using dirichlet process \"\"\"\n        # sample an os from Dirichlet Process (alpha_V, uniform dist of OSs)\n        if len(prev_vals) == 0 \\\n           or np.random.rand() < (alpha_V / (alpha_V - 1)):\n            # draw randomly from uniform dist over services\n            choice = np.random.choice(choices)\n        else:\n            # draw uniformly at random from previous choices\n            choice = np.random.choice(prev_vals)\n        prev_vals.append(choice)\n        return choice\n\n    def _is_sensitive_host(self, addr):\n        return addr in self.sensitive_hosts\n\n    def _convert_to_service_map(self, config):\n        \"\"\"Converts list of bools to a map from service name -> bool \"\"\"\n        service_map = {}\n        for srv, val in zip(self.services, config):\n            service_map[srv] = val\n        return service_map\n\n    def _convert_to_process_map(self, config):\n        \"\"\"Converts list of bools to a map from process name -> bool \"\"\"\n        process_map = {}\n        for proc, val in zip(self.processes, config):\n            process_map[proc] = val\n        return process_map\n\n    def _convert_to_os_map(self, os):\n        \"\"\"Converts an OS string to a map from os name -> bool\n\n        N.B. also adds an entry for None os, which makes it easier for\n        vectorizing and checking if an exploit will work (since exploits can\n        have os=None)\n        \"\"\"\n        os_map = {}\n        for os_name in self.os:\n            os_map[os_name] = os_name == os\n        return os_map\n\n    def _ensure_host_vulnerability(self):\n        \"\"\"Ensures each subnet has at least one vulnerable host and all sensitive hosts\n        are vulnerable\n        \"\"\"\n        vulnerable_subnets = set()\n        for host_addr, host in self.hosts.items():\n            if not self._is_sensitive_host(host_addr) \\\n               and host_addr[0] in vulnerable_subnets:\n                continue\n\n            if self._is_sensitive_host(host_addr):\n                if not self._host_is_vulnerable(host, u.ROOT_ACCESS):\n                    self._update_host_to_vulnerable(host, u.ROOT_ACCESS)\n                vulnerable_subnets.add(host_addr[0])\n            elif self._host_is_vulnerable(host):\n                vulnerable_subnets.add(host_addr[0])\n\n        for subnet, size in enumerate(self.subnets):\n            if subnet in vulnerable_subnets or subnet == u.INTERNET:\n                continue\n            host_num = np.random.randint(size)\n            host = self.hosts[(subnet, host_num)]\n            self._update_host_to_vulnerable(host)\n            vulnerable_subnets.add(subnet)\n\n    def _host_is_vulnerable(self, host, access_level=u.USER_ACCESS):\n        for e_def in self.exploits.values():\n            if self._host_is_vulnerable_to_exploit(host, e_def):\n                if e_def[u.EXPLOIT_ACCESS] >= access_level:\n                    return True\n                for pe_def in self.privescs.values():\n                    if self._host_is_vulnerable_to_privesc(host, pe_def):\n                        return True\n        return False\n\n    def _host_is_vulnerable_to_exploit(self, host, exploit_def):\n        e_srv = exploit_def[u.EXPLOIT_SERVICE]\n        e_os = exploit_def[u.EXPLOIT_OS]\n        if not host.services[e_srv]:\n            return False\n        return e_os is None or host.os[e_os]\n\n    def _host_is_vulnerable_to_privesc(self, host, privesc_def):\n        pe_proc = privesc_def[u.PRIVESC_PROCESS]\n        pe_os = privesc_def[u.PRIVESC_OS]\n        if not host.processes[pe_proc]:\n            return False\n        return pe_os is None or host.os[pe_os]\n\n    def _update_host_to_vulnerable(self, host, access_level=u.USER_ACCESS):\n        \"\"\"Update host config so it's vulnerable to at least one exploit \"\"\"\n        # choose an exploit randomly and make host vulnerable to it\n        # will retry X times before giving up\n        # If vulnerable config is not found in X tries then the scenario\n        # probably needs more options (processes, privesc actions)\n        for i in range(VUL_RETRIES):\n            success, e_def = self._update_host_exploit_vulnerability(\n                host, False\n            )\n            # don't need to check success since should always succeed\n            # in finding exploit, when there is no contraint on OS\n            if e_def[u.EXPLOIT_ACCESS] >= access_level:\n                return\n            # Need to ensure host is now vulnerable to >= 1 privesc action\n            success, pe_def = self._update_host_privesc_vulnerability(\n                host, True\n            )\n            if success:\n                return\n\n        raise AssertionError(\n            \"After {VUL_RETRIES}, unable to find privilege escalation action\"\n            \" for target OS, when looking for vulnerable host configuration,\"\n            \" try again using more privilege escalation actions or processes\"\n        )\n\n    def _update_host_exploit_vulnerability(self, host, os_constraint):\n        # choose an exploit randomly and make host vulnerable to it\n        if not os_constraint:\n            # can change host OS, so all exploits valid\n            valid_e = list(self.exploits.values())\n        else:\n            # exploits must match OS of host, or be OS agnostic\n            # since cannot change host OS\n            valid_e = []\n            for e_def in self.exploits.values():\n                e_os = e_def[u.EXPLOIT_OS]\n                if e_os is None or host.os[e_os]:\n                    valid_e.append(e_def)\n\n            if len(valid_e) == 0:\n                return False, None\n\n        e_def = np.random.choice(valid_e)\n        host.services[e_def[u.EXPLOIT_SERVICE]] = True\n        if e_def[u.EXPLOIT_OS] is not None and not os_constraint:\n            self._update_host_os(host, e_def[u.EXPLOIT_OS])\n\n        return True, e_def\n\n    def _update_host_privesc_vulnerability(self, host, os_constraint):\n        # choose an exploit randomly and make host vulnerable to it\n        if not os_constraint:\n            # no OS constraint\n            valid_pe = list(self.privescs.values())\n        else:\n            valid_pe = []\n            for pe_def in self.privescs.values():\n                pe_os = pe_def[u.PRIVESC_OS]\n                if pe_os is None or host.os[pe_os]:\n                    valid_pe.append(pe_def)\n\n            if len(valid_pe) == 0:\n                return False, None\n\n        pe_def = np.random.choice(valid_pe)\n        host.processes[pe_def[u.PRIVESC_PROCESS]] = True\n        if pe_def[u.PRIVESC_OS] is not None and not os_constraint:\n            self._update_host_os(host, pe_def[u.PRIVESC_OS])\n\n        return True, pe_def\n\n    def _update_host_os(self, host, os):\n        # must set all to false first, so only one host OS is true\n        for os_name in host.os.keys():\n            host.os[os_name] = False\n        host.os[os] = True\n\n    def _get_host_value(self, address):\n        return float(self.sensitive_hosts.get(address, self.base_host_value))\n\n    def _generate_firewall(self, restrictiveness):\n        \"\"\"Generate the firewall rules.\n\n        Parameters\n        ----------\n        restrictiveness : int\n            parameter that controls how many services are blocked by\n            firewall between zones (i.e. between internet, DMZ, sensitive\n            and user zones).\n\n        Returns\n        -------\n        dict\n            firewall rules that are a mapping from (src, dest) connection to\n            set of allowed services, which defines for each service whether\n            traffic using that service is allowed between pairs of subnets.\n\n        Notes\n        -----\n        Traffic from at least one service running on each subnet will be\n        allowed between each zone. This may mean more services will be allowed\n        than restrictiveness parameter.\n        \"\"\"\n        num_subnets = len(self.subnets)\n        firewall = {}\n\n        # find services running on each subnet that are vulnerable\n        subnet_services = {}\n        subnet_services[u.INTERNET] = set()\n        for host_addr, host in self.hosts.items():\n            subnet = host_addr[0]\n            if subnet not in subnet_services:\n                subnet_services[subnet] = set()\n            for e_def in self.exploits.values():\n                if self._host_is_vulnerable_to_exploit(host, e_def):\n                    subnet_services[subnet].add(e_def[u.EXPLOIT_SERVICE])\n\n        for src in range(num_subnets):\n            for dest in range(num_subnets):\n                if src == dest or not self.topology[src][dest]:\n                    # no inter subnet connection so no firewall\n                    continue\n                elif src > SENSITIVE and dest > SENSITIVE:\n                    # all services allowed between user subnets\n                    allowed = set(self.services)\n                    firewall[(src, dest)] = allowed\n                    continue\n                # else src and dest in different zones => block services based\n                # on restrictiveness\n                dest_avail = subnet_services[dest].copy()\n                if len(dest_avail) < restrictiveness:\n                    # restrictiveness not limiting allowed traffic, all\n                    # services allowed\n                    firewall[(src, dest)] = dest_avail.copy()\n                    continue\n                # add at least one service to allowed service\n                dest_allowed = np.random.choice(list(dest_avail))\n                # for dest subnet choose available services upto\n                # restrictiveness limit or all services\n                dest_avail.remove(dest_allowed)\n                allowed = set()\n                allowed.add(dest_allowed)\n                while len(allowed) < restrictiveness:\n                    dest_allowed = np.random.choice(list(dest_avail))\n                    if dest_allowed not in allowed:\n                        allowed.add(dest_allowed)\n                        dest_avail.remove(dest_allowed)\n                firewall[(src, dest)] = allowed\n        self.firewall = firewall\n"
  },
  {
    "path": "nasim/scenarios/host.py",
    "content": "\nclass Host:\n    \"\"\"A single host in the network.\n\n    Note this class is mainly used to store initial scenario data for a host.\n    The HostVector class is used to store and track the current state of a\n    host (for efficiency and ease of use reasons).\n    \"\"\"\n\n    def __init__(self,\n                 address,\n                 os,\n                 services,\n                 processes,\n                 firewall,\n                 value=0.0,\n                 discovery_value=0.0,\n                 compromised=False,\n                 reachable=False,\n                 discovered=False,\n                 access=0):\n        \"\"\"\n        Arguments\n        ---------\n        address : (int, int)\n            address of host as (subnet, id)\n        os : dict\n            A os_name: bool dictionary indicating which OS the host is runinng\n        services : dict\n            a (service_name, bool) dictionary indicating which services\n            are present/absent\n        processes : dict\n            a (process_name, bool) dictionary indicating which processes are\n            running on host or not\n        firewall : dict\n            a (addr, denied services) dictionary defining which services are\n            blocked from other hosts in the network. If other host not in\n            firewall assumes all services allowed\n        value : float, optional\n            value of the host (default=0.0)\n        discovery_value : float, optional\n            the reward gained for discovering the host (default=0.0)\n        compromised : bool, optional\n            whether host has been compromised or not (default=False)\n        reachable : bool, optional\n            whether host is reachable by attacker or not (default=False)\n        discovered : bool, optional\n            whether host has been reachable discovered by attacker or not\n            (default=False)\n        access : int, optional\n            access level of attacker on host (default=0)\n        \"\"\"\n        self.address = address\n        self.os = os\n        self.services = services\n        self.processes = processes\n        self.firewall = firewall\n        self.value = value\n        self.discovery_value = discovery_value\n        self.compromised = compromised\n        self.reachable = reachable\n        self.discovered = discovered\n        self.access = access\n\n    def is_running_service(self, service):\n        return self.services[service]\n\n    def is_running_os(self, os):\n        return self.os[os]\n\n    def is_running_process(self, process):\n        return self.processes[process]\n\n    def traffic_permitted(self, addr, service):\n        return service not in self.firewall.get(addr, [])\n\n    def __str__(self):\n        output = [\"Host: {\"]\n        output.append(f\"\\taddress: {self.address}\")\n        output.append(f\"\\tcompromised: {self.compromised}\")\n        output.append(f\"\\treachable: {self.reachable}\")\n        output.append(f\"\\tvalue: {self.value}\")\n        output.append(f\"\\taccess: {self.access}\")\n\n        output.append(\"\\tOS: {\")\n        for os_name, val in self.os.items():\n            output.append(f\"\\t\\t{os_name}: {val}\")\n        output.append(\"\\t}\")\n\n        output.append(\"\\tservices: {\")\n        for name, val in self.services.items():\n            output.append(f\"\\t\\t{name}: {val}\")\n        output.append(\"\\t}\")\n\n        output.append(\"\\tprocesses: {\")\n        for name, val in self.processes.items():\n            output.append(f\"\\t\\t{name}: {val}\")\n        output.append(\"\\t}\")\n\n        output.append(\"\\tfirewall: {\")\n        for addr, val in self.firewall.items():\n            output.append(f\"\\t\\t{addr}: {val}\")\n        output.append(\"\\t}\")\n        return \"\\n\".join(output)\n\n    def __repr__(self):\n        return f\"Host: {self.address}\"\n"
  },
  {
    "path": "nasim/scenarios/loader.py",
    "content": "\"\"\"This module contains functionality for loading network scenarios from yaml\nfiles.\n\"\"\"\nimport math\n\nimport nasim.scenarios.utils as u\nfrom nasim.scenarios import Scenario\nfrom nasim.scenarios.host import Host\n\n\n# dictionary of valid key names and value types for config file\nVALID_CONFIG_KEYS = {\n    u.SUBNETS: list,\n    u.TOPOLOGY: list,\n    u.SENSITIVE_HOSTS: dict,\n    u.OS: list,\n    u.SERVICES: list,\n    u.PROCESSES: list,\n    u.EXPLOITS: dict,\n    u.PRIVESCS: dict,\n    u.SERVICE_SCAN_COST: (int, float),\n    u.SUBNET_SCAN_COST: (int, float),\n    u.OS_SCAN_COST: (int, float),\n    u.PROCESS_SCAN_COST: (int, float),\n    u.HOST_CONFIGS: dict,\n    u.FIREWALL: dict\n}\n\nOPTIONAL_CONFIG_KEYS = {u.STEP_LIMIT: int}\n\nVALID_ACCESS_VALUES = [\"user\", \"root\", u.USER_ACCESS, u.ROOT_ACCESS]\nACCESS_LEVEL_MAP = {\n    \"user\": u.USER_ACCESS,\n    \"root\": u.ROOT_ACCESS\n}\n\n\n# required keys for exploits\nEXPLOIT_KEYS = {\n    u.EXPLOIT_SERVICE: str,\n    u.EXPLOIT_OS: str,\n    u.EXPLOIT_PROB: (int, float),\n    u.EXPLOIT_COST: (int, float),\n    u.EXPLOIT_ACCESS: (str, int)\n}\n\n# required keys for privesc actions\nPRIVESC_KEYS = {\n    u.PRIVESC_OS: str,\n    u.PRIVESC_PROCESS: str,\n    u.PRIVESC_PROB: (int, float),\n    u.PRIVESC_COST: (int, float),\n    u.PRIVESC_ACCESS: (str, int)\n}\n\n# required keys for host configs\nHOST_CONFIG_KEYS = {\n    u.HOST_OS: (str, None),\n    u.HOST_SERVICES: list,\n    u.HOST_PROCESSES: list\n}\n\n\nclass ScenarioLoader:\n\n    def load(self, file_path, name=None):\n        \"\"\"Load the scenario from file\n\n        Arguments\n        ---------\n        file_path : str\n            path to scenario file\n        name : str, optional\n            the scenarios name, if None name will be generated from file path\n            (default=None)\n\n        Returns\n        -------\n        scenario_dict : dict\n            dictionary with scenario definition\n\n        Raises\n        ------\n        Exception\n            If file unable to load or scenario file is invalid.\n        \"\"\"\n        self.yaml_dict = u.load_yaml(file_path)\n        if name is None:\n            name = u.get_file_name(file_path)\n        self.name = name\n        self._check_scenario_sections_valid()\n\n        self._parse_subnets()\n        self._parse_topology()\n        self._parse_os()\n        self._parse_services()\n        self._parse_processes()\n        self._parse_sensitive_hosts()\n        self._parse_exploits()\n        self._parse_privescs()\n        self._parse_scan_costs()\n        self._parse_host_configs()\n        self._parse_firewall()\n        self._parse_hosts()\n        self._parse_step_limit()\n        return self._construct_scenario()\n\n    def _construct_scenario(self):\n        scenario_dict = dict()\n        scenario_dict[u.SUBNETS] = self.subnets\n        scenario_dict[u.TOPOLOGY] = self.topology\n        scenario_dict[u.OS] = self.os\n        scenario_dict[u.SERVICES] = self.services\n        scenario_dict[u.PROCESSES] = self.processes\n        scenario_dict[u.SENSITIVE_HOSTS] = self.sensitive_hosts\n        scenario_dict[u.EXPLOITS] = self.exploits\n        scenario_dict[u.PRIVESCS] = self.privescs\n        scenario_dict[u.OS_SCAN_COST] = self.os_scan_cost\n        scenario_dict[u.SERVICE_SCAN_COST] = self.service_scan_cost\n        scenario_dict[u.SUBNET_SCAN_COST] = self.subnet_scan_cost\n        scenario_dict[u.PROCESS_SCAN_COST] = self.process_scan_cost\n        scenario_dict[u.FIREWALL] = self.firewall\n        scenario_dict[u.HOSTS] = self.hosts\n        scenario_dict[u.STEP_LIMIT] = self.step_limit\n        return Scenario(\n            scenario_dict, name=self.name, generated=False\n        )\n\n    def _check_scenario_sections_valid(self):\n        \"\"\"Checks if scenario dictionary contains all required sections and\n        they are valid type.\n        \"\"\"\n        # 0. check correct number of keys\n        assert len(self.yaml_dict) >= len(VALID_CONFIG_KEYS), \\\n            (f\"Too few config file keys: {len(self.yaml_dict)} \"\n             f\"< {len(VALID_CONFIG_KEYS)}\")\n\n        # 1. check keys are valid and values are correct type\n        for k, v in self.yaml_dict.items():\n            assert k in VALID_CONFIG_KEYS or k in OPTIONAL_CONFIG_KEYS, \\\n                f\"{k} not a valid config file key\"\n\n            if k in VALID_CONFIG_KEYS:\n                expected_type = VALID_CONFIG_KEYS[k]\n            else:\n                expected_type = OPTIONAL_CONFIG_KEYS[k]\n\n            assert isinstance(v, expected_type), \\\n                (f\"{v} invalid type for config file key '{k}': {type(v)}\"\n                 f\" != {expected_type}\")\n\n    def _parse_subnets(self):\n        subnets = self.yaml_dict[u.SUBNETS]\n        self._validate_subnets(subnets)\n        # insert internet subnet\n        subnets.insert(0, 1)\n        self.subnets = subnets\n        self.num_hosts = sum(subnets)-1\n\n    def _validate_subnets(self, subnets):\n        # check subnets is valid list of positive ints\n        assert len(subnets) > 0, \"Subnets cannot be empty list\"\n        for subnet_size in subnets:\n            assert type(subnet_size) is int and subnet_size > 0, \\\n                f\"{subnet_size} invalid subnet size, must be positive int\"\n\n    def _parse_topology(self):\n        topology = self.yaml_dict[u.TOPOLOGY]\n        self._validate_topology(topology)\n        self.topology = topology\n\n    def _validate_topology(self, topology):\n        # check topology is valid adjacency matrix\n        assert len(topology) == len(self.subnets), \\\n            (\"Number of rows in topology adjacency matrix must equal \"\n             f\"number of subnets: {len(topology)} != {len(self.subnets)}\")\n\n        for row in topology:\n            assert isinstance(row, list), \\\n                \"topology must be 2D adjacency matrix (i.e. list of lists)\"\n            assert len(row) == len(self.subnets), \\\n                (\"Number of columns in topology matrix must equal number of\"\n                 f\" subnets: {len(topology)} != {len(self.subnets)}\")\n            for col in row:\n                assert isinstance(col, int) and (col == 1 or col == 0), \\\n                    (\"Subnet_connections adjaceny matrix must contain only\"\n                     f\" 1 (connected) or 0 (not connected): {col} invalid\")\n\n    def _parse_os(self):\n        os = self.yaml_dict[u.OS]\n        self._validate_os(os)\n        self.os = os\n\n    def _validate_os(self, os):\n        assert len(os) > 0, \\\n            f\"{len(os)}. Invalid number of OSs, must be >= 1\"\n        assert len(os) == len(set(os)), \\\n            f\"{os}. OSs must not contain duplicates\"\n\n    def _parse_services(self):\n        services = self.yaml_dict[u.SERVICES]\n        self._validate_services(services)\n        self.services = services\n\n    def _validate_services(self, services):\n        assert len(services) > 0, \\\n           f\"{len(services)}. Invalid number of services, must be > 0\"\n        assert len(services) == len(set(services)), \\\n            f\"{services}. Services must not contain duplicates\"\n\n    def _parse_processes(self):\n        processes = self.yaml_dict[u.PROCESSES]\n        self._validate_processes(processes)\n        self.processes = processes\n\n    def _validate_processes(self, processes):\n        assert len(processes) >= 1, \\\n            f\"{len(processes)}. Invalid number of services, must be > 0\"\n        assert len(processes) == len(set(processes)), \\\n            f\"{processes}. Processes must not contain duplicates\"\n\n    def _parse_sensitive_hosts(self):\n        sensitive_hosts = self.yaml_dict[u.SENSITIVE_HOSTS]\n        self._validate_sensitive_hosts(sensitive_hosts)\n\n        self.sensitive_hosts = dict()\n        for address, value in sensitive_hosts.items():\n            self.sensitive_hosts[eval(address)] = value\n\n    def _validate_sensitive_hosts(self, sensitive_hosts):\n        # check sensitive_hosts is valid dict of (subnet, id) : value\n        assert len(sensitive_hosts) > 0, \\\n            (\"Number of sensitive hosts must be >= 1: \"\n             f\"{len(sensitive_hosts)} not >= 1\")\n\n        assert len(sensitive_hosts) <= self.num_hosts, \\\n            (\"Number of sensitive hosts must be <= total number of \"\n             f\"hosts: {len(sensitive_hosts)} not <= {self.num_hosts}\")\n\n        # sensitive hosts must be valid address\n        for address, value in sensitive_hosts.items():\n            subnet_id, host_id = eval(address)\n            assert self._is_valid_subnet_ID(subnet_id), \\\n                (\"Invalid sensitive host tuple: subnet_id must be a valid\"\n                 f\" subnet: {subnet_id} != non-negative int less than \"\n                 f\"{len(self.subnets) + 1}\")\n\n            assert self._is_valid_host_address(subnet_id, host_id), \\\n                (\"Invalid sensitive host tuple: host_id must be a valid\"\n                 f\" int: {host_id} != non-negative int less than\"\n                 f\" {self.subnets[subnet_id]}\")\n\n            assert isinstance(value, (float, int)) and value > 0, \\\n                (f\"Invalid sensitive host tuple: invalid value: {value}\"\n                 f\" != a positive int or float\")\n\n        # 5.c sensitive hosts must not contain duplicate addresses\n        for i, m in enumerate(sensitive_hosts.keys()):\n            h1_addr = eval(m)\n            for j, n in enumerate(sensitive_hosts.keys()):\n                if i == j:\n                    continue\n                h2_addr = eval(n)\n                assert h1_addr != h2_addr, \\\n                    (\"Sensitive hosts list must not contain duplicate host \"\n                     f\"addresses: {m} == {n}\")\n\n    def _is_valid_subnet_ID(self, subnet_ID):\n        if type(subnet_ID) is not int \\\n           or subnet_ID < 1 \\\n           or subnet_ID > len(self.subnets):\n            return False\n        return True\n\n    def _is_valid_host_address(self, subnet_ID, host_ID):\n        if not self._is_valid_subnet_ID(subnet_ID):\n            return False\n        if type(host_ID) is not int \\\n           or host_ID < 0 \\\n           or host_ID >= self.subnets[subnet_ID]:\n            return False\n        return True\n\n    def _parse_exploits(self):\n        exploits = self.yaml_dict[u.EXPLOITS]\n        self._validate_exploits(exploits)\n        self.exploits = exploits\n\n    def _validate_exploits(self, exploits):\n        for e_name, e in exploits.items():\n            self._validate_single_exploit(e_name, e)\n\n    def _validate_single_exploit(self, e_name, e):\n        assert isinstance(e, dict), \\\n            f\"{e_name}. Exploit must be a dict.\"\n\n        for k, t in EXPLOIT_KEYS.items():\n            assert k in e, f\"{e_name}. Exploit missing key: '{k}'\"\n            assert isinstance(e[k], t), \\\n                f\"{e_name}. Exploit '{k}' incorrect type. Expected {t}\"\n\n        assert e[u.EXPLOIT_SERVICE] in self.services, \\\n            (f\"{e_name}. Exploit target service invalid: \"\n             f\"'{e[u.EXPLOIT_SERVICE]}'\")\n\n        if str(e[u.EXPLOIT_OS]).lower() == \"none\":\n            e[u.EXPLOIT_OS] = None\n\n        assert e[u.EXPLOIT_OS] is None or e[u.EXPLOIT_OS] in self.os, \\\n            (f\"{e_name}. Exploit target OS is invalid. '{e[u.EXPLOIT_OS]}'.\"\n             \" Should be None or one of the OS in the os list.\")\n\n        assert 0 <= e[u.EXPLOIT_PROB] < 1, \\\n            (f\"{e_name}. Exploit probability, '{e[u.EXPLOIT_PROB]}' not \"\n             \"a valid probability\")\n\n        assert e[u.EXPLOIT_COST] > 0, f\"{e_name}. Exploit cost must be > 0.\"\n\n        assert e[u.EXPLOIT_ACCESS] in VALID_ACCESS_VALUES, \\\n            (f\"{e_name}. Exploit access value '{e[u.EXPLOIT_ACCESS]}' \"\n             f\"invalid. Must be one of {VALID_ACCESS_VALUES}\")\n\n        if isinstance(e[u.EXPLOIT_ACCESS], str):\n            e[u.EXPLOIT_ACCESS] = ACCESS_LEVEL_MAP[e[u.EXPLOIT_ACCESS]]\n\n    def _parse_privescs(self):\n        self.privescs = self.yaml_dict[u.PRIVESCS]\n        self._validate_privescs(self.privescs)\n\n    def _validate_privescs(self, privescs):\n        for pe_name, pe in privescs.items():\n            self._validate_single_privesc(pe_name, pe)\n\n    def _validate_single_privesc(self, pe_name, pe):\n        s_name = \"Priviledge Escalation\"\n\n        assert isinstance(pe, dict), f\"{pe_name}. {s_name} must be a dict.\"\n\n        for k, t in PRIVESC_KEYS.items():\n            assert k in pe, f\"{pe_name}. {s_name} missing key: '{k}'\"\n            assert isinstance(pe[k], t), \\\n                (f\"{pe_name}. {s_name} '{k}' incorrect type. Expected {t}\")\n\n        assert pe[u.PRIVESC_PROCESS] in self.processes, \\\n            (f\"{pe_name}. {s_name} target process invalid: \"\n             f\"'{pe[u.PRIVESC_PROCESS]}'\")\n\n        if str(pe[u.PRIVESC_OS]).lower() == \"none\":\n            pe[u.PRIVESC_OS] = None\n\n        assert pe[u.PRIVESC_OS] is None or pe[u.PRIVESC_OS] in self.os, \\\n            (f\"{pe_name}. {s_name} target OS is invalid. '{pe[u.PRIVESC_OS]}'.\"\n             f\" Should be None or one of the OS in the os list.\")\n\n        assert 0 <= pe[u.PRIVESC_PROB] <= 1.0, \\\n            (f\"{pe_name}. {s_name} probability, '{pe[u.PRIVESC_PROB]}' not \"\n                \"a valid probability\")\n\n        assert pe[u.PRIVESC_COST] > 0, \\\n            f\"{pe_name}. {s_name} cost must be > 0.\"\n\n        assert pe[u.PRIVESC_ACCESS] in VALID_ACCESS_VALUES, \\\n            (f\"{pe_name}. {s_name} access value '{pe[u.PRIVESC_ACCESS]}' \"\n             f\"invalid. Must be one of {VALID_ACCESS_VALUES}\")\n\n        if isinstance(pe[u.PRIVESC_ACCESS], str):\n            pe[u.PRIVESC_ACCESS] = ACCESS_LEVEL_MAP[pe[u.PRIVESC_ACCESS]]\n\n    def _parse_scan_costs(self):\n        self.os_scan_cost = self.yaml_dict[u.OS_SCAN_COST]\n        self.service_scan_cost = self.yaml_dict[u.SERVICE_SCAN_COST]\n        self.subnet_scan_cost = self.yaml_dict[u.SUBNET_SCAN_COST]\n        self.process_scan_cost = self.yaml_dict[u.PROCESS_SCAN_COST]\n        for (n, c) in [\n                (\"OS\", self.os_scan_cost),\n                (\"Service\", self.service_scan_cost),\n                (\"Subnet\", self.subnet_scan_cost),\n                (\"Process\", self.process_scan_cost)\n        ]:\n            self._validate_scan_cost(n, c)\n\n    def _validate_scan_cost(self, scan_name, scan_cost):\n        assert scan_cost >= 0, f\"{scan_name} Scan Cost must be >= 0.\"\n\n    def _parse_host_configs(self):\n        self.host_configs = self.yaml_dict[u.HOST_CONFIGS]\n        self._validate_host_configs(self.host_configs)\n\n    def _validate_host_configs(self, host_configs):\n        assert len(host_configs) == self.num_hosts, \\\n            (\"Number of host configurations must match the number of hosts \"\n             f\"in network: {len(host_configs)} != {self.num_hosts}\")\n\n        assert self._has_all_host_addresses(host_configs.keys()), \\\n            (\"Host configurations must have no duplicates and have an\"\n             \" address for each host on network.\")\n\n        for addr, cfg in host_configs.items():\n            self._validate_host_config(addr, cfg)\n\n    def _has_all_host_addresses(self, addresses):\n        \"\"\"Check that list of (subnet_ID, host_ID) tuples contains all\n        addresses on network based on subnets list\n        \"\"\"\n        for s_id, s_size in enumerate(self.subnets[1:]):\n            for m in range(s_size):\n                # +1 to s_id since first subnet is 1\n                if str((s_id + 1, m)) not in addresses:\n                    return False\n        return True\n\n    def _validate_host_config(self, addr, cfg):\n        \"\"\"Check if a host config is valid or not given the list of exploits available\n        N.B. each host config must contain at least one service\n        \"\"\"\n        err_prefix = f\"Host {addr}\"\n        assert isinstance(cfg, dict) and len(cfg) >= len(HOST_CONFIG_KEYS), \\\n            (f\"{err_prefix} configurations must be a dict of length >= \"\n             f\"{len(HOST_CONFIG_KEYS)}. {cfg} is invalid\")\n\n        for k in HOST_CONFIG_KEYS:\n            assert k in cfg, f\"{err_prefix} configuration missing key: {k}\"\n\n        host_services = cfg[u.HOST_SERVICES]\n        for service in host_services:\n            assert service in self.services, \\\n                (f\"{err_prefix} Invalid service in configuration services \"\n                 f\"list: {service}\")\n\n        assert len(host_services) == len(set(host_services)), \\\n            (f\"{err_prefix} configuration services list cannot contain \"\n             \"duplicates\")\n\n        host_processes = cfg[u.HOST_PROCESSES]\n        for process in host_processes:\n            assert process in self.processes, \\\n                (f\"{err_prefix} invalid process in configuration processes\"\n                 f\" list: {process}\")\n\n        assert len(host_processes) == len(set(host_processes)), \\\n            (f\"{err_prefix} configuation processes list cannot contain \"\n             \"duplicates\")\n\n        host_os = cfg[u.HOST_OS]\n        assert host_os in self.os, \\\n            f\"{err_prefix} invalid os in configuration: {host_os}\"\n\n        fw_err_prefix = f\"{err_prefix} {u.HOST_FIREWALL}\"\n        if u.HOST_FIREWALL in cfg:\n            firewall = cfg[u.HOST_FIREWALL]\n            assert isinstance(firewall, dict), \\\n                (f\"{fw_err_prefix} must be a dictionary, with host \"\n                 \"addresses as keys and a list of denied services as values. \"\n                 f\"{firewall} is invalid.\")\n            for addr, srv_list in firewall.items():\n                addr = self._validate_host_address(addr, err_prefix)\n                assert self._is_valid_firewall_setting(srv_list), \\\n                    (f\"{fw_err_prefix} setting must be a list, contain only \"\n                     f\"valid services and contain no duplicates: {srv_list}\"\n                     \" is not valid\")\n        else:\n            cfg[u.HOST_FIREWALL] = dict()\n\n        v_err_prefix = f\"{err_prefix} {u.HOST_VALUE}\"\n        if u.HOST_VALUE in cfg:\n            host_value = cfg[u.HOST_VALUE]\n            assert isinstance(host_value, (int, float)), \\\n                (f\"{v_err_prefix} must be an integer or float value. \"\n                 f\"{host_value} is invalid\")\n\n            if addr in self.sensitive_hosts:\n                sh_value = self.sensitive_hosts[addr]\n                assert math.isclose(host_value, sh_value), \\\n                    (f\"{v_err_prefix} for a sensitive host must either match \"\n                     f\"the value specified in the {u.SENSITIVE_HOSTS} section \"\n                     f\"or be excluded the host config. The value {host_value} \"\n                     f\"is invalid as it does not match value {sh_value}.\")\n\n    def _validate_host_address(self, addr, err_prefix=\"\"):\n        try:\n            addr = eval(addr)\n        except Exception:\n            raise AssertionError(\n                f\"{err_prefix} address invalid. Must be (subnet, host) tuple\"\n                f\" of integers. {addr} is invalid.\"\n            )\n        assert isinstance(addr, tuple) \\\n            and len(addr) == 2 \\\n            and all([isinstance(a, int) for a in addr]), \\\n            (f\"{err_prefix} address invalid. Must be (subnet, host) tuple\"\n             f\" of integers. {addr} is invalid.\")\n        assert 0 < addr[0] < len(self.subnets), \\\n            (f\"{err_prefix} address invalid. Subnet address must be in range\"\n             f\" 0 < subnet addr < {len(self.subnets)}. {addr[0]} is invalid.\")\n        assert 0 <= addr[1] < self.subnets[addr[0]], \\\n            (f\"{err_prefix} address invalid. Host address must be in range \"\n             f\"0 < host addr < {self.subnets[addr[0]]}. {addr[1]} is invalid.\")\n        return True\n\n    def _parse_firewall(self):\n        firewall = self.yaml_dict[u.FIREWALL]\n        self._validate_firewall(firewall)\n        # convert (subnet_id, subnet_id) string to tuple\n        self.firewall = {}\n        for connect, v in firewall.items():\n            self.firewall[eval(connect)] = v\n\n    def _validate_firewall(self, firewall):\n        assert self._contains_all_required_firewalls(firewall), \\\n            (\"Firewall dictionary must contain two entries for each subnet \"\n             \"connection in network (including from outside) as defined by \"\n             \"network topology matrix\")\n\n        for f in firewall.values():\n            assert self._is_valid_firewall_setting(f), \\\n                (\"Firewall setting must be a list, contain only valid \"\n                 f\"services and contain no duplicates: {f} is not valid\")\n\n    def _contains_all_required_firewalls(self, firewall):\n        for src, row in enumerate(self.topology):\n            for dest, col in enumerate(row):\n                if src == dest:\n                    continue\n                if col == 1 and (str((src, dest)) not in firewall\n                                 or str((dest, src)) not in firewall):\n                    return False\n        return True\n\n    def _is_valid_firewall_setting(self, f):\n        if type(f) != list:\n            return False\n        for service in f:\n            if service not in self.services:\n                return False\n        for i, x in enumerate(f):\n            for j, y in enumerate(f):\n                if i != j and x == y:\n                    return False\n        return True\n\n    def _parse_hosts(self):\n        \"\"\"Returns ordered dictionary of hosts in network, with address as\n        keys and host objects as values\n        \"\"\"\n        hosts = dict()\n        for address, h_cfg in self.host_configs.items():\n            formatted_address = eval(address)\n            os_cfg, srv_cfg, proc_cfg = self._construct_host_config(h_cfg)\n            value = self._get_host_value(formatted_address, h_cfg)\n            hosts[formatted_address] = Host(\n                address=formatted_address,\n                os=os_cfg,\n                services=srv_cfg,\n                processes=proc_cfg,\n                firewall=h_cfg[u.HOST_FIREWALL],\n                value=value\n            )\n        self.hosts = hosts\n\n    def _construct_host_config(self, host_cfg):\n        os_cfg = {}\n        for os_name in self.os:\n            os_cfg[os_name] = os_name == host_cfg[u.HOST_OS]\n        services_cfg = {}\n        for service in self.services:\n            services_cfg[service] = service in host_cfg[u.HOST_SERVICES]\n        processes_cfg = {}\n        for process in self.processes:\n            processes_cfg[process] = process in host_cfg[u.HOST_PROCESSES]\n        return os_cfg, services_cfg, processes_cfg\n\n    def _get_host_value(self, address, host_cfg):\n        if address in self.sensitive_hosts:\n            return float(self.sensitive_hosts[address])\n        return float(host_cfg.get(u.HOST_VALUE, u.DEFAULT_HOST_VALUE))\n\n    def _parse_step_limit(self):\n        if u.STEP_LIMIT not in self.yaml_dict:\n            step_limit = None\n        else:\n            step_limit = self.yaml_dict[u.STEP_LIMIT]\n            assert step_limit > 0, \\\n                f\"Step limit must be positive int: {step_limit} is invalid\"\n\n        self.step_limit = step_limit\n"
  },
  {
    "path": "nasim/scenarios/scenario.py",
    "content": "import math\nfrom pprint import pprint\n\nimport nasim.scenarios.utils as u\n\n\nclass Scenario:\n\n    def __init__(self, scenario_dict, name=None, generated=False):\n        self.scenario_dict = scenario_dict\n        self.name = name\n        self.generated = generated\n        self._e_map = None\n        self._pe_map = None\n\n        # this is used for consistent positioning of\n        # host state and obs in state and obs matrices\n        self.host_num_map = {}\n        for host_num, host_addr in enumerate(self.hosts):\n            self.host_num_map[host_addr] = host_num\n\n    @property\n    def step_limit(self):\n        return self.scenario_dict.get(u.STEP_LIMIT, None)\n\n    @property\n    def services(self):\n        return self.scenario_dict[u.SERVICES]\n\n    @property\n    def num_services(self):\n        return len(self.services)\n\n    @property\n    def os(self):\n        return self.scenario_dict[u.OS]\n\n    @property\n    def num_os(self):\n        return len(self.os)\n\n    @property\n    def processes(self):\n        return self.scenario_dict[u.PROCESSES]\n\n    @property\n    def num_processes(self):\n        return len(self.processes)\n\n    @property\n    def access_levels(self):\n        return u.ROOT_ACCESS\n\n    @property\n    def exploits(self):\n        return self.scenario_dict[u.EXPLOITS]\n\n    @property\n    def privescs(self):\n        return self.scenario_dict[u.PRIVESCS]\n\n    @property\n    def exploit_map(self):\n        \"\"\"A nested dictionary for all exploits in scenario.\n\n        I.e. {service_name: {\n                 os_name: {\n                     name: e_name,\n                     cost: e_cost,\n                     prob: e_prob,\n                     access: e_access\n                 }\n             }\n        \"\"\"\n        if self._e_map is None:\n            e_map = {}\n            for e_name, e_def in self.exploits.items():\n                srv_name = e_def[u.EXPLOIT_SERVICE]\n                if srv_name not in e_map:\n                    e_map[srv_name] = {}\n                srv_map = e_map[srv_name]\n\n                os = e_def[u.EXPLOIT_OS]\n                if os not in srv_map:\n                    srv_map[os] = {\n                        \"name\": e_name,\n                        u.EXPLOIT_SERVICE: srv_name,\n                        u.EXPLOIT_OS: os,\n                        u.EXPLOIT_COST: e_def[u.EXPLOIT_COST],\n                        u.EXPLOIT_PROB: e_def[u.EXPLOIT_PROB],\n                        u.EXPLOIT_ACCESS: e_def[u.EXPLOIT_ACCESS]\n                    }\n            self._e_map = e_map\n        return self._e_map\n\n    @property\n    def privesc_map(self):\n        \"\"\"A nested dictionary for all privilege escalation actions in scenario.\n\n        I.e. {process_name: {\n                 os_name: {\n                     name: pe_name,\n                     cost: pe_cost,\n                     prob: pe_prob,\n                     access: pe_access\n                 }\n             }\n        \"\"\"\n        if self._pe_map is None:\n            pe_map = {}\n            for pe_name, pe_def in self.privescs.items():\n                proc_name = pe_def[u.PRIVESC_PROCESS]\n                if proc_name not in pe_map:\n                    pe_map[proc_name] = {}\n                proc_map = pe_map[proc_name]\n\n                os = pe_def[u.PRIVESC_OS]\n                if os not in proc_map:\n                    proc_map[os] = {\n                        \"name\": pe_name,\n                        u.PRIVESC_PROCESS: proc_name,\n                        u.PRIVESC_OS: os,\n                        u.PRIVESC_COST: pe_def[u.PRIVESC_COST],\n                        u.PRIVESC_PROB: pe_def[u.PRIVESC_PROB],\n                        u.PRIVESC_ACCESS: pe_def[u.PRIVESC_ACCESS]\n                    }\n            self._pe_map = pe_map\n        return self._pe_map\n\n    @property\n    def subnets(self):\n        return self.scenario_dict[u.SUBNETS]\n\n    @property\n    def topology(self):\n        return self.scenario_dict[u.TOPOLOGY]\n\n    @property\n    def sensitive_hosts(self):\n        return self.scenario_dict[u.SENSITIVE_HOSTS]\n\n    @property\n    def sensitive_addresses(self):\n        return list(self.sensitive_hosts.keys())\n\n    @property\n    def firewall(self):\n        return self.scenario_dict[u.FIREWALL]\n\n    @property\n    def hosts(self):\n        return self.scenario_dict[u.HOSTS]\n\n    @property\n    def address_space(self):\n        return list(self.hosts.keys())\n\n    @property\n    def service_scan_cost(self):\n        return self.scenario_dict[u.SERVICE_SCAN_COST]\n\n    @property\n    def os_scan_cost(self):\n        return self.scenario_dict[u.OS_SCAN_COST]\n\n    @property\n    def subnet_scan_cost(self):\n        return self.scenario_dict[u.SUBNET_SCAN_COST]\n\n    @property\n    def process_scan_cost(self):\n        return self.scenario_dict[u.PROCESS_SCAN_COST]\n\n    @property\n    def address_space_bounds(self):\n        return self.scenario_dict.get(\n            u.ADDRESS_SPACE_BOUNDS, (len(self.subnets), max(self.subnets))\n        )\n\n    @property\n    def host_value_bounds(self):\n        \"\"\"The min and max values of host in scenario\n\n        Returns\n        -------\n        (float, float)\n            (min, max) tuple of host values\n        \"\"\"\n        min_value = math.inf\n        max_value = -math.inf\n        for host in self.hosts.values():\n            min_value = min(min_value, host.value)\n            max_value = max(max_value, host.value)\n        return (min_value, max_value)\n\n    @property\n    def host_discovery_value_bounds(self):\n        \"\"\"The min and max discovery values of hosts in scenario\n\n        Returns\n        -------\n        (float, float)\n            (min, max) tuple of host values\n        \"\"\"\n        min_value = math.inf\n        max_value = -math.inf\n        for host in self.hosts.values():\n            min_value = min(min_value, host.discovery_value)\n            max_value = max(max_value, host.discovery_value)\n        return (min_value, max_value)\n\n    def display(self):\n        pprint(self.scenario_dict)\n\n    def get_action_space_size(self):\n        num_exploits = len(self.exploits)\n        num_privescs = len(self.privescs)\n        # OSScan, ServiceScan, SubnetScan, ProcessScan\n        num_scans = 4\n        actions_per_host = num_exploits + num_privescs + num_scans\n        return len(self.hosts) * actions_per_host\n\n    def get_state_space_size(self):\n        # compromised, reachable, discovered\n        host_aux_bin_features = 3\n        num_bin_features = (\n            host_aux_bin_features\n            + self.num_os\n            + self.num_services\n            + self.num_processes\n        )\n        # access\n        num_tri_features = 1\n        host_states = 2**num_bin_features * 3**num_tri_features\n        return len(self.hosts) * host_states\n\n    def get_state_dims(self):\n        # compromised, reachable, discovered, value, discovery_value, access\n        host_aux_features = 6\n        host_state_size = (\n            self.address_space_bounds[0]\n            + self.address_space_bounds[1]\n            + host_aux_features\n            + self.num_os\n            + self.num_services\n            + self.num_processes\n        )\n        return len(self.hosts), host_state_size\n\n    def get_observation_dims(self):\n        state_dims = self.get_state_dims()\n        return state_dims[0]+1, state_dims[1]\n\n    def get_description(self):\n        description = {\n            \"Name\": self.name,\n            \"Type\": \"generated\" if self.generated else \"static\",\n            \"Subnets\": len(self.subnets),\n            \"Hosts\": len(self.hosts),\n            \"OS\": self.num_os,\n            \"Services\": self.num_services,\n            \"Processes\": self.num_processes,\n            \"Exploits\": len(self.exploits),\n            \"PrivEscs\": len(self.privescs),\n            \"Actions\": self.get_action_space_size(),\n            \"Observation Dims\": self.get_observation_dims(),\n            \"States\": self.get_state_space_size(),\n            \"Step Limit\": self.step_limit\n        }\n        return description\n"
  },
  {
    "path": "nasim/scenarios/utils.py",
    "content": "import os\nimport yaml\nimport os.path as osp\n\n\nSCENARIO_DIR = osp.dirname(osp.abspath(__file__))\n\n# default subnet address for internet\nINTERNET = 0\n\n# Constants\nNUM_ACCESS_LEVELS = 2\nNO_ACCESS = 0\nUSER_ACCESS = 1\nROOT_ACCESS = 2\nDEFAULT_HOST_VALUE = 0\n\n# scenario property keys\nSUBNETS = \"subnets\"\nTOPOLOGY = \"topology\"\nSENSITIVE_HOSTS = \"sensitive_hosts\"\nSERVICES = \"services\"\nOS = \"os\"\nPROCESSES = \"processes\"\nEXPLOITS = \"exploits\"\nPRIVESCS = \"privilege_escalation\"\nSERVICE_SCAN_COST = \"service_scan_cost\"\nOS_SCAN_COST = \"os_scan_cost\"\nSUBNET_SCAN_COST = \"subnet_scan_cost\"\nPROCESS_SCAN_COST = \"process_scan_cost\"\nHOST_CONFIGS = \"host_configurations\"\nFIREWALL = \"firewall\"\nHOSTS = \"host\"\nSTEP_LIMIT = \"step_limit\"\nACCESS_LEVELS = \"access_levels\"\nADDRESS_SPACE_BOUNDS = \"address_space_bounds\"\n\n# scenario exploit keys\nEXPLOIT_SERVICE = \"service\"\nEXPLOIT_OS = \"os\"\nEXPLOIT_PROB = \"prob\"\nEXPLOIT_COST = \"cost\"\nEXPLOIT_ACCESS = \"access\"\n\n# scenario privilege escalation keys\nPRIVESC_PROCESS = \"process\"\nPRIVESC_OS = \"os\"\nPRIVESC_PROB = \"prob\"\nPRIVESC_COST = \"cost\"\nPRIVESC_ACCESS = \"access\"\n\n# host configuration keys\nHOST_SERVICES = \"services\"\nHOST_PROCESSES = \"processes\"\nHOST_OS = \"os\"\nHOST_FIREWALL = \"firewall\"\nHOST_VALUE = \"value\"\n\n\ndef load_yaml(file_path):\n    \"\"\"Load yaml file located at file path.\n\n    Parameters\n    ----------\n    file_path : str\n        path to yaml file\n\n    Returns\n    -------\n    dict\n        contents of yaml file\n\n    Raises\n    ------\n    Exception\n        if theres an issue loading file. \"\"\"\n    with open(file_path) as fin:\n        content = yaml.load(fin, Loader=yaml.FullLoader)\n    return content\n\n\ndef get_file_name(file_path):\n    \"\"\"Extracts the file or dir name from file path\n\n    Parameters\n    ----------\n    file_path : str\n        file path\n\n    Returns\n    -------\n    str\n        file name with any path and extensions removed\n    \"\"\"\n    full_file_name = file_path.split(os.sep)[-1]\n    file_name = full_file_name.split(\".\")[0]\n    return file_name\n"
  },
  {
    "path": "nasim/scripts/describe_scenarios.py",
    "content": "\"\"\"This script will output description statistics of all benchmark\nscenarios.\n\nIt will output a table to stdout (and optionally to a .csv file) which\ncontains the following headers:\n\n- Name : the scenarios name\n- Type : static or generated\n- Subnets : the number of subnets\n- Hosts : the number of hosts\n- OS : the number of OS\n- Services : the number of services\n- Processes : the number of processes\n- Exploits : the number of exploits\n- PrivEsc : the number of priviledge escalation actions\n- Actions : the total number of actions available to agent\n- States : the total number of states\n- Step limit : the step limit for the scenario\n\nUsage\n-----\n\n$ python describe_scenarios.py [-o --output filename.csv]\n\n\"\"\"\nimport prettytable\n\nfrom nasim.scenarios import make_benchmark_scenario\nfrom nasim.scenarios.benchmark import AVAIL_BENCHMARKS\n\n\ndef describe_scenarios(output=None):\n    rows = []\n    headers = None\n    for name in AVAIL_BENCHMARKS:\n        scenario = make_benchmark_scenario(name, seed=0)\n        des = scenario.get_description()\n        if headers is None:\n            headers = list(des.keys())\n\n        if des[\"States\"] > 1e8:\n            des[\"States\"] = f\"{des['States']:.2E}\"\n\n        rows.append([str(des[h]) for h in headers])\n\n    table = prettytable.PrettyTable(headers)\n    for row in rows:\n        table.add_row(row)\n\n    print(table)\n\n    if output is not None:\n        print(f\"\\nSaving to {output}\")\n        with open(output, \"w\") as fout:\n            fout.write(\",\".join(headers) + \"\\n\")\n            for row in rows:\n                fout.write(\",\".join(row) + \"\\n\")\n\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"-o\", \"--output\", type=str, default=None,\n                        help=\"File name to output as CSV too\")\n    args = parser.parse_args()\n\n    describe_scenarios(args.output)\n"
  },
  {
    "path": "nasim/scripts/run_dqn_policy.py",
    "content": "\"\"\"A script for running a pre-trained DQN agent\n\nNote, user must ensure the DQN policy matches the NASim\nEnvironment used to train it in terms of size.\n\nE.g. A policy trained on the 'tiny-gen' env can be tested\nagainst the 'tiny' env since they both have the same Action\nand Observation spaces.\n\nBut a policy trained on 'tiny-gen' could not be used on the\n'small' environment (or any non-'tiny' environment for that\nmatter)\n\"\"\"\n\nimport nasim\nfrom nasim.agents.dqn_agent import DQNAgent\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"env_name\", type=str, help=\"benchmark scenario name\")\n    parser.add_argument(\"policy_path\", type=str, help=\"path to policy\")\n    parser.add_argument(\"-o\", \"--partially_obs\", action=\"store_true\",\n                        help=\"Partially Observable Mode\")\n    parser.add_argument(\"--eval_eps\", type=int, default=1,\n                        help=\"Number of episodes to run (default=1)\")\n    parser.add_argument(\"--seed\", type=int, default=0,\n                        help=\"Random seed (default=0)\")\n    parser.add_argument(\"--epsilon\", type=float, default=0.05,\n                        help=(\"Epsilon (i.e. random action probability) to use\"\n                              \"(default=0.05)\"))\n    parser.add_argument(\"--render\", action=\"store_true\",\n                        help=\"Render the episode/s\")\n    args = parser.parse_args()\n\n    env = nasim.make_benchmark(args.env_name,\n                               args.seed,\n                               fully_obs=not args.partially_obs,\n                               flat_actions=True,\n                               flat_obs=True)\n    dqn_agent = DQNAgent(env, verbose=False, **vars(args))\n    dqn_agent.load(args.policy_path)\n\n    total_ret = 0\n    total_steps = 0\n    goals = 0\n    print(f\"\\n{'-'*60}\\nRunning DQN Policy:\\n\\t{args.policy_path}\\n{'-'*60}\")\n    for i in range(args.eval_eps):\n        ret, steps, goal = dqn_agent.run_eval_episode(\n            env, args.render, args.epsilon\n        )\n        print(f\"Episode {i} return={ret}, steps={steps}, goal reached={goal}\")\n        total_ret += ret\n        total_steps += steps\n        goals += int(goal)\n\n    print(f\"\\n{'-'*60}\\nDone\\n{'-'*60}\")\n    print(f\"Average Return = {total_ret / args.eval_eps:.2f}\")\n    print(f\"Average Steps = {total_steps / args.eval_eps:.2f}\")\n    print(f\"Goals = {goals} / {args.eval_eps}\")\n"
  },
  {
    "path": "nasim/scripts/run_random_benchmarks.py",
    "content": "\"\"\"This script runs the random agent for all benchmarks scenarios\n\nThe mean (+/- stdev) steps and reward are reported in table to stdout\n(and to optional CSV file)\n\nUsage\n-----\n$ python run_random_benchmarks.py [-n --num_cpus NUM_CPUS]\n     [-o --output OUTPUT_FILENAME] [-s --num_seeds NUM_SEEDS]\n\n\"\"\"\nimport os\nimport numpy as np\nimport multiprocessing as mp\nfrom prettytable import PrettyTable\n\nimport nasim\nfrom nasim.agents.random_agent import run_random_agent\nfrom nasim.scenarios.benchmark import AVAIL_BENCHMARKS\n\n\ndef print_msg(msg):\n    print(f\"[PID={os.getpid()}] {msg}\")\n\n\nclass Result:\n\n    def __init__(self, name):\n        self.name = name\n        self.run_steps = []\n        self.run_rewards = []\n\n    def add(self, steps, reward):\n        self.run_steps.append(steps)\n        self.run_rewards.append(reward)\n\n    def summarize(self):\n        steps_mean = np.mean(self.run_steps)\n        steps_std = np.std(self.run_steps)\n        reward_mean = np.mean(self.run_rewards)\n        reward_std = np.std(self.run_rewards)\n        return steps_mean, steps_std, reward_mean, reward_std\n\n    def get_formatted_summary(self):\n        steps_mean, steps_std, reward_mean, reward_std = self.summarize()\n        return (\n            f\"{steps_mean:.2f} +/- {steps_std:.2f}\",\n            f\"{reward_mean:.2f} +/- {reward_std:.2f}\"\n        )\n\n\ndef run_scenario(args):\n    scenario_name, seed = args\n    print_msg(f\"Running '{scenario_name}' scenario with seed={seed}\")\n    env = nasim.make_benchmark(scenario_name, seed, False, True, True)\n    steps, total_reward, done = run_random_agent(env, verbose=False)\n    return {\n        \"Name\": scenario_name,\n        \"Seed\": seed,\n        \"Steps\": steps,\n        \"Total reward\": total_reward\n    }\n\n\ndef collate_results(results):\n    scenario_results = {}\n    for res in results:\n        name = res[\"Name\"]\n        if name not in scenario_results:\n            scenario_results[name] = Result(name)\n        scenario_results[name].add(res[\"Steps\"], res[\"Total reward\"])\n    return scenario_results\n\n\ndef output_results(results, output=None):\n    headers = [\"Scenario Name\", \"Steps\", \"Total Reward\"]\n    rows = []\n    for name in AVAIL_BENCHMARKS:\n        rows.append([\n            name, *results[name].get_formatted_summary()\n        ])\n\n    table = PrettyTable(headers)\n    for row in rows:\n        table.add_row(row)\n\n    if output is not None:\n        with open(output, \"w\") as fout:\n            fout.write(\",\".join(headers) + \"\\n\")\n            for row in rows:\n                fout.write(\",\".join(row) + \"\\n\")\n\n\ndef run_random_benchmark(num_cpus=1, num_seeds=10, output=None):\n    run_args_list = []\n    for name in AVAIL_BENCHMARKS:\n        for seed in range(num_seeds):\n            run_args_list.append((name, seed))\n\n    with mp.Pool(num_cpus) as p:\n        results = p.map(run_scenario, run_args_list)\n\n    results = collate_results(results)\n    output_results(results, output)\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"-n\", \"--num_cpus\", type=int, default=1,\n                        help=\"Number of CPUS to use in parallel (default=1)\")\n    parser.add_argument(\"-o\", \"--output\", type=str, default=None,\n                        help=\"File name to output as CSV too\")\n    parser.add_argument(\"-s\", \"--num_seeds\", type=int, default=10,\n                        help=(\"Number of seeds to run for each scenario\"\n                              \" (default=10)\"))\n    args = parser.parse_args()\n\n    run_random_benchmark(**vars(args))\n"
  },
  {
    "path": "nasim/scripts/train_dqn.py",
    "content": "\"\"\"A script for training a DQN agent and storing best policy \"\"\"\n\nimport nasim\nfrom nasim.agents.dqn_agent import DQNAgent\n\n\nclass BestDQN(DQNAgent):\n    \"\"\"A DQN Agent which saves best policy found during training \"\"\"\n\n    def __init__(self,\n                 env,\n                 save_path,\n                 eval_epsilon=0.01,\n                 **kwargs):\n        super().__init__(env, **kwargs)\n        self.save_path = save_path\n        self.eval_epsilon = eval_epsilon\n        self.best_score = -float(\"inf\")\n\n    def run_train_episode(self, step_limit):\n        ep_ret, steps, goal_reached = super().run_train_episode(step_limit)\n\n        if self.steps_done > self.exploration_steps:\n            eval_ret, _, _ = self.run_eval_episode(\n                eval_epsilon=self.eval_epsilon\n            )\n            if eval_ret > self.best_score:\n                print(f\"Saving New Best Score = {ep_ret}\")\n                self.best_score = eval_ret\n                self.save(self.save_path)\n\n        return ep_ret, steps, goal_reached\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"env_name\", type=str, help=\"benchmark scenario name\")\n    parser.add_argument(\"save_path\", type=str, help=\"save path for agent\")\n    parser.add_argument(\"-o\", \"--partially_obs\", action=\"store_true\",\n                        help=\"Partially Observable Mode\")\n    parser.add_argument(\"--eval_epsilon\", type=float, default=0.01,\n                        help=\"Epsilon to use for evaluation (default=0.01)\")\n    parser.add_argument(\"--hidden_sizes\", type=int, nargs=\"*\",\n                        default=[64, 64],\n                        help=\"(default=[64. 64])\")\n    parser.add_argument(\"--lr\", type=float, default=0.001,\n                        help=\"Learning rate (default=0.001)\")\n    parser.add_argument(\"--training_steps\", type=int, default=10000,\n                        help=\"training steps (default=10000)\")\n    parser.add_argument(\"--batch_size\", type=int, default=32,\n                        help=\"(default=32)\")\n    parser.add_argument(\"--target_update_freq\", type=int, default=1000,\n                        help=\"(default=1000)\")\n    parser.add_argument(\"--seed\", type=int, default=0,\n                        help=\"(default=0)\")\n    parser.add_argument(\"--replay_size\", type=int, default=100000,\n                        help=\"(default=100000)\")\n    parser.add_argument(\"--final_epsilon\", type=float, default=0.05,\n                        help=\"(default=0.05)\")\n    parser.add_argument(\"--exploration_steps\", type=int, default=5000,\n                        help=\"(default=5000)\")\n    parser.add_argument(\"--gamma\", type=float, default=0.99,\n                        help=\"(default=0.99)\")\n    args = parser.parse_args()\n    assert args.training_steps > args.exploration_steps\n\n    env = nasim.make_benchmark(args.env_name,\n                               args.seed,\n                               fully_obs=not args.partially_obs,\n                               flat_actions=True,\n                               flat_obs=True)\n    dqn_agent = BestDQN(env, **vars(args))\n    dqn_agent.train()\n\n    print(f\"\\n{'-'*60}\\nDone\\n{'-'*60}\")\n    print(f\"Best Policy score = {dqn_agent.best_score}\")\n    print(f\"Policy saved to: {dqn_agent.save_path}\")\n"
  },
  {
    "path": "nasim/scripts/visualize_graph.py",
    "content": "\"\"\"Environment network graph visualizer\n\nThis script allows the user to visualize the network graph for a chosen\nbenchmark scenario.\n\"\"\"\n\nimport nasim\n\n\nif __name__ == \"__main__\":\n    import argparse\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"scenario_name\", type=str,\n                        help=\"benchmark scenario name\")\n    parser.add_argument(\"-s\", \"--seed\", type=int, default=0,\n                        help=\"random seed (default=0)\")\n    args = parser.parse_args()\n\n    env = nasim.make_benchmark(args.scenario_name, args.seed)\n    env.render_network_graph(show=True)\n"
  },
  {
    "path": "setup.py",
    "content": "import pathlib\n\nfrom setuptools import setup, find_packages\n\nextras = {\n    'dqn': [\n        'torch>=1.5',\n        'tensorboard>=2.2'\n    ],\n    'docs': [\n        'sphinx>=3.0',\n        'sphinx-rtd-theme>=0.4'\n    ],\n    'test': [\n        'pytest>=5.4'\n    ]\n}\n\nextras['all'] = [item for group in extras.values() for item in group]\n\n\ndef get_version():\n    \"\"\"Gets the posggym version.\"\"\"\n    path = pathlib.Path(__file__).absolute().parent / \"nasim\" / \"__init__.py\"\n    content = path.read_text()\n\n    for line in content.splitlines():\n        if line.startswith(\"__version__\"):\n            return line.strip().split()[-1].strip().strip('\"')\n    raise RuntimeError(\"bad version data in __init__.py\")\n\n\nsetup(\n    name='nasim',\n    version=get_version(),\n    url=\"https://networkattacksimulator.readthedocs.io\",\n    description=\"A simple and fast simulator for remote network pen-testing\",\n    long_description=open('README.rst').read(),\n    long_description_content_type='text/x-rst',\n    author=\"Jonathon Schwartz\",\n    author_email=\"Jonathon.Schwartz@anu.edu.au\",\n    license=\"MIT\",\n    packages=[\n        package for package in find_packages()\n        if package.startswith('nasim')\n    ],\n    install_requires=[\n        'gymnasium>=0.26',\n        'numpy>=1.18',\n        'networkx>=2.4',\n        'matplotlib>=3.1',\n        'pyyaml>=5.3',\n        'prettytable>=0.7'\n    ],\n    extras_require=extras,\n    python_requires='>=3.8',\n    package_data={\n        'nasim': ['scenarios/benchmark/*.yaml']\n    },\n    project_urls={\n        'Documentation': \"https://networkattacksimulator.readthedocs.io\",\n        'Source': \"https://github.com/Jjschwartz/NetworkAttackSimulator/\",\n    },\n    classifiers=[\n        'Development Status :: 3 - Alpha',\n        'License :: OSI Approved :: MIT License',\n        'Programming Language :: Python :: 3',\n        'Programming Language :: Python :: 3.8',\n    ],\n    zip_safe=False\n)\n"
  },
  {
    "path": "test/__init__.py",
    "content": ""
  },
  {
    "path": "test/test_bruteforce.py",
    "content": "\"\"\"Runs bruteforce agent on environment for different scenarios and\nusing different parameters to check no exceptions occur\n\"\"\"\n\nimport pytest\n\nimport nasim\nfrom nasim.scenarios.benchmark import \\\n    AVAIL_GEN_BENCHMARKS, AVAIL_STATIC_BENCHMARKS\nfrom nasim.agents.bruteforce_agent import run_bruteforce_agent\n\n\n@pytest.mark.parametrize(\"scenario\", AVAIL_STATIC_BENCHMARKS)\n@pytest.mark.parametrize(\"seed\", [0, 666])\n@pytest.mark.parametrize(\"fully_obs\", [True, False])\n@pytest.mark.parametrize(\"flat_actions\", [True, False])\n@pytest.mark.parametrize(\"flat_obs\", [True, False])\ndef test_bruteforce_static(scenario, seed, fully_obs, flat_actions, flat_obs):\n    \"\"\"Tests all static benchmark scenarios using every possible environment\n    setting, using bruteforce agent, checking for any errors\n    \"\"\"\n    env = nasim.make_benchmark(scenario,\n                               seed=seed,\n                               fully_obs=fully_obs,\n                               flat_actions=flat_actions,\n                               flat_obs=flat_obs,\n                               render_mode=None)\n    run_bruteforce_agent(env, verbose=False)\n\n\n@pytest.mark.parametrize(\"scenario\", AVAIL_GEN_BENCHMARKS)\n@pytest.mark.parametrize(\"seed\", [0, 30, 666])\n@pytest.mark.parametrize(\"fully_obs\", [True, False])\n@pytest.mark.parametrize(\"flat_actions\", [True, False])\n@pytest.mark.parametrize(\"flat_obs\", [True, False])\ndef test_bruteforce_gen(scenario, seed, fully_obs, flat_actions, flat_obs):\n    \"\"\"Tests all generated benchmark scenarios using every possible environment\n    setting, using bruteforce agent, checking for any errors\n    \"\"\"\n    env = nasim.make_benchmark(scenario,\n                               seed=seed,\n                               fully_obs=fully_obs,\n                               flat_actions=flat_actions,\n                               flat_obs=flat_obs,\n                               render_mode=None)\n    run_bruteforce_agent(env, verbose=False)\n"
  },
  {
    "path": "test/test_env.py",
    "content": "\"\"\"Runs some general tests on environment\"\"\"\n\nimport pytest\n\nimport nasim\nfrom nasim.scenarios.benchmark import \\\n    AVAIL_GEN_BENCHMARKS, AVAIL_STATIC_BENCHMARKS\n\n\ndef test_render_error():\n    env = nasim.make_benchmark(\"tiny\", render_mode=\"a bad mode str\")\n    env.reset()\n    with pytest.raises(NotImplementedError):\n        env.render()\n\n\ndef test_render_readable():\n    env = nasim.make_benchmark(\"tiny\", render_mode=\"human\")\n    env.reset()\n    env.render()\n\n\ndef test_render_state_error():\n    env = nasim.make_benchmark(\"tiny\")\n    env.reset()\n    with pytest.raises(NotImplementedError):\n        env.render_state(mode=\"a bad mode str\")\n\n\ndef test_render_state_readable():\n    env = nasim.make_benchmark(\"tiny\")\n    env.reset()\n    env.render_state(mode=\"human\")\n\n\n@pytest.mark.parametrize(\"flat_actions\", [True, False])\ndef test_render_action(flat_actions):\n    env = nasim.make_benchmark(\"tiny\", flat_actions=flat_actions)\n    env.reset()\n    env.render_action(env.action_space.sample())\n\n\n@pytest.mark.parametrize(\n    (\"scenario\", \"expected_value\"),\n    [(\"tiny\", 0.0), (\"small\", 0.0)]\n)\ndef test_get_total_discovery_value(scenario, expected_value):\n    env = nasim.make_benchmark(scenario)\n    env.reset()\n    actual_value = env.network.get_total_discovery_value()\n    assert actual_value == expected_value\n\n\n@pytest.mark.parametrize(\n    (\"scenario\", \"expected_value\"),\n    [(\"tiny\", 200.0), (\"small\", 200.0)]\n)\ndef test_get_total_sensitive_host_value(scenario, expected_value):\n    env = nasim.make_benchmark(scenario)\n    env.reset()\n    actual_value = env.network.get_total_sensitive_host_value()\n    assert actual_value == expected_value\n\n\n@pytest.mark.parametrize(\n    (\"scenario\", \"expected_value\"),\n    [(\"tiny\", 3), (\"small\", 4)]\n)\ndef test_get_minumum_hops(scenario, expected_value):\n    env = nasim.make_benchmark(scenario)\n    env.reset()\n    actual_value = env.get_minimum_hops()\n    assert actual_value == expected_value\n"
  },
  {
    "path": "test/test_generator.py",
    "content": "\"\"\"Runs bruteforce agent on environment for different scenarios and\nusing different parameters to check no exceptions occur\n\"\"\"\n\nimport pytest\n\nimport nasim\nfrom nasim.scenarios.benchmark import \\\n    AVAIL_GEN_BENCHMARKS\n\n\n@pytest.mark.parametrize(\"scenario\", AVAIL_GEN_BENCHMARKS)\n@pytest.mark.parametrize(\"seed\", list(range(100)))\ndef test_generator(scenario, seed):\n    \"\"\"Tests generating all generated benchmark scenarios using a range of\n    seeds, checking for any errors\n    \"\"\"\n    nasim.make_benchmark(scenario, seed=seed)\n"
  },
  {
    "path": "test/test_gym_bruteforce.py",
    "content": "\"\"\"Runs bruteforce agent on environment for different scenarios and\nusing different parameters to check no exceptions occur.\n\nTests loading environments using gym.make()\n\"\"\"\nfrom importlib import reload\n\nimport gymnasium as gym\nimport pytest\n\nimport nasim\nfrom nasim.scenarios.benchmark import AVAIL_BENCHMARKS\nfrom nasim.agents.bruteforce_agent import run_bruteforce_agent\n\n\ndef test_gym_reload():\n    \"\"\"Tests there is no issue when reloading gym \"\"\"\n    reload(gym)\n    reload(nasim)\n\n@pytest.mark.parametrize(\"scenario\", AVAIL_BENCHMARKS)\n@pytest.mark.parametrize(\"po\", ['', 'PO'])\n@pytest.mark.parametrize(\"obs\", ['', '2D'])\n@pytest.mark.parametrize(\"actions\", ['', 'VA'])\n@pytest.mark.parametrize(\"v\", ['v0'])\ndef test_bruteforce(scenario, po, obs, actions,v):\n    \"\"\"Tests all benchmark scenarios using every possible environment\n    setting, using bruteforce agent, checking for any errors\n    \"\"\"\n    name = ''.join([g.capitalize() for g in scenario.split(\"-\")])\n    name = f\"nasim:{name}{po}{obs}{actions}-{v}\"\n    env = gym.make(name, render_mode=None)\n    run_bruteforce_agent(env, verbose=False)\n"
  }
]