Full Code of Jjschwartz/NetworkAttackSimulator for AI

master 4f26de37cfdc cached

91 files

357.3 KB

88.9k tokens

443 symbols

1 requests

Download .txt

Showing preview only (382K chars total). Download the full file or copy to clipboard to get everything.

Repository: Jjschwartz/NetworkAttackSimulator
Branch: master
Commit: 4f26de37cfdc
Files: 91
Total size: 357.3 KB

Directory structure:
gitextract_bolyar94/

├── .github/
│   └── ISSUE_TEMPLATE/
│       ├── bug_report.md
│       └── feature_request.md
├── .gitignore
├── .readthedocs.yaml
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.rst
├── LICENSE.md
├── README.rst
├── docs/
│   ├── Makefile
│   ├── make.bat
│   ├── requirements.txt
│   └── source/
│       ├── community/
│       │   ├── acknowledgements.rst
│       │   ├── contact.rst
│       │   ├── development.rst
│       │   ├── distributing.rst
│       │   ├── index.rst
│       │   └── license.rst
│       ├── conf.py
│       ├── explanations/
│       │   ├── index.rst
│       │   ├── scenario_generation.rst
│       │   └── sim_to_real.rst
│       ├── index.rst
│       ├── reference/
│       │   ├── agents/
│       │   │   └── index.rst
│       │   ├── envs/
│       │   │   ├── actions.rst
│       │   │   ├── environment.rst
│       │   │   ├── host_vector.rst
│       │   │   ├── index.rst
│       │   │   ├── observation.rst
│       │   │   └── state.rst
│       │   ├── index.rst
│       │   ├── load.rst
│       │   └── scenarios/
│       │       ├── benchmark_scenarios.rst
│       │       ├── benchmark_scenarios_agent_scores.csv
│       │       ├── benchmark_scenarios_table.csv
│       │       ├── generator.rst
│       │       └── index.rst
│       └── tutorials/
│           ├── creating_scenarios.rst
│           ├── environment.rst
│           ├── gym_load.rst
│           ├── index.rst
│           ├── installation.rst
│           ├── loading.rst
│           └── scenarios.rst
├── nasim/
│   ├── __init__.py
│   ├── agents/
│   │   ├── __init__.py
│   │   ├── bruteforce_agent.py
│   │   ├── dqn_agent.py
│   │   ├── keyboard_agent.py
│   │   ├── policies/
│   │   │   └── dqn_tiny.pt
│   │   ├── ql_agent.py
│   │   ├── ql_replay_agent.py
│   │   └── random_agent.py
│   ├── demo.py
│   ├── envs/
│   │   ├── __init__.py
│   │   ├── action.py
│   │   ├── environment.py
│   │   ├── gym_env.py
│   │   ├── host_vector.py
│   │   ├── network.py
│   │   ├── observation.py
│   │   ├── render.py
│   │   ├── state.py
│   │   └── utils.py
│   ├── scenarios/
│   │   ├── __init__.py
│   │   ├── benchmark/
│   │   │   ├── __init__.py
│   │   │   ├── generated.py
│   │   │   ├── medium-multi-site.yaml
│   │   │   ├── medium-single-site.yaml
│   │   │   ├── medium.yaml
│   │   │   ├── small-honeypot.yaml
│   │   │   ├── small-linear.yaml
│   │   │   ├── small.yaml
│   │   │   ├── tiny-hard.yaml
│   │   │   ├── tiny-small.yaml
│   │   │   └── tiny.yaml
│   │   ├── generator.py
│   │   ├── host.py
│   │   ├── loader.py
│   │   ├── scenario.py
│   │   └── utils.py
│   └── scripts/
│       ├── describe_scenarios.py
│       ├── run_dqn_policy.py
│       ├── run_random_benchmarks.py
│       ├── train_dqn.py
│       └── visualize_graph.py
├── setup.py
└── test/
    ├── __init__.py
    ├── test_bruteforce.py
    ├── test_env.py
    ├── test_generator.py
    └── test_gym_bruteforce.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/ISSUE_TEMPLATE/bug_report.md
================================================
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''

---

**Describe the bug**
A clear and concise description of what the bug is.

**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

**Expected behavior**
A clear and concise description of what you expected to happen.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Desktop (please complete the following information):**
 - OS: [e.g. iOS]
 - Browser [e.g. chrome, safari]
 - Version [e.g. 22]

**Smartphone (please complete the following information):**
 - Device: [e.g. iPhone6]
 - OS: [e.g. iOS8.1]
 - Browser [e.g. stock browser, safari]
 - Version [e.g. 22]

**Additional context**
Add any other context about the problem here.


================================================
FILE: .github/ISSUE_TEMPLATE/feature_request.md
================================================
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.


================================================
FILE: .gitignore
================================================
*.cprof

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Sphinx documentation
docs/_build/

# mkdocs documentation
/site

# data storage from tensorboard
nasim/agents/runs
runs/

.ipynb_checkpoints/

*.ipynb


================================================
FILE: .readthedocs.yaml
================================================
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the version of Python and other tools you might need
build:
  os: ubuntu-20.04
  tools:
    python: "3.8"

# Build documentation in the docs/ directory with Sphinx
sphinx:
   configuration: docs/source/conf.py
   builder: html
   fail_on_warning: false

# Optionally declare the Python requirements required to build your docs
python:
   install:
     - method: pip
       path: .
     - requirements: docs/requirements.txt

================================================
FILE: CODE_OF_CONDUCT.md
================================================
# Contributor Covenant Code of Conduct

## Our Pledge

In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to making participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, sex characteristics, gender identity and expression,
level of experience, education, socio-economic status, nationality, personal
appearance, race, religion, or sexual identity and orientation.

## Our Standards

Examples of behavior that contributes to creating a positive environment
include:

* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members

Examples of unacceptable behavior by participants include:

* The use of sexualized language or imagery and unwelcome sexual attention or
 advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
 address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
 professional setting

## Our Responsibilities

Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.

## Scope

This Code of Conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community. Examples of
representing a project or community include using an official project e-mail
address, posting via an official social media account, or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at Jonathon.schwartz@anu.edu.au. All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html

[homepage]: https://www.contributor-covenant.org

For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq


================================================
FILE: CONTRIBUTING.rst
================================================
Development
===========

NASim is a work in progress and contributions are welcome via pull request.

For more information, you can check out this link : |how_to_contrib|.

.. |how_to_contrib| raw:: html

   <a href="https://guides.github.com/activities/contributing-to-open-source/#contributing" target="_blank">Contributing to an open source Project on github</a>

Guidelines
----------

Here are a few guidelines for this project.

* Simplicity: Be easy to use but also easy to understand when one digs into the code. Any additional code should be justified by the usefulness of the feature.

These guidelines come of course in addition to all good practices for open source development.

.. _naming_conv:

Code style
----------

This project follows the `PEP 8 <https://www.python.org/dev/peps/pep-0008/>`_ style guide, please follow this with your contributions.

Additionally:
* If a variable is intended to be 'private', it is prefixed by an underscore.

Documentation
-------------

All contributions should be accompanied with at least in code docstrings, when applicable. This project uses `Sphinx <https://www.sphinx-doc.org/>`_ for documentation generation and uses `Numpy style docstrings <https://numpydoc.readthedocs.io/>`_.

Please see code in this project for example or check out this `example <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html#example-numpy>`_.


================================================
FILE: LICENSE.md
================================================

The MIT License (MIT)

Copyright (c) 2018 

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: README.rst
================================================
**Status**: Stable release. No extra development is planned, but still being maintained (bug fixes, etc).


Network Attack Simulator
========================

|docs|

Network Attack Simulator (NASim) is a simulated computer network complete with vulnerabilities, scans and exploits designed to be used as a testing environment for AI agents and planning techniques applied to network penetration testing.


Installation
------------

The easiest way to install the latest version of NASim hosted on PyPi is via pip::

  $ pip install nasim


To install dependencies for running the DQN test agent (this is needed to run the demo) run::

  $ pip install nasim[dqn]


To get the latest bleeding edge version and install in development mode see the `Install docs <https://networkattacksimulator.readthedocs.io/en/latest/tutorials/installation.html>`_


Demo
----

To see NASim in action, you can run the provided demo to interact with an environment directly or see a pre-trained AI agent in action.

To run the `tiny` benchmark scenario demo in interactive mode run::

  $ python -m nasim.demo tiny


This will then run an interactive console where the user can see the current state and choose the next action to take. The goal of the scenario is to *compromise* every host with a non-zero value.

See `here <https://networkattacksimulator.readthedocs.io/en/latest/reference/scenarios/benchmark_scenarios.html>`_ for the full list of scenarios.

To run the `tiny` benchmark scenario demo using the pre-trained AI agent, first ensure the DQN dependencies are installed (see *Installation* section above), then run::

  $ python -m nasim.demo tiny -ai


**Note:** Currently you can only run the AI demo for the `tiny` scenario.


Documentation
-------------

The documentation is available at: https://networkattacksimulator.readthedocs.io/



Using with gymnasium
---------------------

NASim implements the `Gymnasium <https://github.com/Farama-Foundation/Gymnasium/tree/main>`_ environment interface and so can be used with any algorithm that is developed for that interface.

See `Starting NASim using gymnasium <https://networkattacksimulator.readthedocs.io/en/latest/tutorials/gym_load.html>`_.


Authors
-------

**Jonathon Schwartz** - Jonathon.schwartz@anu.edu.au


License
-------

`MIT`_ © 2020, Jonathon Schwartz

.. _MIT: LICENSE


What's new
----------


- 2023-05-14 (v 0.12.0) (MINOR release)

  + Renamed `NASimEnv.get_minimum_actions -> NASimEnv.get_minumum_hops` to better reflect what it does (thanks @rzvnbr for the suggestion).


- 2023-03-13 (v 0.11.0) (MINOR release)

  + Migrated to `gymnasium (formerly Open AI gym) <https://github.com/Farama-Foundation/Gymnasium/>`_ fromOpen AI gym (thanks @rzvnbr for the suggestion).
  + Fixed bug with action string representation (thanks @rzvnbr for the bug report)
  + Added "sim to real considerations" explanation document to the docs (thanks @Tudyx for the suggestion)

- 2023-02-27 (v 0.10.1) (MICRO release)

  + Fixed bug for host based actions (thanks @nguyen-thanh20 for the bug report)

- 2022-07-30 (v 0.10.0) (MINOR release)

  + Fixed typos (thanks @francescoluciano)
  + Updates to be compatible with latest version of OpenAI gym API (v0.25) (see `Open AI gym API docs <https://www.gymlibrary.ml/content/api/>`_ for details), notable changes include

    * Updated naming convention when initializing environments using the ``gym.make`` API (see `gym load docs <https://networkattacksimulator.readthedocs.io/en/latest/tutorials/gym_load.html>`_ for details.)
    * Updated reset function to match new gym API (shouldn't break any implementations using old API)
    * Updated step function to match new gym API. It now returns two bools, the first specifies if terminal/goal state has been reached and the other specifies if the episode is terminated due to the scenario step limit (if any exists) has been reached. This change may break implementations and you may need to specify (or not) when initializing the gym environment using ``gym.make(env_id, new_step_api=True)``

- 2022-05-19 (v 0.9.1) (MICRO release)

  + Fixed a few bugs and added some tests (thanks @simonsays1980 for the bug reports)

- 2021-12-20 (v 0.9.0) (MINOR release)

  + The value of a host is now observed when any level of access is gained on a host. This makes it so that agents can learn to decide whether to invest time in gaining root access on a host or not, depending on the host's value (thanks @jaromiru for the proposal).
  + Initial observation of reachable hosts now contains the host's address (thanks @jaromiru).
  + Added some support for custom address space bounds in when using scenario generator (thanks @jaromiru for the suggestion).

- 2021-3-15 (v 0.8.0) (MINOR release)

  + Added option of specifying a 'value' for each host when defining a custom network using the .YAML format (thanks @Joe-zsc for the suggestion).
  + Added the 'small-honeypot' scenario to included scenarios.

- 2020-12-24 (v 0.7.5) (MICRO release)

  + Added 'undefined error' to observation to fix issue with initial and later observations being indistinguishable.

- 2020-12-17 (v 0.7.4) (MICRO release)

  + Fixed issues with incorrect observation of host 'value' and 'discovery_value'. Now, when in partially observable mode, the agent will correctly only observe these values on the step that they are recieved.
  + Some other minor code formatting fixes

- 2020-09-23 (v 0.7.3) (MICRO release)

  + Fixed issue with scenario YAML files not being included with PyPi package
  + Added final policy visualisation option to DQN and Q-Learning agents

- 2020-09-20 (v 0.7.2) (MICRO release)

  + Fixed bug with 're-registering' Gym environments when reloading modules
  + Added example implementations of Tabular Q-Learning: `agents/ql_agent.py` and `agents/ql_replay.py`
  + Added `Agents` section to docs, along with other minor doc updates

- 2020-09-20 (v 0.7.1) (MICRO release)

  + Added some scripts for running random benchmarks and describing benchmark scenarios
  + Added some more docs (including for creating custom scenarios) and updated other docs

- 2020-09-20 (v 0.7.0) (MINOR release)

  + Implemented host based firewalls
  + Added priviledge escalation
  + Added a demo script, including a pre-trained agent for the 'tiny' scenario
  + Fix to upper bound calculation (factored in reward for discovering a host)

- 2020-08-02 (v 0.6.0) (MINOR release)

  + Implemented compatibility with gym.make()
  + Updated docs for loading and interactive with NASimEnv
  + Added extra functions to nasim.scenarios to make it easier to load scenarios seperately to a NASimEnv
  + Fixed bug to do with class attributes and creating different scenarios in same python session
  + Fixed up bruteforce agent and tests

- 2020-07-31 (v 0.5.0) (MINOR release)

  + First official release on PyPi
  + Cleaned up dependencies, setup.py, etc and some small fixes


.. |docs| image:: https://readthedocs.org/projects/networkattacksimulator/badge/
    :target: https://networkattacksimulator.readthedocs.io/en/latest/?badge=latest
    :alt: Documentation Status
    :scale: 100%


================================================
FILE: docs/Makefile
================================================
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS    ?=
SPHINXBUILD   ?= sphinx-build
SOURCEDIR     = source
BUILDDIR      = build

# Put it first so that "make" without argument is like "make help".
help:
	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)


================================================
FILE: docs/make.bat
================================================
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
	set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
	echo.
	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
	echo.installed, then set the SPHINXBUILD environment variable to point
	echo.to the full path of the 'sphinx-build' executable. Alternatively you
	echo.may add the Sphinx directory to PATH.
	echo.
	echo.If you don't have Sphinx installed, grab it from
	echo.http://sphinx-doc.org/
	exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd


================================================
FILE: docs/requirements.txt
================================================
nasim
sphinx
sphinx-autobuild
sphinx-rtd-theme


================================================
FILE: docs/source/community/acknowledgements.rst
================================================
.. _acknowledgements:

Acknowledgements
================

* Inspiration for the documentation was taken from the `DeeR <https://deer.readthedocs.io/en/master/>`_ project.


================================================
FILE: docs/source/community/contact.rst
================================================
Contact
=======
Questions? Please contact Jonathon.schwartz@anu.edu.au.


================================================
FILE: docs/source/community/development.rst
================================================
.. _dev:

Development
===========

NASim is a work in progress and contributions are welcome via pull request.

For more information, you can check out this link : |how_to_contrib|.

.. |how_to_contrib| raw:: html

   <a href="https://guides.github.com/activities/contributing-to-open-source/#contributing" target="_blank">Contributing to an open source Project on github</a>

Guidelines
----------

Here are a few guidelines for this project.

* Simplicity: Be easy to use but also easy to understand when one digs into the code. Any additional code should be justified by the usefulness of the feature.

These guidelines come of course in addition to all good practices for open source development.

.. _naming_conv:

Code style
----------

This project follows the `PEP 8 <https://www.python.org/dev/peps/pep-0008/>`_ style guide, please follow this with your contributions.

Additionally:
* If a variable is intended to be 'private', it is prefixed by an underscore.

Documentation
-------------

All contributions should be accompanied with at least in code docstrings, when applicable. This project uses `Sphinx <https://www.sphinx-doc.org/>`_ for documentation generation and uses `Numpy style docstrings <https://numpydoc.readthedocs.io/>`_.

Please see code in this project for example or check out this `example <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html#example-numpy>`_.


================================================
FILE: docs/source/community/distributing.rst
================================================
.. _distribution:

Distribution
============

This document contains some notes on distributing NASim via PyPi. This is mainly as a reminder for the steps to take when releasing an update.

.. note:: Unless specified otherwise, all bash commands are assumed to be executed from the root directory of the NASim package.


Before pushing to master
~~~~~~~~~~~~~~~~~~~~~~~~

1. Ensure all tests are passing by running:

.. code-block:: bash

   cd test
   pytest

2. Ensure updates are included in the *What's new* section of the *README.rst* and *docs/source/index.rst* files (this step can be ignored for very small changes)
3. Ensure any necessary updates have been included in the documentation.
4. Make sure the documentation can be built by running:

.. code-block:: bash

   cd docs
   make html

5. Ensure ``setup.py`` has been updated to reflect any version and/or dependency changes.


After changes have been pushed
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If pushing a new version (MAJOR, MINOR, or MICRO), do the following:

1. Add a tag with the release number to the commit.
2. On github create a new release and link it to the tagged commit
3. Publish the new release to PyPi:

.. code-block:: bash

   # build distributions
   python setup.py sdist bdist_wheel

   # upload latest distribution builds to pypi
   # this will ask for PyPi username and password
   python -m twine upload dist/* --skip-existing


4. Login to https://pypi.org/ and verify latest version is added correctly.
5. Visit https://networkattacksimulator.readthedocs.io/en/latest/index.html and check documentation has updated correctly (make sure to refresh browser cache to ensure your looking at the latest version.)


================================================
FILE: docs/source/community/index.rst
================================================
.. _community:

Community & Development
=======================

.. toctree::
    :maxdepth: 1

    development
    license
    contact
    acknowledgements
    distributing


================================================
FILE: docs/source/community/license.rst
================================================
License
=======

The MIT License (MIT)

Copyright (c) 2018

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: docs/source/conf.py
================================================
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys
import nasim
sys.path.insert(0, os.path.abspath(os.path.join('..', '..')))


# -- Project information -----------------------------------------------------

project = 'NASim'
copyright = '2020, Jonathon Schwartz'
author = 'Jonathon Schwartz'

# The full version, including alpha/beta/rc tags
release = nasim.__version__


# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
    'sphinx.ext.autodoc',
    'sphinx.ext.coverage',
    'sphinx.ext.napoleon'
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []

# Explicitly assign the master document
# This is required for the readthedocs.org build to work correctly
master_doc = 'index'


# -- to include special methods ---------------------------------------------

def skip(app, what, name, obj, would_skip, options):
    if name == "__init__":
        return False
    return would_skip


def setup(app):
    app.connect("autodoc-skip-member", skip)


# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages.  See the documentation for
# a list of builtin themes.
#
# html_theme = 'alabaster'
html_theme = 'sphinx_rtd_theme'

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']


================================================
FILE: docs/source/explanations/index.rst
================================================
.. _explanations:

Explanations
============

More technical explanations related to NASim.

.. toctree::
    :maxdepth: 1

    scenario_generation
    sim_to_real


================================================
FILE: docs/source/explanations/scenario_generation.rst
================================================
.. _scenario_generation_explanation:

Scenario Generation Explanation
===============================

Generating the scenarios involves a number of design decisions that strongly determine the form of the network being generated. This document aims to explain some of the more technical details of generating the scenarios when using the :ref:`scenario_generator` class.

The scenario generator is based heavily on prior work, specifically:

- `Sarraute, Carlos, Olivier Buffet, and Jörg Hoffmann. "POMDPs make better hackers: Accounting for uncertainty in penetration testing." Twenty-Sixth AAAI Conference on Artificial Intelligence. 2012. <https://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/viewPaper/4996>`_
- `Speicher, Patrick, et al. "Towards Automated Network Mitigation Analysis (extended)." arXiv preprint arXiv:1705.05088 (2017). <https://arxiv.org/abs/1705.05088>`_

Network Topology
----------------

Description to come. Till then we recommend reading the papers linked above, especially the appendix of Speicher et al (2017).

.. _correlated_configurations:

Correlated Configurations
-------------------------

When generating a scenario with ``uniform=False`` the scenario will be generated with host configurations being correlated. This means that rather than the OS and services it is running being chosen uniformly at random from the available OSs and services, they are chosen randomly with increased probability given to OSs and services that are being run by other hosts whose configuration was generated earlier.


Specifically, the distribution of configurations of each host in the network are generated using a Nested Dirichlet Process, so that across the network hosts will have corelated configurations (i.e. certain services/configurations will be more common across hosts on the network). The correlation can be controlled using three parameters: ``alpha_H``, ``alpha_V``, and ``lambda_V``.

``alpha_H`` and ``alpha_V`` control the degree of correlation, with lower values leading to greater correlation.

``lambda_V`` controls the average number of services running per host, with higher values will mean more services (so more vulnerable) hosts on average.

All three parameters must have a positive value, with the defaults being ``alpha_H=2.0``, ``alpha_V=2.0``, and ``lambda_V=1.0``, which tends to generate networks with fairly correlated configurations where hosts have only a single vulnerability on average.


.. _generated_exploit_probs:

Generated Exploit Probabilities
-------------------------------

Success probabilities of each exploit are determined based on the value of the ``exploit_probs`` argument, as follows:

- ``exploit_probs=None`` - probabilities generated randomly from uniform distribution over the interval (0, 1).
- ``exploit_probs=float`` - probability of each exploit is set to the float value, which must be a valid probability.
- ``exploit_probs=list[float]`` - probability of each exploit is set to corresponding float value in list. This requires that the length of the list matches the number of exploits as specified by the ``num_exploits`` argument.
- ``exploit_probs="mixed"`` - probabilities chosen from a set distribution which is based on the `CVSS attack complexity <https://www.first.org/cvss/v2/guide>`_ distribution of `top 10 vulnerabilities in 2017 <https://go.recordedfuture.com/hubfs/reports/cta-2018-0327.pdf>`_. Specifically, exploit probabilities are chosen from [0.3, 0.6, 0.9] which correspond to high, medium and low attack complexity, respectively, with probabilities [0.2, 0.4, 0.4].

For deterministic exploits set ``exploit_probs=1.0``.


Firewall
--------

The firewall restricts which services can be communicated with between hosts on different subnets. This is mostly done by selecting services at random to block between each subnet, with some contraints.

Firstly, there exists no firewall between subnets in the user zone. So communication between hosts on different user subnets is allowed for all services.

Secondly, the number of services blocked is controlled by the ``restrictiveness`` parameter. This controls the number of services to block between zones (i.e. between the internet, DMZ, sensitive, and user zones).

Thirdly, to ensure that the goal can be reached, traffic from at least one service running on each subnet will be allowed between each zone. This may mean more services will be allowed than restrictiveness parameter.


================================================
FILE: docs/source/explanations/sim_to_real.rst
================================================
.. _sim_to_real_explanation:

Sim-to-Real Gap Considerations
==============================

NASim is a fairly simplified simulator of network penetration testing. It's main goal is to capture some of the key features of network pentesting in a easy-to-use and fast simulator so that it can be used for rapid testing and prototyping of algorithms before these algorithms are tested on more realistic environments. That is to say there is a bit of gap between the scenarios in NASim and the real world.

In this document we wanted to lay down some considerations to think about when trying to extend your algorithm beyond NASim. This is by no means an exhaustive list, but will hopefully give you something to think about for the next steps, and also give an explanation of some of the design decisions made in NASim.

.. note:: This document is a work in progress so if you have any thoughts, useful references, etc on the topic of applying autonomous penetration testing in the real-world please reach out via email or open an issue on github.

Handling Partial Observability
------------------------------

One of the big assumptions made by NASim is that the pentester agent has access to the network addresses of every host in the network, even in partially observable mode. This information is given to the agent in it's list of actions. In practice in the real-world, depending on the scenario, this assumption may be invalid, and part of the challenge for the pentester is to be able to discover new hosts as they navigate through the network.

The main reason NASim is implemented with the network addresses being known is so that the action space size could be fixed, making it simpler to use with typical Deep Reinforcement Learning algorithms (i.e. with neural nets with fixed size input and output layers).

One of the research challenges is to develop algorithms that can handle action spaces that change as the pentester discovers more network addresses, or perhaps more realistic would be that the pentester's action space is mult-dimensional and includes choosing an address and exploit/scan/etc separately. There actually is some support for this built into NASim with the nasim.envs.action.ParameterisedActionSpace action space (see :ref:`actions`), but even using that action space some information about the size of the network is given to the pentester.

At this stage there is no plans to update NASim to support the no-information action space. This is partially due to time, but also to keep NASim simple and stable and because there are a lot of even better and more realistic environments being developed now (e.g. `CybORG <https://github.com/cage-challenge/CybORG>`_.)

One avenue for handling changing action space is to use auto-regressive actions as was done by `AlphaStar <https://www.deepmind.com/blog/alphastar-mastering-the-real-time-strategy-game-starcraft-ii>`_.


================================================
FILE: docs/source/index.rst
================================================
Welcome to Network Attack Simulator's documentation!
====================================================

Network Attack Simulator (NASim) is a lightweight, high-level network attack simulator written in python. It is designed to be used for rapid testing of autonomous pen-testing agents using reinforcement learning and planning. It is a simulator by definition so does not replicate all details of attacking a real system but it instead aims to capture some of the more salient features of network pen-testing such as the large and changing sizes of the state and action spaces, partial observability and varied network topology.

The environment is modelled after the `gymnasium (formerly Open AI gym) <https://github.com/Farama-Foundation/Gymnasium/>`_ interface.


What's new
----------

Version 0.12.0
**************

+ Renamed `NASimEnv.get_minimum_actions -> NASimEnv.get_minumum_hops` to better reflect what it does (thanks @rzvnbr for the suggestion).


Version 0.11.0
**************

+ Migrated to `gymnasium (formerly Open AI gym) <https://github.com/Farama-Foundation/Gymnasium/>`_ fromOpen AI gym (thanks @rzvnbr for the suggestion).
+ Fixed bug with action string representation (thanks @rzvnbr for the bug report)
+ Added "sim to real considerations" explanation document to the docs (thanks @Tudyx for the suggestion)


Version 0.10.1
**************

+ Fixed bug for host based actions (thanks @nguyen-thanh20 for the bug report)


Version 0.10.0
**************

+ Fixed typos (thanks @francescoluciano)
+ Updates to be compatible with latest version of OpenAI gym API (v0.25) (see `Open AI gym API docs <https://www.gymlibrary.ml/content/api/>`_ for details), notable changes include

  * Updated naming convention when initializing environments using the ``gym.make`` API (see `gym load docs <https://networkattacksimulator.readthedocs.io/en/latest/tutorials/gym_load.html>`_ for details.)
  * Updated reset function to match new gym API (shouldn't break any implementations using old API)
  * Updated step function to match new gym API. It now returns two bools, the first specifies if terminal/goal state has been reached and the other specifies if the episode is terminated due to the scenario step limit (if any exists) has been reached. This change may break implementations and you may need to specify (or not) when initializing the gym environment using ``gym.make(env_id, new_step_api=True)``


Version 0.9.1
*************

- Fixed a few bugs and added some tests (thanks @simonsays1980 for the bug reports)


Version 0.9.0
*************

- The value of a host is now observed when any level of access is gained on a host. This makes it so that agents can learn to decide whether to invest time in gaining root access on a host or not, depending on the host's value (thanks @jaromiru for the proposal).
- Initial observation of reachable hosts now contains the host's address (thanks @jaromiru).
- Added some support for custom address space bounds in when using scenario generator (thanks @jaromiru for the suggestion).


Version 0.8.0
*************

- Added option of specifying a 'value' for each host when defining a custom network using the .YAML format (thanks @Joe-zsc for the suggestion).
- Added the 'small-honeypot' scenario to included scenarios.


Version 0.7.5
*************

- Added 'undefined error' to observation to fix issue with initial and later observations being indistinguishable.


Version 0.7.4
*************

- Fixed issues with incorrect observation of host 'value' and 'discovery_value'. Now, when in partially observable mode, the agent will correctly only observe these values on the step that they are recieved
- Some other minor code formatting fixes


Version 0.7.3
*************

- Fixed issue with scenario YAML files not being included with PyPi package
- Added final policy visualisation option to DQN and Q-Learning agents


Version 0.7.2
*************

- Fixed bug with 're-registering' Gym environments when reloading modules
- Added example implementations of Tabular Q-Learning: `agents/ql_agent.py` and `agents/ql_replay.py`
- Added `Agents` section to docs, along with other minor doc updates


Version 0.7.1
*************

- Added some scripts for running random benchmarks and describing benchmark scenarios
- Added some more docs (including for creating custom scenarios) and updated other docs


Version 0.7
***********

- Implemented host based firewalls
- Added priviledge escalation
- Added a demo script, including a pre-trained agent for the 'tiny' scenario
- Fix to upper bound calculation (factored in reward for discovering a host)


Version 0.6
***********

- Implemented compatibility with gym.make()
- Updated docs for loading and interactive with NASimEnv
- Added extra functions to nasim.scenarios to make it easier to load scenarios seperately to a NASimEnv
- Fixed bug to do with class attributes and creating different scenarios in same python session
- Fixed up bruteforce agent and tests


Version 0.5
***********

- First official release on PyPi
- Cleaned up dependencies, setup.py, etc and some small fixes
- First stable version


The Docs
--------

.. toctree::
   :maxdepth: 2

   tutorials/index
   reference/index
   explanations/index
   community/index


How should I cite NASim?
------------------------

Please cite NASim in your publications if you use it in your research. Here is an example BibTeX entry:

.. code-block:: bash

    @misc{schwartz2019nasim,
    title={NASim: Network Attack Simulator},
    author={Schwartz, Jonathon and Kurniawatti, Hanna},
    year={2019},
    howpublished={\url{https://networkattacksimulator.readthedocs.io/}},
    }



Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

.. _GitHub: https://github.com/Jjschwartz/NetworkAttackSimulator


================================================
FILE: docs/source/reference/agents/index.rst
================================================
.. _agents_reference:

Agents Reference
================

This page provides a short summary of the agents that come with the NASim library.

Available Agents
----------------

The agent implementations that come with NASim include:

* **keyboard_agent.py**: An agent that is controlled by the user via terminal inputs.
* **random_agent.py**: A random agent that selects an action randomly from all available actions at each time step.
* **bruteforce_agent.py**: An agent that repeatedly cycles through all available actions in order.
* **ql_agent.py**: A Tabular, epsilod-greedy Q-Learning reinforcement learning agent.
* **ql_replay_agent.py**: A Tabular, epsilod-greedy Q-Learning reinforcement learning agent (same as above) that incorporates an experience replay.
* **dqn_agent.py**: A Deep Q-Network reinforcement learning agent using experience replay and a target Q-Network.


Running Agents
--------------

Each agent file defines a main function so can be run in python via the terminal, with the specific scenario and settings specified as command line arguments:


.. code-block:: bash

    cd nasim/agents
    # to run a different agent, simply replace .py file with desired file
    # to run a different scenario, simply replace 'tiny' with desired scenario
    python bruteforce_agent.py tiny

    # to get details on command line arguments available (e.g. hyperparameters for Q-Learning and DQN agents)
    python bruteforce_agent.py --help


A description and details of how to run each agent can be found at the top of each agent file.


Viewing Agent Policies
----------------------

For the DQN and Tabular Q-Learning agents you can optionally also view the final policies learned by the agents after training has finished:

.. code-block:: bash

    # simply include the --render_eval flag with the DQN and Q-Learning agents
    python ql_agent.py tiny --render_eval


This will show a single episode of the agent, displaying the actions the agent performs along with the observations and rewards the agent recieves.


================================================
FILE: docs/source/reference/envs/actions.rst
================================================
.. _`actions`:

Actions
=======

.. automodule:: nasim.envs.action
   :members:


================================================
FILE: docs/source/reference/envs/environment.rst
================================================
.. _`environment`:

Environment
===========

.. automodule:: nasim.envs.environment
   :members:


================================================
FILE: docs/source/reference/envs/host_vector.rst
================================================
.. _`host_vector`:

HostVector
==========

.. automodule:: nasim.envs.host_vector
   :members:


================================================
FILE: docs/source/reference/envs/index.rst
================================================
.. _env_reference:

Environment Reference
=====================

Technical reference material for classes and functions used to interact with the NASim Environment.

.. toctree::
    :maxdepth: 1

    actions
    environment
    host_vector
    observation
    state


================================================
FILE: docs/source/reference/envs/observation.rst
================================================
.. _`observation`:

Observation
===========

.. automodule:: nasim.envs.observation
   :members:


================================================
FILE: docs/source/reference/envs/state.rst
================================================
.. _`state`:

State
=====

.. automodule:: nasim.envs.state
   :members:


================================================
FILE: docs/source/reference/index.rst
================================================
.. _reference:

Reference
=========

Technical reference material.

.. toctree::
    :maxdepth: 2

    load
    agents/index
    envs/index
    scenarios/index


================================================
FILE: docs/source/reference/load.rst
================================================
.. _nasim_init:

NASimEnv load reference
=======================

Technical reference material for different functions for creating a new NASim Environment.

.. automodule:: nasim
   :members:


================================================
FILE: docs/source/reference/scenarios/benchmark_scenarios.rst
================================================
.. _benchmark_scenarios:

Benchmark Scenarios
===================

There are a number of existing scenarios that come with NASim. They cover a range of complexities and sizes and are intended to be used to help with benchmarking algorithms. Additionally, there are two flavours of existing scenarios: **static** and **generated**.

.. note:: For full list of benchmark scenarios see :ref:`all_benchmark_scenarios`.

**Static** scenarios are predefined and will be exactly the same every time they are loaded. They are defined in .yaml files in the `nasim/scenarios/benchmark/` directory.

**Generated** are scenario generated using the :ref:`scenario_generator` based on some parameters. While certain features of the each scenario will remain constant between generations (e.g. number of hosts, services, exploits), other features may change (e.g. specific host configurations, firewall settings, exploit probabilities) depending on the random seed.


.. _all_benchmark_scenarios:

All benchmark scenarios
-----------------------

The following table provides details of each benchmark scenario currently available in NASim.

.. csv-table:: NASim Benchmark scenarios
   :file: benchmark_scenarios_table.csv
   :header-rows: 1


The number of actions is calculated as *Hosts X (Exploits + PrivEscs + 4)*. The +4 is for the 4 scans available for each host (OSScan, ServiceScan, ProcessScan, and SubnetScan).

The number of states is calculated as *Hosts X 2^(3 + OS + Services) X 3 *. Here the first 3 comes from the *compromised*, *reachable* and *discovered* features of the state and the base of 2 is due to all state features being boolean (present/absent). The second 3 comes from the number of possible access levels possible on a host.

The table below provides mean steps to reach the goal and reward (+/- stdev) for a uniform random agent, with scores averaged over 100 runs.

.. csv-table:: NASim Benchmark scenarios Agent scores
   :file: benchmark_scenarios_agent_scores.csv
   :header-rows: 2


Notes on the scenarios
----------------------

The *tiny*, *small*, *medium*, *large*, and *huge* (and their generated versions) are all based on the network scenarios first used by:

- `Sarraute, Carlos, Olivier Buffet, and Jörg Hoffmann. "POMDPs make better hackers: Accounting for uncertainty in penetration testing." Twenty-Sixth AAAI Conference on Artificial Intelligence. 2012. <https://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/viewPaper/4996>`_
- `Speicher, Patrick, et al. "Towards Automated Network Mitigation Analysis (extended)." arXiv preprint arXiv:1705.05088 (2017). <https://arxiv.org/abs/1705.05088>`_

The *pocp-1-gen* and *pocp-2-gen* scenarios are based on the work by:

- `Shmaryahu, D., Shani, G., Hoffmann, J., & Steinmetz, M. (2018, June). Simulated penetration testing as contingent planning. In Twenty-Eighth International Conference on Automated Planning and Scheduling. <https://www.aaai.org/ocs/index.php/ICAPS/ICAPS18/paper/viewPaper/17766>`_

The other scenarios were made up by author after looking at some random google images of network layouts, and playing around with different interesting network topologies.


================================================
FILE: docs/source/reference/scenarios/benchmark_scenarios_agent_scores.csv
================================================
Scenario Name,Steps,Total Reward
tiny,108.02 +/- 43.82,91.98 +/- 43.82
tiny-hard,135.31 +/- 65.56,21.05 +/- 85.45
tiny-small,319.56 +/- 124.26,-225.86 +/- 167.14
small,501.94 +/- 181.40,-469.80 +/- 241.99
small-honeypot,448.72 +/- 151.62,-476.08 +/- 222.41
small-linear,566.00 +/- 177.08,-555.08 +/- 241.06
medium,1371.45 +/- 420.41,-1875.29 +/- 660.62
medium-single-site,654.89 +/- 385.76,-782.17 +/- 581.14
medium-multi-site,1060.94 +/- 389.86,-1394.71 +/- 590.89
tiny-gen,86.56 +/- 40.16,116.43 +/- 40.15
tiny-gen-rgoal,98.94 +/- 47.83,104.02 +/- 47.80
small-gen,435.73 +/- 205.61,-228.53 +/- 214.34
small-gen-rgoal,423.52 +/- 226.68,-218.62 +/- 240.20
medium-gen,1002.94 +/- 468.10,-788.64 +/- 481.86
large-gen,2548.62 +/- 1224.08,-2327.34 +/- 1241.92
huge-gen,6303.86 +/- 2403.40,-6075.69 +/- 2434.77
pocp-1-gen,15189.46 +/- 6879.75,-14947.80 +/- 6887.43
pocp-2-gen,17211.38 +/- 5855.83,-16871.05 +/- 5864.58


================================================
FILE: docs/source/reference/scenarios/benchmark_scenarios_table.csv
================================================
Name,Type,Subnets,Hosts,OS,Services,Processes,Exploits,PrivEscs,Actions,Observation Dims,States,Step Limit
tiny,static,4,3,1,1,1,1,1,18,4X14,576,1000
tiny-hard,static,4,3,2,3,2,3,2,27,4X18,9216,1000
tiny-small,static,5,5,2,3,2,3,2,45,6X20,15360,1000
small,static,5,8,2,3,2,3,2,72,9X23,24576,1000
small-honeypot,static,5,8,2,3,2,3,2,72,9X23,24576,1000
small-linear,static,7,8,2,3,2,3,2,72,9X22,24576,1000
medium,static,6,16,2,5,3,5,3,192,17X27,393216,2000
medium-single-site,static,2,16,2,5,3,5,3,192,17x34,393216,2000
medium-multi-site,static,7,16,2,5,3,5,3,192,17X29,393216,2000
tiny-gen,generated,4,3,1,1,1,1,1,18,4X14,576,1000
tiny-gen-rangoal,generated,4,3,1,1,1,1,1,18,4X14,576,1000
small-gen,generated,5,8,2,3,2,3,2,72,9X23,24576,1000
small-gen-rangoal,generated,5,8,2,3,2,3,2,72,9X23,24576,1000
medium-gen,generated,6,16,2,5,2,5,2,176,17X26,196608,2000
large-gen,generated,8,23,3,7,3,7,3,322,24X32,4521984,5000
huge-gen,generated,11,38,4,10,4,10,4,684,39X40,2.39E+08,10000
pocp-1-gen,generated,10,35,2,50,2,60,2,2310,36X75,1.51E+19,30000
pocp-2-gen,generated,21,95,3,10,3,30,3,3515,96X48,1.49E+08,30000


================================================
FILE: docs/source/reference/scenarios/generator.rst
================================================
.. _scenario_generator:

Scenario Generator
===================

.. automodule:: nasim.scenarios.generator
   :members:


================================================
FILE: docs/source/reference/scenarios/index.rst
================================================
.. _scenario_reference:

Scenario Reference
==================

Technical reference material for classes and functions used to generate and load Scenarios to use with the NASim Environment.

.. toctree::
    :maxdepth: 1

    benchmark_scenarios
    generator


================================================
FILE: docs/source/tutorials/creating_scenarios.rst
================================================
.. _`creating_scenarios_tute`:

Creating Custom Scenarios
=========================

With NASim it is possible to use custom scenarios defined in a valid YAML file. In this tutorial we will cover how to create and run you own custom scenario.

.. _'defining_custom_yaml':

Defining a custom scenario using YAML
-------------------------------------

Before we dive into writing a new custom YAML scenario it is worth having a look at some examples. NASim comes with a number of benchmark YAML scenarios which can be found in the ``nasim/scenarios/benchmark`` directory (or view on github `here <https://github.com/Jjschwartz/NetworkAttackSimulator/tree/master/nasim/scenarios/benchmark>`_). For this tutorial we will be using the ``tiny.yaml`` scenario as an example.

A custom scenarios in NASim requires definining components: the network and the pen-tester.


Defining the network
^^^^^^^^^^^^^^^^^^^^

The network is defined by the following sections:

   1. **subnets**: size of each subnet in network
   2. **topology**: an adjacency matrix defining which subnets are connected
   3. **os**: names of available operating systems on network
   4. **services**: names of available services on network
   5. **processes**: names of available processes on network
   6. **hosts**: a dictionary of hosts on the network and their configurations
   7. **firewall**: definition of the subnet firewalls


Subnets
"""""""

This property defines the number of subnets on the network and the size of each. It is simply defined as an ordered list of integers. The address of the first subnet in the list is *1*, the second subnet is *2*, and so on. The address of *0* is reserved for the "internet" subnet (see topology section below). For example, the ``tiny`` network contains 3 subnets all of size 1:

.. code-block:: yaml

   subnets: [1, 1, 1]

   # or alternatively

   subnets:
     - 1
     - 1
     - 1


Topology
""""""""

The topology is defined by an adjacency matrix with a row and column for every subnet in the network along with an additional row and column designating the "internet" subnet, i.e. connection to outside of the network. The first row and column is reserved for the "internet" subnet. A connection between subnets is indicated with a ``1`` while not connection is indicated with a ``0``. Note that we assume that connections are symmetric and that a subnet is connected with itself.

For the ``tiny`` network, subnet *1* is a public subnet so is connected to the internet, indicated by a ``1`` in row 1, column 2 and row 2, column 1. Subnet *1* is also connected with subnets *2* and *3*, indicated by ``1`` in relevant cells, meanwhile subnets *2* and *3* are private and not connected directly to the internet, indicated by the ``0`` values.

.. code-block:: yaml

   topology: [[ 1, 1, 0, 0],
              [ 1, 1, 1, 1],
              [ 0, 1, 1, 1],
              [ 0, 1, 1, 1]]



OS, services, processes
"""""""""""""""""""""""

Similar to how we defined the subnet list, the **os**, **services** and **processes** are defined by a simple list. The names of any of the items in each list can be anything, but note that they will be used for validating the host configurations, exploits, etc, so just need to match-up with those values as desired.

Continuing our example, the ``tiny`` scenario includes one OS: *linux*, one service: *ssh*, and one process: *tomcat*:

.. code-block:: yaml

   os:
     - linux
   services:
     - ssh
   processes:
     - tomcat


Host Configurations
"""""""""""""""""""

The host configuration section is a mapping from host address to their configuration, where the address is a ``(subnet number, host number)`` tuple and the configuration must include the hosts OS, services running, processes running, and optional host firewall settings.

There are a few things to note when defining a host:

   1. The number of hosts defined for each subnet needs to match the size of each subnet
   2. Host addresses within a subnet must start from ``0`` and count up from there (i.e. three hosts in subnet *1* would have addresses ``(1, 0)``, ``(1, 1)``, and ``(1, 2)``)
   3. The names of any OS, service, and process must match values provided in the **os**, **services** and **processes** sections of the YAML file.
   4. Each host must have an OS and at least one service running. It is okay for hosts to have no processes running (which can be indicated using an empty list ``[]``).

**Host firewalls** are defined as a mapping from host address to the list of services to deny from that host. Host addresses must be a valid address of a host in the network and any services must also match services defined in the services section. Finally, if a host address is not part of the firewall then it is assumed all traffic is allowed from that host, at the host level (it may still be blocked by subnet firewall).

**Host Value** is the optional value the agent will recieve when compromising the host. Unlike for the *sensitive_hosts* section this value can be negative as well as zero and positive. This makes it possible to set additional host specific rewards or penalties, for example setting a negative reward for a 'honeypot' host on the network. A couple of things to note:

  1. Host value is optional and will default to 0.
  2. For any *sensitive hosts* the value must either not be specified or it must match the value specified in the *sensitive_hosts* section of the file.
  3. Same as for *sensitive hosts*,  agent will only recieve the value as a reward when they compromise the host.

Here is the example host configurations section for the ``tiny`` scenario, where a host firewall and is defined only for host ``(1, 0)`` and the host ``(1, 0)`` has a value of ``0`` (noting we could leave value unspecified in this case for the same result, we include it here as an example):

.. code-block:: yaml

   host_configurations:
     (1, 0):
       os: linux
       services: [ssh]
       processes: [tomcat]
       # which services to deny between individual hosts
       firewall:
         (3, 0): [ssh]
       value: 0
     (2, 0):
       os: linux
       services: [ssh]
       processes: [tomcat]
       firewall:
         (1, 0): [ssh]
     (3, 0):
       os: linux
       services: [ssh]
       processes: [tomcat]


Firewall
""""""""

The final section for defining the network is the firewall, which is defined as a mapping from ``(subnet number, subnet number)`` tuples to list of services to allow. Some things to note about defining firewalls:

   1. A firewall rule can only be defined between subnets that are connected in the topology adjacency matrix.
   2. Each rule defines which services are allowed in a single direction, from the first subnet in the tuple to the second subnet in the tuple (i.e. (source subnet, destination subnet))
   3. An empty list means all traffic will be blocked from source to destination

Here is the firewall definition for the ``tiny`` scenario where SSH traffic is allowed between all subnets, except from subnet 1 to 0 and from 1 to 2.

.. code-block:: yaml

    # two rows for each connection between subnets as defined by topology
    # one for each direction of connection
    # lists which services to allow
    firewall:
      (0, 1): [ssh]
      (1, 0): []
      (1, 2): []
      (2, 1): [ssh]
      (1, 3): [ssh]
      (3, 1): [ssh]
      (2, 3): [ssh]
      (3, 2): [ssh]


And with that we have covered everything needed to define the scenario's network. Next up is defining the pen-tester.


Defining the pen-tester
^^^^^^^^^^^^^^^^^^^^^^^

The pen-tester is defined by these sections:

   1. **sensitive_hosts**: a dictionary containing the address of sensitive/target hosts and their value
   2. **exploits**: a dictionary of exploits
   3. **privilege_escalation**: a dictionary of privilege escalation actions
   4. **os_scan_cost**: cost of using OS scan
   5. **service_scan_cost**: cost of using service scan
   6. **process_scan_cost**: cost of using process scan
   7. **subnet_scan_cost**: cost of using subnet scan
   8. **step_limit**: the maximum number of actions pen-tester can perform in a single episode


Sensitive hosts
"""""""""""""""

This section specifies the addresses and values of the target hosts in the network. When the pen-tester gains root access on these hosts they will recieve the specified value as a reward. The *sensitive_hosts* section is a dictionary where the entries are address, value pairs. Where the address is a ``(subnet number, host number)`` tuple and the value is a non-negative float or integer.

In the ``tiny`` scenario the pen-tester is aiming to get root access on the hosts ``(2, 0)`` and ``(3, 0)``, both of which have a value of 100:

.. code-block:: yaml

    sensitive_hosts:
      (2, 0): 100
      (3, 0): 100


Exploits
""""""""

The exploits section is a dictionary which maps exploit names to exploit definitions. Every scenario requires at least on exploit. An exploit definition is a dictionary which must include the following entries:

  1. **service**: the name of the service the exploit targets.

     - Note, the value must match the name of a service defined in the **services** section of the network definition.

  2. **os**: the name of the operating system the exploit targets or ``none`` if the exploit works on all OSs.

     - If the value is not ``none`` it must match the name of an OS defined in the **os** section of the network definition

  3. **prob**: the probability that the exploit succeeds given all preconditions are met (i.e. target host is discovered and reachable, and the host is running targete service and OS)
  4. **cost**: the cost of performing the action. This should be a non-negative int or float and can represent the cost of the action in any sense desired (financial, time, traffic generated, etc)
  5. **access**: the resulting access the pen-tester will get on the target host if the exploit succeeds. This can be either *user* or *root*.


The name of the exploits can be anything you desire, so long as they are immutable and hashable (i.e. strings, ints, tuples) and unique.

The ``tiny`` example scenario has only a single exploit ``e_ssh`` which targets the SSH service running on linux hosts, has a cost of 1 and results in user level access:

.. code-block:: yaml

    exploits:
      e_ssh:
        service: ssh
        os: linux
        prob: 0.8
        cost: 1
        access: user


Privilege Escalation
"""""""""""""""""""""

Similar to the exploits section, the privilege escalation section is a dictionary which maps privilege escalation action names to their definitions. A privilege escalation action definition is a dictionary which must include the following entries:

  1. **process**: the name of the process the action targets.

     - The value must match the name of a process defined in the **processes** section of the network definition.

  2. **os**: the name of the operating system the action targets or ``none`` if the exploit works on all OSs.

     - If the value is not ``none`` it must match the name of an OS defined in the **os** section of the network definition.

  3. **prob**: the probability that the action succeeds given all preconditions are met (i.e. pen-tester has access to target host, and the host is running target process and OS)
  4. **cost**: the cost of performing the action. This should be a non-negative int or float and can represent the cost of the action in any sense desired (financial, time, traffic generated, etc)
  5. **access**: the resulting access the pen-tester will get on the target host if the action succeeds. This can be either *user* or *root*.

Similar to  exploits, the name of each privilege exploit action can be anything you desire, so long as they are immutable and hashable (i.e. strings, ints, tuples) and unique.

.. note:: It is not required that a scenario has any privilege escalation actions defined. In this case define the privilege escalation section to be empty: ``privilege_escalation: {}``.

          Note however that you will need to make sure that it is possible to get root access on the sensitive hosts via using only exploits, otherwise the pen-tester will never be able to reach the goal.

The ``tiny`` example scenario has a single privilege escalation action ``pe_tomcat`` which targets the tomcat process running on linux hosts, has a cost of 1 and results in root level access:

.. code-block:: yaml

    privilege_escalation:
      pe_tomcat:
        process: tomcat
        os: linux
        prob: 1.0
        cost: 1
        access: root


Scan costs
""""""""""

Each scan must have an associated non-negative cost associated with it. This cost can represent whatever you wish and will be factored in to the reward the agent recieves each time a scan is performed.

Scan costs are easy to define, requiring only a non-negative float or integer value. You must specify the cost of all scans. Here, in the example ``tiny`` scenario, we define a cost of 1 for all scans:

.. code-block:: yaml

    service_scan_cost: 1
    os_scan_cost: 1
    subnet_scan_cost: 1
    process_scan_cost: 1


Step limit
""""""""""

The step limit defines the maximum number of steps (i.e. actions) the pen-tester has to reach the goal within a single episode. During simulation once the step limit is reached the episode is considered done, with the agent having failed to reach the goal.

Defining the step limit is easy since it requires only a positive integer value. For example, here we define a step limit of 1000 for the ``tiny`` scenario:

.. code-block:: yaml

    step_limit: 1000



With that we have everything we need to define a custom scenario. Running the scenario is even easier!


.. _'running_custom_yaml':

Running a custom YAML scenario
------------------------------

To create a ``NASimEnv`` from a custom YAML scenario file we use the ``nasim.load()`` function:

.. code-block:: python

   import nasim
   env = nasim.load('path/to/custom/scenario.yaml`)


The load function also takes some additional parameters to control the observation mode and observation and action spaces for the environment, see :ref:`nasim_init` for reference and :ref:`env_params` for explanation.

If there are any issues with the format of your file you should recieve some, hopefully, helpful error messages when attempting to load it. Once the environment is loaded successfully you can interact with it as per normal (see :ref:`env_tute` for more details).


================================================
FILE: docs/source/tutorials/environment.rst
================================================
.. _`env_tute`:

Interacting with NASim Environment
==================================

Assuming you are comfortable loading an environment from a scenario (see :ref:`loading_tute` or :ref:`gym_load_tute`), then interacting with a NASim Environment is very easy and follows the same interface as `gymnasium <https://github.com/Farama-Foundation/Gymnasium/>`_.


Starting the environment
------------------------

First thing is simply loading the environment::

  import nasim
  # load my environment in the desired way (make_benchmark, load, generate)
  env = nasim.make_benchmark("tiny")

  # or using gym
  import gymnasium as gym
  env = gym.make("nasim:Tiny-PO-v0")


Here we are using the default environment parameters: ``fully_obs=False``, ``flat_actions=True``, and ``flat_obs=True``.

The number of actions can be retrieved from the environment ``action_space`` attribute as follows::

  # When flat_actions=True
  num_actions = env.action_space.n

  # When flat_actions=False
  nvec_actions = env.action_space.nvec


The shape of the observations can be retrieved from the environment ``observation_space`` attribute as follows::

  obs_shape = env.observation_space.shape



Getting the initial observation and resetting the environment
-------------------------------------------------------------

To reset the environment and get the initial observation, use the ``reset()`` function::

  o, info = env.reset()


The ``info`` return value contains optional auxiliary information.


Performing a single step
------------------------

A step in the environment can be taken using the ``step(action)`` function. Here ``action`` can take a few different forms depending on if using ``flat_actions=True`` or ``flat_actions=False``, for our example we can simply pass an integer with 0 <= action < N, which specifies the index of the action in the action space. The ``step`` function then returns a ``(Observation, float, bool, bool, dict)`` tuple corresponding to observation, reward, done, step limit reached, auxiliary info, respectively::

  action = # integer in range [0, env.action_space.n]
  o, r, done, step_limit_reached, info = env.step(action)


if ``done=True`` then the goal has been reached, and the episode is over. Alternatively, if the current scenario has a step limit and ``step_limit_reached=True`` then, well, the step limit has been reached. Following both cases, it is then recommended to stop or reset the environment, otherwise theres no gaurantee of what will happen (especially the first case).


Visualizing the environment
---------------------------

You can use the ``render()`` function to get a human readable visualization of the state of the environment. To use render correctly make sure to pass ``render_mode="human"`` to the environment initialization function::

  import nasim
  # load my environment in the desired way (make_benchmark, load, generate)
  env = nasim.make_benchmark("tiny", render_mode="human")

  # or using gym
  import gymnasium as gym
  env = gym.make("nasim:Tiny-PO-v0", render_mode="human")

  env.reset()
  # render the environment
  # (if render_mode="human" is not passed during initialization this will do nothing)
  env.render()


An example agent
----------------

Some example agents are provided in the ``nasim/agents`` directory. Here is a quick example of a hypothetical agent interacting with the environment::

  import nasim

  env = nasim.make_benchmark("tiny")

  agent = AnAgent(...)

  o, info = env.reset()
  total_reward = 0
  done = False
  step_limit_reached = False
  while not done and not step_limit_reached:
      a = agent.choose_action(o)
      o, r, done, step_limit_reached, info = env.step(a)
      total_reward += r

  print("Done")
  print("Total reward =", total_reward)


It's as simple as that.


================================================
FILE: docs/source/tutorials/gym_load.rst
================================================
.. _`gym_load_tute`:

Starting NASim using OpenAI gym
===============================

On startup NASim also registers each benchmark scenario as an `Gymnasium <https://github.com/Farama-Foundation/Gymnasium/>`_  environment, allowing NASim benchmark environments to be loaded using ``gymnasium.make()``.

:ref:`all_benchmark_scenarios` can be loaded using ``gymnasium.make()``.

.. note:: Custom scenarios must be loaded using the nasim library directly, see :ref:`loading_tute`.


Environment Naming
------------------

Unlike when starting an environment using the ``nasim`` library directly, where environment modes are specified as arguments to the ``nasim.make_benchmark()`` function, when using ``gymnasium.make()`` the scenario and mode are specified in a single name.

When using ``gymnasium.make()`` each environment has the following mode and naming convention:

  ``ScenarioName[PO][2D][VA]-vX``

Where:

- ``ScenarioName`` is the name of the benchmark scenario in Camel Casing
- ``[PO]`` is optional and specifies the environment is in partially observable mode, if it is not included the environment is in fully observable mode.
- ``[2D]`` is optional and specifies the environment is to return 2D observations, if it is not included the environment returns 1D observations.
- ``[VA]`` is optional and specifies the environment is to accept Vector actions (parametrised actions), if it is not included the environment expects integer (flat) actions.
- ``vX`` is the environment version. Currently (as of version ``0.10.0``) all environments are on ``v0``

For example, the 'tiny' benchmark scenario in partially observable mode with flat action-space and flat observation space has the name:

  ``TinyPO-v0``

Or the 'small-gen' benchmark scenario in fully observable mode with parametrised action-space and flat observation-space has the name:

  ``SmallGenVA-v0``


Or the 'medium-single-site' benchmark scenario in partially observable mode with parametrised action-space and 2D observation-space has the name:

  ``MediumSingleSitePO2DVA-v0``


.. note:: See :ref:`env_params` for more explanation on the different modes.


Usage
-----

Now we understand the naming of environments, making a new environment using ``gym.make()`` is easy.

For example to create a new ``TinyPO-v0`` environment:

.. code:: python

   import gymnasium as gym
   env = gym.make("nasim:TinyPO-v0")

   # to specify render mode
   env = gym.make("nasim:TinyPO-v0", render_mode="human")


================================================
FILE: docs/source/tutorials/index.rst
================================================
.. _tutorials:

Tutorials
=========

.. toctree::
    :maxdepth: 1

    installation
    loading
    gym_load
    environment
    scenarios
    creating_scenarios


================================================
FILE: docs/source/tutorials/installation.rst
================================================
.. _installation:

Installation
==============


Dependencies
--------------

This framework is tested to work under Python 3.7 or later.

The required dependencies:

* Python >= 3.7
* Gym >= 0.17
* NumPy >= 1.18
* PyYaml >= 5.3

For rendering:

* NetworkX >= 2.4
* prettytable >= 0.7.2
* Matplotlib >= 3.1.3

We recommend to use the bleeding-edge version and to install it by following the :ref:`dev-install`. If you want a simpler installation procedure and do not intend to modify yourself the learning algorithms etc., you can look at the :ref:`user-install`.

.. _user-install:

User install instructions
--------------------------

NASIm is available on PyPi for and can be installed with ``pip`` with the following command:

.. code-block:: bash

    pip install nasim


This will install the base level, which includes all dependencies needed to use NASim. You can also install the dependencies for building the docs, running tests, and running the DQN example agent seperately or all together, as follows:

.. code-block:: bash

    # install dependencies for building docs
    pip install nasim[docs]

    # install dependencies for running tests
    pip install nasim[test]

    # install dependencies for running dqn_agent
    pip install nasim[dqn]

    # install all dependencies
    pip install nasim[all]



.. _dev-install:

Developer install instructions
-------------------------------

As a developer, you can set you up with the bleeding-edge version of NASim with:

.. code-block:: bash

    git clone -b master https://github.com/Jjschwartz/NetworkAttackSimulator.git


You can install the framework as a package along with all dependencies with (you can remove the '[all]' if you just want base level install):

.. code-block:: bash

    pip install -e .[all]


================================================
FILE: docs/source/tutorials/loading.rst
================================================
.. _`loading_tute`:

Starting a NASim Environment
============================

Interaction with NASim is done primarily via the :class:`~nasim.envs.environment.NASimEnv` class, which handles a simulated network environment as defined by the chosen scenario.

There are two ways to start a new environment: (i) via the nasim library directly, or (ii) using the `gym.make()` function of the gymnasium library.

In this tutorial we will be covering the first method. For the second method check out :ref:`gym_load_tute`.


.. _`env_params`:

Environment Settings
--------------------

For initialization the NASimEnv class takes a scenario definition and three optional arguments.

The scenario defines the network properties and the pen-tester specific information (e.g. exploits available, etc). For this tutorial we are going to stick to how to start a new environment, details on scenarios is covered in :ref:`scenarios_tute`.

The three optional arguments control the environment modes:

- ``fully_obs`` : The observability mode of environment, if True then uses fully observable mode, otherwise is partially observable (default=False)
- ``flat_actions`` : If true then uses a flat action space, otherwise will uses a parameterised action space (default=True).
- ``flat_obs`` :  If true then uses a 1D observation space, otherwise uses a 2D observation space (default=True)


If using fully observable mode (``fully_obs=True``) then the entire state of the network and the attack is observed after each step. This is 'easy' mode and does not reflect the reality of pen-testing, but it is useful for getting started and sanity checking algorithms and environments. When using partially observable mode (``fully_obs=False``) the agent starts with no knowledge of the location, configuration and value of every host on the network and recieves only observations of features of the directly related to the action performed at each step. This is 'hard' mode and reflects the reality of pen-testing more accurately.

Whether the environment is fully or partially observable has no effect on the size and shape of the action and observation spaces or how the agent interacts with the environment. It will have significant implications for the algorithms used to solve the environment, but that is beyond the scope of this tutorial.

Using ``flat_actions=True`` means our action space is made up of N discrete actions, where N is based on the number of hosts in the network and the number of exploits and scans available. For our example there are 3 hosts, 1 exploit and 3 scans (OS, Service, and Subnet), for a total of 3 * (1 + 3) = 12 actions. If ``flat_actions=False`` then each action is a vector with each element of the vector specifying a parameter of the action. For more info see :ref:`actions`.

Using ``flat_obs=True`` means the observations returned will be a 1D vector. Otherwise if ``flat_obs=False`` observations will be a 2D matrix. For explanation of the features of this vector see :ref:`observation`.


.. _`loading_env`:

Loading an Environment from a Scenario
--------------------------------------

NASim Environments can be constructed from scenarios in three ways: making an existing scenario, loading from a .yaml file, and generating from parameters.

.. note:: Each of the methods described below also accept `fully_obs`, `flat_actions` and `flat_obs` boolen arguments.


.. _`make_existing`:

Making an existing scenario
^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is the easiest method for loading a new environment and closely matches the `OpenAI gym <https://github.com/openai/gym>`_ way of doing things. Loading an existing scenario is as easy as::

  import nasim
  env = nasim.make_benchmark("tiny")

And you are done.

You can also pass in a a random seed using the `seed` argument, which will have an effect when using a generated scenario.

.. note::  This method only works with the benchmark scenarios that come with NASim (for the full list see the :ref:`benchmark_scenarios`).


Loading a scenario from a YAML file
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If you wish to load an existing or custom scenario defined in a YAML file, this is also very straight forward::

  import nasim
  env = nasim.load("path/to/scenario.yaml")

And once again, you are done (given your file is in a valid format)!


Generating a scenario
^^^^^^^^^^^^^^^^^^^^^

The final method for loading a new environment is to generate it using the NASim scenario generator. There are quite a number of parameters that can be used to control the what scenario is generated (for a full list see the :ref:`scenario_generator` class), but the two key parameters are the number of hosts in the network and the number of services running (which also controls number of exploits, unless otherwise specified).

To generate a new environment with 5 hosts running a possible 3 services::

  import nasim
  env = nasim.generate(5, 3)

And your done! If you want to pass in some other parameters (say the number of possible operating systems) these can be passed in as keyword arguments::

  env = nasim.generate(5, 3, num_os=3)


Once again, for a full list of available parameters refer to the :ref:`scenario_generator` documentation.


================================================
FILE: docs/source/tutorials/scenarios.rst
================================================
.. _`scenarios_tute`:

Understanding Scenarios
=======================

A scenario in NASim defines all the necessary properties for creating a network environment. Each scenario definition can be broken down into two components: the network configuration and the pen-tester.

Network Configuration
---------------------

The network configuration is defined by a the following properties:

- *subnets*: the number and size of the subnets in the network.
- *topology*: how the different subnets in the network are connected
- *host configurations*: the address, OS, services, processes, and firewalls for each host in the network
- *firewall*: which communication is prevented between subnets

*Note*, for the host configurations we are, in general, only interested in services and processes that the pen-tester has exploits for, so we will typically ignore any non-vulnerable services and processes in order to reduce the problem size.

Pen-Tester
----------

The pen-tester is defined by:

- *exploits*: the set of exploits available to the pen-tester
- *privescs*: the set of priviledge escalation actions available to the pen-tester
- *scan costs*: the cost of performing each type of scan (service, OS, process, and subnet)
- *sensitive hosts*: the target hosts on the network and their value

Example Scenario
----------------

To illustrate these properties here we show an example scenario, where the aim of the pen-tester is to gain root access to the server in the sensitive subnet and one of the hosts in the user subnet.

The figure below shows the the layout of our example network.

.. image:: example_network.png
  :width: 700

From the figure we can see that this network has the following properties:

- *subnets*: three subnets: DMZ with a single server, Sensitive with a single server and User with three user machines.
- *topology*: Only the DMZ is connected to the internet, while all subnets in network are interconnected.
- *host configurations*: The address, OS, services, and processes running on each host are shown next to each host (e.g. the server in the DMZ subnet has address (1, 0), has a linux OS, is running http and ssh services, and the tomcat process). The host firewall settings are show in the table in the top-right of the figure. Here only host *(1, 0)* has a firewall configured which blocks any SSH connections from hosts *(3, 0)* and *(3, 1)*.
- *firewall*: The arrows above and below the firwalls indicate which services can be communicated with in each direction between subnets and between the DMZ subnet and the internet (e.g. the internet can communicate with http services running on hosts in the DMZ, while the firewall blocks no communication from the DMZ to the internet).

Next we need to define our pen-tester, which we specify based on the scenario we wish to simulate.

- *exploits*: for this scenario the pen-tester has access to three exploits

  1. *ssh_exploit*: which exploits the ssh service running on windows machine, has a cost of 2, a success probability of 0.6, and results in user level access if successful.
  2. *ftp_exploit*: which exploits the ftp service running on a linux machine, has a cost of 1, a sucess probability of 0.9, and results in root level access if successful.
  3. *http_exploit*: which exploits the http service running on any OS, has a cost of 3, a success probability of 1.0, and results in user level access if successful.

- *privescs*: for this scenario the pen-tester has access to two priviledge escalation actions

  1. *pe_tomcat*: exploits the tomcat process running on a linux machine to gain root access. It has a cost of 1 and success probability of 1.0.
  2. *pe_daclsvc*: exploits the daclsvc process running on a windows machine to gain root access. It has a cost of 1 and success probability of 1.0.

- *scan costs*: here we need to specify the cost of each type of scan

  1. *service_scan*: 1
  2. *os_scan*: 2
  3. *process_scan*: 1
  4. *subnet_scan*: 1

- *sensitive hosts*: here we have two target hosts

  1. *(2, 0), 1000* : the server running on sensitive subnet, which has a value of 1000.
  2. *(3, 2), 1000* : the last host running on user subnet, which has a value of 1000.

And with that our scenario is fully defined and we have everything we need to run an attack simulation.


================================================
FILE: nasim/__init__.py
================================================
import gymnasium as gym
from gymnasium.envs.registration import register

from nasim.envs import NASimEnv
from nasim.scenarios.benchmark import AVAIL_BENCHMARKS
from nasim.scenarios import \
    make_benchmark_scenario, load_scenario, generate_scenario


__all__ = ['make_benchmark', 'load', 'generate']


def make_benchmark(scenario_name,
                   seed=None,
                   fully_obs=False,
                   flat_actions=True,
                   flat_obs=True,
                   render_mode=None):
    """Make a new benchmark NASim environment.

    Parameters
    ----------
    scenario_name : str
        the name of the benchmark environment
    seed : int, optional
        random seed to use to generate environment (default=None)
    fully_obs : bool, optional
        the observability mode of environment, if True then uses fully
        observable mode, otherwise partially observable (default=False)
    flat_actions : bool, optional
        if true then uses a flat action space, otherwise will use
        parameterised action space (default=True).
    flat_obs : bool, optional
        if true then uses a 1D observation space. If False
        will use a 2D observation space (default=True)
    render_mode : str, optional
            The render mode to use for the environment.

    Returns
    -------
    NASimEnv
        a new environment instance

    Raises
    ------
    NotImplementederror
        if scenario_name does no match any implemented benchmark scenarios.
    """
    env_kwargs = {"fully_obs": fully_obs,
                  "flat_actions": flat_actions,
                  "flat_obs": flat_obs,
                  "render_mode": render_mode}
    scenario = make_benchmark_scenario(scenario_name, seed)
    return NASimEnv(scenario, **env_kwargs)


def load(path,
         fully_obs=False,
         flat_actions=True,
         flat_obs=True,
         name=None,
         render_mode=None):
    """Load NASim Environment from a .yaml scenario file.

    Parameters
    ----------
    path : str
        path to the .yaml scenario file
    fully_obs : bool, optional
        The observability mode of environment, if True then uses fully
        observable mode, otherwise partially observable (default=False)
    flat_actions : bool, optional
        if true then uses a flat action space, otherwise will use
        parameterised action space (default=True).
    flat_obs : bool, optional
        if true then uses a 1D observation space. If False
        will use a 2D observation space (default=True)
    name : str, optional
        the scenarios name, if None name will be generated from path
        (default=None)
    render_mode : str, optional
            The render mode to use for the environment.

    Returns
    -------
    NASimEnv
        a new environment object
    """
    env_kwargs = {"fully_obs": fully_obs,
                  "flat_actions": flat_actions,
                  "flat_obs": flat_obs,
                  "render_mode": render_mode}
    scenario = load_scenario(path, name=name)
    return NASimEnv(scenario, **env_kwargs)


def generate(num_hosts,
             num_services,
             fully_obs=False,
             flat_actions=True,
             flat_obs=True,
             render_mode=None,
             **params):
    """Construct Environment from an auto generated network.

    Parameters
    ----------
    num_hosts : int
        number of hosts to include in network (minimum is 3)
    num_services : int
        number of services to use in environment (minimum is 1)
    fully_obs : bool, optional
        The observability mode of environment, if True then uses fully
        observable mode, otherwise partially observable (default=False)
    flat_actions : bool, optional
        if true then uses a flat action space, otherwise will use
        parameterised action space (default=True).
    flat_obs : bool, optional
        if true then uses a 1D observation space. If False
        will use a 2D observation space (default=True)
    render_mode : str, optional
            The render mode to use for the environment.
    params : dict, optional
        generator params (see :class:`ScenarioGenertor` for full list)

    Returns
    -------
    NASimEnv
        a new environment object
    """
    env_kwargs = {"fully_obs": fully_obs,
                  "flat_actions": flat_actions,
                  "flat_obs": flat_obs,
                  "render_mode": render_mode}
    scenario = generate_scenario(num_hosts, num_services, **params)
    return NASimEnv(scenario, **env_kwargs)


def _register(id, entry_point, kwargs, nondeterministic, force=True):
    """Registers NASim as a Gymnasium Environment.

    Handles issues with re-registering gym environments.
    """
    if id in gym.envs.registry:
        if not force:
            return
        del gym.envs.registry[id]
    register(
        id=id,
        entry_point=entry_point,
        kwargs=kwargs,
        nondeterministic=nondeterministic
    )


for benchmark in AVAIL_BENCHMARKS:
    # PO - partially observable
    # 2D - use 2D Obs
    # VA - use param actions
    # tiny should yield Tiny and tiny-small should yield TinySmall
    for fully_obs in [True, False]:
        name = ''.join([g.capitalize() for g in benchmark.split("-")])
        if not fully_obs:
            name = f"{name}PO"

        _register(
            id=f"{name}-v0",
            entry_point='nasim.envs:NASimGymEnv',
            kwargs={
                "scenario": benchmark,
                "fully_obs": fully_obs,
                "flat_actions": True,
                "flat_obs": True
            },
            nondeterministic=True
        )

        _register(
            id=f"{name}2D-v0",
            entry_point='nasim.envs:NASimGymEnv',
            kwargs={
                "scenario": benchmark,
                "fully_obs": fully_obs,
                "flat_actions": True,
                "flat_obs": False
            },
            nondeterministic=True
        )

        _register(
            id=f"{name}VA-v0",
            entry_point='nasim.envs:NASimGymEnv',
            kwargs={
                "scenario": benchmark,
                "fully_obs": fully_obs,
                "flat_actions": False,
                "flat_obs": True
            },
            nondeterministic=True
        )

        _register(
            id=f"{name}2DVA-v0",
            entry_point='nasim.envs:NASimGymEnv',
            kwargs={
                "scenario": benchmark,
                "fully_obs": fully_obs,
                "flat_actions": False,
                "flat_obs": False
            },
            nondeterministic=True
        )

__version__ = "0.12.0"


================================================
FILE: nasim/agents/__init__.py
================================================


================================================
FILE: nasim/agents/bruteforce_agent.py
================================================
"""An bruteforce agent that repeatedly cycles through all available actions in
order.

To run 'tiny' benchmark scenario with default settings, run the following from
the nasim/agents dir:

$ python bruteforce_agent.py tiny

This will run the agent and display progress and final results to stdout.

To see available running arguments:

$ python bruteforce_agent.py --help
"""

from itertools import product

import nasim

LINE_BREAK = "-"*60


def run_bruteforce_agent(env, step_limit=1e6, verbose=True):
    """Run bruteforce agent on nasim environment.

    Parameters
    ----------
    env : nasim.NASimEnv
        the nasim environment to run agent on
    step_limit : int, optional
        the maximum number of steps to run agent for (default=1e6)
    verbose : bool, optional
        whether to print out progress messages or not (default=True)

    Returns
    -------
    int
        timesteps agent ran for
    float
        the total reward recieved by agent
    bool
        whether the goal was reached or not
    """
    if verbose:
        print(LINE_BREAK)
        print("STARTING EPISODE")
        print(LINE_BREAK)
        print("t: Reward")

    env.reset()
    total_reward = 0
    done = False
    env_step_limit_reached = False
    steps = 0
    cycle_complete = False

    if env.flat_actions:
        act = 0
    else:
        act_iter = product(*[range(n) for n in env.action_space.nvec])

    while not done and not env_step_limit_reached and steps < step_limit:
        if env.flat_actions:
            act = (act + 1) % env.action_space.n
            cycle_complete = (steps > 0 and act == 0)
        else:
            try:
                act = next(act_iter)
                cycle_complete = False
            except StopIteration:
                act_iter = product(*[range(n) for n in env.action_space.nvec])
                act = next(act_iter)
                cycle_complete = True

        _, rew, done, env_step_limit_reached, _ = env.step(act)
        total_reward += rew

        if cycle_complete and verbose:
            print(f"{steps}: {total_reward}")
        steps += 1

    if done and verbose:
        print(LINE_BREAK)
        print("EPISODE FINISHED")
        print(LINE_BREAK)
        print(f"Goal reached = {env.goal_reached()}")
        print(f"Total steps = {steps}")
        print(f"Total reward = {total_reward}")
    elif verbose:
        print(LINE_BREAK)
        print("STEP LIMIT REACHED")
        print(LINE_BREAK)

    if done:
        done = env.goal_reached()

    return steps, total_reward, done


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("env_name", type=str, help="benchmark scenario name")
    parser.add_argument("-s", "--seed", type=int, default=0,
                        help="random seed")
    parser.add_argument("-o", "--partially_obs", action="store_true",
                        help="Partially Observable Mode")
    parser.add_argument("-p", "--param_actions", action="store_true",
                        help="Use Parameterised action space")
    parser.add_argument("-f", "--box_obs", action="store_true",
                        help="Use 2D observation space")
    args = parser.parse_args()

    nasimenv = nasim.make_benchmark(
        args.env_name,
        args.seed,
        not args.partially_obs,
        not args.param_actions,
        not args.box_obs
    )
    if not args.param_actions:
        print(nasimenv.action_space.n)
    else:
        print(nasimenv.action_space.nvec)
    run_bruteforce_agent(nasimenv)


================================================
FILE: nasim/agents/dqn_agent.py
================================================
"""An example DQN Agent.

It uses pytorch 1.5+ and tensorboard libraries (HINT: these dependencies can
be installed by running pip install nasim[dqn])

To run 'tiny' benchmark scenario with default settings, run the following from
the nasim/agents dir:

$ python dqn_agent.py tiny

To see detailed results using tensorboard:

$ tensorboard --logdir runs/

To see available hyperparameters:

$ python dqn_agent.py --help

Notes
-----

This is by no means a state of the art implementation of DQN, but is designed
to be an example implementation that can be used as a reference for building
your own agents.
"""
import random
from pprint import pprint

from gymnasium import error
import numpy as np

import nasim

try:
    import torch
    import torch.nn as nn
    import torch.optim as optim
    import torch.nn.functional as F
    from torch.utils.tensorboard import SummaryWriter
except ImportError as e:
    raise error.DependencyNotInstalled(
        f"{e}. (HINT: you can install dqn_agent dependencies by running "
        "'pip install nasim[dqn]'.)"
    )


class ReplayMemory:

    def __init__(self, capacity, s_dims, device="cpu"):
        self.capacity = capacity
        self.device = device
        self.s_buf = np.zeros((capacity, *s_dims), dtype=np.float32)
        self.a_buf = np.zeros((capacity, 1), dtype=np.int64)
        self.next_s_buf = np.zeros((capacity, *s_dims), dtype=np.float32)
        self.r_buf = np.zeros(capacity, dtype=np.float32)
        self.done_buf = np.zeros(capacity, dtype=np.float32)
        self.ptr, self.size = 0, 0

    def store(self, s, a, next_s, r, done):
        self.s_buf[self.ptr] = s
        self.a_buf[self.ptr] = a
        self.next_s_buf[self.ptr] = next_s
        self.r_buf[self.ptr] = r
        self.done_buf[self.ptr] = done
        self.ptr = (self.ptr + 1) % self.capacity
        self.size = min(self.size+1, self.capacity)

    def sample_batch(self, batch_size):
        sample_idxs = np.random.choice(self.size, batch_size)
        batch = [self.s_buf[sample_idxs],
                 self.a_buf[sample_idxs],
                 self.next_s_buf[sample_idxs],
                 self.r_buf[sample_idxs],
                 self.done_buf[sample_idxs]]
        return [torch.from_numpy(buf).to(self.device) for buf in batch]


class DQN(nn.Module):
    """A simple Deep Q-Network """

    def __init__(self, input_dim, layers, num_actions):
        super().__init__()
        self.layers = nn.ModuleList([nn.Linear(input_dim[0], layers[0])])
        for l in range(1, len(layers)):
            self.layers.append(nn.Linear(layers[l-1], layers[l]))
        self.out = nn.Linear(layers[-1], num_actions)

    def forward(self, x):
        for layer in self.layers:
            x = F.relu(layer(x))
        x = self.out(x)
        return x

    def save_DQN(self, file_path):
        torch.save(self.state_dict(), file_path)

    def load_DQN(self, file_path):
        self.load_state_dict(torch.load(file_path))

    def get_action(self, x):
        with torch.no_grad():
            if len(x.shape) == 1:
                x = x.view(1, -1)
            return self.forward(x).max(1)[1]


class DQNAgent:
    """A simple Deep Q-Network Agent """

    def __init__(self,
                 env,
                 seed=None,
                 lr=0.001,
                 training_steps=20000,
                 batch_size=32,
                 replay_size=10000,
                 final_epsilon=0.05,
                 exploration_steps=10000,
                 gamma=0.99,
                 hidden_sizes=[64, 64],
                 target_update_freq=1000,
                 verbose=True,
                 **kwargs):

        # This DQN implementation only works for flat actions
        assert env.flat_actions
        self.verbose = verbose
        if self.verbose:
            print(f"\nRunning DQN with config:")
            pprint(locals())

        # set seeds
        self.seed = seed
        if self.seed is not None:
            np.random.seed(self.seed)

        # environment setup
        self.env = env

        self.num_actions = self.env.action_space.n
        self.obs_dim = self.env.observation_space.shape

        # logger setup
        self.logger = SummaryWriter()

        # Training related attributes
        self.lr = lr
        self.exploration_steps = exploration_steps
        self.final_epsilon = final_epsilon
        self.epsilon_schedule = np.linspace(1.0,
                                            self.final_epsilon,
                                            self.exploration_steps)
        self.batch_size = batch_size
        self.discount = gamma
        self.training_steps = training_steps
        self.steps_done = 0

        # Neural Network related attributes
        self.device = torch.device("cuda"
                                   if torch.cuda.is_available()
                                   else "cpu")
        self.dqn = DQN(self.obs_dim,
                       hidden_sizes,
                       self.num_actions).to(self.device)
        if self.verbose:
            print(f"\nUsing Neural Network running on device={self.device}:")
            print(self.dqn)

        self.target_dqn = DQN(self.obs_dim,
                              hidden_sizes,
                              self.num_actions).to(self.device)
        self.target_update_freq = target_update_freq

        self.optimizer = optim.Adam(self.dqn.parameters(), lr=self.lr)
        self.loss_fn = nn.SmoothL1Loss()

        # replay setup
        self.replay = ReplayMemory(replay_size,
                                   self.obs_dim,
                                   self.device)

    def save(self, save_path):
        self.dqn.save_DQN(save_path)

    def load(self, load_path):
        self.dqn.load_DQN(load_path)

    def get_epsilon(self):
        if self.steps_done < self.exploration_steps:
            return self.epsilon_schedule[self.steps_done]
        return self.final_epsilon

    def get_egreedy_action(self, o, epsilon):
        if random.random() > epsilon:
            o = torch.from_numpy(o).float().to(self.device)
            return self.dqn.get_action(o).cpu().item()
        return random.randint(0, self.num_actions-1)

    def optimize(self):
        batch = self.replay.sample_batch(self.batch_size)
        s_batch, a_batch, next_s_batch, r_batch, d_batch = batch

        # get q_vals for each state and the action performed in that state
        q_vals_raw = self.dqn(s_batch)
        q_vals = q_vals_raw.gather(1, a_batch).squeeze()

        # get target q val = max val of next state
        with torch.no_grad():
            target_q_val_raw = self.target_dqn(next_s_batch)
            target_q_val = target_q_val_raw.max(1)[0]
            target = r_batch + self.discount*(1-d_batch)*target_q_val

        # calculate loss
        loss = self.loss_fn(q_vals, target)

        # optimize the model
        self.optimizer.zero_grad()
        loss.backward()
        self.optimizer.step()

        if self.steps_done % self.target_update_freq == 0:
            self.target_dqn.load_state_dict(self.dqn.state_dict())

        q_vals_max = q_vals_raw.max(1)[0]
        mean_v = q_vals_max.mean().item()
        return loss.item(), mean_v

    def train(self):
        if self.verbose:
            print("\nStarting training")

        num_episodes = 0
        training_steps_remaining = self.training_steps

        while self.steps_done < self.training_steps:
            ep_results = self.run_train_episode(training_steps_remaining)
            ep_return, ep_steps, goal = ep_results
            num_episodes += 1
            training_steps_remaining -= ep_steps

            self.logger.add_scalar("episode", num_episodes, self.steps_done)
            self.logger.add_scalar(
                "epsilon", self.get_epsilon(), self.steps_done
            )
            self.logger.add_scalar(
                "episode_return", ep_return, self.steps_done
            )
            self.logger.add_scalar(
                "episode_steps", ep_steps, self.steps_done
            )
            self.logger.add_scalar(
                "episode_goal_reached", int(goal), self.steps_done
            )

            if num_episodes % 10 == 0 and self.verbose:
                print(f"\nEpisode {num_episodes}:")
                print(f"\tsteps done = {self.steps_done} / "
                      f"{self.training_steps}")
                print(f"\treturn = {ep_return}")
                print(f"\tgoal = {goal}")

        self.logger.close()
        if self.verbose:
            print("Training complete")
            print(f"\nEpisode {num_episodes}:")
            print(f"\tsteps done = {self.steps_done} / {self.training_steps}")
            print(f"\treturn = {ep_return}")
            print(f"\tgoal = {goal}")

    def run_train_episode(self, step_limit):
        o, _ = self.env.reset()
        done = False
        env_step_limit_reached = False

        steps = 0
        episode_return = 0

        while not done and not env_step_limit_reached and steps < step_limit:
            a = self.get_egreedy_action(o, self.get_epsilon())

            next_o, r, done, env_step_limit_reached, _ = self.env.step(a)
            self.replay.store(o, a, next_o, r, done)
            self.steps_done += 1
            loss, mean_v = self.optimize()
            self.logger.add_scalar("loss", loss, self.steps_done)
            self.logger.add_scalar("mean_v", mean_v, self.steps_done)

            o = next_o
            episode_return += r
            steps += 1

        return episode_return, steps, self.env.goal_reached()

    def run_eval_episode(self,
                         env=None,
                         render=False,
                         eval_epsilon=0.05,
                         render_mode="human"):
        if env is None:
            env = self.env

        original_render_mode = env.render_mode
        env.render_mode = render_mode

        o, _ = env.reset()
        done = False
        env_step_limit_reached = False

        steps = 0
        episode_return = 0

        line_break = "="*60
        if render:
            print("\n" + line_break)
            print(f"Running EVALUATION using epsilon = {eval_epsilon:.4f}")
            print(line_break)
            env.render()
            input("Initial state. Press enter to continue..")

        while not done and not env_step_limit_reached:
            a = self.get_egreedy_action(o, eval_epsilon)
            next_o, r, done, env_step_limit_reached, _ = env.step(a)
            o = next_o
            episode_return += r
            steps += 1
            if render:
                print("\n" + line_break)
                print(f"Step {steps}")
                print(line_break)
                print(f"Action Performed = {env.action_space.get_action(a)}")
                env.render()
                print(f"Reward = {r}")
                print(f"Done = {done}")
                print(f"Step limit reached = {env_step_limit_reached}")
                input("Press enter to continue..")

                if done or env_step_limit_reached:
                    print("\n" + line_break)
                    print("EPISODE FINISHED")
                    print(line_break)
                    print(f"Goal reached = {env.goal_reached()}")
                    print(f"Total steps = {steps}")
                    print(f"Total reward = {episode_return}")

        env.render_mode = original_render_mode
        return episode_return, steps, env.goal_reached()


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("env_name", type=str, help="benchmark scenario name")
    parser.add_argument("--render_eval", action="store_true",
                        help="Renders final policy")
    parser.add_argument("-o", "--partially_obs", action="store_true",
                        help="Partially Observable Mode")
    parser.add_argument("--hidden_sizes", type=int, nargs="*",
                        default=[64, 64],
                        help="(default=[64. 64])")
    parser.add_argument("--lr", type=float, default=0.001,
                        help="Learning rate (default=0.001)")
    parser.add_argument("-t", "--training_steps", type=int, default=20000,
                        help="training steps (default=20000)")
    parser.add_argument("--batch_size", type=int, default=32,
                        help="(default=32)")
    parser.add_argument("--target_update_freq", type=int, default=1000,
                        help="(default=1000)")
    parser.add_argument("--seed", type=int, default=0,
                        help="(default=0)")
    parser.add_argument("--replay_size", type=int, default=100000,
                        help="(default=100000)")
    parser.add_argument("--final_epsilon", type=float, default=0.05,
                        help="(default=0.05)")
    parser.add_argument("--init_epsilon", type=float, default=1.0,
                        help="(default=1.0)")
    parser.add_argument("--exploration_steps", type=int, default=10000,
                        help="(default=10000)")
    parser.add_argument("--gamma", type=float, default=0.99,
                        help="(default=0.99)")
    parser.add_argument("--quite", action="store_false",
                        help="Run in Quite mode")
    args = parser.parse_args()

    env = nasim.make_benchmark(args.env_name,
                               args.seed,
                               fully_obs=not args.partially_obs,
                               flat_actions=True,
                               flat_obs=True)
    dqn_agent = DQNAgent(env, verbose=args.quite, **vars(args))
    dqn_agent.train()
    dqn_agent.run_eval_episode(render=args.render_eval)


================================================
FILE: nasim/agents/keyboard_agent.py
================================================
"""An agent that lets the user interact with NASim using the keyboard.

To run 'tiny' benchmark scenario with default settings, run the following from
the nasim/agents dir:

$ python keyboard_agent.py tiny

This will run the agent and display the game in stdout.

To see available running arguments:

$ python keyboard_agent.py--help
"""
import nasim
from nasim.envs.action import Exploit, PrivilegeEscalation


LINE_BREAK = "-"*60
LINE_BREAK2 = "="*60


def print_actions(action_space):
    for a in range(action_space.n):
        print(f"{a} {action_space.get_action(a)}")
    print(LINE_BREAK)


def choose_flat_action(env):
    print_actions(env.action_space)
    while True:
        try:
            idx = int(input("Choose action number: "))
            action = env.action_space.get_action(idx)
            print(f"Performing: {action}")
            return action
        except Exception:
            print("Invalid choice. Try again.")


def display_actions(actions):
    action_names = list(actions)
    for i, name in enumerate(action_names):
        a_def = actions[name]
        output = [f"{i} {name}:"]
        output.extend([f"{k}={v}" for k, v in a_def.items()])
        print(" ".join(output))


def choose_item(items):
    while True:
        try:
            idx = int(input("Choose number: "))
            return items[idx]
        except Exception:
            print("Invalid choice. Try again.")


def choose_param_action(env):
    print("1. Choose Action Type:")
    print("----------------------")
    for i, atype in enumerate(env.action_space.action_types):
        print(f"{i} {atype.__name__}")
    while True:
        try:
            atype_idx = int(input("Choose index: "))
            # check idx valid
            atype = env.action_space.action_types[atype_idx]
            break
        except Exception:
            print("Invalid choice. Try again.")

    print("------------------------")
    print("2. Choose Target Subnet:")
    print("------------------------")
    num_subnets = env.action_space.nvec[1]
    while True:
        try:
            subnet = int(input(f"Choose subnet in [1, {num_subnets}]: "))
            if subnet < 1 or subnet > num_subnets:
                raise ValueError()
            break
        except Exception:
            print("Invalid choice. Try again.")

    print("----------------------")
    print("3. Choose Target Host:")
    print("----------------------")
    num_hosts = env.scenario.subnets[subnet]
    while True:
        try:
            host = int(input(f"Choose host in [0, {num_hosts-1}]: "))
            if host < 0 or host > num_hosts-1:
                raise ValueError()
            break
        except Exception:
            print("Invalid choice. Try again.")

    # subnet-1, since action_space handles exclusion of internet subnet
    avec = [atype_idx, subnet-1, host, 0, 0]
    if atype not in (Exploit, PrivilegeEscalation):
        action = env.action_space.get_action(avec)
        print("----------------")
        print(f"ACTION SELECTED: {action}")
        return action

    target = (subnet, host)
    if atype == Exploit:
        print("------------------")
        print("4. Choose Exploit:")
        print("------------------")
        exploits = env.scenario.exploits
        display_actions(exploits)
        e_name = choose_item(list(exploits))
        action = Exploit(name=e_name, target=target, **exploits[e_name])
    else:
        print("------------------")
        print("4. Choose Privilege Escalation:")
        print("------------------")
        privescs = env.scenario.privescs
        display_actions(privescs)
        pe_name = choose_item(list(privescs))
        action = PrivilegeEscalation(
            name=pe_name, target=target, **privescs[pe_name]
        )

    print("----------------")
    print(f"ACTION SELECTED: {action}")
    return action


def choose_action(env):
    input("Press enter to choose next action..")
    print("\n" + LINE_BREAK2)
    print("CHOOSE ACTION")
    print(LINE_BREAK2)
    if env.flat_actions:
        return choose_flat_action(env)
    return choose_param_action(env)


def run_keyboard_agent(env):
    """Run Keyboard agent

    Parameters
    ----------
    env : NASimEnv
        the environment

    Returns
    -------
    int
        final return
    int
        steps taken
    bool
        whether goal reached or not
    """
    print(LINE_BREAK2)
    print("STARTING EPISODE")
    print(LINE_BREAK2)

    o, _ = env.reset()
    env.render()
    total_reward = 0
    total_steps = 0
    done = False
    step_limit_reached = False
    while not done and not step_limit_reached:
        a = choose_action(env)
        o, r, done, step_limit_reached, _ = env.step(a)
        total_reward += r
        total_steps += 1
        print("\n" + LINE_BREAK2)
        print("OBSERVATION RECIEVED")
        print(LINE_BREAK2)
        env.render()
        print(f"Reward={r}")
        print(f"Done={done}")
        print(f"Step limit reached={step_limit_reached}")
        print(LINE_BREAK)

    return total_reward, total_steps, done


def run_generative_keyboard_agent(env, render_mode="human"):
    """Run Keyboard agent in generative mode.

    The experience is the same as the normal mode, this is mainly useful
    for testing.

    Parameters
    ----------
    env : NASimEnv
        the environment
    render_mode : str, optional
        display mode for environment (default="human")

    Returns
    -------
    int
        final return
    int
        steps taken
    bool
        whether goal reached or not
    """
    print(LINE_BREAK2)
    print("STARTING EPISODE")
    print(LINE_BREAK2)

    o, _ = env.reset()
    s = env.current_state
    env.render_state(render_mode, s)
    env.render_obs(render_mode, o)

    total_reward = 0
    total_steps = 0
    done = False
    while not done:
        a = choose_action(env)
        ns, o, r, done, _ = env.generative_step(s, a)
        total_reward += r
        total_steps += 1
        print(LINE_BREAK2)
        print("NEXT STATE")
        print(LINE_BREAK2)
        env.render_state(render_mode, ns)
        print("\n" + LINE_BREAK2)
        print("OBSERVATION RECIEVED")
        print(LINE_BREAK2)
        env.render_obs(render_mode, o)
        print(f"Reward={r}")
        print(f"Done={done}")
        print(LINE_BREAK)
        s = ns

    if done:
        done = env.goal_reached()

    return total_reward, total_steps, done


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("env_name", type=str,
                        help="benchmark scenario name")
    parser.add_argument("-s", "--seed", type=int, default=None,
                        help="random seed (default=None)")
    parser.add_argument("-o", "--partially_obs", action="store_true",
                        help="Partially Observable Mode")
    parser.add_argument("-p", "--param_actions", action="store_true",
                        help="Use Parameterised action space")
    parser.add_argument("-g", "--use_generative", action="store_true",
                        help=("Generative environment mode. This makes no"
                              " difference for the player, but is useful"
                              " for testing."))
    args = parser.parse_args()

    env = nasim.make_benchmark(args.env_name,
                               args.seed,
                               fully_obs=not args.partially_obs,
                               flat_actions=not args.param_actions,
                               flat_obs=True,
                               render_mode="human")
    if args.use_generative:
        total_reward, steps, goal = run_generative_keyboard_agent(env,
                                                                  render_mode="human")
    else:
        total_reward, steps, goal = run_keyboard_agent(env)

    print(LINE_BREAK2)
    print("EPISODE FINISHED")
    print(LINE_BREAK)
    print(f"Goal reached = {goal}")
    print(f"Total reward = {total_reward}")
    print(f"Steps taken = {steps}")


================================================
FILE: nasim/agents/ql_agent.py
================================================
"""An example Tabular, epsilon greedy Q-Learning Agent.

This agent does not use an Experience replay (see the 'ql_replay_agent.py')

It uses pytorch 1.5+ tensorboard library for logging (HINT: these dependencies
can be installed by running pip install nasim[dqn])

To run 'tiny' benchmark scenario with default settings, run the following from
the nasim/agents dir:

$ python ql_agent.py tiny

To see detailed results using tensorboard:

$ tensorboard --logdir runs/

To see available hyperparameters:

$ python ql_agent.py --help

Notes
-----

This is by no means a state of the art implementation of Tabular Q-Learning.
It is designed to be an example implementation that can be used as a reference
for building your own agents and for simple experimental comparisons.
"""
import random
import numpy as np
from pprint import pprint

import nasim

try:
    from torch.utils.tensorboard import SummaryWriter
except ImportError as e:
    from gymnasium import error
    raise error.DependencyNotInstalled(
        f"{e}. (HINT: you can install tabular_q_learning_agent dependencies "
        "by running 'pip install nasim[dqn]'.)"
    )


class TabularQFunction:
    """Tabular Q-Function """

    def __init__(self, num_actions):
        self.q_func = dict()
        self.num_actions = num_actions

    def __call__(self, x):
        return self.forward(x)

    def forward(self, x):
        if isinstance(x, np.ndarray):
            x = str(x.astype(np.int))
        if x not in self.q_func:
            self.q_func[x] = np.zeros(self.num_actions, dtype=np.float32)
        return self.q_func[x]

    def forward_batch(self, x_batch):
        return np.asarray([self.forward(x) for x in x_batch])

    def update_batch(self, s_batch, a_batch, delta_batch):
        for s, a, delta in zip(s_batch, a_batch, delta_batch):
            q_vals = self.forward(s)
            q_vals[a] += delta

    def update(self, s, a, delta):
        q_vals = self.forward(s)
        q_vals[a] += delta

    def get_action(self, x):
        return int(self.forward(x).argmax())

    def display(self):
        pprint(self.q_func)


class TabularQLearningAgent:
    """A Tabular. epsilon greedy Q-Learning Agent using Experience Replay """

    def __init__(self,
                 env,
                 seed=None,
                 lr=0.001,
                 training_steps=10000,
                 final_epsilon=0.05,
                 exploration_steps=10000,
                 gamma=0.99,
                 verbose=True,
                 **kwargs):

        # This implementation only works for flat actions
        assert env.flat_actions
        self.verbose = verbose
        if self.verbose:
            print("\nRunning Tabular Q-Learning with config:")
            pprint(locals())

        # set seeds
        self.seed = seed
        if self.seed is not None:
            np.random.seed(self.seed)

        # envirnment setup
        self.env = env

        self.num_actions = self.env.action_space.n
        self.obs_dim = self.env.observation_space.shape

        # logger setup
        self.logger = SummaryWriter()

        # Training related attributes
        self.lr = lr
        self.exploration_steps = exploration_steps
        self.final_epsilon = final_epsilon
        self.epsilon_schedule = np.linspace(
            1.0, self.final_epsilon, self.exploration_steps
        )
        self.discount = gamma
        self.training_steps = training_steps
        self.steps_done = 0

        # Q-Function
        self.qfunc = TabularQFunction(self.num_actions)

    def get_epsilon(self):
        if self.steps_done < self.exploration_steps:
            return self.epsilon_schedule[self.steps_done]
        return self.final_epsilon

    def get_egreedy_action(self, o, epsilon):
        if random.random() > epsilon:
            return self.qfunc.get_action(o)
        return random.randint(0, self.num_actions-1)

    def optimize(self, s, a, next_s, r, done):
        # get q_val for state and action performed in that state
        q_vals_raw = self.qfunc.forward(s)
        q_val = q_vals_raw[a]

        # get target q val = max val of next state
        target_q_val = self.qfunc.forward(next_s).max()
        target = r + self.discount * (1-done) * target_q_val

        # calculate error and update
        td_error = target - q_val
        td_delta = self.lr * td_error

        # optimize the model
        self.qfunc.update(s, a, td_delta)

        s_value = q_vals_raw.max()
        return td_error, s_value

    def train(self):
        if self.verbose:
            print("\nStarting training")

        num_episodes = 0
        training_steps_remaining = self.training_steps

        while self.steps_done < self.training_steps:
            ep_results = self.run_train_episode(training_steps_remaining)
            ep_return, ep_steps, goal = ep_results
            num_episodes += 1
            training_steps_remaining -= ep_steps

            self.logger.add_scalar("episode", num_episodes, self.steps_done)
            self.logger.add_scalar(
                "epsilon", self.get_epsilon(), self.steps_done
            )
            self.logger.add_scalar(
                "episode_return", ep_return, self.steps_done
            )
            self.logger.add_scalar(
                "episode_steps", ep_steps, self.steps_done
            )
            self.logger.add_scalar(
                "episode_goal_reached", int(goal), self.steps_done
            )

            if num_episodes % 10 == 0 and self.verbose:
                print(f"\nEpisode {num_episodes}:")
                print(f"\tsteps done = {self.steps_done} / "
                      f"{self.training_steps}")
                print(f"\treturn = {ep_return}")
                print(f"\tgoal = {goal}")

        self.logger.close()
        if self.verbose:
            print("Training complete")
            print(f"\nEpisode {num_episodes}:")
            print(f"\tsteps done = {self.steps_done} / {self.training_steps}")
            print(f"\treturn = {ep_return}")
            print(f"\tgoal = {goal}")

    def run_train_episode(self, step_limit):
        s, _ = self.env.reset()
        done = False
        env_step_limit_reached = False

        steps = 0
        episode_return = 0

        while not done and not env_step_limit_reached and steps < step_limit:
            a = self.get_egreedy_action(s, self.get_epsilon())

            next_s, r, done, env_step_limit_reached, _ = self.env.step(a)
            self.steps_done += 1
            td_error, s_value = self.optimize(s, a, next_s, r, done)
            self.logger.add_scalar("td_error", td_error, self.steps_done)
            self.logger.add_scalar("s_value", s_value, self.steps_done)

            s = next_s
            episode_return += r
            steps += 1

        return episode_return, steps, self.env.goal_reached()

    def run_eval_episode(self,
                         env=None,
                         render=False,
                         eval_epsilon=0.05,
                         render_mode="human"):
        if env is None:
            env = self.env

        original_render_mode = env.render_mode
        env.render_mode = render_mode

        s, _ = env.reset()
        done = False
        env_step_limit_reached = False

        steps = 0
        episode_return = 0

        line_break = "="*60
        if render:
            print("\n" + line_break)
            print(f"Running EVALUATION using epsilon = {eval_epsilon:.4f}")
            print(line_break)
            env.render()
            input("Initial state. Press enter to continue..")

        while not done and not env_step_limit_reached:
            a = self.get_egreedy_action(s, eval_epsilon)
            next_s, r, done, env_step_limit_reached, _ = env.step(a)
            s = next_s
            episode_return += r
            steps += 1
            if render:
                print("\n" + line_break)
                print(f"Step {steps}")
                print(line_break)
                print(f"Action Performed = {env.action_space.get_action(a)}")
                env.render()
                print(f"Reward = {r}")
                print(f"Done = {done}")
                print(f"Step limit reached = {env_step_limit_reached}")
                input("Press enter to continue..")

                if done or env_step_limit_reached:
                    print("\n" + line_break)
                    print("EPISODE FINISHED")
                    print(line_break)
                    print(f"Goal reached = {env.goal_reached()}")
                    print(f"Total steps = {steps}")
                    print(f"Total reward = {episode_return}")

        env.render_mode = original_render_mode
        return episode_return, steps, env.goal_reached()


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("env_name", type=str, help="benchmark scenario name")
    parser.add_argument("--render_eval", action="store_true",
                        help="Renders final policy")
    parser.add_argument("--lr", type=float, default=0.001,
                        help="Learning rate (default=0.001)")
    parser.add_argument("-t", "--training_steps", type=int, default=10000,
                        help="training steps (default=10000)")
    parser.add_argument("--batch_size", type=int, default=32,
                        help="(default=32)")
    parser.add_argument("--seed", type=int, default=0,
                        help="(default=0)")
    parser.add_argument("--replay_size", type=int, default=100000,
                        help="(default=100000)")
    parser.add_argument("--final_epsilon", type=float, default=0.05,
                        help="(default=0.05)")
    parser.add_argument("--init_epsilon", type=float, default=1.0,
                        help="(default=1.0)")
    parser.add_argument("-e", "--exploration_steps", type=int, default=10000,
                        help="(default=10000)")
    parser.add_argument("--gamma", type=float, default=0.99,
                        help="(default=0.99)")
    parser.add_argument("--quite", action="store_false",
                        help="Run in Quite mode")
    args = parser.parse_args()

    env = nasim.make_benchmark(
        args.env_name,
        args.seed,
        fully_obs=True,
        flat_actions=True,
        flat_obs=True
    )
    ql_agent = TabularQLearningAgent(
        env, verbose=args.quite, **vars(args)
    )
    ql_agent.train()
    ql_agent.run_eval_episode(render=args.render_eval)


================================================
FILE: nasim/agents/ql_replay_agent.py
================================================
"""An example Tabular, epsilon greedy Q-Learning Agent using experience replay.

The replay can help improve learning stability and speed (in terms of learning
per training step), at the cost of increased memory and computation use.

It uses pytorch 1.5+ tensorboard library for logging (HINT: these dependencies
can be installed by running pip install nasim[dqn])

To run 'tiny' benchmark scenario with default settings, run the following from
the nasim/agents dir:

$ python ql_replay_agent.py tiny

To see detailed results using tensorboard:

$ tensorboard --logdir runs/

To see available hyperparameters:

$ python ql_replay_agent.py --help

Notes
-----

This is by no means a state of the art implementation of Tabular Q-Learning.
It is designed to be an example implementation that can be used as a reference
for building your own agents and for simple experimental comparisons.
"""
import random
from pprint import pprint

import numpy as np

import nasim

try:
    from torch.utils.tensorboard import SummaryWriter
except ImportError as e:
    from gymnasium import error
    raise error.DependencyNotInstalled(
        f"{e}. (HINT: you can install tabular_q_learning_agent dependencies "
        "by running 'pip install nasim[dqn]'.)"
    )


class ReplayMemory:
    """Experience Replay for Tabular Q-Learning agent """

    def __init__(self, capacity, s_dims):
        self.capacity = capacity
        self.s_buf = np.zeros((capacity, *s_dims), dtype=np.float32)
        self.a_buf = np.zeros((capacity, 1), dtype=np.int32)
        self.next_s_buf = np.zeros((capacity, *s_dims), dtype=np.float32)
        self.r_buf = np.zeros(capacity, dtype=np.float32)
        self.done_buf = np.zeros(capacity, dtype=np.float32)
        self.ptr, self.size = 0, 0

    def store(self, s, a, next_s, r, done):
        self.s_buf[self.ptr] = s
        self.a_buf[self.ptr] = a
        self.next_s_buf[self.ptr] = next_s
        self.r_buf[self.ptr] = r
        self.done_buf[self.ptr] = done
        self.ptr = (self.ptr + 1) % self.capacity
        self.size = min(self.size+1, self.capacity)

    def sample_batch(self, batch_size):
        sample_idxs = np.random.choice(self.size, batch_size)
        batch = [self.s_buf[sample_idxs],
                 self.a_buf[sample_idxs],
                 self.next_s_buf[sample_idxs],
                 self.r_buf[sample_idxs],
                 self.done_buf[sample_idxs]]
        return batch


class TabularQFunction:
    """Tabular Q-Function """

    def __init__(self, num_actions):
        self.q_func = dict()
        self.num_actions = num_actions

    def __call__(self, x):
        return self.forward(x)

    def forward(self, x):
        if isinstance(x, np.ndarray):
            x = str(x.astype(np.int))
        if x not in self.q_func:
            self.q_func[x] = np.zeros(self.num_actions, dtype=np.float32)
        return self.q_func[x]

    def forward_batch(self, x_batch):
        return np.asarray([self.forward(x) for x in x_batch])

    def update(self, s_batch, a_batch, delta_batch):
        for s, a, delta in zip(s_batch, a_batch, delta_batch):
            q_vals = self.forward(s)
            q_vals[a] += delta

    def get_action(self, x):
        return int(self.forward(x).argmax())

    def display(self):
        pprint(self.q_func)


class TabularQLearningAgent:
    """A Tabular. epsilon greedy Q-Learning Agent using Experience Replay """

    def __init__(self,
                 env,
                 seed=None,
                 lr=0.001,
                 training_steps=10000,
                 batch_size=32,
                 replay_size=10000,
                 final_epsilon=0.05,
                 exploration_steps=10000,
                 gamma=0.99,
                 verbose=True,
                 **kwargs):

        # This implementation only works for flat actions
        assert env.flat_actions
        self.verbose = verbose
        if self.verbose:
            print("\nRunning Tabular Q-Learning with config:")
            pprint(locals())

        # set seeds
        self.seed = seed
        if self.seed is not None:
            np.random.seed(self.seed)

        # envirnment setup
        self.env = env

        self.num_actions = self.env.action_space.n
        self.obs_dim = self.env.observation_space.shape

        # logger setup
        self.logger = SummaryWriter()

        # Training related attributes
        self.lr = lr
        self.exploration_steps = exploration_steps
        self.final_epsilon = final_epsilon
        self.epsilon_schedule = np.linspace(
            1.0, self.final_epsilon, self.exploration_steps
        )
        self.batch_size = batch_size
        self.discount = gamma
        self.training_steps = training_steps
        self.steps_done = 0

        # Q-Function
        self.qfunc = TabularQFunction(self.num_actions)

        # replay setup
        self.replay = ReplayMemory(replay_size, self.obs_dim)

    def get_epsilon(self):
        if self.steps_done < self.exploration_steps:
            return self.epsilon_schedule[self.steps_done]
        return self.final_epsilon

    def get_egreedy_action(self, o, epsilon):
        if random.random() > epsilon:
            return self.qfunc.get_action(o)
        return random.randint(0, self.num_actions-1)

    def optimize(self):
        batch = self.replay.sample_batch(self.batch_size)
        s_batch, a_batch, next_s_batch, r_batch, d_batch = batch

        # get q_vals for each state and the action performed in that state
        q_vals_raw = self.qfunc.forward_batch(s_batch)
        q_vals = np.take_along_axis(q_vals_raw, a_batch, axis=1).squeeze()

        # get target q val = max val of next state
        target_q_val_raw = self.qfunc.forward_batch(next_s_batch)
        target_q_val = target_q_val_raw.max(axis=1)
        target = r_batch + self.discount*(1-d_batch)*target_q_val

        # calculate error and update
        td_error = target - q_vals
        td_delta = self.lr * td_error

        # optimize the model
        self.qfunc.update(s_batch, a_batch, td_delta)

        q_vals_max = q_vals_raw.max(axis=1)
        mean_v = q_vals_max.mean().item()
        mean_td_error = np.absolute(td_error).mean().item()
        return mean_td_error, mean_v

    def train(self):
        if self.verbose:
            print("\nStarting training")

        num_episodes = 0
        training_steps_remaining = self.training_steps

        while self.steps_done < self.training_steps:
            ep_results = self.run_train_episode(training_steps_remaining)
            ep_return, ep_steps, goal = ep_results
            num_episodes += 1
            training_steps_remaining -= ep_steps

            self.logger.add_scalar("episode", num_episodes, self.steps_done)
            self.logger.add_scalar(
                "epsilon", self.get_epsilon(), self.steps_done
            )
            self.logger.add_scalar(
                "episode_return", ep_return, self.steps_done
            )
            self.logger.add_scalar(
                "episode_steps", ep_steps, self.steps_done
            )
            self.logger.add_scalar(
                "episode_goal_reached", int(goal), self.steps_done
            )

            if num_episodes % 10 == 0 and self.verbose:
                print(f"\nEpisode {num_episodes}:")
                print(f"\tsteps done = {self.steps_done} / "
                      f"{self.training_steps}")
                print(f"\treturn = {ep_return}")
                print(f"\tgoal = {goal}")

        self.logger.close()
        if self.verbose:
            print("Training complete")
            print(f"\nEpisode {num_episodes}:")
            print(f"\tsteps done = {self.steps_done} / {self.training_steps}")
            print(f"\treturn = {ep_return}")
            print(f"\tgoal = {goal}")

    def run_train_episode(self, step_limit):
        o = self.env.reset()
        done = False
        env_step_limit_reached = False

        steps = 0
        episode_return = 0

        while not done and not env_step_limit_reached and steps < step_limit:
            a = self.get_egreedy_action(o, self.get_epsilon())

            next_o, r, done, env_step_limit_reached, _ = self.env.step(a)
            self.replay.store(o, a, next_o, r, done)
            self.steps_done += 1
            mean_td_error, mean_v = self.optimize()
            self.logger.add_scalar(
                "mean_td_error", mean_td_error, self.steps_done
            )
            self.logger.add_scalar("mean_v", mean_v, self.steps_done)

            o = next_o
            episode_return += r
            steps += 1

        return episode_return, steps, self.env.goal_reached()

    def run_eval_episode(self,
                         env=None,
                         render=False,
                         eval_epsilon=0.05,
                         render_mode="readable"):
        if env is None:
            env = self.env
        o = env.reset()
        done = False
        env_step_limit_reached = False

        steps = 0
        episode_return = 0

        line_break = "="*60
        if render:
            print("\n" + line_break)
            print(f"Running EVALUATION using epsilon = {eval_epsilon:.4f}")
            print(line_break)
            env.render(render_mode)
            input("Initial state. Press enter to continue..")

        while not done and not env_step_limit_reached:
            a = self.get_egreedy_action(o, eval_epsilon)
            next_o, r, done, env_step_limit_reached, _ = env.step(a)
            o = next_o
            episode_return += r
            steps += 1
            if render:
                print("\n" + line_break)
                print(f"Step {steps}")
                print(line_break)
                print(f"Action Performed = {env.action_space.get_action(a)}")
                env.render(render_mode)
                print(f"Reward = {r}")
                print(f"Done = {done}")
                print(f"Step limit reached = {env_step_limit_reached}")
                input("Press enter to continue..")

                if done or env_step_limit_reached:
                    print("\n" + line_break)
                    print("EPISODE FINISHED")
                    print(line_break)
                    print(f"Goal reached = {env.goal_reached()}")
                    print(f"Total steps = {steps}")
                    print(f"Total reward = {episode_return}")

        return episode_return, steps, env.goal_reached()


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("env_name", type=str, help="benchmark scenario name")
    parser.add_argument("--render_eval", action="store_true",
                        help="Renders final policy")
    parser.add_argument("--lr", type=float, default=0.001,
                        help="Learning rate (default=0.001)")
    parser.add_argument("-t", "--training_steps", type=int, default=10000,
                        help="training steps (default=10000)")
    parser.add_argument("--batch_size", type=int, default=32,
                        help="(default=32)")
    parser.add_argument("--seed", type=int, default=0,
                        help="(default=0)")
    parser.add_argument("--replay_size", type=int, default=100000,
                        help="(default=100000)")
    parser.add_argument("--final_epsilon", type=float, default=0.05,
                        help="(default=0.05)")
    parser.add_argument("--init_epsilon", type=float, default=1.0,
                        help="(default=1.0)")
    parser.add_argument("--exploration_steps", type=int, default=10000,
                        help="(default=10000)")
    parser.add_argument("--gamma", type=float, default=0.99,
                        help="(default=0.99)")
    parser.add_argument("--quite", action="store_false",
                        help="Run in Quite mode")
    args = parser.parse_args()

    env = nasim.make_benchmark(args.env_name,
                               args.seed,
                               fully_obs=True,
                               flat_actions=True,
                               flat_obs=True)
    ql_agent = TabularQLearningAgent(
        env, verbose=args.quite, **vars(args)
    )
    ql_agent.train()
    ql_agent.run_eval_episode(render=args.render_eval)


================================================
FILE: nasim/agents/random_agent.py
================================================
"""A random agent that selects a random action at each step

To run 'tiny' benchmark scenario with default settings, run the following from
the nasim/agents dir:

$ python random_agent.py tiny

This will run the agent and display progress and final results to stdout.

To see available running arguments:

$ python random_agent.py --help
"""

import numpy as np

import nasim

LINE_BREAK = "-"*60


def run_random_agent(env, step_limit=1e6, verbose=True):
    if verbose:
        print(LINE_BREAK)
        print("STARTING EPISODE")
        print(LINE_BREAK)
        print(f"t: Reward")

    env.reset()
    total_reward = 0
    done = False
    env_step_limit_reached = False
    t = 0
    a = 0

    while not done and not env_step_limit_reached and t < step_limit:
        a = env.action_space.sample()
        _, r, done, env_step_limit_reached, _ = env.step(a)
        total_reward += r
        if (t+1) % 100 == 0 and verbose:
            print(f"{t}: {total_reward}")
        t += 1

    if (done or env_step_limit_reached) and verbose:
        print(LINE_BREAK)
        print("EPISODE FINISHED")
        print(LINE_BREAK)
        print(f"Total steps = {t}")
        print(f"Total reward = {total_reward}")
    elif verbose:
        print(LINE_BREAK)
        print("STEP LIMIT REACHED")
        print(LINE_BREAK)

    if done:
        done = env.goal_reached()

    return t, total_reward, done


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("env_name", type=str,
                        help="benchmark scenario name")
    parser.add_argument("-s", "--seed", type=int, default=0,
                        help="random seed")
    parser.add_argument("-r", "--runs", type=int, default=1,
                        help="number of random runs to perform (default=1)")
    parser.add_argument("-o", "--partially_obs", action="store_true",
                        help="Partially Observable Mode")
    parser.add_argument("-p", "--param_actions", action="store_true",
                        help="Use Parameterised action space")
    parser.add_argument("-f", "--box_obs", action="store_true",
                        help="Use 2D observation space")
    args = parser.parse_args()

    seed = args.seed
    run_steps = []
    run_rewards = []
    run_goals = 0
    for i in range(args.runs):
        env = nasim.make_benchmark(args.env_name,
                                   seed,
                                   not args.partially_obs,
                                   not args.param_actions,
                                   not args.box_obs)
        steps, reward, done = run_random_agent(env, verbose=False)
        run_steps.append(steps)
        run_rewards.append(reward)
        run_goals += int(done)
        seed += 1

        if args.runs > 1:
            print(f"Run {i}:")
            print(f"\tSteps = {steps}")
            print(f"\tReward = {reward}")
            print(f"\tGoal reached = {done}")

    run_steps = np.array(run_steps)
    run_rewards = np.array(run_rewards)

    print(LINE_BREAK)
    print("Random Agent Runs Complete")
    print(LINE_BREAK)
    print(f"Mean steps = {run_steps.mean():.2f} +/- {run_steps.std():.2f}")
    print(f"Mean rewards = {run_rewards.mean():.2f} "
          f"+/- {run_rewards.std():.2f}")
    print(f"Goals reached = {run_goals} / {args.runs}")


================================================
FILE: nasim/demo.py
================================================
"""Script for running NASim demo

Usage
-----

$ python demo [-ai] [-h] env_name
"""

import os.path as osp

import nasim
from nasim.agents.dqn_agent import DQNAgent
from nasim.agents.keyboard_agent import run_keyboard_agent


DQN_POLICY_DIR = osp.join(
    osp.dirname(osp.abspath(__file__)),
    "agents",
    "policies"
)
DQN_POLICIES = {
    "tiny": osp.join(DQN_POLICY_DIR, "dqn_tiny.pt"),
    "small": osp.join(DQN_POLICY_DIR, "dqn_small.pt")
}


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser(
        description=(
            "NASim demo. Play as the hacker, trying to gain access"
            " to sensitive information on the network, or run a pre-trained"
            " AI hacker."
        )
    )
    parser.add_argument("env_name", type=str,
                        help="benchmark scenario name")
    parser.add_argument("-ai", "--run_ai", action="store_true",
                        help=("Run AI policy (currently ony supported for"
                              " 'tiny' and 'small' environments"))
    args = parser.parse_args()

    if args.run_ai:
        assert args.env_name in DQN_POLICIES, \
            ("AI demo only supported for the following environments:"
             f" {list(DQN_POLICIES)}")

    env = nasim.make_benchmark(
        args.env_name,
        fully_obs=True,
        flat_actions=True,
        flat_obs=True,
        render_mode="human"
    )

    line_break = f"\n{'-'*60}"
    print(line_break)
    print(f"Running Demo on {args.env_name} environment")
    if args.run_ai:
        print("Using AI policy")
        print(line_break)
        dqn_agent = DQNAgent(env, verbose=False, **vars(args))
        dqn_agent.load(DQN_POLICIES[args.env_name])
        ret, steps, goal = dqn_agent.run_eval_episode(
            env, True, 0.01, "human"
        )
    else:
        print("Player controlled")
        print(line_break)
        ret, steps, goal = run_keyboard_agent(env)

    print(line_break)
    print(f"Episode Complete")
    print(line_break)
    if goal:
        print("Goal accomplished. Sensitive data retrieved!")
    print(f"Final Score={ret}")
    print(f"Steps taken={steps}")


================================================
FILE: nasim/envs/__init__.py
================================================
from nasim.envs.gym_env import NASimGymEnv
from nasim.envs.environment import NASimEnv


================================================
FILE: nasim/envs/action.py
================================================
"""Action related classes for the NASim environment.

This module contains the different action classes that are used
to implement actions within a NASim environment, along within the
different ActionSpace classes, and the ActionResult class.

Notes
-----

**Actions:**

Every action inherits from the base :class:`Action` class, which defines
some common attributes and functions. Different types of actions
are implemented as subclasses of the Action class.

Action types implemented:

- :class:`Exploit`
- :class:`PrivilegeEscalation`
- :class:`ServiceScan`
- :class:`OSScan`
- :class:`SubnetScan`
- :class:`ProcessScan`
- :class:`NoOp`

**Action Spaces:**

There are two types of action spaces, depending on if you are using flat
actions or not:

- :class:`FlatActionSpace`
- :class:`ParameterisedActionSpace`

"""

import math
import numpy as np
from gymnasium import spaces

from nasim.envs.utils import AccessLevel


def load_action_list(scenario):
    """Load list of actions for environment for given scenario

    Parameters
    ----------
    scenario : Scenario
        the scenario

    Returns
    -------
    list
        list of all actions in environment
    """
    action_list = []
    for address in scenario.address_space:
        action_list.append(
            ServiceScan(address, scenario.service_scan_cost)
        )
        action_list.append(
            OSScan(address, scenario.os_scan_cost)
        )
        action_list.append(
            SubnetScan(address, scenario.subnet_scan_cost)
        )
        action_list.append(
            ProcessScan(address, scenario.process_scan_cost)
        )
        for e_name, e_def in scenario.exploits.items():
            exploit = Exploit(e_name, address, **e_def)
            action_list.append(exploit)
        for pe_name, pe_def in scenario.privescs.items():
            privesc = PrivilegeEscalation(pe_name, address, **pe_def)
            action_list.append(privesc)
    return action_list


class Action:
    """The base abstract action class in the environment

    There are multiple types of actions (e.g. exploit, scan, etc.), but every
    action has some common attributes.

    ...

    Attributes
    ----------
    name : str
        the name of action
    target : (int, int)
        the (subnet, host) address of target of the action. The target of the
        action could be the address of a host that the action is being used
        against (e.g. for exploits or targeted scans) or could be the host that
        the action is being executed on (e.g. for subnet scans).
    cost : float
        the cost of performing the action
    prob : float
        the success probability of the action. This is the probability that
        the action works given that it's preconditions are met. E.g. a remote
        exploit targeting a host that you cannot communicate with will always
        fail. For deterministic actions this will be 1.0.
    req_access : AccessLevel,
        the required access level to perform action. For for on host actions
        (i.e. subnet scan, process scan, and privilege escalation) this will
        be the access on the target. For remote actions (i.e. service scan,
        os scan, and exploits) this will be the access on a pivot host (i.e.
        a compromised host that can reach the target).
    """

    def __init__(self,
                 name,
                 target,
                 cost,
                 prob=1.0,
                 req_access=AccessLevel.USER,
                 **kwargs):
        """
        Parameters
        ---------
        name : str
            name of action
        target : (int, int)
            address of target
        cost : float
            cost of performing action
        prob : float, optional
            probability of success for a given action (default=1.0)
        req_access : AccessLevel, optional
            the required access level to perform action
            (default=AccessLevel.USER)
        """
        assert 0 <= prob <= 1.0
        self.name = name
        self.target = target
        self.cost = cost
        self.prob = prob
        self.req_access = req_access

    def is_exploit(self):
        """Check if action is an exploit

        Returns
        -------
        bool
            True if action is exploit, otherwise False
        """
        return isinstance(self, Exploit)

    def is_privilege_escalation(self):
        """Check if action is privilege escalation action

        Returns
        -------
        bool
            True if action is privilege escalation action, otherwise False
        """
        return isinstance(self, PrivilegeEscalation)

    def is_scan(self):
        """Check if action is a scan

        Returns
        -------
        bool
            True if action is scan, otherwise False
        """
        return isinstance(self, (ServiceScan, OSScan, SubnetScan, ProcessScan))

    def is_remote(self):
        """Check if action is a remote action

        A remote action is one where the target host is a remote host (i.e. the
        action is not performed locally on the target)

        Returns
        -------
        bool
            True if action is remote, otherwise False
        """
        return isinstance(self, (ServiceScan, OSScan, Exploit))

    def is_service_scan(self):
        """Check if action is a service scan

        Returns
        -------
        bool
            True if action is service scan, otherwise False
        """
        return isinstance(self, ServiceScan)

    def is_os_scan(self):
        """Check if action is an OS scan

        Returns
        -------
        bool
            True if action is an OS scan, otherwise False
        """
        return isinstance(self, OSScan)

    def is_subnet_scan(self):
        """Check if action is a subnet scan

        Returns
        -------
        bool
            True if action is a subnet scan, otherwise False
        """
        return isinstance(self, SubnetScan)

    def is_process_scan(self):
        """Check if action is a process scan

        Returns
        -------
        bool
            True if action is a process scan, otherwise False
        """
        return isinstance(self, ProcessScan)

    def is_noop(self):
        """Check if action is a do nothing action.

        Returns
        -------
        bool
            True if action is a noop action, otherwise False
        """
        return isinstance(self, NoOp)

    def __str__(self):
        return (f"{self.__class__.__name__}: "
                f"target={self.target}, "
                f"cost={self.cost:.2f}, "
                f"prob={self.prob:.2f}, "
                f"req_access={self.req_access}")

    def __hash__(self):
        return hash(self.__str__())

    def __eq__(self, other):
        if self is other:
            return True
        if not isinstance(other, type(self)):
            return False
        if self.target != other.target:
            return False
        if not (math.isclose(self.cost, other.cost)
                and math.isclose(self.prob, other.prob)):
            return False
        return self.req_access == other.req_access


class Exploit(Action):
    """An Exploit action in the environment

    Inherits from the base Action Class.

    ...

    Attributes
    ----------
    service : str
        the service targeted by exploit
    os : str
        the OS targeted by exploit. If None then exploit works for all OSs.
    access : int
        the access level gained on target if exploit succeeds.
    """

    def __init__(self,
                 name,
                 target,
                 cost,
                 service,
                 os=None,
                 access=0,
                 prob=1.0,
                 req_access=AccessLevel.USER,
                 **kwargs):
        """
        Parameters
        ---------
        target : (int, int)
            address of target
        cost : float
            cost of performing action
        service : str
            the target service
        os : str, optional
            the target OS of exploit, if None then exploit works for all OS
            (default=None)
        access : int, optional
            the access level gained on target if exploit succeeds (default=0)
        prob : float, optional
            probability of success (default=1.0)
        req_access : AccessLevel, optional
            the required access level to perform action
            (default=AccessLevel.USER)
        """
        super().__init__(name=name,
                         target=target,
                         cost=cost,
                         prob=prob,
                         req_access=req_access)
        self.os = os
        self.service = service
        self.access = access

    def __str__(self):
        return (f"{super().__str__()}, os={self.os}, "
                f"service={self.service}, access={self.access}")

    def __eq__(self, other):
        if not super().__eq__(other):
            return False
        return self.service == other.service \
            and self.os == other.os \
            and self.access == other.access


class PrivilegeEscalation(Action):
    """A privilege escalation action in the environment

    Inherits from the base Action Class.

    ...

    Attributes
    ----------
    process : str
        the process targeted by the privilege escalation. If None the action
        works independent of a process
    os : str
        the OS targeted by privilege escalation. If None then action works
        for all OSs.
    access : int
        the access level resulting from privilege escalation action
    """

    def __init__(self,
                 name,
                 target,
                 cost,
                 access,
                 process=None,
                 os=None,
                 prob=1.0,
                 req_access=AccessLevel.USER,
                 **kwargs):
        """
        Parameters
        ---------
        target : (int, int)
            address of target
        cost : float
            cost of performing action
        access : int
            the access level resulting from the privilege escalation
        process : str, optional
            the target process, if None the action does not require a process
            to work (default=None)
        os : str, optional
            the target OS of privilege escalation action, if None then action
            works for all OS (default=None)
        prob : float, optional
            probability of success (default=1.0)
        req_access : AccessLevel, optional
            the required access level to perform action
            (default=AccessLevel.USER)
        """
        super().__init__(name=name,
                         target=target,
                         cost=cost,
                         prob=prob,
                         req_access=req_access)
        self.access = access
        self.os = os
        self.process = process

    def __str__(self):
        return (f"{super().__str__()}, os={self.os}, "
                f"process={self.process}, access={self.access}")

    def __eq__(self, other):
        if not super().__eq__(other):
            return False
        return self.process == other.process \
            and self.os == other.os \
            and self.access == other.access


class ServiceScan(Action):
    """A Service Scan action in the environment

    Inherits from the base Action Class.
    """

    def __init__(self,
                 target,
                 cost,
                 prob=1.0,
                 req_access=AccessLevel.USER,
                 **kwargs):
        """
        Parameters
        ---------
        target : (int, int)
            address of target
        cost : float
            cost of performing action
        prob : float, optional
            probability of success for a given action (default=1.0)
        req_access : AccessLevel, optional
            the required access level to perform action
            (default=AccessLevel.USER)
        """
        super().__init__("service_scan",
                         target=target,
                         cost=cost,
                         prob=prob,
                         req_access=req_access,
                         **kwargs)


class OSScan(Action):
    """An OS Scan action in the environment

    Inherits from the base Action Class.
    """

    def __init__(self,
                 target,
                 cost,
                 prob=1.0,
                 req_access=AccessLevel.USER,
                 **kwargs):
        """
        Parameters
        ---------
        target : (int, int)
            address of target
        cost : float
            cost of performing action
        prob : float, optional
            probability of success for a given action (default=1.0)
        req_access : AccessLevel, optional
            the required access level to perform action
            (default=AccessLevel.USER)
        """
        super().__init__("os_scan",
                         target=target,
                         cost=cost,
                         prob=prob,
                         req_access=req_access,
                         **kwargs)


class SubnetScan(Action):
    """A Subnet Scan action in the environment

    Inherits from the base Action Class.
    """

    def __init__(self,
                 target,
                 cost,
                 prob=1.0,
                 req_access=AccessLevel.USER,
                 **kwargs):
        """
        Parameters
        ---------
        target : (int, int)
            address of target
        cost : float
            cost of performing action
        prob : float, optional
            probability of success for a given action (default=1.0)
        req_access : AccessLevel, optional
            the required access level to perform action
            (default=AccessLevel.USER)
        """
        super().__init__("subnet_scan",
                         target=target,
                         cost=cost,
                         prob=prob,
                         req_access=req_access,
                         **kwargs)


class ProcessScan(Action):
    """A Process Scan action in the environment

    Inherits from the base Action Class.
    """

    def __init__(self,
                 target,
                 cost,
                 prob=1.0,
                 req_access=AccessLevel.USER,
                 **kwargs):
        """
        Parameters
        ---------
        target : (int, int)
            address of target
        cost : float
            cost of performing action
        prob : float, optional
            probability of success for a given action (default=1.0)
        req_access : AccessLevel, optional
            the required access level to perform action
            (default=AccessLevel.USER)
        """
        super().__init__("process_scan",
                         target=target,
                         cost=cost,
                         prob=prob,
                         req_access=req_access,
                         **kwargs)


class NoOp(Action):
    """A do nothing action in the environment

    Inherits from the base Action Class
    """

    def __init__(self, *args, **kwargs):
        super().__init__(name="noop",
                         target=(1, 0),
                         cost=0,
                         prob=1.0,
                         req_access=AccessLevel.NONE)


class ActionResult:
    """A dataclass for storing the results of an Action.

    These results are then used to update the full state and observation.

    ...

    Attributes
    ----------
    success : bool
        True if exploit/scan was successful, False otherwise
    value : float
        value gained from action. Is the value of the host if successfuly
        exploited, otherwise 0
    services : dict
        services identified by action.
    os : dict
        OS identified by action
    processes : dict
        processes identified by action
    access : dict
        access gained by action
    discovered : dict
        host addresses discovered by action
    connection_error : bool
        True if action failed due to connection error (e.g. could
        not reach target)
    permission_error : bool
        True if action failed due to a permission error (e.g. incorrect access
        level to perform action)
    undefined_error : bool
        True if action failed due to an undefined error (e.g. random exploit
        failure)
    newly_discovered : dict
        host addresses discovered for the first time by action
    """

    def __init__(self,
                 success,
                 value=0.0,
                 services=None,
                 os=None,
                 processes=None,
                 access=None,
                 discovered=None,
                 connection_error=False,
                 permission_error=False,
                 undefined_error=False,
                 newly_discovered=None):
        """
        Parameters
        ----------
        success : bool
            True if exploit/scan was successful, False otherwise
        value : float, optional
            value gained from action (default=0.0)
        services : dict, optional
            services identified by action (default=None={})
        os : dict, optional
            OS identified by action (default=None={})
        processes : dict, optional
            processes identified by action (default=None={})
        access : dict, optional
            access gained by action (default=None={})
        discovered : dict, optional
            host addresses discovered by action (default=None={})
        connection_error : bool, optional
            True if action failed due to connection error (default=False)
        permission_error : bool, optional
            True if action failed due to a permission error (default=False)
        undefined_error : bool, optional
            True if action failed due to an undefined error (default=False)
        newly_discovered : dict, optional
            host addresses discovered for first time by action (default=None)
        """
        self.success = success
        self.value = value
        self.services = {} if services is None else services
        self.os = {} if os is None else os
        self.processes = {} if processes is None else processes
        self.access = {} if access is None else access
        self.discovered = {} if discovered is None else discovered
        self.connection_error = connection_error
        self.permission_error = permission_error
        self.undefined_error = undefined_error
        if newly_discovered is not None:
            self.newly_discovered = newly_discovered
        else:
            self.newly_discovered = {}

    def info(self):
        """Get results as dict

        Returns
        -------
        dict
            action results information
        """
        return dict(
            success=self.success,
            value=self.value,
            services=self.services,
            os=self.os,
            processes=self.processes,
            access=self.access,
            discovered=self.discovered,
            connection_error=self.connection_error,
            permission_error=self.permission_error,
            undefined_error=self.undefined_error,
            newly_discovered=self.newly_discovered
        )

    def __str__(self):
        output = ["ActionObservation:"]
        for k, val in self.info().items():
            output.append(f"  {k}={val}")
        return "\n".join(output)


class FlatActionSpace(spaces.Discrete):
    """Flat Action space for NASim environment.

    Inherits and implements the gym.spaces.Discrete action space

    ...

    Attributes
    ----------
    n : int
        the number of actions in the action space
    actions : list of Actions
        the list of the Actions in the action space
    """

    def __init__(self, scenario):
        """
        Parameters
        ---------
        scenario : Scenario
            scenario description
        """
        self.actions = load_action_list(scenario)
        super().__init__(len(self.actions))

    def get_action(self, action_idx):
        """Get Action object corresponding to action idx

        Parameters
        ----------
        action_idx : int
            the action idx

        Returns
        -------
        Action
            Corresponding Action object
        """
        assert isinstance(action_idx, int), \
            ("When using flat action space, action must be an integer"
             f" or an Action object: {action_idx} is invalid")
        return self.actions[action_idx]


class ParameterisedActionSpace(spaces.MultiDiscrete):
    """A parameterised action space for NASim environment.

    Inherits and implements the gym.spaces.MultiDiscrete action space, where
    each dimension corresponds to a different action parameter.

    The action parameters (in order) are:

    0. Action Type = [0, 5]

       Where:

         0=Exploit,

         1=PrivilegeEscalation,

         2=ServiceScan,

         3=OSScan,

         4=SubnetScan,

         5=ProcessScan,

    1. Subnet = [0, #subnets-1]

       -1 since we don't include the internet subnet

    2. Host = [0, max subnets size-1]
    3. OS = [0, #OS]

       Where 0=None.

    4. Service = [0, #services - 1]
    5. Process = [0, #processes]

       Where 0=None.

    Note that OS, Service and Process are only important for exploits and
    privilege escalation actions.

    ...

    Attributes
    ----------
    nvec : Numpy.Array
        vector of the of the size of each parameter
    actions : list of Actions
        the list of all the Actions in the action space
    """

    action_types = [
        Exploit,
        PrivilegeEscalation,
        ServiceScan,
        OSScan,
        SubnetScan,
        ProcessScan
    ]

    def __init__(self, scenario):
        """
        Parameters
        ----------
        scenario : Scenario
            scenario description
        """
        self.scenario = scenario
        self.actions = load_action_list(scenario)

        nvec = [
            len(self.action_types),
            len(self.scenario.subnets)-1,
            max(self.scenario.subnets),
            self.scenario.num_os+1,
            self.scenario.num_services,
            self.scenario.num_processes
        ]

        super().__init__(nvec)

    def get_action(self, action_vec):
        """Get Action object corresponding to action vector.

        Parameters
        ----------
        action_vector : list of ints or tuple of ints or Numpy.Array
            the action vector

        Returns
        -------
        Action
            Corresponding Action object

        Notes
        -----
        1. if host# specified in action vector is greater than
           the number of hosts in the specified subnet, then host#
           will be changed to host# % subnet size.
        2. if action is an exploit and parameters do not match
           any exploit definition in the scenario description then
           a NoOp action is returned with 0 cost.
        """
        assert isinstance(action_vec, (list, tuple, np.ndarray)), \
            ("When using parameterised action space, action must be an Action"
             f" object, a list or a numpy array: {action_vec} is invalid")
        a_class = self.action_types[action_vec[0]]
        # need to add one to subnet to account for Internet subnet
        subnet = action_vec[1]+1
        host = action_vec[2] % self.scenario.subnets[subnet]

        target = (subnet, host)

        if a_class not in (Exploit, PrivilegeEscalation):
            # can ignore other action parameters
            kwargs = self._get_scan_action_def(a_class)
            return a_class(target=target, **kwargs)

        os = None if action_vec[3] == 0 else self.scenario.os[action_vec[3]-1]

        if a_class == Exploit:
            # have to make sure it is valid choice
            # and also get constant params (name, cost, prob, access)
            service = self.scenario.services[action_vec[4]]
            a_def = self._get_exploit_def(service, os)
        else:
            # privilege escalation
            # have to make sure it is valid choice
            # and also get constant params (name, cost, prob, access)
            proc = self.scenario.processes[action_vec[5]]
            a_def = self._get_privesc_def(proc, os)

        if a_def is None:
            return NoOp()
        return a_class(target=target, **a_def)

    def _get_scan_action_def(self, a_class):
        """Get the constants for scan actions definitions """
        if a_class == ServiceScan:
            cost = self.scenario.service_scan_cost
        elif a_class == OSScan:
            cost = self.scenario.os_scan_cost
        elif a_class == SubnetScan:
            cost = self.scenario.subnet_scan_cost
        elif a_class == ProcessScan:
            cost = self.scenario.process_scan_cost
        else:
            raise TypeError(f"Not implemented for Action class {a_class}")
        return {"cost": cost}

    def _get_exploit_def(self, service, os):
        """Check if exploit parameters are valid """
        e_map = self.scenario.exploit_map
        if service not in e_map:
            return None
        if os not in e_map[service]:
            return None
        return e_map[service][os]

    def _get_privesc_def(self, proc, os):
        """Check if privilege escalation parameters are valid """
        pe_map = self.scenario.privesc_map
        if proc not in pe_map:
            return None
        if os not in pe_map[proc]:
            return None
        return pe_map[proc][os]


================================================
FILE: nasim/envs/environment.py
================================================
""" The main Environment class for NASim: NASimEnv.

The NASimEnv class is the main interface for agents interacting with NASim.
"""
import gymnasium as gym
from gymnasium import spaces
import numpy as np

from nasim.envs.state import State
from nasim.envs.render import Viewer
from nasim.envs.network import Network
from nasim.envs.observation import Observation
from nasim.envs.action import Action, FlatActionSpace, ParameterisedActionSpace


class NASimEnv(gym.Env):
    """ A simulated computer network environment for pen-testing.

    Implements the gymnasium interface.

    ...

    Attributes
    ----------
    name : str
        the environment scenario name
    scenario : Scenario
        Scenario object, defining the properties of the environment
    action_space : FlatActionSpace or ParameterisedActionSpace
        Action space for environment.
        If *flat_action=True* then this is a discrete action space (which
        subclasses gymnasium.spaces.Discrete), so each action is represented by an
        integer.
        If *flat_action=False* then this is a parameterised action space (which
        subclasses gymnasium.spaces.MultiDiscrete), so each action is represented
        using a list of parameters.
    observation_space : gymnasium.spaces.Box
        observation space for environment.
        If *flat_obs=True* then observations are represented by a 1D vector,
        otherwise observations are represented as a 2D matrix.
    current_state : State
        the current state of the environment
    last_obs : Observation
        the last observation that was generated by environment
    steps : int
        the number of steps performed since last reset (this does not include
        generative steps)

    """
    metadata = {'render_modes': ["human", "ansi"]}
    render_mode = None
    reward_range = (-float('inf'), float('inf'))

    action_space = None
    observation_space = None
    current_state = None
    last_obs = None

    def __init__(self,
                 scenario,
                 fully_obs=False,
                 flat_actions=True,
                 flat_obs=True,
                 render_mode=None):
        """
        Parameters
        ----------
        scenario : Scenario
            Scenario object, defining the properties of the environment
        fully_obs : bool, optional
            The observability mode of environment, if True then uses fully
            observable mode, otherwise is partially observable (default=False)
        flat_actions : bool, optional
            If true then uses a flat action space, otherwise will uses a
            parameterised action space (default=True).
        flat_obs : bool, optional
            If true then uses a 1D observation space, otherwise uses a 2D
            observation space (default=True)
        render_mode : str, optional
            The render mode to use for the environment.
        """
        self.name = scenario.name
        self.scenario = scenario
        self.fully_obs = fully_obs
        self.flat_actions = flat_actions
        self.flat_obs = flat_obs
        self.render_mode = render_mode

        self.network = Network(scenario)
        self.current_state = State.generate_initial_state(self.network)
        self._renderer = None
        self.reset()

        if self.flat_actions:
            self.action_space = FlatActionSpace(self.scenario)
        else:
            self.action_space = ParameterisedActionSpace(self.scenario)

        if self.flat_obs:
            obs_shape = self.last_obs.shape_flat()
        else:
            obs_shape = self.last_obs.shape()
        obs_low, obs_high = Observation.get_space_bounds(self.scenario)
        self.observation_space = spaces.Box(
            low=obs_low, high=obs_high, shape=obs_shape
        )

        self.steps = 0

    def reset(self, *, seed=None, options=None):
        """Reset the state of the environment and returns the initial state.

        Implements gymnasium.Env.reset().

        Parameters
        ----------
        seed : int, optional
            the optional seed for the environments RNG
        options : dict, optional
            optional environment options (does nothing in NASim at the moment)

        Returns
        -------
        numpy.Array
            the initial observation of the environment
        dict
            auxiliary information regarding reset
        """
        super().reset(seed=seed, options=options)
        self.steps = 0
        self.current_state = self.network.reset(self.current_state)
        self.last_obs = self.current_state.get_initial_observation(
            self.fully_obs
        )

        if self.flat_obs:
            obs = self.last_obs.numpy_flat()
        else:
            obs = self.last_obs.numpy()

        return obs, {}

    def step(self, action):
        """Run one step of the environment using action.

        Implements gymnasium.Env.step().

        Parameters
        ----------
        action : Action or int or list or NumpyArray
            Action to perform. If not Action object, then if using
            flat actions this should be an int and if using non-flat actions
            this should be an indexable array.

        Returns
        -------
        numpy.Array
            observation from performing action
        float
            reward from performing action
        bool
            whether the episode reached a terminal state or not (i.e. all
            target machines have been successfully compromised)
        bool
            whether the episode has reached the step limit (if one exists)
        dict
            auxiliary information regarding step
            (see :func:`nasim.env.action.ActionResult.info`)
        """
        next_state, obs, reward, done, info = self.generative_step(
            self.current_state,
            action
        )
        self.current_state = next_state
        self.last_obs = obs

        if self.flat_obs:
            obs = obs.numpy_flat()
        else:
            obs = obs.numpy()

        self.steps += 1

        step_limit_reached = (
            self.scenario.step_limit is not None
            and self.steps >= self.scenario.step_limit
        )

        return obs, reward, done, step_limit_reached, info

    def generative_step(self, state, action):
        """Run one step of the environment using action in given state.

        Parameters
        ----------
        state : State
            The state to perform the action in
        action : Action, int, list, NumpyArray
            Action to perform. If not Action object, then if using
            flat actions this should be an int and if using non-flat actions
            this should be an indexable array.

        Returns
        -------
        State
            the next state after action was performed
        Observation
            observation from performing action
        float
            reward from performing action
        bool
            whether a terminal state has been reached or not
        dict
            auxiliary information regarding step
            (see :func:`nasim.env.action.ActionResult.info`)
        """
        if not isinstance(action, Action):
            action = self.action_space.get_action(action)

        next_state, action_obs = self.network.perform_action(
            state, action
        )
        obs = next_state.get_observation(
            action, action_obs, self.fully_obs
        )
        done = self.goal_reached(next_state)
        reward = action_obs.value - action.cost
        return next_state, obs, reward, done, action_obs.info()

    def generate_random_initial_state(self):
        """Generates a random initial state for environment.

        This only randomizes the host configurations (os, services)
        using a uniform distribution, so may result in networks where
        it is not possible to reach the goal.

        Returns
        -------
        State
            A random initial state
        """
        return State.generate_random_initial_state(self.network)

    def generate_initial_state(self):
        """Generate the initial state for the environment.

        Returns
        -------
        State
            The initial state

        Notes
        -----
        This does not reset the current state of the environment (use
        :func:`reset` for that).
        """
        return State.generate_initial_state(self.network)

    def render(self):
        """Render environment.

        Implements gymnasium.Env.render().

        See render module for more details on modes and symbols.

        """
        if self.render_mode is None:
            return
        return self.render_obs(mode=self.render_mode, obs=self.last_obs)

    def render_obs(self, mode="human", obs=None):
        """Render observation.

        See render module for more details on modes and symbols.

        Parameters
        ----------
        mode : str
            rendering mode
        obs : Observation or numpy.ndarray, optional
            the observation to render, if None will render last observation.
            If numpy.ndarray it must be in format that matches Observation
            (i.e. ndarray returned by step method) (default=None)
        """
        if mode is None:
            return

        if obs is None:
            obs = self.last_obs

        if not isinstance(obs, Observation):
            obs = Observation.from_numpy(obs, self.current_state.shape())

        if self._renderer is None:
            self._renderer = Viewer(self.network)

        if mode in ("human", "ansi"):
            return self._renderer.render_readable(obs)
        else:
            raise NotImplementedError(
                "Please choose correct render mode from :"
                f"{self.metadata['render_modes']}"
            )

    def render_state(self, mode="human", state=None):
        """Render state.

        See render module for more details on modes and symbols.

        If mode = ASCI:
            Machines displayed in rows, with one row for each subnet and
            hosts displayed in order of id within subnet

        Parameters
        ----------
        mode : str
            rendering mode
        state : State or numpy.ndarray, optional
            the State to render, if None will render current state
            If numpy.ndarray it must be in format that matches State
            (i.e. ndarray returned by generative_step method) (default=None)
        """
        if mode is None:
            return

        if state is None:
            state = self.current_state

        if not isinstance(state, State):
            state = State.from_numpy(state,
                                     self.current_state.shape(),
                                     self.current_state.host_num_map)

        if self._renderer is None:
            self._renderer = Viewer(self.network)

        if mode in ("human", "ansi"):
            return self._renderer.render_readable_state(state)
        else:
            raise NotImplementedError(
                "Please choose correct render mode from : "
                f"{self.metadata['render_modes']}"
            )

    def render_action(self, action):
        """Renders human readable version of action.

        This is mainly useful for getting a text description of the action
        that corresponds to a given integer.

        Parameters
        ----------
        action : Action or int or list or NumpyArray
            Action to render. If not Action object, then if using
            flat actions this should be an int and if using non-flat actions
            this should be an indexable array.
        """
        if not isinstance(action, Action):
            action = self.action_space.get_action(action)
        print(action)

    def render_episode(self, episode, width=7, height=7):
        """Render an episode as sequence of network graphs, where an episode
        is a sequence of (state, action, reward, done) tuples generated from
        interactions with environment.

        Parameters
        ----------
        episode : list
            list of (State, Action, reward, done) tuples
        width : int
            width of GUI window
        height : int
            height of GUI window
        """
        if self._renderer is None:
            self._renderer = Viewer(self.network)
        self._renderer.render_episode(episode, width, height)

    def render_network_graph(self, ax=None, show=False):
        """Render a plot of network as a graph with hosts as nodes arranged
        into subnets and showing connections between subnets. Renders current
        state of network.

        Parameters
        ----------
        ax : Axes
            matplotlib axis to plot graph on, or None to plot on new axis
        show : bool
            whether to display plot, or simply setup plot and showing plot
            can be handled elsewhere by user
        """
        if self._renderer is None:
            self._renderer = Viewer(self.network)
        state = self.current_state
        self._renderer.render_graph(state, ax, show)

    def get_minimum_hops(self):
        """Get the minimum number of network hops required to reach targets.

        That is minimum number of hosts that must be traversed in the network
        in order to reach all sensitive hosts on the network starting from the
        initial state

        Returns
        -------
        int
            minumum possible number of network hops to reach target hosts
        """
        return self.network.get_minimal_hops()

    def get_action_mask(self):
        """Get a vector mask for valid actions.

        Returns
        -------
        ndarray
            numpy vector of 1's and 0's, one for each action. Where an
            index will be 1 if action is valid given current state, or
            0 if action is invalid.
        """
        assert isinstance(self.action_space, FlatActionSpace), \
            "Can only use action mask function when using flat action space"
        mask = np.zeros(self.action_space.n, dtype=np.int64)
        for a_idx in range(self.action_space.n):
            action = self.action_space.get_action(a_idx)
            if self.network.host_discovered(action.target):
                mask[a_idx] = 1
        return mask

    def get_score_upper_bound(self):
        """Get the theoretical upper bound for total reward for scenario.

        The theoretical upper bound score is where the agent exploits only a
        single host in each subnet that is required to reach sensitive hosts
        along the shortest bath in network graph, and exploits the all
        sensitive hosts (i.e. the minimum network hops). Assuming action cost
        of 1 and each sensitive host is exploitable from any other connected
        subnet (which may not be true, hence being an upper bound).

        Returns
        -------
        float
            theoretical max score
        """
        max_reward = self.network.get_total_sensitive_host_value()
        max_reward += self.network.get_total_discovery_value()
        max_reward -= self.network.get_minimal_hops()
        return max_reward

    def goal_reached(self, state=None):
        """Check if the state is the goal state.

        The goal state is when all sensitive hosts have been compromised.

        Parameters
        ----------
        state : State, optional
            a state, if None will use current_state of environment
            (default=None)

        Returns
        -------
        bool
            True if state is goal state, otherwise False.
        """
        if state is None:
            state = self.current_state
        return self.network.all_sensitive_hosts_compromised(state)

    def __str__(self):
        output = [
            "NASimEnv:",
            f"name={self.name}",
            f"fully_obs={self.fully_obs}",
            f"flat_actions={self.flat_actions}",
            f"flat_obs={self.flat_obs}"
        ]
        return "\n  ".join(output)

    def close(self):
        if self._renderer is not None:
            self._renderer.close()
            self._renderer = None


================================================
FILE: nasim/envs/gym_env.py
================================================
from nasim.envs.environment import NASimEnv
from nasim.scenarios import Scenario, make_benchmark_scenario


class NASimGymEnv(NASimEnv):
    """A wrapper around the NASimEnv compatible with gymnasium.make()

    See nasim.NASimEnv for details.
    """

    def __init__(self,
                 scenario,
                 fully_obs=False,
                 flat_actions=True,
                 flat_obs=True,
                 render_mode=None):
        """
        Parameters
        ----------
        scenario : str or or nasim.scenarios.Scenario
            either the name of benchmark environment (str) or a nasim Scenario
            instance
        fully_obs : bool, optional
            the observability mode of environment, if True then uses fully
            observable mode, otherwise partially observable (default=False)
        flat_actions : bool, optional
            if true then uses a flat action space, otherwise will use
            parameterised action space (default=True).
        flat_obs : bool, optional
            if true then uses a 1D observation space. If False
            will use a 2D observation space (default=True)
        render_mode : str, optional
            The render mode to use for the environment.
        """
        if not isinstance(scenario, Scenario):
            scenario = make_benchmark_scenario(scenario)
        super().__init__(scenario,
                         fully_obs=fully_obs,
                         flat_actions=flat_actions,
                         flat_obs=flat_obs,
                         render_mode=render_mode)


================================================
FILE: nasim/envs/host_vector.py
================================================
""" This module contains the HostVector class.

This is the main class for storing and updating the state of a single host
in the NASim environment.
"""

import numpy as np

from nasim.envs.utils import AccessLevel
from nasim.envs.action import ActionResult


class HostVector:
    """ A Vector representation of a single host in NASim.

    Each host is represented as a vector (1D numpy array) for efficiency and to
    make it easier to use with deep learning agents. The vector is made up of
    multiple features arranged in a consistent way.

    Features in the vector, listed in order, are:

    1. subnet address - one-hot encoding with length equal to the number
                        of subnets
    2. host address - one-hot encoding with length equal to the maximum number
                      of hosts in any subnet
    3. compromised - bool
    4. reachable - bool
    5. discovered - bool
    6. value - float
    7. discovery value - float
    8. access - int
    9. OS - bool for each OS in scenario (only one OS has value of true)
    10. services running - bool for each service in scenario
    11. processes running - bool for each process in scenario

    Notes
    -----
    - The size of the vector is equal to:

        #subnets + max #hosts in any subnet + 6 + #OS + #services + #processes.

    - Where the +6 is for compromised, reachable, discovered, value,
      discovery_value, and access features
    - The vector is a float vector so True/False is actually represented as
      1.0/0.0.

    """

    # class properties that are the same for all hosts
    # these are set when calling vectorize method
    # the bounds on address space (used for one hot encoding of host address)
    address_space_bounds = None
    # number of OS in scenario
    num_os = None
    # map from OS name to its index in host vector
    os_idx_map = {}
    # number of services in scenario
    num_services = None
    # map from service name to its index in host vector
    service_idx_map = {}
    # number of processes in scenario
    num_processes = None
    # map from process name to its index in host vector
    process_idx_map = {}
    # size of state for host vector (i.e. len of vector)
    state_size = None

    # vector position constants
    # to be initialized
    _subnet_address_idx = 0
    _host_address_idx = None
    _compromised_idx = None
    _reachable_idx = None
    _discovered_idx = None
    _value_idx = None
    _discovery_value_idx = None
    _access_idx = None
    _os_start_idx = None
    _service_start_idx = None
    _process_start_idx = None

    def __init__(self, vector):
        self.vector = vector

    @classmethod
    def vectorize(cls, host, address_space_bounds, vector=None):
        if cls.address_space_bounds is None:
            cls._initialize(
                address_space_bounds, host.services, host.os, host.processes
            )

        if vector is None:
            vector = np.zeros(cls.state_size, dtype=np.float32)
        else:
            assert len(vector) == cls.state_size

        vector[cls._subnet_address_idx + host.address[0]] = 1
        vector[cls._host_address_idx + host.address[1]] = 1
        vector[cls._compromised_idx] = int(host.compromised)
        vector[cls._reachable_idx] = int(host.reachable)
        vector[cls._discovered_idx] = int(host.discovered)
        vector[cls._value_idx] = host.value
        vector[cls._discovery_value_idx] = host.discovery_value
        vector[cls._access_idx] = host.access
        for os_num, (os_key, os_val) in enumerate(host.os.items()):
            vector[cls._get_os_idx(os_num)] = int(os_val)
        for srv_num, (srv_key, srv_val) in enumerate(host.services.items()):
            vector[cls._get_service_idx(srv_num)] = int(srv_val)
        host_procs = host.processes.items()
        for proc_num, (proc_key, proc_val) in enumerate(host_procs):
            vector[cls._get_process_idx(proc_num)] = int(proc_val)
        return cls(vector)

    @classmethod
    def vectorize_random(cls, host, address_space_bounds, vector=None):
        hvec = cls.vectorize(host, vector)
        # random variables
        for srv_num in cls.service_idx_map.values():
            srv_val = np.random.randint(0, 2)
            hvec.vector[cls._get_service_idx(srv_num)] = srv_val

        chosen_os = np.random.choice(list(cls.os_idx_map.values()))
        for os_num in cls.os_idx_map.values():
            hvec.vector[cls._get_os_idx(os_num)] = int(os_num == chosen_os)

        for proc_num in cls.process_idx_map.values():
            proc_val = np.random.randint(0, 2)
            hvec.vector[cls._get_process_idx(proc_num)] = proc_val
        return hvec

    @property
    def compromised(self):
        return self.vector[self._compromised_idx]

    @compromised.setter
    def compromised(self, val):
        self.vector[self._compromised_idx] = int(val)

    @property
    def discovered(self):
        return self.vector[self._discovered_idx]

    @discovered.setter
    def discovered(self, val):
        self.vector[self._discovered_idx] = int(val)

    @property
    def reachable(self):
        return self.vector[self._reachable_idx]

    @reachable.setter
    def reachable(self, val):
        self.vector[self._reachable_idx] = int(val)

    @property
    def address(self):
        return (
            self.vector[self._subnet_address_idx_slice()].argmax(),
            self.vector[self._host_address_idx_slice()].argmax()
        )

    @property
    def value(self):
        return self.vector[self._value_idx]

    @property
    def discovery_value(self):
        return self.vector[self._discovery_value_idx]

    @property
    def access(self):
        return self.vector[self._access_idx]

    @access.setter
    def access(self, val):
        self.vector[self._access_idx] = int(val)

    @property
    def services(self):
        services = {}
        for srv, srv_num in self.service_idx_map.items():
            services[srv] = self.vector[self._get_service_idx(srv_num)]
        return services

    @property
    def os(self):
        os = {}
        for os_key, os_num in self.os_idx_map.items():
            os[os_key] = self.vector[self._get_os_idx(os_num)]
        return os

    @property
    def processes(self):
        processes = {}
        for proc, proc_num in self.process_idx_map.items():
            processes[proc] = self.vector[self._get_process_idx(proc_num)]
        return processes

    def is_running_service(self, srv):
        srv_num = self.service_idx_map[srv]
        return bool(self.vector[self._get_service_idx(srv_num)])

    def is_running_os(self, os):
        os_num = self.os_idx_map[os]
        return bool(self.vector[self._get_os_idx(os_num)])

    def is_running_process(self, proc):
        proc_num = self.process_idx_map[proc]
        return bool(self.vector[self._get_process_idx(proc_num)])

    def perform_action(self, action):
        """Perform given action against this host

        Arguments
        ---------
        action : Action
            the action to perform

        Returns
        -------
        HostVector
            the resulting state of host after action
        ActionObservation
            the result from the action
        """
        next_state = self.copy()
        if action.is_service_scan():
            result = ActionResult(True, 0, services=self.services)
            return next_state, result

        if action.is_os_scan():
            return next_state, ActionResult(True, 0, os=self.os)

        if action.is_exploit():
            if self.is_running_service(action.service) and \
               (action.os is None or self.is_running_os(action.os)):
                # service and os is present so exploit is successful
                value = 0
                next_state.compromised = True
                if not self.access == AccessLevel.ROOT:
                    # ensure a machine is not rewarded twice
                    # and access doesn't decrease
                    next_state.access = action.access
                    if action.access == AccessLevel.ROOT:
                        value = self.value

                result = ActionResult(
                    True,
                    value=value,
                    services=self.services,
                    os=self.os,
                    access=action.access
                )
                return next_state, result

        # following actions are on host so require correct access
        if not (self.compromised and action.req_access <= self.access):
            result = ActionResult(False, 0, permission_error=True)
            return next_state, result

        if action.is_process_scan():
            result = ActionResult(
                True, 0, access=self.access, processes=self.processes
            )
            return next_state, result

        if action.is_privilege_escalation():
            has_proc = (
                action.process is None
                or self.is_running_process(action.process)
            )
            has_os = (
                action.os is None or self.is_running_os(action.os)
            )
            if has_proc and has_os:
                # host compromised and proc and os is present
                # so privesc is successful
                value = 0.0
                if not self.access == AccessLevel.ROOT:

Download .txt

gitextract_bolyar94/

├── .github/
│   └── ISSUE_TEMPLATE/
│       ├── bug_report.md
│       └── feature_request.md
├── .gitignore
├── .readthedocs.yaml
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.rst
├── LICENSE.md
├── README.rst
├── docs/
│   ├── Makefile
│   ├── make.bat
│   ├── requirements.txt
│   └── source/
│       ├── community/
│       │   ├── acknowledgements.rst
│       │   ├── contact.rst
│       │   ├── development.rst
│       │   ├── distributing.rst
│       │   ├── index.rst
│       │   └── license.rst
│       ├── conf.py
│       ├── explanations/
│       │   ├── index.rst
│       │   ├── scenario_generation.rst
│       │   └── sim_to_real.rst
│       ├── index.rst
│       ├── reference/
│       │   ├── agents/
│       │   │   └── index.rst
│       │   ├── envs/
│       │   │   ├── actions.rst
│       │   │   ├── environment.rst
│       │   │   ├── host_vector.rst
│       │   │   ├── index.rst
│       │   │   ├── observation.rst
│       │   │   └── state.rst
│       │   ├── index.rst
│       │   ├── load.rst
│       │   └── scenarios/
│       │       ├── benchmark_scenarios.rst
│       │       ├── benchmark_scenarios_agent_scores.csv
│       │       ├── benchmark_scenarios_table.csv
│       │       ├── generator.rst
│       │       └── index.rst
│       └── tutorials/
│           ├── creating_scenarios.rst
│           ├── environment.rst
│           ├── gym_load.rst
│           ├── index.rst
│           ├── installation.rst
│           ├── loading.rst
│           └── scenarios.rst
├── nasim/
│   ├── __init__.py
│   ├── agents/
│   │   ├── __init__.py
│   │   ├── bruteforce_agent.py
│   │   ├── dqn_agent.py
│   │   ├── keyboard_agent.py
│   │   ├── policies/
│   │   │   └── dqn_tiny.pt
│   │   ├── ql_agent.py
│   │   ├── ql_replay_agent.py
│   │   └── random_agent.py
│   ├── demo.py
│   ├── envs/
│   │   ├── __init__.py
│   │   ├── action.py
│   │   ├── environment.py
│   │   ├── gym_env.py
│   │   ├── host_vector.py
│   │   ├── network.py
│   │   ├── observation.py
│   │   ├── render.py
│   │   ├── state.py
│   │   └── utils.py
│   ├── scenarios/
│   │   ├── __init__.py
│   │   ├── benchmark/
│   │   │   ├── __init__.py
│   │   │   ├── generated.py
│   │   │   ├── medium-multi-site.yaml
│   │   │   ├── medium-single-site.yaml
│   │   │   ├── medium.yaml
│   │   │   ├── small-honeypot.yaml
│   │   │   ├── small-linear.yaml
│   │   │   ├── small.yaml
│   │   │   ├── tiny-hard.yaml
│   │   │   ├── tiny-small.yaml
│   │   │   └── tiny.yaml
│   │   ├── generator.py
│   │   ├── host.py
│   │   ├── loader.py
│   │   ├── scenario.py
│   │   └── utils.py
│   └── scripts/
│       ├── describe_scenarios.py
│       ├── run_dqn_policy.py
│       ├── run_random_benchmarks.py
│       ├── train_dqn.py
│       └── visualize_graph.py
├── setup.py
└── test/
    ├── __init__.py
    ├── test_bruteforce.py
    ├── test_env.py
    ├── test_generator.py
    └── test_gym_bruteforce.py

Download .txt

SYMBOL INDEX (443 symbols across 31 files)

FILE: docs/source/conf.py
  function skip (line 55) | def skip(app, what, name, obj, would_skip, options):
  function setup (line 61) | def setup(app):

FILE: nasim/__init__.py
  function make_benchmark (line 13) | def make_benchmark(scenario_name,
  function load (line 57) | def load(path,
  function generate (line 97) | def generate(num_hosts,
  function _register (line 139) | def _register(id, entry_point, kwargs, nondeterministic, force=True):

FILE: nasim/agents/bruteforce_agent.py
  function run_bruteforce_agent (line 23) | def run_bruteforce_agent(env, step_limit=1e6, verbose=True):

FILE: nasim/agents/dqn_agent.py
  class ReplayMemory (line 47) | class ReplayMemory:
    method __init__ (line 49) | def __init__(self, capacity, s_dims, device="cpu"):
    method store (line 59) | def store(self, s, a, next_s, r, done):
    method sample_batch (line 68) | def sample_batch(self, batch_size):
  class DQN (line 78) | class DQN(nn.Module):
    method __init__ (line 81) | def __init__(self, input_dim, layers, num_actions):
    method forward (line 88) | def forward(self, x):
    method save_DQN (line 94) | def save_DQN(self, file_path):
    method load_DQN (line 97) | def load_DQN(self, file_path):
    method get_action (line 100) | def get_action(self, x):
  class DQNAgent (line 107) | class DQNAgent:
    method __init__ (line 110) | def __init__(self,
    method save (line 182) | def save(self, save_path):
    method load (line 185) | def load(self, load_path):
    method get_epsilon (line 188) | def get_epsilon(self):
    method get_egreedy_action (line 193) | def get_egreedy_action(self, o, epsilon):
    method optimize (line 199) | def optimize(self):
    method train (line 228) | def train(self):
    method run_train_episode (line 270) | def run_train_episode(self, step_limit):
    method run_eval_episode (line 294) | def run_eval_episode(self,

FILE: nasim/agents/keyboard_agent.py
  function print_actions (line 22) | def print_actions(action_space):
  function choose_flat_action (line 28) | def choose_flat_action(env):
  function display_actions (line 40) | def display_actions(actions):
  function choose_item (line 49) | def choose_item(items):
  function choose_param_action (line 58) | def choose_param_action(env):
  function choose_action (line 131) | def choose_action(env):
  function run_keyboard_agent (line 141) | def run_keyboard_agent(env):
  function run_generative_keyboard_agent (line 185) | def run_generative_keyboard_agent(env, render_mode="human"):

FILE: nasim/agents/ql_agent.py
  class TabularQFunction (line 44) | class TabularQFunction:
    method __init__ (line 47) | def __init__(self, num_actions):
    method __call__ (line 51) | def __call__(self, x):
    method forward (line 54) | def forward(self, x):
    method forward_batch (line 61) | def forward_batch(self, x_batch):
    method update_batch (line 64) | def update_batch(self, s_batch, a_batch, delta_batch):
    method update (line 69) | def update(self, s, a, delta):
    method get_action (line 73) | def get_action(self, x):
    method display (line 76) | def display(self):
  class TabularQLearningAgent (line 80) | class TabularQLearningAgent:
    method __init__ (line 83) | def __init__(self,
    method get_epsilon (line 129) | def get_epsilon(self):
    method get_egreedy_action (line 134) | def get_egreedy_action(self, o, epsilon):
    method optimize (line 139) | def optimize(self, s, a, next_s, r, done):
    method train (line 158) | def train(self):
    method run_train_episode (line 200) | def run_train_episode(self, step_limit):
    method run_eval_episode (line 223) | def run_eval_episode(self,

FILE: nasim/agents/ql_replay_agent.py
  class ReplayMemory (line 46) | class ReplayMemory:
    method __init__ (line 49) | def __init__(self, capacity, s_dims):
    method store (line 58) | def store(self, s, a, next_s, r, done):
    method sample_batch (line 67) | def sample_batch(self, batch_size):
  class TabularQFunction (line 77) | class TabularQFunction:
    method __init__ (line 80) | def __init__(self, num_actions):
    method __call__ (line 84) | def __call__(self, x):
    method forward (line 87) | def forward(self, x):
    method forward_batch (line 94) | def forward_batch(self, x_batch):
    method update (line 97) | def update(self, s_batch, a_batch, delta_batch):
    method get_action (line 102) | def get_action(self, x):
    method display (line 105) | def display(self):
  class TabularQLearningAgent (line 109) | class TabularQLearningAgent:
    method __init__ (line 112) | def __init__(self,
    method get_epsilon (line 164) | def get_epsilon(self):
    method get_egreedy_action (line 169) | def get_egreedy_action(self, o, epsilon):
    method optimize (line 174) | def optimize(self):
    method train (line 199) | def train(self):
    method run_train_episode (line 241) | def run_train_episode(self, step_limit):
    method run_eval_episode (line 267) | def run_eval_episode(self,

FILE: nasim/agents/random_agent.py
  function run_random_agent (line 22) | def run_random_agent(env, step_limit=1e6, verbose=True):

FILE: nasim/envs/action.py
  function load_action_list (line 43) | def load_action_list(scenario):
  class Action (line 79) | class Action:
    method __init__ (line 111) | def __init__(self,
    method is_exploit (line 140) | def is_exploit(self):
    method is_privilege_escalation (line 150) | def is_privilege_escalation(self):
    method is_scan (line 160) | def is_scan(self):
    method is_remote (line 170) | def is_remote(self):
    method is_service_scan (line 183) | def is_service_scan(self):
    method is_os_scan (line 193) | def is_os_scan(self):
    method is_subnet_scan (line 203) | def is_subnet_scan(self):
    method is_process_scan (line 213) | def is_process_scan(self):
    method is_noop (line 223) | def is_noop(self):
    method __str__ (line 233) | def __str__(self):
    method __hash__ (line 240) | def __hash__(self):
    method __eq__ (line 243) | def __eq__(self, other):
  class Exploit (line 256) | class Exploit(Action):
    method __init__ (line 273) | def __init__(self,
    method __str__ (line 312) | def __str__(self):
    method __eq__ (line 316) | def __eq__(self, other):
  class PrivilegeEscalation (line 324) | class PrivilegeEscalation(Action):
    method __init__ (line 343) | def __init__(self,
    method __str__ (line 383) | def __str__(self):
    method __eq__ (line 387) | def __eq__(self, other):
  class ServiceScan (line 395) | class ServiceScan(Action):
    method __init__ (line 401) | def __init__(self,
  class OSScan (line 428) | class OSScan(Action):
    method __init__ (line 434) | def __init__(self,
  class SubnetScan (line 461) | class SubnetScan(Action):
    method __init__ (line 467) | def __init__(self,
  class ProcessScan (line 494) | class ProcessScan(Action):
    method __init__ (line 500) | def __init__(self,
  class NoOp (line 527) | class NoOp(Action):
    method __init__ (line 533) | def __init__(self, *args, **kwargs):
  class ActionResult (line 541) | class ActionResult:
    method __init__ (line 578) | def __init__(self,
    method info (line 631) | def info(self):
    method __str__ (line 653) | def __str__(self):
  class FlatActionSpace (line 660) | class FlatActionSpace(spaces.Discrete):
    method __init__ (line 675) | def __init__(self, scenario):
    method get_action (line 685) | def get_action(self, action_idx):
  class ParameterisedActionSpace (line 704) | class ParameterisedActionSpace(spaces.MultiDiscrete):
    method __init__ (line 764) | def __init__(self, scenario):
    method get_action (line 785) | def get_action(self, action_vec):
    method _get_scan_action_def (line 840) | def _get_scan_action_def(self, a_class):
    method _get_exploit_def (line 854) | def _get_exploit_def(self, service, os):
    method _get_privesc_def (line 863) | def _get_privesc_def(self, proc, os):

FILE: nasim/envs/environment.py
  class NASimEnv (line 16) | class NASimEnv(gym.Env):
    method __init__ (line 59) | def __init__(self,
    method reset (line 110) | def reset(self, *, seed=None, options=None):
    method step (line 143) | def step(self, action):
    method generative_step (line 191) | def generative_step(self, state, action):
    method generate_random_initial_state (line 230) | def generate_random_initial_state(self):
    method generate_initial_state (line 244) | def generate_initial_state(self):
    method render (line 259) | def render(self):
    method render_obs (line 271) | def render_obs(self, mode="human", obs=None):
    method render_state (line 305) | def render_state(self, mode="human", state=None):
    method render_action (line 345) | def render_action(self, action):
    method render_episode (line 362) | def render_episode(self, episode, width=7, height=7):
    method render_network_graph (line 380) | def render_network_graph(self, ax=None, show=False):
    method get_minimum_hops (line 398) | def get_minimum_hops(self):
    method get_action_mask (line 412) | def get_action_mask(self):
    method get_score_upper_bound (line 431) | def get_score_upper_bound(self):
    method goal_reached (line 451) | def goal_reached(self, state=None):
    method __str__ (line 471) | def __str__(self):
    method close (line 481) | def close(self):

FILE: nasim/envs/gym_env.py
  class NASimGymEnv (line 5) | class NASimGymEnv(NASimEnv):
    method __init__ (line 11) | def __init__(self,

FILE: nasim/envs/host_vector.py
  class HostVector (line 13) | class HostVector:
    method __init__ (line 82) | def __init__(self, vector):
    method vectorize (line 86) | def vectorize(cls, host, address_space_bounds, vector=None):
    method vectorize_random (line 115) | def vectorize_random(cls, host, address_space_bounds, vector=None):
    method compromised (line 132) | def compromised(self):
    method compromised (line 136) | def compromised(self, val):
    method discovered (line 140) | def discovered(self):
    method discovered (line 144) | def discovered(self, val):
    method reachable (line 148) | def reachable(self):
    method reachable (line 152) | def reachable(self, val):
    method address (line 156) | def address(self):
    method value (line 163) | def value(self):
    method discovery_value (line 167) | def discovery_value(self):
    method access (line 171) | def access(self):
    method access (line 175) | def access(self, val):
    method services (line 179) | def services(self):
    method os (line 186) | def os(self):
    method processes (line 193) | def processes(self):
    method is_running_service (line 199) | def is_running_service(self, srv):
    method is_running_os (line 203) | def is_running_os(self, os):
    method is_running_process (line 207) | def is_running_process(self, proc):
    method perform_action (line 211) | def perform_action(self, action):
    method observe (line 297) | def observe(self,
    method readable (line 338) | def readable(self):
    method copy (line 341) | def copy(self):
    method numpy (line 345) | def numpy(self):
    method _initialize (line 349) | def _initialize(cls, address_space_bounds, services, os_info, processes):
    method _update_vector_idxs (line 366) | def _update_vector_idxs(cls):
    method _subnet_address_idx_slice (line 383) | def _subnet_address_idx_slice(cls):
    method _host_address_idx_slice (line 387) | def _host_address_idx_slice(cls):
    method _get_service_idx (line 391) | def _get_service_idx(cls, srv_num):
    method _service_idx_slice (line 395) | def _service_idx_slice(cls):
    method _get_os_idx (line 399) | def _get_os_idx(cls, os_num):
    method _os_idx_slice (line 403) | def _os_idx_slice(cls):
    method _get_process_idx (line 407) | def _get_process_idx(cls, proc_num):
    method _process_idx_slice (line 411) | def _process_idx_slice(cls):
    method get_readable (line 415) | def get_readable(cls, vector):
    method reset (line 435) | def reset(cls):
    method __repr__ (line 443) | def __repr__(self):
    method __hash__ (line 446) | def __hash__(self):
    method __eq__ (line 449) | def __eq__(self, other):

FILE: nasim/envs/network.py
  class Network (line 11) | class Network:
    method __init__ (line 14) | def __init__(self, scenario):
    method reset (line 25) | def reset(self, state):
    method perform_action (line 36) | def perform_action(self, state, action):
    method _perform_subnet_scan (line 99) | def _perform_subnet_scan(self, next_state, action):
    method _update (line 131) | def _update(self, state, action, action_obs):
    method _update_reachable (line 135) | def _update_reachable(self, state, compromised_addr):
    method get_sensitive_hosts (line 146) | def get_sensitive_hosts(self):
    method is_sensitive_host (line 149) | def is_sensitive_host(self, host_address):
    method subnets_connected (line 152) | def subnets_connected(self, subnet_1, subnet_2):
    method subnet_traffic_permitted (line 155) | def subnet_traffic_permitted(self, src_subnet, dest_subnet, service):
    method host_traffic_permitted (line 163) | def host_traffic_permitted(self, src_addr, dest_addr, service):
    method has_required_remote_permission (line 167) | def has_required_remote_permission(self, state, action):
    method traffic_permitted (line 187) | def traffic_permitted(self, state, host_addr, service):
    method subnet_public (line 204) | def subnet_public(self, subnet):
    method get_number_of_subnets (line 207) | def get_number_of_subnets(self):
    method all_sensitive_hosts_compromised (line 210) | def all_sensitive_hosts_compromised(self, state):
    method get_total_sensitive_host_value (line 216) | def get_total_sensitive_host_value(self):
    method get_total_discovery_value (line 222) | def get_total_discovery_value(self):
    method get_minimal_hops (line 228) | def get_minimal_hops(self):
    method get_subnet_depths (line 233) | def get_subnet_depths(self):
    method __str__ (line 236) | def __str__(self):

FILE: nasim/envs/observation.py
  class Observation (line 7) | class Observation:
    method __init__ (line 51) | def __init__(self, state_shape):
    method get_space_bounds (line 63) | def get_space_bounds(scenario):
    method from_numpy (line 82) | def from_numpy(cls, o_array, state_shape):
    method from_state (line 89) | def from_state(self, state):
    method from_action_result (line 92) | def from_action_result(self, action_result):
    method from_state_and_action (line 102) | def from_state_and_action(self, state, action_result):
    method update_from_host (line 106) | def update_from_host(self, host_idx, host_obs_vector):
    method success (line 110) | def success(self):
    method connection_error (line 121) | def connection_error(self):
    method permission_error (line 132) | def permission_error(self):
    method undefined_error (line 143) | def undefined_error(self):
    method shape_flat (line 153) | def shape_flat(self):
    method shape (line 163) | def shape(self):
    method numpy_flat (line 173) | def numpy_flat(self):
    method numpy (line 183) | def numpy(self):
    method get_readable (line 193) | def get_readable(self):
    method __str__ (line 217) | def __str__(self):
    method __eq__ (line 220) | def __eq__(self, other):
    method __hash__ (line 223) | def __hash__(self):

FILE: nasim/envs/render.py
  class Viewer (line 31) | class Viewer:
    method __init__ (line 34) | def __init__(self, network):
    method render_graph (line 45) | def render_graph(self, state, ax=None, show=False, width=5, height=6):
    method render_episode (line 98) | def render_episode(self, episode, width=7, height=5):
    method render_readable (line 116) | def render_readable(self, obs):
    method render_readable_state (line 131) | def render_readable_state(self, state):
    method close (line 144) | def close(self):
    method _construct_table_from_dict (line 148) | def _construct_table_from_dict(self, d):
    method _construct_table_from_list_of_dicts (line 155) | def _construct_table_from_list_of_dicts(self, l):
    method _construct_graph (line 163) | def _construct_graph(self, state):
    method _get_host_positions (line 209) | def _get_host_positions(self, network):
    method _get_host_position (line 269) | def _get_host_position(self, m, positions, address_space, row_min, row...
    method _get_subnets (line 307) | def _get_subnets(self, network):
  class EpisodeViewer (line 328) | class EpisodeViewer:
    method __init__ (line 331) | def __init__(self, episode, G, sensitive_hosts, width=7, height=7):
    method _setup_GUI (line 343) | def _setup_GUI(self, width, height):
    method _close (line 371) | def _close(self):
    method _next_graph (line 375) | def _next_graph(self):
    method _previous_graph (line 382) | def _previous_graph(self):
    method _update_graph (line 387) | def _update_graph(self, G, state):
    method _draw_graph (line 398) | def _draw_graph(self, G):
    method legend (line 446) | def legend(compromised=True):
  function get_host_representation (line 463) | def get_host_representation(state, sensitive_hosts, m, representation):

FILE: nasim/envs/state.py
  class State (line 7) | class State:
    method __init__ (line 25) | def __init__(self, network_tensor, host_num_map):
    method tensorize (line 39) | def tensorize(cls, network):
    method generate_initial_state (line 54) | def generate_initial_state(cls, network):
    method generate_random_initial_state (line 60) | def generate_random_initial_state(cls, network):
    method from_numpy (line 79) | def from_numpy(cls, s_array, state_shape, host_num_map):
    method reset (line 85) | def reset(cls):
    method hosts (line 90) | def hosts(self):
    method copy (line 96) | def copy(self):
    method get_initial_observation (line 100) | def get_initial_observation(self, fully_obs):
    method get_observation (line 123) | def get_observation(self, action, action_result, fully_obs):
    method shape_flat (line 202) | def shape_flat(self):
    method shape (line 205) | def shape(self):
    method numpy_flat (line 208) | def numpy_flat(self):
    method numpy (line 211) | def numpy(self):
    method update_host (line 214) | def update_host(self, host_addr, host_vector):
    method get_host (line 218) | def get_host(self, host_addr):
    method get_host_idx (line 222) | def get_host_idx(self, host_addr):
    method get_host_and_idx (line 225) | def get_host_and_idx(self, host_addr):
    method host_reachable (line 229) | def host_reachable(self, host_addr):
    method host_compromised (line 232) | def host_compromised(self, host_addr):
    method host_discovered (line 235) | def host_discovered(self, host_addr):
    method host_has_access (line 238) | def host_has_access(self, host_addr, access_level):
    method set_host_compromised (line 241) | def set_host_compromised(self, host_addr):
    method set_host_reachable (line 244) | def set_host_reachable(self, host_addr):
    method set_host_discovered (line 247) | def set_host_discovered(self, host_addr):
    method get_host_value (line 250) | def get_host_value(self, host_address):
    method host_is_running_service (line 253) | def host_is_running_service(self, host_addr, service):
    method host_is_running_os (line 256) | def host_is_running_os(self, host_addr, os):
    method get_total_host_value (line 259) | def get_total_host_value(self):
    method state_size (line 266) | def state_size(self):
    method get_readable (line 269) | def get_readable(self):
    method __str__ (line 277) | def __str__(self):
    method __hash__ (line 284) | def __hash__(self):
    method __eq__ (line 287) | def __eq__(self, other):

FILE: nasim/envs/utils.py
  class OneHotBool (line 9) | class OneHotBool(enum.IntEnum):
    method from_bool (line 15) | def from_bool(b):
    method __str__ (line 20) | def __str__(self):
    method __repr__ (line 23) | def __repr__(self):
  class ServiceState (line 27) | class ServiceState(enum.IntEnum):
    method __str__ (line 33) | def __str__(self):
    method __repr__ (line 36) | def __repr__(self):
  class AccessLevel (line 40) | class AccessLevel(enum.IntEnum):
    method __str__ (line 45) | def __str__(self):
    method __repr__ (line 48) | def __repr__(self):
  function get_minimal_hops_to_goal (line 52) | def get_minimal_hops_to_goal(topology, sensitive_addresses):
  function min_subnet_depth (line 105) | def min_subnet_depth(topology):

FILE: nasim/scenarios/__init__.py
  function make_benchmark_scenario (line 8) | def make_benchmark_scenario(scenario_name, seed=None):
  function generate_scenario (line 42) | def generate_scenario(num_hosts, num_services, **params):
  function load_scenario (line 63) | def load_scenario(path, name=None):
  function get_scenario_max (line 83) | def get_scenario_max(scenario_name):

FILE: nasim/scenarios/generator.py
  class ScenarioGenerator (line 25) | class ScenarioGenerator:
    method generate (line 66) | def generate(self,
    method _construct_scenario (line 226) | def _construct_scenario(self):
    method _generate_subnets (line 249) | def _generate_subnets(self, num_hosts):
    method _generate_topology (line 269) | def _generate_topology(self):
    method _generate_address_space_bounds (line 302) | def _generate_address_space_bounds(self, address_space_bounds):
    method _generate_os (line 325) | def _generate_os(self, num_os):
    method _generate_services (line 328) | def _generate_services(self, num_services):
    method _generate_processes (line 331) | def _generate_processes(self, num_processes):
    method _generate_exploits (line 334) | def _generate_exploits(self, num_exploits, exploit_cost, exploit_probs):
    method _generate_privescs (line 359) | def _generate_privescs(self, num_privesc, privesc_cost, privesc_probs):
    method _get_action_probs (line 402) | def _get_action_probs(self, num_actions, action_probs):
    method _generate_sensitive_hosts (line 433) | def _generate_sensitive_hosts(self, r_sensitive, r_user, random_goal):
    method _generate_uniform_hosts (line 449) | def _generate_uniform_hosts(self):
    method _possible_host_configs (line 482) | def _possible_host_configs(self):
    method _permutations (line 505) | def _permutations(self, n):
    method _generate_correlated_hosts (line 536) | def _generate_correlated_hosts(self, alpha_H, alpha_V, lambda_V):
    method _get_host_config (line 575) | def _get_host_config(self,
    method _sample_config (line 600) | def _sample_config(self,
    method _dirichlet_process (line 623) | def _dirichlet_process(self,
    method _dirichlet_sample (line 649) | def _dirichlet_sample(self, alpha_V, choices, prev_vals):
    method _is_sensitive_host (line 662) | def _is_sensitive_host(self, addr):
    method _convert_to_service_map (line 665) | def _convert_to_service_map(self, config):
    method _convert_to_process_map (line 672) | def _convert_to_process_map(self, config):
    method _convert_to_os_map (line 679) | def _convert_to_os_map(self, os):
    method _ensure_host_vulnerability (line 691) | def _ensure_host_vulnerability(self):
    method _host_is_vulnerable (line 716) | def _host_is_vulnerable(self, host, access_level=u.USER_ACCESS):
    method _host_is_vulnerable_to_exploit (line 726) | def _host_is_vulnerable_to_exploit(self, host, exploit_def):
    method _host_is_vulnerable_to_privesc (line 733) | def _host_is_vulnerable_to_privesc(self, host, privesc_def):
    method _update_host_to_vulnerable (line 740) | def _update_host_to_vulnerable(self, host, access_level=u.USER_ACCESS):
    method _update_host_exploit_vulnerability (line 767) | def _update_host_exploit_vulnerability(self, host, os_constraint):
    method _update_host_privesc_vulnerability (line 791) | def _update_host_privesc_vulnerability(self, host, os_constraint):
    method _update_host_os (line 813) | def _update_host_os(self, host, os):
    method _get_host_value (line 819) | def _get_host_value(self, address):
    method _generate_firewall (line 822) | def _generate_firewall(self, restrictiveness):

FILE: nasim/scenarios/host.py
  class Host (line 2) | class Host:
    method __init__ (line 10) | def __init__(self,
    method is_running_service (line 65) | def is_running_service(self, service):
    method is_running_os (line 68) | def is_running_os(self, os):
    method is_running_process (line 71) | def is_running_process(self, process):
    method traffic_permitted (line 74) | def traffic_permitted(self, addr, service):
    method __str__ (line 77) | def __str__(self):
    method __repr__ (line 106) | def __repr__(self):

FILE: nasim/scenarios/loader.py
  class ScenarioLoader (line 64) | class ScenarioLoader:
    method load (line 66) | def load(self, file_path, name=None):
    method _construct_scenario (line 108) | def _construct_scenario(self):
    method _check_scenario_sections_valid (line 129) | def _check_scenario_sections_valid(self):
    method _parse_subnets (line 152) | def _parse_subnets(self):
    method _validate_subnets (line 160) | def _validate_subnets(self, subnets):
    method _parse_topology (line 167) | def _parse_topology(self):
    method _validate_topology (line 172) | def _validate_topology(self, topology):
    method _parse_os (line 189) | def _parse_os(self):
    method _validate_os (line 194) | def _validate_os(self, os):
    method _parse_services (line 200) | def _parse_services(self):
    method _validate_services (line 205) | def _validate_services(self, services):
    method _parse_processes (line 211) | def _parse_processes(self):
    method _validate_processes (line 216) | def _validate_processes(self, processes):
    method _parse_sensitive_hosts (line 222) | def _parse_sensitive_hosts(self):
    method _validate_sensitive_hosts (line 230) | def _validate_sensitive_hosts(self, sensitive_hosts):
    method _is_valid_subnet_ID (line 268) | def _is_valid_subnet_ID(self, subnet_ID):
    method _is_valid_host_address (line 275) | def _is_valid_host_address(self, subnet_ID, host_ID):
    method _parse_exploits (line 284) | def _parse_exploits(self):
    method _validate_exploits (line 289) | def _validate_exploits(self, exploits):
    method _validate_single_exploit (line 293) | def _validate_single_exploit(self, e_name, e):
    method _parse_privescs (line 326) | def _parse_privescs(self):
    method _validate_privescs (line 330) | def _validate_privescs(self, privescs):
    method _validate_single_privesc (line 334) | def _validate_single_privesc(self, pe_name, pe):
    method _parse_scan_costs (line 369) | def _parse_scan_costs(self):
    method _validate_scan_cost (line 382) | def _validate_scan_cost(self, scan_name, scan_cost):
    method _parse_host_configs (line 385) | def _parse_host_configs(self):
    method _validate_host_configs (line 389) | def _validate_host_configs(self, host_configs):
    method _has_all_host_addresses (line 401) | def _has_all_host_addresses(self, addresses):
    method _validate_host_config (line 412) | def _validate_host_config(self, addr, cfg):
    method _validate_host_address (line 479) | def _validate_host_address(self, addr, err_prefix=""):
    method _parse_firewall (line 500) | def _parse_firewall(self):
    method _validate_firewall (line 508) | def _validate_firewall(self, firewall):
    method _contains_all_required_firewalls (line 519) | def _contains_all_required_firewalls(self, firewall):
    method _is_valid_firewall_setting (line 529) | def _is_valid_firewall_setting(self, f):
    method _parse_hosts (line 541) | def _parse_hosts(self):
    method _construct_host_config (line 560) | def _construct_host_config(self, host_cfg):
    method _get_host_value (line 572) | def _get_host_value(self, address, host_cfg):
    method _parse_step_limit (line 577) | def _parse_step_limit(self):

FILE: nasim/scenarios/scenario.py
  class Scenario (line 7) | class Scenario:
    method __init__ (line 9) | def __init__(self, scenario_dict, name=None, generated=False):
    method step_limit (line 23) | def step_limit(self):
    method services (line 27) | def services(self):
    method num_services (line 31) | def num_services(self):
    method os (line 35) | def os(self):
    method num_os (line 39) | def num_os(self):
    method processes (line 43) | def processes(self):
    method num_processes (line 47) | def num_processes(self):
    method access_levels (line 51) | def access_levels(self):
    method exploits (line 55) | def exploits(self):
    method privescs (line 59) | def privescs(self):
    method exploit_map (line 63) | def exploit_map(self):
    method privesc_map (line 97) | def privesc_map(self):
    method subnets (line 131) | def subnets(self):
    method topology (line 135) | def topology(self):
    method sensitive_hosts (line 139) | def sensitive_hosts(self):
    method sensitive_addresses (line 143) | def sensitive_addresses(self):
    method firewall (line 147) | def firewall(self):
    method hosts (line 151) | def hosts(self):
    method address_space (line 155) | def address_space(self):
    method service_scan_cost (line 159) | def service_scan_cost(self):
    method os_scan_cost (line 163) | def os_scan_cost(self):
    method subnet_scan_cost (line 167) | def subnet_scan_cost(self):
    method process_scan_cost (line 171) | def process_scan_cost(self):
    method address_space_bounds (line 175) | def address_space_bounds(self):
    method host_value_bounds (line 181) | def host_value_bounds(self):
    method host_discovery_value_bounds (line 197) | def host_discovery_value_bounds(self):
    method display (line 212) | def display(self):
    method get_action_space_size (line 215) | def get_action_space_size(self):
    method get_state_space_size (line 223) | def get_state_space_size(self):
    method get_state_dims (line 237) | def get_state_dims(self):
    method get_observation_dims (line 250) | def get_observation_dims(self):
    method get_description (line 254) | def get_description(self):

FILE: nasim/scenarios/utils.py
  function load_yaml (line 60) | def load_yaml(file_path):
  function get_file_name (line 82) | def get_file_name(file_path):

FILE: nasim/scripts/describe_scenarios.py
  function describe_scenarios (line 32) | def describe_scenarios(output=None):

FILE: nasim/scripts/run_random_benchmarks.py
  function print_msg (line 22) | def print_msg(msg):
  class Result (line 26) | class Result:
    method __init__ (line 28) | def __init__(self, name):
    method add (line 33) | def add(self, steps, reward):
    method summarize (line 37) | def summarize(self):
    method get_formatted_summary (line 44) | def get_formatted_summary(self):
  function run_scenario (line 52) | def run_scenario(args):
  function collate_results (line 65) | def collate_results(results):
  function output_results (line 75) | def output_results(results, output=None):
  function run_random_benchmark (line 94) | def run_random_benchmark(num_cpus=1, num_seeds=10, output=None):

FILE: nasim/scripts/train_dqn.py
  class BestDQN (line 7) | class BestDQN(DQNAgent):
    method __init__ (line 10) | def __init__(self,
    method run_train_episode (line 20) | def run_train_episode(self, step_limit):

FILE: setup.py
  function get_version (line 22) | def get_version():

FILE: test/test_bruteforce.py
  function test_bruteforce_static (line 18) | def test_bruteforce_static(scenario, seed, fully_obs, flat_actions, flat...
  function test_bruteforce_gen (line 36) | def test_bruteforce_gen(scenario, seed, fully_obs, flat_actions, flat_obs):

FILE: test/test_env.py
  function test_render_error (line 10) | def test_render_error():
  function test_render_readable (line 17) | def test_render_readable():
  function test_render_state_error (line 23) | def test_render_state_error():
  function test_render_state_readable (line 30) | def test_render_state_readable():
  function test_render_action (line 37) | def test_render_action(flat_actions):
  function test_get_total_discovery_value (line 47) | def test_get_total_discovery_value(scenario, expected_value):
  function test_get_total_sensitive_host_value (line 58) | def test_get_total_sensitive_host_value(scenario, expected_value):
  function test_get_minumum_hops (line 69) | def test_get_minumum_hops(scenario, expected_value):

FILE: test/test_generator.py
  function test_generator (line 14) | def test_generator(scenario, seed):

FILE: test/test_gym_bruteforce.py
  function test_gym_reload (line 16) | def test_gym_reload():
  function test_bruteforce (line 26) | def test_bruteforce(scenario, po, obs, actions,v):

Download .json

Condensed preview — 91 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (386K chars).

[
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "chars": 834,
    "preview": "---\nname: Bug report\nabout: Create a report to help us improve\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**Describe the b"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "chars": 595,
    "preview": "---\nname: Feature request\nabout: Suggest an idea for this project\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**Is your fea"
  },
  {
    "path": ".gitignore",
    "chars": 527,
    "preview": "*.cprof\n\n# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution "
  },
  {
    "path": ".readthedocs.yaml",
    "chars": 584,
    "preview": "# .readthedocs.yaml\n# Read the Docs configuration file\n# See https://docs.readthedocs.io/en/stable/config-file/v2.html f"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "chars": 3360,
    "preview": "# Contributor Covenant Code of Conduct\n\n## Our Pledge\n\nIn the interest of fostering an open and welcoming environment, w"
  },
  {
    "path": "CONTRIBUTING.rst",
    "chars": 1406,
    "preview": "Development\n===========\n\nNASim is a work in progress and contributions are welcome via pull request.\n\nFor more informati"
  },
  {
    "path": "LICENSE.md",
    "chars": 1068,
    "preview": "\nThe MIT License (MIT)\n\nCopyright (c) 2018 \n\nPermission is hereby granted, free of charge, to any person obtaining a cop"
  },
  {
    "path": "README.rst",
    "chars": 7124,
    "preview": "**Status**: Stable release. No extra development is planned, but still being maintained (bug fixes, etc).\n\n\nNetwork Atta"
  },
  {
    "path": "docs/Makefile",
    "chars": 638,
    "preview": "# Minimal makefile for Sphinx documentation\n#\n\n# You can set these variables from the command line, and also\n# from the "
  },
  {
    "path": "docs/make.bat",
    "chars": 799,
    "preview": "@ECHO OFF\r\n\r\npushd %~dp0\r\n\r\nREM Command file for Sphinx documentation\r\n\r\nif \"%SPHINXBUILD%\" == \"\" (\r\n\tset SPHINXBUILD=sp"
  },
  {
    "path": "docs/requirements.txt",
    "chars": 47,
    "preview": "nasim\nsphinx\nsphinx-autobuild\nsphinx-rtd-theme\n"
  },
  {
    "path": "docs/source/community/acknowledgements.rst",
    "chars": 171,
    "preview": ".. _acknowledgements:\n\nAcknowledgements\n================\n\n* Inspiration for the documentation was taken from the `DeeR <"
  },
  {
    "path": "docs/source/community/contact.rst",
    "chars": 72,
    "preview": "Contact\n=======\nQuestions? Please contact Jonathon.schwartz@anu.edu.au.\n"
  },
  {
    "path": "docs/source/community/development.rst",
    "chars": 1416,
    "preview": ".. _dev:\n\nDevelopment\n===========\n\nNASim is a work in progress and contributions are welcome via pull request.\n\nFor more"
  },
  {
    "path": "docs/source/community/distributing.rst",
    "chars": 1696,
    "preview": ".. _distribution:\n\nDistribution\n============\n\nThis document contains some notes on distributing NASim via PyPi. This is "
  },
  {
    "path": "docs/source/community/index.rst",
    "chars": 174,
    "preview": ".. _community:\n\nCommunity & Development\n=======================\n\n.. toctree::\n    :maxdepth: 1\n\n    development\n    lice"
  },
  {
    "path": "docs/source/community/license.rst",
    "chars": 1083,
    "preview": "License\n=======\n\nThe MIT License (MIT)\n\nCopyright (c) 2018\n\nPermission is hereby granted, free of charge, to any person "
  },
  {
    "path": "docs/source/conf.py",
    "chars": 2446,
    "preview": "# Configuration file for the Sphinx documentation builder.\n#\n# This file only contains a selection of the most common op"
  },
  {
    "path": "docs/source/explanations/index.rst",
    "chars": 164,
    "preview": ".. _explanations:\n\nExplanations\n============\n\nMore technical explanations related to NASim.\n\n.. toctree::\n    :maxdepth:"
  },
  {
    "path": "docs/source/explanations/scenario_generation.rst",
    "chars": 4447,
    "preview": ".. _scenario_generation_explanation:\n\nScenario Generation Explanation\n===============================\n\nGenerating the sc"
  },
  {
    "path": "docs/source/explanations/sim_to_real.rst",
    "chars": 2899,
    "preview": ".. _sim_to_real_explanation:\n\nSim-to-Real Gap Considerations\n==============================\n\nNASim is a fairly simplifie"
  },
  {
    "path": "docs/source/index.rst",
    "chars": 5824,
    "preview": "Welcome to Network Attack Simulator's documentation!\n====================================================\n\nNetwork Attac"
  },
  {
    "path": "docs/source/reference/agents/index.rst",
    "chars": 2038,
    "preview": ".. _agents_reference:\n\nAgents Reference\n================\n\nThis page provides a short summary of the agents that come wit"
  },
  {
    "path": "docs/source/reference/envs/actions.rst",
    "chars": 80,
    "preview": ".. _`actions`:\n\nActions\n=======\n\n.. automodule:: nasim.envs.action\n   :members:\n"
  },
  {
    "path": "docs/source/reference/envs/environment.rst",
    "chars": 97,
    "preview": ".. _`environment`:\n\nEnvironment\n===========\n\n.. automodule:: nasim.envs.environment\n   :members:\n"
  },
  {
    "path": "docs/source/reference/envs/host_vector.rst",
    "chars": 95,
    "preview": ".. _`host_vector`:\n\nHostVector\n==========\n\n.. automodule:: nasim.envs.host_vector\n   :members:\n"
  },
  {
    "path": "docs/source/reference/envs/index.rst",
    "chars": 267,
    "preview": ".. _env_reference:\n\nEnvironment Reference\n=====================\n\nTechnical reference material for classes and functions "
  },
  {
    "path": "docs/source/reference/envs/observation.rst",
    "chars": 97,
    "preview": ".. _`observation`:\n\nObservation\n===========\n\n.. automodule:: nasim.envs.observation\n   :members:\n"
  },
  {
    "path": "docs/source/reference/envs/state.rst",
    "chars": 73,
    "preview": ".. _`state`:\n\nState\n=====\n\n.. automodule:: nasim.envs.state\n   :members:\n"
  },
  {
    "path": "docs/source/reference/index.rst",
    "chars": 160,
    "preview": ".. _reference:\n\nReference\n=========\n\nTechnical reference material.\n\n.. toctree::\n    :maxdepth: 2\n\n    load\n    agents/i"
  },
  {
    "path": "docs/source/reference/load.rst",
    "chars": 193,
    "preview": ".. _nasim_init:\n\nNASimEnv load reference\n=======================\n\nTechnical reference material for different functions f"
  },
  {
    "path": "docs/source/reference/scenarios/benchmark_scenarios.rst",
    "chars": 3156,
    "preview": ".. _benchmark_scenarios:\n\nBenchmark Scenarios\n===================\n\nThere are a number of existing scenarios that come wi"
  },
  {
    "path": "docs/source/reference/scenarios/benchmark_scenarios_agent_scores.csv",
    "chars": 914,
    "preview": "Scenario Name,Steps,Total Reward\ntiny,108.02 +/- 43.82,91.98 +/- 43.82\ntiny-hard,135.31 +/- 65.56,21.05 +/- 85.45\ntiny-s"
  },
  {
    "path": "docs/source/reference/scenarios/benchmark_scenarios_table.csv",
    "chars": 1110,
    "preview": "Name,Type,Subnets,Hosts,OS,Services,Processes,Exploits,PrivEscs,Actions,Observation Dims,States,Step Limit\ntiny,static,4"
  },
  {
    "path": "docs/source/reference/scenarios/generator.rst",
    "chars": 120,
    "preview": ".. _scenario_generator:\n\nScenario Generator\n===================\n\n.. automodule:: nasim.scenarios.generator\n   :members:\n"
  },
  {
    "path": "docs/source/reference/scenarios/index.rst",
    "chars": 260,
    "preview": ".. _scenario_reference:\n\nScenario Reference\n==================\n\nTechnical reference material for classes and functions u"
  },
  {
    "path": "docs/source/tutorials/creating_scenarios.rst",
    "chars": 14490,
    "preview": ".. _`creating_scenarios_tute`:\n\nCreating Custom Scenarios\n=========================\n\nWith NASim it is possible to use cu"
  },
  {
    "path": "docs/source/tutorials/environment.rst",
    "chars": 3802,
    "preview": ".. _`env_tute`:\n\nInteracting with NASim Environment\n==================================\n\nAssuming you are comfortable loa"
  },
  {
    "path": "docs/source/tutorials/gym_load.rst",
    "chars": 2482,
    "preview": ".. _`gym_load_tute`:\n\nStarting NASim using OpenAI gym\n===============================\n\nOn startup NASim also registers e"
  },
  {
    "path": "docs/source/tutorials/index.rst",
    "chars": 163,
    "preview": ".. _tutorials:\n\nTutorials\n=========\n\n.. toctree::\n    :maxdepth: 1\n\n    installation\n    loading\n    gym_load\n    enviro"
  },
  {
    "path": "docs/source/tutorials/installation.rst",
    "chars": 1784,
    "preview": ".. _installation:\n\nInstallation\n==============\n\n\nDependencies\n--------------\n\nThis framework is tested to work under Pyt"
  },
  {
    "path": "docs/source/tutorials/loading.rst",
    "chars": 5220,
    "preview": ".. _`loading_tute`:\n\nStarting a NASim Environment\n============================\n\nInteraction with NASim is done primarily"
  },
  {
    "path": "docs/source/tutorials/scenarios.rst",
    "chars": 4301,
    "preview": ".. _`scenarios_tute`:\n\nUnderstanding Scenarios\n=======================\n\nA scenario in NASim defines all the necessary pr"
  },
  {
    "path": "nasim/__init__.py",
    "chars": 6715,
    "preview": "import gymnasium as gym\nfrom gymnasium.envs.registration import register\n\nfrom nasim.envs import NASimEnv\nfrom nasim.sce"
  },
  {
    "path": "nasim/agents/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "nasim/agents/bruteforce_agent.py",
    "chars": 3577,
    "preview": "\"\"\"An bruteforce agent that repeatedly cycles through all available actions in\norder.\n\nTo run 'tiny' benchmark scenario "
  },
  {
    "path": "nasim/agents/dqn_agent.py",
    "chars": 13783,
    "preview": "\"\"\"An example DQN Agent.\n\nIt uses pytorch 1.5+ and tensorboard libraries (HINT: these dependencies can\nbe installed by r"
  },
  {
    "path": "nasim/agents/keyboard_agent.py",
    "chars": 8104,
    "preview": "\"\"\"An agent that lets the user interact with NASim using the keyboard.\n\nTo run 'tiny' benchmark scenario with default se"
  },
  {
    "path": "nasim/agents/ql_agent.py",
    "chars": 10578,
    "preview": "\"\"\"An example Tabular, epsilon greedy Q-Learning Agent.\n\nThis agent does not use an Experience replay (see the 'ql_repla"
  },
  {
    "path": "nasim/agents/ql_replay_agent.py",
    "chars": 12360,
    "preview": "\"\"\"An example Tabular, epsilon greedy Q-Learning Agent using experience replay.\n\nThe replay can help improve learning st"
  },
  {
    "path": "nasim/agents/random_agent.py",
    "chars": 3379,
    "preview": "\"\"\"A random agent that selects a random action at each step\n\nTo run 'tiny' benchmark scenario with default settings, run"
  },
  {
    "path": "nasim/demo.py",
    "chars": 2175,
    "preview": "\"\"\"Script for running NASim demo\n\nUsage\n-----\n\n$ python demo [-ai] [-h] env_name\n\"\"\"\n\nimport os.path as osp\n\nimport nasi"
  },
  {
    "path": "nasim/envs/__init__.py",
    "chars": 87,
    "preview": "from nasim.envs.gym_env import NASimGymEnv\nfrom nasim.envs.environment import NASimEnv\n"
  },
  {
    "path": "nasim/envs/action.py",
    "chars": 25856,
    "preview": "\"\"\"Action related classes for the NASim environment.\n\nThis module contains the different action classes that are used\nto"
  },
  {
    "path": "nasim/envs/environment.py",
    "chars": 16247,
    "preview": "\"\"\" The main Environment class for NASim: NASimEnv.\n\nThe NASimEnv class is the main interface for agents interacting wit"
  },
  {
    "path": "nasim/envs/gym_env.py",
    "chars": 1585,
    "preview": "from nasim.envs.environment import NASimEnv\nfrom nasim.scenarios import Scenario, make_benchmark_scenario\n\n\nclass NASimG"
  },
  {
    "path": "nasim/envs/host_vector.py",
    "chars": 15719,
    "preview": "\"\"\" This module contains the HostVector class.\n\nThis is the main class for storing and updating the state of a single ho"
  },
  {
    "path": "nasim/envs/network.py",
    "chars": 9205,
    "preview": "import numpy as np\n\nfrom nasim.envs.action import ActionResult\nfrom nasim.envs.utils import get_minimal_hops_to_goal, mi"
  },
  {
    "path": "nasim/envs/observation.py",
    "chars": 6724,
    "preview": "import numpy as np\n\nfrom nasim.envs.utils import AccessLevel\nfrom nasim.envs.host_vector import HostVector\n\n\nclass Obser"
  },
  {
    "path": "nasim/envs/render.py",
    "chars": 17179,
    "preview": "\"\"\"This module contains functions and classes for rendering NASim \"\"\"\nimport math\nimport random\nimport tkinter as Tk\nimp"
  },
  {
    "path": "nasim/envs/state.py",
    "chars": 9335,
    "preview": "import numpy as np\n\nfrom nasim.envs.host_vector import HostVector\nfrom nasim.envs.observation import Observation\n\n\nclass"
  },
  {
    "path": "nasim/envs/utils.py",
    "chars": 3875,
    "preview": "import enum\nimport numpy as np\nfrom queue import deque\nfrom itertools import permutations\n\nINTERNET = 0\n\n\nclass OneHotBo"
  },
  {
    "path": "nasim/scenarios/__init__.py",
    "chars": 2677,
    "preview": "from nasim.scenarios.utils import INTERNET\nfrom nasim.scenarios.scenario import Scenario\nfrom nasim.scenarios.loader imp"
  },
  {
    "path": "nasim/scenarios/benchmark/__init__.py",
    "chars": 1835,
    "preview": "import os.path as osp\n\nfrom nasim.scenarios.benchmark.generated import AVAIL_GEN_BENCHMARKS\n\nBENCHMARK_DIR = osp.dirname"
  },
  {
    "path": "nasim/scenarios/benchmark/generated.py",
    "chars": 3520,
    "preview": "\"\"\"A collection of definitions for generated benchmark scenarios.\n\nEach generated scenario is defined by the a number of"
  },
  {
    "path": "nasim/scenarios/benchmark/medium-multi-site.yaml",
    "chars": 3856,
    "preview": "# A WAN which has multiple 3 remote sites (subnets) connected to the main site\n# sensitive hosts:\n# 1) a server in serve"
  },
  {
    "path": "nasim/scenarios/benchmark/medium-single-site.yaml",
    "chars": 2660,
    "preview": "# A network with a single subnet that has one vulnerable host that must be compromised\n# to access other hosts behind fi"
  },
  {
    "path": "nasim/scenarios/benchmark/medium.yaml",
    "chars": 2954,
    "preview": "# A medium standard (one public subnet) network configuration\n#\n# 16 hosts\n# 5 subnets\n# 2 OS\n# 5 services\n# 3 processes"
  },
  {
    "path": "nasim/scenarios/benchmark/small-honeypot.yaml",
    "chars": 2264,
    "preview": "# A small standard (one public network) network configuration containing a\n# honeypot host (3, 2).\n#\n# 4 subnets\n# 8 hos"
  },
  {
    "path": "nasim/scenarios/benchmark/small-linear.yaml",
    "chars": 2793,
    "preview": "# A small network with\n#\n# 6 subnets\n# 8 hosts\n# 2 OS\n# 3 services\n# 2 processes\n# 3 exploits\n# 2 priv esc\n#\n# - subnets"
  },
  {
    "path": "nasim/scenarios/benchmark/small.yaml",
    "chars": 2145,
    "preview": "# A small standard (one public network) network configuration\n#\n# 4 subnets\n# 8 hosts\n# 2 OS\n# 3 services\n# 2 processes\n"
  },
  {
    "path": "nasim/scenarios/benchmark/tiny-hard.yaml",
    "chars": 1670,
    "preview": "# A harder version of the tiny standard (one public network) network configuration\n#\n# 3 subnets\n# 3 hosts\n# 2 OS\n# 3 se"
  },
  {
    "path": "nasim/scenarios/benchmark/tiny-small.yaml",
    "chars": 1958,
    "preview": "# A tiny-small standard (one public network) network configuration\n# (Not quite tiny, not quite small)\n#\n# 4 subnets\n# 5"
  },
  {
    "path": "nasim/scenarios/benchmark/tiny.yaml",
    "chars": 1464,
    "preview": "# A tiny standard (one public network) network configuration\n#\n# 3 hosts\n# 3 subnets\n# 1 service\n# 1 process\n# 1 os\n# 1 "
  },
  {
    "path": "nasim/scenarios/generator.py",
    "chars": 35703,
    "preview": "\"\"\"This module contains functionality for generating scenarios.\n\nSpecifically, it generates network configurations and a"
  },
  {
    "path": "nasim/scenarios/host.py",
    "chars": 3733,
    "preview": "\nclass Host:\n    \"\"\"A single host in the network.\n\n    Note this class is mainly used to store initial scenario data for"
  },
  {
    "path": "nasim/scenarios/loader.py",
    "chars": 22733,
    "preview": "\"\"\"This module contains functionality for loading network scenarios from yaml\nfiles.\n\"\"\"\nimport math\n\nimport nasim.scena"
  },
  {
    "path": "nasim/scenarios/scenario.py",
    "chars": 7936,
    "preview": "import math\nfrom pprint import pprint\n\nimport nasim.scenarios.utils as u\n\n\nclass Scenario:\n\n    def __init__(self, scena"
  },
  {
    "path": "nasim/scenarios/utils.py",
    "chars": 2003,
    "preview": "import os\nimport yaml\nimport os.path as osp\n\n\nSCENARIO_DIR = osp.dirname(osp.abspath(__file__))\n\n# default subnet addres"
  },
  {
    "path": "nasim/scripts/describe_scenarios.py",
    "chars": 1882,
    "preview": "\"\"\"This script will output description statistics of all benchmark\nscenarios.\n\nIt will output a table to stdout (and opt"
  },
  {
    "path": "nasim/scripts/run_dqn_policy.py",
    "chars": 2422,
    "preview": "\"\"\"A script for running a pre-trained DQN agent\n\nNote, user must ensure the DQN policy matches the NASim\nEnvironment use"
  },
  {
    "path": "nasim/scripts/run_random_benchmarks.py",
    "chars": 3546,
    "preview": "\"\"\"This script runs the random agent for all benchmarks scenarios\n\nThe mean (+/- stdev) steps and reward are reported in"
  },
  {
    "path": "nasim/scripts/train_dqn.py",
    "chars": 3293,
    "preview": "\"\"\"A script for training a DQN agent and storing best policy \"\"\"\n\nimport nasim\nfrom nasim.agents.dqn_agent import DQNAge"
  },
  {
    "path": "nasim/scripts/visualize_graph.py",
    "chars": 597,
    "preview": "\"\"\"Environment network graph visualizer\n\nThis script allows the user to visualize the network graph for a chosen\nbenchma"
  },
  {
    "path": "setup.py",
    "chars": 1916,
    "preview": "import pathlib\n\nfrom setuptools import setup, find_packages\n\nextras = {\n    'dqn': [\n        'torch>=1.5',\n        'tens"
  },
  {
    "path": "test/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "test/test_bruteforce.py",
    "chars": 1971,
    "preview": "\"\"\"Runs bruteforce agent on environment for different scenarios and\nusing different parameters to check no exceptions oc"
  },
  {
    "path": "test/test_env.py",
    "chars": 1949,
    "preview": "\"\"\"Runs some general tests on environment\"\"\"\n\nimport pytest\n\nimport nasim\nfrom nasim.scenarios.benchmark import \\\n    AV"
  },
  {
    "path": "test/test_generator.py",
    "chars": 534,
    "preview": "\"\"\"Runs bruteforce agent on environment for different scenarios and\nusing different parameters to check no exceptions oc"
  },
  {
    "path": "test/test_gym_bruteforce.py",
    "chars": 1109,
    "preview": "\"\"\"Runs bruteforce agent on environment for different scenarios and\nusing different parameters to check no exceptions oc"
  }
]

// ... and 1 more files (download for full content)

About this extraction

This page contains the full source code of the Jjschwartz/NetworkAttackSimulator GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 91 files (357.3 KB), approximately 88.9k tokens, and a symbol index with 443 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo